Extract rows and calculate average

JC3019

I set up 2 problems as follows:

I have two matrices (Mat1 and Mat2). Both matrices are of equal size. I have four output matrices (Output1, Output2, Output3, Output4 respectively) both the same size as Mat1 and Mat2.

Problem 1:

  1. In Mat2, Identify the row that contains the maximum value in column1. Lets assume this is row 1.

  2. Go to Row 1 of Mat1 and extract the first 3 columns of Mat1 Row1 and store in Output1. Store all other rows in Mat2 for the first 3 columns. At this stage Output1 is 1x3. Output 2 is (n-1)x3.

  3. Move to Column 4 of Mat2. Identify the row that contains the maximum value. Lets say this is row 5.

  4. Go to row 5 of Mat1 column 4. Store Row5 columns 4,5,6 in Output1. Store all other rows of Mat1 for columns 4,5,6 in Output2. ... Repeat this process for all columns in Mat2 following the sequence:1,4,7,9 etc. In this case, i have 25 columns for Mat1 and Mat2, so the sequence will end at 24.

I need to be able to change the sequence from 1,4,7,9 etc, to 1,13,25 etc.

Problem2: is equivalent to problem 1, except this time i identify the rows that contain the top-two values in every stage.

  1. In Mat2, Identify the rows that contain the top-two values in Column1. Lets say these rows are 2 and 5. Store the first 3 columns of rows 2 and 5 of Mat1 in Output3. Store all remaining rows (column 1-3) of Mat1 in Output4.

  2. Move to Column 4 of Mat2. Identify the rows which contain the top two values in column 4 of Mat2. lets say rows 1 and 2.

  3. Move to Column 4 of Mat 1. Store column 4,5,6 Row 1 and 2 into Output3. Store all remaining rows in Output4.

Sidenote: This process must be easily extended for matrices with 1000x1000 dimensions. So would prefer not to do this manually.

Mat1 <- data.frame(matrix(nrow = 10, ncol =25, data = rnorm(250,0,1)))
Mat2 <- data.frame(matrix(nrow = 10, ncol =25, data = rnorm(250,0,1)))

> Mat1
            X1          X2         X3          X4         X5           X6          X7          X8          X9         X10         X11        X12        X13
1  -2.22415466  0.98712728  1.0084356  0.58447183  0.2608830 -0.341029099 -0.71693894 -0.61653058 -0.24790470  0.10777970 -1.68562271 -1.6638535 -0.5538468
2   1.11444365 -0.34865955  0.7518822 -0.07573724  0.1336811 -0.831275643 -0.15564822 -0.68849375 -0.05094047  0.21990082 -0.69879135 -0.6348292  1.0172304
3  -0.05367747  0.08654206 -0.3023270 -0.67335942 -1.1173279  0.004670625  0.52482501  0.78330982  1.18795853 -0.06513613  0.42353439 -0.4152209  1.7174158
4   0.42118984 -0.43257583 -1.3368036  1.64849798  0.8294276  1.256987496 -0.50440892  1.07686292  0.94196135  2.90916270 -0.08714083  0.1094395  1.1715895
5  -0.13720451 -0.94864452  1.9751962 -0.70523555  0.1431405  0.569928767  0.54877505 -0.44571903 -1.16282161 -1.65590032 -0.17710859 -0.8904316  0.3252576
6   0.64336424 -0.38277541 -1.6512377 -0.06542054 -0.1195322  0.666255832  0.60826054  1.88822842 -0.52952627 -0.44776682  0.04321836 -0.6190585 -0.9529690
7  -1.04160098  1.10952094 -0.9186759  0.77437293 -0.2284926 -0.113106151 -0.32092624  1.34157301  2.33813068  1.21812714  0.13165646  0.5532299 -1.3470645
8   1.22940987 -1.26271164 -1.2483658 -2.00578793 -0.6773794 -0.228135998 -0.06223206 -1.97606848  1.67339247 -0.47268196 -0.83544561 -0.3313278 -0.2373613
9   0.08485706 -1.60594589  0.8549923 -0.23394708 -0.5978692 -0.321839877 -0.55298452 -0.08387815 -0.99196489  0.83364114 -0.19579612 -0.8017648 -0.2238073
10 -1.71702699  0.39086484 -0.9974210  0.86232862 -0.2755329 -0.160656438  0.49669949  0.73763073 -0.42380390  1.91208332 -0.27778479  0.7866471  0.1813511

> Mat2
            X1         X2          X3         X4         X5          X6          X7         X8           X9        X10        X11        X12        X13
1   0.11053732 -0.5750170  2.58105259 -1.6895285  0.0508918 -0.54188929 -0.92292169  1.4972970  0.009239807 -0.1706461 -0.8942262 -1.6351505  0.2029262
2  -0.83802776 -0.9322157 -0.34753884  0.8164819  0.7318198  0.09162218  0.15971493 -2.6731067  1.554323641 -0.3161967  0.4622101 -1.9521229 -1.3229961
3   0.61368153 -1.3650360  0.95674229  0.4582117 -0.6959545 -0.59627428  1.94172156  1.6784237 -0.482524695 -0.0514944 -0.4608930 -0.5456863 -0.1340540
4  -1.03156503 -0.2516495  0.76770177 -0.7841354 -3.2404904 -1.76276859  1.57421914  0.9782458 -1.364451438 -0.6437429  0.7485424 -0.8778284  1.7587504
5   0.01183232  0.6825633  1.39634308  1.4136879  0.5166420  0.76930390  0.67210932  1.3007904 -0.284451411  0.5163457  0.3198626  0.8030497 -1.4320064
6  -0.06110883 -0.6762991  0.56105196  0.9767543 -1.0016294 -0.84811626 -0.83319744 -1.1777865 -1.185631394 -0.5673733  0.2956725  0.5425602 -1.0510479
7  -0.56195630  1.3883881  0.09995573  0.6722959 -1.6205290  0.32085867 -0.94243554 -0.2340429 -1.299085265 -0.4433517  0.4424583 -2.8887970  0.1679859
8   1.04612102  0.8360530  0.07005306  0.4818317  1.1857504  0.13649605  1.35261983  0.8008935 -0.101922164  0.6773003 -1.0265770  0.1859912  0.2678461
9   0.88419676 -1.7012899 -1.09656000 -0.4360276  0.6238451 -2.03256276 -1.12575579  1.8407234  0.522372401 -0.6229582  0.6727720 -0.5695190  0.6298388
10 -0.68648649 -0.6689894 -0.56849261 -1.9012760  1.1418180  0.46377789 -0.08107475  1.4378120 -1.489367198 -0.7682887 -0.2858680  0.9584056  1.3178700

So for example:

which.max(Mat2[,1]) # 8 

so go to row 8 of Mat1 and store the first 3 cols in Output1.

Output1[1, 1:3] #  1.22940987 -1.26271164 -1.2483658

Store all other rows of Mat1 for cols 1 to 3 in Output2.

which.max(Mat2[, 4]) # 5

implies

Output1[1, 4:6] # -0.70523555  0.1431405  0.569928767

And so on and so forth.

Robin Gertenbach

Does this do what you need?

Juggle <- function(m1, m2, col, step, top_n = 1) {
  if (length(col) == 1) {
    output_rows <- sort(m2[, col], index.return = TRUE, decreasing = TRUE)$ix
    col_idx <- seq(col, col + step - 1)
    top_idx <- seq(top_n)
    bottom_idx <- seq(top_n + 1, nrow(m2))
    max_out <- m1[output_rows[top_idx], col_idx]
    rest_out <- m1[output_rows[bottom_idx], col_idx]
    return(list(max_out = max_out, rest_out = rest_out))
  } else {
    left <- Juggle(m1, m2, col[1], step, top_n)
    right <- Juggle(m1, m2, col[seq(2, length(col))], step, top_n)
    return(list(
      max_out = cbind(left$max_out, right$max_out),
      rest_out = cbind(left$rest_out, right$max_out)
    ))
  }
}

m1 and m2 are the matrices or data frames.
col is a single or sequence of columns like 1, 4, 7, 9 or 1,13,25.
step is the amount of columns to extract.
top_n is the amount of rows to extract.

This will return a list with components max_out for the top_n row extracts and rest_out for the N-top_n row extracts.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Calculate average of columns and rows

Calculate average from multiple rows

Calculate the average of the rows for each group

How to calculate average of adjacent rows?

How to calculate the average of rows/selected rows in the matrix

How to calculate an average value across database rows?

Calculate average date difference of multiple rows

How to calculate average unique rows in SQL?

Calculate the average of rows with similar values in R

Calculate weighted average of dataframe rows with missing values

Calculate a Weighted Rolling Average by rows by group in r?

Calculate the average date every x rows

Calculate average of every n rows in pandas and assign new labels to rows

Pandas: Calculate the average every 2 rows of a column and put it into the a new column

Calculate average of every n rows from a csv file

Add new rows to calculate the sum and average from exiting pandas dataframe

calculate average and maximum value for subset of rows in pandas dataframe

Calculate Average between columns by comparing two rows in SQL Server

how to calculate average when some rows does not exist?

Fetch values from the selected rows and calculate its average

Pandas Calculate Average Bias By Rows from 2 Columns

Googlesheets - ArrayFormula to calculate the weighted average of two rows ignoring text cells

Calculate average between rows in SQL by using lag and ignore first row

Calculate average and standard deviation per 5 rows in a pandas dataframe

Calculate running average in R using first 50 rows

How to calculate moving average from previous rows in data.table?

Calculate average values for rows with different ids in MS Excel

How to calculate the daywise average and multiply it to number of rows in that particular day

c# datatable using compute to calculate an average on a range of rows