R - Automatically stack every nth column of a data frame and save them as new objects

Hanx70 Published at Dev

Hanx70

I have a date frame called DF with, say, three variables that repeat each other cyclically:

      A      B      C      A      B      C 
1    a1     b1     c1     a5     b5     c5
2    a2     b2     c2     a6     b6     c6
3    a3     b3     c3     a7     b7     c7
4    a4     b4     c4     a8     b8     c8

I want to stack the first A column on the second A column (and on the third, and fourth and so on, if they exist), and do the same with the other variables, and then save the result as new objects (as vectors, for example). So what I want to obtain is

V_A <- c(a1,a2,a3,a4,a5,a6,a7,a8)
V_B <- c(b1,b2,b3,b4,b5,b6,b7,b8)
V_C <- c(c1,c2,c3,c4,c5,c6,c7,c8)

While it's very easy to do it manually, like this

V_A <- DF[,seq(1, ncol(DF), 3]
V_A <- stack(DF)
V_B <- DF[,seq(2, ncol(DF), 3]
V_B <- stack(DF)
V_C <- DF[,seq(3, ncol(DF), 3]
V_C <- stack(DF)

what I'm looking for is a code that does this automatically, so that it will work for data frames with every number of variables without having to write ad-hoc codes every time. To sum up, the code should: 1) select every nth column in the data frame 2) stack this columns 3) save the result in new objects automatically created

I feel there must be a way to do this but I haven't succeeded so far. Thanks very much in advance.

EDIT Let's say I am in a slightly different situation, in which the columns repeat but not with exactly the same name, and I still want to do the same thing. So I have:

     A1      B1      C1      A2      B2      C2 
1    a11     b11     c11     a25     b25     c25
2    a12     b12     c12     a26     b26     c26
3    a13     b13     c13     a27     b27     c27
4    a14     b14     c14     a28     b28     c28

and I want:

V_A <- c(a11,a12,a13,a14,a25,a26,a27,a28)
V_B <- c(b11,b12,b13,b14,b25,b26,b27,b28)
V_C <- c(c11,c12,c13,c14,c25,c26,c27,c28)

How can I do it?

G. Grothendieck

Here are some alternatives. No packages are used.

1) aperm Create a 3d array a, permute the dimensions and reshape into a matrix m and then convert that to a data frame. This one only works if all values are of the same type. (2) and (3) do not have this limitation.

k <- 3
nr <- nrow(DF)
nc <- ncol(DF)
unames <- unique(names(DF))

a <- array(as.matrix(DF), c(nr, k, nc/k))
m <- matrix(aperm(a, c(1, 3, 2)),, k, dimnames = list(NULL, unames))
as.data.frame(m, stringsAsFactors = FALSE)

giving:

   A  B  C
1 a1 b1 c1
2 a2 b2 c2
3 a3 b3 c3
4 a4 b4 c4
5 a5 b5 c5
6 a6 b6 c6
7 a7 b7 c7
8 a8 b8 c8

If we are in the situation given in the question's EDIT then replace unames with the following where DF2 is DF with the revised names as per Note at end:

unames <- unique(sub("\\d*$", "", names(DF2)))

2) lapply This generalizes the code in the question. unames is defined above:

L <- lapply(split(as.list(DF), names(DF)), unlist)
as.data.frame(L, stringsAsFactors = FALSE)

giving:

   A  B  C
1 a1 b1 c1
2 a2 b2 c2
3 a3 b3 c3
4 a4 b4 c4
5 a5 b5 c5
6 a6 b6 c6
7 a7 b7 c7
8 a8 b8 c8

With the input shown in the question's EDIT it could be done like this where DF2 is given reproducibly in the Note at the end.

names0 <- sub("\\d*$", "", names(DF2))   # names without the trailing digits
L <- lapply(split(as.list(DF2), names0), unlist)
as.data.frame(L, stringsAsFactors = FALSE)

3) reshape nc and unames are from above. varying is a list with k components such as that the ith component contains the index vector c(i, i+k, ...). It seems that reshape does not like duplicated names so we have given it setNames(DF, 1:nc) as the input. This solution does have the advantage of also generating the index vectors time and id which relate the output to the input data.

varying <- split(1:nc, names(DF))
reshape(setNames(DF, 1:nc), dir = "long", varying = varying, v.names = unames)

giving:

    time  A  B  C id
1.1    1 a1 b1 c1  1
2.1    1 a2 b2 c2  2
3.1    1 a3 b3 c3  3
4.1    1 a4 b4 c4  4
1.2    2 a5 b5 c5  1
2.2    2 a6 b6 c6  2
3.2    2 a7 b7 c7  3
4.2    2 a8 b8 c8  4

With the input shown in the question's EDIT it actually simplifies. We no longer need to use setNames(DF, 1:nc) but can just use the data frame as is as input. Also, we can use varying=TRUE (also see @thelatemail's comment) instead of calculating a complex argument for varying. The input DF2 is as shown in the Note at the end and names0 is as in (2) above.

reshape(DF2, dir = "long", varying = TRUE, v.names = unique(names0))

Note:

Lines <- "      A      B      C      A      B      C 
1    a1     b1     c1     a5     b5     c5
2    a2     b2     c2     a6     b6     c6
3    a3     b3     c3     a7     b7     c7
4    a4     b4     c4     a8     b8     c8"
DF <- read.table(text = Lines, as.is = TRUE, check.names = FALSE)

DF2 <- setNames(DF, c("A1", "B1", "C1", "A2", "B2", "C2")) # test input

Upate: A number of simplifications. Also added DF2 in Note at end and discuss in each alternative how to modify the code to deal with it. (A general method might be just to reduce DF2 to DF as I discussed in the comments below.)

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-10-28

Comments

0 comments

TOP Ranking

Article

R - Automatically stack every nth column of a data frame and save them as new objects

R - Automatically stack every nth column of a data frame and save them as new objects

pump.io port in URL

Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

How to import an asset in swift using Bundle.main.path() in a react-native native module

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

Using Response.Redirect with Friendly URLS in ASP.NET

Can a 32-bit antivirus program protect you from 64-bit threats

Double spacing in rmarkdown pdf

How to fix "pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '<'" using YOLOv3?

3D Touch Peek Swipe Like Mail

Bootstrap 5 Static Modal Still Closes when I Click Outside

Assembly definition can't resolve namespaces from external packages

Vector input in shiny R and then use it

Emulator wrong screen resolution in Android Studio 1.3

Svchost high CPU from Microsoft.BingWeather app errors

Graphics Context misaligned on first paint

Python connect to firebird docker database

Is this docker-for-mac password dialog legit?

How to save models trained locally in Amazon SageMaker?