I have a data.frame with more than 120000 rows, it looks like this
> head(mydf)
ID MONTH.YEAR VALUE
1 110 JAN. 2012 1000
2 111 JAN. 2012 1000
3 121 FEB. 2012 3000
4 131 FEB. 2012 3000
5 141 MAR. 2012 5000
6 142 MAR. 2012 4000
and I want to split the data.frame depend on the MONTH.YEAR
and VALUE
column, and count the rows of each group, my expect answer should looks like this
MONTH.YEAR VALUE count
JAN. 2012 1000 2
FEB. 2012 3000 2
MAR. 2012 5000 1
MAR. 2012 4000 1
I tried to split it and use the sapply
count the number of each group, and this is my code
sp <- split(mydf, list(mydf$MONTH.YEAR, mydf$VALUE), drop=TRUE);
result <- data.frame(yearandvalue = names(sapply(sp, nrow)), count = sapply(sp, nrow))
but I find the process is very slow. Is there a more efficient way to impliment this? thank you very much.
Try
aggregate(ID~., mydf, length)
Or
library(dplyr)
mydf %>%
group_by(MONTH.YEAR, VALUE) %>%
summarise(count=n())
Or
library(data.table)
setDT(mydf)[, list(count=.N) , list(MONTH.YEAR, VALUE)]
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments