how can I split a dataframe by two columns and count number of rows based on group more efficient

Zihu Guo Published at Dev

Zihu Guo

I have a data.frame with more than 120000 rows, it looks like this

> head(mydf)
ID MONTH.YEAR VALUE
1 110  JAN. 2012  1000
2 111  JAN. 2012  1000
3 121  FEB. 2012  3000
4 131  FEB. 2012  3000
5 141  MAR. 2012  5000
6 142  MAR. 2012  4000

and I want to split the data.frame depend on the MONTH.YEAR and VALUE column, and count the rows of each group, my expect answer should looks like this

MONTH.YEAR VALUE count
JAN. 2012  1000  2
FEB. 2012  3000  2
MAR. 2012  5000  1
MAR. 2012  4000  1

I tried to split it and use the sapply count the number of each group, and this is my code

sp <- split(mydf, list(mydf$MONTH.YEAR, mydf$VALUE), drop=TRUE);
result <- data.frame(yearandvalue = names(sapply(sp, nrow)), count = sapply(sp, nrow))

but I find the process is very slow. Is there a more efficient way to impliment this? thank you very much.

akrun

Try

aggregate(ID~., mydf, length)

library(dplyr)
 mydf %>%
    group_by(MONTH.YEAR, VALUE) %>%
    summarise(count=n())

library(data.table)
setDT(mydf)[, list(count=.N) , list(MONTH.YEAR, VALUE)]

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-10-22

Comments

0 comments

Count number of unique rows based on two columns, by group

How can I count the number of occurrences of an element across two columns of a dataframe?

How can I combine rows in a pandas dataframe based on comparing values in two columns?

How can I remove rows from a dataframe based on having NA in two columns?

How can I separate two columns of count by group by?

How can I group the number of values into columns based on month in sql

How can I group by elements based on multiple columns in pandas dataframe and save the number of elements of each group in another column?

Conditional count of rows in a dataframe based on two columns in another dataframe

How can I find sequences of rows based on two columns?

How to split a dataframe in two dataframes based on the total number of rows in the original dataframe

How can i group pandas dataframe based on two colunms?

Group dataframe by two columns and then find average count based on one of the groups

How can I group with multiple columns and count?

How can I split this dataframe in multiple columns?

How can i split a column in more columns based on a specific value in certain cells

How can i count the number of rows duplicated?

How can I split index into two columns?

How can I split the difference between two timestamps that contain more than one month in a Pandas DataFrame

How can I split the difference between two timestamps that contain more than one date in a Pandas DataFrame

How to count the number of rows that have the value 1 for all the columns in a dataframe?

DAX/Power BI: How can I filter distinct rows conditionally on two (or more) columns?

How to split rows into columns in a dataframe

How to count occurrence in previous rows based on two columns value

How to count rows that have the same values in two columns in Dataframe

How can I count the number of rows within each group using SQL?

How can I count the number of rows in multiple datasets, group them by one column (month), in a Select statement in Bigquery?

Pandas dataframe, how can I group by multiple columns and apply sum for specific column and add new count column?

How to split a columns based on the index of the string in the columns while using a efficient method to parse all the Dataframe

Find if more rows exist based on two columns

TOP Ranking

Article

how can I split a dataframe by two columns and count number of rows based on group more efficient

how can I split a dataframe by two columns and count number of rows based on group more efficient

pump.io port in URL

Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

How to import an asset in swift using Bundle.main.path() in a react-native native module

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

Using Response.Redirect with Friendly URLS in ASP.NET

Can a 32-bit antivirus program protect you from 64-bit threats

Double spacing in rmarkdown pdf

How to fix "pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '<'" using YOLOv3?

3D Touch Peek Swipe Like Mail

Bootstrap 5 Static Modal Still Closes when I Click Outside

Assembly definition can't resolve namespaces from external packages

Vector input in shiny R and then use it

Emulator wrong screen resolution in Android Studio 1.3

Svchost high CPU from Microsoft.BingWeather app errors

Graphics Context misaligned on first paint

Python connect to firebird docker database

Is this docker-for-mac password dialog legit?

How to save models trained locally in Amazon SageMaker?