I have a data set with 5 columns:
store_id year event item units
123 2015 sale_2 abc 2
234 2015 sale_3 def 1
345 2015 sale_2 xyz 5
I'm trying to rotate out the items
by store_id, year, and event
to get the sum
. For instance
store_id year event abc def xyz
123 2015 sale_2 2 0 0
234 2015 sale_3 0 1 0
345 2015 sale_2 0 0 5
I'm having trouble figuring out the best method. Normally I'd use dummyVars in caret to do this but I need sums instead of flag. I've looked at tapply but it can't handle more than 2 grouping variables.
Any other suggestions?
library(reshape2)
dcast(df, store_id + year + event ~ item, fun.aggregate = sum, value.var='units')
# store_id year event abc def xyz
# 1: 123 2015 sale_2 2 0 0
# 2: 234 2015 sale_3 0 1 0
# 3: 345 2015 sale_2 0 0 5
For large datasets consider
# uses dcast.data.table, much faster
library(data.table)
setDT(df)
dcast(df, store_id + year + event ~ item, fun.aggregate = sum, value.var='units')
Este artigo é coletado da Internet.
Se houver alguma infração, entre em [email protected] Delete.
deixe-me dizer algumas palavras