我有一个数据集,其中有许多由单个ID提交的值,这些值被组织成子集。我想为每个ID计算一个值,该值= ID分数的平均值/ Subset分数的平均值。我试过很多的选择使用group_by()
,summarize()
而spread()
却无法安排。
library(dplyr)
df <- data.frame(stringsAsFactors=FALSE,
Subset = c("A","B","C","D","A","B","C","D","A","B","C","D"),
ID = c(1,2,3,4,5,3,1,5,2,3,4,1),
score = c(123,42,564,234,123,345,6678,87,543,121,123,55))
averages <-
df %>%
group_by(Subset) %>%
summarise(mean.subs = mean(score)) %>%
ungroup() %>%
group_by(ID) %>%
summarise(mean.id = mean(score) / mean.subs)
我将不胜感激任何帮助。
我想您想使用mutate
而不是summarize
:
library(dplyr)
df %>%
dplyr::group_by(Subset) %>%
dplyr::mutate(mean.subs = mean(score)) %>%
dplyr::ungroup() %>%
dplyr::group_by(ID) %>%
dplyr::mutate(mean.id = mean(score)) %>%
dplyr::rowwise() %>%
dplyr::mutate(value = mean.id / mean.subs)
Subset ID score mean.subs mean.id value
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 1 123 263 2285. 8.69
2 B 2 42 169. 292. 1.73
3 C 3 564 2455 343. 0.140
4 D 4 234 125. 178. 1.42
5 A 5 123 263 105 0.399
6 B 3 345 169. 343. 2.03
7 C 1 6678 2455 2285. 0.931
8 D 5 87 125. 105 0.838
9 A 2 543 263 292. 1.11
10 B 3 121 169. 343. 2.03
11 C 4 123 2455 178. 0.0727
12 D 1 55 125. 2285. 18.2
同样,要计算跨行,您将需要使用rowwise
。您可能还希望ungroup
在末尾添加另一个管道以取消输出分组。
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句