我正在使用以下数据框:
Year Month Day X Y Color
2018 January 1 4.5 6 Red
2018 January 4 3.2 8.1 Red
2018 January 11 1.1 2.3 Blue
2018 February 7 5.4 2.2 Blue
2018 February 15 1.5 4.4 Red
2019 January 3 8.6 2.3 Red
2019 January 22 1.1 2.5 Blue
2019 January 23 5.5 7.8 Red
2019 February 5 6.9 1.1 Red
2019 February 10 1.8 1.3 Red
我正在寻找一个新列,该列指示给定月份x大于y并且颜色为“红色”的观察次数。
Year Month Day X Y Color XGreaterThanYCount
2018 January 1 4.5 6 Red 0
2018 January 4 3.2 8.1 Red 0
2018 January 11 1.1 2.3 Blue 0
2018 February 7 5.4 2.2 Blue 0
2018 February 15 1.5 4.4 Red 0
2019 January 3 8.6 2.3 Red 1
2019 January 22 1.1 2.5 Blue 1
2019 January 23 5.5 7.8 Red 1
2019 February 5 6.9 1.1 Red 2
2019 February 10 1.8 1.3 Red 2
我前不久发布了与此类似的问题,我要重新发布,因为我不得不稍微调整一下问题。
我们可以按组创建一个逻辑表达式(X > Y
和(&
)Color == "Red"
)并获取sum
该逻辑表达式的
library(dplyr)
df1 %>%
group_by(Year, Month) %>%
mutate(XGreaterThanYCount = sum(X > Y & Color == 'Red')) %>%
ungroup
-输出
# A tibble: 10 x 7
# Year Month Day X Y Color XGreaterThanYCount
# <int> <chr> <int> <dbl> <dbl> <chr> <int>
# 1 2018 January 1 4.5 6 Red 0
# 2 2018 January 4 3.2 8.1 Red 0
# 3 2018 January 11 1.1 2.3 Blue 0
# 4 2018 February 7 5.4 2.2 Blue 0
# 5 2018 February 15 1.5 4.4 Red 0
# 6 2019 January 3 8.6 2.3 Red 1
# 7 2019 January 22 1.1 2.5 Blue 1
# 8 2019 January 23 5.5 7.8 Red 1
# 9 2019 February 5 6.9 1.1 Red 2
#10 2019 February 10 1.8 1.3 Red 2
或base R
与ave
df1$XGreaterThanYCount <- with(df1, ave(X > Y & Color == "Red",
Year, Month, FUN = sum))
df1 <- structure(list(Year = c(2018L, 2018L, 2018L, 2018L, 2018L, 2019L,
2019L, 2019L, 2019L, 2019L), Month = c("January", "January",
"January", "February", "February", "January", "January", "January",
"February", "February"), Day = c(1L, 4L, 11L, 7L, 15L, 3L, 22L,
23L, 5L, 10L), X = c(4.5, 3.2, 1.1, 5.4, 1.5, 8.6, 1.1, 5.5,
6.9, 1.8), Y = c(6, 8.1, 2.3, 2.2, 4.4, 2.3, 2.5, 7.8, 1.1, 1.3
), Color = c("Red", "Red", "Blue", "Blue", "Red", "Red", "Blue",
"Red", "Red", "Red")), class = "data.frame", row.names = c(NA,
-10L))
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句