我的代码很脏。如果条件小于2,则名称=不受欢迎。
df <- data.frame(vote=c("A","A","A","B","B","B","B","B","B","C","D"),
val=c(rep(1,11))
)
df %>% group_by(vote) %>% summarise(val=sum(val))
out
vote val
<fct> <dbl>
1 A 3
2 B 6
3 C 1
4 D 1
但是我需要
vote val
<fct> <dbl>
1 A 3
2 B 6
3 unpopular 2
我的想法是
df2 <- df %>% group_by(vote) %>% summarise(val=sum(val))
df2$vote[df2$val < 2] <- "unpop"
df2 %>% group_by....
这不酷。
您知道任何有用的酷功能吗?
我们可以进行双重分组
library(dplyr)
df %>%
group_by(vote) %>%
summarise(val=sum(val)) %>%
group_by(vote = replace(vote, val <2, 'unpop')) %>%
summarise(val = sum(val))
-输出
# A tibble: 3 x 2
# vote val
# <chr> <dbl>
#1 A 3
#2 B 6
#3 unpop 2
或其他选择 rowsum
df %>%
group_by(vote = replace(vote, vote %in%
names(which((rowsum(val, vote) < 2)[,1])), 'unpopular')) %>%
summarise(val = sum(val))
或使用fct_lump_n
从forcats
library(forcats)
df %>%
group_by(vote = fct_lump_n(vote, 2, other_level = "unpop")) %>%
summarise(val = sum(val))
# A tibble: 3 x 2
# vote val
# <fct> <dbl>
#1 A 3
#2 B 6
#3 unpop 2
或使用 table
df %>%
group_by(vote = replace(vote,
vote %in% names(which(table(vote) < 2)), 'unpop')) %>%
summarise(val = sum(val))
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句