R中的分组依据和条件汇总

hy-jp

我的代码很脏。如果条件小于2,则名称=不受欢迎。

df <- data.frame(vote=c("A","A","A","B","B","B","B","B","B","C","D"),
           val=c(rep(1,11))
           )

df %>% group_by(vote) %>% summarise(val=sum(val))
out

  vote    val
  <fct> <dbl>
1 A         3
2 B         6
3 C         1
4 D         1

但是我需要

  vote    val
  <fct> <dbl>
1 A         3
2 B         6
3 unpopular 2

我的想法是

df2 <- df %>% group_by(vote) %>% summarise(val=sum(val))
df2$vote[df2$val < 2] <- "unpop"
df2 %>% group_by....

这不酷。

您知道任何有用的酷功能吗?

阿克伦

我们可以进行双重分组

library(dplyr)
df %>% 
    group_by(vote) %>% 
    summarise(val=sum(val)) %>%
    group_by(vote = replace(vote, val <2, 'unpop')) %>% 
    summarise(val = sum(val))

-输出

# A tibble: 3 x 2
# vote    val
#  <chr> <dbl>
#1 A         3
#2 B         6
#3 unpop     2

或其他选择 rowsum

df %>% 
   group_by(vote = replace(vote, vote %in% 
     names(which((rowsum(val, vote) < 2)[,1])), 'unpopular')) %>% 
   summarise(val = sum(val))

或使用fct_lump_nforcats

library(forcats)
df %>% 
  group_by(vote = fct_lump_n(vote, 2, other_level = "unpop")) %>%
  summarise(val = sum(val))
# A tibble: 3 x 2
#  vote    val
#  <fct> <dbl>
#1 A         3
#2 B         6
#3 unpop     2

或使用 table

df %>%
   group_by(vote = replace(vote, 
      vote %in% names(which(table(vote) < 2)), 'unpop'))  %>%
   summarise(val = sum(val))

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章