折叠 R 中的行

德英乐

我有一个数据框

df <- data.frame(id1 = c("a" , "b", "b", "c"),
                 id2 = c(NA,"a","a",NA),
                 id3 = c("a", "a", "a", "e"),
                 n1 = c(2,2,2,3),
                 n2 = c(2,1,1,1),
                 n3 = c(0,1,1,3),
                 n4 = c(0,1,1,2))

我想将2nd和3rd行折叠为一。之后，我将aggregate通过id3共享相同字符（即a）的列来做。

我的真实数据框很长，包含许多不同的拉丁名称，filter按名称，即a在这种情况下没有意义。我想用条件id3==折叠行id2，但我做不到。对我有什么建议吗？

我想要的输出是这样的

id1 id2 id3 n1 n2 n3 n4
a   NA  a   2  2  0  0
b   a   a   2  1  1  1
c   NA  e   3  1  3  2

#Afterthat, it should be
id1 id3 n1 n2 n3 n4
a    a   4  3  1  1
c    e   3  1  3  2

（我刚刚更新了数据框，抱歉我的错误）

阿克伦

我们得到distinct行以生成第一个预期的

library(dplyr)
df %>%
  distinct
  id1  id2 id3 n1 n2 n3 n4
1   a <NA>   a  2  2  0  0
2   b    a   a  2  1  1  1
3   c <NA>   e  3  1  3  2

最终的输出，我们可以从上面的，即后得到distinct一步，做一组由coalesced“ID2”，“ID1”与“ID3”一起，然后得到sum的numeric列

df %>%
    distinct %>%
    group_by(id1 = coalesce(id2, id1), id3) %>% 
    summarise(across(where(is.numeric), sum), .groups = 'drop')

-输出

# A tibble: 2 × 6
  id1   id3      n1    n2    n3    n4
  <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 a     a         4     3     1     1
2 c     e         3     1     3     2

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。