两个 df R 中的值之间的对应关系

南希

我有两个df要对抗。我的第一个 df 是“sum”

> head(sum)
   File_pdb  Res1      Chain1     Res2      Chain2
1:  7LD1_CM  GLN 81      M       ASN 501      C
2:  7LD1_CM  TYR 128     M       PHE 377      C
3:  7LD1_CM  ILE 78      M       SER 375      C
4:  7LD1_CM  ASN 76      M       ALA 372      C
5:  7LD1_CM  THR 20      M       TYR 369      C
6:  7LD1_CM  ARG 408     C       LEU 131      M

第二个是“mut”

> head(mut)
   RefAA  MutAA LineagesCount
1  VAL 3  GLY 3             1
2  LEU 5  PHE 5             2
3  LEU 8  VAL 8             1
4 SER 13 ILE 13             2
5 LEU 18 PHE 18             5
6 THR 20 ILE 20             1

我必须检查 sum$res1 和 sum$res2 中是否有等于 mut$refAA 的值。如果是这样,我需要在 sum$res1 或 sum$res2 附近添加整行 mut$refAA。

这里有一个例子:

    File_pdb  Res1      Chain1     Res2      Chain2 RefAA  MutAA  LineagesCount
1:  7LD1_CM  GLN 81      M       ASN 501      C
2:  7LD1_CM  TYR 128     M       PHE 377      C
3:  7LD1_CM  ILE 78      M       SER 375      C
4:  7LD1_CM  ASN 76      M       ALA 372      C
5:  7LD1_CM  THR 20      M       TYR 369      C     THR 20   ILE 20     1
6:  7LD1_CM  ARG 408     C       LEU 131      M

我怎么能做到这一点?我正在尝试使用合并和连接功能,但我没有那么有经验,所以我需要更多练习。有人可以帮助我吗?谢谢!

罗马

我不得不稍微修复数据,以便轻松导入数据。那你可以试试tidyverse

library(tidyverse)
SUM %>% 
  mutate(index = 1:n()) %>% 
  pivot_longer(c(Res1, Res2)) %>%   
  left_join(mutate(MUT, value=RefAA), by = "value") %>%  
  group_by(index) %>% 
  fill(MutAA, RefAA, LineagesCount, .direction = "downup") %>% 
  ungroup() %>% 
  pivot_wider(names_from = name, values_from = value, values_fn = toString) %>% 
  mutate(which_Res = ifelse(RefAA == Res1, "Res1", "Res2"))
# A tibble: 6 x 10
  File_pdb Chain1 Chain2 index RefAA MutAA LineagesCount Res1   Res2   which_Res
  <chr>    <chr>  <chr>  <int> <chr> <chr>         <int> <chr>  <chr>  <chr>    
1 7LD1_CM  M      C          1 NA    NA               NA GLN81  ASN501 NA       
2 7LD1_CM  M      C          2 NA    NA               NA TYR128 PHE377 NA       
3 7LD1_CM  M      C          3 NA    NA               NA ILE78  SER375 NA       
4 7LD1_CM  M      C          4 NA    NA               NA ASN76  ALA372 NA       
5 7LD1_CM  M      C          5 THR20 ILE20             1 THR20  TYR369 Res1     
6 7LD1_CM  C      M          6 NA    NA               NA ARG408 LEU131 NA   

数据

SUM <- read.table(text = "   File_pdb  Res1      Chain1     Res2      Chain2
1:  7LD1_CM  GLN81      M       ASN501      C
2:  7LD1_CM  TYR128     M       PHE377      C
3:  7LD1_CM  ILE78      M       SER375      C
4:  7LD1_CM  ASN76      M       ALA372      C
5:  7LD1_CM  THR20      M       TYR369      C
6:  7LD1_CM  ARG408     C       LEU131      M") 
SUM

MUT <- read.table(text = " RefAA  MutAA LineagesCount
1  VAL3  GLY3             1
2  LEU5  PHE5             2
3  LEU8  VAL8             1
4 SER13 ILE13             2
5 LEU18 PHE18             5
6 THR20 ILE20             1")

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章