Linux结合了两个不同的文本文件

清除彩

我想要使​​用以下功能awksed使用其他工具。

  1. 比较两个具有ID的文件(File1,File2)。
  2. 如果具有相同的ID,则将相同的数据从File2带到File1。

例如如下

第一个文件名:File1.txt
内部(以制表符分隔的表格式)

ID      Match     Length
100      OK        1000
200      OK        1000
300      OK        2000
400      OK        2000
500      OK        3000

第二文件名:File2.fasta
该信息包含如下信息

>100
ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG
>200
CTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGA
>300
TGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGAC
>400
GACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACT
>500
ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG

所以我想从File2.fasta再扩展一列到File1.txt文件,所以这是最终结果

ID      Match     Length     Sequence
100      OK        1000     ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG
200      OK        1000     CTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGA
300      OK        2000     TGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGAC
400      OK        2000     GACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACT
500      OK        3000     ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG

有没有人对如何合并这两个文件有什么好主意?

泰勒·佩里亚(Tyler Peryea)

我相信,您正在寻找加入的机会。

首先,您需要对文件进行排序,并且使用通用格式(相同的定界符)。

cat File2.fasta |sed 's/$/\t/g'|tr -d '\n' |sed 's/>/\n/g'|sort > File2.fasta.sorted
cat File1.txt|sort > File1.txt.sorted

然后,您只需要像这样加入:

join -a1 -t'$TAB' File1.txt.sorted File2.fasta.sorted

注意这里$ TAB是指制表符。

这将产生如下内容:

100 OK  1000    ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG    
200 OK  1000    CTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGA    
300 OK  2000    TGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGAC    
400 OK  2000    GACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACT    
500 OK  3000    ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG    
ID  Match   Length

您想要的是哪一个(列名/位置除外)。

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章