我有一个大数据文件,看起来像:
//
ID 1.1.1.258
DE 6-hydroxyhexanoate dehydrogenase.
CA 6-hydroxyhexanoate + NAD(+) = 6-oxohexanoate + NADH.
CC -!- Involved in the cyclohexanol degradation pathway in Acinetobacter
CC NCIB 9871.
//
ID 1.1.1.259
DE 3-hydroxypimeloyl-CoA dehydrogenase.
CA 3-hydroxypimeloyl-CoA + NAD(+) = 3-oxopimeloyl-CoA + NADH.
CC -!- Involved in the anaerobic pathway of benzoate degradation in
CC bacteria.
//
ID 1.1.1.260
DE Sulcatone reductase.
CA Sulcatol + NAD(+) = sulcatone + NADH.
CC -!- Studies on the effects of growth-stage and nutrient supply on the
CC stereochemistry of sulcatone reduction in Clostridia pasteurianum,
CC C.tyrobutyricum and Lactobacillus brevis suggest that there may be at
CC least two sulcatone reductases with different stereospecificities.
//
我想提取此文件中包含工作的部分anaerobic
。我特别想要ID行。
是否可以在ID和//之间搜索文件,以查找anaerobic
输出并将其打印到新文件?如果整个部分都被打印出来,如我所料,我可以将其重新打印出来。
预期应该是
ID 1.1.1.259
要么
ID 1.1.1.259
DE 3-hydroxypimeloyl-CoA dehydrogenase.
CA 3-hydroxypimeloyl-CoA + NAD(+) = 3-oxopimeloyl-CoA + NADH.
CC -!- Involved in the anaerobic pathway of benzoate degradation in
CC bacteria.
//
用awk很简单
awk '/anaerobic/' RS='//\n' ORS='\n//' ./file.txt
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句