Let's say I have a file with several million lines, organized like this:
@1:N:0:ABC
XYZ
@1:N:0:ABC
ABC
I am trying to write a one-line grep/sed/awk matching function that returns both lines if the NCCGGAGA
line from the first line is found in the second line.
When I try to use grep -A1 -P
and pipe the matches with a match like '(?<=:)[A-Z]{3}'
, I get stuck. I think my creativity is failing me here.
With awk
$ awk -F: 'NF==1 && $0 ~ s{print p ORS $0} {s=$NF; p=$0}' ip.txt
@1:N:0:ABC
ABC
-F:
use :
as delimiter, makes it easy to get last columns=$NF; p=$0
save last column value and entire line for printing laterNF==1
if line doesn't contain :
$0 ~ s
if line contains the last column data saved previously
index($0,s)
instead to search literally:
followed by line which doesn't have :
With GNU sed
(might work with other versions too, syntax might differ though)
$ sed -nE '/:/{N; /.*:(.*)\n.*\1/p}' ip.txt
@1:N:0:ABC
ABC
/:/
if line contains :
N
add next line to pattern space/.*:(.*)\n.*\1/
capture string after last :
and check if it is present in next lineagain, this assumes input like shown in question.. this won't work for cases like
@1:N:0:ABC
@1:N:0:XYZ
XYZ
Este artigo é coletado da Internet.
Se houver alguma infração, entre em [email protected] Delete.
deixe-me dizer algumas palavras