Remove line from file

PDM

I have data of 4 columns, 10000s rows. I am trying to remove rows whose 2nd column has values lesser than the previous lines's 3rd column.

chr1        10        20        3 
chr1        15        30        9 
chr1        55        60        1
Sparhawk
awk '$2 >= prev; {prev=$3}' file.txt

Explanation

  • awk <commands> file.txt: run awk on file.txt.
  • $2 >= prev: check if the second field $2 is greater or equal than the contents of variable prev. (This is unset for the first line.) If this is true, then awk defaults to printing the entire line. (i.e. if it is less, then delete the line.)
  • {prev=$3}: store the contents of the third field $3 in the variable prev.

This then repeats for the next line. awk will compare the second field with prev, which now contains the third field from the line before. A couple of things to note:

  • I'm not sure what you wanted for the first line, so I'd just manually include/exclude it as you see fit.
  • If the data are actually tab delimited, just add the following flag to awk to let it know: -F'\t'.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related