print lines that have similar columns with multiple delimiters

cosmictypist

I have two files:

file1.txt

dn_id101_400_CT_TC    string1
dn_id111_60_TT_AA    string2

file2.txt

dn_id101_400_XX_XX    diffstring1
dn_id400_40_XY_YX    diffstring2
dn_id111_60_GG_CC    diffstring3

I want to print the lines from file2.txt if the first three elements separated by _ from file1.txt are present in the line in file2.txt. Here is my desired output:

dn_id101_400_XX_XX    diffstring1
dn_id111_60_GG_CC    diffstring3

Is there a way to to do this? Maybe by changing the delimiter of an awk? I'm not sure how to handle multiple delimiters in an awk command. Here's an example of what I'd like to use:

awk -F"\t" 'FNR==NR {a[$1]; next}; $1 in a' file1.txt file2.txt

dawg

You can do:

$ awk -F"\t" '     
            {s=$1; sub(/_[[:upper:]]+_[[:upper:]]+$/, "", s)} 
    FNR==NR { arr[s]++} 
    FNR<NR && (s in arr)' f1 f2
dn_id101_400_XX_XX  diffstring1
dn_id111_60_GG_CC   diffstring3

That assumes that /_[[:upper:]]+_[[:upper:]]+$/ correctly describes the part you need to remove to make the data keys overlap between the two files.

If you want to go left to right (irrespective of the number of _ after the first three) use split instead:

$ awk -F"\t" '     
            { split($1, a, /_/); s=a[1]"_"a[2]"_"a[3]} 
    FNR==NR { arr[s]++} 
    FNR<NR && (s in arr)' f1 f2

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-10-23

Comments

0 comments

TOP Ranking

Article

print lines that have similar columns with multiple delimiters

print lines that have similar columns with multiple delimiters

pump.io port in URL

grouping by column variables and appending a new variable based on condition

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

Group boxplot data while keeping their individual X axis labels in ggplot2 in R

Vector input in shiny R and then use it

BigQuery - concatenate ignoring NULL

Can a 32-bit antivirus program protect you from 64-bit threats

How to remove the extra space from right in a webview?

How to how increase/decrease compared to adjacent cell

android.content.Context.getSharedPreferences(java.lang.String, int)' on a null object reference id DBhandler

Getting 502 Bad Gateway Error While Deploying WordPress On Dockerized Lemp?

Type 'number' is not assignable to type 'NgIterable<any>' when trying to async observe a datasource

Check if a number is a perfect square

FFmpeg resize without upscaling

How do I display Label text character-by-character?

How to show an image in a View with ASP.NET MVC 5? (Many suggestions failed so far)

Json Schema - Conditional Evaluation with RegEx

PlayOnLinux displays weird looking window on 18.04 for MS Office installation

JMeter: Why get error when try to save test plan

Emulator wrong screen resolution in Android Studio 1.3