Coding,Value,Meaning,54-1.0,54-2.0,431-2.0,212-0.0,212-1.0
1,1,Yes,0.4,0.3,0.7,0.1,0.6
2,0,Other job (free text entry),0,0.7,0.3,0.7,0.8
2,1,Managers and Senior Officials,0.5,0.2,0.4,0.7,0.7
2,11,Corporate Managers,0.1,0.7,0.4,0.2,0.4
2,111,Corporate Managers And Senior Officials,0,0.8,0.8,0.4,0.8
2,1111,Senior officials in national government,0.9,0.6,0.4,0.2,0.9
2,1111001,AM (National Assembly),0.8,0.3,0.2,0,0.2
2,1111002,Ambassador (Foreign and Commonwealth Office),0.9,0.9,0.7,0.1,0.2
2,1111003,Band 0 (Health and Safety Executive),0.6,0.4,0,0.4,0.8
2,1111004,Band 1B (Meteorological Office),0.6,0.1,0.6,1,0.8
我有一个像上面这样的 csv.gz 文件。我想按匹配某些字符串的名称提取列,例如,列名称匹配“54-”和“212-”。
我找到了如下解决方案,但我想知道是否可以对其进行修改,以便它可以提取与字符串列表中的任何元素匹配的列,例如“含义”、“54-”、“212-”。
zcat test.csv.gz |awk -F, 'NR==1{for(i=1;i<=NF;i++)if($i~/54-/)f[n++]=i}{for(i=0;i<n;i++)printf"%s%s",i?" ":"",$f[i];print""}'
我还想将其保存到 csv.gz 文件中。但是通过> outputfile.csv
在最后添加,我不能用逗号分隔。我想知道我应该把OFS=","
这个命令放在哪里?
示例输出如下(在 csv.gz 文件中)
Meaning,54-1.0,54-2.0,212-0.0,212-1.0
Yes,0.4,0.3,0.1,0.6
Other job (free text entry),0,0.7,0.7,0.8
Managers and Senior Officials,0.5,0.2,0.7,0.7
Corporate Managers,0.1,0.7,0.2,0.4
Corporate Managers And Senior Officials,0,0.8,0.4,0.8
Senior officials in national government,0.9,0.6,0.2,0.9
AM (National Assembly),0.8,0.3,0,0.2
Ambassador (Foreign and Commonwealth Office),0.9,0.9,0.1,0.2
Band 0 (Health and Safety Executive),0.6,0.4,0.4,0.8
Band 1B (Meteorological Office),0.6,0.1,1,0.8
谢谢你。
希望这有助于get
根据您的需要更改变量:
单线:
$ awk -v get='^(Meaning|54-|212-)' 'BEGIN{FS=OFS=","}FNR==1{for(i=1;i<=NF;i++)if($i~get)cols[++c]=i}{for(i=1; i<=c; i++)printf "%s%s", $(cols[i]), (i<c ? OFS : ORS)}' file
Meaning,54-1.0,54-2.0,212-0.0,212-1.0
Yes,0.4,0.3,0.1,0.6
Other job (free text entry),0,0.7,0.7,0.8
Managers and Senior Officials,0.5,0.2,0.7,0.7
Corporate Managers,0.1,0.7,0.2,0.4
Corporate Managers And Senior Officials,0,0.8,0.4,0.8
Senior officials in national government,0.9,0.6,0.2,0.9
AM (National Assembly),0.8,0.3,0,0.2
Ambassador (Foreign and Commonwealth Office),0.9,0.9,0.1,0.2
Band 0 (Health and Safety Executive),0.6,0.4,0.4,0.8
Band 1B (Meteorological Office),0.6,0.1,1,0.8
在你的情况下:
$ zcat test.csv.gz | awk -v get='^(Meaning|54-|212-)' 'BEGIN{FS=OFS=","}FNR==1{for(i=1;i<=NF;i++)if($i~get)cols[++c]=i}{for(i=1; i<=c; i++)printf "%s%s", $(cols[i]), (i<c ? OFS : ORS)}'
更好的可读性:
awk -v get='^(Meaning|54-|212-)' '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++)
if($i~get)cols[++c]=i
}
{
for(i=1; i<=c; i++)
printf "%s%s", $(cols[i]), (i<c ? OFS : ORS)
}' file
输入:
$ cat file
Coding,Value,Meaning,54-1.0,54-2.0,431-2.0,212-0.0,212-1.0
1,1,Yes,0.4,0.3,0.7,0.1,0.6
2,0,Other job (free text entry),0,0.7,0.3,0.7,0.8
2,1,Managers and Senior Officials,0.5,0.2,0.4,0.7,0.7
2,11,Corporate Managers,0.1,0.7,0.4,0.2,0.4
2,111,Corporate Managers And Senior Officials,0,0.8,0.8,0.4,0.8
2,1111,Senior officials in national government,0.9,0.6,0.4,0.2,0.9
2,1111001,AM (National Assembly),0.8,0.3,0.2,0,0.2
2,1111002,Ambassador (Foreign and Commonwealth Office),0.9,0.9,0.7,0.1,0.2
2,1111003,Band 0 (Health and Safety Executive),0.6,0.4,0,0.4,0.8
2,1111004,Band 1B (Meteorological Office),0.6,0.1,0.6,1,0.8
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句