我目前正在上一个类,让我们将代码提交给自动评分器,然后返回结果。它返回的格式很难直观地解析,因此我想编写一个脚本,可以在管道中使用该脚本以使其更易于阅读。
这是自动分级机的输出:
Problem,Correct?,Correct Answer,Agent's Answer
"Challenge Problem B-04",0,4,-1
"Basic Problem B-12",0,1,-1
"Challenge Problem B-05",0,6,-1
"Challenge Problem B-07",0,6,-1
"Challenge Problem B-06",0,3,-1
"Basic Problem B-11",0,1,-1
"Basic Problem B-10",0,3,-1
"Challenge Problem B-03",0,3,-1
"Challenge Problem B-02",0,1,-1
"Challenge Problem B-01",0,6,-1
"Challenge Problem B-09",0,4,-1
"Challenge Problem B-08",0,4,-1
"Basic Problem B-08",0,6,-1
"Basic Problem B-09",0,5,-1
"Basic Problem B-04",0,3,-1
"Basic Problem B-05",0,4,-1
"Basic Problem B-06",0,5,-1
"Basic Problem B-07",0,6,-1
"Basic Problem B-01",0,2,-1
"Basic Problem B-02",0,5,-1
"Basic Problem B-03",0,1,-1
"Challenge Problem B-10",0,4,-1
"Challenge Problem B-11",0,5,-1
"Challenge Problem B-12",0,1,-1
{
"Basic Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Basic Problems B"
},
"Challenge Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Challenge Problems B"
}
}
它由逗号分隔的值和JSON混合而成。最好将所有这些都放在一个我可以阅读的漂亮表中。
目前,我有类似
python submit.py --provider gt --assignment error-check | column -t -s, | less -S
哪个输出:
{
"Basic Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Basic Problems B"
},
"Challenge Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Challenge Problems B"
}
}
Problem Correct? Correct Answer Agent's Answer
"Challenge Problem B-04" 0 4 -1
"Basic Problem B-12" 0 1 -1
"Challenge Problem B-05" 0 6 -1
"Challenge Problem B-07" 0 6 -1
"Challenge Problem B-06" 0 3 -1
"Basic Problem B-11" 0 1 -1
"Basic Problem B-10" 0 3 -1
"Challenge Problem B-03" 0 3 -1
"Challenge Problem B-02" 0 1 -1
"Challenge Problem B-01" 0 6 -1
"Challenge Problem B-09" 0 4 -1
"Challenge Problem B-08" 0 4 -1
"Basic Problem B-08" 0 6 -1
"Basic Problem B-09" 0 5 -1
"Basic Problem B-04" 0 3 -1
"Basic Problem B-05" 0 4 -1
"Basic Problem B-06" 0 5 -1
"Basic Problem B-07" 0 6 -1
"Basic Problem B-01" 0 2 -1
"Basic Problem B-02" 0 5 -1
"Basic Problem B-03" 0 1 -1
"Challenge Problem B-10" 0 4 -1
"Challenge Problem B-11" 0 5 -1
"Challenge Problem B-12" 0 1 -1
这使我大部分时间都在那里。现在我想知道是否有一种方法可以处理JSON?
我不能依靠将输出分割成一定的行号来实现,但是我想我可以在第一次找到a时对输出进行分段{
。
我想做的尽可能少,所以我可以和同学分享。因此,依存关系越少越好。
我看过其他建议使用外部代码的JSON解析帖子。
理想的输出如下所示:
Problem Correct? Correct Answer Agent's Answer
"Challenge Problem B-04" 0 4 -1
"Basic Problem B-12" 0 1 -1
"Challenge Problem B-05" 0 6 -1
"Challenge Problem B-07" 0 6 -1
"Challenge Problem B-06" 0 3 -1
"Basic Problem B-11" 0 1 -1
"Basic Problem B-10" 0 3 -1
"Challenge Problem B-03" 0 3 -1
"Challenge Problem B-02" 0 1 -1
"Challenge Problem B-01" 0 6 -1
"Challenge Problem B-09" 0 4 -1
"Challenge Problem B-08" 0 4 -1
"Basic Problem B-08" 0 6 -1
"Basic Problem B-09" 0 5 -1
"Basic Problem B-04" 0 3 -1
"Basic Problem B-05" 0 4 -1
"Basic Problem B-06" 0 5 -1
"Basic Problem B-07" 0 6 -1
"Basic Problem B-01" 0 2 -1
"Basic Problem B-02" 0 5 -1
"Basic Problem B-03" 0 1 -1
"Challenge Problem B-10" 0 4 -1
"Challenge Problem B-11" 0 5 -1
"Challenge Problem B-12" 0 1 -1
Set Incorrect Skipped Correct
Basic Problems B 0 12 0
Challenge Problems B 0 12 0
将JSON与其余部分分离非常容易。这将只为您提供非JSON:
python submit.py --provider gt --assignment error-check | sed '/{/,$d'
而且,只有JSON:
python submit.py --provider gt --assignment error-check | sed -n '/{/,$p'
为了说明这一点,我将您的示例输入另存为file
:
$ sed '/{/,$d' file
Problem,Correct?,Correct Answer,Agent's Answer
"Challenge Problem B-04",0,4,-1
"Basic Problem B-12",0,1,-1
"Challenge Problem B-05",0,6,-1
"Challenge Problem B-07",0,6,-1
"Challenge Problem B-06",0,3,-1
"Basic Problem B-11",0,1,-1
"Basic Problem B-10",0,3,-1
"Challenge Problem B-03",0,3,-1
"Challenge Problem B-02",0,1,-1
"Challenge Problem B-01",0,6,-1
"Challenge Problem B-09",0,4,-1
"Challenge Problem B-08",0,4,-1
"Basic Problem B-08",0,6,-1
"Basic Problem B-09",0,5,-1
"Basic Problem B-04",0,3,-1
"Basic Problem B-05",0,4,-1
"Basic Problem B-06",0,5,-1
"Basic Problem B-07",0,6,-1
"Basic Problem B-01",0,2,-1
"Basic Problem B-02",0,5,-1
"Basic Problem B-03",0,1,-1
"Challenge Problem B-10",0,4,-1
"Challenge Problem B-11",0,5,-1
"Challenge Problem B-12",0,1,-1
和
$ sed -n '/{/,$p' file
{
"Basic Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Basic Problems B"
},
"Challenge Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Challenge Problems B"
}
}
现在,您已经很好地处理了非JSON,所以我不会改变它。理想情况下,应该使用JSON解析器(如)解析JSON数据jq
。可悲的是,我对jq
正确执行此操作的了解还不够,因此我能想到的最好的解决方案是这种相当微妙的解决方案。至少它确实可以满足您的要求(cat file
用python submit.py --provider gt --assignment error-check
命令代替:
$ cat file | sed -n 's/[,"]//g; s/^ *//; /{/,$p' | tac | awk -F': ' 'BEGIN{printf "%-30s%-10s%-10s%-10s\n", "Set", "Incorrect", "Skipped", "Correct"} NF==2 && !/\{/{if($1=="Set"){set=$2;data[set]["Incorrect"] = 0;data[set]["Skipped"] = 0;data[set]["Correct"] = 0;} data[set][$1]=$2}END{for(set in data){printf "%-30s%-10s%-10s%-10s\n", set,data[set]["Incorrect"],data[set]["Skipped"],data[set]["Correct"]}}'
Set Incorrect Skipped Correct
Challenge Problems B 0 12 0
Basic Problems B 0 12 0
将所有这些放到一个shell脚本中可以得到:
#!/bin/bash
tmpFile=$(mktemp)
python submit.py --provider gt --assignment error-check > "$tmpFile";
sed '/{/,$d' "$tmpFile" | column -t -s,
sed -n 's/[,"]//g; s/^ *//; /{/,$p' "$tmpFile" |
tac |
awk -F': ' '
BEGIN{
printf "%-30s%-10s%-10s%-10s\n", "Set", "Incorrect", "Skipped", "Correct"
}
NF==2 && !/\{/{
if($1=="Set"){
set=$2;
data[set]["Incorrect"] = 0;
data[set]["Skipped"] = 0;
data[set]["Correct"] = 0;
}
data[set][$1]=$2
}
END{
for(set in data){
printf "%-30s%-10s%-10s%-10s\n", set,
data[set]["Incorrect"],
data[set]["Skipped"],
data[set]["Correct"]}
}'
rm "$tmpFile"
产生以下输出:
$ foo.sh
Problem Correct? Correct Answer Agent's Answer
"Challenge Problem B-04" 0 4 -1
"Basic Problem B-12" 0 1 -1
"Challenge Problem B-05" 0 6 -1
"Challenge Problem B-07" 0 6 -1
"Challenge Problem B-06" 0 3 -1
"Basic Problem B-11" 0 1 -1
"Basic Problem B-10" 0 3 -1
"Challenge Problem B-03" 0 3 -1
"Challenge Problem B-02" 0 1 -1
"Challenge Problem B-01" 0 6 -1
"Challenge Problem B-09" 0 4 -1
"Challenge Problem B-08" 0 4 -1
"Basic Problem B-08" 0 6 -1
"Basic Problem B-09" 0 5 -1
"Basic Problem B-04" 0 3 -1
"Basic Problem B-05" 0 4 -1
"Basic Problem B-06" 0 5 -1
"Basic Problem B-07" 0 6 -1
"Basic Problem B-01" 0 2 -1
"Basic Problem B-02" 0 5 -1
"Basic Problem B-03" 0 1 -1
"Challenge Problem B-10" 0 4 -1
"Challenge Problem B-11" 0 5 -1
"Challenge Problem B-12" 0 1 -1
Set Incorrect Skipped Correct
Challenge Problems B 0 12 0
Basic Problems B 0 12 0
但是,这感觉很骇人,我希望有人可以提出一个带有专用JSON解析器的更干净的解决方案。
Steeldriver很好,可以jq
在评论中提供适当的解决方案,因此,如果我们将其合并,则将变得更简单(更安全):
#!/bin/bash
tmpFile=$(mktemp)
python submit.py --provider gt --assignment error-check > "$tmpFile";
sed '/{/,$d' "$tmpFile" | column -t -s,
sed -n '/{/,$p' "$tmpFile" |
jq -r '["Set","Incorrect","Skipped","Correct"], (.[] | [.Set,.Incorrect,.Skipped,.Correct]) | @tsv'
rm "$tmpFile"
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句