我有一个无法更改的第三方工具生成的事件日志文件。因此,此日志文件是一个巨大的JSON数组,其中odds元素包含元数据,而对包含与元数据关联的正文消息。我希望能够根据元数据拆分文件,从而按主题将信息汇总到不同的文件中。
我正在Windows上进行此项目,并且正在使用批处理文件和JQ进行尝试。
基本上,数组如下所示:
[
{ "type": "abc123"},
{"name":"first component of type abc123"},
{ "type": "abc123"},
{"name":"second component of type abc123"},
{ "type": "def124"},
{"name":"first component of type def124"},
{ "type": "xyz999"},
{"name":"first component of type xyz999"},
{ "type": "abc123"},
{"name":"third component of type abc123"},
{ "type": "def124"},
{"name":"second component of type def124"},
{ "type": "abc123"},
{"name":"fifth component of type abc123"},
{ "type": "abc123"},
{"name":"sixth component of type abc123"},
{ "type": "def124"},
{"name":"third component of type def124"},
{ "type": "def124"},
{"name":"fourth component of type def124"},
{ "type": "abc123"},
{"name":"seventh component of type abc123"},
{ "type": "xyz999"},
{"name":"second component of type xyz999"}
...
]
我知道我只有3种类型,因此我要存档的是为它们中的每一种创建一个文件。就像是:
第一个档案
{
"componentLog": {
"type": "abc123",
"information": [
"first component of type abc123",
"second component of type abc123",
"third component of type abc123",
...
]
}
}
第二档
{
"componentLog": {
"type": "def124",
"information": [
"first component of type def124",
"second component of type def124",
"third component of type def124",
...
]
}
}
第三档
{
"componentLog": {
"type": "xyz999",
"information": [
"first component of type xyz999",
"second component of type xyz999",
"third component of type xyz999",
...
]
}
}
我知道我可以与此分离元数据
jq.exe ".[] | select(.type==\"product\")" file.json
然后我尝试对index
。进行数学运算,但是索引仅返回包含select语句的第一项的索引...所以我不知道如何解决这个问题...
以下bash脚本有点混乱,因为它假定没有文件(输入或输出)适合内存。
如果您在计算环境中没有访问bash,sed和awk的权限,则可能要考虑安装wsl,mingw或诸如此类,或者可以适当地修改脚本,例如,使用适用于Windows的gawk或Ruby对于Windows。
尚未嵌入到原始问题中的另一个主要假设是,可以删除log-type*.tmp
文件并为各种“类型”值覆盖log-TYPE.json。
确保设置input
为适当的输入文件名。
# The input file name:
input=file.json
/bin/rm log-type*.tmp
# Use jq to produce a stream of .type and .name values
# as per the jq FAQ
jq -cn --stream '
fromstream(1|truncate_stream(inputs))
| if .type then .type else .name end' "$input" |
awk '
NR%2 {fn=$1; sub("^\"","",fn); sub("\"$","", fn); next;}
{ print > "log-type." fn ".tmp"}
'
for f in log-type.*.tmp ; do
echo formatting $f ...
g=$(sed -e 's/log-type.//' -e 's/.tmp$//' <<< "$f")
echo g="$g"
awk -v type="\"$g\"" '
BEGIN { print "{\"componentLog\": { \"type\": " type " ,";
print "\"information\": ["; }
NR==1 { print; next }
{print ",", $0}
END {print "]}}"; }' "$f" > "log-$g.json"
done
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句