根据第一个属性获取项目和后续项目

用户名

我有一个无法更改的第三方工具生成的事件日志文件。因此，此日志文件是一个巨大的JSON数组，其中odds元素包含元数据，而对包含与元数据关联的正文消息。我希望能够根据元数据拆分文件，从而按主题将信息汇总到不同的文件中。

我正在Windows上进行此项目，并且正在使用批处理文件和JQ进行尝试。

基本上，数组如下所示：

[
  { "type": "abc123"},
  {"name":"first component of type abc123"},
   { "type": "abc123"},
  {"name":"second component of type abc123"},
  { "type": "def124"},
  {"name":"first component of type def124"},
  { "type": "xyz999"},
  {"name":"first component of type xyz999"},
  { "type": "abc123"},
  {"name":"third component of type abc123"},
  { "type": "def124"},
  {"name":"second component of type def124"},
  { "type": "abc123"},
  {"name":"fifth component of type abc123"},
  { "type": "abc123"},
  {"name":"sixth component of type abc123"},
  { "type": "def124"},
  {"name":"third component of type def124"},
  { "type": "def124"},
  {"name":"fourth component of type def124"},
  { "type": "abc123"},
  {"name":"seventh component of type abc123"},
  { "type": "xyz999"},
  {"name":"second component of type xyz999"}
  ...
]

我知道我只有3种类型，因此我要存档的是为它们中的每一种创建一个文件。就像是：

第一个档案

{
  "componentLog": {
       "type": "abc123",
       "information": [
          "first component of type abc123",
          "second component of type abc123",
          "third component of type abc123",
          ...
       ]
     }
}

第二档

{
  "componentLog": {
       "type": "def124",
       "information": [
          "first component of type def124",
          "second component of type def124",
          "third component of type def124",
          ...
       ]
     }
}

第三档

{
  "componentLog": {
       "type": "xyz999",
       "information": [
          "first component of type xyz999",
          "second component of type xyz999",
          "third component of type xyz999",
          ...
       ]
     }
}

我知道我可以与此分离元数据

jq.exe ".[] | select(.type==\"product\")" file.json

然后我尝试对index。进行数学运算，但是索引仅返回包含select语句的第一项的索引...所以我不知道如何解决这个问题...

峰

以下bash脚本有点混乱，因为它假定没有文件（输入或输出）适合内存。

如果您在计算环境中没有访问bash，sed和awk的权限，则可能要考虑安装wsl，mingw或诸如此类，或者可以适当地修改脚本，例如，使用适用于Windows的gawk或Ruby对于Windows。

尚未嵌入到原始问题中的另一个主要假设是，可以删除log-type*.tmp文件并为各种“类型”值覆盖log-TYPE.json。

确保设置input为适当的输入文件名。

# The input file name:
input=file.json

/bin/rm log-type*.tmp

# Use jq to produce a stream of .type and .name values 
# as per the jq FAQ
jq -cn --stream '
   fromstream(1|truncate_stream(inputs))
   | if .type then .type else .name end'  "$input" |
 awk '
      NR%2 {fn=$1; sub("^\"","",fn); sub("\"$","", fn); next;} 
      { print > "log-type." fn ".tmp"}
'

for f in log-type.*.tmp ; do
    echo formatting $f ...
    g=$(sed -e 's/log-type.//' -e 's/.tmp$//' <<< "$f")
    echo g="$g"
    awk -v type="\"$g\"" '
      BEGIN { print "{\"componentLog\": { \"type\": " type " ,";
      print "\"information\": ["; }
      NR==1 { print; next }
      {print ",", $0} 
      END {print "]}}"; }' "$f" > "log-$g.json"
done

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-01-21

我来说两句

0 条评论

登录后参与评论

根据第一个属性获取项目和后续项目

根据第一个属性获取项目和后续项目

计算数据帧R中的字符串频率

Android Studio Kotlin：提取为常量

Excel 2016图表将增长与4个参数进行比较

获取并汇总所有关联的数据

如何使用Redux-Toolkit重置Redux Store

http：// localhost：3000 /＃！/为什么我在localhost链接中得到“＃！/”。

将加号/减号添加到jQuery菜单

算术中的c ++常量类型转换

TYPO3：将 Formhandler 添加到新闻扩展

TreeMap中的自定义排序

如何开始为Ubuntu开发

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

无法使用 envoy 访问 .ssh/config

在Ubuntu和Windows中，触摸板有时会滞后。硬件问题？

遍历元素数组以每X秒在浏览器上显示

在Jenkins服务器中使用Selenium和Ruby进行的黄瓜测试失败，但在本地计算机中通过

警告消息：在matrix（unlist（drop.item），ncol = 10，byrow = TRUE）中：数据长度[16]不是列数的倍数[10]>？

未捕获的SyntaxError：带有Ajax帖子的意外令牌u

如何使用tweepy流式传输来自指定用户的推文（仅在该用户发布推文时流式传输）

尝试在Dell XPS13 9360上安装Windows 7时出错

如果从DB接收到的值为空，则JMeter JDBC调用将返回该值作为参数名称