如何从非常大(> 100,000行)的JSON文件中为每个节点提取两个数据字段(1个标量和1个数组)?

内波穆克

我有一个139,000行的JSON文件,其结构基本上如下所示(它是OpenStreetMap的摘录):

{
  "type": "FeatureCollection",
  "generator": "overpass-ide",
  "features": [
    {
      "type": "Feature",
      "properties": {
        "@id": "relation/7859",
        "TMC:cid_58:tabcd_1:Class": "Area",
        "TMC:cid_58:tabcd_1:LCLversion": "9.00",
        "TMC:cid_58:tabcd_1:LocationCode": "4934",
        "leisure": "park",
        "name": "Platnersberg",
        "type": "multipolygon",
        "@geometry": "center"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          11.128184,
          49.4706035
        ]
      },
      "id": "relation/7859"
    },
    {
      "type": "Feature",
      "properties": {
        "@id": "relation/62370",
        "TMC:cid_58:tabcd_1:Class": "Area",
        "TMC:cid_58:tabcd_1:LCLversion": "8.00",
        "TMC:cid_58:tabcd_1:LocationCode": "1157",
        "admin_level": "6",
        "boundary": "administrative",
        "de:place": "city",
        "name": "Eisenach",
        "type": "boundary",
        "@geometry": "center"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          10.2836229,
          50.9916015
        ]
      },
      "id": "relation/62370"
    }
  ]
}

我不希望在此文件中(最好以CSV文件的形式)获取名称,TMC位置代码和每个要素的坐标:

location_code,name,latitude,longitude

我知道我可以制作一个正则表达式,将所有多余的节点踢掉,但这将是一个相当复杂的正则表达式。我还在jq这里的OpenSuSE Leap 15.1机器上安装了该工具,但是当涉及到该工具时,我是个新手。

关于如何执行此提取工作的任何想法?

钢铁司机

我本人是新角,但我认为

$ jq -r '.features[] | select(.type == "Feature") | [.properties."TMC:cid_58:tabcd_1:LocationCode",.properties.name,.geometry.coordinates[]] | @csv' file.json
"4934","Platnersberg",11.128184,49.4706035
"1157","Eisenach",10.2836229,50.9916015

应该做。select(.type == "Feature")过滤器可能没有必要-我不知道是否有任何其他类型是可能的。

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章