Python parse nested JSON file and take out specific attributes

John

So I am having here one big JSON file which looks like this:

data = {
    "Module1": {
        "Description": "",
        "Layer": "1",
        "SourceDir": "pathModule1",
        "Attributes": {
            "some",
        },
        "Vendor": "comp",
        "components":{
            "Component1": {
               "path": "something",
               "includes": [
                   "include1",
                   "include2",
                   "include3",
                   "include4",
                   "include5"
               ]
               "generated:" "txt"
               "memory:" "txt"
               etc
            },
            "Component2":{
               "path": "something",
               "includes": [
                   "include1",
                   "include2",
                   "include3",
                   "include4",
                   "include5"
               ]
               "generated:" "txt"
               "memory:" "txt"
               etc
            }
        }
    },
    "Module2": {
        "Description": "",
        "Layer": "2",
        "SourceDir": "pathModule2",
        "Attributes": {
            "some",
        },
        "Vendor": "comp",
        "components":{
            "Component1": {
               "path": "something",
               "includes": [
                   "include1",
                   "include2",
                   "include3",
                   "include4",
                   "include5"
               ]
               "generated:" "txt"
               "memory:" "txt"
               etc
            },
            "Component2":{
               "path": "something",
               "includes": [
                   "include1",
                   "include2",
                   "include3",
                   "include4",
                   "include5"
               ]
               "generated:" "txt"
               "memory:" "txt"
               etc
            }
        }
    },
    "Module3": {
        "Description": "",
        "Layer": "3",
        "SourceDir": "path",
        "Attributes": {
            "some",
        },
        "Vendor": "",
    },
    "Module4": {
        "Description": "",
        "Layer": "4",
        "SourceDir": "path",
        "Attributes": {
            "some",
        }
    }
}

I have to go through and take some stuff out of it, so at the end I get this:

Whenever Vendor field is equal to "comp", take that module into consideration, take it's SourceDir filed, all components, their path and includes.

So output would be:

Module1, "pathModule1", components: [Component1, path, [includes: include1, include2 ,include3 ,include4 ,include5 ]], [Component2, path, includes: [include1, include2 ,include3 ,include4 ,include5 ]]

Module2, "pathModule2", components: [Component1, path, [includes: include1, include2 ,include3 ,include4 ,include5 ]], [Component2, path, includes: [include1, include2 ,include3 ,include4 ,include5 ]]

I am really struggling with accessing all the fields that I need.

My current code is this:

with open ("DB.json", 'r') as f:
    modules= json.load(f)

for k in modules.keys():
    try:
        if swc_list[k]["Vendor"] == "comp":
            list_components.append(k)
            sourceDirList.append(swc_list[k]['SourceDir'])
            for i in swc_list[k]['sw_objects']:
                 list_sw_objects.append((swc_list[k]['sw_objects']))
    except KeyError:
        continue

I am managing to get only Module1 and sourceDir, but not Component1, 2 and its attributes.. How can I achieve this?

Thanks!

PirateNinjas

I would start by filtering out the items you're not interested in, by doing something like:

data = {k: v for k,v in data.items() if v.get("Vendor") == "comp"}

This drops all the modules you don't want. It's a bit inefficient, because you're parsing over the dictionary a second time to get data in a format you want, but it's easier to reason about as a first step, which is helpful!

At this point you could iterate over the dictionary again if needed - you would have something like:

{'Module1': {'Attributes': {'some'},
             'Description': '',
             'Layer': '1',
             'SourceDir': 'pathModule1',
             'Vendor': 'comp',
             'components': {'Component1': {'includes': ['include1',
                                                        'include2',
                                                        'include3',
                                                        'include4',
                                                        'include5'],
                                           'path': 'something'},
                            'Component2': {'includes': ['include1',
                                                        'include2',
                                                        'include3',
                                                        'include4',
                                                        'include5'],
                                           'path': 'something'}}},
 'Module2': {'Attributes': {'some'},
             'Description': '',
             'Layer': '2',
             'SourceDir': 'pathModule2',
             'Vendor': 'comp',
             'components': {'Component1': {'includes': ['include1',
                                                        'include2',
                                                        'include3',
                                                        'include4',
                                                        'include5'],
                                           'path': 'something'},
                            'Component2': {'includes': ['include1',
                                                        'include2',
                                                        'include3',
                                                        'include4',
                                                        'include5'],
                                           'path': 'something'}}}}

To get a print out of the source directories and the components only, you could do:

for k,v in data2.items():
    print(k, v["SourceDir"], v["components"])

which would give you:

Module1 pathModule1 {'Component1': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}, 'Component2': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}}
Module2 pathModule2 {'Component1': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}, 'Component2': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}}

Edit: To refine the output further, you can change the above loop to be:

for k,v in data2.items():
    components = [(comp_name, comp_data["path"], comp_data["includes"]) for comp_name, comp_data in v["components"].items()]
    print(k, v["SourceDir"], components)

which will give you:

Module1 pathModule1 [('Component1', 'something', ['include1', 'include2', 'include3', 'include4', 'include5']), ('Component2', 'something', ['include1', 'include2', 'include3', 'include4', 'include5'])]
Module2 pathModule2 [('Component1', 'something', ['include1', 'include2', 'include3', 'include4', 'include5']), ('Component2', 'something', ['include1', 'include2', 'include3', 'include4', 'include5'])]

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How to parse only specific attributes from a JSON file to an array

Trying use a prompt to parse and find specific keys in nested JSON file

create a file and take out specific data

Cannot parse nested json file

How to further parse a nested dictionary in a json file to a dataframe in Python

Parse Nested JSON with Python/Pandas

Parse Nested JSON with Python and Pandas

Parse JSON nested arrays in Excel in specific way

Python parse JSON file

How to take specific users from JSON file?

Print out specific values from a JSON file (nested objects-arrays) w/ JAVA

How to parse a nested json file in typescript (.tsx)?

API with nested JSON parse into CSV file in ADF

parse a quite nested Json file with Pandas/Python, the json thing is now in one column of a dataframe

Getting info out of nested json file

Parse Json file and save specific values

Parse JSON File and iterate over specific parameters

Parse XML file in python and retrieve nested children

Python parse text file into nested dictionaries

how to take the specific details out in Python that are separated by a semi colon or a slash?

Parse json array with out object in python

Parse a file with JSON objects in Python

Python - Parse text file to json

Parse a json txt file with python

Parse json file using python

How to parse/extract nested JSON data with Python?

How to use python parse json with nested children

How to parse a nested json data file iand insert it into sqlite database using python3

Getting specific information out of JSON file