在将 JSON 转换为 DF 时,将 JSON 部分编号作为列包含在 df 中

观看

我有一个嵌套的 JSON,如下所示:

{
"group": {
    "groupname": "grp1",
    "groupid": 1,
    "city": "London"
},
"persons": {
    "0": {
        "name": "john",
        "age": 12,
        "gender": "M",
        "groupid": 1
    },
    "1": {
        "name": "maat",
        "age": 15,
        "gender": "M",
        "groupid": 1
    },
    "2": {
        "name": "chrissle",
        "age": 10,
        "gender": "F",
        "groupid": 1
    },
    "3": {
        "name": "stacy",
        "age": 11,
        "gender": "F",
        "groupid": 1
    },
    "4": {
        "name": "mark",
        "age": 12,
        "gender": "M",
        "groupid": 1
    },
    "5": {
        "name": "job",
        "age": 12,
        "gender": "M",
        "groupid": 1
    }
},
"group": {
    "groupname": "grp1",
    "groupid": 2,
    "city": "NewYork"
},
"persons": {
    "0": {
        "name": "will",
        "age": 12,
        "gender": "M",
        "groupid": 2
    },
    "1": {
        "name": "phil",
        "age": 15,
        "gender": "M",
        "groupid": 2
    },
    "2": {
        "name": "winnie",
        "age": 10,
        "gender": "F",
        "groupid": 2
    }
}
}

我想将这两个部分group分别persons分成两个 df。

对于第二个 df persons,我想将节号包括为列,如下所示:

id    name    age    gender    groupid
0     john     12      M         1
1     maat     15      M         1
2     chrissle 10      F         1

我已将 JSON 加载为 dict 列表并将其转换为 df:

data= pd.DataFrame.from_dict(data)

然后我可以得到persons

personsdf= personsdf['persons']

但是,这会给我一个 df,其中有一列包含每个人部分的 dict 行。

我在下面尝试取消嵌套字典行:

finaldf= pd.DataFrame()
for index, row in personsdf.iterrows():
    row_data=row['personsdf']
    row_data = pd.DataFrame.from_dict(row_data)
    row_data = row_data.T
    finaldf= finaldf.append(row_data, ignore_index=True)

但是后来我得到了除了丢失的节号之外的所有列。有没有更好的方法来解决这个问题?

安德烈·凯塞利

如果我理解正确,您想创建两个数据框:一个用于组,第二个用于人员:

data = [
    {
        "group": {"groupname": "grp1", "groupid": 1, "city": "London"},
        "persons": {
            "0": {"name": "john", "age": 12, "gender": "M", "groupid": 1},
            "1": {"name": "maat", "age": 15, "gender": "M", "groupid": 1},
            "2": {"name": "chrissle", "age": 10, "gender": "F", "groupid": 1},
            "3": {"name": "stacy", "age": 11, "gender": "F", "groupid": 1},
            "4": {"name": "mark", "age": 12, "gender": "M", "groupid": 1},
            "5": {"name": "job", "age": 12, "gender": "M", "groupid": 1},
        },
    },
    {
        "group": {"groupname": "grp1", "groupid": 2, "city": "NewYork"},
        "persons": {
            "0": {"name": "will", "age": 12, "gender": "M", "groupid": 2},
            "1": {"name": "phil", "age": 15, "gender": "M", "groupid": 2},
            "2": {"name": "winnie", "age": 10, "gender": "F", "groupid": 2},
        },
    },
]

df1 = pd.DataFrame([d["group"] for d in data])
df2 = pd.DataFrame(
    [{"id": k, **v} for d in data for k, v in d["persons"].items()]
)
print(df1)
print(df2)

印刷:

  groupname  groupid     city
0      grp1        1   London
1      grp1        2  NewYork

  id      name  age gender  groupid
0  0      john   12      M        1
1  1      maat   15      M        1
2  2  chrissle   10      F        1
3  3     stacy   11      F        1
4  4      mark   12      M        1
5  5       job   12      M        1
6  0      will   12      M        2
7  1      phil   15      M        2
8  2    winnie   10      F        2

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章