我有一个嵌套的 JSON,如下所示:
{
"group": {
"groupname": "grp1",
"groupid": 1,
"city": "London"
},
"persons": {
"0": {
"name": "john",
"age": 12,
"gender": "M",
"groupid": 1
},
"1": {
"name": "maat",
"age": 15,
"gender": "M",
"groupid": 1
},
"2": {
"name": "chrissle",
"age": 10,
"gender": "F",
"groupid": 1
},
"3": {
"name": "stacy",
"age": 11,
"gender": "F",
"groupid": 1
},
"4": {
"name": "mark",
"age": 12,
"gender": "M",
"groupid": 1
},
"5": {
"name": "job",
"age": 12,
"gender": "M",
"groupid": 1
}
},
"group": {
"groupname": "grp1",
"groupid": 2,
"city": "NewYork"
},
"persons": {
"0": {
"name": "will",
"age": 12,
"gender": "M",
"groupid": 2
},
"1": {
"name": "phil",
"age": 15,
"gender": "M",
"groupid": 2
},
"2": {
"name": "winnie",
"age": 10,
"gender": "F",
"groupid": 2
}
}
}
我想将这两个部分group
分别persons
分成两个 df。
对于第二个 df persons
,我想将节号包括为列,如下所示:
id name age gender groupid
0 john 12 M 1
1 maat 15 M 1
2 chrissle 10 F 1
我已将 JSON 加载为 dict 列表并将其转换为 df:
data= pd.DataFrame.from_dict(data)
然后我可以得到persons
personsdf= personsdf['persons']
但是,这会给我一个 df,其中有一列包含每个人部分的 dict 行。
我在下面尝试取消嵌套字典行:
finaldf= pd.DataFrame()
for index, row in personsdf.iterrows():
row_data=row['personsdf']
row_data = pd.DataFrame.from_dict(row_data)
row_data = row_data.T
finaldf= finaldf.append(row_data, ignore_index=True)
但是后来我得到了除了丢失的节号之外的所有列。有没有更好的方法来解决这个问题?
如果我理解正确,您想创建两个数据框:一个用于组,第二个用于人员:
data = [
{
"group": {"groupname": "grp1", "groupid": 1, "city": "London"},
"persons": {
"0": {"name": "john", "age": 12, "gender": "M", "groupid": 1},
"1": {"name": "maat", "age": 15, "gender": "M", "groupid": 1},
"2": {"name": "chrissle", "age": 10, "gender": "F", "groupid": 1},
"3": {"name": "stacy", "age": 11, "gender": "F", "groupid": 1},
"4": {"name": "mark", "age": 12, "gender": "M", "groupid": 1},
"5": {"name": "job", "age": 12, "gender": "M", "groupid": 1},
},
},
{
"group": {"groupname": "grp1", "groupid": 2, "city": "NewYork"},
"persons": {
"0": {"name": "will", "age": 12, "gender": "M", "groupid": 2},
"1": {"name": "phil", "age": 15, "gender": "M", "groupid": 2},
"2": {"name": "winnie", "age": 10, "gender": "F", "groupid": 2},
},
},
]
df1 = pd.DataFrame([d["group"] for d in data])
df2 = pd.DataFrame(
[{"id": k, **v} for d in data for k, v in d["persons"].items()]
)
print(df1)
print(df2)
印刷:
groupname groupid city
0 grp1 1 London
1 grp1 2 NewYork
id name age gender groupid
0 0 john 12 M 1
1 1 maat 15 M 1
2 2 chrissle 10 F 1
3 3 stacy 11 F 1
4 4 mark 12 M 1
5 5 job 12 M 1
6 0 will 12 M 2
7 1 phil 15 M 2
8 2 winnie 10 F 2
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句