我想将嵌套的 json 格式转换为我尝试过的 Pandas 数据帧,但我的数据看起来像这样,这是不正确的
我试图获取 json 并保存在 innings 字典中,并尝试转换为无法以正确格式工作的 Pandas
innings 是我试图转换为 Pandas 数据帧的字典,但它没有以正确的格式转换
这是我的 json 是这样的
{
'1164223': [
{
'ball_limit': '300',
'balls': '300',
'batted': '1',
'batting_team_id': '2591',
'bowling_team_id': '1832',
'bpo': '6',
'byes': '1',
'event': '0',
'event_name': None,
'extras': '11',
'innings_number': '1',
'innings_numth': '1st',
'lead': '308',
'legbyes': '4',
'live_current': '0',
'live_current_name': None,
'minutes': None,
'noballs': '0',
'old_penalty_or_bonus': '0',
'over_limit': '50.0',
'over_limit_run_rate': '6.16',
'over_split_limit': '0.0',
'overs': '50.0',
'overs_docked': '0',
'penalties': '0',
'penalties_field_end': '0',
'penalties_field_start': '0',
'run_rate': '6.16',
'runs': '308',
'target': '0',
'wickets': '6',
'wides': '6'
},
{
'ball_limit': '300',
'balls': '294',
'batted': '1',
'batting_team_id': '1832',
'bowling_team_id': '2591',
'bpo': '6',
'byes': '0',
'event': '0',
'event_name': None,
'extras': '10',
'innings_number': '2',
'innings_numth': '1st',
'lead': '3',
'legbyes': '1',
'live_current': '1',
'live_current_name': 'current innings',
'minutes': None,
'noballs': '1',
'old_penalty_or_bonus': '0',
'over_limit': '50.0',
'over_limit_run_rate': '6.22',
'over_split_limit': '0.0',
'overs': '49.0',
'overs_docked': '0',
'penalties': '0',
'penalties_field_end': '0',
'penalties_field_start': '0',
'run_rate': '6.34',
'runs': '311',
'target': '309',
'wickets': '6',
'wides': '8'
}
],
'1165045': [
{
'ball_limit': '300',
'balls': '271',
'batted': '1',
'batting_team_id': '1003',
'bowling_team_id': '2989',
'bpo': '6',
'byes': '0',
'event': '1',
'event_name': 'all out',
'extras': '10',
'innings_number': '1',
'innings_numth': '1st',
'lead': '169',
'legbyes': '4',
'live_current': '0',
'live_current_name': None,
'minutes': None,
'noballs': '1',
'old_penalty_or_bonus': '0',
'over_limit': '50.0',
'over_limit_run_rate': '3.38',
'over_split_limit': '0.0',
'overs': '45.1',
'overs_docked': '0',
'penalties': '0',
'penalties_field_end': '0',
'penalties_field_start': '0',
'run_rate': '3.74',
'runs': '169',
'target': '0',
'wickets': '10',
'wides': '5'
},
{
'ball_limit': '300',
'balls': '239',
'batted': '1',
'batting_team_id': '2989',
'bowling_team_id': '1003',
'bpo': '6',
'byes': '0',
'event': '3',
'event_name': 'target reached',
'extras': '12',
'innings_number': '2',
'innings_numth': '1st',
'lead': '1',
'legbyes': '6',
'live_current': '1',
'live_current_name': 'current innings',
'minutes': None,
'noballs': '0',
'old_penalty_or_bonus': '0',
'over_limit': '50.0',
'over_limit_run_rate': '3.40',
'over_split_limit': '0.0',
'overs': '39.5',
'overs_docked': '0',
'penalties': '0',
'penalties_field_end': '0',
'penalties_field_start': '0',
'run_rate': '4.26',
'runs': '170',
'target': '170',
'wickets': '3',
'wides': '6'
}
]
}
使用concat
有dictionary comprehension
:
df = pd.concat({k: pd.DataFrame(v) for k, v in j.items()})
print (df)
ball_limit balls batted batting_team_id bowling_team_id bpo byes \
1164223 0 300 300 1 2591 1832 6 1
1 300 294 1 1832 2591 6 0
1165045 0 300 271 1 1003 2989 6 0
1 300 239 1 2989 1003 6 0
event event_name extras ... overs overs_docked penalties \
1164223 0 0 None 11 ... 50.0 0 0
1 0 None 10 ... 49.0 0 0
1165045 0 1 all out 10 ... 45.1 0 0
1 3 target reached 12 ... 39.5 0 0
penalties_field_end penalties_field_start run_rate runs target \
1164223 0 0 0 6.16 308 0
1 0 0 6.34 311 309
1165045 0 0 0 3.74 169 0
1 0 0 4.26 170 170
wickets wides
1164223 0 6 6
1 6 8
1165045 0 10 5
1 3 6
[4 rows x 32 columns]
另一个解决方案是循环dict comprehension
并为字典列表添加外部字典的键,最后传递给DataFrame
构造函数:
df = pd.DataFrame([dict(x, **{'_id':k}) for k, v in j.items() for x in v])
print (df)
_id ball_limit balls batted batting_team_id bowling_team_id bpo byes \
0 1164223 300 300 1 2591 1832 6 1
1 1164223 300 294 1 1832 2591 6 0
2 1165045 300 271 1 1003 2989 6 0
3 1165045 300 239 1 2989 1003 6 0
event event_name ... overs overs_docked penalties \
0 0 None ... 50.0 0 0
1 0 None ... 49.0 0 0
2 1 all out ... 45.1 0 0
3 3 target reached ... 39.5 0 0
penalties_field_end penalties_field_start run_rate runs target wickets wides
0 0 0 6.16 308 0 6 6
1 0 0 6.34 311 309 6 8
2 0 0 3.74 169 0 10 5
3 0 0 4.26 170 170 3 6
[4 rows x 33 columns]
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句