给定以下两个数据框:
df1:
id city district year price
0 1 bjs cyq 2018 12
1 2 bjs cyq 2019 6
2 3 sh hp 2018 4
3 4 shs hpq 2019 3
df2:
id city district year
0 1 bj cy 2018
1 2 bj cy 2019
2 4 sh hp 2019
比方说,在一些值city
,并district
从df1
有错误,所以我需要更新city
和district
价值观df1
与那些df2
基于id
,我预期的结果是这样的:
id city district year price
0 1 bj cy 2018 12
1 2 bj cy 2019 6
2 3 sh hp 2018 4
3 4 sh hp 2019 3
我该如何在熊猫中做到这一点?谢谢。
更新:
解决方案1:
cities = df2.set_index('id')['city']
district = df2.set_index('id')['district']
df1['city'] = df1['id'].map(cities)
df1['district'] = df1['id'].map(district)
解决方案2:
df1[["city","district"]] = pd.merge(df1,df2,on=["id"],how="left")[["city_y","district_y"]]
print(df1)
出:
id city district year price
0 1 bj cy 2018 12
1 2 bj cy 2019 6
2 3 NaN NaN 2018 4
3 4 sh hp 2019 3
请注意city
,并district
为id
IS3
是NaN
S,但我想,从保留值df1
。
尝试combine_first
:
df2.set_index('id').combine_first(df1.set_index('id')).reset_index()
输出:
id city district price year
0 1 bj cy 12.0 2018.0
1 2 bj cy 6.0 2019.0
2 3 sh hp 4.0 2018.0
3 4 sh hp 3.0 2019.0
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句