我有第一个数据框
df1:
A B C D
Car 0
Bike 0
Train 0
Plane 0
Other_1 Plane 2
Other_2 Plane 3
Other 3 Plane 4
而另一个:
df2:
A B
Car 4 %
Bike 5 %
Train 6 %
Plane 7 %
所以我想得到这个组合:
df1:
A B C D
Car 0 4 %
Bike 0 5 %
Train 0 6 %
Plane 0 7 %
Other_1 Plane 2 2
Other_2 Plane 3 3
Other 3 Plane 4 4
哪个是最好的方法?
如果df和df2的索引相同,则可以使用:
df['D'] = df2['B'].combine_first(df['C'])
输出:
A B C D
0 Car NaN 0 4 %
1 Bike NaN 0 5 %
2 Train NaN 0 6 %
3 Plane NaN 0 7 %
4 Other_1 Plane 2 2
5 Other_2 Plane 3 3
6 Other_3 Plane 4 4
如果索引不一致,则可以merge
在列A上使用:
df_out = df.merge(df2, on ='A', how='left', suffixes=('','y'))
df_out.assign(D = df_out.By.fillna(df_out.C)).drop('By', axis=1)
或使用@piRSquared改进的单线:
df.drop('D',1).merge(df2.rename(columns={'B':'D'}), how='left',on ='A')
输出:
A B C D
0 Car NaN 0 4 %
1 Bike NaN 0 5 %
2 Train NaN 0 6 %
3 Plane NaN 0 7 %
4 Other_1 Plane 2 2
5 Other_2 Plane 3 3
6 Other_3 Plane 4 4
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句