我有2个数据框:
df_Billed: pd.Dataframe({'Bill_Number':[220119, 220120, 220219, 220219, 220419, 220519, 220619, 221219],'Date': [1/31/2019, 2/20/2020, 2/28/2019, 6/30/2019,6/30/2019,6/30/2019,6/30/2019,12/31/2019], 'Amount': [3312.5, 832.0,10000.0, -3312.5,8725.0,1862.5,3637.5,1587.5]})
df_Received: pd.Dataframe({'Bill_Number':[220119, 220219, 220419, 220519, 220619],'Date':[4/16/2019,5/21/2019,8/2/2019,8/2/2019,8/2/2019],'Amount':[3312.5,6687.5,8725,1862.5,3637.5]})
我试图在df_Billed中搜索每个“ Bill_Number”,以查看df_Received是否存在。理想情况下,如果有的话,我想针对特定的帐单编号来计算df_Billed和df_Received之间的日期差(以查看支付多少天)。如果df_Received中不存在帐单号,我想简单地在df_Billed中返回该帐单号的所有行。
EX: Since df_Billed Bill_Number 220119 is in df_Received, it would return 75 (which is the number of days it took for the bill to be paid 4/16/2019 - 1/31/2019).
EX: Since df_Billed Bill_Number 221219 is not in df_Received, it would return 12/31/2019 (which is the date it was billed).
您可能最初可以在Bill_Number上使用merge
df_Billed=df_Billed.merge(df_Received,on='Bill_Number',how='left')
然后使用apply和pandas.to_datetime计算日期之间的差异
df_Billed['result']=df_Billed.apply(lambda x:x.Date_x if pd.isnull(x.Date_y)
else abs(pd.to_datetime(x.Date_x)-pd.to_datetime(x.Date_y)).days,
axis=1)
最后,我认为您想为最终结果创建一个新列。.因此,我将合并的列Date_x和Amount_y重命名为下面的Date和Amount:
df_Billed.drop(['Date_y','Amount_y'],axis=1,inplace=True)
df_Billed.rename(columns={"Date_x": "Date","Amount_x":"Amount"},inplace=True)
最终数据框:
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句