如果我在 Pandas 数据框中有两列,并且我想执行断言以查看它们是否等于或大于两列上的其他逻辑布尔测试。
现在我正在做这样的事情:
# Roll the fields up so we can compare both reports.
# Goal: Show that `Gross Sales per Bar` is equal to `Gross Sales per Category`
#
# Do a GROUP BY of all the service bars and sum their Gross Sales per Bar
# Since the same value should be in this field for every 'Gross Sales per Bar' field,
# grab the first one, so we can compare them below
df_bar_sum = sbbac.groupby(['Bar'], as_index=False)['Gross Sales per Bar'].first()
df_bar_sum2 = sbbac.groupby(['Bar'], as_index=False)['Gross Sales per Category'].sum()
# Rename the 'Gross Sales per Category' column to 'Summed Gross Sales per Category'
df_bar_sum2.rename(columns={'Gross Sales per Category':'Summed Gross Sales per Category'}, inplace=True)
# Add the 'Gross Sales per Bar' column to the df_bar_sum2 Data Frame.
df_bar_sum2['Gross Sales per Bar'] = df_bar_sum['Gross Sales per Bar']
# See if they match...they should since the value of 'Gross Sales per Bar' should be equal to 'Gross Sales per Category' summed.
df_bar_sum2['GrossSalesPerCat_GrossSalesPerBar_eq'] = df_bar_sum2.apply(lambda row: 1 if row['Summed Gross Sales per Category'] == row['Gross Sales per Bar'] else 0, axis=1)
# Print the result
df_bar_sum2
我只是得到一个列,1
如果它匹配,0
如果不匹配。
我想在这里使用断言来测试它们是否匹配,因为如果它们与显示的某种错误不匹配,那么在进行测试时会导致整个事情失败;也许这不是对表格数据进行处理的好方法,我不确定,但如果这是一个好主意,我宁愿使用断言来比较它们。
断言也可能更难阅读,这很糟糕,我不确定......
assert np.allclose(your_df['Summed Gross Sales per Category'],
your_df['Gross Sales per Bar'])
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句