# 如何根据熊猫另一栏中的条件计算记录的频率？

``````In [2]: df = pd.DataFrame({
...:     'donorID':[101,101,101,102,103,101,101,102,103],
...:     'recipientID':[11,11,21,21,31,11,21,31,31],
...:     'amount':[100,200,500,200,200,300,200,200,100],
...:     'year':[2014,2014,2014,2014,2014,2015,2015,2015,2015]
...: })

In [3]: df
Out[3]:
amount  donorID  recipientID  year
0     100      101           11  2014
1     200      101           11  2014
2     500      101           21  2014
3     200      102           21  2014
4     200      103           31  2014
5     300      101           11  2015
6     200      101           21  2015
7     200      102           31  2015
8     100      103           31  2015
``````

``````   donorID  num_donation_2_years
0      101                     2
1      102                     0
2      103                     1
``````

BEN_YO

``````df1=df.groupby('donorID').apply(lambda x : x.groupby(x.recipientID).year.nunique().gt(1).sum())
df1
Out[102]:
donorID
101    2
102    0
103    1
dtype: int64
``````

``````df1.to_frame('num_donation_2_years').reset_index()
Out[104]:
donorID  num_donation_2_years
0      101                     2
1      102                     0
2      103                     1
``````

``````df1=df.groupby(['donorID','recipientID']).year.nunique().gt(1).sum(level=0)
df1
Out[109]:
donorID
101    2.0
102    0.0
103    1.0
Name: year, dtype: float64

df1.to_frame('num_donation_2_years').reset_index()
Out[104]:
donorID  num_donation_2_years
0      101                     2
1      102                     0
2      103                     1
``````

0 条评论