我有以下熊猫dataframe
-
EventID Institution_Name
TimeCreated
2021-03-22 15:34:46 40 H1
2021-03-22 18:17:19 40 H2
2021-03-22 20:37:47 40 H2
2021-03-22 20:40:20 40 H2
2021-03-22 21:37:32 40 H2
2021-03-22 22:16:32 40 H2
2021-03-22 23:19:49 40 H2
2021-03-22 23:26:40 40 H2
2021-03-23 00:26:03 40 H3
2021-03-23 01:25:43 40 H4
2021-03-23 04:00:24 40 H5
2021-03-23 13:09:42 40 H6
2021-03-23 13:13:23 40 H1
2021-03-23 15:49:33 40 H7
2021-03-23 17:22:30 40 H8
2021-03-23 17:22:37 40 H8
2021-03-23 17:23:49 40 H9
2021-03-23 18:19:56 40 H2
2021-03-23 18:22:14 40 H2
2021-03-23 18:52:36 40 H10
我想计算每个机构每天的事件数,并按降序对计数进行排序,同时保持按升序排列的天数。例如。最终结果看起来像这样 -
TimeCreated Institution_Name EventID_count
2021-03-22 H2 7
2021-03-22 H1 1
....
2021-03-23 H2 2
and so on
我正在使用以下 -
grouper = df.groupby([pd.Grouper(freq='1D'), 'Institution_Name'])
grouper['EventID'].count().reset_index().sort_values(['TimeCreated'],ascending=True).sort_values('EventID', ascending=False).head(5)
but this does not give the desired result.
grouper = df.groupby([pd.Grouper(key='TimeCreated', freq='1D'), 'Institution_Name'])
grouper = grouper.count().groupby('TimeCreated', group_keys=False)
grouper_count_desc = grouper.apply(lambda x: x.sort_values(by='EventID', ascending=False))
In[65]: grouper_count_desc
Out[65]:
EventID
TimeCreated Institution_Name
2021-03-22 H2 7
H1 1
2021-03-23 H2 2
H8 2
H1 1
H10 1
H3 1
H4 1
H5 1
H6 1
H7 1
H9 1
grouper_date_asc = grouper_count_desc.sort_values(by='TimeCreated', ascending=True)
In[70]: grouper_date_desc = grouper_count_desc.sort_values(by='TimeCreated', ascending=False) # to show result, I used descending
In[71]: grouper_date_desc
Out[71]:
EventID
TimeCreated Institution_Name
2021-03-23 H2 2
H8 2
H1 1
H10 1
H3 1
H4 1
H5 1
H6 1
H7 1
H9 1
2021-03-22 H2 7
H1 1
print(grouper_date_asc.reset_index())
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句