下面是我的数据框
info date time file msg
0 INFO: 2018-09-12 16:10:10: view.py: phone
1 INFO: 2018-09-12 16:10:10: view.py: asdasd
2 INFO: 2018-09-12 16:10:43: view.py: contact start
3 INFO: 2018-09-12 16:10:43: view.py: contact end
4 INFO: 2018-09-12 16:11:36: view.py: app start
5 INFO: 2018-09-12 16:11:36: view.py: busy start
6 INFO: 2018-09-12 16:12:08: view.py: busy end
7 INFO: 2018-09-12 16:12:08: view.py: contact end
8 INFO: 2018-09-12 16:12:08: view.py: app end
9 INFO: 2018-09-12 16:12:08: view.py: phone
7 INFO: 2018-09-12 16:12:08: view.py: contact end
我想根据msg
列中的值将此数据框拆分为多个数据框。如果我想按“电话”作为值拆分,我的数据框应该看起来像这样:
df1:
info date time file msg
0 INFO: 2018-09-12 16:10:10: view.py: phone
1 INFO: 2018-09-12 16:10:10: view.py: asdasd
2 INFO: 2018-09-12 16:10:43: view.py: contact start
3 INFO: 2018-09-12 16:10:43: view.py: contact end
4 INFO: 2018-09-12 16:11:36: view.py: app start
5 INFO: 2018-09-12 16:11:36: view.py: busy start
6 INFO: 2018-09-12 16:12:08: view.py: busy end
7 INFO: 2018-09-12 16:12:08: view.py: contact end
8 INFO: 2018-09-12 16:12:08: view.py: app end
df2:
info date time file msg
9 INFO: 2018-09-12 16:12:08: view.py: phone
7 INFO: 2018-09-12 16:12:08: view.py: contact end
将字典用于可变数量的相关变量。在这里,您可以与GroupBy
+结合使用cumsum
:
d = dict(tuple(df.groupby(df['msg'].eq('phone').cumsum())))
然后通过访问dataframes d[1]
,d[2]
,... d[n]
。
结果:
{1: info date time file msg
0 INFO: 2018-09-12 16:10:10: view.py: phone
1 INFO: 2018-09-12 16:10:10: view.py: asdasd
2 INFO: 2018-09-12 16:10:43: view.py: contactstart
3 INFO: 2018-09-12 16:10:43: view.py: contactend
4 INFO: 2018-09-12 16:11:36: view.py: appstart
5 INFO: 2018-09-12 16:11:36: view.py: busystart
6 INFO: 2018-09-12 16:12:08: view.py: busyend
7 INFO: 2018-09-12 16:12:08: view.py: contactend
8 INFO: 2018-09-12 16:12:08: view.py: append,
2: info date time file msg
9 INFO: 2018-09-12 16:12:08: view.py: phone
7 INFO: 2018-09-12 16:12:08: view.py: contactend}
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句