我有一个数据框,其中的一个列具有单个值和一列值列表:
period node key_players
0 0 ZF1013 [ZF1128, ZF176, ZF434, ZF469, ZF659]
1 0 ZF1014 [ZF1128, ZF176, ZF434, ZF469, ZF659]
2 0 ZF1015 [ZF1128, ZF176, ZF434, ZF469, ZF659]
3 0 ZF1020 [ZF1128, ZF176, ZF434, ZF469, ZF659]
4 0 ZF1025 [ZF1128, ZF176, ZF434, ZF469, ZF659]
... ... ... ...
1565 4 ZF898 [ZF1336, ZF1346, ZF3, ZF434, ZF481]
1566 4 ZF945 [ZF1336, ZF1346, ZF3, ZF434, ZF481]
1567 4 ZF948 [ZF1336, ZF1346, ZF3, ZF434, ZF481]
1568 4 ZF97 [ZF1336, ZF1346, ZF3, ZF434, ZF481]
1569 4 ZFM264 [ZF1336, ZF1346, ZF3, ZF434, ZF481]
我想过滤“ key_players”中“节点”的位置。
我使用了df可见部分的版本(为了以后的功能,请遵循以下步骤:如何提供出色的熊猫示例)
我修改了几行以在其中包含节点 key_players
from io import StringIO
df = pd.read_csv(StringIO(
"""
period node key_players
0 0 ZF1013 ['ZF1128', 'ZF176', 'ZF434','ZF469','ZF659']
1 0 ZF1014 ['ZF1014', 'ZF176', 'ZF434','ZF469','ZF659']
2 0 ZF1015 ['ZF1128', 'ZF176', 'ZF434','ZF469','ZF659']
3 0 ZF1020 ['ZF1128', 'ZF176', 'ZF434','ZF469','ZF659']
4 0 ZF1025 ['ZF1128', 'ZF1025', 'ZF434','ZF469','ZF659']
1565 4 ZF898 ['ZF1336', 'ZF1346','ZF3', 'ZF434,' 'ZF481']
1566 4 ZF945 ['ZF1336', 'ZF1346','ZF3', 'ZF434,' 'ZF481']
1567 4 ZF948 ['ZF1336', 'ZF1346','ZF3', 'ZF434,' 'ZF481']
1568 4 ZF97 ['ZF1336', 'ZF1346','ZF3', 'ZF434,' 'ZF481']
1569 4 ZFM264 ['ZF1336', 'ZF1346','ZF3', 'ZF434,' 'ZF481']
"""), sep = '\s\s+')
df['key_players'] = df['key_players'].apply(eval)
我们将列表解压缩到key_players
via中,explode
并保留与之匹配的行node
df2 = df.assign(kp = df['key_players']).explode('kp')
df2[df2['kp'] == df2['node']].drop(columns = 'kp')
此打印
period node key_players
-- -------- ------ -----------------------------------------------
1 0 ZF1014 ['ZF1014', 'ZF176', 'ZF434', 'ZF469', 'ZF659']
4 0 ZF1025 ['ZF1128', 'ZF1025', 'ZF434', 'ZF469', 'ZF659']
如果您不介意遍历行(通常不建议使用熊猫),则可以执行此操作
df[df.apply(lambda row: row['node'] in row['key_players'], axis=1)]
具有相同的输出
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句