如何使用python在数据框中将一列中的列表分为两列?例如:
row | column_A
==================================
1 |[('Ahli', 'NNP'), |
| ('paleontologi', 'NNP'), |
| ('Thomas', 'NNP'), |
| ('dan', 'CC'), |
| ('timnya', 'RB'), |
| ('.', 'Z')], |
2 |[('fosil', 'NN'), |
| ('mamalia', 'NN'), |
| ('yang', 'SC'), |
| ('menghuni', 'VB'), |
| ('Antartika', 'NNP')] |
我只想从列表中获取第二个字符串:
row | column_A | postag
=======================================
1 |[('Ahli', 'NNP'), |[('NNP'),
| ('paleontologi', 'NNP'), | (NNP),
| ('Thomas', 'NNP'), | (NNP),
| ('dan', 'CC'), | (CC),
| ('timnya', 'RB'), | (RB),
| ('.', 'Z')], | (Z)],
2 |[('fosil', 'NN'), |[('NN'),
| ('mamalia', 'NN'), | ('NN'),
| ('yang', 'SC'), | ('SC),
| ('menghuni', 'VB'), | ('VB'),
| ('Antartika', 'NNP')] | ('NNP)]
使用,Series.map
以应用自定义映射函数,该函数column_A
根据所需需求映射每个列表:
df['postag'] = df['column_A'].map(lambda l: [b for a, b in l])
另一个可能的想法:
df['postag'] = [[y for x, y in lst] for lst in df['column_A']]
结果:
# print(df)
column_A postag
0 [(Ahli, NNP), (paleontologi, NNP), (Thomas, NN... [NNP, NNP, NNP, CC, RB, Z]
1 [(fosil, NN), (mamalia, NN), (yang, SC), (meng... [NN, NN, SC, VB, NNP]
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句