是否有很好的现有方法可以在这两种形式之间进行转换?
+--------+-------------+
| FRUIT | ATTRIBUTES |
+--------+-------------+
| banana | long|yellow |
+--------+-------------+
| kiwi | brown|oval |
+--------+-------------+
和
+--------+-----------+
| FRUIT | ATTRIBUTE |
+--------+-----------+
| banana | long |
+--------+-----------+
| banana | yellow |
+--------+-----------+
| kiwi | brown |
+--------+-----------+
| kiwi | oval |
+--------+-----------+
我目前正在为解包过程迭代行,我听说这不被鼓励。
import pandas as pd
packed = pd.DataFrame([['banana', 'long|yellow'],
['kiwi', 'brown|oval']],
columns=['FRUIT', 'ATTRIBUTES'])
pack_delim = '|'
per_fruit_frames = []
for row in packed.itertuples(index=True, name='Pandas'):
row_attribs = row.ATTRIBUTES
row_attribs_split = row_attribs.split(pack_delim)
row_attribs_series = pd.Series(row_attribs_split)
ras_len = len(row_attribs_split)
fruit_rep = [row[1]] * ras_len
frs = pd.Series(fruit_rep)
temp = pd.concat([frs, row_attribs_series], axis=1)
per_fruit_frames.append(temp)
unpacked = pd.concat(per_fruit_frames)
unpacked.columns = packed.columns
尝试:
packed=(packed.assign(ATTRIBUTES=packed['ATTRIBUTES'].str.split('|'))
.explode('ATTRIBUTES',ignore_index=True))
或者
分两步:
packed['ATTRIBUTES']=packed['ATTRIBUTES'].str.split('|')
packed=packed.explode('ATTRIBUTES',ignore_index=True)
的输出packed
:
FRUIT ATTRIBUTES
0 banana long
1 banana yellow
2 kiwi brown
3 kiwi oval
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句