我有一个数据框如下:
df = pd.DataFrame({'ORDER':["A", "A", "A", "B", "B","B"], 'GROUP': ["A1C", "A1", "B1", "B1C", "M1", "M1C"]})
df['_A1_XYZ'] = 1
df['_A1C_XYZ'] = 2
df['_B1_XYZ'] = 3
df['_B1C_XYZ'] = 4
df['_M1_XYZ'] = 5
df
ORDER GROUP _A1_XYZ _A1C_XYZ _B1_XYZ _B1C_XYZ _M1_XYZ
0 A A1C 1 2 3 4 5
1 A A1 1 2 3 4 5
2 A B1 1 2 3 4 5
3 B B1C 1 2 3 4 5
4 B M1 1 2 3 4 5
5 B M1C 1 2 3 4 5
我想基于列“ GROUP”和所有以XYZ结尾的列创建列“ NEW” ,如下所示:基于每行df [“ NEW”] = df [“ _XYZ”]的GROUP值。
例如,对于第一行,GROUP = A1C,因此“ NEW” = 2(_A1C_XYZ),对于第二行“ NEW” = 1(_A1_XYZ)
我的预期输出
ORDER GROUP _A1_XYZ _A1C_XYZ _B1_XYZ _B1C_XYZ _M1_XYZ NEW
0 A A1C 1 2 3 4 5 2
1 A A1 1 2 3 4 5 1
2 A B1 1 2 3 4 5 3
3 B B1C 1 2 3 4 5 4
4 B M1 1 2 3 4 5 5
5 B M1C 1 2 3 4 5
用途pd.DataFrame.lookup
:
df['NEW'] = df.lookup(df.index, '_'+df['GROUP']+'_XYZ')
df
输出:
ORDER GROUP _A1_XYZ _A1C_XYZ _B1_XYZ _B1C_XYZ _M1_XYZ _M1C_XYZ NEW
0 A A1C 1 2 3 4 5 6 2
1 A A1 1 2 3 4 5 6 1
2 A B1 1 2 3 4 5 6 3
3 B B1C 1 2 3 4 5 6 4
4 B M1 1 2 3 4 5 6 5
5 B M1C 1 2 3 4 5 6 6
或使用堆栈并重新编制索引,
(df['New'] = df.stack().reindex(zip(df.index, '_'+dfl['GROUP']+'_XYZ'))
.rename('NEW').reset_index(level=1, drop=True))
df
输出:
ORDER GROUP _A1_XYZ _A1C_XYZ _B1_XYZ _B1C_XYZ _M1_XYZ New
0 A A1C 1 2 3 4 5 2
1 A A1 1 2 3 4 5 1
2 A B1 1 2 3 4 5 3
3 B B1C 1 2 3 4 5 4
4 B M1 1 2 3 4 5 5
5 B M1C 1 2 3 4 5 NaN
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句