我有一个数据框,其中包含多个按日期排序的因子值:
date_and_time date hour factor st0 st1 st2
0 2012-01-01 00:00:00 2012-01-01 0 PM25 69.4000 58.00000 51.60000
1 2012-01-01 00:00:00 2012-01-01 0 CO 1.3800 1.15714 1.06585
2 2012-01-01 01:00:00 2012-01-01 1 PM25 127.2000 77.33330 89.20000
3 2012-01-01 01:00:00 2012-01-01 1 CO 1.5800 1.57143 1.63169
我将如何获得:
date_and_time date hour PM25_st0 PM25_st1 PM25_st2 COst0 COst1 COst2
0 2012-01-01 00:00:00 2012-01-01 0 69.4000 58.0000 51.6000 1.3800 1.15714 1.06585
1 2012-01-01 01:00:00 2012-01-01 1 127.200 77.3333 89.2000 1.5800 1.57143 1.63169
基本上枢轴化并创建相应的列,首先我尝试这样做:
df_data = {
'date_and_time': ['2012-01-01 00:00:00','2012-01-01 00:00:00','2012-01-01 01:00:00','2012-01-01 01:00:00' ],
'date': ['2012-01-01','2012-01-01','2012-01-01','2012-01-01'],
'hour':[0,0,1,1],
'factor':['PM25','CO','PM25','CO'],
'st0':[69.4000 ,1.3800,127.2000,1.5800],
'st1':[58.00000,1.15714,77.33330,1.57143],
'st2':[51.60000,1.06585,89.20000,1.63169]
}
df_test = pd.DataFrame(df_data, columns = ['date_and_time', 'date','hour','factor','st0','st1','st2'])
df_pivoted = pd.pivot_table(df_test,index=["date_and_time","date","hour"],columns=["factor"],values=["st0","st1","st2"])
并得到:
有没有一种方法可以按因子自定义列?
DataFrame.sort_index
由的第二级使用MultiIndex in columns
,然后列出f-string
具有交换级的s的理解,最后DataFrame.reset_index
列出来自的列MultiIndex in index
:
df_pivoted = df_pivoted.sort_index(axis=1, level=1)
df_pivoted.columns = [f'{b}_{a}' for a, b in df_pivoted.columns]
df_pivoted = df_pivoted.reset_index()
print (df_pivoted)
date_and_time date hour CO_st0 CO_st1 CO_st2 PM25_st0 \
0 2012-01-01 00:00:00 2012-01-01 0 1.38 1.15714 1.06585 69.4
1 2012-01-01 01:00:00 2012-01-01 1 1.58 1.57143 1.63169 127.2
PM25_st1 PM25_st2
0 58.0000 51.6
1 77.3333 89.2
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句