我有一个数据框,如下所示
df = pd.DataFrame({
'subject_ID':[1,2,3,4,5],
'date_visit':['1/1/2020','3/3/2200','13/11/2100','24/05/2198','30/03/2071'],
'a11fever':['Yes','No','Yes','Yes','No'],
'a12diagage':[36,34,42,40,np.nan],
'a12diagyr':[2021,3213,2091,4567,8901],
'a12diagyrago':[6,np.nan,9,np.nan,np.nan]})
我想转换一个主题的示例输出如下所示的数据框
虽然我可以使用pd.melt
和成功完成此操作stack
,但我不能使用wide_long
。
pd.melt(df, id_vars =['subject_ID','date_visit'], value_vars =['a11fever', 'a12diagage', 'a12diagyr','a12diagyrago']) # works fine
pd.wide_to_long(df, stubnames=['measurement', 'val'],i=(['subject_ID','date_visit']), j='grp').sort_index(level=0) # returns 0 records
df.set_index(['subject_ID','date_visit']).stack().reset_index() #works fine
我还有一个问题是
一)我们总是不得不提到的所有列名,我们希望下变换value_vars
的部分pd.melt
。我的真实数据将超过120列。那么,在这里我必须一一提及吗?
您还能在使用方法上帮助我wide_long
吗?
这不是用例,pd.wide_to_long
因为它将生成不正确的输出。您必须使用stubnames
,这些将转换为列(a11
&a12
)。参见示例:
melt = pd.wide_to_long(df,
i=['subject_ID', 'date_visit'],
stubnames=['a11', 'a12'],
suffix='\D+',
j='fever_diag').reset_index()
subject_ID date_visit fever_diag a11 a12
0 1 1/1/2020 diagage NaN 36.0
1 1 1/1/2020 diagyr NaN 2021.0
2 1 1/1/2020 diagyrago NaN 6.0
3 1 1/1/2020 fever Yes NaN
4 2 3/3/2200 diagage NaN 34.0
5 2 3/3/2200 diagyr NaN 3213.0
6 2 3/3/2200 diagyrago NaN NaN
7 2 3/3/2200 fever No NaN
8 3 13/11/2100 diagage NaN 42.0
9 3 13/11/2100 diagyr NaN 2091.0
10 3 13/11/2100 diagyrago NaN 9.0
11 3 13/11/2100 fever Yes NaN
12 4 24/05/2198 diagage NaN 40.0
13 4 24/05/2198 diagyr NaN 4567.0
14 4 24/05/2198 diagyrago NaN NaN
15 4 24/05/2198 fever Yes NaN
16 5 30/03/2071 diagage NaN NaN
17 5 30/03/2071 diagyr NaN 8901.0
18 5 30/03/2071 diagyrago NaN NaN
19 5 30/03/2071 fever No NaN
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句