pandas DataFrame reshape by multiple column values

endangeredoxen

I'm trying to free myself of JMP for data analysis but cannot determine the pandas equivalent of JMP's Split Columns function. I'm starting with the following DataFrame:

In [1]: df = pd.DataFrame({'Level0': [0,0,0,0,0,0,1,1,1,1,1,1], 'Level1': [0,1,0,1,0,1,0,1,0,1,0,1], 'Vals': [1,3,2,4,1,6,7,5,3,3,2,8]})
In [2]: df
Out[2]:
    Level0  Level1  Vals
0        0       0     1
1        0       1     3
2        0       0     2
3        0       1     4
4        0       0     1
5        0       1     6
6        1       0     7
7        1       1     5
8        1       0     3
9        1       1     3
10       1       0     2
11       1       1     8

I can handle some of the output scenarios of JMP's function using the pivot_table function, but I'm stumped on the case where the Vals column is split by unique combinations of Level0 and Level1 to give the following output:

Level0   0       1
Level1   0   1   0   1
0        1   3   7   5
1        2   4   3   3
2        1   6   2   8

I tried pd.pivot_table(df, values='Vals', columns=['Level0', 'Level1']) but this gives mean values for the different combinations:

Level0  Level1
0       0         1.333333
        1         4.333333
1       0         4.000000
        1         5.333333

I also tried pd.pivot_table(df, values='Vals', index=df.index, columns=['Level0', 'Level1'] which gets me the column headers I want but doesn't work because it forces the output to have the same number of rows as the original so the output has a lot of NaN values:

Level0   0       1
Level1   0   1   0   1
0        1 NaN NaN NaN
1      NaN   3 NaN NaN
2        2 NaN NaN NaN
3      NaN   4 NaN NaN
4        1 NaN NaN NaN
5      NaN   6 NaN NaN
6      NaN NaN   7 NaN
7      NaN NaN NaN   5
8      NaN NaN   3 NaN
9      NaN NaN NaN   3
10     NaN NaN   2 NaN
11     NaN NaN NaN   8

Any suggestions?

ayhan

It's a bit of workaround, but you can do:

df.pivot_table(index=df.groupby(['Level0', 'Level1']).cumcount(), 
               columns=['Level0', 'Level1'], values='Vals', aggfunc='first')
Out: 
Level0  0     1   
Level1  0  1  0  1
0       1  3  7  5
1       2  4  3  3
2       1  6  2  8

The idea here is that the index of the output is not readily available in the original DataFrame. You can get it with the following:

df.groupby(['Level0', 'Level1']).cumcount()
Out: 
0     0
1     0
2     1
3     1
4     2
5     2
6     0
7     0
8     1
9     1
10    2
11    2
dtype: int64

Now if you pass this as the index of the pivot_table, an arbitrary aggfunc (mean, min, max, first or last) should work for you as those index-column pairs have only one entry.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-10-25

Comments

0 comments

TOP Ranking

Article

pandas DataFrame reshape by multiple column values

pandas DataFrame reshape by multiple column values

pump.io port in URL

How to import an asset in swift using Bundle.main.path() in a react-native native module

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

Inner Loop design for webscrapping

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

mysql.connector.errors.InterfaceError: 2003: Can't connect to MySQL server on '127.0.0.1:3306' (111 Connection refused)

Removed zsh, but forgot to change shell back to bash, and now Ubuntu crashes (wsl)

ggplotly no applicable method for 'plotly_build' applied to an object of class "NULL" if statements

How to run blender on webserver?

Resetting Value of <input type="time"> in Firefox

Converting a class method to a property with a backing field

Ambiguous use of 'init' with CFStringTransform and Swift 3

Execute ./script.sh with a crontab

How to set tab order for array of cluster,where cluster elements have different data types in LabVIEW?

How to pass data to the ng2-bs3-modal?

Retrieve Element Tag Value XML Using Bash

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

SQL Server : need add a dot before two last character

Making Array From Page Elements in jQuery

Laravel's ORM sync with timestamps doesn't update timestamps

Do animations stop css changes after animation completion?