Python match a column name based on a column value in another dataframe

Tim B

Apologies if this is a duplicate of some sort, I looked at 20 different questions, but none of them helped me. If someone can point me to a question that answers this, I'll happily delete my question.

I have two dataframes, the first is called df_full long list of various columns, one of which is called 'Industry' and has the strings of various Industries. df_full['Industry'].head() is:

INDEX Industry
0 Service
1 Service
2 Trade
3 Service
4 Manufacturing

My second dataframe is called df_industry and has quantiles based on each of the industries. df_industry['profit_sales'] is:

Industry
Financial 0.25 0.025616
0.50 0.219343
0.75 0.410408
Manufacturing 0.25 -0.012373
0.50 0.002032
0.75 0.010331
Service 0.25 -0.012660
0.50 0.003375
0.75 0.064102
Trade 0.25 -0.102178
0.50 0.001715
0.75 0.018705
Transport 0.25 -0.042755
0.50 -0.042755
0.75 0.056487

I am trying to create a new column for my first dataframe with the 0.5 quantile according to the industry in column industry.

Thus my new output table should look like, df_full[['Industry','quantile_05']].head()

INDEX Industry quantile_05
0 Service 0.003375
1 Service 0.003375
2 Trade 0.001715
3 Service 0.003375
4 Manufacturing 0.002032

I have currently tried to no avail: df_full['quantile_05'] = df_full.apply(lambda x: df_industry['profit_sales'][df_full['Industry'][x]][0.5] ,axis=1)

Quang Hoang

It looks like you can do a map:

df_full['quantile_05'] = df_full['Industry'].map(df_industry['profit_sales'].unstack()[0.5])

Output:

             Industry  quantile_05
INDEX                             
0             Service     0.003375
1             Service     0.003375
2               Trade     0.001715
3             Service     0.003375
4       Manufacturing     0.002032

If you want all three quantiles, you can do a merge as suggested by Kyle:

df_full.merge(df_industry['profit_sales'].unstack(),
          left_on=['Industry'], 
          right_index=True,
          how='left')

Output:

             Industry      0.25       0.5      0.75
INDEX                                              
0             Service -0.012660  0.003375  0.064102
1             Service -0.012660  0.003375  0.064102
2               Trade       NaN  0.001715  0.018705
3             Service -0.012660  0.003375  0.064102
4       Manufacturing -0.012373  0.002032  0.010331

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How to match column value in a dataframe based on condition in another dataframe?

Filling a dataframe based on the column name and index value of another dataframe

Returning the name of column based on the value of another column

Add a column based on the value of another column in a dataframe

Filtering the dataframe based on the column value of another dataframe

python: sort a value into correct dataframe column based on the columns numeric name

r - copy value based on match in another column

Match DataFrame column value against another DataFrame column and count hits

Python - Access column based on another column value

Python Dict Comprehension retrieve value from 1 dataframe column if match another column value

Python Pandas DataFrame - How to sum values in 1 column based on partial match in another column (date type)?

Name worksheets based on cell value in another column

Update an existing column in one dataframe based on the value of a column in another dataframe

python dataframe new column based on another column with value of conditional min of dataframe

Is there any way to replace a missing value based on another columns' value to match the column name

Match the column name based on the string in python?

Fill DataFrame based on value from another column

Calculate percentage of occurences of a value in a dataframe column based on another column value

Selecting a column value based on the value from another dataframe column

Max value of a column based on every unique value of another column (Dataframe)

Python Dataframe add a value to new column based on value from another column

Returning a specific value from a column based on the value of another column in the same row in Python Pandas Dataframe

Replace the value on a column based on the match, obtained using regex, on another column (Python Pandas)

How do I get the value of a column in a Pandas DataFrame with the name based on another columns value on the same row?

Python DataFrame: count of occurances based on another column

How to find minimum value in a column based on condition in an another column of a dataframe?

Set column status based on another dataframe column value pyspark

extract column value based on another column pandas dataframe

Have to split dataframe column based on length value in another column