How can I create a column in an actual dataframe by indexing another dataframe using the values in two columns from the actual dataframe

ManuelVillamil

Good day. I have two dataset (df1,df2). I am trying to fill the column 'values' in df2 by using the column 'site_before' as the row index in df1 and the column 'site' as the column index in df1.

The dataset df1:

      ANA01  PHO01  ATL  BAL12  BOS07   
ANA01   0     0      3     3     3
PHO01   0     0      3     3     3
ATL    -3    -3      0     0     0 
BAL12   -3   -3      0     0     0 
BOS07   -3    -3     0     0     0

"The first column is the indexes of the rows"

The dataset df2:

    Game_ID     site_before  site   values
1   ANA199804010    ANA01   ANA01   
3   ANA199804020    ANA01   ATL 
5   ANA199804030    ANA01   BAL12   
7   ANA199804040    ANA01   BOS07   
9   ANA199804050    ANA01   ANA01   
674 BOS199804300    BOS07   BOS07   
31  ANA199805010    BOS07   ANA01   
33  ANA199805020    PHO01   ANA01   
35  ANA199805030    PHO01   PHO01   
37  ANA199805040    PHO01   ATL 
39  ANA199805050    PHO01   BAL12

I tried to do:

df2['values'] = df1.loc[df2['site_before'], df2['site']].values

but I got an error as ValueError: Wrong number of items passed 4864, placement implies 1

The result I am expecting is:

    Game_ID     site_before site    values
1   ANA199804010    ANA01   ANA01   0
3   ANA199804020    ANA01   ATL     3
5   ANA199804030    ANA01   BAL12   3
7   ANA199804040    ANA01   BOS07   3
9   ANA199804050    ANA01   ANA01   0
674 BOS199804300    BOS07   BOS07   0
31  ANA199805010    BOS07   ANA01   -3
33  ANA199805020    PHO01   ANA01   0
35  ANA199805030    PHO01   PHO01   0
37  ANA199805040    PHO01   ATL     3
39  ANA199805050    PHO01   BAL12   3
jezrael

Use DataFrame.join with new MultiIndex Series created by DataFrame.stack:

df2 = df2.join(df1.stack().rename('new').rename_axis(('site_before','site')), 
               on=['site_before','site'])
print (df2)
          Game_ID site_before   site  new
1    ANA199804010       ANA01  ANA01    0
3    ANA199804020       ANA01    ATL    3
5    ANA199804030       ANA01  BAL12    3
7    ANA199804040       ANA01  BOS07    3
9    ANA199804050       ANA01  ANA01    0
674  BOS199804300       BOS07  BOS07    0
31   ANA199805010       BOS07  ANA01   -3
33   ANA199805020       PHO01  ANA01    0
35   ANA199805030       PHO01  PHO01    0
37   ANA199805040       PHO01    ATL    3
39   ANA199805050       PHO01  BAL12    3

Alternative is use DataFrame.melt with DataFrame.merge and left join:

df3 = df1.rename_axis('site_before').reset_index().melt('site_before', var_name='site')

df2 = df2.merge(df3, how='left')
print (df2)
         Game_ID site_before   site  new
0   ANA199804010       ANA01  ANA01    0
1   ANA199804020       ANA01    ATL    3
2   ANA199804030       ANA01  BAL12    3
3   ANA199804040       ANA01  BOS07    3
4   ANA199804050       ANA01  ANA01    0
5   BOS199804300       BOS07  BOS07    0
6   ANA199805010       BOS07  ANA01   -3
7   ANA199805020       PHO01  ANA01    0
8   ANA199805030       PHO01  PHO01    0
9   ANA199805040       PHO01    ATL    3
10  ANA199805050       PHO01  BAL12    3

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How can I insert actual values in dataframe as columns in R?

How can I create a column in a dataframe using conditional logic on multiple columns in another dataframe python pandas?

How to replace string values in one column with actual column values from other columns in the same dataframe?

How Do I Create New Column In Pandas Dataframe Using Two Columns Simultaneously From A Different Dataframe?

I want to populate the column of a dataframe with values from the column of another dataframe when the values of two columns match

How can I create a NEW column in a dataframe based on values of another column in a DIFFERENT dataframe that have common information?

how do I create a dataframe based on values from another dataframe?

How do I create a column in a pandas dataframe using values from two rows?

How to group DataFrame by two columns and create two new columns with min and max values from third column?

How to create a dataframe using values computed from another dataframe in R?

Can I divide each column of dataframe using corresponding values from another dataframe in R?

How can I add a column from one dataframe to another dataframe?

How to create boolean columns based on column values in another dataframe

How to create a new column in a DataFrame based on values of two other columns

How can I create a column for a dataframe where values are dependent on the values of another column?

Dataframe .join creates NaN valued column from actual values

Create column from another dataframe, using existing columns

Using two dataframes how can I compare a lookup value as a substring in the column in another dataframe to create a new column if the match exists

Create a dict using two columns from dataframe with duplicates in one column

Create a new column using a condition from other two columns in a dataframe

How do I combine the float values of two columns and put it in an another column of my dataframe?

How to map values in single column of a dataframe to two columns of another dataframe and extract the mapped values?

Sort Two column and create new columns for sorted values from dataframe using pandas

How to map columns From one DataFrame onto another based on a column values between the two?

Pandas dataframe columns extension using another dataframe column values

How can I replace the values in one pyspark dataframe column with the values from another column in a sub-section of the dataframe?

How can I add a new column to dataframe that uses a function to check if values from two columns fit specific criteria?

How to create new columns in pandas dataframe using column values?

How do I assign a name to the actual values in a Dataframe