Using Python how can I merge two columns and overwrite data from one column only if data in other column exists?

smpat04

I am trying to merge some data and have been unable to get the results I've been looking for. I have two data frames with two columns each: CID and Notional. DF1 has every CID and DF2 has just some of the CIDs. I want to take DF2's data and merge it with DF1 so that IF DF2's data exists it will overwrite DF1, and if not DF1 will retain it's data.

I have tried using pd.merge and I end up with a DataFrame that has columns CID, Notional_X, Notional_Y; I have tried 'update' but it just replaces all old DataFrame data.

Here's an example of what I'm looking for:

#Example of Data (couldn't find a better way to explain this)
df1 = pd.DataFrame({'CID':[1,25,100], 'Notional': [1000, 2500, 5500]})
df2 = pd.DataFrame({'CID':[25], 'Notional': [0]})

the output would return a DataFrame that looks like this:

pd.DataFrame({'CID': [1,25,100], 'Notional': [1000,0,5500]})

(not that the merge reduced CID 25 to 0 which is found in df2 without changing anything else)

The documentation suggests that 'merge' should accomplish it but it just... doesn't.

test = df1.merge(df1, df2, how = 'left', on = 'CID')

This seems to merge the dataframes without merging the data (it just appends a column on the end)

Any help would be greatly appreciated. Thank you.

Randall Goodwin

In your case, when both the left and right tables of the join also have the same data column ("Notional") that is not part of the merge key ("CID"), there is no method in the merge function to decide which value to use for Notional.

You can add one more line of code though to take care of this.

import pandas as pd
import numpy as np

# make the data
df1 = pd.DataFrame({'CID':[1,25,100], 'Notional': [1000, 2500, 5500]})
df2 = pd.DataFrame({'CID':[25], 'Notional': [0]})

# merge the data
test = df1.merge(df2, how='left', on='CID')

# If Notional from df2 was not missing,  then use it,  else use df1's Notional
test['Notional'] = np.where(test['Notional_y'].isna(), test['Notional_x'], test['Notional_y'])

You could then drop Notional_x and Notional_y from the dataframe, leaving your newly created Notional.

enter image description here

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How do I replace a string from one column using the data from two other columns (pandas)

How do I initialize a new column in a data frame using data from two other columns and some decisions?

How can I render data from multiple columns into one column?

How to create a ROOT Histogram from a large file containing two columns of data? I only want to create a histogram from one column of data

How to split one column into two columns such that it should maintain the data according to other columns using SQL

Merge data from multiple columns into one column

How do I merge two columns' data into one column and identify each entry by type?

How do I add data to a column only if a certain value exists in previous column using Python and Faker?

I have two dataset and need to comapre string from one data set columns with other dataset column in R

How can I overwrite a mapping of a column based on its current value and value of two other columns?

When using Presto to query a Hive table, how can I return values for columns when no data exists for that column?

How to convert two columns of data into multiple columns where one column is to be a header and the other column to appear as list?

How to merge data from two columns into one with a + sign using pandas

Python Pandas: Merge Columns of Data Frame with column name into one column

Merge two dataframes if data of a column in first dataframe exists in any of the columns of another dataframe in python

How can I separate the data frame of one column into two columns in R?

Python Dataframe Merge Boolean Columns Data into One Column Data

Merge one column data in Oracle based on another two columns

Spring Data JPA - How can I fetch data from a Date column using only month and day?

How do I separate data from one column into two

How to merge two dataframes and return data from another column in new column only if there is match?

Merge columns in Python/Pandas of Dataframe1 from Dataframe2 only if specific column contains at least one of the words of the other column

How can I join two tables in SQL and use function SUM on one column and show two other columns?

How can I look through 3 diferent columns to match a common number with one column of another dataframe to merge in the data (and if no match append)?

error only for one column by using Genfromtxt. All other columns could be read. how can i fix it?

How columnstore index knows which data from one column are connected to data from other columns?

How to merge two data frames based on one column in one data frame and two column in second dataframe

How do I get two columns from differnt tables and display data only if values less than 3 from second column?

How to merge two pandas columns into one column?