How to fill Non-Null values from some columns in Pandas Dataframe into a new column? How to use np.where() for multiple conditions?

one_random_python_learner

I have a question regards about np.where()

Currently, I have 2 columns, each column contains Null values and categorical values. Values from each column are distinct and will not overlap.

For now, I want to apply all the Non-Null values from these 2 columns into the new column and fill the NaN value in the new column as a categorical value.

My idea is using np.where()

df['C']=np.where(df['A']=='user1', 'user1',(df['B']=='user2','user2','user3'))

Basic idea is if df['A']=='A', fill the value A into new column fist, elif df['B']=='B', fill the value B into new column as well, Else fill the value 'C' for all the NaN values.

However, a syntax error returned.

ValueError: operands could not be broadcast together with shapes (544,) () (3,) 

Thanks for the help always!

Sample data:

A   B   C   Desired col C
user1   Null    Null    user1
user1   Null    Null    user1
user1   Null    Null    user1
user1   Null    Null    user1
Null    user2   Null    user2
Null    user2   Null    user2
Null    user2   Null    user2
Null    user2   Null    user2
Null    user2   Null    user2
Null    user2   Null    user2
Null    Null    Null    user3
Null    Null    Null    user3
Null    Null    Null    user3
Null    Null    Null    user3
CreekGeek

Assuming your initial df is only cols A, B, and C:

# convert value you don't want to NaNs
df = df.where(df != 'Null')

# temporary list
lst = []

# iterate row-wise
for r in df.iterrows():
    # test if all values in row are the same (1 = no)
    if r[1].nunique() == 1:
        # if different, find the one that is the string and append to list
        a,b,c = r[1] # *this is specific to your example with three cols*
        for i in [a,b,c]:
            if isinstance(i,str):
                lst.append(i)
    else:
        # if same append specified value to list
        lst.append('user3')

df['D'] = lst

It's verbose and will be bit slow for very large dfs, but it produces your expected result. And it's readable.

It would be cleaner if you didn't have the rows with all nulls. Then a cleaner, one-liner would be more feasible df.where(), .apply(lambda), or masked array approach easier.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How to fill values in new column in dataframe on a multiple conditions

How to fill a column in pandas dataframe based on some conditions set upon two different columns?

How to add a new column with multiple string contain conditions in python pandas other than using np.where?

Creating a new column in dataframe based on multiple conditions from other rows and columns? Including rows that are null? - Python/Pandas

how to use a variable representing conditions in np.where for column in pandas with list values?

How To Fill New Column With Values From Second Dataframe but Dependent on Different Existing Column in Current Dataframe using Pandas

How to filter some columns values and generate a new column with these values in Pandas?

How to fill a columns based on the null values in another column in pandas

How to fill null values in pandas dataframe, with values from another column in python?

How do you concatenate multiple columns in a DataFrame into a another column when some values are null?

How to split a column into multiple columns and then count the null values in the new column in SQL or Pandas?

Pandas df: fill values in new column with specific values from another column (condition with multiple columns)

How to get another column in a dataframe filled with values from another columns based on multiple conditions?

How to combine non-null entries of columns of a DataFrame into a new column?

How to estimate count for Pandas dataframe column values based on multiple conditions?

How to fill Null values in a Pandas DataFrame based on a value from a different column?

How to get values from a cell of a dataframe based on multiple conditions in a new column?

How to make new dataframe columns from all the unique values in some particular column?

Fill new column in one dataframe with values from another, based on values in two other columns? (Python/Pandas)

How to populate values inside a new column based values from other columns in a dataframe in Pandas

How to sort multiple columns' values from min to max, and put in new columns in pandas dataframe?

How to use the sum values from a column in a multi-level indexed pandas dataframe as a condition for values in new column

new column in pandas dataframe failed to get the expected values basis if conditions on multiple columns

How do I move some cell values from 2 columns in Pandas DF to another new column?

How do I split data out from one column of a pandas dataframe into multiple columns of a new dataframe

How to assign values to multiple non existing columns in a pandas dataframe?

Pandas - how to create a new dataframe from the columns and values of an old dataframe?

How to transform all values in a column of a pandas dataframe into new columns with counts?

How to create new columns in pandas dataframe using column values?

TOP Ranking

HotTag

Archive