Pandas calculate a new column from multiple other columns and subset of rows

Sid

I'm curious to see how do we calculate a new column from a dataframe based on other existing columns and subset of rows.

Example: I have a dataframe with different rooms 111, 222, 333, 444 and each room might have beds AAA, BBB. And I also have bed status per room as occupied or Free. I wanted to calculate a new column - "Active" with following criteria:

  1. Only one bed per room is active.
  2. Occupied bed takes first preference to be active in a room.
  3. If room has only one bed it must be active irrespective of bed status

Sample DataFrame:

   Room  Bed    Status
0   111  AAA  Occupied
1   111  BBB      Free
2   222  AAA      Free
3   333  AAA      Free
4   333  BBB  Occupied
5   444  BBB  Occupied

Expected Output:

   Room  Bed    Status  Active
0   111  AAA  Occupied    True
1   111  BBB      Free   False
2   222  AAA      Free    True
3   333  AAA      Free   False
4   333  BBB  Occupied    True
5   444  BBB  Occupied    True

I wanted to convert this to a dictionary and loop through it but have a strong feeling that this could be implemented through pandas in-built functions.

TIA

jezrael

Use:

print (df)
   Room  Bed    Status
0   111  AAA  Occupied
1   111  BBB      Free
2   222  AAA      Free
3   333  AAA      Free
4   333  BBB  Occupied
5   444  BBB  Occupied
6   555  AAA  Occupied
7   555  BBB  Occupied

#test Occupied
m1 = df['Status'].eq('Occupied')
#test first active if multiple active per Room (555)
m2 = (~df.assign(Status = df['Status'].where(df['Status'].eq('Occupied')))
         .duplicated(['Room','Status']))
#test if one bed room
m3 = df['Room'].map(df['Room'].value_counts()).eq(1)

#chain masks with & for bitwise AND and | for bitwise OR
df['Active'] = m1 & m2 | m3
print (df)
   Room  Bed    Status  Active
0   111  AAA  Occupied    True
1   111  BBB      Free   False
2   222  AAA      Free    True
3   333  AAA      Free   False
4   333  BBB  Occupied    True
5   444  BBB  Occupied    True
6   555  AAA  Occupied    True
7   555  BBB  Occupied   False

EDIT:

#test Occupied
m1 = df['Status'].eq('Occupied')
#test first active if multiple active per Room (555)
m2 = (~df.assign(Status = df['Status'].where(df['Status'].eq('Occupied')))
         .duplicated(['Room','Status']))

#test Free
m4 = df['Status'].eq('Free')
#test first active if multiple active per Room (666)
m5 = (~df.assign(Status = df['Status'].where(m4))
         .duplicated(['Room','Status']))

#test if all bed room has Free
m6 = m4.groupby(df['Room']).transform('all')

#chain masks with & for bitwise AND and | for bitwise OR
df['Active'] =  (m5 & m6) | (m1 & m2)
print (df)
   Room  Bed    Status  Active
0   111  AAA  Occupied    True
1   111  BBB      Free   False
2   222  AAA      Free    True
3   333  AAA      Free   False
4   333  BBB  Occupied    True
5   444  BBB  Occupied    True
6   555  AAA  Occupied    True
7   555  BBB  Occupied   False
8   666  AAA      Free    True
9   666  BBB      Free   False

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Creating a new column in dataframe based on multiple conditions from other rows and columns? Including rows that are null? - Python/Pandas

Pandas, create new column based on other columns across multiple rows

Calculate new column as the mean of other columns pandas

pandas: Calculate new values in the new column from multiple criteria apply from multiple columns without looping

New column in DataFrame from other columns AND rows

Referencing multiple columns and rows to calculate new value in a new column

Calculate new column in pandas using conditional on other columns

pandas create new column based on values from other columns / apply a function of multiple columns, row-wise

How to create a new dataframe column using values and groupings from other rows and columns in pandas?

Pandas DataFrame, how to calculate a new column element based on multiple rows

Average a subset of rows across multiple pandas columns

Modifying multiple columns in a subset of rows in pandas DataFrame

pandas select rows by matching a column entry to entries in multiple other columns

New column - multiple conditions from multiple rows and columns

Create new column with values from certain rows of other columns

Concatenate multiple columns into a new column for duplicate rows with pandas

Create new pandas column with apply based on conditions of multiple other columns

Create multiple new pandas column based on other columns in a loop

Pandas create a new column with complex condition of multiple other columns

Pandas: Create new columns with values from other rows

Create New Pandas Column from Groupby and Dividing Other Columns

python pandas dataframe create new column from other columns' cells

Creating new column from other columns in pandas dataframe

pandas: add new column with value from either of two other columns

Pandas - create new column from values in other columns on condition

Creating a new column based on values from other columns in python pandas

How to select a subset of rows from a subset of columns in pandas

Creating new Pandas DataFrame rows from multiple ID columns

Adding a new column based on other columns and rows