How can a I drop duplicate rows for a dataframe based on the filter or condition of another column?

Mazil_tov998

I have a large dataframe which is in long format and can be created below:

import pandas as pd
df = pd.DataFrame({'period':['2021-01-01','2021-02-01','2021-03-01','2021-03-01','2021-04-01','2021-04-01'],
                    'indi':['pop','vacced','tot_num_cases','tot_num_cases','pop','pop'],
                    'value':[10000,200,8999,8999,27000,27000]})

I want to drop duplicate rows based on the condition below:

df[df['indi'] == 'tot_num_cases'].drop_duplicates(keep="last")  

but only on the rows which match the condition. How do that without dropping all duplicate rows of the dataframe. Result would look like:

final result

Guy

Split the condition to two conditions, duplicate value in indi column and the value is 'tot_num_cases'

df = df[~(df.duplicated(subset='indi', keep='last') & df['indi'].eq('tot_num_cases'))]
print(df)

Output

       period           indi  value
0  2021-01-01            pop  10000
1  2021-02-01         vacced    200
3  2021-03-01  tot_num_cases   8999
4  2021-04-01            pop  27000
5  2021-04-01            pop  27000

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How to drop duplicate rows based on another column condition?

How to drop duplicate from DataFrame taking into account value of another column

Group Pandas dataframe by one column, drop rows based on another column

How can I filter for pandas columns or rows based on values of another column?

python add rows in dataframe based on a condition of another column

How can I drop rows from a DataFrame that have a duplicate strings across multiple columns?

Filter a dataframe based on a condition of another column

How can i find the highest and the lowest value between rows depending on a condition being met in another column in a pd.DataFrame?

Filter and Drop rows based on a condition for a list of list column in dataframe

filter dataframe based on condition on another column in the dataframe in R

How do I filter multiple rows with matching column values based on all rows meeting a certain condition? [R]

How can I drop consecutive duplicate rows in one column based on condition/grouping from another column?

How can I filter a pandas dataframe of substrings based on another dataframe's column of full strings?

In a pandas dataframe, how can I filter the rows based on a column value, do calculation and assign the result to a new column?

Delete some rows in dataframe based on condition in another column

Drop Rows in First Df column, on condition based on another Df column

How to drop rows from one dataframe if there is no match in another dataframe based on common column?

How to filter duplicate rows based on a condition in tsql?

Drop consecutive duplicate rows based on condition

Drop some of duplicate rows based on condition

Pandas: How to find mean of a column based on duplicate rows in another column?

How do you filter duplicate columns in a dataframe based on a value in another column

Drop duplicate rows based on lookup or another dataframe

How to drop rows from a dataframe based on condition in python?

Pandas dataframe create new rows based on condition from another column

how to drop duplicate rows based on multiple column values in Amazon Athena?

How can I create a new column in a pandas dataframe that counts backwards the rows based on the value of another column?

How can I drop duplicate rows from a Polars DataFrame according to a custom function?

How can I filter an rows in column of ArrayType(StringType) against items in another column in a separate dataframe using pyspark?

TOP Ranking

HotTag

Archive