Removing rows in a pandas DataFrame where the row contains a string present in a list?

Pyderman

I know how to remove rows from a single-column ('From') pandas DataFrame where the row contains a string e.g given df and somestring:

df = df[~df.From.str.contains(someString)]

Now I wish to do something similar, but this time I wish to remove any rows that contain a string that is in any element of another list. Were I not using pandas, I would use for and the if ... not ... in approach. But how do I take advantage of pandas' own functionality to achieve this? Given the list of items to remove ignorethese, extracted from a file of comma-separated strings EMAILS_TO_IGNORE, I tried:

with open(EMAILS_TO_IGNORE) as emails:
        ignorethese = emails.read().split(', ')
        df = df[~df.From.isin(ignorethese)]

Am I convoluting matters by first decomposing the file into a list? Given that it is a plain text file of comma-separated values, can I bypass this with something simpler?

Anand S Kumar

Series.str.contains supports regular expression , you can create a regex from your list of emails to ignore by using | to OR them , and then use that in contains . Example -

df[~df.From.str.contains('|'.join(ignorethese))]

Demo -

In [109]: df
Out[109]:
                                         From
0         Grey Caulfu <[email protected]>
1  Deren Torculas <[email protected]>
2    Charlto Youna <[email protected]>

In [110]: ignorelist = ['[email protected]','[email protected]']

In [111]: ignorere = '|'.join(ignorelist)

In [112]: df[~df.From.str.contains(ignorere)]
Out[112]:
                                       From
2  Charlto Youna <[email protected]>

Please note, as mentioned in the documentation it uses re.search() .

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Searching Pandas Column for string where each row value contains list

Removing a whole list row in Pandas Dataframe - if list contains element (in this case a card or two)

Pandas removing a list of different rows in a dataframe

Removing a rows from pandas data frame if one of its cell contains list of all caps string

Pandas dataframe select rows where a list-column contains any of a list of strings

Removing rows contains non-english words in Pandas dataframe

Removing rows from dataframe that contains string in a particular column

Removing a string from a list in a pandas dataframe

Pandas dataframe select rows where a list-column contains a specific set of elements

Create a new column from two columns of a dataframe where rows of each column contains list in string format

How to filter Pandas Dataframe rows which contains any string from a list?

R subset rows that are NAs except where row contains a specific string

Removing rows from a Pandas Dataframe that do not exist in a list

Keep rows of a dataframe that are present in a list

how to search a string value within a specific column in pandas dataframe, and if present, give an output of that row present in the dataframe?

Converting a list to pandas dataframe where list contains dictionary

Pyspark: Extracting rows of a dataframe where value contains a string of characters

Pandas: Merge two rows if row contains certain string

Loop through a list of table rows until a row contains a particular string?

Pandas dataframe - Select rows where one column's values contains a string and another column's values starts with specific strings

Removing rows from the dataframe 1 where equivalent row in dataframe2 has null values

Pandas dataframe : Iterate over rows an item in column that contains list

Add a list of string to each row in a Pandas DataFrame

Python dataframe rows contains multiple list of string search

Pandas Dataframe Keep Row If Column Contains Any Designated Partial String

Removing 'dominated' rows from a Pandas dataframe (rows with all values lower than the values of any other row)

Pandas If row value contains items from a list as substrings, add new colum with values present on substring

select where row contains string

Removing List Within Pandas Dataframe