I've got a pandas DataFrame that looks like this:
molecule species
0 a [dog]
1 b [horse, pig]
2 c [cat, dog]
3 d [cat, horse, pig]
4 e [chicken, pig]
and I like to extract a DataFrame containing only thoses rows, that contain any of selection = ['cat', 'dog']
. So the result should look like this:
molecule species
0 a [dog]
1 c [cat, dog]
2 d [cat, horse, pig]
What would be the simplest way to do this?
For testing:
selection = ['cat', 'dog']
df = pd.DataFrame({'molecule': ['a','b','c','d','e'], 'species' : [['dog'], ['horse','pig'],['cat', 'dog'], ['cat','horse','pig'], ['chicken','pig']]})
IIUC Re-create your df then using isin
with any
should be faster than apply
df[pd.DataFrame(df.species.tolist()).isin(selection).any(1)]
Out[64]:
molecule species
0 a [dog]
2 c [cat, dog]
3 d [cat, horse, pig]
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments