I have a pandas DataFrame with 4 columns, the first being "ID NUMBER". I am trying to filter "ID NUMBER" and get the same values bundled together. After that I want to extract each one that have the same values to a different csv file with their respected name.
DataFrame:
ID Number col2 col3 DATE
0 111 0.5 -0.6 20160104
1 118 -0.1 -0.6 20160104
2 11D 0.3 -1.1 20160104
3 111 -0.7 -0.9 20150102
***Output I need:***
Number ID col2 col3 DATE
0 111 0.5 -0.6 20160104
1 111 -0.7 -0.9 20150102
I have attempted to do something, however I could not find anything about how to filter a columns, and then extract online. Thank you!
You can use duplicated
with param keep=False
so it returns True
for all duplicated rows and mask the df:
In [16]:
df[df['ID Number'].duplicated(keep=False)]
Out[16]:
ID Number col2 col3 DATE
0 111 0.5 -0.6 20160104
3 111 -0.7 -0.9 20150102
For the second part you can do:
gp = df[df['ID Number'].duplicated(keep=False)].groupby('ID Number')
gp.apply(lambda x: x.to_csv(str(x.name) + '.csv')
EDIT
Actually if you're just wanting to write all rows with the same ID number to a named csv then:
df.groupby('ID Number').apply(lambda x: x.to_csv(str(x.name) + '.csv'))
Should do what you want
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments