熊猫分组条件过滤

ccsv 发表于 Dev

ccsv

我有一个DataFrame：

import pandas as pd

df = pd.DataFrame({'First': ['Sam', 'Greg', 'Steve', 'Sam',
                             'Jill', 'Bill', 'Nod', 'Mallory', 'Ping', 'Lamar'],
                   'Last': ['Stevens', 'Hamcunning', 'Strange', 'Stevens',
                            'Vargas', 'Simon', 'Purple', 'Green', 'Simon', 'Simon'],
                   'Address': ['112 Fake St',
                               '13 Crest St',
                               '14 Main St',
                               '112 Fake St',
                               '2 Morningwood',
                               '7 Cotton Dr',
                               '14 Main St',
                               '20 Main St',
                               '7 Cotton Dr',
                               '7 Cotton Dr'],
                   'Status': ['Infected', '', 'Infected', '', '', '', '','', '', 'Infected'],
                   })

然后应用以下分组代码

df_index = df.groupby(['Address', 'Last']).filter(lambda x: (x['Status'] == 'Infected').any()).index
df.loc[df_index, 'Status'] = 'Infected'

而不是按照分组代码将所有内容都标记为“已感染”。有没有一种方法可以选择将要更新的值，以便可以将它们标记为其他值？例如：

df2 = df.copy(deep=True)
df2['Status'] = ['Infected', '', 'Infected', 'Infected2', '', 'Infected2', '', '', 'Infected2', 'Infected']

马吕斯

我认为这可以达到您想要的结果，但效果略有不同：

def infect_new_people(group):
    if (group['Status'] == 'Infected').any():
        # Only affect people not already infected
        group.loc[group['Status'] != 'Infected', 'Status'] = 'Infected2'
    return group['Status']

# Need group_keys=False so that each group has the same index
#   as the original dataframe
df['Status'] = df.groupby(['Address', 'Last'], group_keys=False).apply(infect_new_people)

df
Out[36]: 
         Address    First        Last     Status
0    112 Fake St      Sam     Stevens   Infected
1    13 Crest St     Greg  Hamcunning           
2     14 Main St    Steve     Strange   Infected
3    112 Fake St      Sam     Stevens  Infected2
4  2 Morningwood     Jill      Vargas           
5    7 Cotton Dr     Bill       Simon  Infected2
6     14 Main St      Nod      Purple           
7     20 Main St  Mallory       Green           
8    7 Cotton Dr     Ping       Simon  Infected2
9    7 Cotton Dr    Lamar       Simon   Infected

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。