使用列中的值过滤数据框

保罗·科尔特斯

我有一个包含员工姓名、员工电子邮件、经理姓名和经理电子邮件的数据框。我需要使用经理电子邮件的所有唯一值过滤此数据框,并确认它们也出现在员工电子邮件列中,这样可以确保它们也作为员工在数据库中。

例如我有这个数据框:

Employee Name            Employee E-mail            Manager Name            Manager E-mail
Pedro                    [email protected]            Paul                    [email protected]
Paul                     N/A                        Carlos                  [email protected]
Richard                  [email protected]          Josh                    [email protected]
Carlos                   [email protected]           Peter                   #
Maria                    #                          Bob                     N/A
Josh                     [email protected]             Carlos                  [email protected]

这将返回以下数据框:

Employee Name            Employee E-mail            Manager Name            Manager E-mail
Richard                  [email protected]          Josh                    [email protected]
Josh                     [email protected]             Carlos                  [email protected]

最好的方法是什么?

莫兹韦

IIUC,您可以使用掩码和布尔索引:

# is the employee email valid? you can use a different pattern e.g. '@company\.com'
m1 = df['Employee E-mail'].str.contains('@').fillna(False)
# is the manager email valid?
m2 = df['Manager E-mail'].str.contains('@').fillna(False)
# is the manager also an employee?
m3 = df['Manager E-mail'].isin(df['Employee E-mail'])

# all conditions True
df2 = df.loc[m1&m2&m3]

输出:

  Employee Name    Employee E-mail Manager Name    Manager E-mail
2       Richard  [email protected]         Josh    [email protected]
5          Josh     [email protected]       Carlos  [email protected]

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章