熊猫：如何跟踪两个数据框之间匹配数据条目的索引？

Yii硕士培训

我还是熊猫的新手。

我试图缓存匹配数据项的指数两只大熊猫数据帧之间的一个形式的Python字典，有更多的时间有效地计算和哈希查询以后。

例如，我有两个数据帧关系。

R1:                     R2:

A        B              B        C
1        2              2        18
2        2              5        18
3        6              6        26
4        7              6        31
                        7        32

列B是R1和R2之间的公共属性。我想构造一个字典，将R1中的每个值映射到R2中匹配的数据条目的索引。

例如所需的输出

{2: [0],        (2 from R1[B] matches with the 0th entry in R2)
 6: [2,3]       (6 from R1[B] matches with the 2nd and 3rd entry in R2)
 7: [4]}        (7 from R1[B] matches with the 4th entry in R2)

有没有一种有效的方法呢？R1 [B]中与R2中不匹配的值是否显示为在输出字典中具有空列表作为值的项目，还是被完全跳过。

谢谢！

BEN_YO

使用reset_index和groupby与R2列B，得到list的index与，则链.loc

R2.reset_index().groupby('B')['index'].apply(list).loc[R1.B.unique()]# if you need dict , adding to_dict() at the end 
B
2       [0]
6    [2, 3]
7       [4]
Name: index, dtype: object

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。