如何将数据框总结为与ID结合在一起的列表？

Ella 发表于 Dev

她

我有一个买家（buyerid），该买家可以购买几种不同的汽车（carid）。我想列出他买了哪些车。在这里，我想总结每个买家的所有汽车，并将它们保存为列表。

例如，买家1购买了ID为1和ID 2的汽车。此列表现在应包含[1,2]。我该如何列出清单？

如果我调用method，.values.tolist()那么我将每一行列为列表，但是我希望买方对carid进行汇总。

import pandas as pd
d = {'Buyerid': [1,1,2,2,3,3,3,4,5,5,5],
     'Carid': [1,2,3,4,4,1,2,4,1,3,5],
     'Carid2': [1,2,3,4,4,1,2,4,1,3,5]}

df = pd.DataFrame(data=d)

print(df)

ls = df.values.tolist()
print(ls)

    Buyerid  Carid  Carid2
0         1      1       1
1         1      2       2
2         2      3       3
3         2      4       4
4         3      4       4
5         3      1       1
6         3      2       2
7         4      4       4
8         5      1       1
9         5      3       3
10        5      5       5

[[1, 1, 1], [1, 2, 2], [2, 3, 3], [2, 4, 4], [3, 4, 4], [3, 1, 1], [3, 2, 2], [4, 4, 4], [5, 1, 1], [5, 3, 3], [5, 5, 5]]

# What I want as list
[[1,2],[3,4],[4,1,2],[4],[1,3,5]]

耶斯列尔

如果需要，选择要GroupBy.apply与np.uniqueif顺序无关紧要的处理列：

L = (df.groupby(['Buyerid'])[['Carid','Carid2']]
       .apply(lambda x: np.unique(x).tolist()).tolist())

或者，如果需要不Buyerid使用而处理所有列：

L  = (df.set_index('Buyerid')
        .groupby('Buyerid')
        .apply(lambda x: np.unique(x).tolist())
        .tolist())

print (L)
[[1, 2], [3, 4], [1, 2, 4], [4], [1, 3, 5]]

如果订购很重要，请使用DataFrame.melt以下方法对未选中的机智进行重复DataFrame.drop_duplicates：

L1 = (df.melt('Buyerid')
        .drop_duplicates(['Buyerid','value'])
        .groupby('Buyerid')['value']
        .agg(list)
        .tolist())
print (L1)

[[1, 2], [3, 4], [4, 1, 2], [4], [1, 3, 5]]

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。