Pandas：每一列的nan的百分比

Icy 发表于 Python

123

冰冷

目标：为df的每一列和每个客户获取缺失值的百分比

我的df关于创建的票证：

          id                type  ...      priority          Client
0     56 113            Incident  ...          Low           client1
1     56 267             Demande  ...          High          client1
2     56 294            Incident  ...          Nan           NaN
3     56 197             Demande  ...          Low           client3
4     56 143             Demande  ...          Nan           client4

第一次尝试：

df.notna().sum()/len(agg_global)*100
Out[29]:                       
id                       97.053453   
type                     76.415869   
priority                 82.626625    
client                   84.596443

这是非常有用的，但是我想在列中的“客户”维度向输出中添加更多详细信息，如下所示：

我想创建的输出：

                           Client1   Client2     Client3      NaN
id                      100.000000   100.000000  100.000000   66.990424
type                     76.415869   66.990424   76.415869    43.761970
status                  100.000000   100.000000  66.990424    76.415869
category                66.990424   43.761970   76.415869     43.761970
entity                   43.761970   100.000000  76.415869    76.415869
source_demande           84.596443   100.000000  76.415869    43.761970

我尝试使用“ groupby”，但我无法获得所需的输出...：

                   id       type  ...      priority         Client
client                            ...                             
True        97.053453  76.415869  ...      29.98632       29.98632

任何建议将被认真考虑。感谢您的关注！

耶斯雷尔：

您可以删除Client不测试缺失值百分比的列，通过来测试它们DataFrame.isna，通过Client用replace NaNs 汇总平均值以避免丢失它们，最后通过DataFrame.T：

print (df)
       id      type priority   Client
0     NaN  Incident      Low  client1
1     NaN       NaN     High  client1
2  56 294  Incident      Nan      NaN
3  56 197       NaN      Low  client3
4     NaN   Demande      NaN  client4


df = (df.drop('Client', 1)
        .isna()
        .groupby(df['Client'].fillna('NaN'))
        .mean()
        .rename_axis(None)
        .T)
print (df)
          NaN  client1  client3  client4
id        0.0      1.0      0.0      1.0
type      0.0      0.5      1.0      0.0
priority  0.0      0.0      0.0      1.0

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。