我有一个DF“数据”如下
Name Quality city
Tom High A
nick Medium B
krish Low A
Jack High A
Kevin High B
Phil Medium B
我想按城市分组,并基于“质量”列创建一个新列,并如下计算avegare
city High Medium Low High_Avg Medium_AVG Low_avg
A 2 0 1 66.66 0 33.33
B 1 1 0 50 50 0
我尝试使用以下脚本,但我知道这是完全错误的。data_average = data_df.groupby(['city'],as_index = False).count()
获取频率计数,将结果除以各列之和,最后将datframe连接成一个:
result = pd.crosstab(df.city, df.Quality)
averages = result.div(result.sum(1).array, axis=0).mul(100).round(2).add_suffix("_Avg")
#combine the dataframes
pd.concat((result, averages), axis=1)
Quality High Low Medium High_Avg Low_Avg Medium_Avg
city
A 2 1 0 66.67 33.33 0.00
B 1 0 2 33.33 0.00 66.67
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句