我想创建一个钛酸数据集的堆叠条形图。该情节需要按“ Pclass”,“ Sex”和“ Survived”分组。我已经通过大量乏味的numpy操作做到了这一点,以产生下面的归一化图(其中“ M”是男性,“ F”是女性)
有没有办法使用熊猫内置的绘图功能来做到这一点?
我已经试过了:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('train.csv')
df_grouped = df.groupby(['Survived','Sex','Pclass'])['Survived'].count()
df_grouped.unstack().plot(kind='bar',stacked=True, colormap='Blues', grid=True, figsize=(13,5));
这不是我想要的。无论如何,有没有使用熊猫图制作第一个图?提前致谢
生成的条形图将不会像第一个图那样彼此相邻,但是除此之外,pandas允许您执行以下操作:
df_g = df.groupby(['Pclass', 'Sex'])['Survived'].agg([np.mean, lambda x: 1-np.mean(x)])
df_g.columns = ['Survived', 'Died']
df_g.plot.bar(stacked=True)
在此,补丁的水平分组由于堆叠的要求而变得复杂。例如,如果我们只关心“生存”的价值,那么大熊猫可以开箱即用。
df.groupby(['Pclass', 'Sex'])['Survived'].mean().unstack().plot.bar()
如果临时解决方案足以对地块进行后处理,那么这样做也不会太复杂:
import numpy as np
from matplotlib import ticker
df_g = df.groupby(['Pclass', 'Sex'])['Survived'].agg([np.mean, lambda x: 1-np.mean(x)])
df_g.columns = ['Survived', 'Died']
ax = df_g.plot.bar(stacked=True)
# Move back every second patch
for i in range(6):
new_x = ax.patches[i].get_x() - (i%2)/2
ax.patches[i].set_x(new_x)
ax.patches[i+6].set_x(new_x)
# Update tick locations correspondingly
minor_tick_locs = [x.get_x()+1/4 for x in ax.patches[:6]]
major_tick_locs = np.array([x.get_x()+1/4 for x in ax.patches[:6]]).reshape(3, 2).mean(axis=1)
ax.set_xticks(minor_tick_locs, minor=True)
ax.set_xticks(major_tick_locs)
# Use indices from dataframe as tick labels
minor_tick_labels = df_g.index.levels[1][df_g.index.labels[1]].values
major_tick_labels = df_g.index.levels[0].values
ax.xaxis.set_ticklabels(minor_tick_labels, minor=True)
ax.xaxis.set_ticklabels(major_tick_labels)
# Remove ticks and organize tick labels to avoid overlap
ax.tick_params(axis='x', which='both', bottom='off')
ax.tick_params(axis='x', which='minor', rotation=45)
ax.tick_params(axis='x', which='major', pad=35, rotation=0)
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句