我有以下数据:
id, approach, outcome
a1, approach1, outcome1
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a2, approach1, outcome2
a2, approach1, outcome1
a2, approach1, outcome1
a2, approach1, outcome2
a2, approach1, outcome1
a2, approach2, outcome1
a2, approach2, outcome1
a2, approach2, outcome2
a2, approach2, outcome1
a2, approach2, outcome2
a2, approach3, outcome2
a2, approach3, outcome2
a2, approach3, outcome1
a2, approach3, outcome2
a2, approach3, outcome1
但是,我们有 id,而不是年数,而不是水果。
这是我到目前为止所做的:
df = pandas.read_csv("test.txt", sep=r',\s+', engine = "python")
fig, ax = plt.subplots(1, 1, figsize=(5.5, 4))
data = df[df.approach == "approach1"].groupby(["id", "outcome"], sort=False)["outcome"].count().unstack(level=1)
data.plot.bar(width=0.5, position=0.6, color=["g", "r"], stacked=True, ax=ax)
data = df[df.approach == "approach2"].groupby(["id", "outcome"], sort=False)["outcome"].count().unstack(level=1)
data.plot.bar(width=0.5, position=-0.6, color=["g", "r"], stacked=True, ax=ax)
# "Activate" minor ticks
ax.minorticks_on()
rects_locs = []
p = 0
for patch in ax.patches:
rects_locs.append(patch.get_x() + patch.get_width())
# p += 0.01
# Set minor ticks there
ax.set_xticks(rects_locs, minor = True)
# Labels for the rectangles
new_ticks = ["Approach1"] * 10 + ["Approach2"] * 10
# Set the labels
from matplotlib import ticker
ax.xaxis.set_minor_formatter(ticker.FixedFormatter(new_ticks)) #add the custom ticks
# Move the category label further from x-axis
ax.tick_params(axis='x', which='major', pad=15)
# Remove minor ticks where not necessary
ax.tick_params(axis='x',which='both', top='off')
ax.tick_params(axis='y',which='both', left='off', right = 'off')
plt.xticks(rotation=0)
所以基本上我想要id
作为主要的 x-tick(所以应该有 2 个这样的 x 值),然后对于每个 id 应该有 3 个分组的堆叠条(方法 1、方法 2、方法 3)。
好吧,我并不以此为荣。但它有效。希望有更多知识的人会提出更好的解决方案。
我首先设置您的数据:
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
import numpy as np
import pandas as pd
data = np.array([
'id', 'approach', 'outcome',
'a1', 'approach1', 'outcome1',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a2', 'approach1', 'outcome2',
'a2', 'approach1', 'outcome1',
'a2', 'approach1', 'outcome1',
'a2', 'approach1', 'outcome2',
'a2', 'approach1', 'outcome1',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome2',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome2',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome1',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome1'])
data = data.reshape(data.size // 3, 3)
df = pd.DataFrame(data[1:], columns=data[0])
接下来,我为每个方法和 id计算"outcome1"
和 的所有出现"outcome2"
。(我确定这可以直接在 Pandas 中完成,但我有点像 Pandas 新手):
dict = {}
for id in 'a1', 'a2':
dict[id] = {}
for approach in 'approach1', 'approach2', 'approach3':
dict[id][approach] = {}
for outcome in 'outcome1', 'outcome2':
dict[id][approach][outcome] = ((df['id'] == id)
& (df['approach'] == approach)
& (df['outcome'] == outcome)).sum()
plot_data = pd.DataFrame(dict)
现在剩下的就是做绘图了。
fig, ax = plt.subplots(1, 1)
i = 0
for id in 'a1', 'a2':
for approach in 'approach1', 'approach2', 'approach3':
ax.bar(i, plot_data[id][approach]["outcome1"], color='g')
ax.bar(i, plot_data[id][approach]["outcome2"],
bottom=plot_data[id][approach]["outcome1"], color='r')
i += 1
i+=1
ax.set_xticklabels(['', 'approach1', 'approach2', 'approach3', '',
'approach1', 'approach2', 'approach3'], rotation=45)
custom_lines = [Line2D([0], [0], color='g', lw=4),
Line2D([0], [0], color='r', lw=4)]
ax.legend(custom_lines, ['Outcome 1', 'Outcome 2'])
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句