Python/Matplotlib 子图 - 堆积条形图 - 为类别设置固定颜色

杰瑞

我想从pandas df(使用df.pivot_table)创建堆叠条形图的子图,并保持子图的类别颜色一致(即固定)。

问题是并非数据透视表中的每个索引值(样本 df 中的“域”)都具有相同数量的类别 - 因此 matplotlib 重新启动每个子图中的类别着色 - 导致相同的颜色用于两个不同的类别。

这是用于说明的虚拟代码:

df:

    main domain category  val
0   cat1      a    apple    1
1   cat1      a   orange    1
2   cat1      a  broccli    1
3   cat1      b    apple    1
4   cat1      b   orange    1
5   cat1      a     plum    1
6   cat1      c    apple    1
7   cat2      b   orange    1
8   cat2      b   orange    1
9   cat2      b    apple    1
10  cat2      b   orange    1
11  cat2      c     plum    1
12  cat2      c    apple    1
13  cat2      b   orange    1
14  cat2      b   orange    1

代码是:

import pandas as pd
import matplotlib.pyplot as plt

fig = plt.figure(figsize=(15, 10))
sub_plot_grid = (4, 10)
sub_plot_col_size = 5
sub_plot_row_size = 3

ax1 = plt.subplot2grid(sub_plot_grid, (0, 0), colspan=sub_plot_col_size, rowspan=sub_plot_row_size)
ax2 = plt.subplot2grid(sub_plot_grid, (0, 5), colspan=sub_plot_col_size, rowspan=sub_plot_row_size, sharey=ax1, sharex=ax1)
ax2.tick_params(axis='y', which='both', length=0)
list_of_ax = [ax1, ax2]

sub_df_1 = df[df['main'] == 'cat1']
sub_df_2 = df[df['main'] == 'cat2']
list_of_df = [sub_df_1, sub_df_2]

for ax, df in zip(list_of_ax, list_of_df):
    df1 = df.pivot_table(index='domain', columns='category', values='val', aggfunc='sum', dropna=False,  margins=True).sort_values(by='All', ascending=False).drop('All').drop('All', axis=1)
    df1.drop(df1.loc[df1.sum(axis=1) == 0].index, inplace=True)
    df1.drop(columns=df1.columns[df1.sum() == 0], inplace=True)
    df1.plot(kind='barh', stacked=True, alpha=0.7, ax=ax)

plt.show()

剧情是:

在此处输入图片说明

The problem is visible in the 'orange' and 'plum' categories: in the first subplot - the color for 'orange' category is green in the second subplot the color is orange. The 'plum' category color in first sub-plot is red, and the in the second it is green.

I need the colors for each category to remain the same for all sub-plots.

I have searched for solutions for a while now and tried a few different things, including trying to manually pass a list of colors or use colormaps, but the problem of matplotlib restarting colors with each sub-plot remains.

Any help would be appreciated.

Tom

See the docs for the color argument for bar charts with pandas. Specifically, the part about passing a dict:

{column namecolor} 形式的字典,因此每列都将相应地着色。例如,如果您的列名为 a 和 b,则传递 {'a': 'green', 'b': 'red'} 将使 a 列的条形变为绿色,而 b 列的条形变为红色。

所以如果你定义:

colors = {'apple':'red', 'orange':'orange', 'plum':'purple', 'broccli':'green'}

然后绘制:

df1.plot(kind='barh', stacked=True, alpha=0.7, ax=ax, color=colors)

您将获得以下信息:

在此处输入图片说明

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章