如何使用Pandas中的变量之一绘制堆积的条形图？

shunyo 发表于 Dev

顺阳

我正在尝试使用这个csv文件，该文件已作为pandas.Dataframe黑色购物者的黑色星期五购买数据输入，并提供了各种变量来理解他们的购买方式。

User_ID,Product_ID,Gender,Age,Occupation,City_Category,Stay_In_Current_City_Years,Marital_Status,Product_Category_1,Product_Category_2,Product_Category_3,Purchase
1000001,P00069042,F,0-17,10,A,2,0,3,,,8370
1000001,P00248942,F,0-17,10,A,2,0,1,6,14,15200
1000001,P00087842,F,0-17,10,A,2,0,12,,,1422
1000001,P00085442,F,0-17,10,A,2,0,12,14,,1057
1000002,P00285442,M,55+,16,C,4+,0,8,,,7969
1000003,P00193542,M,26-35,15,A,3,0,1,2,,15227
1000004,P00184942,M,46-50,7,B,2,1,1,8,17,19215
1000004,P00346142,M,46-50,7,B,2,1,1,15,,15854
1000004,P0097242,M,46-50,7,B,2,1,1,16,,15686
1000005,P00274942,M,26-35,20,A,1,1,8,,,7871
1000005,P00251242,M,26-35,20,A,1,1,5,11,,5254
1000005,P00014542,M,26-35,20,A,1,1,8,,,3957
1000005,P00031342,M,26-35,20,A,1,1,8,,,6073
1000005,P00145042,M,26-35,20,A,1,1,1,2,5,15665
1000006,P00231342,F,51-55,9,A,1,0,5,8,14,5378
1000006,P00190242,F,51-55,9,A,1,0,4,5,,2079
1000006,P0096642,F,51-55,9,A,1,0,2,3,4,13055
1000006,P00058442,F,51-55,9,A,1,0,5,14,,8851
1000007,P00036842,M,36-45,1,B,1,1,1,14,16,11788
1000008,P00249542,M,26-35,12,C,4+,1,1,5,15,19614
1000008,P00220442,M,26-35,12,C,4+,1,5,14,,8584
1000008,P00156442,M,26-35,12,C,4+,1,8,,,9872
1000008,P00213742,M,26-35,12,C,4+,1,8,,,9743
1000008,P00214442,M,26-35,12,C,4+,1,8,,,5982
1000008,P00303442,M,26-35,12,C,4+,1,1,8,14,11927
1000009,P00135742,M,26-35,17,C,0,0,6,8,,16662
1000009,P00039942,M,26-35,17,C,0,0,8,,,5887
1000009,P00161442,M,26-35,17,C,0,0,5,14,,6973
1000009,P00078742,M,26-35,17,C,0,0,5,8,14,5391
1000010,P00085942,F,36-45,1,B,4+,1,2,4,8,16352
1000010,P00118742,F,36-45,1,B,4+,1,5,11,,8886
1000010,P00297942,F,36-45,1,B,4+,1,8,,,5875
1000010,P00266842,F,36-45,1,B,4+,1,5,,,8854
1000010,P00058342,F,36-45,1,B,4+,1,3,4,,10946
1000010,P00032442,F,36-45,1,B,4+,1,5,,,5152
1000010,P00105942,F,36-45,1,B,4+,1,5,,,7089
1000010,P00182642,F,36-45,1,B,4+,1,2,4,9,12909
1000010,P00186942,F,36-45,1,B,4+,1,5,12,,8770
1000010,P00155442,F,36-45,1,B,4+,1,1,11,15,15212
1000010,P00221342,F,36-45,1,B,4+,1,1,2,5,15705
1000010,P00087242,F,36-45,1,B,4+,1,14,,,7947
1000010,P00111142,F,36-45,1,B,4+,1,1,15,16,18963
1000010,P00259342,F,36-45,1,B,4+,1,5,9,,8718
1000010,P0094542,F,36-45,1,B,4+,1,2,4,9,16406
1000010,P00148642,F,36-45,1,B,4+,1,6,10,13,12642
1000010,P00312142,F,36-45,1,B,4+,1,8,,,10007
1000010,P00113242,F,36-45,1,B,4+,1,1,6,8,11562

现在，我想按城市和性别创建一个总购买量的叠加图，如下所示：这是我尝试过的：

import pandas
import matplotlib.pyplot as plt
from matplotlib.ticker import StrMethodFormatter
import numpy as np
with open('BlackFriday.csv') as csv_file:
    df = pandas.read_csv(csv_file, sep=',')
    # Group by user id, city and gender
    users_by_city_gender = df.groupby(['City_Category','Gender'])['Purchase'].agg('sum').to_frame()
    ax3 = pandas.DataFrame({'City-A': users_by_city_gender.groupby('City_Category').get_group('A').Purchase,
                            'City-B': users_by_city_gender.groupby('City_Category').get_group('B').Purchase,
                            'City-C': users_by_city_gender.groupby('City_Category').get_group('C').Purchase}).plot.hist(stacked=True)
    ## Switch off ticks
    ax3.tick_params(axis="both", which="both", bottom=False, top=False, labelbottom=False, left=False, right=False,
                    labelleft=True)

    # Draw horizontal axis lines
    # vals = ax.get_yticks()
    # for tick in vals:
    #    ax.axhline(y=tick, linestyle='dashed', alpha=0.4, color='#eeeeee', zorder=1)

    # Remove title
    ax3.set_title("Total purchase by city and gender")

    # Set x-axis label
    ax3.set_xlabel("City category", labelpad=20, weight='bold', size=12)

    # Set y-axis label
    ax3.set_ylabel("Total purchase [dollars]", labelpad=20, weight='bold', size=12)

    # Format y-axis label
    ax3.yaxis.set_major_formatter(StrMethodFormatter('{x:,g}'))
    plt.show()

生成的图似乎与我想要的图完全不同。对其进行调试，users_by_city_gender使其成为一系列城市（A，B和C）的数据框，每个城市均包含按性别划分的购买总额（M和F）。因此，我认为这是正确绘制图表所需的数据。

我已经查看了关于stackexchange的其他问题，以便为熊猫数据框创建堆叠的条形图，但是我无法为我的问题找到解决方案。

用户名

您可以使用groupby和pivot_table：

s = (df.pivot_table(
        index='City_Category', columns='Gender', values='Purchase', aggfunc='sum'))

s.plot(kind='bar', stacked=True)
plt.show()

为了说明，这是结果的pivot样子：

Gender                F         M
City_Category
A               55412.0   54047.0
B              201995.0   62543.0
C                   NaN  108604.0

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2020-12-13

我来说两句

0 条评论

登录后参与评论

上一篇：如何缩小TFS 2017快速增长的tbl_content表

熊猫-绘制堆积的条形图

如何为熊猫分组特征绘制堆积的条形图？

如何使用Pandas中的变量之一绘制堆积的条形图？

如何使用Pandas中的变量之一绘制堆积的条形图？

Qt Creator Windows 10 - “使用 jom 而不是 nmake”不起作用

使用next.js时出现服务器错误，错误：找不到react-redux上下文值；请确保组件包装在<Provider>中

SQL Server中的非确定性数据类型

Swift 2.1-对单个单元格使用UITableView

如何避免每次重新编译所有文件？

在同一Pushwoosh应用程序上Pushwoosh多个捆绑ID

Hashchange事件侦听器在将事件处理程序附加到事件之前进行侦听

应用发明者仅从列表中选择一个随机项一次

在 Avalonia 中是否有带有柱子的 TreeView 或类似的东西？

HttpClient中的角度变化检测

在Wagtail管理员中，如何禁用图像和文档的摘要项？

如何了解DFT结果

Camunda-根据分配的组过滤任务列表

错误：找不到存根。请确保已调用spring-cloud-contract：convert

为什么此后台线程中未处理的异常不会终止我的进程？

构建类似于Jarvis的本地语言应用程序

使用分隔符将成对相邻的数组元素相互连接

您如何通过 Nativescript 中的 Fetch 发出发布请求？

通过iwd从Linux系统上的命令行连接到wifi（适用于Linux的无线守护程序）

使用React / Javascript在Wordpress API中通过ID获取选择的多个帖子/页面

使用 text() 獲取特定文本節點的 XPath