我有 2 个 csv 文件
文件1.csv
,DATE,DAY,OPEN,2PM,CLOSE,STATUS
0,2021-05-18,Tuesday,538.8,530.45,530.8,0
1,2021-05-19,Wednesday,530.65,532.6,536.85,0
2,2021-05-20,Thursday,536.95,537.05,536.35,1
3,2021-05-21,Friday,538.0,538.2,537.55,1
4,2021-05-24,Monday,537.3,535.05,532.85,1
5,2021-05-25,Tuesday,535.9,531.35,529.65,1
6,2021-05-26,Wednesday,532.95,530.55,532.1,0
7,2021-05-27,Thursday,532.95,529.65,529.85,0
文件2.csv
,DATE,DAY,OPEN,2PM,CLOSE,STATUS
0,2021-05-18,Tuesday,538.8,530.45,530.8,1
1,2021-05-19,Wednesday,530.65,532.6,536.85,0
2,2021-05-20,Thursday,536.95,537.05,536.35,1
3,2021-05-21,Friday,538.0,538.2,537.55,1
4,2021-05-24,Monday,537.3,535.05,532.85,2
5,2021-05-25,Tuesday,535.9,531.35,529.65,1
6,2021-05-26,Wednesday,532.95,530.55,532.1,0
7,2021-05-27,Thursday,532.95,529.65,529.85,0
文件3.csv
,DATE,DAY,OPEN,2PM,CLOSE,STATUS
0,2021-05-18,Tuesday,538.8,530.45,530.9,0
1,2021-05-19,Wednesday,530.65,532.6,536.85,1
2,2021-05-20,Thursday,536.95,537.05,536.35,0
3,2021-05-21,Friday,538.0,538.2,537.55,1
4,2021-05-24,Monday,537.3,535.05,532.85,1
5,2021-05-25,Tuesday,535.9,531.35,529.65,0
6,2021-05-26,Wednesday,532.95,530.55,532.1,0
7,2021-05-27,Thursday,532.95,529.65,529.85,1
可以使用以下方法绘制 INDIVIDUAL csv 文件的图形
import pandas as pd
df = pd.read_csv("file1.csv")
df.groupby('DAY')['STATUS'].value_counts(normalize=True).unstack().plot.bar()
它显示情节为
该图为一个文件有 5 个 twinBARS(星期一、星期二、星期三等)。
但是,我想在一个图中从所有 3 个文件中绘制“星期一”的数据。谁能让我知道我们如何处理多个文件?
这意味着,情节将有 3 个双杠。每个双杠将代表每个文件中的星期一,例如
Monday from file1.csv
Monday from file2.csv
Monday from file3.csv
我想为所有 3 个文件绘制星期一的这个图。
在连接它们之前FILE
为每个创建一列df
。然后通过过滤器所需的天(Tuesday
在此实例中)和组由两个DAY
和FILE
:
df1 = pd.read_csv('file1.csv').assign(FILE=1)
df2 = pd.read_csv('file2.csv').assign(FILE=2)
df3 = pd.read_csv('file3.csv').assign(FILE=3)
df = pd.concat([df1, df2, df3]).reset_index(drop=True)
# or concat via generator
# df = pd.concat(pd.read_csv(f'file{i}.csv').assign(FILE=i) for i in (1,2,3).reset_index(drop=True))
(df[df.DAY.eq('Tuesday')]
.groupby(['DAY', 'FILE'])['STATUS']
.value_counts(normalize=True)
.unstack().plot.bar())
plt.xticks(rotation=0)
要按给定的 过滤threshold
,请将值计数保存到中间counts
df 并使用它来过滤:
day, threshold = 'Tuesday', 0.8
counts = df[df.DAY.eq(day)].groupby(['DAY', 'FILE'])['STATUS'].value_counts(normalize=True).unstack()
counts[counts > threshold].plot.bar()
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句