我有一个包含8个功能和1个类的文本文件。我文件的数据是(data.txt
):
1,1,3,2,1,1,1,3,HIGH
1,1,3,1,2,1,1,3,HIGH
1,1,1,1,3,3,1,2,HIGH
1,3,2,1,3,3,3,3,HIGH
1,3,1,2,3,1,2,1,HIGH
2,3,1,2,1,2,2,1,HIGH
2,2,2,2,2,1,2,3,HIGH
2,2,1,1,1,2,2,3,HIGH
3,2,1,3,1,3,3,3,HIGH
3,2,1,2,2,3,3,2,HIGH
在上面的文件中,前8列是功能。它们用可能是1或2或3的数字标记。最后一列是类名(HIGH
)。现在,我想根据标签号来绘制这些特征。我可以通过以下代码在第3列中进行操作:
import pandas as pd
from matplotlib import pyplot as plt
df = pd.read_csv('data.txt', header=None)
# Features are : A,B,C,...,H
df.columns = ['A', 'B','C', 'D', 'E', 'F', 'G', 'H', 'class']
X = df.ix[:, 0:8].values
y = df.ix[:, 8].values
kind = ['barstacked']
deg = ['HIGH']
pos = ['left','right','mid']
col = ['r','b','y']
with plt.style.context('seaborn-whitegrid'):
plt.figure(figsize=(8, 6))
for j in range(0,3):
for i in range(1):
plt.hist(X[y == deg[i], j],
label=deg[i],
bins=30,
alpha=0.6, histtype=kind[i], align=pos[j], color=col[j])
plt.tick_params(axis='both', which='major', labelsize=17)
plt.xlim(0.75, 3.25)
plt.tight_layout()
plt.savefig("figure.png" , format='png', dpi=700)
plt.show()
However I could not plot the other 5 columns because I did not know how to put them next to each other as there are only 3 align options (left
, mid
and right
). What I am looking at is a histogram plot for all 8 features that separates the features based on the tag number. A graph like this:
You don't need a histogram here and you can easily generate the required figure using a bar chart because you are just plotting a single frequency here. The idea is as follows:
Counter
module from collections
to get the frequency of 1, 2, and 3.(j-4)*0.1
添加到x值的offset参数来完成。在这里0.1是条形宽度的不错选择。i
此处进行额外的循环,因为它始终为0df.ix
在较新的熊猫版本中已弃用。您将不得不使用df.iloc
。以下是您的操作方法。
df.columns = ['A', 'B','C', 'D', 'E', 'F', 'G', 'H', 'class']
X = df.ix[:, 0:8].values
y = df.ix[:, 8].values
with plt.style.context('seaborn-whitegrid'):
plt.figure(figsize=(8, 6))
for j in range(0,8):
freqs = Counter(X[y == deg[0], j])
xvalues = np.array(list(freqs.keys()))
plt.bar(xvalues+(j-4)*0.1, freqs.values(), width=0.1,
alpha=0.9, edgecolor='k', lw=2)
plt.tick_params(axis='both', which='major', labelsize=17)
plt.xlim(0.25, 3.75)
plt.xticks([1,2,3])
plt.tight_layout()
plt.show()
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句