我是机器学习的新手,我正在尝试通过keras预测里拉汇率。我认为这些值是正确的,但是我无法正确绘制这些值。看起来像这样:图片
这是我的代码(csv文件是德语的,因此是这样的翻译:Datum-> Date,Erster-> Open,Hoch-> High,Tief-> Low,Schlusskurs-> Close):
问题如下:
import pandas as pd
import numpy as np
from keras.models import Sequential
from keras.layers import LSTM
X_train = []
y_train = []
csv_file = "wkn_A0C32V_historic.csv" #csv file (path)
data = pd.read_csv(csv_file, sep=";") #reading the csv file
data["Erster vorher"] = data["Erster"].shift(-1) #moving the data in Erster(Open) one step backwards
data["Erster"] = data["Erster"].str.replace(",", ".") #replacing all commas with dots in order to calculate with float numbers
data["Erster vorher"] = data["Erster vorher"].str.replace(",", ".") #same here
data["Changes"] = (data["Erster"].astype(float) / data["Erster vorher"].astype(float)) - 1 #calculating the changes
data = data.dropna() #dropping the NaNs
changes = data["Changes"]
#X_train = (number_of_examples, sequence_length, input_dimension)
for i in range(len(changes) - 20):
X_train.append(np.array(changes[i+1:i+21][::-1]))
y_train.append(changes[i])
X_train = np.array(X_train).reshape(-1, 20, 1)
y_train = np.array(y_train)
print("X_train shape: " + str(X_train.shape))
print("y_train shape: " + str(y_train.shape))
#Training the data
model = Sequential()
model.add(LSTM(1, input_shape=(20, 1)))
model.compile(optimizer="rmsprop", loss="mse", metrics=["accuracy"])
model.fit(X_train, y_train, batch_size=32, epochs=10)
preds = model.predict(X_train)
preds = preds.reshape(-1)
print("Shape of predictions: " + str(preds.shape))
preds = np.append(preds, np.zeros(20))
data["predictions"] = preds
data["Open_predicted"] = data["Erster vorher"].astype(float) * (1 + data["predictions"].astype(float)) #calculating the new Open with the predicted numbers
print(data)
import matplotlib.pyplot as plt
dates = np.array(data["Datum"]).astype(np.datetime64)
#HERE BEGINS THE PROBLEM...
plt.plot(dates, data["Erster"], label="Erster")
plt.plot(dates, data["Open_predicted"], label="Erster (predicted)")
plt.legend()
plt.show()
输出:
Epoch 9/10
32/3444 [..............................] - ETA: 0s - loss: 9.5072e-05 - accuracy: 0.1250
448/3444 [==>...........................] - ETA: 0s - loss: 1.8344e-04 - accuracy: 0.0513
960/3444 [=======>......................] - ETA: 0s - loss: 1.2734e-04 - accuracy: 0.0583
1472/3444 [===========>..................] - ETA: 0s - loss: 1.0480e-04 - accuracy: 0.0577
1984/3444 [================>.............] - ETA: 0s - loss: 9.7956e-05 - accuracy: 0.0600
2464/3444 [====================>.........] - ETA: 0s - loss: 9.0399e-05 - accuracy: 0.0621
2976/3444 [========================>.....] - ETA: 0s - loss: 8.5287e-05 - accuracy: 0.0649
3444/3444 [==============================] - 0s 122us/step - loss: 8.1555e-05 - accuracy: 0.0633
Epoch 10/10
32/3444 [..............................] - ETA: 0s - loss: 5.5561e-05 - accuracy: 0.0312
544/3444 [===>..........................] - ETA: 0s - loss: 6.1705e-05 - accuracy: 0.0662
1056/3444 [========>.....................] - ETA: 0s - loss: 1.2215e-04 - accuracy: 0.0644
1536/3444 [============>.................] - ETA: 0s - loss: 9.9676e-05 - accuracy: 0.0651
2048/3444 [================>.............] - ETA: 0s - loss: 9.2219e-05 - accuracy: 0.0625
2592/3444 [=====================>........] - ETA: 0s - loss: 8.8050e-05 - accuracy: 0.0625
3104/3444 [==========================>...] - ETA: 0s - loss: 8.1685e-05 - accuracy: 0.0651
3444/3444 [==============================] - 0s 118us/step - loss: 8.1349e-05 - accuracy: 0.0633
Shape of predictions: (3444,)
Datum Erster Hoch ... Changes predictions Open_predicted
0 2020-09-04 8.8116 8,8226 ... 0.011816 0.000549 8.713479
1 2020-09-03 8.7087 8,8263 ... -0.006457 0.001141 8.775301
2 2020-09-02 8.7653 8,7751 ... -0.005051 0.001849 8.826093
3 2020-09-01 8.8098 8,8377 ... 0.009465 0.001102 8.736818
4 2020-08-31 8.7272 8,7993 ... 0.000069 0.001149 8.736630
... ... ... ... ... ... ... ...
3459 2009-01-07 2.0449 2,1288 ... -0.021392 0.000000 2.089600
3460 2009-01-06 2.0896 2,0922 ... -0.020622 0.000000 2.133600
3461 2009-01-05 2.1336 2,1477 ... 0.002914 0.000000 2.127400
3462 2009-01-04 2.1274 2,1323 ... -0.005377 0.000000 2.138900
3463 2009-01-02 2.1389 2,1521 ... 0.000000 0.000000 2.138900
[3464 rows x 9 columns]
从图中可以看出两点:(1)Erster和Erster(预测的)看起来好像处于不同的比例尺,并且(2)y轴标签上的大量标签让人联想到您得到的结果当您绘制日期时间而不是数字时。我想象某个地方有些混乱,但是在哪里并不明显。
我对故障排除的建议是:(i)绘制Erster vs Erster(预测)以检查比例是否相似,以及(ii)打印输出data.info()
以检查数据类型是否符合预期。
旁注:我建议对数据框进行排序,以使日期升序排列。
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句