Python-使用带有Keras的LSTM递归神经网络进行模式预测

Desta Haileselassie Hagos

我正在处理来自具有三列(time_stamp,X和Y-其中Y为实际值)的格式化CSV数据集的模式预测我想预测基于过去值时间指数从Y X的值,这里是我是如何处理与LSTM回归神经网络的问题在Python与Keras

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import LSTM, Dense
from keras.preprocessing.sequence import TimeseriesGenerator
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split

np.random.seed(7)

df = pd.read_csv('test32_C_data.csv')
n_features=100

values = df.values

for i in range(0,n_features):
    df['X_t'+str(i)] = df['X'].shift(i) 
    df['X_tp'+str(i)] = (df['X'].shift(i) - df['X'].shift(i+1))/(df['X'].shift(i))

print(df)
pd.set_option('use_inf_as_null', True)

#df.replace([np.inf, -np.inf], np.nan).dropna(axis=1)
df.dropna(inplace=True)

X = df.drop('Y', axis=1)
y = df['Y']


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.40)

X_train = X_train.drop('time', axis=1)
X_train = X_train.drop('X_t1', axis=1)
X_train = X_train.drop('X_t2', axis=1)
X_test = X_test.drop('time', axis=1)
X_test = X_test.drop('X_t1', axis=1)
X_test = X_test.drop('X_t2', axis=1)


sc = MinMaxScaler()

X_train = np.array(df['X'])
X_train = X_train.reshape(-1, 1)
X_train = sc.fit_transform(X_train)

y_train = np.array(df['Y'])
y_train=y_train.reshape(-1, 1)
y_train = sc.fit_transform(y_train)


model_data = TimeseriesGenerator(X_train, y_train, 100, batch_size = 10) 

# Initialising the RNN
model = Sequential() 

# Adding the input layerand the LSTM layer
model.add(LSTM(4, input_shape=(None, 1)))

# Adding the output layer
model.add(Dense(1)) 

# Compiling the RNN
model.compile(loss='mse', optimizer='rmsprop') 

# Fitting the RNN to the Training set
model.fit_generator(model_data)

# evaluate the model
#scores = model.evaluate(X_train, y_train)
#print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))


# Getting the predicted values
predicted = X_test
predicted = sc.transform(predicted)
predicted = predicted.reshape((-1, 1, 1))
y_pred = model.predict(predicted)
y_pred = sc.inverse_transform(y_pred)

当我这样绘制预测时

plt.figure
plt.plot(y_test, color = 'red', label = 'Actual')
plt.plot(y_pred, color = 'blue', label = 'Predicted')
plt.title('Prediction')
plt.xlabel('Time [INdex]')
plt.ylabel('Values')
plt.legend()
plt.show()

以下是我得到的情节。

在此处输入图片说明

但是,如果我们分别绘制每列,

groups = [1, 2]
i = 1
# plot each column
plt.figure()
for group in groups:
    plt.subplot(len(groups), 1, i)
    plt.plot(values[:, group])
    plt.title(df.columns[group], y=0.5, loc='right')
    i += 1
plt.show()

以下是我们得到的图。

在此处输入图片说明

我们如何提高预测准确性?

克里斯·法尔

我会让您从这里拿走,但这至少应该可以帮助您。

注意:我看到您所预测的变量有些困惑。为此,我正在预测通常是标准的“ Y”。如果那是不正确的,只需在放入create_sequences函数之前交换订单即可。该代码仍然可以正常工作,无论如何,这只是一个起点,您需要多花一点时间才能获得一个性能良好的网络。

import numpy as np
from keras.models import Sequential
from keras.layers import LSTM, Dense
from sklearn.preprocessing import MinMaxScaler
import pandas as pd

np.random.seed(7)

df = pd.read_csv('test32_C_data.csv')
n_features = 100


def create_sequences(data, window=14, step=1, prediction_distance=14):
    x = []
    y = []

    for i in range(0, len(data) - window - prediction_distance, step):
        x.append(data[i:i + window])
        y.append(data[i + window + prediction_distance][0])

    x, y = np.asarray(x), np.asarray(y)

    return x, y


# Scaling prior to splitting
scaler = MinMaxScaler(feature_range=(0.01, 0.99))
scaled_data = scaler.fit_transform(df.loc[:, ["Y", "X"]].values)

# Build sequences
x_sequence, y_sequence = create_sequences(scaled_data)

# Create test/train split
test_len = int(len(x_sequence) * 0.15)
valid_len = int(len(x_sequence) * 0.15)
train_end = len(x_sequence) - (test_len + valid_len)
x_train, y_train = x_sequence[:train_end], y_sequence[:train_end]
x_valid, y_valid = x_sequence[train_end:train_end + valid_len], y_sequence[train_end:train_end + valid_len]
x_test, y_test = x_sequence[train_end + valid_len:], y_sequence[train_end + valid_len:]

# Initialising the RNN
model = Sequential()

# Adding the input layerand the LSTM layer
model.add(LSTM(4, input_shape=(14, 2)))

# Adding the output layer
model.add(Dense(1))

# Compiling the RNN
model.compile(loss='mse', optimizer='rmsprop')

# Fitting the RNN to the Training set
model.fit(x_train, y_train, epochs=5)

# Getting the predicted values
y_pred = model.predict(x_test)

# Plot results
pd.DataFrame({"y_test": y_test, "y_pred": np.squeeze(y_pred)}).plot()

差异:

  • 使用自定义序列生成器,14步窗口,14步“前瞻”进行预测

  • 自定义训练/测试/有效拆分,您可以在训练中使用验证集来提前停止

  • 将输入形状更改为包括2个要素,且其窗口为14 input_shape =(14,2)

  • 5个纪元

预测结果

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章