使用张量流实现 LSTM 回归模型

普拉文

我正在尝试为输入编号列表实现张量流 LSTM 回归模型。例子：

 input_data = [1, 2, 3, 4, 5]
 time_steps = 2
    -> X == [[1, 2], [2, 3], [3, 4]]
    -> y == [3, 4, 5]

代码如下：

TIMESTEPS = 20
num_hidden=20

Xd, yd = load_data()

train_input = Xd['train']
train_input = train_input.reshape(-1,20,1)
train_output = yd['train']

# train_input = [[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20],..
# train_output  = [[21],[22],[23]....

test_input = Xd['test']
test_output = yd['test']

X = tf.placeholder(tf.float32, [None, 20, 1])
y = tf.placeholder(tf.float32, [None, 1])

cell = tf.nn.rnn_cell.LSTMCell(num_hidden, state_is_tuple=True)

val, state = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)
val = tf.Print(val, [tf.argmax(val,1)], 'argmax(val)=' , summarize=20, first_n=7)

val = tf.transpose(val, [1, 0, 2])
val = tf.Print(val, [tf.argmax(val,1)], 'argmax(val2)=' , summarize=20, first_n=7)

# Take only the last output after 20 time steps
last = tf.gather(val, int(val.get_shape()[0]) - 1)
last = tf.Print(last, [tf.argmax(last,1)], 'argmax(val3)=' , summarize=20, first_n=7)

# define variables for weights and bias
weight = tf.Variable(tf.truncated_normal([num_hidden, int(y.get_shape()[1])]))
bias = tf.Variable(tf.constant(0.1, shape=[y.get_shape()[1]]))

# Prediction is matmul of last value + wieght + bias
prediction = tf.matmul(last, weight) + bias

# Cost function using softmax
# y is the true distrubution and prediction is the predicted
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(prediction), reduction_indices=[1]))
#cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))

optimizer = tf.train.AdamOptimizer()
minimize = optimizer.minimize(cost)

from tensorflow.python import debug as tf_debug
inita = tf.initialize_all_variables()
sess = tf.Session()
sess.run(inita)

batch_size = 100
no_of_batches = int(len(train_input)/batch_size)
epoch = 10
test_size = 100
for i in range(epoch):
    for start, end in zip(range(0, len(train_input), batch_size), range(batch_size, len(train_input)+1, batch_size)):
        sess.run(minimize, feed_dict={X: train_input[start:end], y: train_output[start:end]})

    test_indices = np.arange(len(test_input))  # Get A Test Batch
    np.random.shuffle(test_indices)
    test_indices = test_indices[0:test_size]
    print (i, mean_squared_error(np.argmax(test_output[test_indices], axis=1), sess.run(prediction, feed_dict={X: test_input[test_indices]})))

print ("predictions", prediction.eval(feed_dict={X: train_input}, session=sess))
y_pred = prediction.eval(feed_dict={X: test_input}, session=sess)
sess.close()
test_size = test_output.shape[0]
ax = np.arange(0, test_size, 1)
plt.plot(ax, test_output, 'r', ax, y_pred, 'b')
plt.show()

但是我无法最小化成本，计算出的 MSE 在每一步都会增加而不是减少。我怀疑我使用的成本问题有问题。

关于我做错了什么的任何想法或建议？

谢谢

安东尼·达马托

如评论中所述，您必须将损失函数更改为 MSE 函数并降低学习率。你的错误收敛到零吗？

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。