如何在张量流中累积梯度?

你好丽丽

我有一个与此类似的问题

因为我的资源有限,并且我使用的是用于训练三元组网络的深度模型(VGG-16),所以我想累积128个批次大小为一个的训练示例的梯度,然后传播误差并更新权重。

我不清楚如何执行此操作。我使用tensorflow,但是任何实现/伪代码都是受欢迎的。

流行音乐

让我们逐一介绍您喜欢的答案之一中提出的代码:

## Optimizer definition - nothing different from any classical example
opt = tf.train.AdamOptimizer()

## Retrieve all trainable variables you defined in your graph
tvs = tf.trainable_variables()
## Creation of a list of variables with the same shape as the trainable ones
# initialized with 0s
accum_vars = [tf.Variable(tf.zeros_like(tv.initialized_value()), trainable=False) for tv in tvs]
zero_ops = [tv.assign(tf.zeros_like(tv)) for tv in accum_vars]

## Calls the compute_gradients function of the optimizer to obtain... the list of gradients
gvs = opt.compute_gradients(rmse, tvs)

## Adds to each element from the list you initialized earlier with zeros its gradient (works because accum_vars and gvs are in the same order)
accum_ops = [accum_vars[i].assign_add(gv[0]) for i, gv in enumerate(gvs)]

## Define the training step (part with variable value update)
train_step = opt.apply_gradients([(accum_vars[i], gv[1]) for i, gv in enumerate(gvs)])

第一部分基本上将newvariables添加ops到图形中,这将使您能够

  1. accum_ops在变量列表中使用ops累积渐变accum_vars
  2. 使用操作更新模型权重 train_step

然后,要在训练时使用它,您必须遵循以下步骤(仍然来自您链接的答案):

## The while loop for training
while ...:
    # Run the zero_ops to initialize it
    sess.run(zero_ops)
    # Accumulate the gradients 'n_minibatches' times in accum_vars using accum_ops
    for i in xrange(n_minibatches):
        sess.run(accum_ops, feed_dict=dict(X: Xs[i], y: ys[i]))
    # Run the train_step ops to update the weights based on your accumulated gradients
    sess.run(train_step)

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章