Fix part of a variable during eager execution training in tensorflow

Hooked

Is there a way to only update some of the variables during an eager execution update step? Consider this minimal working example:

import tensorflow as tf
tf.enable_eager_execution()
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)

x = tf.Variable([1.0, 2.0])

def train(x):
    with tf.GradientTape() as tape:
        loss = x[0]**2 + x[1]**2 + 1/(x[0]+x[1])
        variables = [x]
        grads = tape.gradient(loss, variables)
        optimizer.apply_gradients(zip(grads, variables))

for _ in range(2000):
    train(x)
    print(x.numpy())

Which converges to [0.5, 0.5]. I'd like to fix the value of x[0] to it's initial value, while keeping everything else the way it is. What I've tried so far:

  • Adding a x[0].assign(1.0) operation to the training step which grows the graph unnecessarily
  • Changing variables = [x[:-1]] which gives ValueError: No gradients provided for any variable: ['tf.Tensor([1.], shape=(1,), dtype=float32)']
  • Adding grads = [grads[0][1:]] which gives tensorflow.python.framework.errors_impl.InvalidArgumentError: var and delta do not have the same shape[2] [1] [Op:ResourceApplyGradientDescent]
  • Doing both, which gives TypeError: 'NoneType' object is not subscriptable

For this MWE I can easily use two separate variables, but I'm interested in the generic case where I only want to update a known slice of an array.

gorjan

You can set the gradient of the index you don't want to update to 0. In the code snippet bellow, the mask tensor indicates which elements we want to update (values 1), and which elements we don't want to update (values 0).

import tensorflow as tf
tf.enable_eager_execution()

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)

x = tf.Variable([1.0, 2.0])
mask = tf.constant([0.0, 1.0])

def train(x):
    with tf.GradientTape() as tape:
        loss = x[0]**2 + x[1]**2 + 1/(x[0]+x[1])
        variables = [x]

        grads = tape.gradient(loss, variables) * mask
        optimizer.apply_gradients(zip(grads, variables))

for _ in range(100):
    train(x)
    print(x.numpy())

Another possible solution for your problem can be to stop the gradient on operations that x[0] is dependent on. For example:

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)

x = tf.Variable([1.0, 2.0])

def train(x):
    with tf.GradientTape() as tape:
        loss = tf.stop_gradient(x[0])**2 + x[1]**2 + 1/(tf.stop_gradient(x[0])+x[1])
        variables = [x]

        grads = tape.gradient(loss, variables)
        optimizer.apply_gradients(zip(grads, variables))

for _ in range(100):
    train(x)
    print(x.numpy())

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Tensorflow Eager Execution on Colab

Eager execution in Tensorflow 2

Tensorflow Eager Execution Multithreaded

How to update parameter at each epoch within an intermediate Layer between training runs ? (tensorflow eager execution)

Tensorflow multinomial distribution with eager execution

Eager execution in tensorflow custom estimators

Forcing eager execution in tensorflow 2.1.0

Does the TensorFlow backend of Keras rely on the eager execution?

Is Eager Execution meant to replace the tensorflow session approach?

How to disable eager execution in tensorflow 2.0?

Both eager and graph execution in tensorflow tests

Printing the loss during TensorFlow training

How can I compute the gradient w.r.t. a non-variable in TensorFlow's eager execution mode?

Saving layer weights at each epoch during training into a numpy type/array? Converting TensorFlow Variable to numpy array?

Tensorflow custom loss function NaNs during training

How to print the gradients during training in Tensorflow?

Very low GPU usage during training in Tensorflow

Change Dropout rate during training in tensorflow - is it possible?

Tensorflow Model fills up with NaNs during training

TensorFlow: Change a variable while training

Tensorflow training with variable batch size

Does Tensorflow support Keras models fit() method with eager execution?

TensorFlow: How can I inspect gradients and weights in eager execution?

AttributeError: module 'tensorflow' has no attribute 'enable_eager_execution'

Tensorflow Keras TypeError: Eager execution of tf.constant with unsupported shape

TensorFlow 1.8.0, eager execution and comparison don't behave as expected

How to use tensorflow eager execution only in specific parts of the application?

Tensorflow 2 eager execution disabled inside a custom layer

TensorFlow Eager Execution: AttributeError: 'NoneType' object has no attribute '_tape'