tf.GradientTape giving None gradient while writing custom training loop

Al Shahreyaj

I'm trying to write a custom training loop. Here is a sample code of what I'm trying to do. I have two training parameter and one parameter is updating another parameter. See the code below:

x1 = tf.Variable(1.0, dtype=float)
x2 = tf.Variable(1.0, dtype=float)

with tf.GradientTape() as tape:
    n = x2 + 4
    x1.assign(n)
    x = x1 + 1
    y = x**2
    val = tape.gradient(y, [x1, x2])
    for v in val:
        print(v)

and the output is

tf.Tensor(12.0, shape=(), dtype=float32)
None

It seems like GradientTape is not watching the first(x2) parameter. Both parameter is tf.Variable type, so GradientTape should watch both the parameter. I also tried tape.watch(x2), which is also not working. Am I missing something?

AloneTogether

Check the docs regarding a gradient of None. To get the gradients for x1, you have to track x with tape.watch(x):

x1 = tf.Variable(1.0, dtype=float)
x2 = tf.Variable(1.0, dtype=float)

with tf.GradientTape() as tape:
    n = x2 + 4
    x1.assign(n)
    x = x1 + 1
    tape.watch(x)
    y = x**2

dv0, dv1 = tape.gradient(y, [x1, x2])
print(dv0)
print(dv1)

However, regarding x2, the output y is not connected to x2 at all, since x1.assign(n) does not seem to be tracked and that is why the gradient is None. This is consistent with the docs:

State stops gradients. When you read from a stateful object, the tape can only observe the current state, not the history that lead to it.

A tf.Tensor is immutable. You can't change a tensor once it's created. It has a value, but no state. All the operations discussed so far are also stateless: the output of a tf.matmul only depends on its inputs.

A tf.Variable has internal state—its value. When you use the variable, the state is read. It's normal to calculate a gradient with respect to a variable, but the variable's state blocks gradient calculations from going farther back

If, for example, you do something like this:

x1 = tf.Variable(1.0, dtype=float)
x2 = tf.Variable(1.0, dtype=float)

with tf.GradientTape() as tape:
    n = x2 + 4
    x1 = n
    x = x1 + 1
    tape.watch(x)
    y = x**2 

dv0, dv1 = tape.gradient(y, [x1, x2])

It should work.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2022-05-25

Comments

0 comments

Reparametrization in tensorflow-probability: tf.GradientTape() doesn't calculate the gradient with respect to a distribution's mean

TOP Ranking

Article

tf.GradientTape giving None gradient while writing custom training loop

tf.GradientTape giving None gradient while writing custom training loop

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

pump.io port in URL

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

Unable to use switch toggle for dark mode in material-ui

grouping by column variables and appending a new variable based on condition

BigQuery - concatenate ignoring NULL

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

Python Read Directory And Output to CSV

Remove adjacent duplicates in linked list in C

Angular 8. Unknown amount of http.get requests in array to call, must be sequential, what to use

How to keep curl session alive between two php processes?

Limit number of characters in uitextview

JMeter: Why get error when try to save test plan

Always setting the text cursor on Textbox

MTKView Displaying Wide Gamut P3 Colorspace

Vector input in shiny R and then use it

How to implement an authentication method using Spring Boot and JPA?

Laravel getting value from another table using eloquent

org.springframework.web.client.HttpClientErrorException: 401 null

When I click any button in my view page the form is submitted