我注意到,当损失函数将输入转换为numpy数组以计算输出值时,Tensorflow的自动微分功能不会提供与有限差分相同的值。这是问题的最小工作示例:
import tensorflow as tf
import numpy as np
def lossFn(inputTensor):
# Input is a rank-2 square tensor
return tf.linalg.trace(inputTensor @ inputTensor)
def lossFnWithNumpy(inputTensor):
# Same function, but converts input to a numpy array before performing the norm
inputArray = inputTensor.numpy()
return tf.linalg.trace(inputArray @ inputArray)
N = 2
tf.random.set_seed(0)
randomTensor = tf.random.uniform([N, N])
# Prove that the two functions give the same output; evaluates to exactly zero
print(lossFn(randomTensor) - lossFnWithNumpy(randomTensor))
theoretical, numerical = tf.test.compute_gradient(lossFn, [randomTensor])
# These two values match
print(theoretical[0])
print(numerical[0])
theoretical, numerical = tf.test.compute_gradient(lossFnWithNumpy, [randomTensor])
# The theoretical value is [0 0 0 0]
print(theoretical[0])
print(numerical[0])
该函数tf.test.compute_gradients
使用自动微分计算“理论”梯度,并使用有限差分计算数值梯度。如代码所示,如果.numpy()
在损失函数中使用自动微分功能,则不会计算梯度。
有人可以解释原因吗?
从指南:梯度和自动微分简介
如果计算退出TensorFlow,则磁带将无法记录渐变路径。例如:
x = tf.Variable([[1.0, 2.0], [3.0, 4.0]], dtype=tf.float32) with tf.GradientTape() as tape: x2 = x**2 # This step is calculated with NumPy y = np.mean(x2, axis=0) # Like most ops, reduce_mean will cast the NumPy array to a constant tensor # using `tf.convert_to_tensor`. y = tf.reduce_mean(y,axis=0) print(tape.gradient(y, x))
输出
None
numpy值将在对的调用中作为常量张量而返回tf.linalg.trace
,Tensorflow无法对其进行梯度计算。
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句