如何在pytorch中定义依赖于模型wrt输入的偏导数的损失函数？

在遵循论文《神经常微分方程》和使用JAX库的博客上阅读了有关如何使用神经网络求解ODE的知识后，我尝试使用“普通” Pytorch做相同的事情，但发现一个点相当“晦涩”：如何正确处理使用输入参数之一的函数（在这种情况下为模型）的偏导数。

为了恢复如图2所示的当前问题，打算在条件-2（= x <= 2）中以条件y（x = 0）= 1求解ODE y'= -2 * x * y。有限差分法的解决方案是用具有10个节点的单层的NN（y（x）= NN（x））代替。

我设法（或多或少）使用以下代码复制博客

import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np 

# Define the NN model to solve the problem
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.lin1 = nn.Linear(1,10)
        self.lin2 = nn.Linear(10,1)

    def forward(self, x):
        x = torch.sigmoid(self.lin1(x))
        x = torch.sigmoid(self.lin2(x))
        return x

model = Model()

# Define loss_function from the Ordinary differential equation to solve
def ODE(x,y):
    dydx, = torch.autograd.grad(y, x, 
    grad_outputs=y.data.new(y.shape).fill_(1),
    create_graph=True, retain_graph=True)

    eq = dydx + 2.* x * y # y' = - 2x*y
    ic = model(torch.tensor([0.])) - 1.    # y(x=0) = 1
    return torch.mean(eq**2) + ic**2

loss_func = ODE

# Define the optimization
# opt = optim.SGD(model.parameters(), lr=0.1, momentum=0.99,nesterov=True) # Equivalent to blog
opt = optim.Adam(model.parameters(),lr=0.1,amsgrad=True) # Got faster convergence with Adam using amsgrad

# Define reference grid 
x_data = torch.linspace(-2.0,2.0,401,requires_grad=True)
x_data = x_data.view(401,1) # reshaping the tensor

# Iterative learning
epochs = 1000
for epoch in range(epochs):
    opt.zero_grad()
    y_trial = model(x_data)
    loss = loss_func(x_data, y_trial)

    loss.backward()
    opt.step()

    if epoch % 100 == 0:
        print('epoch {}, loss {}'.format(epoch, loss.item()))

# Plot Results
plt.plot(x_data.data.numpy(), np.exp(-x_data.data.numpy()**2), label='exact')
plt.plot(x_data.data.numpy(), y_data.data.numpy(), label='approx')
plt.legend()
plt.show()

从这里，我设法得到如图所示的结果。在此处输入图片说明

问题在于，在ODE函数的定义中，我宁愿传递类似（x，fun）（其中的乐趣是我的模型）的东西，而不传递（x，y），以便对函数的偏导数和特定求值。可以通过调用完成模型。所以，像

def ODE(x,fun):
    dydx, = "grad of fun w.r.t x as a function"

    eq = dydx(x) + 2.* x * fun(x)        # y' = - 2x*y
    ic = fun( torch.tensor([0.]) ) - 1.  # y(x=0) = 1
    return torch.mean(eq**2) + ic**2

有任何想法吗？提前致谢

编辑：

经过一些试验后，我找到了一种将模型作为输入传递的方法，但发现了另一个奇怪的行为……新问题是用BC y（x = -2）= -1来求解ODE y''= -2。 y（x = 2）= 1，其解析解为y（x）= -x ^ 2 + x / 2 + 4

让我们将之前的代码修改为：

import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np 

# Define the NN model to solve the equation
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.lin1 = nn.Linear(1,10)
        self.lin2 = nn.Linear(10,1)

    def forward(self, x):
        y = torch.sigmoid(self.lin1(x))
        z = torch.sigmoid(self.lin2(y))
        return z

model = Model()

# Define loss_function from the Ordinary differential equation to solve
def ODE(x,fun):
    y = fun(x)

    dydx = torch.autograd.grad(y, x, 
    grad_outputs=y.data.new(y.shape).fill_(1),
    create_graph=True, retain_graph=True)[0]

    d2ydx2 = torch.autograd.grad(dydx, x, 
    grad_outputs=dydx.data.new(dydx.shape).fill_(1),
    create_graph=True, retain_graph=True)[0]

    eq  = d2ydx2 + torch.tensor([ 2.])             # y'' = - 2
    bc1 =  fun(torch.tensor([-2.])) - torch.tensor([-1.]) # y(x=-2) = -1
    bc2 =  fun(torch.tensor([ 2.])) - torch.tensor([ 1.]) # y(x= 2) =  1
    return torch.mean(eq**2) + bc1**2 + bc2**2

loss_func = ODE

因此，在这里，我将模型作为参数传递，并成功导出了两次……到目前为止，一切都很好。但是，在这种情况下使用S形函数不仅不是必需的，而且所得到的结果与分析结果相去甚远。

如果我将NN更改为：

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.lin1 = nn.Linear(1,1)
        self.lin2 = nn.Linear(1,1)

    def forward(self, x):
        y = self.lin1(x)
        z = self.lin2(y)
        return z

在这种情况下，我希望优化通过两个线性函数的两次通过，从而检索一个二阶函数。

RuntimeError：差异的张量之一似乎未在图中使用。如果这是所需的行为，则设置allow_unused = True。

将选项添加到dydx的定义不能解决问题，将其添加到d2ydx2可以提供NoneType定义。

图层是否存在问题？

乌曼·古普塔

快速解决方案：

添加allow_unused=True到.grad功能。所以，改变

dydx = torch.autograd.grad(
    y, x,
    grad_outputs=y.data.new(y.shape).fill_(1),
    create_graph=True, retain_graph=True)[0]

d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
    dydx.shape).fill_(1), create_graph=True, retain_graph=True)[0]

至

dydx = torch.autograd.grad(
    y, x,
    grad_outputs=y.data.new(y.shape).fill_(1),
    create_graph=True, retain_graph=True, allow_unused=True)[0]

d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
    dydx.shape).fill_(1), create_graph=True, retain_graph=True, allow_unused=True)[0]

更多说明：

看看该怎么allow_unused做：

allow_unused (bool, optional): If ``False``, specifying inputs that were not
        used when computing outputs (and therefore their grad is always zero)
        is an error. Defaults to ``False``.

因此，如果您尝试将wrt区分为一个不用于计算值的变量，它将产生错误。另外，请注意，仅在使用线性图层时才会发生错误。

这是因为当您使用线性图层时，您拥有y=W1*W2*x + b = Wx+b而dy/dx不是的函数x，它只是W。因此，当您尝试区分dy/dxwrt时x会引发错误。使用Sigmoid后，此错误就会消失，因为这dy/dx将是的函数x。为避免该错误，请确保dy/dx是x的函数或使用allow_unused=True

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-01-21

我来说两句

0 条评论

登录后参与评论

上一篇：创建IAM角色并将策略附加到该角色时出错

TOP 榜单

文章

如何在pytorch中定义依赖于模型wrt输入的偏导数的损失函数？

如何在pytorch中定义依赖于模型wrt输入的偏导数的损失函数？

蓝屏死机没有修复解决方案

计算数据帧中每行的NA

UITableView的项目向下滚动后更改颜色，然后快速备份

Node.js中未捕获的异常错误，发生调用

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

Linux的官方Adobe Flash存储库是否已过时？

验证REST API参数

ggplot：对齐多个分面图-所有大小不同的分面

Mac OS X更新后的GRUB 2问题

通过 Git 在运行 Jenkins 作业时获取 ClassNotFoundException

带有错误“ where”条件的查询如何返回结果？

用日期数据透视表和日期顺序查询

VB.net将2条特定行导出到DataGridView

如何从视图一次更新多行（ASP.NET - Core）

Java Eclipse中的错误13，如何解决？

尝试反复更改屏幕上按钮的位置 - kotlin android studio

离子动态工具栏背景色

应用发明者仅从列表中选择一个随机项一次

当我尝试下载 StanfordNLP en 模型时，出现错误

python中的boto3文件上传

在同一Pushwoosh应用程序上Pushwoosh多个捆绑ID