神经网络中的动量

约根

神经网络和动量

动量因子应优选与[数据集实例和单个权重]或[仅权重]有关。例如：

def get_momentum( instance, weight ):
   return float

instance1 = vector 1xn
instance2 = vector 1xn
weights   = vector 1xn

# Option 1
get_momentum( instance1, weights[0] ) # eg returns 0.1
get_momentum( instance2, weights[0] ) # eg returns 0.3 <-- same weight, different momentum

# Option 2
get_momentum( instance1, weights[0] ) # eg returns 0.1
get_momentum( instance2, weights[0] ) # eg returns 0.1

第二种选择将具有较低的存储器复杂性。我相信这也会导致学习算法比第一种方法更容易陷入局部最优状态。选项1应该引起更强的动量“拉动”。

约根

经过测试

我已经对我的假设进行了一些检验。两种方法的执行效果几乎相同，但是使用第一种方法则有明显的改进。

动量数据结构的内存复杂度：

方法1： O( instances * weights )
方法二： O( weights )

结果：

每个回合使用预定义的权重集。两种版本均以相同的砝码组进行训练。

$ pypy backprop.py # First approach
Round: 1/10     Required epochs: 40995
Round: 2/10     Required epochs: 40997
Round: 3/10     Required epochs: 40996
Round: 4/10     Required epochs: 40997
Round: 5/10     Required epochs: 40997
Round: 6/10     Required epochs: 40997
Round: 7/10     Required epochs: 40999
Round: 8/10     Required epochs: 40996
Round: 9/10     Required epochs: 40996
Round: 10/10    Required epochs: 40997

$ pypy backprop.py # Second approach
Round: 1/10     Required epochs: 41070
Round: 2/10     Required epochs: 41072
Round: 3/10     Required epochs: 41069
Round: 4/10     Required epochs: 41069
Round: 5/10     Required epochs: 41070
Round: 6/10     Required epochs: 41071
Round: 7/10     Required epochs: 41072
Round: 8/10     Required epochs: 41069
Round: 9/10     Required epochs: 41070
Round: 10/10    Required epochs: 41071

正如我们从测试中可能会看到的那样，第二种方法（具有较低的内存复杂性）在达到所需的精度之前需要更多的训练时间。