Main question: I define the same model in two different ways. Why do I get different results? They seem to be the same model.
Secondary question (answered below) If I run the code again, I get different results again. I have set the seed at the beginning to fix the randomness. Why is that happening?
import numpy as np
np.random.seed(1)
from keras.models import Model, Sequential
from keras.layers import Input, Dense
model1= Sequential([
Dense(20, activation='sigmoid',kernel_initializer='glorot_normal',
input_shape=(2,)),
Dense(2, activation='linear', kernel_initializer='glorot_normal'),
])
model1.compile(optimizer='adam', loss='mean_squared_error')
ipt = Input(shape=(2,))
x = Dense(20, activation='sigmoid', kernel_initializer='glorot_normal')(ipt)
out = Dense(2, activation='linear', kernel_initializer='glorot_normal')(x)
model2 = Model(ipt, out)
model2.compile(optimizer='adam', loss='mean_squared_error')
x_train=np.array([[1,2],[3,4],[3,4]])
model1.fit(x_train, x_train,epochs=2, validation_split=0.1, shuffle=False)
model2.fit(x_train, x_train,epochs=2, validation_split=0.1, shuffle=False)
The first time, the output is:
2/2 [==============================] - 0s 68ms/step - loss: 14.4394 - val_loss: 21.5747
Epoch 2/2
2/2 [==============================] - 0s 502us/step - loss: 14.3199 - val_loss: 21.4163
Train on 2 samples, validate on 1 samples
Epoch 1/2
2/2 [==============================] - 0s 72ms/step - loss: 11.0523 - val_loss: 17.7059
Epoch 2/2
2/2 [==============================] - 0s 491us/step - loss: 10.9833 - val_loss: 17.5785
The second time, the output is:
2/2 [==============================] - 0s 80ms/step - loss: 14.4394 - val_loss: 21.5747
Epoch 2/2
2/2 [==============================] - 0s 501us/step - loss: 14.3199 - val_loss: 21.4163
Train on 2 samples, validate on 1 samples
Epoch 1/2
2/2 [==============================] - 0s 72ms/step - loss: 11.0523 - val_loss: 17.6733
Epoch 2/2
2/2 [==============================] - 0s 485us/step - loss: 10.9597 - val_loss: 17.5459
Update after reading the answer: By the answer below, one of my questions has been answered. I changed the beginning of my code to:
import numpy as np
np.random.seed(1)
import random
random.seed(2)
import tensorflow as tf
tf.set_random_seed(3)
And, now I am getting the same numbers as before. So, it is stable. But, my main question has remained unanswered. Why at each time, the two equivalent models give different results?
Here is the result I get every time:
results 1:
Epoch 1/2
2/2 [==============================] - 0s 66ms/sample - loss: 11.9794 - val_loss: 18.9925
Epoch 2/2
2/2 [==============================] - 0s 268us/sample - loss: 11.8813 - val_loss: 18.8572
results 2:
Epoch 1/2
2/2 [==============================] - 0s 67ms/sample - loss: 5.4743 - val_loss: 9.3471
Epoch 2/2
2/2 [==============================] - 0s 3ms/sample - loss: 5.4108 - val_loss: 9.2497
The problem's rooted in the expected vs. actual behavior of model definition and randomness. To see what's going on, we must understand how "RNG" works:
RNG()
is called, it returns a "random" value and increments its internal counter by 1. Call this counter n
- then: random_value = RNG(n)
n
according to the value of that seed (but not to that seed); we can represent this difference via + c
in the counterc
will be a constant produced by a non-linear, but deterministic, function of the seed: f(seed)
import numpy as np
np.random.seed(4) # internal counter = 0 + c
print(np.random.random()) # internal counter = 1 + c
print(np.random.random()) # internal counter = 2 + c
print(np.random.random()) # internal counter = 3 + c
np.random.seed(4) # internal counter = 0 + c
print(np.random.random()) # internal counter = 1 + c
print(np.random.random()) # internal counter = 2 + c
print(np.random.random()) # internal counter = 3 + c
0.9670298390136767
0.5472322491757223
0.9726843599648843
0.9670298390136767
0.5472322491757223
0.9726843599648843
Suppose model1
has 100 weights, and you set a seed (n = 0 + c
). After model1
is built, your counter is at 100 + c
. If you don't reset the seed, even if you build model2
with the exact same code, the models will differ - as model2
's weights are initialized per n
from 100 + c
to 200 + c
.
There are three seeds to ensure better randomness:
import numpy as np
np.random.seed(1) # for Numpy ops
import random
random.seed(2) # for Python ops
import tensorflow as tf
tf.set_random_seed(3) # for tensorfow ops - e.g. Dropout masks
This'll give pretty good reproducibility, but not perfect if you're using a GPU - due to parallelism of operations; this video explains it well. For even better reproducibility, set your PYHTONHASHSEED
- that and other info in the official Keras FAQ.
"Perfect" reproducibility is rather redundant, as your results should agree within .1% majority of the time - but if you really need it, likely the only way currently is to switch to CPU and stop using CUDA - but that'll slow down training tremendously (by x10+).
Sources of randomness:
Model randomness demo:
import numpy as np
np.random.seed(4)
model1_init_weights = [np.random.random(), np.random.random(), np.random.random()]
model2_init_weights = [np.random.random(), np.random.random(), np.random.random()]
print("model1_init_weights:", model1_init_weights)
print("model2_init_weights:", model2_init_weights)
model1_init_weights: [0.9670298390136767, 0.5472322491757223, 0.9726843599648843]
model2_init_weights: [0.7148159936743647, 0.6977288245972708, 0.21608949558037638]
Restart kernel. Now run this:
import numpy as np
np.random.seed(4)
model2_init_weights = [np.random.random(), np.random.random(), np.random.random()]
model1_init_weights = [np.random.random(), np.random.random(), np.random.random()]
print("model1_init_weights:", model1_init_weights)
print("model2_init_weights:", model2_init_weights)
model1_init_weights: [0.7148159936743647, 0.6977288245972708, 0.21608949558037638]
model2_init_weights: [0.9670298390136767, 0.5472322491757223, 0.9726843599648843]
Thus, flipping the order of model1
and model2
in your code also flips the losses. This is because the seed does not reset itself between the two models' definitions, so your weight initializations are totally different.
If you wish them to be the same, reset the seed before defining EACH MODEL, and before FITTING each model - and use a handy function like below. But your best bet is to restart the kernel and work in separate .py
files.
def reset_seeds():
np.random.seed(1)
random.seed(2)
tf.set_random_seed(3)
print("RANDOM SEEDS RESET")
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句