这是我如何加载包含图像数据的 2 个文件夹的数据:
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
main_folder,
validation_split=0.1,
subset="training",
seed=123,
image_size=(dim, dim))
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
main_folder,
validation_split=0.1,
subset="validation",
seed=123,
image_size=(dim, dim))
从文件夹加载训练数据给出
Found 6457 files belonging to 2 classes.
Using 5812 files for training.
Found 6457 files belonging to 2 classes.
Using 645 files for validation.
这是我训练模型的方法:
model = tf.keras.models.Sequential([
tf.keras.layers.experimental.preprocessing.Rescaling(1. / 255),
tf.keras.layers.Conv2D(16, (3, 3), activation='relu', padding='same'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss=tf.losses.BinaryCrossentropy(from_logits=True), optimizer="adam", metrics=["accuracy"])
es = EarlyStopping(monitor='val_accuracy', min_delta=0.1, patience=5)
model.fit(
train_ds,
validation_data=val_ds,
epochs=epc,
callbacks=[es])
这是我得到结果的方式:
y_pred = model.predict(val_ds)
predicted_categories = tf.argmax(y_pred, axis=1)
true_categories = tf.concat([y for x, y in val_ds], axis=0)
print(classification_report(true_categories, predicted_categories ))
矛盾的输出是:
Epoch 1/100
182/182 [==============================] - 8s 44ms/step - loss: 0.6617 - accuracy: 0.5139 - val_loss: 0.6466 - val_accuracy: 0.3442
Epoch 2/100
182/182 [==============================] - 8s 46ms/step - loss: 0.6613 - accuracy: 0.5712 - val_loss: 0.6460 - val_accuracy: 0.6558
Epoch 3/100
182/182 [==============================] - 8s 44ms/step - loss: 0.6611 - accuracy: 0.5594 - val_loss: 0.6474 - val_accuracy: 0.3442
Epoch 4/100
182/182 [==============================] - 8s 46ms/step - loss: 0.6315 - accuracy: 0.6504 - val_loss: 0.4623 - val_accuracy: 0.9690
Epoch 5/100
182/182 [==============================] - 8s 46ms/step - loss: 0.4780 - accuracy: 0.9554 - val_loss: 0.4597 - val_accuracy: 0.9690
Epoch 6/100
182/182 [==============================] - 8s 45ms/step - loss: 0.4831 - accuracy: 0.9434 - val_loss: 0.4517 - val_accuracy: 0.9845
Epoch 7/100
182/182 [==============================] - 8s 45ms/step - loss: 0.4720 - accuracy: 0.9658 - val_loss: 0.4546 - val_accuracy: 0.9736
Epoch 8/100
182/182 [==============================] - 8s 44ms/step - loss: 0.4719 - accuracy: 0.9652 - val_loss: 0.4507 - val_accuracy: 0.9860
Epoch 9/100
182/182 [==============================] - 8s 44ms/step - loss: 0.4747 - accuracy: 0.9597 - val_loss: 0.4528 - val_accuracy: 0.9814
precision recall f1-score support
0 0.34 1.00 0.51 222
1 0.00 0.00 0.00 423
accuracy 0.34 645
macro avg 0.17 0.50 0.26 645
weighted avg 0.12 0.34 0.18 645
否则,每次执行时我都会得到不同的答案
有人可以请教为什么分类报告的准确率为 34% 而模型 val_accuracy 为 0.94%?
tf.keras.preprocessing.image_dataset_from_directory
方法有一个参数被调用label_mode
,它的默认值是int
适合于sparse_categoricalcrossentropy
等。label_model = binary
如果分类是二分类,应该改成。
矛盾在这里:
tf.keras.layers.Dense(1, activation='sigmoid')
predicted_categories = tf.argmax(y_pred, axis=1)
随着sigmoid
你的输出由具有形状的列表(1,)
。当您获取argmax
该列表时,它总是返回零作为索引,因为该列表只有一个索引。所以你需要在使用时应用一些阈值方法sigmoid
。Sigmoid 将输出压缩到 [0,1] 的范围内。所以你可以这样做:
predicted_categories = [1 * (x[0]>=0.5) for x in y_pred]
这意味着如果预测值大于0.5
那么它将属于第二类。您可以根据需要调整阈值。
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句