在Keras fit_generator中将shuffle设置为True时，精度会降低

Obiii 发表于 Dev

死了

我正在使用的数据非常不平衡。

我正在使用VGG16训练图像分类器。我冻结了VGG16中的所有层，接受最后两个完全连接的层。

BATCH_SIZE = 128

EPOCHS = 80

当我设置shuffle = False时，每个类的精度和查全率都很高（介于.80-.90之间），但是当我将shuffle = True时，每个类的精度和查全率都降至0.10-0.20。我不确定发生了什么。可以帮忙吗？

下面是代码：

img_size = 224
trainGen = trainAug.flow_from_directory(
    trainPath,
    class_mode="categorical",
    target_size=(img_size, img_size),
    color_mode="rgb",
    shuffle=False,
    batch_size=BATCH_SIZE)
valGen = valAug.flow_from_directory(
    valPath,
    class_mode="categorical",
    target_size=(img_size, img_size),
    color_mode="rgb",
    shuffle=False,
    batch_size=BATCH_SIZE)

testGen = valAug.flow_from_directory(
    testPath,
    class_mode="categorical",
    target_size=(img_size, img_size),
    color_mode="rgb",
    shuffle=False,
    batch_size=BATCH_SIZE)

baseModel = VGG16(weights="imagenet", include_top=False,input_tensor=Input(shape=(img_size, img_size, 3)))
headModel = baseModel.output
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(512, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(PFR_NUM_CLASS, activation="softmax")(headModel)
# place the head FC model on top of the base model (this will become
# the actual model we will train)
model = Model(inputs=baseModel.input, outputs=headModel)
# loop over all layers in the base model and freeze them so they will
# *not* be updated during the first training process
for layer in baseModel.layers:
    layer.trainable = False

班级权重计算如下：

from sklearn.utils import class_weight
import numpy as np

class_weights = class_weight.compute_class_weight(
               'balanced',
                np.unique(trainGen.classes), 
                trainGen.classes)

这些是班级权重：

array([0.18511007, 2.06740331, 1.00321716, 3.53018868, 2.48637874,
       2.27477204, 1.57557895, 6.68214286, 1.04233983, 4.02365591])

培训代码为：

# compile our model (this needs to be done after our setting our layers to being non-trainable
print("[INFO] compiling model...")
opt = SGD(lr=1e-5, momentum=0.8)
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
# train the head of the network for a few epochs (all other layers
# are frozen) -- this will allow the new FC layers to start to become
#initialized with actual "learned" values versus pure random
print("[INFO] training head...")
H = model.fit_generator(
    trainGen,
    steps_per_epoch=totalTrain // BATCH_SIZE,
    validation_data=valGen,
    validation_steps=totalVal // BATCH_SIZE,
    epochs=EPOCHS,
    class_weight=class_weights,
    verbose=1,
    callbacks=callbacks_list)
# reset the testing generator and evaluate the network after
# fine-tuning just the network head

蒂姆布斯·卡林（Timbus Calin）

就您而言，设置的问题shuffle=True在于，如果您对验证集进行混洗，结果将很混乱。碰巧预测是正确的，但与错误的索引进行比较可能会导致误导性结果，就像您的情况一样。

始终shuffle=True在训练集以及shuffle=False验证集和测试集上。

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-01-23

我来说两句

0 条评论

登录后参与评论

上一篇：Python Numpy插值给出错误的输出

在Keras fit_generator中将shuffle设置为True时，精度会降低

在Keras fit_generator中将shuffle设置为True时，精度会降低

计算数据帧R中的字符串频率

Android Studio Kotlin：提取为常量

Excel 2016图表将增长与4个参数进行比较

获取并汇总所有关联的数据

如何使用Redux-Toolkit重置Redux Store

http：// localhost：3000 /＃！/为什么我在localhost链接中得到“＃！/”。

将加号/减号添加到jQuery菜单

算术中的c ++常量类型转换

TYPO3：将 Formhandler 添加到新闻扩展

TreeMap中的自定义排序

如何开始为Ubuntu开发

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

无法使用 envoy 访问 .ssh/config

在Ubuntu和Windows中，触摸板有时会滞后。硬件问题？

遍历元素数组以每X秒在浏览器上显示

在Jenkins服务器中使用Selenium和Ruby进行的黄瓜测试失败，但在本地计算机中通过

警告消息：在matrix（unlist（drop.item），ncol = 10，byrow = TRUE）中：数据长度[16]不是列数的倍数[10]>？

未捕获的SyntaxError：带有Ajax帖子的意外令牌u

如何使用tweepy流式传输来自指定用户的推文（仅在该用户发布推文时流式传输）

尝试在Dell XPS13 9360上安装Windows 7时出错

如果从DB接收到的值为空，则JMeter JDBC调用将返回该值作为参数名称