Why does the training accuracy of the model increase very slowly (almost stays stable) over some number of epochs?Keras image classification validation accuracy highertraining loss increases while validation accuracy increasesKeras - negative cosine proximity losshow to save val_loss and val_acc in Kerastraining vgg on flowers dataset with keras, validation loss not changingNo classification improvements with using set_session() for KerasKeras model giving very low training and validation accuracy for multi-label image classificationKeras fit_generator and fit results are differentConvolutional Autoencoder unsupervised deep learning modelLoss of CNN in Keras becomes nan at some point of training

Is a USB 3.0 device possible with a four contact USB 2.0 connector?

Scam? Phone call from "Department of Social Security" asking me to call back

Why do so many people play out of turn on the last lead?

What if a restaurant suddenly cannot accept credit cards, and the customer has no cash?

Why is the battery jumpered to a resistor in this schematic?

Typesetting "hollow slash"

Did Michelle Obama have a staff of 23; and Melania have a staff of 4?

How do I answer an interview question about how to handle a hard deadline I won't be able to meet?

Set theory with antielements?

What's a good pattern to calculate a variable only when it is used the first time?

Is there a way, other than having a Diviner friend, for a player to avoid rolling Initiative at the start of a combat?

Select elements of a list by comparing it to another list

Quick destruction of a helium filled airship?

What should I do with the stock I own if I anticipate there will be a recession?

Resource is refusing to do a handover before leaving

Eric Andre had a dream

What allows us to use imaginary numbers?

Why won't the Republicans use a superdelegate system like the DNC in their nomination process?

How to train a replacement without them knowing?

If a person claims to know anything could it be disproven by saying 'prove that we are not in a simulation'?

Airline power sockets shut down when I plug my computer in. How can I avoid that?

What should I do if actually I found a serious flaw in someone's PhD thesis and an article derived from that PhD thesis?

Expressing a chain of boolean ORs using ILP

Why should I pay for an SSL certificate?



Why does the training accuracy of the model increase very slowly (almost stays stable) over some number of epochs?


Keras image classification validation accuracy highertraining loss increases while validation accuracy increasesKeras - negative cosine proximity losshow to save val_loss and val_acc in Kerastraining vgg on flowers dataset with keras, validation loss not changingNo classification improvements with using set_session() for KerasKeras model giving very low training and validation accuracy for multi-label image classificationKeras fit_generator and fit results are differentConvolutional Autoencoder unsupervised deep learning modelLoss of CNN in Keras becomes nan at some point of training






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I am working on the CIFAR-10 dataset and trying to get the benchmark or atleast 90% accuracy. I have tried all the below mentioned ways but most of them result into the same thing and which is.... the training accuracy doesn't improve after some epochs and stays stable and also the validation accuracy fluctuates a little bit.



The dataset directory is as :



cifar

train(total 40,000 images. 4000 images per class. Total 10 classes)
airplane
automobile......(similar structure for test and validation as well)

test.(total 10,000 images. 1000 images per class)

validation.(total 10,000 images. 1000 images per class)

code.py


I have tried using the following parameters :



  1. Optimizers : adam, nadam, adadelta and SGD.

  2. Batch sizes : 16,32.

  3. Initially started with 2 convolution layers. Firstly I trained with
    64 filters in both and then both convolution layers with 128
    filters. Now I have added a 3rd convolution layer.

Here is the code :



import keras

from keras.layers import Conv2D, MaxPool2D, Flatten, Dense, Dropout,
Activation, BatchNormalization, GlobalAveragePooling2D

from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers, regularizers


classifier = Sequential()

classifier.add(Conv2D(filters=64, kernel_size=(3,3), input_shape= (32,32,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(Conv2D(filters=64, kernel_size=(3,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(Conv2D(filters=64, kernel_size=(3,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(GlobalAveragePooling2D())
classifier.add(Dense(units=10,activation='softmax'))

'''
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
'''

classifier.compile(optimizer='nadam', loss='categorical_crossentropy', metrics=['accuracy'])

train_datagen=ImageDataGenerator(rescale=1./255,
featurewise_center=True,featurewise_std_normalization=True,
shear_range=0.2,rotation_range=20, width_shift_range=0.2,
height_shift_range=0.2,horizontal_flip=True)

test_datagen=ImageDataGenerator(rescale=1./255,
featurewise_center=True,featurewise_std_normalization=True)

train_dataset=train_datagen.flow_from_directory(
directory='cifar/train', target_size=(32,32),
batch_size=16, class_mode='categorical')

test_dataset=test_datagen.flow_from_directory(
directory='cifar/validation', target_size=(32,32),
batch_size=16, class_mode='categorical')

classifier.fit_generator(train_dataset,
steps_per_epoch=2500, epochs=50,
validation_data=test_dataset, validation_steps=625)


And here are the epoch observations :



Epoch 17/50 
2500/2500 [==============================] - 259s 103ms/step - loss:
0.9305 - acc: 0.6840 - val_loss: 0.8195 - val_acc: 0.7111

Epoch 18/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.9280 - acc: 0.6817 - val_loss: 0.9981 - val_acc: 0.6816

Epoch 19/50
2500/2500 [==============================] - 260s 104ms/step - loss:
0.9112 - acc: 0.6896 - val_loss: 0.9393 - val_acc: 0.6786

Epoch 20/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.9053 - acc: 0.6881 - val_loss: 0.8509 - val_acc: 0.7172

Epoch 21/50
2500/2500 [==============================] - 259s 104ms/step - loss:
0.9110 - acc: 0.6874 - val_loss: 0.8427 - val_acc: 0.7211

Epoch 22/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.8967 - acc: 0.6944 - val_loss: 0.7139 - val_acc: 0.7592

Epoch 23/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.8825 - acc: 0.6967 - val_loss: 0.8611 - val_acc: 0.7066

Epoch 24/50
2500/2500 [==============================] - 260s 104ms/step - loss:
0.8819 - acc: 0.6967 - val_loss: 0.7436 - val_acc: 0.7447

Epoch 25/50
2500/2500 [==============================] - 270s 108ms/step - loss:
0.8780 - acc: 0.6995 - val_loss: 0.8129 - val_acc: 0.7310

Epoch 26/50
2500/2500 [==============================] - 279s 112ms/step - loss:
0.8756 - acc: 0.7010 - val_loss: 0.7890 - val_acc: 0.7276

Epoch 27/50
2500/2500 [==============================] - 283s 113ms/step - loss:
0.8680 - acc: 0.7027 - val_loss: 0.8185 - val_acc: 0.7307

Epoch 28/50
2500/2500 [==============================] - 287s 115ms/step - loss:
0.8651 - acc: 0.7043 - val_loss: 0.7457 - val_acc: 0.7460

Epoch 29/50
2500/2500 [==============================] - 286s 114ms/step - loss:
0.8531 - acc: 0.7065 - val_loss: 1.1669 - val_acc: 0.6483

Epoch 30/50
2500/2500 [==============================] - 290s 116ms/step - loss:
0.8521 - acc: 0.7085 - val_loss: 0.7221 - val_acc: 0.7565

Epoch 31/50
2500/2500 [==============================] - 289s 116ms/step - loss:
0.8518 - acc: 0.7072 - val_loss: 0.7308 - val_acc: 0.7549

Epoch 32/50
2500/2500 [==============================] - 291s 116ms/step - loss:
0.8465 - acc: 0.7119 - val_loss: 0.8550 - val_acc: 0.7182

Epoch 33/50
2500/2500 [==============================] - 302s 121ms/step - loss:
0.8406 - acc: 0.7121 - val_loss: 1.0259 - val_acc: 0.6770

Epoch 34/50
2500/2500 [==============================] - 286s 115ms/step - loss:
0.8424 - acc: 0.7120 - val_loss: 0.6924 - val_acc: 0.7646

Epoch 35/50
2500/2500 [==============================] - 273s 109ms/step - loss:
0.8337 - acc: 0.7143 - val_loss: 0.8744 - val_acc: 0.7220

Epoch 36/50
2500/2500 [==============================] - 285s 114ms/step - loss:
0.8332 - acc: 0.7144 - val_loss: 1.0132 - val_acc: 0.6753

Epoch 37/50
2500/2500 [==============================] - 275s 110ms/step - loss:
0.8382 - acc: 0.7122 - val_loss: 0.7873 - val_acc: 0.7366


I am a beginner in deep learning so pardon me if I may have done any silly mistake. Please guide me how should I proceed further.










share|improve this question



















  • 1





    Your validation accuracy and loss is oscillating, it might be caused because of a high value of momentum in your optimizer. I suggest you also to try with a lower learning rate or increasing the decay.

    – Eric
    Mar 27 at 13:05











  • Maybe it will be better to change the number of filters in convolution layers. Start with one layer of 64 filters and the other two with 128 filters. You can read this similar issue, it mught be helpfull

    – Eric
    Mar 27 at 13:08











  • @Eric thank you for the suggestion. I will try that. Also, would the order of the convolution layers i.e. (filters) -- (128,128,64) or (128,64,128) or (64,128,128) affect the results significantly?

    – Atharva Kalsekar
    Mar 28 at 11:51











  • I think so, because every layer learns a specific property (contours, regions ...). But i just said it because the usual configuration is (64,128, 128), the best way to know it is to compare the results between this 3 coinfigurations.

    – Eric
    Mar 29 at 9:12











  • Ok thanks. I will try that as well @Eric.

    – Atharva Kalsekar
    Mar 29 at 10:11

















1















I am working on the CIFAR-10 dataset and trying to get the benchmark or atleast 90% accuracy. I have tried all the below mentioned ways but most of them result into the same thing and which is.... the training accuracy doesn't improve after some epochs and stays stable and also the validation accuracy fluctuates a little bit.



The dataset directory is as :



cifar

train(total 40,000 images. 4000 images per class. Total 10 classes)
airplane
automobile......(similar structure for test and validation as well)

test.(total 10,000 images. 1000 images per class)

validation.(total 10,000 images. 1000 images per class)

code.py


I have tried using the following parameters :



  1. Optimizers : adam, nadam, adadelta and SGD.

  2. Batch sizes : 16,32.

  3. Initially started with 2 convolution layers. Firstly I trained with
    64 filters in both and then both convolution layers with 128
    filters. Now I have added a 3rd convolution layer.

Here is the code :



import keras

from keras.layers import Conv2D, MaxPool2D, Flatten, Dense, Dropout,
Activation, BatchNormalization, GlobalAveragePooling2D

from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers, regularizers


classifier = Sequential()

classifier.add(Conv2D(filters=64, kernel_size=(3,3), input_shape= (32,32,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(Conv2D(filters=64, kernel_size=(3,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(Conv2D(filters=64, kernel_size=(3,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(GlobalAveragePooling2D())
classifier.add(Dense(units=10,activation='softmax'))

'''
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
'''

classifier.compile(optimizer='nadam', loss='categorical_crossentropy', metrics=['accuracy'])

train_datagen=ImageDataGenerator(rescale=1./255,
featurewise_center=True,featurewise_std_normalization=True,
shear_range=0.2,rotation_range=20, width_shift_range=0.2,
height_shift_range=0.2,horizontal_flip=True)

test_datagen=ImageDataGenerator(rescale=1./255,
featurewise_center=True,featurewise_std_normalization=True)

train_dataset=train_datagen.flow_from_directory(
directory='cifar/train', target_size=(32,32),
batch_size=16, class_mode='categorical')

test_dataset=test_datagen.flow_from_directory(
directory='cifar/validation', target_size=(32,32),
batch_size=16, class_mode='categorical')

classifier.fit_generator(train_dataset,
steps_per_epoch=2500, epochs=50,
validation_data=test_dataset, validation_steps=625)


And here are the epoch observations :



Epoch 17/50 
2500/2500 [==============================] - 259s 103ms/step - loss:
0.9305 - acc: 0.6840 - val_loss: 0.8195 - val_acc: 0.7111

Epoch 18/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.9280 - acc: 0.6817 - val_loss: 0.9981 - val_acc: 0.6816

Epoch 19/50
2500/2500 [==============================] - 260s 104ms/step - loss:
0.9112 - acc: 0.6896 - val_loss: 0.9393 - val_acc: 0.6786

Epoch 20/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.9053 - acc: 0.6881 - val_loss: 0.8509 - val_acc: 0.7172

Epoch 21/50
2500/2500 [==============================] - 259s 104ms/step - loss:
0.9110 - acc: 0.6874 - val_loss: 0.8427 - val_acc: 0.7211

Epoch 22/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.8967 - acc: 0.6944 - val_loss: 0.7139 - val_acc: 0.7592

Epoch 23/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.8825 - acc: 0.6967 - val_loss: 0.8611 - val_acc: 0.7066

Epoch 24/50
2500/2500 [==============================] - 260s 104ms/step - loss:
0.8819 - acc: 0.6967 - val_loss: 0.7436 - val_acc: 0.7447

Epoch 25/50
2500/2500 [==============================] - 270s 108ms/step - loss:
0.8780 - acc: 0.6995 - val_loss: 0.8129 - val_acc: 0.7310

Epoch 26/50
2500/2500 [==============================] - 279s 112ms/step - loss:
0.8756 - acc: 0.7010 - val_loss: 0.7890 - val_acc: 0.7276

Epoch 27/50
2500/2500 [==============================] - 283s 113ms/step - loss:
0.8680 - acc: 0.7027 - val_loss: 0.8185 - val_acc: 0.7307

Epoch 28/50
2500/2500 [==============================] - 287s 115ms/step - loss:
0.8651 - acc: 0.7043 - val_loss: 0.7457 - val_acc: 0.7460

Epoch 29/50
2500/2500 [==============================] - 286s 114ms/step - loss:
0.8531 - acc: 0.7065 - val_loss: 1.1669 - val_acc: 0.6483

Epoch 30/50
2500/2500 [==============================] - 290s 116ms/step - loss:
0.8521 - acc: 0.7085 - val_loss: 0.7221 - val_acc: 0.7565

Epoch 31/50
2500/2500 [==============================] - 289s 116ms/step - loss:
0.8518 - acc: 0.7072 - val_loss: 0.7308 - val_acc: 0.7549

Epoch 32/50
2500/2500 [==============================] - 291s 116ms/step - loss:
0.8465 - acc: 0.7119 - val_loss: 0.8550 - val_acc: 0.7182

Epoch 33/50
2500/2500 [==============================] - 302s 121ms/step - loss:
0.8406 - acc: 0.7121 - val_loss: 1.0259 - val_acc: 0.6770

Epoch 34/50
2500/2500 [==============================] - 286s 115ms/step - loss:
0.8424 - acc: 0.7120 - val_loss: 0.6924 - val_acc: 0.7646

Epoch 35/50
2500/2500 [==============================] - 273s 109ms/step - loss:
0.8337 - acc: 0.7143 - val_loss: 0.8744 - val_acc: 0.7220

Epoch 36/50
2500/2500 [==============================] - 285s 114ms/step - loss:
0.8332 - acc: 0.7144 - val_loss: 1.0132 - val_acc: 0.6753

Epoch 37/50
2500/2500 [==============================] - 275s 110ms/step - loss:
0.8382 - acc: 0.7122 - val_loss: 0.7873 - val_acc: 0.7366


I am a beginner in deep learning so pardon me if I may have done any silly mistake. Please guide me how should I proceed further.










share|improve this question



















  • 1





    Your validation accuracy and loss is oscillating, it might be caused because of a high value of momentum in your optimizer. I suggest you also to try with a lower learning rate or increasing the decay.

    – Eric
    Mar 27 at 13:05











  • Maybe it will be better to change the number of filters in convolution layers. Start with one layer of 64 filters and the other two with 128 filters. You can read this similar issue, it mught be helpfull

    – Eric
    Mar 27 at 13:08











  • @Eric thank you for the suggestion. I will try that. Also, would the order of the convolution layers i.e. (filters) -- (128,128,64) or (128,64,128) or (64,128,128) affect the results significantly?

    – Atharva Kalsekar
    Mar 28 at 11:51











  • I think so, because every layer learns a specific property (contours, regions ...). But i just said it because the usual configuration is (64,128, 128), the best way to know it is to compare the results between this 3 coinfigurations.

    – Eric
    Mar 29 at 9:12











  • Ok thanks. I will try that as well @Eric.

    – Atharva Kalsekar
    Mar 29 at 10:11













1












1








1








I am working on the CIFAR-10 dataset and trying to get the benchmark or atleast 90% accuracy. I have tried all the below mentioned ways but most of them result into the same thing and which is.... the training accuracy doesn't improve after some epochs and stays stable and also the validation accuracy fluctuates a little bit.



The dataset directory is as :



cifar

train(total 40,000 images. 4000 images per class. Total 10 classes)
airplane
automobile......(similar structure for test and validation as well)

test.(total 10,000 images. 1000 images per class)

validation.(total 10,000 images. 1000 images per class)

code.py


I have tried using the following parameters :



  1. Optimizers : adam, nadam, adadelta and SGD.

  2. Batch sizes : 16,32.

  3. Initially started with 2 convolution layers. Firstly I trained with
    64 filters in both and then both convolution layers with 128
    filters. Now I have added a 3rd convolution layer.

Here is the code :



import keras

from keras.layers import Conv2D, MaxPool2D, Flatten, Dense, Dropout,
Activation, BatchNormalization, GlobalAveragePooling2D

from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers, regularizers


classifier = Sequential()

classifier.add(Conv2D(filters=64, kernel_size=(3,3), input_shape= (32,32,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(Conv2D(filters=64, kernel_size=(3,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(Conv2D(filters=64, kernel_size=(3,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(GlobalAveragePooling2D())
classifier.add(Dense(units=10,activation='softmax'))

'''
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
'''

classifier.compile(optimizer='nadam', loss='categorical_crossentropy', metrics=['accuracy'])

train_datagen=ImageDataGenerator(rescale=1./255,
featurewise_center=True,featurewise_std_normalization=True,
shear_range=0.2,rotation_range=20, width_shift_range=0.2,
height_shift_range=0.2,horizontal_flip=True)

test_datagen=ImageDataGenerator(rescale=1./255,
featurewise_center=True,featurewise_std_normalization=True)

train_dataset=train_datagen.flow_from_directory(
directory='cifar/train', target_size=(32,32),
batch_size=16, class_mode='categorical')

test_dataset=test_datagen.flow_from_directory(
directory='cifar/validation', target_size=(32,32),
batch_size=16, class_mode='categorical')

classifier.fit_generator(train_dataset,
steps_per_epoch=2500, epochs=50,
validation_data=test_dataset, validation_steps=625)


And here are the epoch observations :



Epoch 17/50 
2500/2500 [==============================] - 259s 103ms/step - loss:
0.9305 - acc: 0.6840 - val_loss: 0.8195 - val_acc: 0.7111

Epoch 18/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.9280 - acc: 0.6817 - val_loss: 0.9981 - val_acc: 0.6816

Epoch 19/50
2500/2500 [==============================] - 260s 104ms/step - loss:
0.9112 - acc: 0.6896 - val_loss: 0.9393 - val_acc: 0.6786

Epoch 20/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.9053 - acc: 0.6881 - val_loss: 0.8509 - val_acc: 0.7172

Epoch 21/50
2500/2500 [==============================] - 259s 104ms/step - loss:
0.9110 - acc: 0.6874 - val_loss: 0.8427 - val_acc: 0.7211

Epoch 22/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.8967 - acc: 0.6944 - val_loss: 0.7139 - val_acc: 0.7592

Epoch 23/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.8825 - acc: 0.6967 - val_loss: 0.8611 - val_acc: 0.7066

Epoch 24/50
2500/2500 [==============================] - 260s 104ms/step - loss:
0.8819 - acc: 0.6967 - val_loss: 0.7436 - val_acc: 0.7447

Epoch 25/50
2500/2500 [==============================] - 270s 108ms/step - loss:
0.8780 - acc: 0.6995 - val_loss: 0.8129 - val_acc: 0.7310

Epoch 26/50
2500/2500 [==============================] - 279s 112ms/step - loss:
0.8756 - acc: 0.7010 - val_loss: 0.7890 - val_acc: 0.7276

Epoch 27/50
2500/2500 [==============================] - 283s 113ms/step - loss:
0.8680 - acc: 0.7027 - val_loss: 0.8185 - val_acc: 0.7307

Epoch 28/50
2500/2500 [==============================] - 287s 115ms/step - loss:
0.8651 - acc: 0.7043 - val_loss: 0.7457 - val_acc: 0.7460

Epoch 29/50
2500/2500 [==============================] - 286s 114ms/step - loss:
0.8531 - acc: 0.7065 - val_loss: 1.1669 - val_acc: 0.6483

Epoch 30/50
2500/2500 [==============================] - 290s 116ms/step - loss:
0.8521 - acc: 0.7085 - val_loss: 0.7221 - val_acc: 0.7565

Epoch 31/50
2500/2500 [==============================] - 289s 116ms/step - loss:
0.8518 - acc: 0.7072 - val_loss: 0.7308 - val_acc: 0.7549

Epoch 32/50
2500/2500 [==============================] - 291s 116ms/step - loss:
0.8465 - acc: 0.7119 - val_loss: 0.8550 - val_acc: 0.7182

Epoch 33/50
2500/2500 [==============================] - 302s 121ms/step - loss:
0.8406 - acc: 0.7121 - val_loss: 1.0259 - val_acc: 0.6770

Epoch 34/50
2500/2500 [==============================] - 286s 115ms/step - loss:
0.8424 - acc: 0.7120 - val_loss: 0.6924 - val_acc: 0.7646

Epoch 35/50
2500/2500 [==============================] - 273s 109ms/step - loss:
0.8337 - acc: 0.7143 - val_loss: 0.8744 - val_acc: 0.7220

Epoch 36/50
2500/2500 [==============================] - 285s 114ms/step - loss:
0.8332 - acc: 0.7144 - val_loss: 1.0132 - val_acc: 0.6753

Epoch 37/50
2500/2500 [==============================] - 275s 110ms/step - loss:
0.8382 - acc: 0.7122 - val_loss: 0.7873 - val_acc: 0.7366


I am a beginner in deep learning so pardon me if I may have done any silly mistake. Please guide me how should I proceed further.










share|improve this question














I am working on the CIFAR-10 dataset and trying to get the benchmark or atleast 90% accuracy. I have tried all the below mentioned ways but most of them result into the same thing and which is.... the training accuracy doesn't improve after some epochs and stays stable and also the validation accuracy fluctuates a little bit.



The dataset directory is as :



cifar

train(total 40,000 images. 4000 images per class. Total 10 classes)
airplane
automobile......(similar structure for test and validation as well)

test.(total 10,000 images. 1000 images per class)

validation.(total 10,000 images. 1000 images per class)

code.py


I have tried using the following parameters :



  1. Optimizers : adam, nadam, adadelta and SGD.

  2. Batch sizes : 16,32.

  3. Initially started with 2 convolution layers. Firstly I trained with
    64 filters in both and then both convolution layers with 128
    filters. Now I have added a 3rd convolution layer.

Here is the code :



import keras

from keras.layers import Conv2D, MaxPool2D, Flatten, Dense, Dropout,
Activation, BatchNormalization, GlobalAveragePooling2D

from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers, regularizers


classifier = Sequential()

classifier.add(Conv2D(filters=64, kernel_size=(3,3), input_shape= (32,32,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(Conv2D(filters=64, kernel_size=(3,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(Conv2D(filters=64, kernel_size=(3,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(GlobalAveragePooling2D())
classifier.add(Dense(units=10,activation='softmax'))

'''
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
'''

classifier.compile(optimizer='nadam', loss='categorical_crossentropy', metrics=['accuracy'])

train_datagen=ImageDataGenerator(rescale=1./255,
featurewise_center=True,featurewise_std_normalization=True,
shear_range=0.2,rotation_range=20, width_shift_range=0.2,
height_shift_range=0.2,horizontal_flip=True)

test_datagen=ImageDataGenerator(rescale=1./255,
featurewise_center=True,featurewise_std_normalization=True)

train_dataset=train_datagen.flow_from_directory(
directory='cifar/train', target_size=(32,32),
batch_size=16, class_mode='categorical')

test_dataset=test_datagen.flow_from_directory(
directory='cifar/validation', target_size=(32,32),
batch_size=16, class_mode='categorical')

classifier.fit_generator(train_dataset,
steps_per_epoch=2500, epochs=50,
validation_data=test_dataset, validation_steps=625)


And here are the epoch observations :



Epoch 17/50 
2500/2500 [==============================] - 259s 103ms/step - loss:
0.9305 - acc: 0.6840 - val_loss: 0.8195 - val_acc: 0.7111

Epoch 18/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.9280 - acc: 0.6817 - val_loss: 0.9981 - val_acc: 0.6816

Epoch 19/50
2500/2500 [==============================] - 260s 104ms/step - loss:
0.9112 - acc: 0.6896 - val_loss: 0.9393 - val_acc: 0.6786

Epoch 20/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.9053 - acc: 0.6881 - val_loss: 0.8509 - val_acc: 0.7172

Epoch 21/50
2500/2500 [==============================] - 259s 104ms/step - loss:
0.9110 - acc: 0.6874 - val_loss: 0.8427 - val_acc: 0.7211

Epoch 22/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.8967 - acc: 0.6944 - val_loss: 0.7139 - val_acc: 0.7592

Epoch 23/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.8825 - acc: 0.6967 - val_loss: 0.8611 - val_acc: 0.7066

Epoch 24/50
2500/2500 [==============================] - 260s 104ms/step - loss:
0.8819 - acc: 0.6967 - val_loss: 0.7436 - val_acc: 0.7447

Epoch 25/50
2500/2500 [==============================] - 270s 108ms/step - loss:
0.8780 - acc: 0.6995 - val_loss: 0.8129 - val_acc: 0.7310

Epoch 26/50
2500/2500 [==============================] - 279s 112ms/step - loss:
0.8756 - acc: 0.7010 - val_loss: 0.7890 - val_acc: 0.7276

Epoch 27/50
2500/2500 [==============================] - 283s 113ms/step - loss:
0.8680 - acc: 0.7027 - val_loss: 0.8185 - val_acc: 0.7307

Epoch 28/50
2500/2500 [==============================] - 287s 115ms/step - loss:
0.8651 - acc: 0.7043 - val_loss: 0.7457 - val_acc: 0.7460

Epoch 29/50
2500/2500 [==============================] - 286s 114ms/step - loss:
0.8531 - acc: 0.7065 - val_loss: 1.1669 - val_acc: 0.6483

Epoch 30/50
2500/2500 [==============================] - 290s 116ms/step - loss:
0.8521 - acc: 0.7085 - val_loss: 0.7221 - val_acc: 0.7565

Epoch 31/50
2500/2500 [==============================] - 289s 116ms/step - loss:
0.8518 - acc: 0.7072 - val_loss: 0.7308 - val_acc: 0.7549

Epoch 32/50
2500/2500 [==============================] - 291s 116ms/step - loss:
0.8465 - acc: 0.7119 - val_loss: 0.8550 - val_acc: 0.7182

Epoch 33/50
2500/2500 [==============================] - 302s 121ms/step - loss:
0.8406 - acc: 0.7121 - val_loss: 1.0259 - val_acc: 0.6770

Epoch 34/50
2500/2500 [==============================] - 286s 115ms/step - loss:
0.8424 - acc: 0.7120 - val_loss: 0.6924 - val_acc: 0.7646

Epoch 35/50
2500/2500 [==============================] - 273s 109ms/step - loss:
0.8337 - acc: 0.7143 - val_loss: 0.8744 - val_acc: 0.7220

Epoch 36/50
2500/2500 [==============================] - 285s 114ms/step - loss:
0.8332 - acc: 0.7144 - val_loss: 1.0132 - val_acc: 0.6753

Epoch 37/50
2500/2500 [==============================] - 275s 110ms/step - loss:
0.8382 - acc: 0.7122 - val_loss: 0.7873 - val_acc: 0.7366


I am a beginner in deep learning so pardon me if I may have done any silly mistake. Please guide me how should I proceed further.







keras deep-learning conv-neural-network






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 27 at 12:29









Atharva KalsekarAtharva Kalsekar

61 bronze badge




61 bronze badge










  • 1





    Your validation accuracy and loss is oscillating, it might be caused because of a high value of momentum in your optimizer. I suggest you also to try with a lower learning rate or increasing the decay.

    – Eric
    Mar 27 at 13:05











  • Maybe it will be better to change the number of filters in convolution layers. Start with one layer of 64 filters and the other two with 128 filters. You can read this similar issue, it mught be helpfull

    – Eric
    Mar 27 at 13:08











  • @Eric thank you for the suggestion. I will try that. Also, would the order of the convolution layers i.e. (filters) -- (128,128,64) or (128,64,128) or (64,128,128) affect the results significantly?

    – Atharva Kalsekar
    Mar 28 at 11:51











  • I think so, because every layer learns a specific property (contours, regions ...). But i just said it because the usual configuration is (64,128, 128), the best way to know it is to compare the results between this 3 coinfigurations.

    – Eric
    Mar 29 at 9:12











  • Ok thanks. I will try that as well @Eric.

    – Atharva Kalsekar
    Mar 29 at 10:11












  • 1





    Your validation accuracy and loss is oscillating, it might be caused because of a high value of momentum in your optimizer. I suggest you also to try with a lower learning rate or increasing the decay.

    – Eric
    Mar 27 at 13:05











  • Maybe it will be better to change the number of filters in convolution layers. Start with one layer of 64 filters and the other two with 128 filters. You can read this similar issue, it mught be helpfull

    – Eric
    Mar 27 at 13:08











  • @Eric thank you for the suggestion. I will try that. Also, would the order of the convolution layers i.e. (filters) -- (128,128,64) or (128,64,128) or (64,128,128) affect the results significantly?

    – Atharva Kalsekar
    Mar 28 at 11:51











  • I think so, because every layer learns a specific property (contours, regions ...). But i just said it because the usual configuration is (64,128, 128), the best way to know it is to compare the results between this 3 coinfigurations.

    – Eric
    Mar 29 at 9:12











  • Ok thanks. I will try that as well @Eric.

    – Atharva Kalsekar
    Mar 29 at 10:11







1




1





Your validation accuracy and loss is oscillating, it might be caused because of a high value of momentum in your optimizer. I suggest you also to try with a lower learning rate or increasing the decay.

– Eric
Mar 27 at 13:05





Your validation accuracy and loss is oscillating, it might be caused because of a high value of momentum in your optimizer. I suggest you also to try with a lower learning rate or increasing the decay.

– Eric
Mar 27 at 13:05













Maybe it will be better to change the number of filters in convolution layers. Start with one layer of 64 filters and the other two with 128 filters. You can read this similar issue, it mught be helpfull

– Eric
Mar 27 at 13:08





Maybe it will be better to change the number of filters in convolution layers. Start with one layer of 64 filters and the other two with 128 filters. You can read this similar issue, it mught be helpfull

– Eric
Mar 27 at 13:08













@Eric thank you for the suggestion. I will try that. Also, would the order of the convolution layers i.e. (filters) -- (128,128,64) or (128,64,128) or (64,128,128) affect the results significantly?

– Atharva Kalsekar
Mar 28 at 11:51





@Eric thank you for the suggestion. I will try that. Also, would the order of the convolution layers i.e. (filters) -- (128,128,64) or (128,64,128) or (64,128,128) affect the results significantly?

– Atharva Kalsekar
Mar 28 at 11:51













I think so, because every layer learns a specific property (contours, regions ...). But i just said it because the usual configuration is (64,128, 128), the best way to know it is to compare the results between this 3 coinfigurations.

– Eric
Mar 29 at 9:12





I think so, because every layer learns a specific property (contours, regions ...). But i just said it because the usual configuration is (64,128, 128), the best way to know it is to compare the results between this 3 coinfigurations.

– Eric
Mar 29 at 9:12













Ok thanks. I will try that as well @Eric.

– Atharva Kalsekar
Mar 29 at 10:11





Ok thanks. I will try that as well @Eric.

– Atharva Kalsekar
Mar 29 at 10:11












0






active

oldest

votes










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55377244%2fwhy-does-the-training-accuracy-of-the-model-increase-very-slowly-almost-stays-s%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes




Is this question similar to what you get asked at work? Learn more about asking and sharing private information with your coworkers using Stack Overflow for Teams.







Is this question similar to what you get asked at work? Learn more about asking and sharing private information with your coworkers using Stack Overflow for Teams.



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55377244%2fwhy-does-the-training-accuracy-of-the-model-increase-very-slowly-almost-stays-s%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

은진 송씨 목차 역사 본관 분파 인물 조선 왕실과의 인척 관계 집성촌 항렬자 인구 같이 보기 각주 둘러보기 메뉴은진 송씨세종실록 149권, 지리지 충청도 공주목 은진현