Validation loss increases after 3 epochs but validation accuracy keeps increasingnoisy validation loss (versus epoch) when using batch normalizationKeras image classification validation accuracy higherloss, val_loss, acc and val_acc do not update at all over epochsKeras: Training loss decrases (accuracy increase) while validation loss increases (accuracy decrease)Keras LSTM - Validation Loss Increasing From Epoch #1ConvNet validation accuracy relation with each epochTest Accuracy Increases Whilst Loss IncreasesWhy model produces the best performance after the first epoch when my training loss decreases and the accuracy of the validation set increases?Validation Loss Never DecreasesTraining Loss suddenly increases while validation loss continues decreasing in tensorflow
Why didn't Thanos use the Time Stone to stop the Avengers' plan?
Gladys goes shopping
Any advice on creating fictional locations in real places when writing historical fiction?
Why does Mjolnir fall down in Age of Ultron but not in Endgame?
Have 1.5% of all nuclear reactors ever built melted down?
Why would Ryanair allow me to book this journey through a third party, but not through their own website?
number headings
C++ forcing function parameter evalution order
How to illustrate the Mean Value theorem?
Is it possible to play as a necromancer skeleton?
Using credit/debit card details vs swiping a card in a payment (credit card) terminal
How did these characters "suit up" so quickly?
My players want to grind XP but we're using milestone advancement
Is it true that cut time means "play twice as fast as written"?
What is a really good book for complex variables?
Where can I find visible/radio telescopic observations of the center of the Milky Way galaxy?
A steel cutting sword?
Is the taxi route omitted in low visibility (LVP)?
How to deal with a colleague who is being aggressive?
What does this symbol on the box of power supply mean?
The art of clickbait captions
Can I tell a prospective employee that everyone in the team is leaving?
Looking for a soft substance that doesn't dissolve underwater
Did 20% of US soldiers in Vietnam use heroin, 95% of whom quit afterwards?
Validation loss increases after 3 epochs but validation accuracy keeps increasing
noisy validation loss (versus epoch) when using batch normalizationKeras image classification validation accuracy higherloss, val_loss, acc and val_acc do not update at all over epochsKeras: Training loss decrases (accuracy increase) while validation loss increases (accuracy decrease)Keras LSTM - Validation Loss Increasing From Epoch #1ConvNet validation accuracy relation with each epochTest Accuracy Increases Whilst Loss IncreasesWhy model produces the best performance after the first epoch when my training loss decreases and the accuracy of the validation set increases?Validation Loss Never DecreasesTraining Loss suddenly increases while validation loss continues decreasing in tensorflow
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
Training and validation is healthy for 2 epochs but after 2-3 epochs the Val_loss keeps increasing while the Val_acc keeps increasing.
I'm trying to train a CNN model to classify a given review to a single class of 1-5. Therefore, I considered it as a multi-class classification.
I've divided the dataset to 3 sets - 70% training, 20% testing and 10% validation.
Distribution of training data for 5 classes as follows.
1 - 31613, 2 - 32527, 3 - 61044, 4 - 140005, 5 - 173023.
Therefore I've added class weights as follows.
1: 5.47, 2: 5.32, 3: 2.83, 4: 1.26, 5: 1
Model structure is as below.
input_layer = Input(shape=(max_length, ), dtype='int32')
embedding = Embedding(vocab_size, 200, input_length=max_length)(input_layer)
channel1 = Conv1D(filters=100, kernel_size=2, padding='valid', activation='relu', strides=1)(embedding)
channel1 = GlobalMaxPooling1D()(channel1)
channel2 = Conv1D(filters=100, kernel_size=3, padding='valid', activation='relu', strides=1)(embedding)
channel2 = GlobalMaxPooling1D()(channel2)
channel3 = Conv1D(filters=100, kernel_size=4, padding='valid', activation='relu', strides=1)(embedding)
channel3 = GlobalMaxPooling1D()(channel3)
merged = concatenate([channel1, channel2, channel3], axis=1)
merged = Dense(256, activation='relu')(merged)
merged = Dropout(0.6)(merged)
merged = Dense(5)(merged)
output = Activation('softmax')(merged)
model = Model(inputs=[input_layer], outputs=[output])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
model.fit(final_X_train, final_Y_train, epochs=5, batch_size=512, validation_data=(final_X_val, final_Y_val), callbacks=callback, class_weight=class_weights)
1/5 - loss: 1.8733 - categorical_accuracy: 0.5892 - val_loss: 0.7749 - val_categorical_accuracy: 0.6558
2/5 - loss: 1.3908 - categorical_accuracy: 0.6917 - val_loss: 0.7421 - val_categorical_accuracy: 0.6784
3/5 - loss: 0.9587 - categorical_accuracy: 0.7734 - val_loss: 0.7595 - val_categorical_accuracy: 0.6947
4/5 - loss: 0.6402 - categorical_accuracy: 0.8370 - val_loss: 0.7921 - val_categorical_accuracy: 0.7216
5/5 - loss: 0.4520 - categorical_accuracy: 0.8814 - val_loss: 0.8556 - val_categorical_accuracy: 0.7331
Final accuracy = 0.7328754744261703
This seems to be an overfitting behavior, but I've tried adding dropout layers which didn't help. I've also tried increasing the data, which made the results even worst.
I'm totally new to deep learning, if anyone has any suggestions to improve, please let me know.
python tensorflow deep-learning classification multilabel-classification
add a comment |
Training and validation is healthy for 2 epochs but after 2-3 epochs the Val_loss keeps increasing while the Val_acc keeps increasing.
I'm trying to train a CNN model to classify a given review to a single class of 1-5. Therefore, I considered it as a multi-class classification.
I've divided the dataset to 3 sets - 70% training, 20% testing and 10% validation.
Distribution of training data for 5 classes as follows.
1 - 31613, 2 - 32527, 3 - 61044, 4 - 140005, 5 - 173023.
Therefore I've added class weights as follows.
1: 5.47, 2: 5.32, 3: 2.83, 4: 1.26, 5: 1
Model structure is as below.
input_layer = Input(shape=(max_length, ), dtype='int32')
embedding = Embedding(vocab_size, 200, input_length=max_length)(input_layer)
channel1 = Conv1D(filters=100, kernel_size=2, padding='valid', activation='relu', strides=1)(embedding)
channel1 = GlobalMaxPooling1D()(channel1)
channel2 = Conv1D(filters=100, kernel_size=3, padding='valid', activation='relu', strides=1)(embedding)
channel2 = GlobalMaxPooling1D()(channel2)
channel3 = Conv1D(filters=100, kernel_size=4, padding='valid', activation='relu', strides=1)(embedding)
channel3 = GlobalMaxPooling1D()(channel3)
merged = concatenate([channel1, channel2, channel3], axis=1)
merged = Dense(256, activation='relu')(merged)
merged = Dropout(0.6)(merged)
merged = Dense(5)(merged)
output = Activation('softmax')(merged)
model = Model(inputs=[input_layer], outputs=[output])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
model.fit(final_X_train, final_Y_train, epochs=5, batch_size=512, validation_data=(final_X_val, final_Y_val), callbacks=callback, class_weight=class_weights)
1/5 - loss: 1.8733 - categorical_accuracy: 0.5892 - val_loss: 0.7749 - val_categorical_accuracy: 0.6558
2/5 - loss: 1.3908 - categorical_accuracy: 0.6917 - val_loss: 0.7421 - val_categorical_accuracy: 0.6784
3/5 - loss: 0.9587 - categorical_accuracy: 0.7734 - val_loss: 0.7595 - val_categorical_accuracy: 0.6947
4/5 - loss: 0.6402 - categorical_accuracy: 0.8370 - val_loss: 0.7921 - val_categorical_accuracy: 0.7216
5/5 - loss: 0.4520 - categorical_accuracy: 0.8814 - val_loss: 0.8556 - val_categorical_accuracy: 0.7331
Final accuracy = 0.7328754744261703
This seems to be an overfitting behavior, but I've tried adding dropout layers which didn't help. I've also tried increasing the data, which made the results even worst.
I'm totally new to deep learning, if anyone has any suggestions to improve, please let me know.
python tensorflow deep-learning classification multilabel-classification
1
Increasing validation loss is perfectly fine as long as accuracy keeps improving. Google about cross-entropy loss if it is not clear. I would try to remove class weights. Although it is unbalanced, you still have relatively large amount of samples for every class. Instead, I would shuffle data at the beginning of every epoch and try to train longer than 5 epochs. Maybe 50-100 epochs.
– Vlad
Mar 25 at 12:16
I added an EarlyStopping to stop training once the val_categorical_accuracy starts dropping. I managed to train up to 9epochs and then val_accuracy started to decrease and the training stopped at 0.76 accuracy. After testing on testing set it gave a similar accuracy. But the loss kept increasing after 4epochs.
– stranger
Mar 28 at 8:22
add a comment |
Training and validation is healthy for 2 epochs but after 2-3 epochs the Val_loss keeps increasing while the Val_acc keeps increasing.
I'm trying to train a CNN model to classify a given review to a single class of 1-5. Therefore, I considered it as a multi-class classification.
I've divided the dataset to 3 sets - 70% training, 20% testing and 10% validation.
Distribution of training data for 5 classes as follows.
1 - 31613, 2 - 32527, 3 - 61044, 4 - 140005, 5 - 173023.
Therefore I've added class weights as follows.
1: 5.47, 2: 5.32, 3: 2.83, 4: 1.26, 5: 1
Model structure is as below.
input_layer = Input(shape=(max_length, ), dtype='int32')
embedding = Embedding(vocab_size, 200, input_length=max_length)(input_layer)
channel1 = Conv1D(filters=100, kernel_size=2, padding='valid', activation='relu', strides=1)(embedding)
channel1 = GlobalMaxPooling1D()(channel1)
channel2 = Conv1D(filters=100, kernel_size=3, padding='valid', activation='relu', strides=1)(embedding)
channel2 = GlobalMaxPooling1D()(channel2)
channel3 = Conv1D(filters=100, kernel_size=4, padding='valid', activation='relu', strides=1)(embedding)
channel3 = GlobalMaxPooling1D()(channel3)
merged = concatenate([channel1, channel2, channel3], axis=1)
merged = Dense(256, activation='relu')(merged)
merged = Dropout(0.6)(merged)
merged = Dense(5)(merged)
output = Activation('softmax')(merged)
model = Model(inputs=[input_layer], outputs=[output])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
model.fit(final_X_train, final_Y_train, epochs=5, batch_size=512, validation_data=(final_X_val, final_Y_val), callbacks=callback, class_weight=class_weights)
1/5 - loss: 1.8733 - categorical_accuracy: 0.5892 - val_loss: 0.7749 - val_categorical_accuracy: 0.6558
2/5 - loss: 1.3908 - categorical_accuracy: 0.6917 - val_loss: 0.7421 - val_categorical_accuracy: 0.6784
3/5 - loss: 0.9587 - categorical_accuracy: 0.7734 - val_loss: 0.7595 - val_categorical_accuracy: 0.6947
4/5 - loss: 0.6402 - categorical_accuracy: 0.8370 - val_loss: 0.7921 - val_categorical_accuracy: 0.7216
5/5 - loss: 0.4520 - categorical_accuracy: 0.8814 - val_loss: 0.8556 - val_categorical_accuracy: 0.7331
Final accuracy = 0.7328754744261703
This seems to be an overfitting behavior, but I've tried adding dropout layers which didn't help. I've also tried increasing the data, which made the results even worst.
I'm totally new to deep learning, if anyone has any suggestions to improve, please let me know.
python tensorflow deep-learning classification multilabel-classification
Training and validation is healthy for 2 epochs but after 2-3 epochs the Val_loss keeps increasing while the Val_acc keeps increasing.
I'm trying to train a CNN model to classify a given review to a single class of 1-5. Therefore, I considered it as a multi-class classification.
I've divided the dataset to 3 sets - 70% training, 20% testing and 10% validation.
Distribution of training data for 5 classes as follows.
1 - 31613, 2 - 32527, 3 - 61044, 4 - 140005, 5 - 173023.
Therefore I've added class weights as follows.
1: 5.47, 2: 5.32, 3: 2.83, 4: 1.26, 5: 1
Model structure is as below.
input_layer = Input(shape=(max_length, ), dtype='int32')
embedding = Embedding(vocab_size, 200, input_length=max_length)(input_layer)
channel1 = Conv1D(filters=100, kernel_size=2, padding='valid', activation='relu', strides=1)(embedding)
channel1 = GlobalMaxPooling1D()(channel1)
channel2 = Conv1D(filters=100, kernel_size=3, padding='valid', activation='relu', strides=1)(embedding)
channel2 = GlobalMaxPooling1D()(channel2)
channel3 = Conv1D(filters=100, kernel_size=4, padding='valid', activation='relu', strides=1)(embedding)
channel3 = GlobalMaxPooling1D()(channel3)
merged = concatenate([channel1, channel2, channel3], axis=1)
merged = Dense(256, activation='relu')(merged)
merged = Dropout(0.6)(merged)
merged = Dense(5)(merged)
output = Activation('softmax')(merged)
model = Model(inputs=[input_layer], outputs=[output])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
model.fit(final_X_train, final_Y_train, epochs=5, batch_size=512, validation_data=(final_X_val, final_Y_val), callbacks=callback, class_weight=class_weights)
1/5 - loss: 1.8733 - categorical_accuracy: 0.5892 - val_loss: 0.7749 - val_categorical_accuracy: 0.6558
2/5 - loss: 1.3908 - categorical_accuracy: 0.6917 - val_loss: 0.7421 - val_categorical_accuracy: 0.6784
3/5 - loss: 0.9587 - categorical_accuracy: 0.7734 - val_loss: 0.7595 - val_categorical_accuracy: 0.6947
4/5 - loss: 0.6402 - categorical_accuracy: 0.8370 - val_loss: 0.7921 - val_categorical_accuracy: 0.7216
5/5 - loss: 0.4520 - categorical_accuracy: 0.8814 - val_loss: 0.8556 - val_categorical_accuracy: 0.7331
Final accuracy = 0.7328754744261703
This seems to be an overfitting behavior, but I've tried adding dropout layers which didn't help. I've also tried increasing the data, which made the results even worst.
I'm totally new to deep learning, if anyone has any suggestions to improve, please let me know.
python tensorflow deep-learning classification multilabel-classification
python tensorflow deep-learning classification multilabel-classification
edited Mar 24 at 4:05
stranger
asked Mar 24 at 3:49
strangerstranger
157
157
1
Increasing validation loss is perfectly fine as long as accuracy keeps improving. Google about cross-entropy loss if it is not clear. I would try to remove class weights. Although it is unbalanced, you still have relatively large amount of samples for every class. Instead, I would shuffle data at the beginning of every epoch and try to train longer than 5 epochs. Maybe 50-100 epochs.
– Vlad
Mar 25 at 12:16
I added an EarlyStopping to stop training once the val_categorical_accuracy starts dropping. I managed to train up to 9epochs and then val_accuracy started to decrease and the training stopped at 0.76 accuracy. After testing on testing set it gave a similar accuracy. But the loss kept increasing after 4epochs.
– stranger
Mar 28 at 8:22
add a comment |
1
Increasing validation loss is perfectly fine as long as accuracy keeps improving. Google about cross-entropy loss if it is not clear. I would try to remove class weights. Although it is unbalanced, you still have relatively large amount of samples for every class. Instead, I would shuffle data at the beginning of every epoch and try to train longer than 5 epochs. Maybe 50-100 epochs.
– Vlad
Mar 25 at 12:16
I added an EarlyStopping to stop training once the val_categorical_accuracy starts dropping. I managed to train up to 9epochs and then val_accuracy started to decrease and the training stopped at 0.76 accuracy. After testing on testing set it gave a similar accuracy. But the loss kept increasing after 4epochs.
– stranger
Mar 28 at 8:22
1
1
Increasing validation loss is perfectly fine as long as accuracy keeps improving. Google about cross-entropy loss if it is not clear. I would try to remove class weights. Although it is unbalanced, you still have relatively large amount of samples for every class. Instead, I would shuffle data at the beginning of every epoch and try to train longer than 5 epochs. Maybe 50-100 epochs.
– Vlad
Mar 25 at 12:16
Increasing validation loss is perfectly fine as long as accuracy keeps improving. Google about cross-entropy loss if it is not clear. I would try to remove class weights. Although it is unbalanced, you still have relatively large amount of samples for every class. Instead, I would shuffle data at the beginning of every epoch and try to train longer than 5 epochs. Maybe 50-100 epochs.
– Vlad
Mar 25 at 12:16
I added an EarlyStopping to stop training once the val_categorical_accuracy starts dropping. I managed to train up to 9epochs and then val_accuracy started to decrease and the training stopped at 0.76 accuracy. After testing on testing set it gave a similar accuracy. But the loss kept increasing after 4epochs.
– stranger
Mar 28 at 8:22
I added an EarlyStopping to stop training once the val_categorical_accuracy starts dropping. I managed to train up to 9epochs and then val_accuracy started to decrease and the training stopped at 0.76 accuracy. After testing on testing set it gave a similar accuracy. But the loss kept increasing after 4epochs.
– stranger
Mar 28 at 8:22
add a comment |
1 Answer
1
active
oldest
votes
val_loss keeps increasing while the Val_acc keeps increasing This is maybe because of the loss function...loss function is being calculated using actual predicted probabilities while accuracy is being calculated using one hot vectors.
Let's take your 4-class example. For one of the review true class is, say 1. The predicted probabilities by the system are [0.25, 0.30, 0.25, 0.2]. According to categorical_accuracy your output is correct i.e [0, 1, 0, 0] but since your probability mass is so distributed...categorical_crossentropy will give a high loss as well.
As for the overfitting problem. I am not really sure why introducing more data is causing problems.
Try increasing the strides.
Don't make the data more imbalanced by adding data to any particular class.
I still have a doubt in selecting the best loss function for my scenario. I expect one output which should be a class from 1 to 5. Do you think using categorical_crossentropy is the best or maybe using mean_squared_error because I always use argmax to take the class with the highest probability in categorical_crossentropy ignoring the rest of the classes with low probabilities?
– stranger
Mar 27 at 7:15
1
mean_squared_error is not recommended in multi-class classification problems. I think you should use categorical_crossentropy only. You should read up about more combinations of activation and loss functions recommended for multi-class classifications(single and multi-label).
– ashutosh singh
Mar 27 at 9:02
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55320567%2fvalidation-loss-increases-after-3-epochs-but-validation-accuracy-keeps-increasin%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
val_loss keeps increasing while the Val_acc keeps increasing This is maybe because of the loss function...loss function is being calculated using actual predicted probabilities while accuracy is being calculated using one hot vectors.
Let's take your 4-class example. For one of the review true class is, say 1. The predicted probabilities by the system are [0.25, 0.30, 0.25, 0.2]. According to categorical_accuracy your output is correct i.e [0, 1, 0, 0] but since your probability mass is so distributed...categorical_crossentropy will give a high loss as well.
As for the overfitting problem. I am not really sure why introducing more data is causing problems.
Try increasing the strides.
Don't make the data more imbalanced by adding data to any particular class.
I still have a doubt in selecting the best loss function for my scenario. I expect one output which should be a class from 1 to 5. Do you think using categorical_crossentropy is the best or maybe using mean_squared_error because I always use argmax to take the class with the highest probability in categorical_crossentropy ignoring the rest of the classes with low probabilities?
– stranger
Mar 27 at 7:15
1
mean_squared_error is not recommended in multi-class classification problems. I think you should use categorical_crossentropy only. You should read up about more combinations of activation and loss functions recommended for multi-class classifications(single and multi-label).
– ashutosh singh
Mar 27 at 9:02
add a comment |
val_loss keeps increasing while the Val_acc keeps increasing This is maybe because of the loss function...loss function is being calculated using actual predicted probabilities while accuracy is being calculated using one hot vectors.
Let's take your 4-class example. For one of the review true class is, say 1. The predicted probabilities by the system are [0.25, 0.30, 0.25, 0.2]. According to categorical_accuracy your output is correct i.e [0, 1, 0, 0] but since your probability mass is so distributed...categorical_crossentropy will give a high loss as well.
As for the overfitting problem. I am not really sure why introducing more data is causing problems.
Try increasing the strides.
Don't make the data more imbalanced by adding data to any particular class.
I still have a doubt in selecting the best loss function for my scenario. I expect one output which should be a class from 1 to 5. Do you think using categorical_crossentropy is the best or maybe using mean_squared_error because I always use argmax to take the class with the highest probability in categorical_crossentropy ignoring the rest of the classes with low probabilities?
– stranger
Mar 27 at 7:15
1
mean_squared_error is not recommended in multi-class classification problems. I think you should use categorical_crossentropy only. You should read up about more combinations of activation and loss functions recommended for multi-class classifications(single and multi-label).
– ashutosh singh
Mar 27 at 9:02
add a comment |
val_loss keeps increasing while the Val_acc keeps increasing This is maybe because of the loss function...loss function is being calculated using actual predicted probabilities while accuracy is being calculated using one hot vectors.
Let's take your 4-class example. For one of the review true class is, say 1. The predicted probabilities by the system are [0.25, 0.30, 0.25, 0.2]. According to categorical_accuracy your output is correct i.e [0, 1, 0, 0] but since your probability mass is so distributed...categorical_crossentropy will give a high loss as well.
As for the overfitting problem. I am not really sure why introducing more data is causing problems.
Try increasing the strides.
Don't make the data more imbalanced by adding data to any particular class.
val_loss keeps increasing while the Val_acc keeps increasing This is maybe because of the loss function...loss function is being calculated using actual predicted probabilities while accuracy is being calculated using one hot vectors.
Let's take your 4-class example. For one of the review true class is, say 1. The predicted probabilities by the system are [0.25, 0.30, 0.25, 0.2]. According to categorical_accuracy your output is correct i.e [0, 1, 0, 0] but since your probability mass is so distributed...categorical_crossentropy will give a high loss as well.
As for the overfitting problem. I am not really sure why introducing more data is causing problems.
Try increasing the strides.
Don't make the data more imbalanced by adding data to any particular class.
answered Mar 25 at 11:39
ashutosh singhashutosh singh
976
976
I still have a doubt in selecting the best loss function for my scenario. I expect one output which should be a class from 1 to 5. Do you think using categorical_crossentropy is the best or maybe using mean_squared_error because I always use argmax to take the class with the highest probability in categorical_crossentropy ignoring the rest of the classes with low probabilities?
– stranger
Mar 27 at 7:15
1
mean_squared_error is not recommended in multi-class classification problems. I think you should use categorical_crossentropy only. You should read up about more combinations of activation and loss functions recommended for multi-class classifications(single and multi-label).
– ashutosh singh
Mar 27 at 9:02
add a comment |
I still have a doubt in selecting the best loss function for my scenario. I expect one output which should be a class from 1 to 5. Do you think using categorical_crossentropy is the best or maybe using mean_squared_error because I always use argmax to take the class with the highest probability in categorical_crossentropy ignoring the rest of the classes with low probabilities?
– stranger
Mar 27 at 7:15
1
mean_squared_error is not recommended in multi-class classification problems. I think you should use categorical_crossentropy only. You should read up about more combinations of activation and loss functions recommended for multi-class classifications(single and multi-label).
– ashutosh singh
Mar 27 at 9:02
I still have a doubt in selecting the best loss function for my scenario. I expect one output which should be a class from 1 to 5. Do you think using categorical_crossentropy is the best or maybe using mean_squared_error because I always use argmax to take the class with the highest probability in categorical_crossentropy ignoring the rest of the classes with low probabilities?
– stranger
Mar 27 at 7:15
I still have a doubt in selecting the best loss function for my scenario. I expect one output which should be a class from 1 to 5. Do you think using categorical_crossentropy is the best or maybe using mean_squared_error because I always use argmax to take the class with the highest probability in categorical_crossentropy ignoring the rest of the classes with low probabilities?
– stranger
Mar 27 at 7:15
1
1
mean_squared_error is not recommended in multi-class classification problems. I think you should use categorical_crossentropy only. You should read up about more combinations of activation and loss functions recommended for multi-class classifications(single and multi-label).
– ashutosh singh
Mar 27 at 9:02
mean_squared_error is not recommended in multi-class classification problems. I think you should use categorical_crossentropy only. You should read up about more combinations of activation and loss functions recommended for multi-class classifications(single and multi-label).
– ashutosh singh
Mar 27 at 9:02
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55320567%2fvalidation-loss-increases-after-3-epochs-but-validation-accuracy-keeps-increasin%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Increasing validation loss is perfectly fine as long as accuracy keeps improving. Google about cross-entropy loss if it is not clear. I would try to remove class weights. Although it is unbalanced, you still have relatively large amount of samples for every class. Instead, I would shuffle data at the beginning of every epoch and try to train longer than 5 epochs. Maybe 50-100 epochs.
– Vlad
Mar 25 at 12:16
I added an EarlyStopping to stop training once the val_categorical_accuracy starts dropping. I managed to train up to 9epochs and then val_accuracy started to decrease and the training stopped at 0.76 accuracy. After testing on testing set it gave a similar accuracy. But the loss kept increasing after 4epochs.
– stranger
Mar 28 at 8:22