Validation loss oscillates a lot, validation accuracy > learning accuracy, but test accuracy is high. Is my model overfitting?Keras cifar10 example validation and test loss lower than training lossthe training accuracy steadily increase, but training loss decrease and then increaseloss and validation loss decreasing but accuracy and validation accuracy remain staticPytorch Test Loss increases while accuracy increasesWhy model produces the best performance after the first epoch when my training loss decreases and the accuracy of the validation set increases?How to deal with validation loss from train loss, which is also converging in deep learning?Pytorch Adam optimizer's awkward behavior? better with restart?Validation loss increase and constant training accuracy 1D cnnAccuracy in a CNN model never goes high for training and validation setIs 10 fold test meaningful to measure overfitting?
What is a "Lear Processor" and how did it work?
How effective would wooden scale armor be in a medieval setting?
Should I include code in my research paper?
Is a request to book a business flight ticket for a graduate student an unreasonable one?
Is it okay to roll multiple attacks that all have advantage in one cluster?
LED glows slightly during soldering
What do three diagonal dots above a letter mean in the "Misal rico de Cisneros" (Spain, 1518)?
What is the minimum time required for final wash in film development?
Why does the US seem to have a rather low economic interest in Africa?
Is it OK to leave real names & info visible in business card portfolio?
Is there a strong legal guarantee that the U.S. can give to another country that it won't attack them?
The origin of a particular self-reference paradox
What are the original Russian words for a prostitute?
How can I effectively communicate to recruiters that a phone call is not possible?
What is this little owl-like bird?
Addressing unnecessary daily meetings with manager?
Received a dinner invitation through my employer's email, is it ok to attend?
What attributes and how big would a sea creature(s) need to be able to tow a ship?
Write a function
Is there a minimum field size for peah to apply?
How do native German speakers usually express skepticism (using even) about a premise?
To what extent would a wizard be able to combine feats to learn to mimic unknown spells?
When did "&" stop being taught alongside the alphabet?
How do you move up one folder in Finder?
Validation loss oscillates a lot, validation accuracy > learning accuracy, but test accuracy is high. Is my model overfitting?
Keras cifar10 example validation and test loss lower than training lossthe training accuracy steadily increase, but training loss decrease and then increaseloss and validation loss decreasing but accuracy and validation accuracy remain staticPytorch Test Loss increases while accuracy increasesWhy model produces the best performance after the first epoch when my training loss decreases and the accuracy of the validation set increases?How to deal with validation loss from train loss, which is also converging in deep learning?Pytorch Adam optimizer's awkward behavior? better with restart?Validation loss increase and constant training accuracy 1D cnnAccuracy in a CNN model never goes high for training and validation setIs 10 fold test meaningful to measure overfitting?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
- I am training a model, and using the original learning rate of the author (I use their github too), I get a validation loss that keeps oscillating a lot, it will decrease but then suddenly jump to a large value and then decrease again, but never really converges as the lowest it gets is 2 (while training loss converges to 0.0 something - much below 1)
At each epoch I get the training accuracy and at the end, the validation accuracy. Validation accuracy is always greater than the training accuracy.
When I test on real test data, I get good results, but I wonder if my model is overfitting. I expect a good model's val loss to converge in a similar fashion with training loss, but this doesn't happen and the fact that the val loss oscillates to very large values at times worries me.
- Adjusting the learning rate and scheduler etc etc, I got the val loss and training loss to a downward fashion with less oscilliation, but this time my test accuracy remains low (as well as training and validation accuracies)
I did try a couple of optimizers (adam, sgd, adagrad) with step scheduler and also the pleateu one of pytorch, I played with step sizes etc. but it didn't really help, neither did clipping gradients.
- Is my model overfitting?
- If so, how can I reduce the overfitting besides data augmentation?
- If not (I read some people on quora said it is nothing to worry about, though I would think it must be overfitting), how can I justify it? Even if I would get similar results for a k-fold experiment, would it be good enough? I don't feel it would justify the oscilliating. How should I proceed?
optimization deep-learning conv-neural-network
add a comment |
- I am training a model, and using the original learning rate of the author (I use their github too), I get a validation loss that keeps oscillating a lot, it will decrease but then suddenly jump to a large value and then decrease again, but never really converges as the lowest it gets is 2 (while training loss converges to 0.0 something - much below 1)
At each epoch I get the training accuracy and at the end, the validation accuracy. Validation accuracy is always greater than the training accuracy.
When I test on real test data, I get good results, but I wonder if my model is overfitting. I expect a good model's val loss to converge in a similar fashion with training loss, but this doesn't happen and the fact that the val loss oscillates to very large values at times worries me.
- Adjusting the learning rate and scheduler etc etc, I got the val loss and training loss to a downward fashion with less oscilliation, but this time my test accuracy remains low (as well as training and validation accuracies)
I did try a couple of optimizers (adam, sgd, adagrad) with step scheduler and also the pleateu one of pytorch, I played with step sizes etc. but it didn't really help, neither did clipping gradients.
- Is my model overfitting?
- If so, how can I reduce the overfitting besides data augmentation?
- If not (I read some people on quora said it is nothing to worry about, though I would think it must be overfitting), how can I justify it? Even if I would get similar results for a k-fold experiment, would it be good enough? I don't feel it would justify the oscilliating. How should I proceed?
optimization deep-learning conv-neural-network
add a comment |
- I am training a model, and using the original learning rate of the author (I use their github too), I get a validation loss that keeps oscillating a lot, it will decrease but then suddenly jump to a large value and then decrease again, but never really converges as the lowest it gets is 2 (while training loss converges to 0.0 something - much below 1)
At each epoch I get the training accuracy and at the end, the validation accuracy. Validation accuracy is always greater than the training accuracy.
When I test on real test data, I get good results, but I wonder if my model is overfitting. I expect a good model's val loss to converge in a similar fashion with training loss, but this doesn't happen and the fact that the val loss oscillates to very large values at times worries me.
- Adjusting the learning rate and scheduler etc etc, I got the val loss and training loss to a downward fashion with less oscilliation, but this time my test accuracy remains low (as well as training and validation accuracies)
I did try a couple of optimizers (adam, sgd, adagrad) with step scheduler and also the pleateu one of pytorch, I played with step sizes etc. but it didn't really help, neither did clipping gradients.
- Is my model overfitting?
- If so, how can I reduce the overfitting besides data augmentation?
- If not (I read some people on quora said it is nothing to worry about, though I would think it must be overfitting), how can I justify it? Even if I would get similar results for a k-fold experiment, would it be good enough? I don't feel it would justify the oscilliating. How should I proceed?
optimization deep-learning conv-neural-network
- I am training a model, and using the original learning rate of the author (I use their github too), I get a validation loss that keeps oscillating a lot, it will decrease but then suddenly jump to a large value and then decrease again, but never really converges as the lowest it gets is 2 (while training loss converges to 0.0 something - much below 1)
At each epoch I get the training accuracy and at the end, the validation accuracy. Validation accuracy is always greater than the training accuracy.
When I test on real test data, I get good results, but I wonder if my model is overfitting. I expect a good model's val loss to converge in a similar fashion with training loss, but this doesn't happen and the fact that the val loss oscillates to very large values at times worries me.
- Adjusting the learning rate and scheduler etc etc, I got the val loss and training loss to a downward fashion with less oscilliation, but this time my test accuracy remains low (as well as training and validation accuracies)
I did try a couple of optimizers (adam, sgd, adagrad) with step scheduler and also the pleateu one of pytorch, I played with step sizes etc. but it didn't really help, neither did clipping gradients.
- Is my model overfitting?
- If so, how can I reduce the overfitting besides data augmentation?
- If not (I read some people on quora said it is nothing to worry about, though I would think it must be overfitting), how can I justify it? Even if I would get similar results for a k-fold experiment, would it be good enough? I don't feel it would justify the oscilliating. How should I proceed?
optimization deep-learning conv-neural-network
optimization deep-learning conv-neural-network
asked Mar 26 at 0:00
dusadusa
2842 silver badges15 bronze badges
2842 silver badges15 bronze badges
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
The training loss at each epoch is usually computed on the entire training set.
The validation loss at each epoch is usually computed on one minibatch of the validation set, so it is normal for it to be more noisey.
Solution: You can report the Exponential Moving Average of the validation loss across different epochs to have less fluctuations.
It is not overfitting since your validation accuracy is not less than the training accuracy. In fact, it sounds like your model is underfitting since your validation accuracy > training accuracy.
Thanks! reporting exponential moving average sounds like a good idea.
– dusa
Mar 26 at 21:05
but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.
– dusa
Mar 26 at 21:18
1
The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).
– Soroush
Mar 26 at 23:02
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55348052%2fvalidation-loss-oscillates-a-lot-validation-accuracy-learning-accuracy-but-t%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The training loss at each epoch is usually computed on the entire training set.
The validation loss at each epoch is usually computed on one minibatch of the validation set, so it is normal for it to be more noisey.
Solution: You can report the Exponential Moving Average of the validation loss across different epochs to have less fluctuations.
It is not overfitting since your validation accuracy is not less than the training accuracy. In fact, it sounds like your model is underfitting since your validation accuracy > training accuracy.
Thanks! reporting exponential moving average sounds like a good idea.
– dusa
Mar 26 at 21:05
but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.
– dusa
Mar 26 at 21:18
1
The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).
– Soroush
Mar 26 at 23:02
add a comment |
The training loss at each epoch is usually computed on the entire training set.
The validation loss at each epoch is usually computed on one minibatch of the validation set, so it is normal for it to be more noisey.
Solution: You can report the Exponential Moving Average of the validation loss across different epochs to have less fluctuations.
It is not overfitting since your validation accuracy is not less than the training accuracy. In fact, it sounds like your model is underfitting since your validation accuracy > training accuracy.
Thanks! reporting exponential moving average sounds like a good idea.
– dusa
Mar 26 at 21:05
but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.
– dusa
Mar 26 at 21:18
1
The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).
– Soroush
Mar 26 at 23:02
add a comment |
The training loss at each epoch is usually computed on the entire training set.
The validation loss at each epoch is usually computed on one minibatch of the validation set, so it is normal for it to be more noisey.
Solution: You can report the Exponential Moving Average of the validation loss across different epochs to have less fluctuations.
It is not overfitting since your validation accuracy is not less than the training accuracy. In fact, it sounds like your model is underfitting since your validation accuracy > training accuracy.
The training loss at each epoch is usually computed on the entire training set.
The validation loss at each epoch is usually computed on one minibatch of the validation set, so it is normal for it to be more noisey.
Solution: You can report the Exponential Moving Average of the validation loss across different epochs to have less fluctuations.
It is not overfitting since your validation accuracy is not less than the training accuracy. In fact, it sounds like your model is underfitting since your validation accuracy > training accuracy.
answered Mar 26 at 17:41
SoroushSoroush
731 silver badge6 bronze badges
731 silver badge6 bronze badges
Thanks! reporting exponential moving average sounds like a good idea.
– dusa
Mar 26 at 21:05
but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.
– dusa
Mar 26 at 21:18
1
The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).
– Soroush
Mar 26 at 23:02
add a comment |
Thanks! reporting exponential moving average sounds like a good idea.
– dusa
Mar 26 at 21:05
but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.
– dusa
Mar 26 at 21:18
1
The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).
– Soroush
Mar 26 at 23:02
Thanks! reporting exponential moving average sounds like a good idea.
– dusa
Mar 26 at 21:05
Thanks! reporting exponential moving average sounds like a good idea.
– dusa
Mar 26 at 21:05
but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.
– dusa
Mar 26 at 21:18
but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.
– dusa
Mar 26 at 21:18
1
1
The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).
– Soroush
Mar 26 at 23:02
The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).
– Soroush
Mar 26 at 23:02
add a comment |
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55348052%2fvalidation-loss-oscillates-a-lot-validation-accuracy-learning-accuracy-but-t%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown