Validation loss oscillates a lot, validation accuracy > learning accuracy, but test accuracy is high. Is my model overfitting?Keras cifar10 example validation and test loss lower than training lossthe training accuracy steadily increase, but training loss decrease and then increaseloss and validation loss decreasing but accuracy and validation accuracy remain staticPytorch Test Loss increases while accuracy increasesWhy model produces the best performance after the first epoch when my training loss decreases and the accuracy of the validation set increases?How to deal with validation loss from train loss, which is also converging in deep learning?Pytorch Adam optimizer's awkward behavior? better with restart?Validation loss increase and constant training accuracy 1D cnnAccuracy in a CNN model never goes high for training and validation setIs 10 fold test meaningful to measure overfitting?

What is a "Lear Processor" and how did it work?

How effective would wooden scale armor be in a medieval setting?

Should I include code in my research paper?

Is a request to book a business flight ticket for a graduate student an unreasonable one?

Is it okay to roll multiple attacks that all have advantage in one cluster?

LED glows slightly during soldering

What do three diagonal dots above a letter mean in the "Misal rico de Cisneros" (Spain, 1518)?

What is the minimum time required for final wash in film development?

Why does the US seem to have a rather low economic interest in Africa?

Is it OK to leave real names & info visible in business card portfolio?

Is there a strong legal guarantee that the U.S. can give to another country that it won't attack them?

The origin of a particular self-reference paradox

What are the original Russian words for a prostitute?

How can I effectively communicate to recruiters that a phone call is not possible?

What is this little owl-like bird?

Addressing unnecessary daily meetings with manager?

Received a dinner invitation through my employer's email, is it ok to attend?

What attributes and how big would a sea creature(s) need to be able to tow a ship?

Write a function

Is there a minimum field size for peah to apply?

How do native German speakers usually express skepticism (using even) about a premise?

To what extent would a wizard be able to combine feats to learn to mimic unknown spells?

When did "&" stop being taught alongside the alphabet?

How do you move up one folder in Finder?



Validation loss oscillates a lot, validation accuracy > learning accuracy, but test accuracy is high. Is my model overfitting?


Keras cifar10 example validation and test loss lower than training lossthe training accuracy steadily increase, but training loss decrease and then increaseloss and validation loss decreasing but accuracy and validation accuracy remain staticPytorch Test Loss increases while accuracy increasesWhy model produces the best performance after the first epoch when my training loss decreases and the accuracy of the validation set increases?How to deal with validation loss from train loss, which is also converging in deep learning?Pytorch Adam optimizer's awkward behavior? better with restart?Validation loss increase and constant training accuracy 1D cnnAccuracy in a CNN model never goes high for training and validation setIs 10 fold test meaningful to measure overfitting?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








2















  1. I am training a model, and using the original learning rate of the author (I use their github too), I get a validation loss that keeps oscillating a lot, it will decrease but then suddenly jump to a large value and then decrease again, but never really converges as the lowest it gets is 2 (while training loss converges to 0.0 something - much below 1)

At each epoch I get the training accuracy and at the end, the validation accuracy. Validation accuracy is always greater than the training accuracy.



When I test on real test data, I get good results, but I wonder if my model is overfitting. I expect a good model's val loss to converge in a similar fashion with training loss, but this doesn't happen and the fact that the val loss oscillates to very large values at times worries me.



  1. Adjusting the learning rate and scheduler etc etc, I got the val loss and training loss to a downward fashion with less oscilliation, but this time my test accuracy remains low (as well as training and validation accuracies)

I did try a couple of optimizers (adam, sgd, adagrad) with step scheduler and also the pleateu one of pytorch, I played with step sizes etc. but it didn't really help, neither did clipping gradients.



  1. Is my model overfitting?

  2. If so, how can I reduce the overfitting besides data augmentation?

  3. If not (I read some people on quora said it is nothing to worry about, though I would think it must be overfitting), how can I justify it? Even if I would get similar results for a k-fold experiment, would it be good enough? I don't feel it would justify the oscilliating. How should I proceed?









share|improve this question




























    2















    1. I am training a model, and using the original learning rate of the author (I use their github too), I get a validation loss that keeps oscillating a lot, it will decrease but then suddenly jump to a large value and then decrease again, but never really converges as the lowest it gets is 2 (while training loss converges to 0.0 something - much below 1)

    At each epoch I get the training accuracy and at the end, the validation accuracy. Validation accuracy is always greater than the training accuracy.



    When I test on real test data, I get good results, but I wonder if my model is overfitting. I expect a good model's val loss to converge in a similar fashion with training loss, but this doesn't happen and the fact that the val loss oscillates to very large values at times worries me.



    1. Adjusting the learning rate and scheduler etc etc, I got the val loss and training loss to a downward fashion with less oscilliation, but this time my test accuracy remains low (as well as training and validation accuracies)

    I did try a couple of optimizers (adam, sgd, adagrad) with step scheduler and also the pleateu one of pytorch, I played with step sizes etc. but it didn't really help, neither did clipping gradients.



    1. Is my model overfitting?

    2. If so, how can I reduce the overfitting besides data augmentation?

    3. If not (I read some people on quora said it is nothing to worry about, though I would think it must be overfitting), how can I justify it? Even if I would get similar results for a k-fold experiment, would it be good enough? I don't feel it would justify the oscilliating. How should I proceed?









    share|improve this question
























      2












      2








      2








      1. I am training a model, and using the original learning rate of the author (I use their github too), I get a validation loss that keeps oscillating a lot, it will decrease but then suddenly jump to a large value and then decrease again, but never really converges as the lowest it gets is 2 (while training loss converges to 0.0 something - much below 1)

      At each epoch I get the training accuracy and at the end, the validation accuracy. Validation accuracy is always greater than the training accuracy.



      When I test on real test data, I get good results, but I wonder if my model is overfitting. I expect a good model's val loss to converge in a similar fashion with training loss, but this doesn't happen and the fact that the val loss oscillates to very large values at times worries me.



      1. Adjusting the learning rate and scheduler etc etc, I got the val loss and training loss to a downward fashion with less oscilliation, but this time my test accuracy remains low (as well as training and validation accuracies)

      I did try a couple of optimizers (adam, sgd, adagrad) with step scheduler and also the pleateu one of pytorch, I played with step sizes etc. but it didn't really help, neither did clipping gradients.



      1. Is my model overfitting?

      2. If so, how can I reduce the overfitting besides data augmentation?

      3. If not (I read some people on quora said it is nothing to worry about, though I would think it must be overfitting), how can I justify it? Even if I would get similar results for a k-fold experiment, would it be good enough? I don't feel it would justify the oscilliating. How should I proceed?









      share|improve this question














      1. I am training a model, and using the original learning rate of the author (I use their github too), I get a validation loss that keeps oscillating a lot, it will decrease but then suddenly jump to a large value and then decrease again, but never really converges as the lowest it gets is 2 (while training loss converges to 0.0 something - much below 1)

      At each epoch I get the training accuracy and at the end, the validation accuracy. Validation accuracy is always greater than the training accuracy.



      When I test on real test data, I get good results, but I wonder if my model is overfitting. I expect a good model's val loss to converge in a similar fashion with training loss, but this doesn't happen and the fact that the val loss oscillates to very large values at times worries me.



      1. Adjusting the learning rate and scheduler etc etc, I got the val loss and training loss to a downward fashion with less oscilliation, but this time my test accuracy remains low (as well as training and validation accuracies)

      I did try a couple of optimizers (adam, sgd, adagrad) with step scheduler and also the pleateu one of pytorch, I played with step sizes etc. but it didn't really help, neither did clipping gradients.



      1. Is my model overfitting?

      2. If so, how can I reduce the overfitting besides data augmentation?

      3. If not (I read some people on quora said it is nothing to worry about, though I would think it must be overfitting), how can I justify it? Even if I would get similar results for a k-fold experiment, would it be good enough? I don't feel it would justify the oscilliating. How should I proceed?






      optimization deep-learning conv-neural-network






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 26 at 0:00









      dusadusa

      2842 silver badges15 bronze badges




      2842 silver badges15 bronze badges






















          1 Answer
          1






          active

          oldest

          votes


















          1














          The training loss at each epoch is usually computed on the entire training set.

          The validation loss at each epoch is usually computed on one minibatch of the validation set, so it is normal for it to be more noisey.

          Solution: You can report the Exponential Moving Average of the validation loss across different epochs to have less fluctuations.




          It is not overfitting since your validation accuracy is not less than the training accuracy. In fact, it sounds like your model is underfitting since your validation accuracy > training accuracy.






          share|improve this answer























          • Thanks! reporting exponential moving average sounds like a good idea.

            – dusa
            Mar 26 at 21:05











          • but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.

            – dusa
            Mar 26 at 21:18






          • 1





            The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).

            – Soroush
            Mar 26 at 23:02











          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55348052%2fvalidation-loss-oscillates-a-lot-validation-accuracy-learning-accuracy-but-t%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          The training loss at each epoch is usually computed on the entire training set.

          The validation loss at each epoch is usually computed on one minibatch of the validation set, so it is normal for it to be more noisey.

          Solution: You can report the Exponential Moving Average of the validation loss across different epochs to have less fluctuations.




          It is not overfitting since your validation accuracy is not less than the training accuracy. In fact, it sounds like your model is underfitting since your validation accuracy > training accuracy.






          share|improve this answer























          • Thanks! reporting exponential moving average sounds like a good idea.

            – dusa
            Mar 26 at 21:05











          • but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.

            – dusa
            Mar 26 at 21:18






          • 1





            The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).

            – Soroush
            Mar 26 at 23:02
















          1














          The training loss at each epoch is usually computed on the entire training set.

          The validation loss at each epoch is usually computed on one minibatch of the validation set, so it is normal for it to be more noisey.

          Solution: You can report the Exponential Moving Average of the validation loss across different epochs to have less fluctuations.




          It is not overfitting since your validation accuracy is not less than the training accuracy. In fact, it sounds like your model is underfitting since your validation accuracy > training accuracy.






          share|improve this answer























          • Thanks! reporting exponential moving average sounds like a good idea.

            – dusa
            Mar 26 at 21:05











          • but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.

            – dusa
            Mar 26 at 21:18






          • 1





            The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).

            – Soroush
            Mar 26 at 23:02














          1












          1








          1







          The training loss at each epoch is usually computed on the entire training set.

          The validation loss at each epoch is usually computed on one minibatch of the validation set, so it is normal for it to be more noisey.

          Solution: You can report the Exponential Moving Average of the validation loss across different epochs to have less fluctuations.




          It is not overfitting since your validation accuracy is not less than the training accuracy. In fact, it sounds like your model is underfitting since your validation accuracy > training accuracy.






          share|improve this answer













          The training loss at each epoch is usually computed on the entire training set.

          The validation loss at each epoch is usually computed on one minibatch of the validation set, so it is normal for it to be more noisey.

          Solution: You can report the Exponential Moving Average of the validation loss across different epochs to have less fluctuations.




          It is not overfitting since your validation accuracy is not less than the training accuracy. In fact, it sounds like your model is underfitting since your validation accuracy > training accuracy.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 26 at 17:41









          SoroushSoroush

          731 silver badge6 bronze badges




          731 silver badge6 bronze badges












          • Thanks! reporting exponential moving average sounds like a good idea.

            – dusa
            Mar 26 at 21:05











          • but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.

            – dusa
            Mar 26 at 21:18






          • 1





            The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).

            – Soroush
            Mar 26 at 23:02


















          • Thanks! reporting exponential moving average sounds like a good idea.

            – dusa
            Mar 26 at 21:05











          • but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.

            – dusa
            Mar 26 at 21:18






          • 1





            The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).

            – Soroush
            Mar 26 at 23:02

















          Thanks! reporting exponential moving average sounds like a good idea.

          – dusa
          Mar 26 at 21:05





          Thanks! reporting exponential moving average sounds like a good idea.

          – dusa
          Mar 26 at 21:05













          but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.

          – dusa
          Mar 26 at 21:18





          but doesn't it also mean it is not good at generalizing if result is noisy? I suppose there is room for improvement like you said, for underfitting.

          – dusa
          Mar 26 at 21:18




          1




          1





          The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).

          – Soroush
          Mar 26 at 23:02






          The result is (most likely) noisey because you're getting the validation loss on a small subsample of the validation set (instead of using the whole validation set every time). This is the problem and it has nothing to do with how good your model is. When you randomly get a subsample from the validation set for measuring the loss, you'll sometimes get a subsample that gets a higher loss and you'll sometimes get a subsample with lower loss. This (partly) creates the fluctuations. You'll see less fluctuations if you use a larger minibatch size for the validation set (but it's more computations).

          – Soroush
          Mar 26 at 23:02









          Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







          Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55348052%2fvalidation-loss-oscillates-a-lot-validation-accuracy-learning-accuracy-but-t%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

          Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

          Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript