How to get the same results in different iterations in RandomForest in sklearnHow to calculate feature importance in each models of cross validation in sklearnHow to get the current time in PythonWhy does comparing strings using either '==' or 'is' sometimes produce a different result?How do I get the number of elements in a list?I have much more than three elements in every class, but I get this error: “class cannot be less than k=3 in scikit-learn”How to iterate over rows in a DataFrame in Pandas?train_test_split not splitting dataMachine learning Random forestsPython - What value should we use for random_state in train_test_split() and in which scenario?Used SequentialFeatureSelector but does not improve the model accuracyNeural network ValueError: Found input variables with inconsistent numbers of samples?

How do I anonymously report the Establishment Clause being broken?

How do I make my fill-in-the-blank exercise more obvious?

Entering the US with dual citizenship but US passport is long expired?

How does the UK House of Commons think they can prolong the deadline of Brexit?

Where on Earth is it easiest to survive in the wilderness?

How can I oppose my advisor granting gift authorship to a collaborator?

MOSFET broke after attaching capacitor bank

Which costing factors go into the optimizer choosing different types of spools?

What's the point of this macro?

In-universe, why does Doc Brown program the time machine to go to 1955?

First Number to Contain Each Letter

Darwin alternative to `lsb_release -a`?

Are buttons really enough to bound validities by S4.2?

What fraction of 2x2 USA call signs are vanity calls?

Why does the UK Prime Minister need the permission of Parliament to call a general election?

split a six digits number column into separated columns with one digit

GFI outlets tripped after power outage

Translate English to Pig Latin | PIG_LATIN.PY

Are there mathematical concepts that exist in the fourth dimension, but not in the third dimension?

Is it risky to move from broad geographical diversification into investing mostly in less developed markets?

ASCII Maze Rendering 3000

Tiny image scraper for xkcd.com

What drugs were used in England during the High Middle Ages?

Left my gmail logged in when I was fired



How to get the same results in different iterations in RandomForest in sklearn


How to calculate feature importance in each models of cross validation in sklearnHow to get the current time in PythonWhy does comparing strings using either '==' or 'is' sometimes produce a different result?How do I get the number of elements in a list?I have much more than three elements in every class, but I get this error: “class cannot be less than k=3 in scikit-learn”How to iterate over rows in a DataFrame in Pandas?train_test_split not splitting dataMachine learning Random forestsPython - What value should we use for random_state in train_test_split() and in which scenario?Used SequentialFeatureSelector but does not improve the model accuracyNeural network ValueError: Found input variables with inconsistent numbers of samples?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.



input_file = 'sample.csv'

df1 = pd.read_csv(input_file)
df2 = pd.read_csv(input_file)
X=df1.drop(['lable'], axis=1) # Features
y=df2['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))


As suggested by other answers I added the parameter n_estimators and random_state. However, it did not work for me.



I have attached the csv file here:



I am happy to provide more details if needed.










share|improve this question
































    1















    I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.



    input_file = 'sample.csv'

    df1 = pd.read_csv(input_file)
    df2 = pd.read_csv(input_file)
    X=df1.drop(['lable'], axis=1) # Features
    y=df2['lable'] # Labels
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

    clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
    clf.fit(X_train,y_train)
    y_pred=clf.predict(X_test)
    print("Accuracy:",metrics.accuracy_score(y_test, y_pred))


    As suggested by other answers I added the parameter n_estimators and random_state. However, it did not work for me.



    I have attached the csv file here:



    I am happy to provide more details if needed.










    share|improve this question




























      1












      1








      1








      I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.



      input_file = 'sample.csv'

      df1 = pd.read_csv(input_file)
      df2 = pd.read_csv(input_file)
      X=df1.drop(['lable'], axis=1) # Features
      y=df2['lable'] # Labels
      X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

      clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
      clf.fit(X_train,y_train)
      y_pred=clf.predict(X_test)
      print("Accuracy:",metrics.accuracy_score(y_test, y_pred))


      As suggested by other answers I added the parameter n_estimators and random_state. However, it did not work for me.



      I have attached the csv file here:



      I am happy to provide more details if needed.










      share|improve this question
















      I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.



      input_file = 'sample.csv'

      df1 = pd.read_csv(input_file)
      df2 = pd.read_csv(input_file)
      X=df1.drop(['lable'], axis=1) # Features
      y=df2['lable'] # Labels
      X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

      clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
      clf.fit(X_train,y_train)
      y_pred=clf.predict(X_test)
      print("Accuracy:",metrics.accuracy_score(y_test, y_pred))


      As suggested by other answers I added the parameter n_estimators and random_state. However, it did not work for me.



      I have attached the csv file here:



      I am happy to provide more details if needed.







      python scikit-learn classification random-forest






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 28 at 4:23









      Venkatachalam N

      6,5925 gold badges15 silver badges41 bronze badges




      6,5925 gold badges15 silver badges41 bronze badges










      asked Mar 28 at 4:10









      EmJEmJ

      9521 gold badge6 silver badges28 bronze badges




      9521 gold badge6 silver badges28 bronze badges

























          1 Answer
          1






          active

          oldest

          votes


















          1
















          You need to set the random state for the train-test splitting as well.



          The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.



          import pandas as pd
          from sklearn.model_selection import train_test_split
          from sklearn.ensemble import RandomForestClassifier
          from sklearn import metrics

          df1=pd.read_csv('sample.csv')

          X=df1.drop(['lable'], axis=1) # Features
          y=df1['lable'] # Labels
          X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)

          clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
          clf.fit(X_train,y_train)
          y_pred=clf.predict(X_test)
          print("Accuracy:",metrics.accuracy_score(y_test, y_pred))


          Output:




          Accuracy: 0.6777777777777778







          share|improve this answer




















          • 1





            Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

            – Venkatachalam N
            Mar 28 at 4:31






          • 1





            Also kindly take some time to review stackoverflow.com/help/someone-answers

            – Venkatachalam N
            Mar 28 at 4:36






          • 1





            Glad that I can help! you can have different values for these two random_states

            – Venkatachalam N
            Mar 28 at 4:59







          • 1





            You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

            – Venkatachalam N
            Mar 28 at 5:00







          • 1





            Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

            – EmJ
            Apr 2 at 5:10










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55390033%2fhow-to-get-the-same-results-in-different-iterations-in-randomforest-in-sklearn%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1
















          You need to set the random state for the train-test splitting as well.



          The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.



          import pandas as pd
          from sklearn.model_selection import train_test_split
          from sklearn.ensemble import RandomForestClassifier
          from sklearn import metrics

          df1=pd.read_csv('sample.csv')

          X=df1.drop(['lable'], axis=1) # Features
          y=df1['lable'] # Labels
          X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)

          clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
          clf.fit(X_train,y_train)
          y_pred=clf.predict(X_test)
          print("Accuracy:",metrics.accuracy_score(y_test, y_pred))


          Output:




          Accuracy: 0.6777777777777778







          share|improve this answer




















          • 1





            Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

            – Venkatachalam N
            Mar 28 at 4:31






          • 1





            Also kindly take some time to review stackoverflow.com/help/someone-answers

            – Venkatachalam N
            Mar 28 at 4:36






          • 1





            Glad that I can help! you can have different values for these two random_states

            – Venkatachalam N
            Mar 28 at 4:59







          • 1





            You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

            – Venkatachalam N
            Mar 28 at 5:00







          • 1





            Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

            – EmJ
            Apr 2 at 5:10















          1
















          You need to set the random state for the train-test splitting as well.



          The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.



          import pandas as pd
          from sklearn.model_selection import train_test_split
          from sklearn.ensemble import RandomForestClassifier
          from sklearn import metrics

          df1=pd.read_csv('sample.csv')

          X=df1.drop(['lable'], axis=1) # Features
          y=df1['lable'] # Labels
          X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)

          clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
          clf.fit(X_train,y_train)
          y_pred=clf.predict(X_test)
          print("Accuracy:",metrics.accuracy_score(y_test, y_pred))


          Output:




          Accuracy: 0.6777777777777778







          share|improve this answer




















          • 1





            Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

            – Venkatachalam N
            Mar 28 at 4:31






          • 1





            Also kindly take some time to review stackoverflow.com/help/someone-answers

            – Venkatachalam N
            Mar 28 at 4:36






          • 1





            Glad that I can help! you can have different values for these two random_states

            – Venkatachalam N
            Mar 28 at 4:59







          • 1





            You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

            – Venkatachalam N
            Mar 28 at 5:00







          • 1





            Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

            – EmJ
            Apr 2 at 5:10













          1














          1










          1









          You need to set the random state for the train-test splitting as well.



          The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.



          import pandas as pd
          from sklearn.model_selection import train_test_split
          from sklearn.ensemble import RandomForestClassifier
          from sklearn import metrics

          df1=pd.read_csv('sample.csv')

          X=df1.drop(['lable'], axis=1) # Features
          y=df1['lable'] # Labels
          X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)

          clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
          clf.fit(X_train,y_train)
          y_pred=clf.predict(X_test)
          print("Accuracy:",metrics.accuracy_score(y_test, y_pred))


          Output:




          Accuracy: 0.6777777777777778







          share|improve this answer













          You need to set the random state for the train-test splitting as well.



          The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.



          import pandas as pd
          from sklearn.model_selection import train_test_split
          from sklearn.ensemble import RandomForestClassifier
          from sklearn import metrics

          df1=pd.read_csv('sample.csv')

          X=df1.drop(['lable'], axis=1) # Features
          y=df1['lable'] # Labels
          X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)

          clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
          clf.fit(X_train,y_train)
          y_pred=clf.predict(X_test)
          print("Accuracy:",metrics.accuracy_score(y_test, y_pred))


          Output:




          Accuracy: 0.6777777777777778








          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 28 at 4:22









          Venkatachalam NVenkatachalam N

          6,5925 gold badges15 silver badges41 bronze badges




          6,5925 gold badges15 silver badges41 bronze badges










          • 1





            Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

            – Venkatachalam N
            Mar 28 at 4:31






          • 1





            Also kindly take some time to review stackoverflow.com/help/someone-answers

            – Venkatachalam N
            Mar 28 at 4:36






          • 1





            Glad that I can help! you can have different values for these two random_states

            – Venkatachalam N
            Mar 28 at 4:59







          • 1





            You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

            – Venkatachalam N
            Mar 28 at 5:00







          • 1





            Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

            – EmJ
            Apr 2 at 5:10












          • 1





            Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

            – Venkatachalam N
            Mar 28 at 4:31






          • 1





            Also kindly take some time to review stackoverflow.com/help/someone-answers

            – Venkatachalam N
            Mar 28 at 4:36






          • 1





            Glad that I can help! you can have different values for these two random_states

            – Venkatachalam N
            Mar 28 at 4:59







          • 1





            You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

            – Venkatachalam N
            Mar 28 at 5:00







          • 1





            Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

            – EmJ
            Apr 2 at 5:10







          1




          1





          Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

          – Venkatachalam N
          Mar 28 at 4:31





          Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

          – Venkatachalam N
          Mar 28 at 4:31




          1




          1





          Also kindly take some time to review stackoverflow.com/help/someone-answers

          – Venkatachalam N
          Mar 28 at 4:36





          Also kindly take some time to review stackoverflow.com/help/someone-answers

          – Venkatachalam N
          Mar 28 at 4:36




          1




          1





          Glad that I can help! you can have different values for these two random_states

          – Venkatachalam N
          Mar 28 at 4:59






          Glad that I can help! you can have different values for these two random_states

          – Venkatachalam N
          Mar 28 at 4:59





          1




          1





          You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

          – Venkatachalam N
          Mar 28 at 5:00






          You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

          – Venkatachalam N
          Mar 28 at 5:00





          1




          1





          Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

          – EmJ
          Apr 2 at 5:10





          Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

          – EmJ
          Apr 2 at 5:10








          Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







          Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55390033%2fhow-to-get-the-same-results-in-different-iterations-in-randomforest-in-sklearn%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

          Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

          Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript