How to get the same results in different iterations in RandomForest in sklearnHow to calculate feature importance in each models of cross validation in sklearnHow to get the current time in PythonWhy does comparing strings using either '==' or 'is' sometimes produce a different result?How do I get the number of elements in a list?I have much more than three elements in every class, but I get this error: “class cannot be less than k=3 in scikit-learn”How to iterate over rows in a DataFrame in Pandas?train_test_split not splitting dataMachine learning Random forestsPython - What value should we use for random_state in train_test_split() and in which scenario?Used SequentialFeatureSelector but does not improve the model accuracyNeural network ValueError: Found input variables with inconsistent numbers of samples?
How do I anonymously report the Establishment Clause being broken?
How do I make my fill-in-the-blank exercise more obvious?
Entering the US with dual citizenship but US passport is long expired?
How does the UK House of Commons think they can prolong the deadline of Brexit?
Where on Earth is it easiest to survive in the wilderness?
How can I oppose my advisor granting gift authorship to a collaborator?
MOSFET broke after attaching capacitor bank
Which costing factors go into the optimizer choosing different types of spools?
What's the point of this macro?
In-universe, why does Doc Brown program the time machine to go to 1955?
First Number to Contain Each Letter
Darwin alternative to `lsb_release -a`?
Are buttons really enough to bound validities by S4.2?
What fraction of 2x2 USA call signs are vanity calls?
Why does the UK Prime Minister need the permission of Parliament to call a general election?
split a six digits number column into separated columns with one digit
GFI outlets tripped after power outage
Translate English to Pig Latin | PIG_LATIN.PY
Are there mathematical concepts that exist in the fourth dimension, but not in the third dimension?
Is it risky to move from broad geographical diversification into investing mostly in less developed markets?
ASCII Maze Rendering 3000
Tiny image scraper for xkcd.com
What drugs were used in England during the High Middle Ages?
Left my gmail logged in when I was fired
How to get the same results in different iterations in RandomForest in sklearn
How to calculate feature importance in each models of cross validation in sklearnHow to get the current time in PythonWhy does comparing strings using either '==' or 'is' sometimes produce a different result?How do I get the number of elements in a list?I have much more than three elements in every class, but I get this error: “class cannot be less than k=3 in scikit-learn”How to iterate over rows in a DataFrame in Pandas?train_test_split not splitting dataMachine learning Random forestsPython - What value should we use for random_state in train_test_split() and in which scenario?Used SequentialFeatureSelector but does not improve the model accuracyNeural network ValueError: Found input variables with inconsistent numbers of samples?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.
input_file = 'sample.csv'
df1 = pd.read_csv(input_file)
df2 = pd.read_csv(input_file)
X=df1.drop(['lable'], axis=1) # Features
y=df2['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
As suggested by other answers I added the parameter n_estimators
and random_state
. However, it did not work for me.
I have attached the csv file here:
I am happy to provide more details if needed.
python scikit-learn classification random-forest
add a comment |
I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.
input_file = 'sample.csv'
df1 = pd.read_csv(input_file)
df2 = pd.read_csv(input_file)
X=df1.drop(['lable'], axis=1) # Features
y=df2['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
As suggested by other answers I added the parameter n_estimators
and random_state
. However, it did not work for me.
I have attached the csv file here:
I am happy to provide more details if needed.
python scikit-learn classification random-forest
add a comment |
I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.
input_file = 'sample.csv'
df1 = pd.read_csv(input_file)
df2 = pd.read_csv(input_file)
X=df1.drop(['lable'], axis=1) # Features
y=df2['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
As suggested by other answers I added the parameter n_estimators
and random_state
. However, it did not work for me.
I have attached the csv file here:
I am happy to provide more details if needed.
python scikit-learn classification random-forest
I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.
input_file = 'sample.csv'
df1 = pd.read_csv(input_file)
df2 = pd.read_csv(input_file)
X=df1.drop(['lable'], axis=1) # Features
y=df2['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
As suggested by other answers I added the parameter n_estimators
and random_state
. However, it did not work for me.
I have attached the csv file here:
I am happy to provide more details if needed.
python scikit-learn classification random-forest
python scikit-learn classification random-forest
edited Mar 28 at 4:23
Venkatachalam N
6,5925 gold badges15 silver badges41 bronze badges
6,5925 gold badges15 silver badges41 bronze badges
asked Mar 28 at 4:10
EmJEmJ
9521 gold badge6 silver badges28 bronze badges
9521 gold badge6 silver badges28 bronze badges
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
You need to set the random state for the train-test splitting as well.
The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
df1=pd.read_csv('sample.csv')
X=df1.drop(['lable'], axis=1) # Features
y=df1['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)
clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
Output:
Accuracy: 0.6777777777777778
1
Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter liken_estimators
,max_depth
,etc.)
– Venkatachalam N
Mar 28 at 4:31
1
Also kindly take some time to review stackoverflow.com/help/someone-answers
– Venkatachalam N
Mar 28 at 4:36
1
Glad that I can help! you can have different values for these two random_states
– Venkatachalam N
Mar 28 at 4:59
1
You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!
– Venkatachalam N
Mar 28 at 5:00
1
Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)
– EmJ
Apr 2 at 5:10
|
show 2 more comments
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55390033%2fhow-to-get-the-same-results-in-different-iterations-in-randomforest-in-sklearn%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You need to set the random state for the train-test splitting as well.
The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
df1=pd.read_csv('sample.csv')
X=df1.drop(['lable'], axis=1) # Features
y=df1['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)
clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
Output:
Accuracy: 0.6777777777777778
1
Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter liken_estimators
,max_depth
,etc.)
– Venkatachalam N
Mar 28 at 4:31
1
Also kindly take some time to review stackoverflow.com/help/someone-answers
– Venkatachalam N
Mar 28 at 4:36
1
Glad that I can help! you can have different values for these two random_states
– Venkatachalam N
Mar 28 at 4:59
1
You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!
– Venkatachalam N
Mar 28 at 5:00
1
Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)
– EmJ
Apr 2 at 5:10
|
show 2 more comments
You need to set the random state for the train-test splitting as well.
The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
df1=pd.read_csv('sample.csv')
X=df1.drop(['lable'], axis=1) # Features
y=df1['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)
clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
Output:
Accuracy: 0.6777777777777778
1
Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter liken_estimators
,max_depth
,etc.)
– Venkatachalam N
Mar 28 at 4:31
1
Also kindly take some time to review stackoverflow.com/help/someone-answers
– Venkatachalam N
Mar 28 at 4:36
1
Glad that I can help! you can have different values for these two random_states
– Venkatachalam N
Mar 28 at 4:59
1
You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!
– Venkatachalam N
Mar 28 at 5:00
1
Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)
– EmJ
Apr 2 at 5:10
|
show 2 more comments
You need to set the random state for the train-test splitting as well.
The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
df1=pd.read_csv('sample.csv')
X=df1.drop(['lable'], axis=1) # Features
y=df1['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)
clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
Output:
Accuracy: 0.6777777777777778
You need to set the random state for the train-test splitting as well.
The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
df1=pd.read_csv('sample.csv')
X=df1.drop(['lable'], axis=1) # Features
y=df1['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)
clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
Output:
Accuracy: 0.6777777777777778
answered Mar 28 at 4:22
Venkatachalam NVenkatachalam N
6,5925 gold badges15 silver badges41 bronze badges
6,5925 gold badges15 silver badges41 bronze badges
1
Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter liken_estimators
,max_depth
,etc.)
– Venkatachalam N
Mar 28 at 4:31
1
Also kindly take some time to review stackoverflow.com/help/someone-answers
– Venkatachalam N
Mar 28 at 4:36
1
Glad that I can help! you can have different values for these two random_states
– Venkatachalam N
Mar 28 at 4:59
1
You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!
– Venkatachalam N
Mar 28 at 5:00
1
Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)
– EmJ
Apr 2 at 5:10
|
show 2 more comments
1
Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter liken_estimators
,max_depth
,etc.)
– Venkatachalam N
Mar 28 at 4:31
1
Also kindly take some time to review stackoverflow.com/help/someone-answers
– Venkatachalam N
Mar 28 at 4:36
1
Glad that I can help! you can have different values for these two random_states
– Venkatachalam N
Mar 28 at 4:59
1
You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!
– Venkatachalam N
Mar 28 at 5:00
1
Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)
– EmJ
Apr 2 at 5:10
1
1
Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like
n_estimators
,max_depth
,etc.)– Venkatachalam N
Mar 28 at 4:31
Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like
n_estimators
,max_depth
,etc.)– Venkatachalam N
Mar 28 at 4:31
1
1
Also kindly take some time to review stackoverflow.com/help/someone-answers
– Venkatachalam N
Mar 28 at 4:36
Also kindly take some time to review stackoverflow.com/help/someone-answers
– Venkatachalam N
Mar 28 at 4:36
1
1
Glad that I can help! you can have different values for these two random_states
– Venkatachalam N
Mar 28 at 4:59
Glad that I can help! you can have different values for these two random_states
– Venkatachalam N
Mar 28 at 4:59
1
1
You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!
– Venkatachalam N
Mar 28 at 5:00
You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!
– Venkatachalam N
Mar 28 at 5:00
1
1
Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)
– EmJ
Apr 2 at 5:10
Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)
– EmJ
Apr 2 at 5:10
|
show 2 more comments
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55390033%2fhow-to-get-the-same-results-in-different-iterations-in-randomforest-in-sklearn%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown