How to get the same results in different iterations in RandomForest in sklearnHow to calculate feature importance in each models of cross validation in sklearnHow to get the current time in PythonWhy does comparing strings using either '==' or 'is' sometimes produce a different result?How do I get the number of elements in a list?I have much more than three elements in every class, but I get this error: “class cannot be less than k=3 in scikit-learn”How to iterate over rows in a DataFrame in Pandas?train_test_split not splitting dataMachine learning Random forestsPython - What value should we use for random_state in train_test_split() and in which scenario?Used SequentialFeatureSelector but does not improve the model accuracyNeural network ValueError: Found input variables with inconsistent numbers of samples?

How do I anonymously report the Establishment Clause being broken?

How do I make my fill-in-the-blank exercise more obvious?

Entering the US with dual citizenship but US passport is long expired?

How does the UK House of Commons think they can prolong the deadline of Brexit?

Where on Earth is it easiest to survive in the wilderness?

How can I oppose my advisor granting gift authorship to a collaborator?

MOSFET broke after attaching capacitor bank

Which costing factors go into the optimizer choosing different types of spools?

What's the point of this macro?

In-universe, why does Doc Brown program the time machine to go to 1955?

First Number to Contain Each Letter

Darwin alternative to `lsb_release -a`?

Are buttons really enough to bound validities by S4.2?

What fraction of 2x2 USA call signs are vanity calls?

Why does the UK Prime Minister need the permission of Parliament to call a general election?

split a six digits number column into separated columns with one digit

GFI outlets tripped after power outage

Translate English to Pig Latin | PIG_LATIN.PY

Are there mathematical concepts that exist in the fourth dimension, but not in the third dimension?

Is it risky to move from broad geographical diversification into investing mostly in less developed markets?

ASCII Maze Rendering 3000

Tiny image scraper for xkcd.com

What drugs were used in England during the High Middle Ages?

Left my gmail logged in when I was fired

How to get the same results in different iterations in RandomForest in sklearn

How to calculate feature importance in each models of cross validation in sklearnHow to get the current time in PythonWhy does comparing strings using either '==' or 'is' sometimes produce a different result?How do I get the number of elements in a list?I have much more than three elements in every class, but I get this error: “class cannot be less than k=3 in scikit-learn”How to iterate over rows in a DataFrame in Pandas?train_test_split not splitting dataMachine learning Random forestsPython - What value should we use for random_state in train_test_split() and in which scenario?Used SequentialFeatureSelector but does not improve the model accuracyNeural network ValueError: Found input variables with inconsistent numbers of samples?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.

input_file = 'sample.csv'

df1 = pd.read_csv(input_file)
df2 = pd.read_csv(input_file)
X=df1.drop(['lable'], axis=1) # Features
y=df2['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

As suggested by other answers I added the parameter n_estimators and random_state. However, it did not work for me.

I have attached the csv file here:

I am happy to provide more details if needed.

edited Mar 28 at 4:23

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

asked Mar 28 at 4:10

EmJ

9521 gold badge6 silver badges28 bronze badges

add a comment |

I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.

input_file = 'sample.csv'

df1 = pd.read_csv(input_file)
df2 = pd.read_csv(input_file)
X=df1.drop(['lable'], axis=1) # Features
y=df2['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

As suggested by other answers I added the parameter n_estimators and random_state. However, it did not work for me.

I have attached the csv file here:

I am happy to provide more details if needed.

edited Mar 28 at 4:23

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

asked Mar 28 at 4:10

EmJ

9521 gold badge6 silver badges28 bronze badges

add a comment |

I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.

input_file = 'sample.csv'

df1 = pd.read_csv(input_file)
df2 = pd.read_csv(input_file)
X=df1.drop(['lable'], axis=1) # Features
y=df2['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

As suggested by other answers I added the parameter n_estimators and random_state. However, it did not work for me.

I have attached the csv file here:

I am happy to provide more details if needed.

edited Mar 28 at 4:23

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

asked Mar 28 at 4:10

EmJ

9521 gold badge6 silver badges28 bronze badges

I am using Random Forest classifier for the classification and in each iteration I get different results. My code is as follows.

input_file = 'sample.csv'

df1 = pd.read_csv(input_file)
df2 = pd.read_csv(input_file)
X=df1.drop(['lable'], axis=1) # Features
y=df2['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

As suggested by other answers I added the parameter n_estimators and random_state. However, it did not work for me.

I have attached the csv file here:

I am happy to provide more details if needed.

python scikit-learn classification random-forest

edited Mar 28 at 4:23

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

asked Mar 28 at 4:10

EmJ

9521 gold badge6 silver badges28 bronze badges

edited Mar 28 at 4:23

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

asked Mar 28 at 4:10

EmJ

9521 gold badge6 silver badges28 bronze badges

edited Mar 28 at 4:23

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

edited Mar 28 at 4:23

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

edited Mar 28 at 4:23

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

asked Mar 28 at 4:10

EmJ

9521 gold badge6 silver badges28 bronze badges

asked Mar 28 at 4:10

EmJ

9521 gold badge6 silver badges28 bronze badges

asked Mar 28 at 4:10

EmJ

9521 gold badge6 silver badges28 bronze badges

add a comment |

1 Answer
1

active

oldest

votes

You need to set the random state for the train-test splitting as well.

The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics

df1=pd.read_csv('sample.csv')

X=df1.drop(['lable'], axis=1) # Features
y=df1['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)

clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Output:

Accuracy: 0.6777777777777778

answered Mar 28 at 4:22

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

1

Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

– Venkatachalam N
Mar 28 at 4:31

1

Also kindly take some time to review stackoverflow.com/help/someone-answers

– Venkatachalam N
Mar 28 at 4:36

1

Glad that I can help! you can have different values for these two random_states

– Venkatachalam N
Mar 28 at 4:59

1

You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

– Venkatachalam N
Mar 28 at 5:00

1

Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

– EmJ
Apr 2 at 5:10

|
show 2 more comments

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55390033%2fhow-to-get-the-same-results-in-different-iterations-in-randomforest-in-sklearn%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You need to set the random state for the train-test splitting as well.

The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics

df1=pd.read_csv('sample.csv')

X=df1.drop(['lable'], axis=1) # Features
y=df1['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)

clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Output:

Accuracy: 0.6777777777777778

answered Mar 28 at 4:22

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

1

Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

– Venkatachalam N
Mar 28 at 4:31

1

Also kindly take some time to review stackoverflow.com/help/someone-answers

– Venkatachalam N
Mar 28 at 4:36

1

Glad that I can help! you can have different values for these two random_states

– Venkatachalam N
Mar 28 at 4:59

1

You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

– Venkatachalam N
Mar 28 at 5:00

1

Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

– EmJ
Apr 2 at 5:10

|
show 2 more comments

You need to set the random state for the train-test splitting as well.

The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics

df1=pd.read_csv('sample.csv')

X=df1.drop(['lable'], axis=1) # Features
y=df1['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)

clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Output:

Accuracy: 0.6777777777777778

answered Mar 28 at 4:22

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

1

Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

– Venkatachalam N
Mar 28 at 4:31

1

Also kindly take some time to review stackoverflow.com/help/someone-answers

– Venkatachalam N
Mar 28 at 4:36

1

Glad that I can help! you can have different values for these two random_states

– Venkatachalam N
Mar 28 at 4:59

1

You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

– Venkatachalam N
Mar 28 at 5:00

1

Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

– EmJ
Apr 2 at 5:10

|
show 2 more comments

You need to set the random state for the train-test splitting as well.

The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics

df1=pd.read_csv('sample.csv')

X=df1.drop(['lable'], axis=1) # Features
y=df1['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)

clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Output:

Accuracy: 0.6777777777777778

answered Mar 28 at 4:22

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

You need to set the random state for the train-test splitting as well.

The following code would give you a reproducible results. The recommended approach is not to change the random_state value for improving performance.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics

df1=pd.read_csv('sample.csv')

X=df1.drop(['lable'], axis=1) # Features
y=df1['lable'] # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=5)

clf=RandomForestClassifier(random_state = 42, class_weight="balanced")
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Output:

Accuracy: 0.6777777777777778

answered Mar 28 at 4:22

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

answered Mar 28 at 4:22

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

answered Mar 28 at 4:22

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

answered Mar 28 at 4:22

Venkatachalam N

6,5925 gold badges15 silver badges41 bronze badges

1

Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

– Venkatachalam N
Mar 28 at 4:31

1

Also kindly take some time to review stackoverflow.com/help/someone-answers

– Venkatachalam N
Mar 28 at 4:36

1

Glad that I can help! you can have different values for these two random_states

– Venkatachalam N
Mar 28 at 4:59

1

You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

– Venkatachalam N
Mar 28 at 5:00

1

Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

– EmJ
Apr 2 at 5:10

|
show 2 more comments

1

Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

– Venkatachalam N
Mar 28 at 4:31

1

Also kindly take some time to review stackoverflow.com/help/someone-answers

– Venkatachalam N
Mar 28 at 4:36

1

Glad that I can help! you can have different values for these two random_states

– Venkatachalam N
Mar 28 at 4:59

1

You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

– Venkatachalam N
Mar 28 at 5:00

1

Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

– EmJ
Apr 2 at 5:10

Once your fix some value for the random_state, don't change it during your modelling process (When you tune other parameter like n_estimators,max_depth,etc.)

– Venkatachalam N
Mar 28 at 4:31

Also kindly take some time to review stackoverflow.com/help/someone-answers

– Venkatachalam N
Mar 28 at 4:36

Glad that I can help! you can have different values for these two random_states

– Venkatachalam N
Mar 28 at 4:59

You should not try to find the optimal random_state. Read here. You have to assign some random value to it!!!

– Venkatachalam N
Mar 28 at 5:00

Please let me know if you know an answer for this: stackoverflow.com/questions/55466081/… Thank you :)

– EmJ
Apr 2 at 5:10

|
show 2 more comments

Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

1 Answer
1

1 Answer
1

1 Answer
1