What columns should use in order to train Random Forest?R - Random Forest and more than 53 categoriesR Random Forests Variable ImportanceSuggestions for speeding up Random ForestsR random forest - training set using target column for predictionCreating a loop for different random forest training algorithmsTrain a random forest algorithm using various columnsRandom forest bootstrap training and forest generationDeleting rows in training dataset for Random ForestTraining a Random Forest on TensorflowError in Bagging with party::cforestWhy would different random forest implementations in R yield different results?

Most elegant way to write a one shot IF

Spicket or spigot?

What is a macro? Difference between macro and function?

Is this hogweed?

In native German words, is Q always followed by U, as in English?

Prime parity peregrination

How can I convince my reader that I will not use a certain trope?

What is the line crossing the Pacific Ocean that is shown on maps?

Using aluminium busbar/cables in an aircraft instead of copper

How to formulate maximum function in a constraint?

Generate and graph the Recamán Sequence

How did researchers use to find articles before the Internet and the computer era?

Averting Real Women Don’t Wear Dresses

What is "oversubscription" in Networking?

Can another character physically take something that Mage Hand is carrying/holding?

Who gets an Apparition licence?

Plotting the gradient descent

How exactly is a normal force exerted, at the molecular level?

Why was Mal so quick to drop Bester in favour of Kaylee?

What exactly is a fey/fiend/celestial spirit?

Why isn’t the tax system continuous rather than bracketed?

"Plugged in" or "Plugged in in"

Can a police officer film me on their personal device in my own home?

Different budgets within roommate group



What columns should use in order to train Random Forest?


R - Random Forest and more than 53 categoriesR Random Forests Variable ImportanceSuggestions for speeding up Random ForestsR random forest - training set using target column for predictionCreating a loop for different random forest training algorithmsTrain a random forest algorithm using various columnsRandom forest bootstrap training and forest generationDeleting rows in training dataset for Random ForestTraining a Random Forest on TensorflowError in Bagging with party::cforestWhy would different random forest implementations in R yield different results?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0















Background



I am new in machine learning. I want to train model by using Random Forest algorithm. I have database which includes total 9 columns, in which 8 are independent variables and last (9th) variable 'Class' is dependent variable. Dependent variable is predictor variable which contains 3 categories i.e. - S, N, R. All independent variables (except 2) contain more categories than 53. Code shows error when categories become more than 53. I want to train the model in order to identify whether the database line is Suspicious (S), Normal (N), Robot (R). Column numbers 4 and 7 contain more than 19k categories/levels. These are the important columns because they contain attack entries/features etc. How to derive other variables from them becomes complicated.



Code



library('ROCR')
library('randomForest')
library('caret')
library('ranger')

database<-read.csv('data1.csv')
set.seed(1000)
train<-sample(1:310341,217239,replace = FALSE)
traindata<-database[train,]
testdata<-database[-train,]
# fit <- train(database$Class ~ ., data = traindata, method = "ranger")
fit<-randomForest(Class~.,data = traindata, ntree=500, importance= TRUE, proximity = TRUE, na.action = na.roughfix)


Efforts taken



I have tried above code, but due to 4th, and 7th column, it shows error of "more than 53 columns can not be used".



Any help would be appreciated to resolve this issue...










share|improve this question



















  • 1





    Possible duplicate of R - Random Forest and more than 53 categories

    – divibisan
    Apr 2 at 17:49

















0















Background



I am new in machine learning. I want to train model by using Random Forest algorithm. I have database which includes total 9 columns, in which 8 are independent variables and last (9th) variable 'Class' is dependent variable. Dependent variable is predictor variable which contains 3 categories i.e. - S, N, R. All independent variables (except 2) contain more categories than 53. Code shows error when categories become more than 53. I want to train the model in order to identify whether the database line is Suspicious (S), Normal (N), Robot (R). Column numbers 4 and 7 contain more than 19k categories/levels. These are the important columns because they contain attack entries/features etc. How to derive other variables from them becomes complicated.



Code



library('ROCR')
library('randomForest')
library('caret')
library('ranger')

database<-read.csv('data1.csv')
set.seed(1000)
train<-sample(1:310341,217239,replace = FALSE)
traindata<-database[train,]
testdata<-database[-train,]
# fit <- train(database$Class ~ ., data = traindata, method = "ranger")
fit<-randomForest(Class~.,data = traindata, ntree=500, importance= TRUE, proximity = TRUE, na.action = na.roughfix)


Efforts taken



I have tried above code, but due to 4th, and 7th column, it shows error of "more than 53 columns can not be used".



Any help would be appreciated to resolve this issue...










share|improve this question



















  • 1





    Possible duplicate of R - Random Forest and more than 53 categories

    – divibisan
    Apr 2 at 17:49













0












0








0








Background



I am new in machine learning. I want to train model by using Random Forest algorithm. I have database which includes total 9 columns, in which 8 are independent variables and last (9th) variable 'Class' is dependent variable. Dependent variable is predictor variable which contains 3 categories i.e. - S, N, R. All independent variables (except 2) contain more categories than 53. Code shows error when categories become more than 53. I want to train the model in order to identify whether the database line is Suspicious (S), Normal (N), Robot (R). Column numbers 4 and 7 contain more than 19k categories/levels. These are the important columns because they contain attack entries/features etc. How to derive other variables from them becomes complicated.



Code



library('ROCR')
library('randomForest')
library('caret')
library('ranger')

database<-read.csv('data1.csv')
set.seed(1000)
train<-sample(1:310341,217239,replace = FALSE)
traindata<-database[train,]
testdata<-database[-train,]
# fit <- train(database$Class ~ ., data = traindata, method = "ranger")
fit<-randomForest(Class~.,data = traindata, ntree=500, importance= TRUE, proximity = TRUE, na.action = na.roughfix)


Efforts taken



I have tried above code, but due to 4th, and 7th column, it shows error of "more than 53 columns can not be used".



Any help would be appreciated to resolve this issue...










share|improve this question
















Background



I am new in machine learning. I want to train model by using Random Forest algorithm. I have database which includes total 9 columns, in which 8 are independent variables and last (9th) variable 'Class' is dependent variable. Dependent variable is predictor variable which contains 3 categories i.e. - S, N, R. All independent variables (except 2) contain more categories than 53. Code shows error when categories become more than 53. I want to train the model in order to identify whether the database line is Suspicious (S), Normal (N), Robot (R). Column numbers 4 and 7 contain more than 19k categories/levels. These are the important columns because they contain attack entries/features etc. How to derive other variables from them becomes complicated.



Code



library('ROCR')
library('randomForest')
library('caret')
library('ranger')

database<-read.csv('data1.csv')
set.seed(1000)
train<-sample(1:310341,217239,replace = FALSE)
traindata<-database[train,]
testdata<-database[-train,]
# fit <- train(database$Class ~ ., data = traindata, method = "ranger")
fit<-randomForest(Class~.,data = traindata, ntree=500, importance= TRUE, proximity = TRUE, na.action = na.roughfix)


Efforts taken



I have tried above code, but due to 4th, and 7th column, it shows error of "more than 53 columns can not be used".



Any help would be appreciated to resolve this issue...







r random-forest






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 26 at 18:22







DSD

















asked Mar 25 at 11:06









DSDDSD

12 bronze badges




12 bronze badges







  • 1





    Possible duplicate of R - Random Forest and more than 53 categories

    – divibisan
    Apr 2 at 17:49












  • 1





    Possible duplicate of R - Random Forest and more than 53 categories

    – divibisan
    Apr 2 at 17:49







1




1





Possible duplicate of R - Random Forest and more than 53 categories

– divibisan
Apr 2 at 17:49





Possible duplicate of R - Random Forest and more than 53 categories

– divibisan
Apr 2 at 17:49












0






active

oldest

votes










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55336405%2fwhat-columns-should-use-in-order-to-train-random-forest%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes




Is this question similar to what you get asked at work? Learn more about asking and sharing private information with your coworkers using Stack Overflow for Teams.








Is this question similar to what you get asked at work? Learn more about asking and sharing private information with your coworkers using Stack Overflow for Teams.




















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55336405%2fwhat-columns-should-use-in-order-to-train-random-forest%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

은진 송씨 목차 역사 본관 분파 인물 조선 왕실과의 인척 관계 집성촌 항렬자 인구 같이 보기 각주 둘러보기 메뉴은진 송씨세종실록 149권, 지리지 충청도 공주목 은진현