What columns should use in order to train Random Forest?R - Random Forest and more than 53 categoriesR Random Forests Variable ImportanceSuggestions for speeding up Random ForestsR random forest - training set using target column for predictionCreating a loop for different random forest training algorithmsTrain a random forest algorithm using various columnsRandom forest bootstrap training and forest generationDeleting rows in training dataset for Random ForestTraining a Random Forest on TensorflowError in Bagging with party::cforestWhy would different random forest implementations in R yield different results?
One folder having two different locations on Ubuntu 18.04
If two black hole event horizons overlap (touch) can they ever separate again?
Can two or more lightbeams (from a laser for example) have visible interference when they cross in mid-air?
Is it okay to fade a human face just to create some space to place important content over it?
Wrong corporate name on employment agreement
Meaning of じゃないんじゃない?
Why is Silver Fang rated as S-class Rank 3 hero?
What game is this character in the Pixels movie from?
Do the 26 richest billionaires own as much wealth as the poorest 3.8 billion people?
What does grep -v "grep" mean and do?
Why is Japan trying to have a better relationship with Iran?
Breakups - Makeups
Should I report a leak of confidential HR information?
Why was Mal so quick to drop Bester in favour of Kaylee?
Different budgets within roommate group
Pairwise Scatter Plots with Histograms and Correlations
How exactly is a normal force exerted, at the molecular level?
Why do we use a cylinder as a Gaussian surface for infinitely long charged wire?
What's the rule for a natural 20 on a Perception check?
Who are these Discworld wizards from this picture?
What exactly did Ant-Man see that made him say that their plan worked?
Details of video memory access arbitration in Space Invaders
How to securely dispose of a smartphone?
Is the location of an aircraft spoiler really that vital?
What columns should use in order to train Random Forest?
R - Random Forest and more than 53 categoriesR Random Forests Variable ImportanceSuggestions for speeding up Random ForestsR random forest - training set using target column for predictionCreating a loop for different random forest training algorithmsTrain a random forest algorithm using various columnsRandom forest bootstrap training and forest generationDeleting rows in training dataset for Random ForestTraining a Random Forest on TensorflowError in Bagging with party::cforestWhy would different random forest implementations in R yield different results?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
Background
I am new in machine learning. I want to train model by using Random Forest algorithm. I have database which includes total 9 columns, in which 8 are independent variables and last (9th) variable 'Class' is dependent variable. Dependent variable is predictor variable which contains 3 categories i.e. - S, N, R. All independent variables (except 2) contain more categories than 53. Code shows error when categories become more than 53. I want to train the model in order to identify whether the database line is Suspicious (S), Normal (N), Robot (R). Column numbers 4 and 7 contain more than 19k categories/levels. These are the important columns because they contain attack entries/features etc. How to derive other variables from them becomes complicated.
Code
library('ROCR')
library('randomForest')
library('caret')
library('ranger')
database<-read.csv('data1.csv')
set.seed(1000)
train<-sample(1:310341,217239,replace = FALSE)
traindata<-database[train,]
testdata<-database[-train,]
# fit <- train(database$Class ~ ., data = traindata, method = "ranger")
fit<-randomForest(Class~.,data = traindata, ntree=500, importance= TRUE, proximity = TRUE, na.action = na.roughfix)
Efforts taken
I have tried above code, but due to 4th, and 7th column, it shows error of "more than 53 columns can not be used".
Any help would be appreciated to resolve this issue...
r random-forest
add a comment |
Background
I am new in machine learning. I want to train model by using Random Forest algorithm. I have database which includes total 9 columns, in which 8 are independent variables and last (9th) variable 'Class' is dependent variable. Dependent variable is predictor variable which contains 3 categories i.e. - S, N, R. All independent variables (except 2) contain more categories than 53. Code shows error when categories become more than 53. I want to train the model in order to identify whether the database line is Suspicious (S), Normal (N), Robot (R). Column numbers 4 and 7 contain more than 19k categories/levels. These are the important columns because they contain attack entries/features etc. How to derive other variables from them becomes complicated.
Code
library('ROCR')
library('randomForest')
library('caret')
library('ranger')
database<-read.csv('data1.csv')
set.seed(1000)
train<-sample(1:310341,217239,replace = FALSE)
traindata<-database[train,]
testdata<-database[-train,]
# fit <- train(database$Class ~ ., data = traindata, method = "ranger")
fit<-randomForest(Class~.,data = traindata, ntree=500, importance= TRUE, proximity = TRUE, na.action = na.roughfix)
Efforts taken
I have tried above code, but due to 4th, and 7th column, it shows error of "more than 53 columns can not be used".
Any help would be appreciated to resolve this issue...
r random-forest
1
Possible duplicate of R - Random Forest and more than 53 categories
– divibisan
Apr 2 at 17:49
add a comment |
Background
I am new in machine learning. I want to train model by using Random Forest algorithm. I have database which includes total 9 columns, in which 8 are independent variables and last (9th) variable 'Class' is dependent variable. Dependent variable is predictor variable which contains 3 categories i.e. - S, N, R. All independent variables (except 2) contain more categories than 53. Code shows error when categories become more than 53. I want to train the model in order to identify whether the database line is Suspicious (S), Normal (N), Robot (R). Column numbers 4 and 7 contain more than 19k categories/levels. These are the important columns because they contain attack entries/features etc. How to derive other variables from them becomes complicated.
Code
library('ROCR')
library('randomForest')
library('caret')
library('ranger')
database<-read.csv('data1.csv')
set.seed(1000)
train<-sample(1:310341,217239,replace = FALSE)
traindata<-database[train,]
testdata<-database[-train,]
# fit <- train(database$Class ~ ., data = traindata, method = "ranger")
fit<-randomForest(Class~.,data = traindata, ntree=500, importance= TRUE, proximity = TRUE, na.action = na.roughfix)
Efforts taken
I have tried above code, but due to 4th, and 7th column, it shows error of "more than 53 columns can not be used".
Any help would be appreciated to resolve this issue...
r random-forest
Background
I am new in machine learning. I want to train model by using Random Forest algorithm. I have database which includes total 9 columns, in which 8 are independent variables and last (9th) variable 'Class' is dependent variable. Dependent variable is predictor variable which contains 3 categories i.e. - S, N, R. All independent variables (except 2) contain more categories than 53. Code shows error when categories become more than 53. I want to train the model in order to identify whether the database line is Suspicious (S), Normal (N), Robot (R). Column numbers 4 and 7 contain more than 19k categories/levels. These are the important columns because they contain attack entries/features etc. How to derive other variables from them becomes complicated.
Code
library('ROCR')
library('randomForest')
library('caret')
library('ranger')
database<-read.csv('data1.csv')
set.seed(1000)
train<-sample(1:310341,217239,replace = FALSE)
traindata<-database[train,]
testdata<-database[-train,]
# fit <- train(database$Class ~ ., data = traindata, method = "ranger")
fit<-randomForest(Class~.,data = traindata, ntree=500, importance= TRUE, proximity = TRUE, na.action = na.roughfix)
Efforts taken
I have tried above code, but due to 4th, and 7th column, it shows error of "more than 53 columns can not be used".
Any help would be appreciated to resolve this issue...
r random-forest
r random-forest
edited Mar 26 at 18:22
DSD
asked Mar 25 at 11:06
DSDDSD
12 bronze badges
12 bronze badges
1
Possible duplicate of R - Random Forest and more than 53 categories
– divibisan
Apr 2 at 17:49
add a comment |
1
Possible duplicate of R - Random Forest and more than 53 categories
– divibisan
Apr 2 at 17:49
1
1
Possible duplicate of R - Random Forest and more than 53 categories
– divibisan
Apr 2 at 17:49
Possible duplicate of R - Random Forest and more than 53 categories
– divibisan
Apr 2 at 17:49
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55336405%2fwhat-columns-should-use-in-order-to-train-random-forest%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Is this question similar to what you get asked at work? Learn more about asking and sharing private information with your coworkers using Stack Overflow for Teams.
Is this question similar to what you get asked at work? Learn more about asking and sharing private information with your coworkers using Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55336405%2fwhat-columns-should-use-in-order-to-train-random-forest%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Possible duplicate of R - Random Forest and more than 53 categories
– divibisan
Apr 2 at 17:49