Decision Tree status column & related numerical value column The Next CEO of Stack OverflowHow to handle catagorical data while training decision tree using scikit-learn/ sklearn?Feature Importance extraction of Decision Trees (scikit-learn)decision trees from features of multiple datatypesDecision Tree produces different outputsPassing categorical data to Sklearn Decision TreeDecision tree algorithm for mixed numeric and nominal dataUse of one-hot encoder to build decision treesIn sklearn, how can one-hot encoding help when building decision tree with categorical features?Using OneHotEncoder for categorical features in decision tree classifierWhat is relation between R-squared and numerical data in case of Decision Tree?
Are there any unintended negative consequences to allowing PCs to gain multiple levels at once in a short milestone-XP game?
What benefits would be gained by using human laborers instead of drones in deep sea mining?
What connection does MS Office have to Netscape Navigator?
Unreliable Magic - Is it worth it?
What can we do to stop prior company from asking us questions?
How does the Z80 determine which peripheral sent an interrupt?
Is it my responsibility to learn a new technology in my own time my employer wants to implement?
Grabbing quick drinks
What happened in Rome, when the western empire "fell"?
What does convergence in distribution "in the Gromov–Hausdorff" sense mean?
How to prevent changing the value of variable?
Can I run my washing machine drain line into a condensate pump so it drains better?
Which tube will fit a -(700 x 25c) wheel?
Inappropriate reference requests from Journal reviewers
Would a completely good Muggle be able to use a wand?
Multiple labels for a single equation
Is there a difference between "Fahrstuhl" and "Aufzug"
How to invert MapIndexed on a ragged structure? How to construct a tree from rules?
Is it possible to search for a directory/file combination?
Why has the US not been more assertive in confronting Russia in recent years?
Is it professional to write unrelated content in an almost-empty email?
Written every which way
What is the result of assigning to std::vector<T>::begin()?
sp_blitzCache results Memory grants
Decision Tree status column & related numerical value column
The Next CEO of Stack OverflowHow to handle catagorical data while training decision tree using scikit-learn/ sklearn?Feature Importance extraction of Decision Trees (scikit-learn)decision trees from features of multiple datatypesDecision Tree produces different outputsPassing categorical data to Sklearn Decision TreeDecision tree algorithm for mixed numeric and nominal dataUse of one-hot encoder to build decision treesIn sklearn, how can one-hot encoding help when building decision tree with categorical features?Using OneHotEncoder for categorical features in decision tree classifierWhat is relation between R-squared and numerical data in case of Decision Tree?
I have a data including two columns where one is categorically shows the status of the feature & the other one numerically shows the related value. Just like below:
I want to run a decision tree algorithm via scikit learn on this data. I am not sure how to deal with these two columns because conceptually I cannot figure out how to bond these tho very correlated features. Basically, we are not supposed to leave null data, however, this one is supposed to be null in numerical column by nature. If we make it "0", it has another meaning.
So, how should I pre-process this data to have the decision tree algorithm work properly?
scikit-learn numeric decision-tree categorical-data
add a comment |
I have a data including two columns where one is categorically shows the status of the feature & the other one numerically shows the related value. Just like below:
I want to run a decision tree algorithm via scikit learn on this data. I am not sure how to deal with these two columns because conceptually I cannot figure out how to bond these tho very correlated features. Basically, we are not supposed to leave null data, however, this one is supposed to be null in numerical column by nature. If we make it "0", it has another meaning.
So, how should I pre-process this data to have the decision tree algorithm work properly?
scikit-learn numeric decision-tree categorical-data
Please share what you have tried so far, and what specific programming issues you face; SO is not a code design service, I kindly suggest you re-read How to Ask and What topics can I ask about here?.
– desertnaut
Mar 21 at 18:03
Thanks for the insight.
– BTurkeli
Mar 22 at 5:04
add a comment |
I have a data including two columns where one is categorically shows the status of the feature & the other one numerically shows the related value. Just like below:
I want to run a decision tree algorithm via scikit learn on this data. I am not sure how to deal with these two columns because conceptually I cannot figure out how to bond these tho very correlated features. Basically, we are not supposed to leave null data, however, this one is supposed to be null in numerical column by nature. If we make it "0", it has another meaning.
So, how should I pre-process this data to have the decision tree algorithm work properly?
scikit-learn numeric decision-tree categorical-data
I have a data including two columns where one is categorically shows the status of the feature & the other one numerically shows the related value. Just like below:
I want to run a decision tree algorithm via scikit learn on this data. I am not sure how to deal with these two columns because conceptually I cannot figure out how to bond these tho very correlated features. Basically, we are not supposed to leave null data, however, this one is supposed to be null in numerical column by nature. If we make it "0", it has another meaning.
So, how should I pre-process this data to have the decision tree algorithm work properly?
scikit-learn numeric decision-tree categorical-data
scikit-learn numeric decision-tree categorical-data
asked Mar 21 at 16:58
BTurkeliBTurkeli
37118
37118
Please share what you have tried so far, and what specific programming issues you face; SO is not a code design service, I kindly suggest you re-read How to Ask and What topics can I ask about here?.
– desertnaut
Mar 21 at 18:03
Thanks for the insight.
– BTurkeli
Mar 22 at 5:04
add a comment |
Please share what you have tried so far, and what specific programming issues you face; SO is not a code design service, I kindly suggest you re-read How to Ask and What topics can I ask about here?.
– desertnaut
Mar 21 at 18:03
Thanks for the insight.
– BTurkeli
Mar 22 at 5:04
Please share what you have tried so far, and what specific programming issues you face; SO is not a code design service, I kindly suggest you re-read How to Ask and What topics can I ask about here?.
– desertnaut
Mar 21 at 18:03
Please share what you have tried so far, and what specific programming issues you face; SO is not a code design service, I kindly suggest you re-read How to Ask and What topics can I ask about here?.
– desertnaut
Mar 21 at 18:03
Thanks for the insight.
– BTurkeli
Mar 22 at 5:04
Thanks for the insight.
– BTurkeli
Mar 22 at 5:04
add a comment |
1 Answer
1
active
oldest
votes
My prefossor provides a reasonable answer as below.
First, fill the null cells with "0".
If you plug the data into decision tree algorithms with these two features, we have two cases:
If "Status" comes first:
The tree will split 0's and 1's into two branches. Under 0, all Amount values will be already 0, hence this feature will not be chosen. Under 1, there will not be any 0 Status.If "Amount" comes first: All Status 0's will go under only one branch and they will get together with the ones that are very small amounts.
So, If the Amount data is noisy, it might be helpful to keep the Status column. Otherwise, I would remove the Status column.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55285581%2fdecision-tree-status-column-related-numerical-value-column%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
My prefossor provides a reasonable answer as below.
First, fill the null cells with "0".
If you plug the data into decision tree algorithms with these two features, we have two cases:
If "Status" comes first:
The tree will split 0's and 1's into two branches. Under 0, all Amount values will be already 0, hence this feature will not be chosen. Under 1, there will not be any 0 Status.If "Amount" comes first: All Status 0's will go under only one branch and they will get together with the ones that are very small amounts.
So, If the Amount data is noisy, it might be helpful to keep the Status column. Otherwise, I would remove the Status column.
add a comment |
My prefossor provides a reasonable answer as below.
First, fill the null cells with "0".
If you plug the data into decision tree algorithms with these two features, we have two cases:
If "Status" comes first:
The tree will split 0's and 1's into two branches. Under 0, all Amount values will be already 0, hence this feature will not be chosen. Under 1, there will not be any 0 Status.If "Amount" comes first: All Status 0's will go under only one branch and they will get together with the ones that are very small amounts.
So, If the Amount data is noisy, it might be helpful to keep the Status column. Otherwise, I would remove the Status column.
add a comment |
My prefossor provides a reasonable answer as below.
First, fill the null cells with "0".
If you plug the data into decision tree algorithms with these two features, we have two cases:
If "Status" comes first:
The tree will split 0's and 1's into two branches. Under 0, all Amount values will be already 0, hence this feature will not be chosen. Under 1, there will not be any 0 Status.If "Amount" comes first: All Status 0's will go under only one branch and they will get together with the ones that are very small amounts.
So, If the Amount data is noisy, it might be helpful to keep the Status column. Otherwise, I would remove the Status column.
My prefossor provides a reasonable answer as below.
First, fill the null cells with "0".
If you plug the data into decision tree algorithms with these two features, we have two cases:
If "Status" comes first:
The tree will split 0's and 1's into two branches. Under 0, all Amount values will be already 0, hence this feature will not be chosen. Under 1, there will not be any 0 Status.If "Amount" comes first: All Status 0's will go under only one branch and they will get together with the ones that are very small amounts.
So, If the Amount data is noisy, it might be helpful to keep the Status column. Otherwise, I would remove the Status column.
answered Mar 22 at 12:06
BTurkeliBTurkeli
37118
37118
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55285581%2fdecision-tree-status-column-related-numerical-value-column%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please share what you have tried so far, and what specific programming issues you face; SO is not a code design service, I kindly suggest you re-read How to Ask and What topics can I ask about here?.
– desertnaut
Mar 21 at 18:03
Thanks for the insight.
– BTurkeli
Mar 22 at 5:04