One hot encoding using sklearn preprocessing Label BinarizerOne hot encoder confusionLabel encoding across multiple columns in scikit-learnWhy can't I one-hot encode my labels with TensorFlow? (Bad slice index None of type <type 'NoneType'>)Is there any way to visualize decision tree (sklearn) with categorical features consolidated from one hot encoded features?One hot encoding and its combination with DecisionTreeClassifierDo scikit-learn classifiers automatically one-hot encode?tensorflow TFRecord k-hot encodingsklearn - How to generate proper labels with multiple valuesOne-hot-encoding multiple columns in sklearn and naming columns
Contact Search Results Address Type
Is it ethical to tell my teaching assistant that I like him?
What is "ass door"?
What is the best word describing the nature of expiring in a short amount of time, connoting "losing public attention"?
Was US film used in Luna 3?
Are there any English words pronounced with sounds/syllables that aren't part of the spelling?
Strange LED behavior: Why is there a voltage over the LED with only one wire connected to it?
Host telling me to cancel my booking in exchange for a discount?
How to run a substitute command on only a certain part of the line
Wiring IKEA light fixture into old fixture
Why do people say "I am broke" instead of "I am broken"?
Where can I find maps and other historical resources / references of Calcutta / Kolkata in the Victorian era?
What is a plausible power source to indefinitely sustain a space station?
"It is what it is" in French
Would using carbon dioxide as fuel work to reduce the greenhouse effect?
List of Casimir elements of low dimensional Lie algebras
Is it better to merge "often" or only after completion do a big merge of feature branches?
An Italian table, is it in fact Arabic?
How often should alkaline batteries be checked when they are in a device?
Adding gears to my grandson's 12" bike
What is the significance of numbers(2,3) mentioned in SOT23?
German phrase for 'suited and booted'
Storyboard broken after updating Xcode to version 10.3 (10G8) & app no longer is running
Is there a way to shorten this while condition?
One hot encoding using sklearn preprocessing Label Binarizer
One hot encoder confusionLabel encoding across multiple columns in scikit-learnWhy can't I one-hot encode my labels with TensorFlow? (Bad slice index None of type <type 'NoneType'>)Is there any way to visualize decision tree (sklearn) with categorical features consolidated from one hot encoded features?One hot encoding and its combination with DecisionTreeClassifierDo scikit-learn classifiers automatically one-hot encode?tensorflow TFRecord k-hot encodingsklearn - How to generate proper labels with multiple valuesOne-hot-encoding multiple columns in sklearn and naming columns
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I am trying to use sklearn.preprocessing.LabelBinarizer() to create a one hot encoding of only a two-column labels, i.e. I only want to categorize two set of objects. In this case, when I use fit(range(0,2)), it just returns a one dimensional array, instead of 2x1. This is fine, but when I want to use them in Tensorflow, the shape should really be (2,1) for dimensional consistency. Please advise how I can resolve it.
Here is the code:
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(0, 3))
Calling lb.transform([1, 0]), the result is:
[[0 1 0]
[1 0 0]]
whereas when we change 3 to 2, i.e. lb.fit(range(0, 2)), the result would be
[[1]
[0]]
instead of
[[0 1]
[1 0]]
This will create problems in the algorithms that work consistently with arrays with n dimensions. Is it any way to resolve this issue?
python scikit-learn
add a comment |
I am trying to use sklearn.preprocessing.LabelBinarizer() to create a one hot encoding of only a two-column labels, i.e. I only want to categorize two set of objects. In this case, when I use fit(range(0,2)), it just returns a one dimensional array, instead of 2x1. This is fine, but when I want to use them in Tensorflow, the shape should really be (2,1) for dimensional consistency. Please advise how I can resolve it.
Here is the code:
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(0, 3))
Calling lb.transform([1, 0]), the result is:
[[0 1 0]
[1 0 0]]
whereas when we change 3 to 2, i.e. lb.fit(range(0, 2)), the result would be
[[1]
[0]]
instead of
[[0 1]
[1 0]]
This will create problems in the algorithms that work consistently with arrays with n dimensions. Is it any way to resolve this issue?
python scikit-learn
Can you explain which method you call to get the result?lb.fit()does not return anything,
– Eskapp
Mar 26 at 14:19
1
Sorry to miss to include it. Here is the code:print(lb.transform([1, 0]))
– HamidReza Mirkhani
Mar 26 at 14:53
First thing, this is not an issue of the method. According to the documentation:Binary targets transform to a column vector(scikit-learn.org/stable/modules/generated/…) You can build the array you want from the colomn vector result, in the case the dimension is 2. I'll try to write an answer if this is unclear.
– Eskapp
Mar 26 at 15:50
Thanks, I wouldn't call it as an issue of the method too, however, to me, a better implementation would allow developers to control the output types to make them consistent. As you highlighted, I have to write another customized method just in casen=2for example.
– HamidReza Mirkhani
Mar 26 at 16:01
add a comment |
I am trying to use sklearn.preprocessing.LabelBinarizer() to create a one hot encoding of only a two-column labels, i.e. I only want to categorize two set of objects. In this case, when I use fit(range(0,2)), it just returns a one dimensional array, instead of 2x1. This is fine, but when I want to use them in Tensorflow, the shape should really be (2,1) for dimensional consistency. Please advise how I can resolve it.
Here is the code:
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(0, 3))
Calling lb.transform([1, 0]), the result is:
[[0 1 0]
[1 0 0]]
whereas when we change 3 to 2, i.e. lb.fit(range(0, 2)), the result would be
[[1]
[0]]
instead of
[[0 1]
[1 0]]
This will create problems in the algorithms that work consistently with arrays with n dimensions. Is it any way to resolve this issue?
python scikit-learn
I am trying to use sklearn.preprocessing.LabelBinarizer() to create a one hot encoding of only a two-column labels, i.e. I only want to categorize two set of objects. In this case, when I use fit(range(0,2)), it just returns a one dimensional array, instead of 2x1. This is fine, but when I want to use them in Tensorflow, the shape should really be (2,1) for dimensional consistency. Please advise how I can resolve it.
Here is the code:
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(0, 3))
Calling lb.transform([1, 0]), the result is:
[[0 1 0]
[1 0 0]]
whereas when we change 3 to 2, i.e. lb.fit(range(0, 2)), the result would be
[[1]
[0]]
instead of
[[0 1]
[1 0]]
This will create problems in the algorithms that work consistently with arrays with n dimensions. Is it any way to resolve this issue?
python scikit-learn
python scikit-learn
edited Mar 26 at 18:29
Eskapp
1,91514 silver badges27 bronze badges
1,91514 silver badges27 bronze badges
asked Mar 26 at 14:10
HamidReza MirkhaniHamidReza Mirkhani
305 bronze badges
305 bronze badges
Can you explain which method you call to get the result?lb.fit()does not return anything,
– Eskapp
Mar 26 at 14:19
1
Sorry to miss to include it. Here is the code:print(lb.transform([1, 0]))
– HamidReza Mirkhani
Mar 26 at 14:53
First thing, this is not an issue of the method. According to the documentation:Binary targets transform to a column vector(scikit-learn.org/stable/modules/generated/…) You can build the array you want from the colomn vector result, in the case the dimension is 2. I'll try to write an answer if this is unclear.
– Eskapp
Mar 26 at 15:50
Thanks, I wouldn't call it as an issue of the method too, however, to me, a better implementation would allow developers to control the output types to make them consistent. As you highlighted, I have to write another customized method just in casen=2for example.
– HamidReza Mirkhani
Mar 26 at 16:01
add a comment |
Can you explain which method you call to get the result?lb.fit()does not return anything,
– Eskapp
Mar 26 at 14:19
1
Sorry to miss to include it. Here is the code:print(lb.transform([1, 0]))
– HamidReza Mirkhani
Mar 26 at 14:53
First thing, this is not an issue of the method. According to the documentation:Binary targets transform to a column vector(scikit-learn.org/stable/modules/generated/…) You can build the array you want from the colomn vector result, in the case the dimension is 2. I'll try to write an answer if this is unclear.
– Eskapp
Mar 26 at 15:50
Thanks, I wouldn't call it as an issue of the method too, however, to me, a better implementation would allow developers to control the output types to make them consistent. As you highlighted, I have to write another customized method just in casen=2for example.
– HamidReza Mirkhani
Mar 26 at 16:01
Can you explain which method you call to get the result?
lb.fit() does not return anything,– Eskapp
Mar 26 at 14:19
Can you explain which method you call to get the result?
lb.fit() does not return anything,– Eskapp
Mar 26 at 14:19
1
1
Sorry to miss to include it. Here is the code:
print(lb.transform([1, 0]))– HamidReza Mirkhani
Mar 26 at 14:53
Sorry to miss to include it. Here is the code:
print(lb.transform([1, 0]))– HamidReza Mirkhani
Mar 26 at 14:53
First thing, this is not an issue of the method. According to the documentation:
Binary targets transform to a column vector (scikit-learn.org/stable/modules/generated/…) You can build the array you want from the colomn vector result, in the case the dimension is 2. I'll try to write an answer if this is unclear.– Eskapp
Mar 26 at 15:50
First thing, this is not an issue of the method. According to the documentation:
Binary targets transform to a column vector (scikit-learn.org/stable/modules/generated/…) You can build the array you want from the colomn vector result, in the case the dimension is 2. I'll try to write an answer if this is unclear.– Eskapp
Mar 26 at 15:50
Thanks, I wouldn't call it as an issue of the method too, however, to me, a better implementation would allow developers to control the output types to make them consistent. As you highlighted, I have to write another customized method just in case
n=2 for example.– HamidReza Mirkhani
Mar 26 at 16:01
Thanks, I wouldn't call it as an issue of the method too, however, to me, a better implementation would allow developers to control the output types to make them consistent. As you highlighted, I have to write another customized method just in case
n=2 for example.– HamidReza Mirkhani
Mar 26 at 16:01
add a comment |
2 Answers
2
active
oldest
votes
labelBinarizer()'s purpose according to the documentation is
Binarize labels in a one-vs-all fashion
Several regression and binary classification algorithms are available in scikit-learn.
A simple way to extend these algorithms to the multi-class classification case is to use > the so-called one-vs-all scheme.
If your data has only two types of labels, then you can directly feed that to binary classifier. Hence, one column is good enough to capture two classes in One-Vs-Rest fashion.
Binary targets transform to a column vector
>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
[0],
[0],
[1]])
If your intention is just creating one-hot encoding, use the following method.
from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder()
>>> enc.fit_transform([['yes'], ['no'], ['no'], ['yes']]).toarray()
array([[0., 1.],
[1., 0.],
[1., 0.],
[0., 1.]])
Hope this clarifies, your question of why Sklearn labelBinarizer() does not convert the 2 class data into two column output.
add a comment |
As already said as a comment, this is not an issue of the method. According to the documentation: Binary targets transform to a column vector. You can build the array you want from the colomn vector result, in the case the dimension is 2.
A direct and simple way to do this is:
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(2) # range(0, 2) is the same as range(2)
a = lb.transform([1, 0])
result_2d = np.array([[item[0], 0 if item[0] else 1] for item in a])
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55359235%2fone-hot-encoding-using-sklearn-preprocessing-label-binarizer%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
labelBinarizer()'s purpose according to the documentation is
Binarize labels in a one-vs-all fashion
Several regression and binary classification algorithms are available in scikit-learn.
A simple way to extend these algorithms to the multi-class classification case is to use > the so-called one-vs-all scheme.
If your data has only two types of labels, then you can directly feed that to binary classifier. Hence, one column is good enough to capture two classes in One-Vs-Rest fashion.
Binary targets transform to a column vector
>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
[0],
[0],
[1]])
If your intention is just creating one-hot encoding, use the following method.
from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder()
>>> enc.fit_transform([['yes'], ['no'], ['no'], ['yes']]).toarray()
array([[0., 1.],
[1., 0.],
[1., 0.],
[0., 1.]])
Hope this clarifies, your question of why Sklearn labelBinarizer() does not convert the 2 class data into two column output.
add a comment |
labelBinarizer()'s purpose according to the documentation is
Binarize labels in a one-vs-all fashion
Several regression and binary classification algorithms are available in scikit-learn.
A simple way to extend these algorithms to the multi-class classification case is to use > the so-called one-vs-all scheme.
If your data has only two types of labels, then you can directly feed that to binary classifier. Hence, one column is good enough to capture two classes in One-Vs-Rest fashion.
Binary targets transform to a column vector
>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
[0],
[0],
[1]])
If your intention is just creating one-hot encoding, use the following method.
from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder()
>>> enc.fit_transform([['yes'], ['no'], ['no'], ['yes']]).toarray()
array([[0., 1.],
[1., 0.],
[1., 0.],
[0., 1.]])
Hope this clarifies, your question of why Sklearn labelBinarizer() does not convert the 2 class data into two column output.
add a comment |
labelBinarizer()'s purpose according to the documentation is
Binarize labels in a one-vs-all fashion
Several regression and binary classification algorithms are available in scikit-learn.
A simple way to extend these algorithms to the multi-class classification case is to use > the so-called one-vs-all scheme.
If your data has only two types of labels, then you can directly feed that to binary classifier. Hence, one column is good enough to capture two classes in One-Vs-Rest fashion.
Binary targets transform to a column vector
>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
[0],
[0],
[1]])
If your intention is just creating one-hot encoding, use the following method.
from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder()
>>> enc.fit_transform([['yes'], ['no'], ['no'], ['yes']]).toarray()
array([[0., 1.],
[1., 0.],
[1., 0.],
[0., 1.]])
Hope this clarifies, your question of why Sklearn labelBinarizer() does not convert the 2 class data into two column output.
labelBinarizer()'s purpose according to the documentation is
Binarize labels in a one-vs-all fashion
Several regression and binary classification algorithms are available in scikit-learn.
A simple way to extend these algorithms to the multi-class classification case is to use > the so-called one-vs-all scheme.
If your data has only two types of labels, then you can directly feed that to binary classifier. Hence, one column is good enough to capture two classes in One-Vs-Rest fashion.
Binary targets transform to a column vector
>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
[0],
[0],
[1]])
If your intention is just creating one-hot encoding, use the following method.
from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder()
>>> enc.fit_transform([['yes'], ['no'], ['no'], ['yes']]).toarray()
array([[0., 1.],
[1., 0.],
[1., 0.],
[0., 1.]])
Hope this clarifies, your question of why Sklearn labelBinarizer() does not convert the 2 class data into two column output.
answered Mar 26 at 17:32
ai_learningai_learning
6,1775 gold badges15 silver badges41 bronze badges
6,1775 gold badges15 silver badges41 bronze badges
add a comment |
add a comment |
As already said as a comment, this is not an issue of the method. According to the documentation: Binary targets transform to a column vector. You can build the array you want from the colomn vector result, in the case the dimension is 2.
A direct and simple way to do this is:
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(2) # range(0, 2) is the same as range(2)
a = lb.transform([1, 0])
result_2d = np.array([[item[0], 0 if item[0] else 1] for item in a])
add a comment |
As already said as a comment, this is not an issue of the method. According to the documentation: Binary targets transform to a column vector. You can build the array you want from the colomn vector result, in the case the dimension is 2.
A direct and simple way to do this is:
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(2) # range(0, 2) is the same as range(2)
a = lb.transform([1, 0])
result_2d = np.array([[item[0], 0 if item[0] else 1] for item in a])
add a comment |
As already said as a comment, this is not an issue of the method. According to the documentation: Binary targets transform to a column vector. You can build the array you want from the colomn vector result, in the case the dimension is 2.
A direct and simple way to do this is:
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(2) # range(0, 2) is the same as range(2)
a = lb.transform([1, 0])
result_2d = np.array([[item[0], 0 if item[0] else 1] for item in a])
As already said as a comment, this is not an issue of the method. According to the documentation: Binary targets transform to a column vector. You can build the array you want from the colomn vector result, in the case the dimension is 2.
A direct and simple way to do this is:
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(2) # range(0, 2) is the same as range(2)
a = lb.transform([1, 0])
result_2d = np.array([[item[0], 0 if item[0] else 1] for item in a])
answered Mar 26 at 17:36
EskappEskapp
1,91514 silver badges27 bronze badges
1,91514 silver badges27 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55359235%2fone-hot-encoding-using-sklearn-preprocessing-label-binarizer%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Can you explain which method you call to get the result?
lb.fit()does not return anything,– Eskapp
Mar 26 at 14:19
1
Sorry to miss to include it. Here is the code:
print(lb.transform([1, 0]))– HamidReza Mirkhani
Mar 26 at 14:53
First thing, this is not an issue of the method. According to the documentation:
Binary targets transform to a column vector(scikit-learn.org/stable/modules/generated/…) You can build the array you want from the colomn vector result, in the case the dimension is 2. I'll try to write an answer if this is unclear.– Eskapp
Mar 26 at 15:50
Thanks, I wouldn't call it as an issue of the method too, however, to me, a better implementation would allow developers to control the output types to make them consistent. As you highlighted, I have to write another customized method just in case
n=2for example.– HamidReza Mirkhani
Mar 26 at 16:01