One hot encoding using sklearn preprocessing Label BinarizerOne hot encoder confusionLabel encoding across multiple columns in scikit-learnWhy can't I one-hot encode my labels with TensorFlow? (Bad slice index None of type <type 'NoneType'>)Is there any way to visualize decision tree (sklearn) with categorical features consolidated from one hot encoded features?One hot encoding and its combination with DecisionTreeClassifierDo scikit-learn classifiers automatically one-hot encode?tensorflow TFRecord k-hot encodingsklearn - How to generate proper labels with multiple valuesOne-hot-encoding multiple columns in sklearn and naming columns

Contact Search Results Address Type

Is it ethical to tell my teaching assistant that I like him?

What is "ass door"?

What is the best word describing the nature of expiring in a short amount of time, connoting "losing public attention"?

Was US film used in Luna 3?

Are there any English words pronounced with sounds/syllables that aren't part of the spelling?

Strange LED behavior: Why is there a voltage over the LED with only one wire connected to it?

Host telling me to cancel my booking in exchange for a discount?

How to run a substitute command on only a certain part of the line

Wiring IKEA light fixture into old fixture

Why do people say "I am broke" instead of "I am broken"?

Where can I find maps and other historical resources / references of Calcutta / Kolkata in the Victorian era?

What is a plausible power source to indefinitely sustain a space station?

"It is what it is" in French

Would using carbon dioxide as fuel work to reduce the greenhouse effect?

List of Casimir elements of low dimensional Lie algebras

Is it better to merge "often" or only after completion do a big merge of feature branches?

An Italian table, is it in fact Arabic?

How often should alkaline batteries be checked when they are in a device?

Adding gears to my grandson's 12" bike

What is the significance of numbers(2,3) mentioned in SOT23?

German phrase for 'suited and booted'

Storyboard broken after updating Xcode to version 10.3 (10G8) & app no longer is running

Is there a way to shorten this while condition?

One hot encoding using sklearn preprocessing Label Binarizer

One hot encoder confusionLabel encoding across multiple columns in scikit-learnWhy can't I one-hot encode my labels with TensorFlow? (Bad slice index None of type <type 'NoneType'>)Is there any way to visualize decision tree (sklearn) with categorical features consolidated from one hot encoded features?One hot encoding and its combination with DecisionTreeClassifierDo scikit-learn classifiers automatically one-hot encode?tensorflow TFRecord k-hot encodingsklearn - How to generate proper labels with multiple valuesOne-hot-encoding multiple columns in sklearn and naming columns

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I am trying to use sklearn.preprocessing.LabelBinarizer() to create a one hot encoding of only a two-column labels, i.e. I only want to categorize two set of objects. In this case, when I use fit(range(0,2)), it just returns a one dimensional array, instead of 2x1. This is fine, but when I want to use them in Tensorflow, the shape should really be (2,1) for dimensional consistency. Please advise how I can resolve it.

Here is the code:

from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(0, 3))

Calling lb.transform([1, 0]), the result is:

[[0 1 0]
 [1 0 0]]

whereas when we change 3 to 2, i.e. lb.fit(range(0, 2)), the result would be

[[1]
 [0]]

instead of

[[0 1]
 [1 0]]

This will create problems in the algorithms that work consistently with arrays with n dimensions. Is it any way to resolve this issue?

edited Mar 26 at 18:29

Eskapp

1,91514 silver badges27 bronze badges

asked Mar 26 at 14:10

HamidReza Mirkhani

305 bronze badges

Can you explain which method you call to get the result? lb.fit() does not return anything,

– Eskapp
Mar 26 at 14:19

1

Sorry to miss to include it. Here is the code: print(lb.transform([1, 0]))

– HamidReza Mirkhani
Mar 26 at 14:53

First thing, this is not an issue of the method. According to the documentation: Binary targets transform to a column vector (scikit-learn.org/stable/modules/generated/…) You can build the array you want from the colomn vector result, in the case the dimension is 2. I'll try to write an answer if this is unclear.

– Eskapp
Mar 26 at 15:50

Thanks, I wouldn't call it as an issue of the method too, however, to me, a better implementation would allow developers to control the output types to make them consistent. As you highlighted, I have to write another customized method just in case n=2 for example.

– HamidReza Mirkhani
Mar 26 at 16:01

add a comment |

Here is the code:

from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(0, 3))

Calling lb.transform([1, 0]), the result is:

[[0 1 0]
 [1 0 0]]

whereas when we change 3 to 2, i.e. lb.fit(range(0, 2)), the result would be

[[1]
 [0]]

instead of

[[0 1]
 [1 0]]

This will create problems in the algorithms that work consistently with arrays with n dimensions. Is it any way to resolve this issue?

edited Mar 26 at 18:29

Eskapp

1,91514 silver badges27 bronze badges

asked Mar 26 at 14:10

HamidReza Mirkhani

305 bronze badges

Can you explain which method you call to get the result? lb.fit() does not return anything,

– Eskapp
Mar 26 at 14:19

1

Sorry to miss to include it. Here is the code: print(lb.transform([1, 0]))

– HamidReza Mirkhani
Mar 26 at 14:53

First thing, this is not an issue of the method. According to the documentation: Binary targets transform to a column vector (scikit-learn.org/stable/modules/generated/…) You can build the array you want from the colomn vector result, in the case the dimension is 2. I'll try to write an answer if this is unclear.

– Eskapp
Mar 26 at 15:50

Thanks, I wouldn't call it as an issue of the method too, however, to me, a better implementation would allow developers to control the output types to make them consistent. As you highlighted, I have to write another customized method just in case n=2 for example.

– HamidReza Mirkhani
Mar 26 at 16:01

add a comment |

Here is the code:

from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(0, 3))

Calling lb.transform([1, 0]), the result is:

[[0 1 0]
 [1 0 0]]

whereas when we change 3 to 2, i.e. lb.fit(range(0, 2)), the result would be

[[1]
 [0]]

instead of

[[0 1]
 [1 0]]

This will create problems in the algorithms that work consistently with arrays with n dimensions. Is it any way to resolve this issue?

edited Mar 26 at 18:29

Eskapp

1,91514 silver badges27 bronze badges

asked Mar 26 at 14:10

HamidReza Mirkhani

305 bronze badges

Here is the code:

from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(0, 3))

Calling lb.transform([1, 0]), the result is:

[[0 1 0]
 [1 0 0]]

whereas when we change 3 to 2, i.e. lb.fit(range(0, 2)), the result would be

[[1]
 [0]]

instead of

[[0 1]
 [1 0]]

This will create problems in the algorithms that work consistently with arrays with n dimensions. Is it any way to resolve this issue?

python scikit-learn

edited Mar 26 at 18:29

Eskapp

1,91514 silver badges27 bronze badges

asked Mar 26 at 14:10

HamidReza Mirkhani

305 bronze badges

edited Mar 26 at 18:29

Eskapp

1,91514 silver badges27 bronze badges

asked Mar 26 at 14:10

HamidReza Mirkhani

305 bronze badges

edited Mar 26 at 18:29

Eskapp

1,91514 silver badges27 bronze badges

edited Mar 26 at 18:29

Eskapp

1,91514 silver badges27 bronze badges

edited Mar 26 at 18:29

Eskapp

1,91514 silver badges27 bronze badges

asked Mar 26 at 14:10

HamidReza Mirkhani

305 bronze badges

asked Mar 26 at 14:10

HamidReza Mirkhani

305 bronze badges

asked Mar 26 at 14:10

HamidReza Mirkhani

305 bronze badges

Can you explain which method you call to get the result? lb.fit() does not return anything,

– Eskapp
Mar 26 at 14:19

1

Sorry to miss to include it. Here is the code: print(lb.transform([1, 0]))

– HamidReza Mirkhani
Mar 26 at 14:53

First thing, this is not an issue of the method. According to the documentation: Binary targets transform to a column vector (scikit-learn.org/stable/modules/generated/…) You can build the array you want from the colomn vector result, in the case the dimension is 2. I'll try to write an answer if this is unclear.

– Eskapp
Mar 26 at 15:50

Thanks, I wouldn't call it as an issue of the method too, however, to me, a better implementation would allow developers to control the output types to make them consistent. As you highlighted, I have to write another customized method just in case n=2 for example.

– HamidReza Mirkhani
Mar 26 at 16:01

add a comment |

Can you explain which method you call to get the result? lb.fit() does not return anything,

– Eskapp
Mar 26 at 14:19

1

Sorry to miss to include it. Here is the code: print(lb.transform([1, 0]))

– HamidReza Mirkhani
Mar 26 at 14:53

First thing, this is not an issue of the method. According to the documentation: Binary targets transform to a column vector (scikit-learn.org/stable/modules/generated/…) You can build the array you want from the colomn vector result, in the case the dimension is 2. I'll try to write an answer if this is unclear.

– Eskapp
Mar 26 at 15:50

Thanks, I wouldn't call it as an issue of the method too, however, to me, a better implementation would allow developers to control the output types to make them consistent. As you highlighted, I have to write another customized method just in case n=2 for example.

– HamidReza Mirkhani
Mar 26 at 16:01

Can you explain which method you call to get the result? lb.fit() does not return anything,

– Eskapp
Mar 26 at 14:19

Sorry to miss to include it. Here is the code: print(lb.transform([1, 0]))

– HamidReza Mirkhani
Mar 26 at 14:53

First thing, this is not an issue of the method. According to the documentation: Binary targets transform to a column vector (scikit-learn.org/stable/modules/generated/…) You can build the array you want from the colomn vector result, in the case the dimension is 2. I'll try to write an answer if this is unclear.

– Eskapp
Mar 26 at 15:50

Thanks, I wouldn't call it as an issue of the method too, however, to me, a better implementation would allow developers to control the output types to make them consistent. As you highlighted, I have to write another customized method just in case n=2 for example.

– HamidReza Mirkhani
Mar 26 at 16:01

add a comment |

2 Answers
2

active

oldest

votes

labelBinarizer()'s purpose according to the documentation is

Binarize labels in a one-vs-all fashion

Several regression and binary classification algorithms are available in scikit-learn.
A simple way to extend these algorithms to the multi-class classification case is to use > the so-called one-vs-all scheme.

If your data has only two types of labels, then you can directly feed that to binary classifier. Hence, one column is good enough to capture two classes in One-Vs-Rest fashion.

Binary targets transform to a column vector

>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
 [0],
 [0],
 [1]])

If your intention is just creating one-hot encoding, use the following method.

from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder()
>>> enc.fit_transform([['yes'], ['no'], ['no'], ['yes']]).toarray()
array([[0., 1.],
 [1., 0.],
 [1., 0.],
 [0., 1.]])

Hope this clarifies, your question of why Sklearn labelBinarizer() does not convert the 2 class data into two column output.

answered Mar 26 at 17:32

ai_learning

6,1775 gold badges15 silver badges41 bronze badges

add a comment |

As already said as a comment, this is not an issue of the method. According to the documentation: Binary targets transform to a column vector. You can build the array you want from the colomn vector result, in the case the dimension is 2.

A direct and simple way to do this is:

from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(2) # range(0, 2) is the same as range(2)
a = lb.transform([1, 0])
result_2d = np.array([[item[0], 0 if item[0] else 1] for item in a])

answered Mar 26 at 17:36

Eskapp

1,91514 silver badges27 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55359235%2fone-hot-encoding-using-sklearn-preprocessing-label-binarizer%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

labelBinarizer()'s purpose according to the documentation is

Binarize labels in a one-vs-all fashion

Several regression and binary classification algorithms are available in scikit-learn.
A simple way to extend these algorithms to the multi-class classification case is to use > the so-called one-vs-all scheme.

If your data has only two types of labels, then you can directly feed that to binary classifier. Hence, one column is good enough to capture two classes in One-Vs-Rest fashion.

Binary targets transform to a column vector

>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
 [0],
 [0],
 [1]])

If your intention is just creating one-hot encoding, use the following method.

from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder()
>>> enc.fit_transform([['yes'], ['no'], ['no'], ['yes']]).toarray()
array([[0., 1.],
 [1., 0.],
 [1., 0.],
 [0., 1.]])

Hope this clarifies, your question of why Sklearn labelBinarizer() does not convert the 2 class data into two column output.

answered Mar 26 at 17:32

ai_learning

6,1775 gold badges15 silver badges41 bronze badges

add a comment |

labelBinarizer()'s purpose according to the documentation is

Binarize labels in a one-vs-all fashion

Several regression and binary classification algorithms are available in scikit-learn.
A simple way to extend these algorithms to the multi-class classification case is to use > the so-called one-vs-all scheme.

If your data has only two types of labels, then you can directly feed that to binary classifier. Hence, one column is good enough to capture two classes in One-Vs-Rest fashion.

Binary targets transform to a column vector

>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
 [0],
 [0],
 [1]])

If your intention is just creating one-hot encoding, use the following method.

from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder()
>>> enc.fit_transform([['yes'], ['no'], ['no'], ['yes']]).toarray()
array([[0., 1.],
 [1., 0.],
 [1., 0.],
 [0., 1.]])

Hope this clarifies, your question of why Sklearn labelBinarizer() does not convert the 2 class data into two column output.

answered Mar 26 at 17:32

ai_learning

6,1775 gold badges15 silver badges41 bronze badges

add a comment |

labelBinarizer()'s purpose according to the documentation is

Binarize labels in a one-vs-all fashion

Several regression and binary classification algorithms are available in scikit-learn.
A simple way to extend these algorithms to the multi-class classification case is to use > the so-called one-vs-all scheme.

If your data has only two types of labels, then you can directly feed that to binary classifier. Hence, one column is good enough to capture two classes in One-Vs-Rest fashion.

Binary targets transform to a column vector

>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
 [0],
 [0],
 [1]])

If your intention is just creating one-hot encoding, use the following method.

from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder()
>>> enc.fit_transform([['yes'], ['no'], ['no'], ['yes']]).toarray()
array([[0., 1.],
 [1., 0.],
 [1., 0.],
 [0., 1.]])

Hope this clarifies, your question of why Sklearn labelBinarizer() does not convert the 2 class data into two column output.

answered Mar 26 at 17:32

ai_learning

6,1775 gold badges15 silver badges41 bronze badges

labelBinarizer()'s purpose according to the documentation is

Binarize labels in a one-vs-all fashion

Several regression and binary classification algorithms are available in scikit-learn.
A simple way to extend these algorithms to the multi-class classification case is to use > the so-called one-vs-all scheme.

If your data has only two types of labels, then you can directly feed that to binary classifier. Hence, one column is good enough to capture two classes in One-Vs-Rest fashion.

Binary targets transform to a column vector

>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
 [0],
 [0],
 [1]])

If your intention is just creating one-hot encoding, use the following method.

from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder()
>>> enc.fit_transform([['yes'], ['no'], ['no'], ['yes']]).toarray()
array([[0., 1.],
 [1., 0.],
 [1., 0.],
 [0., 1.]])

Hope this clarifies, your question of why Sklearn labelBinarizer() does not convert the 2 class data into two column output.

answered Mar 26 at 17:32

ai_learning

6,1775 gold badges15 silver badges41 bronze badges

answered Mar 26 at 17:32

ai_learning

6,1775 gold badges15 silver badges41 bronze badges

answered Mar 26 at 17:32

ai_learning

6,1775 gold badges15 silver badges41 bronze badges

answered Mar 26 at 17:32

ai_learning

6,1775 gold badges15 silver badges41 bronze badges

add a comment |

A direct and simple way to do this is:

from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(2) # range(0, 2) is the same as range(2)
a = lb.transform([1, 0])
result_2d = np.array([[item[0], 0 if item[0] else 1] for item in a])

answered Mar 26 at 17:36

Eskapp

1,91514 silver badges27 bronze badges

add a comment |

A direct and simple way to do this is:

from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(2) # range(0, 2) is the same as range(2)
a = lb.transform([1, 0])
result_2d = np.array([[item[0], 0 if item[0] else 1] for item in a])

answered Mar 26 at 17:36

Eskapp

1,91514 silver badges27 bronze badges

add a comment |

A direct and simple way to do this is:

from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(2) # range(0, 2) is the same as range(2)
a = lb.transform([1, 0])
result_2d = np.array([[item[0], 0 if item[0] else 1] for item in a])

answered Mar 26 at 17:36

Eskapp

1,91514 silver badges27 bronze badges

A direct and simple way to do this is:

from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(2) # range(0, 2) is the same as range(2)
a = lb.transform([1, 0])
result_2d = np.array([[item[0], 0 if item[0] else 1] for item in a])

answered Mar 26 at 17:36

Eskapp

1,91514 silver badges27 bronze badges

answered Mar 26 at 17:36

Eskapp

1,91514 silver badges27 bronze badges

answered Mar 26 at 17:36

Eskapp

1,91514 silver badges27 bronze badges

answered Mar 26 at 17:36

Eskapp

1,91514 silver badges27 bronze badges

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

2 Answers
2

2 Answers
2

2 Answers
2