Python: how to compare the similarity between clustering using k-means algorithm?How do I copy a file in Python?What is the difference between Python's list methods append and extend?How can I safely create a nested directory?How can I remove a trailing newline?How do I parse a string to a float or int?How to get the current time in PythonHow can I make a time delay in Python?How do I get the number of elements in a list?How do I concatenate two lists in Python?How do I lowercase a string in Python?
Randomness Testing of Cryptographic Algorithim
Ask for a paid taxi in order to arrive as early as possible for an interview within the city
How to compare two different formulations of a problem?
Is there a known non-euclidean geometry where two concentric circles of different radii can intersect? (as in the novel "The Universe Between")
Expressing a chain of boolean ORs using ILP involving different variables
Changing a TGV booking
Have only girls been born for a long time in this village?
How to decide whether an eshop is safe or compromised
Apply for US visa question
How could China have extradited people for political reason under the extradition law it wanted to pass in Hong Kong?
How do you call it when two celestial bodies come as close to each other as they will in their current orbits?
Does Swashbuckler's Fancy Footwork apply if the attack was made with Booming Blade?
Why doesn't the Falcon-9 first stage use three legs to land?
What are the pros and cons of Einstein-Cartan Theory?
Why don't we use Cavea-B
Why does my house heat up, even when it's cool outside?
Can I submit a paper under an alias so as to avoid trouble in my country?
Was Tuvok bluffing when he said that Voyager's transporters rendered the Kazon weapons useless?
Is there a SubImageApply?
In an emergency, how do I find and share my position?
Most practical knots for hitching a line to an object while keeping the bitter end as tight as possible, without sag?
Why don't sharp and flat root note chords seem to be present in much guitar music?
Are there any plans for handling people floating away during an EVA?
How can I pack my food so it doesn't smell?
Python: how to compare the similarity between clustering using k-means algorithm?
How do I copy a file in Python?What is the difference between Python's list methods append and extend?How can I safely create a nested directory?How can I remove a trailing newline?How do I parse a string to a float or int?How to get the current time in PythonHow can I make a time delay in Python?How do I get the number of elements in a list?How do I concatenate two lists in Python?How do I lowercase a string in Python?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have two observations of the same event. Let say X and Y.
I suppose to have nc clusters. I am using sklearn to make the clustering.
x = KMeans(n_clusters=nc).fit_predict(X)
y = KMeans(n_clusters=nc).fit_predict(Y)
is there a measure that allow me to compare x and y: i.e. this measure will be 1 if the clusters x and y are the same.
python cluster-analysis k-means
add a comment |
I have two observations of the same event. Let say X and Y.
I suppose to have nc clusters. I am using sklearn to make the clustering.
x = KMeans(n_clusters=nc).fit_predict(X)
y = KMeans(n_clusters=nc).fit_predict(Y)
is there a measure that allow me to compare x and y: i.e. this measure will be 1 if the clusters x and y are the same.
python cluster-analysis k-means
add a comment |
I have two observations of the same event. Let say X and Y.
I suppose to have nc clusters. I am using sklearn to make the clustering.
x = KMeans(n_clusters=nc).fit_predict(X)
y = KMeans(n_clusters=nc).fit_predict(Y)
is there a measure that allow me to compare x and y: i.e. this measure will be 1 if the clusters x and y are the same.
python cluster-analysis k-means
I have two observations of the same event. Let say X and Y.
I suppose to have nc clusters. I am using sklearn to make the clustering.
x = KMeans(n_clusters=nc).fit_predict(X)
y = KMeans(n_clusters=nc).fit_predict(Y)
is there a measure that allow me to compare x and y: i.e. this measure will be 1 if the clusters x and y are the same.
python cluster-analysis k-means
python cluster-analysis k-means
asked May 13 '16 at 21:29
emaxemax
1,0723 gold badges14 silver badges42 bronze badges
1,0723 gold badges14 silver badges42 bronze badges
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Just extract the cluster centers of your kmeans-objects (see the docs):
x_centers = x.cluster_centers_
y_centers = y.cluster_centers_
The you have to decide which metric you are using to compare these. Keep in mind that the centers are floating-points, the clustering-process is a heuristic and the clustering-process is a random-algorithm. This means, you will get something which interprets as not exactly the same with a high probability, even for cluster-objects trained on the same data.
This link discusses some approaches and the problems.
add a comment |
The Rand Index and its adjusted version do this exactly. Two cluster assignments that match (even if the labels themselves, which are treated as arbitrary, are different), get a score of 1. A value of 0 means they don't agree at all. The Adjusted Rand Index uses its baseline as random assignment of points to clusters.
would Adjusted Rand of just Rand make more sense in the OP case?
– serafeim
Jul 30 at 17:50
@serafeim what is OP?
– Sam A.
Aug 11 at 6:40
original post = OP
– serafeim
Aug 12 at 21:49
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f37219609%2fpython-how-to-compare-the-similarity-between-clustering-using-k-means-algorithm%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Just extract the cluster centers of your kmeans-objects (see the docs):
x_centers = x.cluster_centers_
y_centers = y.cluster_centers_
The you have to decide which metric you are using to compare these. Keep in mind that the centers are floating-points, the clustering-process is a heuristic and the clustering-process is a random-algorithm. This means, you will get something which interprets as not exactly the same with a high probability, even for cluster-objects trained on the same data.
This link discusses some approaches and the problems.
add a comment |
Just extract the cluster centers of your kmeans-objects (see the docs):
x_centers = x.cluster_centers_
y_centers = y.cluster_centers_
The you have to decide which metric you are using to compare these. Keep in mind that the centers are floating-points, the clustering-process is a heuristic and the clustering-process is a random-algorithm. This means, you will get something which interprets as not exactly the same with a high probability, even for cluster-objects trained on the same data.
This link discusses some approaches and the problems.
add a comment |
Just extract the cluster centers of your kmeans-objects (see the docs):
x_centers = x.cluster_centers_
y_centers = y.cluster_centers_
The you have to decide which metric you are using to compare these. Keep in mind that the centers are floating-points, the clustering-process is a heuristic and the clustering-process is a random-algorithm. This means, you will get something which interprets as not exactly the same with a high probability, even for cluster-objects trained on the same data.
This link discusses some approaches and the problems.
Just extract the cluster centers of your kmeans-objects (see the docs):
x_centers = x.cluster_centers_
y_centers = y.cluster_centers_
The you have to decide which metric you are using to compare these. Keep in mind that the centers are floating-points, the clustering-process is a heuristic and the clustering-process is a random-algorithm. This means, you will get something which interprets as not exactly the same with a high probability, even for cluster-objects trained on the same data.
This link discusses some approaches and the problems.
edited Apr 13 '17 at 12:44
Community♦
11 silver badge
11 silver badge
answered May 13 '16 at 21:34
saschasascha
20.1k6 gold badges41 silver badges78 bronze badges
20.1k6 gold badges41 silver badges78 bronze badges
add a comment |
add a comment |
The Rand Index and its adjusted version do this exactly. Two cluster assignments that match (even if the labels themselves, which are treated as arbitrary, are different), get a score of 1. A value of 0 means they don't agree at all. The Adjusted Rand Index uses its baseline as random assignment of points to clusters.
would Adjusted Rand of just Rand make more sense in the OP case?
– serafeim
Jul 30 at 17:50
@serafeim what is OP?
– Sam A.
Aug 11 at 6:40
original post = OP
– serafeim
Aug 12 at 21:49
add a comment |
The Rand Index and its adjusted version do this exactly. Two cluster assignments that match (even if the labels themselves, which are treated as arbitrary, are different), get a score of 1. A value of 0 means they don't agree at all. The Adjusted Rand Index uses its baseline as random assignment of points to clusters.
would Adjusted Rand of just Rand make more sense in the OP case?
– serafeim
Jul 30 at 17:50
@serafeim what is OP?
– Sam A.
Aug 11 at 6:40
original post = OP
– serafeim
Aug 12 at 21:49
add a comment |
The Rand Index and its adjusted version do this exactly. Two cluster assignments that match (even if the labels themselves, which are treated as arbitrary, are different), get a score of 1. A value of 0 means they don't agree at all. The Adjusted Rand Index uses its baseline as random assignment of points to clusters.
The Rand Index and its adjusted version do this exactly. Two cluster assignments that match (even if the labels themselves, which are treated as arbitrary, are different), get a score of 1. A value of 0 means they don't agree at all. The Adjusted Rand Index uses its baseline as random assignment of points to clusters.
answered Mar 27 at 15:28
Sam A.Sam A.
637 bronze badges
637 bronze badges
would Adjusted Rand of just Rand make more sense in the OP case?
– serafeim
Jul 30 at 17:50
@serafeim what is OP?
– Sam A.
Aug 11 at 6:40
original post = OP
– serafeim
Aug 12 at 21:49
add a comment |
would Adjusted Rand of just Rand make more sense in the OP case?
– serafeim
Jul 30 at 17:50
@serafeim what is OP?
– Sam A.
Aug 11 at 6:40
original post = OP
– serafeim
Aug 12 at 21:49
would Adjusted Rand of just Rand make more sense in the OP case?
– serafeim
Jul 30 at 17:50
would Adjusted Rand of just Rand make more sense in the OP case?
– serafeim
Jul 30 at 17:50
@serafeim what is OP?
– Sam A.
Aug 11 at 6:40
@serafeim what is OP?
– Sam A.
Aug 11 at 6:40
original post = OP
– serafeim
Aug 12 at 21:49
original post = OP
– serafeim
Aug 12 at 21:49
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f37219609%2fpython-how-to-compare-the-similarity-between-clustering-using-k-means-algorithm%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown