How can I determine if ID's are grouped similarly? [duplicate]How can you compare two cluster groupings in terms of similarity or overlap in Python?Grouping functions (tapply, by, aggregate) and the *apply familyHow to make a great R reproducible exampleCluster analysis in R: determine the optimal number of clustersR data clustering using a pre-defined distance/similarity matrixclustering qualitative data in ROptimal grouping/clustering of items in groups with minimum sizeR Univariate Clustering by GroupCoding for two different clustering methodsHow can you compare two cluster groupings in terms of similarity or overlap in Python?Consecutive Across and Unique Number Within Group
Shell builtin `printf` line limit?
Split into three!
Caught with my phone during an exam
How to safely discharge oneself
amsmath: How can I use the equation numbering and label manually and anywhere?
What does it mean for something to be strictly less than epsilon for an arbitrary epsilon?
Why is a weak base more able to deprotonate a strong acid than a weak acid?
VHDL: Why is it hard to desgin a floating point unit in hardware?
To exponential digit growth and beyond!
Are there historical examples of audiences drawn to a work that was "so bad it's good"?
Why do testers need root cause analysis?
Coloring lines in a graph the same color if they are the same length
Download app bundles from App Store to run on iOS Emulator on Mac
size of pointers and architecture
(For training purposes) Are there any openings with rook pawns that are more effective than others (and if so, what are they)?
Is there any mention of ghosts who live outside the Hogwarts castle?
Ribbon Cable Cross Talk - Is there a fix after the fact?
Singular Integration
Is there an idiom that means that you are in a very strong negotiation position in a negotiation?
What happens when redirecting with 3>&1 1>/dev/null?
Proto-Indo-European (PIE) words with IPA
Is it normal to "extract a paper" from a master thesis?
What is the required burn to keep a satellite at a Lagrangian point?
Salesforce bug enabled "Modify All"
How can I determine if ID's are grouped similarly? [duplicate]
How can you compare two cluster groupings in terms of similarity or overlap in Python?Grouping functions (tapply, by, aggregate) and the *apply familyHow to make a great R reproducible exampleCluster analysis in R: determine the optimal number of clustersR data clustering using a pre-defined distance/similarity matrixclustering qualitative data in ROptimal grouping/clustering of items in groups with minimum sizeR Univariate Clustering by GroupCoding for two different clustering methodsHow can you compare two cluster groupings in terms of similarity or overlap in Python?Consecutive Across and Unique Number Within Group
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
This question already has an answer here:
How can you compare two cluster groupings in terms of similarity or overlap in Python?
2 answers
I have applied two different clustering algorithms to my data, and I would like to express the commonality among the results of these.
The data is organized as;
- "ID" = Identifier
- "Group_1" = Results from first clustering algorithm
- "Group_2" = Results from second clustering algorithm.
Group_1 is the output of a hierarchical clustering, which had the highest CVI at k = 5, and Group_2 is the output of k-means clustering, which had the highest CVI at k = 10.
I would like to determine the similarity of the results.
Here is the data, which I try to find the similarity of:
structure(list(ID = c(400100L, 400101L, 400106L, 442306L, 443110L,
443300L, 443301L, 443302L, 443303L, 443304L, 443307L, 443309L,
443311L, 443312L, 443313L, 443314L, 443316L, 443317L, 443322L,
443324L, 443328L, 443329L, 443330L, 443331L, 443332L, 443333L,
443334L, 443339L, 443344L, 443345L, 443351L, 443365L, 443366L,
443371L, 443378L, 443382L, 443383L, 443388L, 443390L, 443392L,
443396L, 443398L, 443399L, 443506L, 443507L, 443511L, 443512L,
443514L, 443521L, 443522L, 443800L, 443802L, 443816L, 443817L,
443819L, 443820L, 443823L, 443825L, 443828L, 443829L, 443833L,
443842L, 443855L, 443859L, 443876L, 443877L, 443879L, 444101L,
444104L, 444202L, 444204L, 444207L, 444251L, 444305L, 444307L,
444309L, 444312L, 444314L, 444325L, 444327L, 444328L, 444334L,
444335L, 444339L, 444341L, 444346L, 444359L, 444501L, 444504L,
444508L, 444509L, 444511L, 444512L, 444514L, 444517L, 444520L,
444521L, 444547L, 444548L, 444554L, 445101L, 445106L, 445112L,
445113L, 445115L, 445120L, 445141L, 445302L, 445303L, 445304L,
445309L, 445312L, 445313L, 445315L, 445316L, 445318L, 445319L,
445322L, 445327L, 445330L, 445333L, 445404L, 445405L, 445409L,
445510L, 445522L, 445552L, 445560L, 451704L, 451705L, 452503L,
452514L), Group_1 = c(1L, 1L, 2L, 2L, 3L, 2L, 4L, 2L, 2L, 1L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 5L, 2L, 2L, 4L, 4L, 4L, 5L, 5L,
2L, 2L, 1L, 1L, 2L, 2L, 3L, 4L, 4L, 3L, 2L, 2L, 1L, 3L, 1L, 1L,
3L, 2L, 3L, 2L, 1L, 4L, 2L, 5L, 4L, 5L, 3L, 4L, 1L, 2L, 3L, 2L,
2L, 5L, 4L, 2L, 2L, 5L, 1L, 1L, 1L, 2L, 5L, 4L, 4L, 2L, 3L, 3L,
1L, 2L, 1L, 4L, 2L, 4L, 5L, 1L, 4L, 2L, 4L, 2L, 3L, 2L, 2L, 2L,
1L, 2L, 2L, 3L, 4L, 2L, 2L, 3L, 4L, 1L, 1L, 5L, 2L, 2L, 3L, 4L,
3L, 5L, 4L, 1L, 1L, 1L, 2L, 4L, 3L, 4L, 4L, 1L, 2L, 1L, 1L, 2L,
5L, 4L, 4L, 2L, 4L, 3L, 1L, 1L, 3L, 5L), Group_2 = c(7, 7, 7,
7, 8, 3, 3, 7, 3, 9, 6, 1, 7, 7, 10, 7, 4, 6, 7, 7, 6, 3, 3,
10, 7, 6, 1, 7, 9, 1, 6, 7, 3, 1, 5, 3, 7, 2, 5, 6, 5, 4, 6,
10, 1, 1, 1, 10, 1, 6, 7, 6, 6, 3, 7, 7, 6, 5, 7, 6, 9, 7, 8,
6, 3, 7, 9, 3, 7, 6, 6, 2, 6, 3, 3, 2, 7, 1, 6, 6, 6, 3, 6, 6,
3, 7, 7, 1, 3, 7, 3, 6, 8, 6, 3, 7, 6, 7, 7, 1, 3, 6, 7, 3, 7,
3, 7, 3, 3, 5, 5, 2, 6, 3, 1, 6, 7, 6, 7, 5, 2, 7, 6, 5, 7, 1,
8, 7, 3, 9, 7, 6)), row.names = c(NA, -132L), class = c("data.frame"))
I would like to know a percentage agreement between the two groups, however I cannot figure out how to calculate it.
Ultimately, I would like to arrive at something as:
ID's grouped together in both "Group_1" and "Group_2" divided by N
My assumption would then be that ID's grouped similarly by both algorithms are correctly labelled and I could redo the clustering with the remaining ID's.
r cluster-analysis similarity
marked as duplicate by Anony-Mousse
StackExchange.ready(function()
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();
);
);
);
Mar 23 at 23:13
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
How can you compare two cluster groupings in terms of similarity or overlap in Python?
2 answers
I have applied two different clustering algorithms to my data, and I would like to express the commonality among the results of these.
The data is organized as;
- "ID" = Identifier
- "Group_1" = Results from first clustering algorithm
- "Group_2" = Results from second clustering algorithm.
Group_1 is the output of a hierarchical clustering, which had the highest CVI at k = 5, and Group_2 is the output of k-means clustering, which had the highest CVI at k = 10.
I would like to determine the similarity of the results.
Here is the data, which I try to find the similarity of:
structure(list(ID = c(400100L, 400101L, 400106L, 442306L, 443110L,
443300L, 443301L, 443302L, 443303L, 443304L, 443307L, 443309L,
443311L, 443312L, 443313L, 443314L, 443316L, 443317L, 443322L,
443324L, 443328L, 443329L, 443330L, 443331L, 443332L, 443333L,
443334L, 443339L, 443344L, 443345L, 443351L, 443365L, 443366L,
443371L, 443378L, 443382L, 443383L, 443388L, 443390L, 443392L,
443396L, 443398L, 443399L, 443506L, 443507L, 443511L, 443512L,
443514L, 443521L, 443522L, 443800L, 443802L, 443816L, 443817L,
443819L, 443820L, 443823L, 443825L, 443828L, 443829L, 443833L,
443842L, 443855L, 443859L, 443876L, 443877L, 443879L, 444101L,
444104L, 444202L, 444204L, 444207L, 444251L, 444305L, 444307L,
444309L, 444312L, 444314L, 444325L, 444327L, 444328L, 444334L,
444335L, 444339L, 444341L, 444346L, 444359L, 444501L, 444504L,
444508L, 444509L, 444511L, 444512L, 444514L, 444517L, 444520L,
444521L, 444547L, 444548L, 444554L, 445101L, 445106L, 445112L,
445113L, 445115L, 445120L, 445141L, 445302L, 445303L, 445304L,
445309L, 445312L, 445313L, 445315L, 445316L, 445318L, 445319L,
445322L, 445327L, 445330L, 445333L, 445404L, 445405L, 445409L,
445510L, 445522L, 445552L, 445560L, 451704L, 451705L, 452503L,
452514L), Group_1 = c(1L, 1L, 2L, 2L, 3L, 2L, 4L, 2L, 2L, 1L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 5L, 2L, 2L, 4L, 4L, 4L, 5L, 5L,
2L, 2L, 1L, 1L, 2L, 2L, 3L, 4L, 4L, 3L, 2L, 2L, 1L, 3L, 1L, 1L,
3L, 2L, 3L, 2L, 1L, 4L, 2L, 5L, 4L, 5L, 3L, 4L, 1L, 2L, 3L, 2L,
2L, 5L, 4L, 2L, 2L, 5L, 1L, 1L, 1L, 2L, 5L, 4L, 4L, 2L, 3L, 3L,
1L, 2L, 1L, 4L, 2L, 4L, 5L, 1L, 4L, 2L, 4L, 2L, 3L, 2L, 2L, 2L,
1L, 2L, 2L, 3L, 4L, 2L, 2L, 3L, 4L, 1L, 1L, 5L, 2L, 2L, 3L, 4L,
3L, 5L, 4L, 1L, 1L, 1L, 2L, 4L, 3L, 4L, 4L, 1L, 2L, 1L, 1L, 2L,
5L, 4L, 4L, 2L, 4L, 3L, 1L, 1L, 3L, 5L), Group_2 = c(7, 7, 7,
7, 8, 3, 3, 7, 3, 9, 6, 1, 7, 7, 10, 7, 4, 6, 7, 7, 6, 3, 3,
10, 7, 6, 1, 7, 9, 1, 6, 7, 3, 1, 5, 3, 7, 2, 5, 6, 5, 4, 6,
10, 1, 1, 1, 10, 1, 6, 7, 6, 6, 3, 7, 7, 6, 5, 7, 6, 9, 7, 8,
6, 3, 7, 9, 3, 7, 6, 6, 2, 6, 3, 3, 2, 7, 1, 6, 6, 6, 3, 6, 6,
3, 7, 7, 1, 3, 7, 3, 6, 8, 6, 3, 7, 6, 7, 7, 1, 3, 6, 7, 3, 7,
3, 7, 3, 3, 5, 5, 2, 6, 3, 1, 6, 7, 6, 7, 5, 2, 7, 6, 5, 7, 1,
8, 7, 3, 9, 7, 6)), row.names = c(NA, -132L), class = c("data.frame"))
I would like to know a percentage agreement between the two groups, however I cannot figure out how to calculate it.
Ultimately, I would like to arrive at something as:
ID's grouped together in both "Group_1" and "Group_2" divided by N
My assumption would then be that ID's grouped similarly by both algorithms are correctly labelled and I could redo the clustering with the remaining ID's.
r cluster-analysis similarity
marked as duplicate by Anony-Mousse
StackExchange.ready(function()
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();
);
);
);
Mar 23 at 23:13
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
How can you compare two cluster groupings in terms of similarity or overlap in Python?
2 answers
I have applied two different clustering algorithms to my data, and I would like to express the commonality among the results of these.
The data is organized as;
- "ID" = Identifier
- "Group_1" = Results from first clustering algorithm
- "Group_2" = Results from second clustering algorithm.
Group_1 is the output of a hierarchical clustering, which had the highest CVI at k = 5, and Group_2 is the output of k-means clustering, which had the highest CVI at k = 10.
I would like to determine the similarity of the results.
Here is the data, which I try to find the similarity of:
structure(list(ID = c(400100L, 400101L, 400106L, 442306L, 443110L,
443300L, 443301L, 443302L, 443303L, 443304L, 443307L, 443309L,
443311L, 443312L, 443313L, 443314L, 443316L, 443317L, 443322L,
443324L, 443328L, 443329L, 443330L, 443331L, 443332L, 443333L,
443334L, 443339L, 443344L, 443345L, 443351L, 443365L, 443366L,
443371L, 443378L, 443382L, 443383L, 443388L, 443390L, 443392L,
443396L, 443398L, 443399L, 443506L, 443507L, 443511L, 443512L,
443514L, 443521L, 443522L, 443800L, 443802L, 443816L, 443817L,
443819L, 443820L, 443823L, 443825L, 443828L, 443829L, 443833L,
443842L, 443855L, 443859L, 443876L, 443877L, 443879L, 444101L,
444104L, 444202L, 444204L, 444207L, 444251L, 444305L, 444307L,
444309L, 444312L, 444314L, 444325L, 444327L, 444328L, 444334L,
444335L, 444339L, 444341L, 444346L, 444359L, 444501L, 444504L,
444508L, 444509L, 444511L, 444512L, 444514L, 444517L, 444520L,
444521L, 444547L, 444548L, 444554L, 445101L, 445106L, 445112L,
445113L, 445115L, 445120L, 445141L, 445302L, 445303L, 445304L,
445309L, 445312L, 445313L, 445315L, 445316L, 445318L, 445319L,
445322L, 445327L, 445330L, 445333L, 445404L, 445405L, 445409L,
445510L, 445522L, 445552L, 445560L, 451704L, 451705L, 452503L,
452514L), Group_1 = c(1L, 1L, 2L, 2L, 3L, 2L, 4L, 2L, 2L, 1L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 5L, 2L, 2L, 4L, 4L, 4L, 5L, 5L,
2L, 2L, 1L, 1L, 2L, 2L, 3L, 4L, 4L, 3L, 2L, 2L, 1L, 3L, 1L, 1L,
3L, 2L, 3L, 2L, 1L, 4L, 2L, 5L, 4L, 5L, 3L, 4L, 1L, 2L, 3L, 2L,
2L, 5L, 4L, 2L, 2L, 5L, 1L, 1L, 1L, 2L, 5L, 4L, 4L, 2L, 3L, 3L,
1L, 2L, 1L, 4L, 2L, 4L, 5L, 1L, 4L, 2L, 4L, 2L, 3L, 2L, 2L, 2L,
1L, 2L, 2L, 3L, 4L, 2L, 2L, 3L, 4L, 1L, 1L, 5L, 2L, 2L, 3L, 4L,
3L, 5L, 4L, 1L, 1L, 1L, 2L, 4L, 3L, 4L, 4L, 1L, 2L, 1L, 1L, 2L,
5L, 4L, 4L, 2L, 4L, 3L, 1L, 1L, 3L, 5L), Group_2 = c(7, 7, 7,
7, 8, 3, 3, 7, 3, 9, 6, 1, 7, 7, 10, 7, 4, 6, 7, 7, 6, 3, 3,
10, 7, 6, 1, 7, 9, 1, 6, 7, 3, 1, 5, 3, 7, 2, 5, 6, 5, 4, 6,
10, 1, 1, 1, 10, 1, 6, 7, 6, 6, 3, 7, 7, 6, 5, 7, 6, 9, 7, 8,
6, 3, 7, 9, 3, 7, 6, 6, 2, 6, 3, 3, 2, 7, 1, 6, 6, 6, 3, 6, 6,
3, 7, 7, 1, 3, 7, 3, 6, 8, 6, 3, 7, 6, 7, 7, 1, 3, 6, 7, 3, 7,
3, 7, 3, 3, 5, 5, 2, 6, 3, 1, 6, 7, 6, 7, 5, 2, 7, 6, 5, 7, 1,
8, 7, 3, 9, 7, 6)), row.names = c(NA, -132L), class = c("data.frame"))
I would like to know a percentage agreement between the two groups, however I cannot figure out how to calculate it.
Ultimately, I would like to arrive at something as:
ID's grouped together in both "Group_1" and "Group_2" divided by N
My assumption would then be that ID's grouped similarly by both algorithms are correctly labelled and I could redo the clustering with the remaining ID's.
r cluster-analysis similarity
This question already has an answer here:
How can you compare two cluster groupings in terms of similarity or overlap in Python?
2 answers
I have applied two different clustering algorithms to my data, and I would like to express the commonality among the results of these.
The data is organized as;
- "ID" = Identifier
- "Group_1" = Results from first clustering algorithm
- "Group_2" = Results from second clustering algorithm.
Group_1 is the output of a hierarchical clustering, which had the highest CVI at k = 5, and Group_2 is the output of k-means clustering, which had the highest CVI at k = 10.
I would like to determine the similarity of the results.
Here is the data, which I try to find the similarity of:
structure(list(ID = c(400100L, 400101L, 400106L, 442306L, 443110L,
443300L, 443301L, 443302L, 443303L, 443304L, 443307L, 443309L,
443311L, 443312L, 443313L, 443314L, 443316L, 443317L, 443322L,
443324L, 443328L, 443329L, 443330L, 443331L, 443332L, 443333L,
443334L, 443339L, 443344L, 443345L, 443351L, 443365L, 443366L,
443371L, 443378L, 443382L, 443383L, 443388L, 443390L, 443392L,
443396L, 443398L, 443399L, 443506L, 443507L, 443511L, 443512L,
443514L, 443521L, 443522L, 443800L, 443802L, 443816L, 443817L,
443819L, 443820L, 443823L, 443825L, 443828L, 443829L, 443833L,
443842L, 443855L, 443859L, 443876L, 443877L, 443879L, 444101L,
444104L, 444202L, 444204L, 444207L, 444251L, 444305L, 444307L,
444309L, 444312L, 444314L, 444325L, 444327L, 444328L, 444334L,
444335L, 444339L, 444341L, 444346L, 444359L, 444501L, 444504L,
444508L, 444509L, 444511L, 444512L, 444514L, 444517L, 444520L,
444521L, 444547L, 444548L, 444554L, 445101L, 445106L, 445112L,
445113L, 445115L, 445120L, 445141L, 445302L, 445303L, 445304L,
445309L, 445312L, 445313L, 445315L, 445316L, 445318L, 445319L,
445322L, 445327L, 445330L, 445333L, 445404L, 445405L, 445409L,
445510L, 445522L, 445552L, 445560L, 451704L, 451705L, 452503L,
452514L), Group_1 = c(1L, 1L, 2L, 2L, 3L, 2L, 4L, 2L, 2L, 1L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 5L, 2L, 2L, 4L, 4L, 4L, 5L, 5L,
2L, 2L, 1L, 1L, 2L, 2L, 3L, 4L, 4L, 3L, 2L, 2L, 1L, 3L, 1L, 1L,
3L, 2L, 3L, 2L, 1L, 4L, 2L, 5L, 4L, 5L, 3L, 4L, 1L, 2L, 3L, 2L,
2L, 5L, 4L, 2L, 2L, 5L, 1L, 1L, 1L, 2L, 5L, 4L, 4L, 2L, 3L, 3L,
1L, 2L, 1L, 4L, 2L, 4L, 5L, 1L, 4L, 2L, 4L, 2L, 3L, 2L, 2L, 2L,
1L, 2L, 2L, 3L, 4L, 2L, 2L, 3L, 4L, 1L, 1L, 5L, 2L, 2L, 3L, 4L,
3L, 5L, 4L, 1L, 1L, 1L, 2L, 4L, 3L, 4L, 4L, 1L, 2L, 1L, 1L, 2L,
5L, 4L, 4L, 2L, 4L, 3L, 1L, 1L, 3L, 5L), Group_2 = c(7, 7, 7,
7, 8, 3, 3, 7, 3, 9, 6, 1, 7, 7, 10, 7, 4, 6, 7, 7, 6, 3, 3,
10, 7, 6, 1, 7, 9, 1, 6, 7, 3, 1, 5, 3, 7, 2, 5, 6, 5, 4, 6,
10, 1, 1, 1, 10, 1, 6, 7, 6, 6, 3, 7, 7, 6, 5, 7, 6, 9, 7, 8,
6, 3, 7, 9, 3, 7, 6, 6, 2, 6, 3, 3, 2, 7, 1, 6, 6, 6, 3, 6, 6,
3, 7, 7, 1, 3, 7, 3, 6, 8, 6, 3, 7, 6, 7, 7, 1, 3, 6, 7, 3, 7,
3, 7, 3, 3, 5, 5, 2, 6, 3, 1, 6, 7, 6, 7, 5, 2, 7, 6, 5, 7, 1,
8, 7, 3, 9, 7, 6)), row.names = c(NA, -132L), class = c("data.frame"))
I would like to know a percentage agreement between the two groups, however I cannot figure out how to calculate it.
Ultimately, I would like to arrive at something as:
ID's grouped together in both "Group_1" and "Group_2" divided by N
My assumption would then be that ID's grouped similarly by both algorithms are correctly labelled and I could redo the clustering with the remaining ID's.
This question already has an answer here:
How can you compare two cluster groupings in terms of similarity or overlap in Python?
2 answers
r cluster-analysis similarity
r cluster-analysis similarity
edited Mar 23 at 20:56
markus
16.8k21439
16.8k21439
asked Mar 23 at 20:46
JPMJPM
43
43
marked as duplicate by Anony-Mousse
StackExchange.ready(function()
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();
);
);
);
Mar 23 at 23:13
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
marked as duplicate by Anony-Mousse
StackExchange.ready(function()
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();
);
);
);
Mar 23 at 23:13
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Standard clustering evaluation measures such as
- adjusted Rand index (ARI)
- Normalized mutual information (NMI)
can be used to evaluate the similarity of two clusterings. It's easy to see that they are symmetric.
Maybe, I was not clear in my description. One thing, is that I want to determine the symmetry between the groupings - and here you are right. I could use one of the above. However, this does not help me in determining the IDs, which have been clustered similarly by both the first and second method, and thus does not help me in establishing the certainty of the clusters.
– JPM
Mar 24 at 12:55
If you study these measures, you'll see that clustering is predicting whether two objects are in the same cluster, or in different clusters. So the level you'll need to argue is on pairs of objects, not single IDs.
– Anony-Mousse
Mar 24 at 14:35
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Standard clustering evaluation measures such as
- adjusted Rand index (ARI)
- Normalized mutual information (NMI)
can be used to evaluate the similarity of two clusterings. It's easy to see that they are symmetric.
Maybe, I was not clear in my description. One thing, is that I want to determine the symmetry between the groupings - and here you are right. I could use one of the above. However, this does not help me in determining the IDs, which have been clustered similarly by both the first and second method, and thus does not help me in establishing the certainty of the clusters.
– JPM
Mar 24 at 12:55
If you study these measures, you'll see that clustering is predicting whether two objects are in the same cluster, or in different clusters. So the level you'll need to argue is on pairs of objects, not single IDs.
– Anony-Mousse
Mar 24 at 14:35
add a comment |
Standard clustering evaluation measures such as
- adjusted Rand index (ARI)
- Normalized mutual information (NMI)
can be used to evaluate the similarity of two clusterings. It's easy to see that they are symmetric.
Maybe, I was not clear in my description. One thing, is that I want to determine the symmetry between the groupings - and here you are right. I could use one of the above. However, this does not help me in determining the IDs, which have been clustered similarly by both the first and second method, and thus does not help me in establishing the certainty of the clusters.
– JPM
Mar 24 at 12:55
If you study these measures, you'll see that clustering is predicting whether two objects are in the same cluster, or in different clusters. So the level you'll need to argue is on pairs of objects, not single IDs.
– Anony-Mousse
Mar 24 at 14:35
add a comment |
Standard clustering evaluation measures such as
- adjusted Rand index (ARI)
- Normalized mutual information (NMI)
can be used to evaluate the similarity of two clusterings. It's easy to see that they are symmetric.
Standard clustering evaluation measures such as
- adjusted Rand index (ARI)
- Normalized mutual information (NMI)
can be used to evaluate the similarity of two clusterings. It's easy to see that they are symmetric.
answered Mar 23 at 23:11
Anony-MousseAnony-Mousse
60k799164
60k799164
Maybe, I was not clear in my description. One thing, is that I want to determine the symmetry between the groupings - and here you are right. I could use one of the above. However, this does not help me in determining the IDs, which have been clustered similarly by both the first and second method, and thus does not help me in establishing the certainty of the clusters.
– JPM
Mar 24 at 12:55
If you study these measures, you'll see that clustering is predicting whether two objects are in the same cluster, or in different clusters. So the level you'll need to argue is on pairs of objects, not single IDs.
– Anony-Mousse
Mar 24 at 14:35
add a comment |
Maybe, I was not clear in my description. One thing, is that I want to determine the symmetry between the groupings - and here you are right. I could use one of the above. However, this does not help me in determining the IDs, which have been clustered similarly by both the first and second method, and thus does not help me in establishing the certainty of the clusters.
– JPM
Mar 24 at 12:55
If you study these measures, you'll see that clustering is predicting whether two objects are in the same cluster, or in different clusters. So the level you'll need to argue is on pairs of objects, not single IDs.
– Anony-Mousse
Mar 24 at 14:35
Maybe, I was not clear in my description. One thing, is that I want to determine the symmetry between the groupings - and here you are right. I could use one of the above. However, this does not help me in determining the IDs, which have been clustered similarly by both the first and second method, and thus does not help me in establishing the certainty of the clusters.
– JPM
Mar 24 at 12:55
Maybe, I was not clear in my description. One thing, is that I want to determine the symmetry between the groupings - and here you are right. I could use one of the above. However, this does not help me in determining the IDs, which have been clustered similarly by both the first and second method, and thus does not help me in establishing the certainty of the clusters.
– JPM
Mar 24 at 12:55
If you study these measures, you'll see that clustering is predicting whether two objects are in the same cluster, or in different clusters. So the level you'll need to argue is on pairs of objects, not single IDs.
– Anony-Mousse
Mar 24 at 14:35
If you study these measures, you'll see that clustering is predicting whether two objects are in the same cluster, or in different clusters. So the level you'll need to argue is on pairs of objects, not single IDs.
– Anony-Mousse
Mar 24 at 14:35
add a comment |