next up previous contents
Next: Adjacencies Up: Confusion Matrices Previous: NMI   Contents

Linear Assignment

Linear assignment dies not depend of the rows/column ordering as does NMI. However, it does penalize for over or under-fitting. As shown in Table 3, the scores between the KMeans and FullEM results are very high relative to the ground truth, even through the NMI scores to ground truth were almost 1.0. This should that the clustering did not come close to finding the true number of clusters. Specifically in this case, they were under-specified.

Of course, in real data, one has no idea of the true number of clusters, so a bootstrapping method is needed. By computing scores of many clustering against each other, one may be able to identify ``consistent'' clusters which may be part of an underlying structure in the data.


Table 3: Linear Assignment Comparison Scores
Clusters Linear Assignment Score
KMeans vs. FullEM 0.7427
KMeans vs. Ground Truth 0.1453
FullEM vs. Ground Truth 0.1453



next up previous contents
Next: Adjacencies Up: Confusion Matrices Previous: NMI   Contents
Lucas Scharenbroich 2003-08-27