next up previous contents
Next: KMeans Up: Unsupervised Wrappers Previous: FullEM   Contents

HutSOM

Requirements:

HutSOM is a Self-Organizing Map implementation from the University of Helsinki. Self-organizing maps attempt to interpret and organize high-dimensional datasets by modeling the available observations by a restricted, low-dimensional set.

The HutSOM wrapper is instantiated identically to the previous examples. Let's try running the HutSOM algorithms and compare the results to the EM algorithms above.

>>> from compClust.mlx.dataset import Dataset
>>> from compClust.mlx.wrapper.HutSOM import HutSOM
>>> ds = Dataset('synth_t_15c3_p_0750_d_03_v_0d3.txt')
>>> parameters = {}
>>> parameters['transform_method'] = 'none'
>>> parameters['init_method'] = 'random'
>>> parameters['som_x_dimension'] = 1
>>> parameters['som_y_dimension'] = 15
>>> parameters['num_iterations'] = 100
>>> parameters['seed'] = 1234
>>> hutsom = HutSOM(ds, parameters)
>>> hutsom.validate()
1
>>> hutsom.run()
1
>>>results = hutsom.getLabeling()
>>> map(lambda x : len(results.getRowsByLabel(x)),
... results.getLabels())
[76, 21, 45, 48, 54, 76, 68, 36, 61, 20, 67, 23, 38, 43, 74]

The SOM appears to create a fairly solid partitioning with a standard deviation of counts of only 19.9. If we increase the som_y_dimension to 45 to reflect the true number of primitive clusters in the dataset, the standard deviation drops to 7.85. However, this may not be a true improvement. If we multiply the standard deviations by the average cluster size (15 and 45), we get aggregate values of 298.5 and 353.25 respectively. Thus we might consider the 15 class clustering superior.


next up previous contents
Next: KMeans Up: Unsupervised Wrappers Previous: FullEM   Contents
Lucas Scharenbroich 2003-08-27