Class: DistanceFromMean

compClust/mlx/models/DistanceFromMean.py

Produces a model which determines fitness by the summing the distances to the mean of each cluster.

This model has the shortcoming that as k increases, the score continues to improve. The reason for this is that the average squared distance continually decreases, thus raising the fitness score.

Base Classes
Base Classes	IModel

Methods

__computeClosestClusterDistances
__init__
__repr__
evaluateFitness
initFromLabels

__computeClosestClusterDistances

__computeClosestClusterDistances ( self,  data_points )

Given a datapoint find the its closest cluster and return the distance between it and its cluster.

Exceptions
Exceptions	ValueError(( "Dimensionality of the data point [%d] must equal " + "the dimensionality of the cluster means [%d]" ) %(len( point ), len(means [ 0 ] ) ) )

__init__

__init__ (
        self,
        means=None,
        data=None,
        labels=None,
        )

__repr__

__repr__ ( self )

evaluateFitness

evaluateFitness ( self,  data )

Return the fitness of the model given a particular set of data.

The fitness equation is:

N ---------------------- N-1 ---- \ 2 > (mean_a - point_k ) / ---- k=0 where point_k is one of the data points, mean_a is the closest cluster mean, and N is the number of data points.

initFromLabels

initFromLabels (
        self,
        data,
        labels,
        )

Given a dataset and labeling construct compute the means and use as the class means.

Table of Contents

This document was automatically generated on Wed Aug 27 14:25:03 2003 by HappyDoc version 2.1