The ML_Algorithm framework

Next: Unsupervised Wrappers Up: The Wonderful World of Previous: The Wonderful World of Contents

The ML_Algorithm framework

All clustering algorithms are derived from the ML_Algorithm base class. The name of this class simply stands for ``Machine-Learning Algorithm'', and an instance of the class represents some algorithm which can be run on a dataset and produce a classification result.

There are only two methods which are essential to run an ML_Algorithm. They are run() and getLabeling(). Along with the construction, these two methods allow you to execute the algorithm and get at interesting results.

There are two major classes of machine learning algorithms - supervised and unsupervised. Unsupervised algorithms are given only a dataset, and from that, they must determine, based on their internal criterion, what the proper classification of the data are.

Supervised algorithms, on the other hand, go through two distinct phases, a training and testing phase. During the training phase, the algorithm is provided with a dataset and also a Labeling of the data. The Labeling identifies which data points belong to which class. After the supervised algorithm contructs a model which fits the training set, it can enter the testing phase in which datapoint are input to the model and their estimated class is returned as output.

Next: Unsupervised Wrappers Up: The Wonderful World of Previous: The Wonderful World of Contents

Lucas Scharenbroich 2003-08-27