All clustering algorithms are derived from the ML_Algorithm
base class. The name of this class simply stands for ``Machine-Learning Algorithm'', and an instance of the class represents some algorithm which can be run on a dataset and produce a classification result.
There are only two methods which are essential to run an ML_Algorithm
. They are run()
and getLabeling()
. Along with the construction, these two methods allow you to execute the algorithm and get at interesting results.
There are two major classes of machine learning algorithms - supervised and unsupervised. Unsupervised algorithms are given only a dataset, and from that, they must determine, based on their internal criterion, what the proper classification of the data are.
Supervised algorithms, on the other hand, go through two distinct phases, a training and testing phase. During the training phase, the algorithm is provided with a dataset and also a Labeling of the data. The Labeling identifies which data points belong to which class. After the supervised algorithm contructs a model which fits the training set, it can enter the testing phase in which datapoint are input to the model and their estimated class is returned as output.