Module: KMeans | compClust/mlx/wrapper/KMeans.py | |||||||
---|---|---|---|---|---|---|---|---|
Usage: KMeans.py parameterFilename datasetFilename resultsFilenameWrapper for kmeans algorithm Depends on the following environment variables: KMEANS_COMMAND (e.g., /proj/cluster_gazing2/bin/kmeans) Algorithm parameters include the following name value pairs. Unless a default is indicated, the parameter is required. distance_metric Either the word "correlation" or "euclidean" (include quotes). init_means The word "church", or "random", "random_range", or "random_sample" (include quotes). k The number of clusters, k, to find. k_strict If "true", kmeans will treat k as a strict parameter. That is, if k clusters could not be found, (after an optional num_restarts, in the case of randomly initialized means) no result will be reported. Defaults to "false". num_iterations The number of kmeans iterations. max_restarts The maximum number of restarts in the case of collapsed clusters (valid only for randomly initialized means). Defaults to 0. num_mean_samples If init_means = "random_sample", this parameter indicates the number of datapoints to sample (without replacement) when estimating initial means. Defaults to 3. seed The seed to use for the pseudo-random number generator (valid only for randomly initialized means). Defaults to 42. An example parameter file:distance_metric = "euclidean" init_means = "random_sample" k = 5 num_iterations = 100 max_restarts = 10 num_mean_samples = 3 seed = 1234
|