Module: TSplit | compClust/mlx/wrapper/TSplit.py | |||||||
---|---|---|---|---|---|---|---|---|
Usage: TSplit.py parameter_filename input_filename output_filenameWrapper for the tsplit algorithm.Note: The class labels will have the extension you specify on the command line and the tsplit intermediate file, if saved, will have a .gtr extension. Depends on the following environment variables: TSPLIT_COMMAND (e.g., /proj/cluster_gazing2/bin/tsplit) Brief Algorithm Description: Required Parameters: (note: the list enclosed in the brakets are possible values each one of parameters can take ) distance_metric = [correlation, euclidean, Bhattacharyya]Bhattacharyya : takes into account of not only the difference between the two mean vectors, but also the distributions of the two groups of data points. agglomerate_method = [none, native, size, clusterNumber]none - do not agglomerate, just generate the normal tsplit output files native - use tsplit built in agglomeration to produce as close to K clusters as possible size - perform a size threshold agglomeration. Starting at the root recurse through the tree attempting to agglomerate at each node stopping only when the number of genes in the agglomerated sub-tree is less then the parameter "size_threshold" clusterNumber - return as close to K clusters as possible using the "size" agglomeration method to partition the tree size/clusterNumber agglomeration is identical to the agglomeration used in xclust splitting_method = [
|
Imported modules | ||
---|---|---|
from compClust.mlx.ML_Algorithm import ML_Algorithm from compClust.mlx.XClustTree import XClustTree from compClust.mlx.labelings import Labeling from compClust.mlx.models import DistanceFromMean import compClust.mlx.wrapper from compClust.mlx.wrapper.TreeAgglomerator import TreeAgglomerator from compClust.util import Verify, WrapperUtil from compClust.util.TimeStampedPrintStream import TimeStampedPrintStream import os import string import sys import tempfile | ||
Classes | ||
|