Table of Contents

Class: ConfusionMatrix compClust/score/ConfusionMatrix.py

The ConfusionMatrix class' main purpose is to construct a confusion matrix between two (or more) labelings or a dataset and perform analysis on the confusion matrix.

Methods   
NMI
__countHelper
__init__
averageNMI
createConfusionHypercubeFromLabeling
createConfusionMatrixFromFile
createConfusionMatrixFromLabeling
findCellCoordinates
getAdjacencyList
getAdjacencyMatrix
getAgreementList
getConfusionHypercubeCell
getCounts
getDimensionalLabeling
getHypercubeCounts
getInverseStarburst
getNumElements
getStarburst
linearAssignment
printCounts
projectConfusionHypercube
removeCellFromHypercube
removeIndexFromHypercube
transposeNMI
  NMI 
NMI ( self )

Returns the NMI score of the confusion matrix.

  __countHelper 
__countHelper (
        self,
        dims,
        partialKey,
        )

Recursive subroutine which traverses the hypercube along each dimension and creates a nested list structure of the counts of the cells. This is a O(n^d) algorithm where d is the number of dimensions of the hypercube and n in the magnitude of each dimension. Thus, this routine should be called sparingly

  __init__ 
__init__ ( self )

# # self.dimensions: # a list containing the magnitude along each dimensions. This # corresponds to the number of classes in a given Labeling # # self.hypercube: # a dictionary containing the confusion hypercube. Cell are stored # by index. The coordinates of the cell in N-dimensional space is # represented by a comma-separated list. i.e. # hypercube[1][2][3] <=> hypercube['1,2,3'] # # self.numElements: # The total number of elements (genes) stored in the confusion # matrix # # self.dimensionLabeling: # a Labeling for the axis of the dimension of the Hypercube. # Operations on the confusion hypercube should make use of axis # labels, not indexes # # self.index2cell: # The index of a label in the labeling into this list will return # the string key of its cell in the hypercube # # self.rowClassNames & self.colClassNames: # A 2-way dictionary which maps the class numbers to their names # and vice-versa. Aliases of self.classNames[0] and # self.classNames[1] # # self.classNames # A list of class name <-> class number dictionary, one for # each dimension #

  averageNMI 
averageNMI ( self )

Returns the average NMI score between the confusion matrix and it's transpose.

  createConfusionHypercubeFromLabeling 
createConfusionHypercubeFromLabeling ( self,  labelings )

A Confusion Hypercube is a generalization of the confusion matrix which allows for any number of labelings to be analyzed at the same time. This is a realization of the full Cartesian product of the classes defined in the labelings.

The ability to ask questions about more than two datasets at a time is a valuable tool and can be used for sophisticated anaysis.

  createConfusionMatrixFromFile 
createConfusionMatrixFromFile (
        self,
        clusteringFile1,
        clusteringFile2,
        )

A confusion matrix is constructed from the two clustered files clusteringFile1 and clusteringFile2. clusteringFile1 clusters are arranged across the rows and clusteringFile2 clusters are arranged across the columns of the confusion matrix. We assume each file contains a list of cluster labels, one per line. The confusion matrix will on be constructed for data which is shared between the two clusterings.

The constructed confusion matrix, instead of storing straight numeric counts, stores lists of elements shared between each pair of clusters. This adds quite a bit of exploratory power to the confusion matrix.

  createConfusionMatrixFromLabeling 
createConfusionMatrixFromLabeling (
        self,
        labeling1,
        labeling2,
        )

A confusion matrix is constructed from the two labelings labeling1 and labeling2.

  findCellCoordinates 
findCellCoordinates ( self,  index )

Finds and return the coordinates of the cell which contains the index in question. The coordinates are returned as a list of integers suitable to be passed to getConfusionHypercubeCell().

  getAdjacencyList 
getAdjacencyList ( self )

Returns a list of tuples indicating which classes correspond with each other.

  getAdjacencyMatrix 
getAdjacencyMatrix ( self )

Returns a numeric array with a 1 indicating the clusters corresponding to the elements indices are corresponding clusters - zero everywhere else.

  getAgreementList 
getAgreementList ( self )

Returns a list of numElements of which each entry contains either a 1 or 0 depending whether or not the dimensions of the hypercube agree on the point's classification. This method is only valid for 2D confusion matrices.

A perfect agreement would return a list of all 1's.

  getConfusionHypercubeCell 
getConfusionHypercubeCell ( self,  cellCoordinates )

Returns the list of indices held in a node of the hypercube. If the node does not exist, an empty list is returned.

The cellCoordinates value is a tuple of labels

  getCounts 
getCounts ( self )

  getDimensionalLabeling 
getDimensionalLabeling ( self )

Returns a list containing the labels for each dimension along the hypercube.

  getHypercubeCounts 
getHypercubeCounts ( self )

  getInverseStarburst 
getInverseStarburst ( self,  index )

Similar to getStarburst(), but returns a list of cell data not along the dimension axis from the cell containing index. This is not a proper inverse since this set also does not contain the cell to which index belongs.

This operation tell what data is unrelated to index.

  getNumElements 
getNumElements ( self )

Returns the total number of elements stored in the confusion hypercube.

  getStarburst 
getStarburst ( self,  index )

Returns a list of lists containing all the elements in the starburst centered on the cell containing index. The starburst can be conceptualizes as a series of rays expanding along each dimension of the hypercube from the center point. For each cell these rays touch, if it contains data, that data is appended to a list.

This operation is useful to determine what data, while not perfectly associated with index, is considered related to some degree.

  linearAssignment 
linearAssignment ( self )

Returns the linear assignment score for a given matrix.

  printCounts 
printCounts (
        self,
        labels=0,
        outputStream=sys.stdout,
        )

Makes a pretty print out of the confusion matrix, even labels the axes. If you pass in labels=0, then no labels will be printed. outputStream allows for the output of the function to be redirected to any open stream

  projectConfusionHypercube 
projectConfusionHypercube ( self,  labels )

Returns another hypercube built from the dimensions of the original confusion hypercube specified by labels. Equivalent to projecting the hypercube onto the dimensions passed.

  removeCellFromHypercube 
removeCellFromHypercube ( self,  cellCoordinates )

Given a cell's coordinates, removes that cell from the hypercube

  removeIndexFromHypercube 
removeIndexFromHypercube ( self,  index )

  transposeNMI 
transposeNMI ( self )

Returns the NMI score of the transposed confusion matrix.


Table of Contents

This document was automatically generated on Wed Aug 27 14:25:10 2003 by HappyDoc version 2.1