Class: ConfusionMatrix

Class: ConfusionMatrix compClust/score/ConfusionMatrix.py

The ConfusionMatrix class' main purpose is to construct a confusion matrix between two (or more) labelings or a dataset and perform analysis on the confusion matrix.

Methods

NMI
__countHelper
__init__
averageNMI
createConfusionHypercubeFromLabeling
createConfusionMatrixFromFile
createConfusionMatrixFromLabeling
findCellCoordinates

getAdjacencyList
getAdjacencyMatrix
getAgreementList
getConfusionHypercubeCell
getCounts
getDimensionalLabeling
getHypercubeCounts
getInverseStarburst

getNumElements
getStarburst
linearAssignment
printCounts
projectConfusionHypercube
removeCellFromHypercube
removeIndexFromHypercube
transposeNMI

NMI

NMI ( self )

Returns the NMI score of the confusion matrix.

__countHelper

__countHelper (
        self,
        dims,
        partialKey,
        )

Recursive subroutine which traverses the hypercube along each dimension and creates a nested list structure of the counts of the cells. This is a O(n^d) algorithm where d is the number of dimensions of the hypercube and n in the magnitude of each dimension. Thus, this routine should be called sparingly

__init__

__init__ ( self )

# # self.dimensions: # a list containing the magnitude along each dimensions. This # corresponds to the number of classes in a given Labeling # # self.hypercube: # a dictionary containing the confusion hypercube. Cell are stored # by index. The coordinates of the cell in N-dimensional space is # represented by a comma-separated list. i.e. # hypercube[1][2][3] <=> hypercube['1,2,3'] # # self.numElements: # The total number of elements (genes) stored in the confusion # matrix # # self.dimensionLabeling: # a Labeling for the axis of the dimension of the Hypercube. # Operations on the confusion hypercube should make use of axis # labels, not indexes # # self.index2cell: # The index of a label in the labeling into this list will return # the string key of its cell in the hypercube # # self.rowClassNames & self.colClassNames: # A 2-way dictionary which maps the class numbers to their names # and vice-versa. Aliases of self.classNames[0] and # self.classNames[1] # # self.classNames # A list of class name <-> class number dictionary, one for # each dimension #

averageNMI

averageNMI ( self )

Returns the average NMI score between the confusion matrix and it's transpose.

createConfusionHypercubeFromLabeling

createConfusionHypercubeFromLabeling ( self,  labelings )

A Confusion Hypercube is a generalization of the confusion matrix which allows for any number of labelings to be analyzed at the same time. This is a realization of the full Cartesian product of the classes defined in the labelings.

The ability to ask questions about more than two datasets at a time is a valuable tool and can be used for sophisticated anaysis.

createConfusionMatrixFromFile

createConfusionMatrixFromFile (
        self,
        clusteringFile1,
        clusteringFile2,
        )

A confusion matrix is constructed from the two clustered files clusteringFile1 and clusteringFile2. clusteringFile1 clusters are arranged across the rows and clusteringFile2 clusters are arranged across the columns of the confusion matrix. We assume each file contains a list of cluster labels, one per line. The confusion matrix will on be constructed for data which is shared between the two clusterings.

The constructed confusion matrix, instead of storing straight numeric counts, stores lists of elements shared between each pair of clusters. This adds quite a bit of exploratory power to the confusion matrix.

createConfusionMatrixFromLabeling

createConfusionMatrixFromLabeling (
        self,
        labeling1,
        labeling2,
        )

A confusion matrix is constructed from the two labelings labeling1 and labeling2.

findCellCoordinates

findCellCoordinates ( self,  index )

Finds and return the coordinates of the cell which contains the index in question. The coordinates are returned as a list of integers suitable to be passed to getConfusionHypercubeCell().

getAdjacencyList

getAdjacencyList ( self )

Returns a list of tuples indicating which classes correspond with each other.

getAdjacencyMatrix

getAdjacencyMatrix ( self )

Returns a numeric array with a 1 indicating the clusters corresponding to the elements indices are corresponding clusters - zero everywhere else.

getAgreementList

getAgreementList ( self )

Returns a list of numElements of which each entry contains either a 1 or 0 depending whether or not the dimensions of the hypercube agree on the point's classification. This method is only valid for 2D confusion matrices.

A perfect agreement would return a list of all 1's.

getConfusionHypercubeCell

getConfusionHypercubeCell ( self,  cellCoordinates )

Returns the list of indices held in a node of the hypercube. If the node does not exist, an empty list is returned.

The cellCoordinates value is a tuple of labels

getCounts

getCounts ( self )

getDimensionalLabeling

getDimensionalLabeling ( self )

Returns a list containing the labels for each dimension along the hypercube.

getHypercubeCounts

getHypercubeCounts ( self )

getInverseStarburst

getInverseStarburst ( self,  index )

Similar to getStarburst(), but returns a list of cell data not along the dimension axis from the cell containing index. This is not a proper inverse since this set also does not contain the cell to which index belongs.

This operation tell what data is unrelated to index.

getNumElements

getNumElements ( self )

Returns the total number of elements stored in the confusion hypercube.

getStarburst

getStarburst ( self,  index )

Returns a list of lists containing all the elements in the starburst centered on the cell containing index. The starburst can be conceptualizes as a series of rays expanding along each dimension of the hypercube from the center point. For each cell these rays touch, if it contains data, that data is appended to a list.

This operation is useful to determine what data, while not perfectly associated with index, is considered related to some degree.

linearAssignment

linearAssignment ( self )

Returns the linear assignment score for a given matrix.

printCounts

printCounts (
        self,
        labels=0,
        outputStream=sys.stdout,
        )

Makes a pretty print out of the confusion matrix, even labels the axes. If you pass in labels=0, then no labels will be printed. outputStream allows for the output of the function to be redirected to any open stream

projectConfusionHypercube

projectConfusionHypercube ( self,  labels )

Returns another hypercube built from the dimensions of the original confusion hypercube specified by labels. Equivalent to projecting the hypercube onto the dimensions passed.

removeCellFromHypercube

removeCellFromHypercube ( self,  cellCoordinates )

Given a cell's coordinates, removes that cell from the hypercube

removeIndexFromHypercube

removeIndexFromHypercube ( self,  index )

transposeNMI

transposeNMI ( self )

Returns the NMI score of the transposed confusion matrix.

Table of Contents

This document was automatically generated on Wed Aug 27 14:25:10 2003 by HappyDoc version 2.1