The ConfusionMatrix class' main purpose is to construct a confusion matrix
between two (or more) labelings or a dataset and perform analysis on the
confusion matrix.
Methods
|
|
|
|
NMI
|
NMI ( self )
Returns the NMI score of the confusion matrix.
|
|
__countHelper
|
__countHelper (
self,
dims,
partialKey,
)
Recursive subroutine which traverses the hypercube along each
dimension and creates a nested list structure of the counts of the
cells. This is a O(n^d) algorithm where d is the number of dimensions
of the hypercube and n in the magnitude of each dimension. Thus, this
routine should be called sparingly
|
|
__init__
|
__init__ ( self )
#
# self.dimensions:
# a list containing the magnitude along each dimensions. This
# corresponds to the number of classes in a given Labeling
#
# self.hypercube:
# a dictionary containing the confusion hypercube. Cell are stored
# by index. The coordinates of the cell in N-dimensional space is
# represented by a comma-separated list. i.e.
# hypercube[1][2][3] <=> hypercube['1,2,3']
#
# self.numElements:
# The total number of elements (genes) stored in the confusion
# matrix
#
# self.dimensionLabeling:
# a Labeling for the axis of the dimension of the Hypercube.
# Operations on the confusion hypercube should make use of axis
# labels, not indexes
#
# self.index2cell:
# The index of a label in the labeling into this list will return
# the string key of its cell in the hypercube
#
# self.rowClassNames & self.colClassNames:
# A 2-way dictionary which maps the class numbers to their names
# and vice-versa. Aliases of self.classNames[0] and
# self.classNames[1]
#
# self.classNames
# A list of class name <-> class number dictionary, one for
# each dimension
#
|
|
averageNMI
|
averageNMI ( self )
Returns the average NMI score between the confusion matrix and it's
transpose.
|
|
createConfusionHypercubeFromLabeling
|
createConfusionHypercubeFromLabeling ( self, labelings )
A Confusion Hypercube is a generalization of the confusion matrix which
allows for any number of labelings to be analyzed at the same time.
This is a realization of the full Cartesian product of the classes
defined in the labelings. The ability to ask questions about more than two datasets at a time
is a valuable tool and can be used for sophisticated anaysis.
|
|
createConfusionMatrixFromFile
|
createConfusionMatrixFromFile (
self,
clusteringFile1,
clusteringFile2,
)
A confusion matrix is constructed from the two clustered
files clusteringFile1 and clusteringFile2. clusteringFile1
clusters are arranged across the rows and clusteringFile2
clusters are arranged across the columns of the confusion
matrix. We assume each file contains a list of cluster labels,
one per line. The confusion matrix will on be constructed for
data which is shared between the two clusterings. The constructed confusion matrix, instead of storing straight
numeric counts, stores lists of elements shared between each
pair of clusters. This adds quite a bit of exploratory power
to the confusion matrix.
|
|
createConfusionMatrixFromLabeling
|
createConfusionMatrixFromLabeling (
self,
labeling1,
labeling2,
)
A confusion matrix is constructed from the two labelings
labeling1 and labeling2.
|
|
findCellCoordinates
|
findCellCoordinates ( self, index )
Finds and return the coordinates of the cell which contains the index
in question. The coordinates are returned as a list of integers
suitable to be passed to getConfusionHypercubeCell().
|
|
getAdjacencyList
|
getAdjacencyList ( self )
Returns a list of tuples indicating which classes correspond with
each other.
|
|
getAdjacencyMatrix
|
getAdjacencyMatrix ( self )
Returns a numeric array with a 1 indicating the clusters corresponding
to the elements indices are corresponding clusters - zero everywhere
else.
|
|
getAgreementList
|
getAgreementList ( self )
Returns a list of numElements of which each entry contains either
a 1 or 0 depending whether or not the dimensions of the hypercube
agree on the point's classification. This method is only valid
for 2D confusion matrices. A perfect agreement would return a list of all 1's.
|
|
getConfusionHypercubeCell
|
getConfusionHypercubeCell ( self, cellCoordinates )
Returns the list of indices held in a node of the hypercube. If the
node does not exist, an empty list is returned. The cellCoordinates value is a tuple of labels
|
|
getCounts
|
getCounts ( self )
|
|
getDimensionalLabeling
|
getDimensionalLabeling ( self )
Returns a list containing the labels for each dimension along the
hypercube.
|
|
getHypercubeCounts
|
getHypercubeCounts ( self )
|
|
getInverseStarburst
|
getInverseStarburst ( self, index )
Similar to getStarburst(), but returns a list of cell data not along
the dimension axis from the cell containing index. This is not a
proper inverse since this set also does not contain the cell to which
index belongs. This operation tell what data is unrelated to index.
|
|
getNumElements
|
getNumElements ( self )
Returns the total number of elements stored in the confusion hypercube.
|
|
getStarburst
|
getStarburst ( self, index )
Returns a list of lists containing all the elements in the starburst
centered on the cell containing index. The starburst can be
conceptualizes as a series of rays expanding along each dimension of
the hypercube from the center point. For each cell these rays touch,
if it contains data, that data is appended to a list. This operation is useful to determine what data, while not perfectly
associated with index , is considered related to some degree.
|
|
linearAssignment
|
linearAssignment ( self )
Returns the linear assignment score for a given matrix.
|
|
printCounts
|
printCounts (
self,
labels=0,
outputStream=sys.stdout,
)
Makes a pretty print out of the confusion matrix, even labels the
axes. If you pass in labels=0, then no labels will be printed.
outputStream allows for the output of the function to be redirected to
any open stream
|
|
projectConfusionHypercube
|
projectConfusionHypercube ( self, labels )
Returns another hypercube built from the dimensions of the original
confusion hypercube specified by labels . Equivalent to projecting
the hypercube onto the dimensions passed.
|
|
removeCellFromHypercube
|
removeCellFromHypercube ( self, cellCoordinates )
Given a cell's coordinates, removes that cell from the hypercube
|
|
removeIndexFromHypercube
|
removeIndexFromHypercube ( self, index )
|
|
transposeNMI
|
transposeNMI ( self )
Returns the NMI score of the transposed confusion matrix.
|