View outlier list

The getOutputForPCNOutliers takes which principal component you want to view and a list of labeling names which you want to see included in the report. These labelings need to be ones that have row labels, such as created by labeling.labelRows from the start of the tutorial.

pcaginsu.getOutputForPCNOutliers(1, ['cho_clustering', 'em', 'names'])

The result of that command should contain the same information as the following table.

PC-1 10 High/Low PC-1 Value em cho_clustering names
high 7.66496608144 4 M WSC4
high 4.17705992859 4 M YOL019W
high 3.87001547533 4 M HOF1
high 3.84837129339 5 Early G1 SUR1
high 3.35847421047 4 M BUB3
high 3.34419451262 4 M CDC5
high 3.34037443779 4 M YML034W
high 3.27043501501 4 M COT1
high 2.82125356759 4 M HDR1
high 2.80269836671 4 G2 YIL158W
low -3.26930983481 2 Late G1 HO
low -3.39173779979 2 Late G1 RNR1
low -3.44848612811 2 Late G1 YLR183C
low -3.56081559301 2 Late G1 CDC54
low -3.59901879069 2 Late G1 YOR144C
low -3.60057680134 2 Late G1 HST3
low -3.64414373489 2 Late G1 TOF1
low -3.79017028645 2 Late G1 SPH1
low -3.86578252915 2 Late G1 YPL264C
low -3.93550058084 2 Late G1 HHO1

Interestingly it appears that principal component one helps to differentiate between M phase and the Late G1 phase of the cell cycle.

We do provide a convenience function for saving the output of the getOutput commands, pcaGinsu.write2DStringArrayToFile, this function takes the result of the getOutput command, a file name, and an optional delimiter (which defaults to tab).

outliers = pcaginsu.getOutputForPCNOutliers(1, ['cho_clustering', 'em', 'names'])
pcaGinsu.write2DStringArrayToFile(outliers, 'pca-outliers-1.txt')

Brandon King 2005-07-29