To get a report of the high and low extreme point sets for a given principal component, you can call the pcaGinzu.getOutputForPCNOutliers() function. This function takes as input the principal component number (in 1-origin indexing) that you want to view, plus a list of row labeling names that you want to see included in the report. These labelings need to be row labelings, such as those created by the labeling.labelRows function (described in the data loading section of this tutorial).
pc1egs = pcaginzu.getOutputForPCNOutliers(1, ['cho clusters', 'diagem clusters', 'common name'])
The resulting list of lists returned by that command should contain the same information as the following table:
PC-1 10 High/Low | PC-1 Value | diagem clusters | cho clusters | common name |
high | 7.66496608144 | 4 | M | WSC4 |
high | 4.17705992859 | 4 | M | YOL019W |
high | 3.87001547533 | 4 | M | HOF1 |
high | 3.84837129339 | 5 | Early G1 | SUR1 |
high | 3.35847421047 | 4 | M | BUB3 |
high | 3.34419451262 | 4 | M | CDC5 |
high | 3.34037443779 | 4 | M | YML034W |
high | 3.27043501501 | 4 | M | COT1 |
high | 2.82125356759 | 4 | M | HDR1 |
high | 2.80269836671 | 4 | G2 | YIL158W |
low | -3.26930983481 | 2 | Late G1 | HO |
low | -3.39173779979 | 2 | Late G1 | RNR1 |
low | -3.44848612811 | 2 | Late G1 | YLR183C |
low | -3.56081559301 | 2 | Late G1 | CDC54 |
low | -3.59901879069 | 2 | Late G1 | YOR144C |
low | -3.60057680134 | 2 | Late G1 | HST3 |
low | -3.64414373489 | 2 | Late G1 | TOF1 |
low | -3.79017028645 | 2 | Late G1 | SPH1 |
low | -3.86578252915 | 2 | Late G1 | YPL264C |
low | -3.93550058084 | 2 | Late G1 | HHO1 |
Interestingly it appears that principal component one helps to differentiate between M phase and the Late G1 phase of the cell cycle. A similar report for principal component 2 suggests that PC2 discriminates between Early G1 and S/G2 phase (which corresponds well with the PC2 extreme gene trajectory plots generated earlier).
The getOutputForPCNOutliers() returns the above result as a Python list of lists. We provide a convenience function called pcaGinzu.write2DStringArrayToFile() if you want this output this result to a file. This function takes the result of the getOutput command, a file name, and an optional delimiter (which defaults to tab). Here is an example howing how to call this output function:
pcaGinzu.write2DStringArrayToFile(outliers, 'pc1-extreme-genes.txt')
Later we will show how to generate the extreme point lists for all principal components in a batch operation mode.
Joe Roden 2005-12-13