The scoreColumnLabelingsForPCN() function attempts to identify any column labelings that correlate well with a particular principal component's condition partitioning.

Unfortunately the Cho yeast cell cycling dataset only has one covariate, time (the "time points" column labeling), so it is not really a good example of this feature. Furthermore, it was loaded as a discrete covariate, so we're not able to do a proper analysis of what is naturally a continuous covariate (unless we explicitly define a new column labeling containing time as a numeric covariate). Nonetheless, we can give it a try as a discrete variable to see what happens.

The function that computes covariate scores for a specific principal component is called as follows:

pcaginzu.scoreColumnLabelingsForPCN(1)

This returns a ColumnScore object to hold 1 discrete covariate correlation score, and 3 continuous covariate correlation scores.

We can determine which principal component best correlates with time, by looking at all of the principal components. E.g.:

for i in range(1,pcaginzu.rowPCAView.numCols+1): print i, pcaginzu.scoreColumnLabelingsForPCN(i)[0].scores

The above commands result in the following output:

1 0.682976972108 2 0.686202158131 3 0.625067714467 4 0.657751569535 5 0.582238520955 6 0.539481092126 7 0.602061519391 8 0.563922061919 9 0.0 10 0.539481092126 11 0.539481092126 12 0.539481092126 13 0.0 14 0.0 15 0.539481092126 16 0.0 17 0.539481092126

Apparently principal component 2 generates the condition partitioning that best explains time in this dataset.

Joe Roden 2005-12-13