Now that we have identified the points (e.g. genes) that occupy the extremes of a particular principal

There are two extreme point trajectory plots: the first shows the extreme point trajectories in the same order as the dataset's original columns are ordered; the second shows the trajectories with the columns (the x-axis) reordered to emphasize the dimensions (conditions) in which the high and low extreme points are most separated.

The first trajectory plot is sorted in the original column order (again remember that this IPlot-based API uses 0-based indexing). The following call will plot the extreme point trajectories for the second principal component in original order:

ipcaginzu.plotPCNOutlierRowsInOriginalColumnOrder(1)

In the resulting figure, it's clear that the high genes at one end of principal component 2 are co-expressed in a specific phase of the cell cycle, and the low genes at the other end of the PC2 axis are co-expressed in the opposite phase of the cell cycle.

These plots are interactive, so clicking on a single trajectory plot's vertex allows you to see the identity of that specific gene.

The second trajectory plot is sorted by the difference of the mean "high" vectors (red) and the mean "low" vectors (blue), in order to emphasize the conditions that most affect that principal component. The following call creates a plot of the extreme point trajectories in that order ( mean(high) - mean(low) ) for the second principal component.

ipcaginzu.plotPCNOutlierRowsInSigGroupOrder(1)

This plot can be made much more meaningful by looking at the identities of the reordered conditions. This particular plot's X-axis does not (yet) have the ability to show those identities, but we can see the ordering in the PCA interpretation text output, as well as with the equivalent matplotlib plotting function shown later in the tutorial.

Joe Roden 2005-12-13