Load a dataset

For this tutorial will be starting with the Cho Yeast cell cycling data set ([Cho et al., 1998]) because it is fairly small and easy to work with.

We have added new utility functions to make it very easy to load a number of example datasets into CompClust. These example loading functions download the original data and annotation files from remote web locations, load them into memory, attach row and column labels (e.g. gene names, condition covariates, etc.), and return a Dataset object that can be used for subsequent analyses. The example loading functions save a local copy of the resulting dataset in your home directory to speed subsequent dataset loading.

We can use these one of these simple dataset loading commands to get the Cho example data into memory as a Dataset object:

cho = LoadExample.LoadCho()

To keep dataset loading easy for now, we recommend simply executing the above commands and then skipping to Section 3.

If you are interested in understanding how to load your own dataset, the following sections describe in more detail how to manually load the Cho example dataset and associated labelings from tab-delimited text files. Alternatively, you can study the LoadCho function within the compClust.util.LoadExample.py module, or refer to the other CompClust tutorials described in Section 9 for additional examples and details.

Subsections

Joe Roden 2005-12-13