CompClust Labelings

Adding a new labeling to any data set is fairly easy. All you need to do is make a tab delimited text file with either one row or one column depending on what type of labeling is appropriate. The only restriction is that labeling must be the same dimensions as it's data set.

For example, if you wanted to add a 'Gene Name' labeling to the data set in section 2.2. You would need a row labeling... i.e. one column with three labels to match the three rows in the data set. Below is an example of this labeling.

Gene Name 1
Gene Name 2
Gene Name 3

If you wanted to make your own cluster labeling (group labeling), you would reuse the same label in one or more rows. For example if I wanted to create a cluster labeling which groups Gene 1 and Gene 2 in one group and Gene 3 in another group, I would create the following row labeling.

Cluster 1
Cluster 1
Cluster 2

One may wish to keep around the time point hours as column labeling as well. To do this, create a tab delimited text file with one row as show in the column label below.

Hour 1   Hour 2  

In actuality, labeling files can be in either in row form, as one label per row, or in column form as one label per tab separated column.

One of the beauties of CompClust is you can attach as many labels as you can think of. In CompClustTk you will see dialogs asking you to select cluster labelings, which are row labelings which separates your data into groups. And you will be requested for primary and secondary labelings, which are basically arbitrary row labelings which you may wish to attach. For example, when viewing gene expression data in a plot, you may wish to attach a primary labeling of gene names and a secondary labeling of descriptions.

Column labelings currently can only be taking advantage of by using CompClust from Python, but in the future, these features may be exposed in CompClustTk.

Brandon King 2005-05-27