compClust (pyMLX and IPlot) installation instructions Welcome to pyMLX and IPlot hopefully this should make it reasonably clear how to install the two components. END WARNING == Prerequisites == pyMLX and IPlot are written in python and use require a variety of python modules. This file includes intructions for a Debian Source Installation and a "Manual Install" * Debian source Install * + Base interpreter installation To build and run compclust you'll need the following packages. apt-get install python python-numeric python-numeric-ext ipython \ python-pmw python-scientific \ python-pyrex python-stats python-tk python-pyrex \ python-dev python-profiler python-imaging \ python-imaging-tk gcc g++ libc6-dev tk8.4-dev NOTE: matplotlib Debian packages (python-matplotlib) can be installed by adding the following two lines to your /etc/apt/sources.list: deb http://anakonda.altervista.org/debian packages/ deb-src http://anakonda.altervista.org/debian sources/ + Web interface apt-get install python-simpletal python-twisted (Alas debian only has quixote 1, so you'll need to manually install quixote2) * Manual Install * + Python Obviously you will need python 2.3, we haven't started testing with 2.4 yet http://python.org/download/ + Numeric Next you will need Python Numeric and all of its extensions. Download numpy from http://sourceforge.net/projects/numpy (Numarray is an upcoming version of numeric routines for python which we don't currently support.) + Statistic modules You will need stats.py and pstat.py to run the pca analysis package which came from http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/python.html However as a convenience they're now included + Compiled Components To build the link to some of the underlying C code you will also need pyrex http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/ + Matplotlib The most current plot library is using matplotlib, which is available from http://matplotlib.sf.net versions from the late .7x to early .8x should work. (They keep changing their APIs) + PIL (Python Imaging Library) If you want to be able to download the plots in the web version in formats other than PNG, you'll need to install PIL. http://www.pythonware.com/products/pil/ + Web Interface To run the web interface you'll need to install the following packages Quixote http://www.mems-exchange.org/software/quixote/ >= 2.0 SimpleTAL http://www.owlfish.com/software/simpleTAL/ >= 3.12 Quixote currently supports several different web servers, compclust web has driver scripts for apache/scgi and the simple quixote server. compclustweb-local.py serves using the simple web server built into Quixote 2. Alternatively one can use the scgi interface to apache. This obviously requires apache, http://httpd.apache.org and SCGI http://www.mems-exchange.org/software/scgi/ Once you have installed scgi you can either install mod_scgi, following their instructions or use the cgi2scgi gateway script. You will need to compile the example cgi2scgi code provided with the scgi package. You may need to change the PORT setting in the file to match whatever port you plan on launching the scgi server application as. + Optional Components The following components, Scientific and IPython are optional. Scientific is used to provide a histogram plot in IPlot, and we just find IPython to be a convenient work environment. Scientific is used by IPlot for it's histogram functionality and by compClust.visualize.SummaryViews for a least squares http://starship.python.net/~hinsen/ScientificPython/ Ipython can be downloaded from http://ipython.scipy.org/ * Historical install * We've gone through a couple of different plotting backends, the first version of IPlot used Pmw and BLT Python megawidgets (Pmw) wraps the core BLT graph widget for IPlot. It can be found at http://pmw.sourceforge.net/ for debian apt-get install blt for manual installations Tk should have come with your installation of python, however you may also need to add BLT for its graph widget http://incrtcl.sourceforge.net/blt/ Also for completeness, some archaic pyMLX code that predates IPlot uses gracePlot if you find yourself interested in that code. gracePlot is available from http://graceplot.sourceforge.net/ == Algorithm Installation == + For collaborators For some people we can distribute the underlying C code if you have compclust/src you're in luck and have the full source tree. We hope to clean up our licensing issues and break our dependency on the non-free numerical recipes code so we can redistribute the c source code as well. But we can't do it yet. Currently I don't autodetect platform type for building the C code. you MUST use a version of gnu make, other lesser makes will not cut it. Some of the underlying algorithm code can use MPI, however the python code isn't currently taking advantage of this. If you would like to build a parallel version you will need to install mpi and update the variable MPICH_INCLUDE to point to it. However the current C make system will skip MPI if its not available. You may also need to change BINEXT, OBJEXT, LIBEXT, LIBOBJEXT, and SHLIBEXT if your platform is not a straightforward linux like unix variant. once you've made these changes you should be able to do $ python setup.py install (which should also build all of the binary code), if not you should be able to just type make in the compclust/src directory. * For general public Because of licensing issues we can't distribute the source to the command line c code that we use. We provide binaries for our EM and KMeans clustering algorithms for platforms that we use at our web page http://woldlab.caltech.edu/compClust/ There are two packages that we can't redistribute. We used the matlab dependent SOM toolbox from the Helsinki University of Technology for our SOM implementation. http://www.cis.hut.fi/projects/somtoolbox/ Also we used a modified version of XCluster which was based on the Gavin Sherlock's implementation. http://genetics.stanford.edu/~sherlock/cluster.html For the binaries find a convenient location to store them. For Xcluster you'll need to follow their build instructions. This is being obsoleted, currently the kmeans and diagem algorithms expect to find the kmeans and diagem binaries in either the source tree or properly installed distribution of compclust. The python code needs to know the location of these components for which we used environment variables to specify the paths to external programs. For instance the variable DIAGEM_COMMAND is used by the DiagEM wrapper to find the diagem executable. Wrapper Required Environment variables DiagEM DIAGEM_COMMAND= HutSOM SOM_TOOLBOX_HOME= KMeans KMEANS_COMMAND= KMedians KMEDIANS_COMMAND= WMatch WMATCH_COMMAND= XClust XCLUST_COMMAND=