From: Brandon King Date: Sat, 25 Mar 2006 00:08:28 +0000 (+0000) Subject: First (incomplete) rough draft of the Mussagl Manual X-Git-Url: http://woldlab.caltech.edu/gitweb/?p=mussa.git;a=commitdiff_plain;h=18f69f0aaf8a2025c4e84bbc82109def10c8bcd0 First (incomplete) rough draft of the Mussagl Manual --- diff --git a/.boring b/.boring index 61997f1..7ef83f1 100644 --- a/.boring +++ b/.boring @@ -1,6 +1,6 @@ # Boring file regexps: # doxygen output -(^|/)doc($|/) +#(^|/)doc($|/) # depends files \.d$ # qt diff --git a/doc/manual/Makefile b/doc/manual/Makefile new file mode 100644 index 0000000..bf22463 --- /dev/null +++ b/doc/manual/Makefile @@ -0,0 +1,11 @@ +all: html pdf + +html: + rst2html mussagl_manual.rst mussagl_manual.html + +pdf: + rst2latex mussagl_manual.rst mussagl_manual.tex + rubber --pdf mussagl_manual + +clean: + rm mussagl_manual.aux mussagl_manual.log mussagl_manual.out mussagl_manual.tex \ No newline at end of file diff --git a/doc/manual/images/define_analysis.png b/doc/manual/images/define_analysis.png new file mode 100644 index 0000000..fe9a2b0 Binary files /dev/null and b/doc/manual/images/define_analysis.png differ diff --git a/doc/manual/images/define_analysis_step1.png b/doc/manual/images/define_analysis_step1.png new file mode 100644 index 0000000..5344edb Binary files /dev/null and b/doc/manual/images/define_analysis_step1.png differ diff --git a/doc/manual/images/define_analysis_step1a.png b/doc/manual/images/define_analysis_step1a.png new file mode 100644 index 0000000..8d9edef Binary files /dev/null and b/doc/manual/images/define_analysis_step1a.png differ diff --git a/doc/manual/images/define_analysis_step2.png b/doc/manual/images/define_analysis_step2.png new file mode 100644 index 0000000..a5f5bfe Binary files /dev/null and b/doc/manual/images/define_analysis_step2.png differ diff --git a/doc/manual/images/demo.png b/doc/manual/images/demo.png new file mode 100644 index 0000000..8822a3c Binary files /dev/null and b/doc/manual/images/demo.png differ diff --git a/doc/manual/images/load_analysis_menu.png b/doc/manual/images/load_analysis_menu.png new file mode 100644 index 0000000..f0cc52c Binary files /dev/null and b/doc/manual/images/load_analysis_menu.png differ diff --git a/doc/manual/images/load_mck_example.png b/doc/manual/images/load_mck_example.png new file mode 100644 index 0000000..e5aec3f Binary files /dev/null and b/doc/manual/images/load_mck_example.png differ diff --git a/doc/manual/images/load_mupa_dialog.png b/doc/manual/images/load_mupa_dialog.png new file mode 100644 index 0000000..283cf7f Binary files /dev/null and b/doc/manual/images/load_mupa_dialog.png differ diff --git a/doc/manual/images/load_mupa_menu.png b/doc/manual/images/load_mupa_menu.png new file mode 100644 index 0000000..ef09a35 Binary files /dev/null and b/doc/manual/images/load_mupa_menu.png differ diff --git a/doc/manual/images/opened.png b/doc/manual/images/opened.png new file mode 100644 index 0000000..0e64743 Binary files /dev/null and b/doc/manual/images/opened.png differ diff --git a/doc/manual/mussagl_manual.rst b/doc/manual/mussagl_manual.rst new file mode 100644 index 0000000..1a5fb6a --- /dev/null +++ b/doc/manual/mussagl_manual.rst @@ -0,0 +1,367 @@ +============== +Mussagl Manual +============== +------------------ +By Brandon W. King +------------------ + +Last updated: March 23rd, 2006 + +Updated to Mussagl build: 141 + + +.. contents:: + +Introduction +============ + + +What is Mussagl? +---------------- + + +Short History of Mussa +---------------------- + + +Mussa Python/PMW Prototype +~~~~~~~~~~~~~~~~~~~~~~~~~~ + + +Mussa C++/FLTK +~~~~~~~~~~~~~~ + + +Mussagl C++/Qt/OpenGL +~~~~~~~~~~~~~~~~~~~~~ + + +Getting Mussagl +=============== + +License +------- + +Mussagl has been released open source under the `GPL v2 +license`__. + +__ GPL_ + +Platforms +--------- + +You have the option of building from source or downloading prebuilt +binaries. Most people will want the prebuilt versions. + +Supported Platforms: + + * Mac OS X (binary or source) + * Windows XP (binary or source) + * Linux (source) + +Download +-------- + +Mussagl can be downloaded from http://mussa.caltech.edu/. + +Install +------- + +Mac OS X +~~~~~~~~ +Once you have downloaded the .dmg file, dubble click on it and follow +the install instructions. + +FIXME: Mention how to launch the program. + + +Windows XP +~~~~~~~~~~ +Once you have downloaded the Mussagl installer, double click on the +installer and follow the install instructions. + +To start mussagl, launch the program from Start > Programs > Mussagl > +Mussgl. + + +Linux +~~~~~ +Currently we do not have a binary installer for Linux. You will have +to build from source. See the 'build from source' section below. + + +Build from Source +~~~~~~~~~~~~~~~~~ + +Instructions for building from source can be found `build page +`_ on the +`Mussa wiki`__. + +__ wiki_ + + +Using Mussagl +============= + + +Launch Mussagl +-------------- +Launch Mussagl... It should look similar to the screen shot below. + +.. image:: images/opened.png + :alt: Launch Mussa + :align: center + + + +Create/Load Analysis +---------------------- + +Currently there are three ways to load a mussa experiment. + + 1. `Create a new analysis`_ + 2. `Load a mussa parameter file`_ (.mupa) + 3. `Load an analysis`_ + +.. _createnew: + +Create a new analysis +~~~~~~~~~~~~~~~~~~~~~ + +To create a new analysis select 'Define analysis' from the 'File' +menu. You should see a dialog box similar to the one below. For this +demo we will use the example sequences that come with Mussagl. + +.. image:: images/define_analysis.png + :alt: Define Analysis + :align: center + +Instructions: + + 1. **Give the experiement a name**, for this demo, we'll use + 'demo_w30_t20'. Mussa will create a folder with this name to store + the analysis files in once it has been run. + + 2. Choose a `window size`_. For this demo **choose 30**. + + 3. Choose a threshold_... for this demo **choose 20**. See the + Threshold_ section for more detailed information. + + 4. Choose the number of sequences_ you would like. For this demo + **choose 3**. + +.. image:: images/define_analysis_step1a.png + :alt: Steps 1-4 + :align: center + +Now click on the 'Browse' button next to the sequence input box and +then select /examples/seq/human_mck_pro.fa file. Do the same in the +next two sequence input boxes selecting mouse_mck_pro.fa and +rabbit_mck_pro.fa as shown below. + +.. image:: images/define_analysis_step2.png + :alt: Choose sequences + :align: center + +Click the **create** button and in a few moments you should see +something similar to the following screen shot. + +.. image:: images/demo.png + :alt: Mussagl Demo + :align: center + +This analysis is now saved in a directory called **demo_w30_t20** in +the current working directory. If you close and reopen Mussagl, you +can reload the saved analysis. See `Load an analysis`_ section below +for details. + + +Load a mussa parameter file +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you prefer, you can define your Mussa analysis using the Mussa +parameter file. See the `Parameter File Format`_ section for details +on creating a .mupa file. + +Once you have a .mupa file created, load Mussgl and select the **File > +Load Mussa Parameters** menu option. Select the .mupa file and click +open. + +.. image:: images/load_mupa_menu.png + :alt: Load Mussa Parameters + :align: center + +If you would like to see an example, you can load the +**mck3test.mupa** file in the examples directory that comes with +Mussagl. + +.. image:: images/load_mupa_dialog.png + :alt: Load Mussa Parameters Dialog + :align: center + + +Load an analysis +~~~~~~~~~~~~~~~~ + +To load a previously run analysis open Mussagl and select the **File > +Load Analysis** menu option. Select an analysis **directory** and +click open. + +.. image:: images/load_analysis_menu.png + :alt: Load Analysis Menu + :align: center + + +Detailed Info +------------- + +Threshold +~~~~~~~~~ + +The threshold of an analysis is in minimum number of base pair +matches must be meet to in order to be kept as a match. Note that you +can vary the threshold from within Mussagl. For example, if you +choose a `window size`_ of **30** and a **threshold** of **20** the mussa +nway transitive algorithm will store all matches that are 20 out of 30 +bp matches or better and pass it on to Mussagl. Mussagl will +then allow you to dynamically choose a threshold from 10 to 30 base +pairs. A threshold of 30 bps would only show 30 out of 30 bp +matches. A threshold of 20 bps would show all matches of 20 out of 30 +bps or better. Choosing a threshold below 20 in this case won't have +an effect [*]_ because the mussa algorithm didn't report and matches below +this threshold. + +.. [*] In the future, Mussagl will automatically detect the minimum + threshold which was used when defining an analysis and not allow + you to select a threshold below the minimum. See `ticket #52 + `_ for more + info. + +Window Size +~~~~~~~~~~~ + +The typical sizes people tend to choose are between 20 and 30. Feel +free to analysis with this setting depending on your needs. + + +Sequences +~~~~~~~~~ + +Mussa reads in sequences which are formated in the fasta_ +format. Mussa may take a long time to run (>10 minutes) if the total +bp length near 280Kb. Once mussa has run once, you can reload +previously run analyses. + + +Mussa File Formats +------------------ + +.. _param: + +Parameter File Format +~~~~~~~~~~~~~~~~~~~~~ + +**File Format (.mupa):** + +:: + + # name of anaylsis directory and stem for associated files + ANA_NAME + + # if APPEND vars true, a _wXX and/or _tYY added to analysis name + # where XX = WINDOW and YY = THRESHOLD + # Highly recommeded with use of command line override of WINDOW or THRESHOLD + APPEND_WIN + APPEND_THRES + + # how many sequences are being analyzed + SEQUENCE_NUM + + # first sequence info + SEQUENCE + ANNOTATION + SEQ_START + + # the second sequence info + SEQUENCE + # ANNOTATION + SEQ_START + # SEQ_END + + # third sequence info + SEQUENCE + # ANNOTATION + + # analyses parameters: command line args -w -t will override these + WINDOW + THRESHOLD + +.. csv-table:: Parameter File Options: + :header: "Option Name", "Value", "Default", "Required", "Description" + :widths: 30 30 30 30 60 + + "ANA_NAME", "string", "N/A", "true", "Name of analysis (Also + name of directory where analysis will be saved." + "APPEND_WIN", "true/false", "?", "?", "Appends _w## to ANA_NAME" + "APPEND_THRES", "true or false", "?", "?", "Appends _t## to ANA_NAME" + "SEQUENCE_NUM", "integer", "N/A", "true", "The number of sequences + to analyse" + "SEQUENCE", "/fasta/filepath.fa", "N/A", "true", "Must define one + sequence per SEQUENCE_NUM." + "ANNOTATION", "/annotation/filepath.txt", "N/A", "false", "Optional + annotation file. See `annotation file format`_ section for more + information." + "SEQ_START", "integer", "1", "false", "Optional index into fasta file" + "SEQ_END", "integer", "1", "false", "Optional index into fasta file" + "WINDOW", "integer", "N/A", "true", "`Window Size`_" + "THRESHOLD", "integer", "N/A", "true", "`Threshold`_" + +.. _annot: + +Annotation File Format +~~~~~~~~~~~~~~~~~~~~~~ + +The first line in the file is the sequence name. Each line there after +is a **space** seperated annotation. + +Format: + +:: + + + + + + + ... + +Example: + +:: + + Mouse + 251 500 Glorp Glorptype + 751 1000 Glorp Glorptype + 1251 1500 Glorp Glorptype + 1751 2000 Glorp Glorptype + + +.. _motif: + +Motif File Format +~~~~~~~~~~~~~~~~~ + +Format: + + + GGCC 0.0 1 1 + + +.. Define links below + ------------------ + +.. _GPL: http://www.opensource.org/licenses/gpl-license.php +.. _wiki: http://mussa.caltech.edu +.. _build: http://woldlab.caltech.edu/cgi-bin/mussa/wiki/MussaglBuild +.. _fasta: http://en.wikipedia.org/wiki/FASTA_format +