By Brandon W. King
------------------
-Last updated: May 15th, 2006
+Last updated: May 18th, 2006
-Updated to Mussagl build: 141 (Update to 193 in progress)
+Updated to Mussagl build: 141 (Update to 200 in progress)
.. contents::
Download
--------
-Mussagl can be downloaded from http://mussa.caltech.edu/.
+Mussagl in binary form for OS X and Windows and/or source can be
+downloaded from http://mussa.caltech.edu/.
Install
-------
Now click on the 'Browse' button next to the sequence input box and
then select /examples/seq/human_mck_pro.fa file. Do the same in the
next two sequence input boxes selecting mouse_mck_pro.fa and
-rabbit_mck_pro.fa as shown below.
+rabbit_mck_pro.fa as shown below. Note that you can create annotation
+files using the mussa `Annotation File Format` to add annotations to
+your sequence.
.. image:: images/define_analysis_step2.png
:alt: Choose sequences
~~~~~~~~
.. Screenshot with numbers showing features.
+.. image:: images/window_overview.png
+ :alt: Mussa Window
+ :align: center
+
+Legend:
+
+ 1. `DNA Sequence (Black bars)`_
+
+ 2. Annotation_
+
+ 3. Motif_
+
+ 4. `Conservation tracks`_
+
+ 5. `Motif Toggle`_
+
+ 6. `Zoom Factor`_ (Base pairs per pixel)
+
+ 7. `Dynamic Threshold`_
+
+ 8. `Sequence Information Bar`_
+
+ 9. `Sequence Scroll Bar`_
+
+
+DNA Sequence (black bars)
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. image:: images/sequence_bar.png
+ :alt: Sequence Bar
+ :align: center
+
+Each of the black bars represents one of the loaded sequences, in this
+case the sequence around the gene 'MCK' in human, mouse, and rabit.
+
+FIXME: Should I mention the repeats here?
+
+
+Annotation
+~~~~~~~~~~
+
+.. figure:: images/annotation.png
+ :alt: Annotation
+ :align: center
+
+ Annotation shown in green on sequence bar.
+
+
+Annotations can be included on any of the sequences using the `Load a
+mussa parameter file`_ method of loading your sequences. You can
+define annotations by location or using an exact subsequence and you
+may also choose any color for display of the annoation; see the
+`Annotation File Format`_ section for details.
+
+Note: Currently there is no way to add annotations using the GUI (only
+via the .mupa file). We plan to add this feature in the future, but it
+likely will not make it into the first release.
+
+
+Motif
+~~~~~
+
+.. figure:: images/motif.png
+ :alt: Motif
+ :align: center
+
+ Motif shown in light blue on sequence bar.
+
+The only real difference between an annotation and motif in mussagl is
+that you can define motifs from within the GUI. See the `Motifs`_
+section for more information.
+
+
+Conservation tracks
+~~~~~~~~~~~~~~~~~~~
+
+.. figure:: images/conservation_tracks.png
+ :alt: Conservation Tracks
+ :align: center
+
+ Conservations tracks shown as red and blue lines between sequence
+ bars.
+
+The **red lines** between the sequence bars represent conservation
+between the sequences and **blue lines** represent **reverse
+complement** conservation. The amount of sequence conservation shown
+will depend on the relatedness of your sequences and the `dynamic
+threshold` you are using. Sequences with lots of repeats will cause
+major slow downs in calculating the matches.
+
+
+Motif Toggle
+~~~~~~~~~~~~
+
+.. image:: images/motif_toggle.png
+ :alt: Motif Toggle
+ :align: center
+
+Toggles motifs on and off. This will not turn on and off annotations.
+
+Note: As of the current build (#200), this feature hasn't been
+implemented.
+
+
+Zoom Factor
+~~~~~~~~~~~
+
+.. image:: images/zoom_factor.png
+ :alt: Zoom Factor
+ :align: center
+
+The zoom factor represents the number of base pairs represented per
+pixel. When you zoom in far enough the sequence will switch from
+seeing a black bar, representing the sequence, to the actual sequence
+(well, ASCII representation of sequence).
+
+
Dynamic Threshold
~~~~~~~~~~~~~~~~~
-Zoom
-~~~~
+.. image:: images/dynamic_threshold.png
+ :alt: Dynamic Threshold
+ :align: center
-Sequence Info
-~~~~~~~~~~~~~
+You can dynamically change the threshold for how strong of match you
+consider the conservation to be with one of two options:
+
+ 1. Number of base pair matchs out of window size.
+
+ 2. Percent base pair conservation.
+
+See the Threshold_ section for more infromation.
-Scroll Bar
-~~~~~~~~~~
-.. Moving sequence
+
+Sequence Information Bar
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. image:: images/seq_info_bar.png
+ :alt: Sequence Information Bar
+ :align: center
+
+The sequence infomation bars can be found to the left and right sides
+of mussagl. Next to each sequence you will find the following
+information:
+
+ 1. Species (If it has been defined)
+ 2. Total Size of Sequence
+ 3. Current base pair position
+
+
+Sequence Scroll Bar
+~~~~~~~~~~~~~~~~~~~
+
+.. image:: images/scroll_bar.png
+ :alt: Sequence Scroll Bar
+ :align: center
+
+The scroll bar allows you to scroll through the sequence which is
+useful when you have zoomed in using the `zoom factor`_.
Annotations / Motifs
Annotations
~~~~~~~~~~~
+Currently annotations can be added to a sequence using the mussa
+`annotation file format`_ and can be loaded by selecting the
+annotation file when defining a new analysis (see `Create a new
+analysis`_ section) or by defining a .mupa file pointing to your
+annotation file (see `Load a mussa parameter file`_ section).
+
Motifs
~~~~~~
Load Motifs from File
*********************
+It is possible to load motifs from a file which was saved from a
+previous run or by defining your own motif file. See the `Motif File
+Format`_ section for details.
+
+To load a motif file, select **Load Motif List** item from the
+**File** menu and select a motif list file.
+
+.. image:: images/load_motif.png
+ :alt: Load Motif List
+ :align: center
+
+
+Save Motifs to File
+*******************
+
+Note: Currently not implemented
+
+
Motif Dialog
************
+Mussa has the ability to find lab motifs using the `IUPAC Nucleotide
+Code`_ for defining a motif. To define a motif, select **View > Edit
+Motifs** menu item as shown below.
+
+.. image:: images/view_edit_motifs.png
+ :alt: "View > Edit Motifs" Menu
+ :align: center
+
+You will see a dialog box appear with a "set motifs" button and 10
+rows for defining motifs and the color that will be displayed on the
+sequence. By default all 10 motifs start off as with white as the color.
+
+.. image:: images/motif_dialog_start.png
+ :alt: Motif Dialog
+ :align: center
+
Detailed Info
-------------
~~~~~~~~~~~~~~~~~~~~~~
The first line in the file is the sequence name. Each line there after
-is a **space** seperated annotation.
+is a **space** seperated annotation.
+
+New as of build 198:
+
+ * The annotation format now supports fasta sequences embeded in the
+ annotation file as shown in the format example below. Mussagl will
+ take this sequence and look for an exact match of this sequence in
+ your sequences. If a match is found, it will label it with the name
+ of from the fasta header.
Format:
<start> <stop> <annotation_name> <annotation_type>
<start> <stop> <annotation_name> <annotation_type>
<start> <stop> <annotation_name> <annotation_type>
+ >Fasta Header
+ ACTGACTGACGTACGTAGCTAGCTAGCTAGCACG
+ ACGTACGTACGTACGTAGCTGTCATACGCTAGCA
+ TGCGTAGAGGATCTCGGATGCTAGCGCTATCGAT
+ ACGTACGGCAGTACGCGGTCAGA
+ <start> <stop> <annotation_name> <annotation_type>
...
Example:
251 500 Glorp Glorptype
751 1000 Glorp Glorptype
1251 1500 Glorp Glorptype
+ >My favorite DNA sequence
+ GATTACA
1751 2000 Glorp Glorptype
-.. _motif:
+.. _motif_file_format:
Motif File Format
~~~~~~~~~~~~~~~~~
GGCC 0.0 1 1
+
+IUPAC Nucleotide Code
+~~~~~~~~~~~~~~~~~~~~~
+
+For your convience, below is a table of the IUPAC Nucleotide Code.
+
+The following table is table 1 from "Nomenclature for Incompletely
+Specified Bases in Nucleic Acid Sequences" which can be found at
+http://www.chem.qmul.ac.uk/iubmb/misc/naseq.html.
+
+====== ================= ===================================
+Symbol Meaning Origin of designation
+====== ================= ===================================
+G G Guanine
+A A Adenine
+T T Thymine
+C C Cytosine
+R G or A puRine
+Y T or C pYrimidine
+M A or C aMino
+K G or T Keto
+S G or C Strong interaction (3 H bonds)
+W A or T Weak interaction (2 H bonds)
+H A or C or T not-G, H follows G in the alphabet
+B G or T or C not-A, B follows A
+V G or C or A not-T (not-U), V follows U
+D G or A or T not-C, D follows C
+N G or A or T or C aNy
+====== ================= ===================================
+
+
.. Define links below
------------------
.. _wiki: http://mussa.caltech.edu
.. _build: http://woldlab.caltech.edu/cgi-bin/mussa/wiki/MussaglBuild
.. _fasta: http://en.wikipedia.org/wiki/FASTA_format
+.. _wpDnaMotif: http://en.wikipedia.org/wiki/DNA_motif