.. _`Family Relations`: http://cartwheel.caltech.edu/
+Motivation
+----------
+
+.. class:: small
+
+ The hope is that conservation while highlight elements that are important.
+ However, it (by definition) only shows elements in common.
+
+ For instance though a two sequence comparision between a Human and Fugu
+ muscle gene might show important elements of muscle, it would lose any
+ mammal specific elements.
+
+ But a two sequence comparison between Mouse and Human might have too
+ much in common to be useful.
+
+
+Motivation: Human vs. Fugu
+--------------------------
+
+.. class:: small
+
+ .. image:: HuFu.png
+
+Motivation: Human vs. Mouse
+---------------------------
+
+.. class:: small
+
+ .. image:: HuMo.png
+
+Motivation
+----------
+
+.. class:: small
+
+ The hope is that by requiring conservation in multiple more closely related
+ species one can achive the purification of the long distance comparison
+ while still allowing elements that are important to those more closely
+ related species to remain.
+
+Motivation: Mammals
+-------------------
+
+.. class:: small
+
+ .. image:: HuCoDoMoRa.png
+
Algorithm
---------
.. class:: small
- To compute a result Mussa conceptually uses these modules
-
- * Seqcomp
- * Test Transitivity
- * "Refinement"
+ To compute a result Mussa uses these algorithms to perform the N-way
+ filtering.
+
+ * Seqcomp (determins the pairwise list of "matches")
+ * Transitivity Test (filters the matches)
Seqcomp
-------
match = 0
for i in range(W):
if S[0][x+i] == S[1][y+i]:
- increment match
- if match > threshold:
- save indicies
+ match = match + 1
+ if match >= threshold:
+ save_indicies(x,y)
- The actual algorithm only needs to compare the base that
+ The algorithm actully being used only needs to compare the base that
"slid in" into window, and account for the base that "slid out"
Seqcomp
.. image:: 4bp_window_no_match.png
- In this case there are none.
+ In this case there is only one.
Seqcomp
-------
.. image:: 4bp_window_match.png
- However, now that we slid over one position there are now 4
+ However, now that we slid over one position there are now 3
and so we would record 0, 5
Seqcomp
When extending to more than two sequences, mussa needs to compare
- (N * (N-1)) / 2 sequences
+ (N * (N-1)) sequences
+
+Transitivity Test
+-----------------
+
+.. class:: small
+
+ There are several algorithms for comparing multiple sequences.
+
+ * Require transitivity, e.g. if A = B, and B = C, then A = C
+ * "Radial" only tests matches between any number of query sequences
+ and a single reference sequence. A = B, A = C, but B ?= C
+ * "Entropy" (an experimental comparision that Tristan was working on)
Test Transitivity
-----------------
-Refinement
-----------
+.. class:: small
+ .. image:: 4way_trans.png
+
+
Limits
------
- describe the difference between a long distance comparison
- and multiple closer comparisons. (should use some pictures for that)
+.. class:: small
- paircomp/seqcomp
+ One of the weaknesses with the current implementation is that the
+ transitivity filtering step involves a combinatorial explosion as it
+ compares every possible path.
- transitivity filter
+ The parameters that influence the number of matches found are,
+ repeat masking the sequence, how closely releated the two sequences
+ are, the length of the sequence and the stringency of the seqcomp
+ threshold.
-How To Use
-----------
+Limits
+------
- Should this include pulling things from the tutorial?
- cover sucking things out of UCSC?
+.. class:: small
+
+ Additionally the types of elements found are influenced by the
+ window size and base-pair threshold.
+
+ For instance a 6 base pair binding site wont be detected when using
+ a 30 base pair window size.
+
+Usage
+-----
+
+.. class:: small
+
+ Currently I have two classes of target user for mussa.
+
+ * Computationally savvy user (AKA me)
+ * The "typical" biologist (AKA my PI)
+
+Tutorial
+--------
+
+ Brandon has been working on a tutorial for the GUI
+ which includes a section on how we extract sequence out of UCSC.
+
+
+Command-Line Features
+---------------------
+
+.. class:: small
+
+ * Command line::
+
+ $ mussagl --help
+ --run-analysis arg run an analysis
+ defined by the mussa
+ parameter file
+ --view-analysis arg load a previously run
+ analysis
+ --no-gui terminate without viewing
+ an analysis
+
+Command-Line Features
+---------------------
+
+.. class:: small
+
+ * Parameter file::
+
+ ANA_NAME mck3test
+ APPEND_WIN true
+ APPEND_THRES true
+
+ SEQUENCE seq/mouse_mck_pro.fa
+ ANNOTATION mm_mck3test.annot
+
+Command-Line Features
+---------------------
+
+.. class:: small
+
+ * Annotation File::
+
+ [Seq name]
+ start stop name type
+ >name
+ AGCGAAA
+
+ * [Seq name] is an optional name specifier.
+ * The "alignment" algorithm used for sequence specified annotations
+ is currently just using the motif search, so it only accepts
+ IUPAC codes and doesn't handle in-dels.
+
+GUI Features
+------------
+
+.. class:: small
+
+ * The Create Analysis menu option provides the same options
+ as the parameter file.
+
+ .. image:: ../manual/images/define_analysis.png
+
+GUI Features
+------------
+
+.. class:: small
+
+ Although there isn't a GUI for describing large annotations.
+ (The motif editor can be used this way but there are issues).
+
+
+GUI Features
+------------
+
+.. class:: small
+
+ The Mussa GUI can:
+
+ * Display sequence with highlighted annotation regions
+ * Search for motifs in these sequences
+ * Show a base-pair alignment of a seqcomp "match"
+ * Copy sequence regions
+ * Create a new analysis using a subselection of one analysis
+ and different parameters.
+
+GUI
+---
+
+.. class:: small
+
+ <demo>
+
+Finish
+------
+
+.. class:: small
+
+Mussa has been developed by:
+
+ * Tristan DeBuysscher
+ * Diane Trout
+ * Brandon King
+ * Nora Mullaney
+And been influenced by:
+
+ * C. Titus Brown
+ * Erich Schwars
+ * and Barbara Wold
+ :tiny:`and as I stepped in fairly late in Mussa's life, there could easily
+ be others.`