From: Diane Trout Date: Sat, 8 Jul 2006 01:33:02 +0000 (+0000) Subject: start of presentation for bioinformatics journal club 2006 july 10 X-Git-Url: http://woldlab.caltech.edu/gitweb/?p=mussa.git;a=commitdiff_plain;h=58adf40f9b6e009f0d3d5fa2a277ad25ea63530d start of presentation for bioinformatics journal club 2006 july 10 --- diff --git a/doc/bioinfo_jc/4bp_window_match.png b/doc/bioinfo_jc/4bp_window_match.png new file mode 100644 index 0000000..18c9d96 Binary files /dev/null and b/doc/bioinfo_jc/4bp_window_match.png differ diff --git a/doc/bioinfo_jc/4bp_window_match.svg b/doc/bioinfo_jc/4bp_window_match.svg new file mode 100644 index 0000000..5f0141d --- /dev/null +++ b/doc/bioinfo_jc/4bp_window_match.svg @@ -0,0 +1,118 @@ + + + + + + + + + + + image/svg+xml + + + + + + + AGGCATAGCGTGCAT + ACCGTAGTCGTTGAC + + + + diff --git a/doc/bioinfo_jc/4bp_window_no_match.png b/doc/bioinfo_jc/4bp_window_no_match.png new file mode 100644 index 0000000..269f2a3 Binary files /dev/null and b/doc/bioinfo_jc/4bp_window_no_match.png differ diff --git a/doc/bioinfo_jc/4bp_window_no_match.svg b/doc/bioinfo_jc/4bp_window_no_match.svg new file mode 100644 index 0000000..49d3ad5 --- /dev/null +++ b/doc/bioinfo_jc/4bp_window_no_match.svg @@ -0,0 +1,115 @@ + + + + + + + + + + + image/svg+xml + + + + + + + AGGCATAGCGTGCAT + ACCGTAGTCGTTGAC + + + + diff --git a/doc/bioinfo_jc/bioinfo-presentation.rst b/doc/bioinfo_jc/bioinfo-presentation.rst new file mode 100644 index 0000000..3437935 --- /dev/null +++ b/doc/bioinfo_jc/bioinfo-presentation.rst @@ -0,0 +1,137 @@ +.. include:: + +===== +Mussa +===== + +:Authors: Diane Trout + +.. The contents of this directory contain the source + for a presentation for the Caltech Bioinformatics Journal club. + +.. footer:: Caltech Bioinformatics Journal Club + +What is Mussa +------------- + +.. class:: small + + Mussa is tool to search for conserved regions between several + sequences. Hopefully regions detected as conserved will + highlight potentially important DNA sequence features such as + cis-regulatory modules, microRNA genes, and exons. + + Mussa extends previous 2-way sequence comparison to N sequences. + +Family Tree +----------- + +.. class:: small + + Family Relations and Mussa started using the same sequence + comparison algorithm but developed in different directions. + + .. image:: familytree.png + :alt: Gratutious software family tree + + `Family Relations`_ focused on providing a robust usable piece + of software. + + Mussa focused on the N-way algorithm. + + .. _`Family Relations`: http://cartwheel.caltech.edu/ + +Algorithm +--------- + +.. class:: small + + To compute a result Mussa conceptually uses these modules + + * Seqcomp + * Test Transitivity + * "Refinement" + +Seqcomp +------- + +.. class:: small + + The original seqcomp comparion uses a refinement of a fairly simple + algorithm to compare two sequences. + + Given window of size W and sequences S[0] and S[1]:: + + for x in range(len(S[0])-W): + for y in range(len(S[1])-W): + match = 0 + for i in range(W): + if S[0][x+i] == S[1][y+i]: + increment match + if match > threshold: + save indicies + + The actual algorithm only needs to compare the base that + "slid in" into window, and account for the base that "slid out" + +Seqcomp +------- + +.. class:: small + + Assume that in this case we need 3 matches out of 4 + + .. image:: 4bp_window_no_match.png + + In this case there are none. + +Seqcomp +------- + +.. class:: small + + Assume that in this case we need 3 matches out of 4 + + .. image:: 4bp_window_match.png + + However, now that we slid over one position there are now 4 + and so we would record 0, 5 + +Seqcomp +------- + +.. class:: small + + + Once one pass is complete one of the sequences is reversed complimented + and the process is repeated. + + .. container:: incremental + + When extending to more than two sequences, mussa needs to compare + + (N * (N-1)) / 2 sequences + +Test Transitivity +----------------- + +Refinement +---------- + +Limits +------ + + describe the difference between a long distance comparison + and multiple closer comparisons. (should use some pictures for that) + + paircomp/seqcomp + + transitivity filter + +How To Use +---------- + + Should this include pulling things from the tutorial? + cover sucking things out of UCSC? + + diff --git a/doc/bioinfo_jc/familytree.png b/doc/bioinfo_jc/familytree.png new file mode 100644 index 0000000..f489c44 Binary files /dev/null and b/doc/bioinfo_jc/familytree.png differ diff --git a/doc/bioinfo_jc/familytree.svg b/doc/bioinfo_jc/familytree.svg new file mode 100644 index 0000000..ba108a5 --- /dev/null +++ b/doc/bioinfo_jc/familytree.svg @@ -0,0 +1,374 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + Seqcomp + + Family Relations + + Family Relations IIpaircomp + + + Python MUSSA + FLTK Mussa + + MussaGL + + + + + + +