|
ChIPSeq Peak Finder - Description coming soon.
BioHub - The BioHub is a relational database
and Python API developed at Caltech that manages associations between
numerous genomic-sequence-based and transcript-based datasets in
order to provide centralized query services and uniform data
access. The Biohub was designed to permit biologists to draw on and
combine many disparate data sources for integrative analyses such as
gene network modeling. The central feature in Biohub design is the
Sequence Registry which relates diverse data and annotations to
individual genomic sequence features - usually genes.
Cistematic - The core of
Cistematic is a Python package with a rich set of API's that
simplify the collection and analysis of candidate cis-elements from a
number of different motif-finding and phylogenetic footprinting
programs such as MEME, AlignACE, Co-Bind, and FootPrinter. Cistematic
assesses the significance of each motif by comparing it to its
prevalence genome-wide. A web front-end using Python Server Pages is
built on top of the Webware application server, which allows for an
interactive setup and exploration of the results.
CompClust - CompClust is a
python package written using the pyMLX and IPlot APIs. It provides
software tools to explore and quantify relationships between
clustering results. Its development has been largely built around
needs of microarray data analysis but could be easily used in other
domains. Briefly pyMLX provides an provides for efficient and
convenient execution of many clustering algorithms using a extendable
library of algorithms. It also provides many-to-many linkages between
data features and annotations (such as cluster labels, gene names,
gene ontology information, etc.) This linkages are are persistant
through data manipulations. IPlot provides an abstraction of the
plotting process in which any arbitrary feature or derived feature of
the data can be projected onto any feature of the plot, including the
X,Y coordinates of points, marker symbol, marker size, maker/line
color, etc. These plots are intrinsically linked to the dataset, the
View and the Labeling classes found within pyMLX.
MAD: Motif Analysis and Detection
- MAD is a python package that provides tools for locating and
analyzing candidate regulatory motifs (factor binding sites) using
PWMs. MAD provides tools for the visualization of motifs in genomic
sequences (intergenic or otherwise) with their appropriate
significance. It also provides algorithms for optimizing (refining)
motif PWMs based on clustering results (coexpression), phylogenetic
information (orthology across genomes), and cooperativity with other
motifs.
Mussa - Mussa is an N-way
version of the FamilyRelations/secomp 2-way
comparative sequence analysis programs. Given DNA sequence from N
species, Mussa uses all possible pairwise comparions to derive an
N-wise comparison. For example, given sequences 1,2,3, and 4, Mussa
makes 6 2-way comparisons: 1vs2, 1vs3, 1vs4, 2vs3, 2vs4, and 3vs4.
It then compares all the links between these comparisons, saving
those that satisfy a transitivity
requirement. The saved paths are then displayed in an interactive viewer
Pymerase - Pymerase is
a tool intended to generate a python object model, relational
database, and an object-relational model connecting the two. However
it has been extended to also output web pages, gui widgets, tab
delimited text parsers, etc. It can be easily extended to output
whatever else you might like. We are currently using Pymerase for BioHub development and other projects.
Sigmoid - The Sigmoid
project is intended to produce a database of cellular signaling
pathways and models thereof, to marshall the major forms of data and
knowledge required as input to cellular modeling software and also to
organize the outputs. Such cellular signaling and regulatory pathways
are commonly hand-drawn in biological literature as an aid to
intuitive understanding. Pathway databases can provide the same
assistance in the context of attempts to achieve a quantitative
understanding of cellular processes by numerical simulation. They can
also serve as an aid to capturing and querying both expert knowledge
and heterogeneous data sets pertaining to pathways. Cell model
databases are a subject of current research. SIGMOID works at the
interface of these two areas.
|