woldlab.caltech.edu Git - htsworkflow.git/log

Merge in the library list, detail, and results downloading feature from
the Caltech live site.

There's several components in the frontend tree to render the pages,
in addition this adds in some helper functions in pipelines.eland
to simplify computing summary statistics for an eland lane.

I also needed to merge in a generator based makebed code for
returning the files to the django database.

To use this, the settings file in this branch will need a variable
RESULT_HOME_DIR to be set.

commit | commitdiff | tree

Diane Trout [Fri, 13 Feb 2009 01:42:06 +0000 (01:42 +0000)]

use the compression handling auto-opener for our eland files

commit | commitdiff | tree

Diane Trout [Thu, 12 Feb 2009 22:38:09 +0000 (22:38 +0000)]

make our API docstrings more epydoc friendly

commit | commitdiff | tree

Diane Trout [Thu, 12 Feb 2009 22:37:08 +0000 (22:37 +0000)]

Add load_pipeline_run_xml, a little function that feeds the xml file into
ElementTree and grabs the useful root

commit | commitdiff | tree

Diane Trout [Thu, 5 Feb 2009 00:06:39 +0000 (00:06 +0000)]

Drop 'using %s as cwd' down to just debug level.
It was getting to annoying watching it scroll by constantly

commit | commitdiff | tree

Diane Trout [Fri, 30 Jan 2009 20:47:35 +0000 (20:47 +0000)]

extended command line configuration parsing and add config file parsing
for finding the location of our database and sequence archive directories.

commit | commitdiff | tree

Diane Trout [Fri, 30 Jan 2009 02:15:57 +0000 (02:15 +0000)]

Try to make runfolder results extraction more robust
If an IPAR or firecrest directory is missing some of the important
matrix files that implies there isn't actually a valid run present,
this patch will then (hopefully) issue a warning and skip that analysis
run.

I also added an option to scripts/runfolder to allow a user to specify
where the extracted results should go.

One questionable thing is that for one analysis some of the lanes
were run as sequence and not an eland analysis so were I expected
all the lanes to have an eland genome, it doesn't for these.
I hope that the code doesn't lose the index after serializing and
deserializing that chunk example.

commit | commitdiff | tree

Diane Trout [Fri, 30 Jan 2009 01:51:50 +0000 (01:51 +0000)]

Update to not hard code the config file name and the error message
for when we don't find it

commit | commitdiff | tree

Diane Trout [Sat, 24 Jan 2009 00:24:18 +0000 (00:24 +0000)]

insert code to do ~ home directory expansion

commit | commitdiff | tree

Diane Trout [Fri, 23 Jan 2009 02:23:21 +0000 (02:23 +0000)]

Add in Rami's report template, and adjust the paths to use "reports" instead
of "htsw_reports"

commit | commitdiff | tree

Diane Trout [Fri, 23 Jan 2009 02:21:09 +0000 (02:21 +0000)]

Add id as an AutoNumber(primary_key=True) field and remove the pk from
library_id.

Stanford decided to use library_id as a text field so they could use
library IDs like "SL100". Caltech just used the raw sql id, so the
foreign key reference in experiments_flowcells was expecting a numeric
id, but since the model had the text field as the primary key things
didn't work.

commit | commitdiff | tree

Diane Trout [Wed, 21 Jan 2009 02:50:22 +0000 (02:50 +0000)]

Merge in Rami's changes from last friday.

Most of the admin pages work. Though there's a wsgi error with the reports.
I'll try to figure out tomorrow.

the biggest difference between caltech trunk and stanford schemas right now
is caltech is using made_for as a foreign key, and stanford is using it
as a text field.

commit | commitdiff | tree

Diane Trout [Wed, 14 Jan 2009 01:18:42 +0000 (01:18 +0000)]

add some testing code for the runner daemon

commit | commitdiff | tree

Diane Trout [Wed, 14 Jan 2009 01:17:16 +0000 (01:17 +0000)]

add empty admin.py for eland_config app

commit | commitdiff | tree

Diane Trout [Wed, 14 Jan 2009 01:12:47 +0000 (01:12 +0000)]

Merged much of the stanford htsworkflow frontend into trunk.
Updated to be compatable with Django 1.0

A big change for the 1.0 compatibility is the Admin class that was
attached to models was moved into a seperate file admin.py

I probably munged some of the fieldset formatting in the conversion process.

commit | commitdiff | tree

Diane Trout [Thu, 8 Jan 2009 20:12:03 +0000 (20:12 +0000)]

This is a partial merge of the stanford branch with the caltech branch of
the web application, it doesn't work correctly yet, the libraries admin page
is broken, and lacks the ability to browe the 'made_for' column.

This is based on a merge that started a few month ago, but I hadn't finished
the merge, I'll need to check for more updates from their branch soon.

During the process I decided it would be a good idea to update to django 1.0
which is going to make things even more unstable, so I thought I should
check this work in progess in before continuing.

commit | commitdiff | tree

Diane Trout [Tue, 6 Jan 2009 02:05:10 +0000 (02:05 +0000)]

Look in Temp directories for some of the files we have historically
used for our summary reports.

Version 1.1rc1 of the gapipeline started moving some of the files
into /Temp subdirectories of bustard and gerald.

commit | commitdiff | tree

Diane Trout [Wed, 24 Dec 2008 23:39:31 +0000 (23:39 +0000)]

Handle paired-end eland files.

This required changing the ELAND class to hold a list of dictionaries
from its previous implmentation where it was exporting an internal dictionary
of the lanes.

I decided to directly show the internal list and to remove the previous
dictionary methods to make it more obvious when code was expecting
the previous behavior.

Also a saved runfolder will now have eland files of the form
s_<lane id>_<end id>.

Internally the end is 0 or 1, I tried to make the display show 1 or 2 for
the users benefit though.

commit | commitdiff | tree

Diane Trout [Wed, 24 Dec 2008 23:34:23 +0000 (23:34 +0000)]

remove a debug print statement

commit | commitdiff | tree

Diane Trout [Wed, 24 Dec 2008 23:33:51 +0000 (23:33 +0000)]

Add test cases for alphanum sort

commit | commitdiff | tree

Diane Trout [Wed, 24 Dec 2008 23:33:14 +0000 (23:33 +0000)]

Support sorting numbers along with the alphanumeric strings

also I cleaned up the indent a bit

commit | commitdiff | tree

Diane Trout [Tue, 23 Dec 2008 02:06:27 +0000 (02:06 +0000)]

change from hand coded formatting functions to the built in python
C-style printf formatting

commit | commitdiff | tree

Diane Trout [Tue, 23 Dec 2008 02:05:35 +0000 (02:05 +0000)]

Use the right URLError attribute names for error messages

commit | commitdiff | tree

Diane Trout [Mon, 22 Dec 2008 22:50:46 +0000 (22:50 +0000)]

update make-tree-library script with new default location

commit | commitdiff | tree

Diane Trout [Mon, 22 Dec 2008 20:44:15 +0000 (20:44 +0000)]

fix the multi-eland parser to strip off extensions and not the last 3
characters of the filename.

commit | commitdiff | tree

Diane Trout [Mon, 22 Dec 2008 20:43:32 +0000 (20:43 +0000)]

clean up the logic for deciding the output filename when using stdin
as the input

commit | commitdiff | tree

Diane Trout [Fri, 19 Dec 2008 00:54:06 +0000 (00:54 +0000)]

Add command to report path to make figuring out which goat_pipeline is running

commit | commitdiff | tree

Diane Trout [Thu, 18 Dec 2008 23:43:38 +0000 (23:43 +0000)]

rename config file to something that doesn't include the read length
since that has been changing.

also a minor code clean up.

commit | commitdiff | tree

Diane Trout [Wed, 10 Dec 2008 01:00:25 +0000 (01:00 +0000)]

The summary parsing code now seems to handle paired end runs
this required changing how the lane_results were being stored,
previously it was a dictionary indexed by lane, now it is a list
of dictionaries, where the list index indicates which "end" of
a paired end run it is. (0 is the first, 1 is the second)

Also I got tired of being forced to use strings for the lane index
by element tree and modified the code so it converts the strings
required by element tree to integers for our internal dictionaries.

commit | commitdiff | tree

Diane Trout [Tue, 9 Dec 2008 01:19:23 +0000 (01:19 +0000)]

Test 1.1rc1 style runs, which unfortunately require a hack for parsing
the summary.htm files since illumina's html is invalid.
They forgot to use < when writing <=. Most web browsers will ignore
it, but ElementTree is pickier.

Also as of this commit the summary parsing code still doesn't understand
paired end runs so the paired end summary file parsing tests still fail.

commit | commitdiff | tree

Diane Trout [Wed, 3 Dec 2008 22:25:26 +0000 (22:25 +0000)]

make-library-tree is a tool to maintain caltech's version of our solexa
results archive.

commit | commitdiff | tree

Diane Trout [Wed, 3 Dec 2008 22:24:29 +0000 (22:24 +0000)]

Add test code to see if runfolder can handle something that looks like a
paired end run.

commit | commitdiff | tree

Diane Trout [Wed, 3 Dec 2008 22:22:31 +0000 (22:22 +0000)]

Add code to create a paired end Summary.htm file

commit | commitdiff | tree

Diane Trout [Wed, 3 Dec 2008 22:21:16 +0000 (22:21 +0000)]

Store the bustard pathname when searching for run folders
This was needed so the srf file can use the same runfolder scanning
code as the --extract-results feature.

commit | commitdiff | tree

Diane Trout [Fri, 21 Nov 2008 01:15:27 +0000 (01:15 +0000)]

Use the get_runs from htsworkflow.pipelines.runfolder
On the plus side this means it'll handle IPAR files, on the downside
it means that the srf program will crash if there's something wrong with
the summary.htm file or if there's an ipar directory that doesn't have
a run in it.
(I really need to add some code to get_runs to skip over IPAR directories that
are being ignored.)

commit | commitdiff | tree

Diane Trout [Fri, 14 Nov 2008 19:04:59 +0000 (19:04 +0000)]

Forgot to change a import htsworkflow.pipeline to htsworflow.pipelines.

commit | commitdiff | tree

Diane Trout [Thu, 6 Nov 2008 22:49:40 +0000 (22:49 +0000)]

Updated ipar_100 test case to deal with the using U0/1/2 vs R0/1/2
(my first implementation was to just dump all of the multi reads into
U0/1/2)

commit | commitdiff | tree

Diane Trout [Thu, 6 Nov 2008 22:39:24 +0000 (22:39 +0000)]

Process eland extended (or multi) read files.

This also updates the report tools to be compatible with 1.0.
For multi reads I mapped 0/1/2 mismatch reads to U0/U1/U2 if the number of
reads equaled 1 (for each category seperatly) and I mapped reads >1 and < 255
to R0/R1/R2.

Unfortunately 1.1rc1 changed the summary file, so this patch is not
compatible with it yet.

commit | commitdiff | tree

Diane Trout [Thu, 30 Oct 2008 22:28:01 +0000 (22:28 +0000)]

The htsworkflow.pipelines.gerald module was getting to large
so I broke the portion that analyzed the Summary.htm file and
the eland_result files into seperate modules in anticipation
of extending the eland code to handle some of the newer eland
result file types.

commit | commitdiff | tree

Diane Trout [Thu, 30 Oct 2008 22:03:12 +0000 (22:03 +0000)]

Add support for scanning for results in the IPAR directory.

The field that was the firecrest class in PipelineRun is now the
"image_analysis" field and can be either firecrest or ipar.

I also extracted some of the common functions out of the runfolder test
modules and added them to a seperate "simulate_runfolder" module.

commit | commitdiff | tree

Diane Trout [Thu, 30 Oct 2008 21:59:56 +0000 (21:59 +0000)]

Add "_slow" to the end of the queuecommand test functions
this allows "nosetests --exclude=slow" to skip them.

commit | commitdiff | tree

Diane Trout [Tue, 28 Oct 2008 21:25:00 +0000 (21:25 +0000)]

update setup.py for some package renames and some missing scripts

commit | commitdiff | tree

Diane Trout [Tue, 21 Oct 2008 19:44:25 +0000 (19:44 +0000)]

Merge in new modules from htsworkflow branch.

However I renamed things to simpler names.

analys_track -> analysis
exp_track -> experiments
fctracker -> samples
htsw_reports -> reports

As a result this check in probably wont work as I haven't finished
updating all the imports

commit | commitdiff | tree

Diane Trout [Tue, 21 Oct 2008 19:39:50 +0000 (19:39 +0000)]

Merge in model changes to fctracker from htsworkflow branch

commit | commitdiff | tree

Diane Trout [Tue, 21 Oct 2008 19:02:49 +0000 (19:02 +0000)]

update scripts for the pipeline to pipelines module rename

commit | commitdiff | tree

Diane Trout [Wed, 15 Oct 2008 19:49:34 +0000 (19:49 +0000)]

rename pipeline to pipelines to imply that we can process more than just illumina.

commit | commitdiff | tree

Diane Trout [Wed, 15 Oct 2008 18:59:34 +0000 (18:59 +0000)]

Rename trunk from gaworkflow to htsworkflow (and update all of the imports)
Fix the queuecommands test script to deal with the 1 sec delay hack

commit | commitdiff | tree

Diane Trout [Thu, 25 Sep 2008 00:04:19 +0000 (00:04 +0000)]

solexa2srf likes to produce output, so my trick of watching the
sockets to block until when the process ends didn't work.

This patch inserts a simple sleep(1) (second) into the code that
waits for the jobs to finish to prevent the queue manager from rapidly
spinning.

It should probably be fixed with a better way of monitoring for when
a process finishes

commit | commitdiff | tree

Diane Trout [Thu, 25 Sep 2008 00:02:25 +0000 (00:02 +0000)]

use _ for field seperator in srf file names. (Using a uniform seperator
makes it easier to process the files later. Not to mention avoiding
characters that are "special" like : is a good idea for multi-platform
compatibility)

commit | commitdiff | tree

Diane Trout [Thu, 18 Sep 2008 23:28:58 +0000 (23:28 +0000)]

Use queuecommands.run not queuecommands.start_job to actually
wait to launch additional processes

commit | commitdiff | tree

Diane Trout [Thu, 18 Sep 2008 22:53:26 +0000 (22:53 +0000)]

Be a little more informative about how many process are left to run
and what the exit code was in queuecommands.py

commit | commitdiff | tree

Diane Trout [Fri, 5 Sep 2008 21:56:38 +0000 (21:56 +0000)]

extract status field out of flowcell name.

For gaworkflow we abused the schema and stored the flow cell status
in the flow cell name field, this patch updates my sqlite interface
to the fctracker db to split that field.

commit | commitdiff | tree

Diane Trout [Fri, 29 Aug 2008 16:51:24 +0000 (16:51 +0000)]

Add support for converting mutli-eland files from pipeline 0.3 to
bedfiles

commit | commitdiff | tree

Diane Trout [Fri, 29 Aug 2008 16:51:23 +0000 (16:51 +0000)]

insert stub clean_runs function to list roughly what I think I can
delete before compressing the runfolder

commit | commitdiff | tree

Diane Trout [Fri, 15 Aug 2008 22:46:46 +0000 (22:46 +0000)]

Improve code to extract runfolder name from the path to the runfolder.

This version will actually convert relative paths into an absolute path
before extracting the runfolder name, as well as grabbing the right name
if there's a trailing /

commit | commitdiff | tree

Diane Trout [Thu, 14 Aug 2008 20:58:15 +0000 (20:58 +0000)]

In trying to get scripts/srf to work I needed to set subprocess.Popen to
shell=True, the end result of that is that at least on linux hosts
passing in a list of arguments to Popen doesn't work very well, Popen
needs a string.

Perhaps a better solution would be for queuecommand to take a
shell parameter and if that's true do the joining into a string.

but for the moment I just converted my test case to pass a string
instead of a list.

commit | commitdiff | tree

Diane Trout [Thu, 14 Aug 2008 20:58:15 +0000 (20:58 +0000)]

refactor code to make a runfolder out of the UnitTest class.
I did it so I could more easily make a mini-runfolder for developing
code that needed to scan the runfolder.

commit | commitdiff | tree

Diane Trout [Thu, 14 Aug 2008 00:09:39 +0000 (00:09 +0000)]

we might as well automatically save the Summary.htm file as well

commit | commitdiff | tree

Diane Trout [Thu, 14 Aug 2008 00:09:09 +0000 (00:09 +0000)]

Utility to create srf files from a bustard directory

this version works, as long as you launch it in the bustard directory
in question. There seems to be some messiness in the interaction between
how the list of arguments passed to Popen with shell=True has any file globs
expanded.

I had to switch from passing a list of arguments to Popen to string,
and I'm still not sure if any of the code to try and change the directory
to the bustard directory actually worked correctly.

(which is why it only works when launching from the bustard directory)

commit | commitdiff | tree

Diane Trout [Mon, 11 Aug 2008 23:22:15 +0000 (23:22 +0000)]

A bit of refactoring toward making the run progress report code work
by walking the directory instead of just watching via pyinotify.

mostly this was move where the report formatting code was stored to
someplace a little more shared, and by moving the thread that watches
the directory tree.

commit | commitdiff | tree

Diane Trout [Wed, 16 Jul 2008 00:46:45 +0000 (00:46 +0000)]

The older pipeline runs had a Phi-X control lane which we didn't
run eland against, so the total number of eland entries in
the GERALD config.xml file was less than 8. So relax testing
that constraint.

commit | commitdiff | tree

Diane Trout [Tue, 15 Jul 2008 00:26:48 +0000 (00:26 +0000)]

Provide cross referencing information to the libraries to help find
which lanes provide supporting information

commit | commitdiff | tree

Diane Trout [Mon, 7 Jul 2008 22:19:51 +0000 (22:19 +0000)]

Finish updating the Summary parsing file to handle the new 0.3 format
in addition I split test_runfolder into one that tests 0.2.6 files and
one that tests 0.3 files.

commit | commitdiff | tree

Diane Trout [Thu, 3 Jul 2008 00:16:50 +0000 (00:16 +0000)]

Partially handle the changed Summary.htm file from the 0.3 version of the
GAPipeline.

This update is incomplete as I'm pretty sure the xml serialization code
for the run xml file will break. However it does generate the summary
report for both the old summary file and the new post 0.3 file.

I also need to add unit tests for parsing and serializing the 0.3
file format.

commit | commitdiff | tree

Diane Trout [Tue, 24 Jun 2008 00:36:17 +0000 (00:36 +0000)]

Detect if our watch is on a mount point.

If we're on something that is unmounted, keep watching until there's a
new mount. Once something has been remounted, restart the watch.

commit | commitdiff | tree

Diane Trout [Tue, 17 Jun 2008 00:25:03 +0000 (00:25 +0000)]

Add QueueCommands, a class that allows controlling how many
processes to run simultaniously.

I still need to and a driver script to handle getting jobs from the
user.

It's mostly in so I can control launching the solexa2srf commands for
submitting stuff to the SRA.

commit | commitdiff | tree

Diane Trout [Fri, 6 Jun 2008 21:02:36 +0000 (21:02 +0000)]

don't use os.path.normpath when pathname is null in PipelineRun

commit | commitdiff | tree

Diane Trout [Thu, 5 Jun 2008 22:24:22 +0000 (22:24 +0000)]

add --extract-results to scripts/runfolder
this will build a directory tree with <flowcellID>/<cycle count>/
with the various eland result files, run_*.xml files, etc.

commit | commitdiff | tree

Diane Trout [Thu, 29 May 2008 00:01:50 +0000 (00:01 +0000)]

Some of the older flow cells used a default genome for eland instead
of specifying the genome path for each lane.

This patch will look up in the chipwidedefaults for the eland_genome if
it isn't found in the lane specific parameters

commit | commitdiff | tree

Diane Trout [Wed, 28 May 2008 00:42:19 +0000 (00:42 +0000)]

Compute all the details needed to create our 25bp rerun given just
a runfolder.
(This assumes more than the --gerald/-o version that I first
implemented, which is still available).

Now you can give rerun_eland a runfolder name, and it will (if there's
only 1 run found by pipeline.runfolder) extract the bases from that
into a new Data/C1-<length+1> directory and should launch eland.

commit | commitdiff | tree

Diane Trout [Tue, 27 May 2008 22:52:13 +0000 (22:52 +0000)]

ignore more *.py[co~] files in some of our test directories

commit | commitdiff | tree

Diane Trout [Fri, 23 May 2008 21:37:05 +0000 (21:37 +0000)]

add --run-xml to runfolder so you can generate summary reports from a
previously analyzed runfolder

commit | commitdiff | tree

Diane Trout [Fri, 23 May 2008 21:33:07 +0000 (21:33 +0000)]

Update pipeline.gerald to handle eland_result files that have been bzipped.
Also I added my opener module which will try to guess the right
compression utility for a file.

commit | commitdiff | tree

Brandon King [Mon, 19 May 2008 22:49:09 +0000 (22:49 +0000)]

Begining of consolidation with trunk/stanford variatants of the database.

commit | commitdiff | tree

Diane Trout [Thu, 15 May 2008 00:32:47 +0000 (00:32 +0000)]

add rerun_eland.py which extracts sub-sequences from eland files and runs
eland on them with a new sequence length.

The script also helpfully uses the gerald config file to figure out the
correct genome path.

commit | commitdiff | tree

Diane Trout [Wed, 14 May 2008 23:00:47 +0000 (23:00 +0000)]

separate computing the sample/lane_id names from calculating read counts

the read count computation takes a long time, and if we just want to
quickly access some information from the gerald directory it was really
annoying to wait for it to finish.

commit | commitdiff | tree

Brandon King [Wed, 14 May 2008 00:01:27 +0000 (00:01 +0000)]

v0.2.0 progress
* Commented out eland_result table as it is not being used by either site and Stanford has implemented something that is probably more useful, so we will like import that.
* Person has been renamed to UserProfile and has been intergrated with the user profiles feature of Django (http://www.djangobook.com/en/1.0/chapter12/#cn222), which allows you to get access to the "profile" information by using user.get_profile().
* Added Lab which just contains a name... This will be used to implement user/lab level access to Flowcell/Library information.

commit | commitdiff | tree

Diane Trout [Tue, 13 May 2008 16:36:55 +0000 (16:36 +0000)]

add additional debugging logging to retrieve_config and configure_pipeline
to help figure out why it was failing. (which turned out to originally be
because of user error)

commit | commitdiff | tree

Diane Trout [Tue, 13 May 2008 16:17:11 +0000 (16:17 +0000)]

logging.basicConfig should only be in top level scripts.
using basicConfig in a module causes problems because it's likely
to override the users logging.basicConfig. (from some other
top level script that's using logging correctly)

commit | commitdiff | tree

Diane Trout [Sat, 10 May 2008 04:32:25 +0000 (04:32 +0000)]

make it possible to include all alignments, not just the ones that match
chromosomes.

commit | commitdiff | tree

Diane Trout [Sat, 10 May 2008 00:18:41 +0000 (00:18 +0000)]

makebed is a script too

commit | commitdiff | tree

Diane Trout [Sat, 10 May 2008 00:18:24 +0000 (00:18 +0000)]

Keep track of sample_name and lane_id computed from the eland
filename.

Perhaps I should have more code checking to make sure its of the form
s_?_eland_result.txt

commit | commitdiff | tree

Diane Trout [Fri, 9 May 2008 03:51:30 +0000 (03:51 +0000)]

Make the runfolder splitting patch a bit more python 2.4 compatible
Python2.4 doesn't have datetime.strptime, nor does it have
a built in copy of ElementTree in the xml.etree namespace,

commit | commitdiff | tree

Diane Trout [Fri, 9 May 2008 03:22:39 +0000 (03:22 +0000)]

update the setup.py file to the new name for the runfolder script

commit | commitdiff | tree

Diane Trout [Fri, 9 May 2008 03:21:16 +0000 (03:21 +0000)]

Move runfolder analysis classes out of scripts/runfolder.py into seperate files
Also rename runfolder.py to runfolder

This was a really annoying patch to make, I wanted to do two major things,
be able to construct the runfolder configuration extracting classes
from the xml file I was creating, and to make unit tests to make sure
all the code was at least somewhat correct.

Writing all of the xml serialization code was really annoying and dull,
there was probably some nifty metaprogramming way of solving it, but
I didn't feel like figuring it out, as I really need to move on to
more important parts of the project.

I wanted to rename runfolder.py to runfolder as the solexa pipeline code
has a runfolder.py (and if anyone has a better name for the script that's
supposed to dump the runfolder xml file, let me know).

Also in working on the xml serialization code, I extended the serialization
for the eland files, this version now dumps the genome_map and the
eland statistics, like reads, match counts and the like. It does
mean that the --archive mode will take longer, but it also means
I'll have enough information to generate the run statistics later.

Now I might have to redo this if we figure out if we should be handling
the realign files instead.

commit | commitdiff | tree

Diane Trout [Thu, 24 Apr 2008 00:25:42 +0000 (00:25 +0000)]

Add a script that takes a set of eland_result files and makes bedfiles
it'll also look up the lane descriptions in the flowcell database

commit | commitdiff | tree

Diane Trout [Thu, 24 Apr 2008 00:24:26 +0000 (00:24 +0000)]

Report cluster results with the rest of the lane summary information.
this involved breaking names like "s_1" into their sample and lane identifiers
and then exclusively using the lane identifiers.

One complexity is that I still had to treat the lane IDs as keys into
a dictionary instead of offsets into a list, because the lanes
were labeled in the range 1..8, but python's list indexes would have
been 0..7.

I also changed the report code to return a string instead of printing
stuff to stdout, to make it easier for me to integrate it into code
to email the summary report.

commit | commitdiff | tree

Diane Trout [Tue, 22 Apr 2008 21:58:04 +0000 (21:58 +0000)]

also since nothing is currently using the pipelineFinished message
from runner remove it

commit | commitdiff | tree

Diane Trout [Tue, 22 Apr 2008 00:26:58 +0000 (00:26 +0000)]

oops forgot to remove some debugging statements from the previous patch

commit | commitdiff | tree

Diane Trout [Tue, 22 Apr 2008 00:02:16 +0000 (00:02 +0000)]

Extend makebed to lookup metadata out of a copy of the fctracker database

commit | commitdiff | tree

Diane Trout [Mon, 21 Apr 2008 23:34:51 +0000 (23:34 +0000)]

split the library script into a reusable database/reporting layer
and command line script.