Diane Trout [Sat, 14 Feb 2009 02:09:43 +0000 (02:09 +0000)]
Tag version that has paired end flag
Diane Trout [Sat, 14 Feb 2009 01:45:25 +0000 (01:45 +0000)]
Add a 'paired_end' flag to a flowcell.
Then use this flag in the eland_config module to specify ANALYSIS eland_pair
Also change the default analysis to eland_extended
use the following to add the new field
alter table fctracker_flowcell add column paired_end bool not null default false;
Diane Trout [Thu, 5 Feb 2009 20:04:25 +0000 (20:04 +0000)]
Add a function to instantiate a PipelineRun object from a run xml file.
Diane Trout [Thu, 29 Jan 2009 23:05:40 +0000 (23:05 +0000)]
add commas to numbers on the library summary statistics page
Diane Trout [Thu, 29 Jan 2009 22:45:41 +0000 (22:45 +0000)]
ignore files in this upload directory for the moment. Makes svn status a lot easier to read.
Diane Trout [Thu, 29 Jan 2009 22:42:42 +0000 (22:42 +0000)]
1.0 and later versions of illumina pipeline don't need READ_LENGTH
Diane Trout [Thu, 29 Jan 2009 22:29:56 +0000 (22:29 +0000)]
Minor tweaks to libray summary page.
Move the totals to after the 0,1,2-mismatch collumn.
Suppress the None in the headings for the blocks that list what
species/spike in the reads matched against.
Add "lane" and "end" (if needed) to the header for that summary.
Diane Trout [Thu, 29 Jan 2009 03:16:05 +0000 (03:16 +0000)]
I forgot to update a variable name in an error message
Diane Trout [Thu, 29 Jan 2009 01:42:41 +0000 (01:42 +0000)]
Update the library listing and summary reports to be prettier and easier to use.
This also involved updating the pipelines.eland class to have some helpful
read-only properties to do some math for our reporting.
Also I added a library.html template instead of just dumping raw
html to the django response object.
Diane Trout [Thu, 29 Jan 2009 00:26:09 +0000 (00:26 +0000)]
Merge updated changes forced by updating to trunks pipelines classes.
I needed to do this because the old live site couldn't read the newer
pipeline run xml files I was generated.
Diane Trout [Sat, 24 Jan 2009 01:28:29 +0000 (01:28 +0000)]
merge copying/running automation code from trunk into live site
Brandon King [Sat, 7 Jun 2008 00:59:13 +0000 (00:59 +0000)]
Now displays Diane's summary report if available.
Brandon King [Sat, 7 Jun 2008 00:15:34 +0000 (00:15 +0000)]
Bug fix try #3! (I'm glad I am taking a vacation next week.)
Brandon King [Sat, 7 Jun 2008 00:13:14 +0000 (00:13 +0000)]
Err, missed a spot.
Brandon King [Sat, 7 Jun 2008 00:09:38 +0000 (00:09 +0000)]
Minor bug fix... destroying your own dictionary in middle loop is not
a good idea. Sleep helps prevent these problems in the first place.
Brandon King [Fri, 6 Jun 2008 22:04:32 +0000 (22:04 +0000)]
Ultra-raw output of run_xml data implemented. The sorting of columns
is currently semi-random, but the data is there. I will work on a
better sorting in the near future. Keeping it simple at first.
Brandon King [Fri, 6 Jun 2008 21:21:45 +0000 (21:21 +0000)]
Back porting Diane's ethelp.py from the trunk to v0.1.x branch
Brandon King [Thu, 5 Jun 2008 23:41:26 +0000 (23:41 +0000)]
Back porting Diane's runfolder.py from the trunk to v0.1.x branch
Brandon King [Thu, 5 Jun 2008 23:40:54 +0000 (23:40 +0000)]
Back porting Diane's gerald.py from the trunk to v0.1.x branch
Brandon King [Thu, 5 Jun 2008 23:40:14 +0000 (23:40 +0000)]
Back porting Diane's firecrest.py from the trunk to v0.1.x branch
Brandon King [Thu, 5 Jun 2008 23:39:31 +0000 (23:39 +0000)]
Back porting Diane's bustard.py from the trunk to v0.1.x branch
Brandon King [Wed, 4 Jun 2008 22:14:45 +0000 (22:14 +0000)]
Added link to Summary.htm files if they are available.
Brandon King [Wed, 28 May 2008 22:16:12 +0000 (22:16 +0000)]
* Added option to use default return type by adding /ucsc/ at the end
of the bedfile url (good for debugging, and possibly for displaying
on ucsc genome browser, but may have been another bug that was
preventing the default return method from working with UCSC).
* Used Diane's auto open feature for reading the bz2 eland_result files.
Brandon King [Wed, 28 May 2008 21:15:48 +0000 (21:15 +0000)]
Updated mime types of large files to prevent killing people's
web browsers. Not so fun the 2nd time it happens.
Brandon King [Wed, 28 May 2008 19:19:47 +0000 (19:19 +0000)]
Bed file generation is here!
* Added a new makebed function which returns a generator rather
than writing to outstream.
* Updated the make_description function in makebed.py to be able
to handle cases like '<flowcell_id> (deleted)'.
* Added a bedfile generation view which uses the makebed generator
function to prevent memory issues on the webserver.
* Visit <url>/library/ for trying out the new features!
Brandon King [Tue, 27 May 2008 17:58:36 +0000 (17:58 +0000)]
Backporting makebed.py to v0.1.x branch
Brandon King [Tue, 27 May 2008 17:57:34 +0000 (17:57 +0000)]
Backporting fctracker.py to v0.1.x branch
Brandon King [Sat, 24 May 2008 00:18:32 +0000 (00:18 +0000)]
library view: eland_result files accessible now
* Added links in the library view to get access to "accessable" data.
If the data is not "accessable", no link will be provided.
Brandon King [Fri, 23 May 2008 21:43:56 +0000 (21:43 +0000)]
Back porting Diane's opener.py from trunk to v0.1.x branch
Brandon King [Fri, 23 May 2008 00:12:02 +0000 (00:12 +0000)]
Adding slightly more meaningful display of library/flowcell information.
Brandon King [Wed, 21 May 2008 17:22:54 +0000 (17:22 +0000)]
Added custom view to display libraries, when a library is selected it says what flowcells and lanes the library was placed on.
* url: /library/ (index)
* url: /library/<library_id>/ (flowcell/lane output)
Lorian Schaeffer [Sat, 19 Apr 2008 02:25:30 +0000 (02:25 +0000)]
Blatant misappropriation of a mostly-blank field to hold unrelated information. Only mechanical effect is letting successful_pM be displayed in the admin table.
Lorian Schaeffer [Sat, 19 Apr 2008 00:12:34 +0000 (00:12 +0000)]
Added ability to filter for RNAseq libraries, now that I know which they are
Lorian Schaeffer [Fri, 18 Apr 2008 21:56:11 +0000 (21:56 +0000)]
Minor display changes
Lorian Schaeffer [Fri, 18 Apr 2008 19:40:45 +0000 (19:40 +0000)]
Added options to stopping_point dropdown. Also changed how admin is displayed.
Brandon King [Thu, 17 Apr 2008 20:29:07 +0000 (20:29 +0000)]
Making a branch for v0.1.x as Lorian needs a bug to be fixed before
the trunk (soon to be v0.2.0) will be stable.
Brandon King [Wed, 20 Feb 2008 21:30:57 +0000 (21:30 +0000)]
Tagging working version before major database changes. v0.1.0.
Diane Trout [Tue, 29 Jan 2008 02:13:59 +0000 (02:13 +0000)]
DummyOptions didn't define a genome_dir member before trying to access it
Diane Trout [Wed, 23 Jan 2008 01:52:40 +0000 (01:52 +0000)]
return most recent genome build for the pipeline config file.
Brandon's original pipeline customization code replaced things
like %(genome|build)s with the path to the ELAND genome files.
What I did is made it possible to substitute keys like %(genome)s in
addition to %(genome|build)s. The idea is that the most config files
will be set to use whatever is the "most recent" build, but hopefully
at some point we'll provide some way of specifying which build.
The way I defined "most recent" genome build was to use the
alphanum sort, that sorts mixed alpha/numeric strings in the
'natural' order instead of ASCII order, thus "mm10" > "mm8".
For the genomes that we had installed right now this would work
for everything but arabadopsis--which appears to be using a version
number of MMDDYYYY. Though if we changed it to YYYYMMDD everything
should work correctly.
Brandon King [Wed, 23 Jan 2008 00:17:14 +0000 (00:17 +0000)]
Latest genome build, eland_config file temp. solution patch.
* As per Diane's suggestion, the config file being generated will only
contain the species name, which will mean "use latest build" for
this species. At a later time when the web UI has been updated to allow
overriding this default and selecting a specific genome build to
use (the species_name|build_num convention will return). See
ticket:50 for more information.
Brandon King [Thu, 17 Jan 2008 23:40:58 +0000 (23:40 +0000)]
Added a new get_cycles(recipe_xml_filepath) function in
gaworkflow.pipeline.recipe_parser.
Brandon King [Wed, 16 Jan 2008 02:13:59 +0000 (02:13 +0000)]
Spoolwatch function for getting cycle number from Recipe*.xml file.
* Work towards ticket:28.
Brandon King [Wed, 16 Jan 2008 01:07:05 +0000 (01:07 +0000)]
Attempt to support case insensitive config file names.
* ticket:43
Brandon King [Wed, 16 Jan 2008 00:44:17 +0000 (00:44 +0000)]
Fix for tickets #45 & #47
* retrieve_config needed the genome_dir substitution implemented (ticket:45)
* configure_pipeline has a name clash, now fixed (ticket:47)
Diane Trout [Tue, 15 Jan 2008 23:51:22 +0000 (23:51 +0000)]
Oops, forgot to provide a null body to my the stub of WatcherEvent.
Roughly I'm trying to keep a way of scheduling different types
of events at speficfied times in the future.
I think it needs to be some kind of queue, that the event loop
scans for events whose time has come.
Diane Trout [Tue, 15 Jan 2008 23:46:04 +0000 (23:46 +0000)]
pyinotify rm_watch takes a list, not a dictionary.
Diane Trout [Tue, 15 Jan 2008 02:04:30 +0000 (02:04 +0000)]
Update test_copier to reflect copiers move to gaworkflow.automation
Diane Trout [Tue, 15 Jan 2008 01:07:48 +0000 (01:07 +0000)]
move the autotmation scripts into gaworkflow.automation
the scripts that were providing the tools to automate running
the solexa pipeline were unfairly "priviledged" compared to the
components that wrapped talking to the pipeline commands and
providing the website, in that the other components were in
sub-packages while the automation was just in the gaworkflow
package.
So I moved them into the somewhat clearer "gaworkflow.automation".
The intent is that gaworkflow.automation contains modules
that make things happen without human intervention.
Diane Trout [Fri, 11 Jan 2008 23:43:41 +0000 (23:43 +0000)]
Add the xmi file from umbrello documenting our basic pipeline usecase
Diane Trout [Wed, 9 Jan 2008 02:29:52 +0000 (02:29 +0000)]
always trigger a copy when we receive the sequencingFinshed message
hopefully then we'll only send sequencingFinished off to runner
when the rsyncing is well and truely finished.
Diane Trout [Tue, 8 Jan 2008 02:39:21 +0000 (02:39 +0000)]
tell subversion to ignore *.py[co~] files
Diane Trout [Tue, 8 Jan 2008 02:38:00 +0000 (02:38 +0000)]
make sequencingFinished return something instead of none
to make xmlrpclib happy.
Diane Trout [Tue, 8 Jan 2008 02:11:48 +0000 (02:11 +0000)]
Remove unnecessary code from the runner.py module
Diane Trout [Tue, 8 Jan 2008 01:32:53 +0000 (01:32 +0000)]
Add script to make it easier to launch runner
Diane Trout [Fri, 4 Jan 2008 23:50:12 +0000 (23:50 +0000)]
use \w for alphanumeric flow cell IDs
Brandon King [Fri, 4 Jan 2008 23:09:58 +0000 (23:09 +0000)]
Runner now reports status when user sends status request!
* just send 'status' to the bot to get status information now.
Brandon King [Fri, 4 Jan 2008 22:41:50 +0000 (22:41 +0000)]
Runner now notifies users about success or failure of the following steps:
* Retrieve Config
* Configure Pipeline
* Run Pipeline
Brandon King [Fri, 4 Jan 2008 22:04:44 +0000 (22:04 +0000)]
Runner seems to work with running the pipeline when launch from within
any directory.
* stdout/stderr output is saved in top level of the analysis directory.
* Requires a base_analysis_dir to be set in bot config file (stating
where all the analysis directories are stored.
Brandon King [Fri, 4 Jan 2008 00:21:22 +0000 (00:21 +0000)]
Work towards getting runner to work properly.
* Runner can now run one run if launched in the directory
of the analysis it should run.
* TODO: Allow for running the pipeline without changing directories.
Brandon King [Fri, 4 Jan 2008 00:04:23 +0000 (00:04 +0000)]
Ok, one more option required.
Brandon King [Fri, 4 Jan 2008 00:03:03 +0000 (00:03 +0000)]
Forgot an option in the DummyOptions object.
Brandon King [Thu, 3 Jan 2008 23:19:22 +0000 (23:19 +0000)]
Work towards disabling command-line parsing outside of use
in scripts.
Brandon King [Wed, 2 Jan 2008 22:42:14 +0000 (22:42 +0000)]
Updated regexs to support new and "improved" flow cell numbers!
(ticket:2)
* Thank heavens for regex!
* URL for valid flow cell number is any alphanumeric character now
rather than FC#####.
* Getting the flowcell number from the directory path should now work
again.
Diane Trout [Sun, 30 Dec 2007 23:47:11 +0000 (23:47 +0000)]
direntry parser wasn't eating a trailing newline
This patch assumes that there always will be a single trailing whitespace
character to handle, on the off chance that someone will make a
horrific filename with trailing whitespace.
e.g. filename = "oh why do you do this.. "
Diane Trout [Sun, 30 Dec 2007 23:30:30 +0000 (23:30 +0000)]
forgot to rename the caller and called when refactoring the dirlist parsing
Diane Trout [Sun, 30 Dec 2007 23:28:17 +0000 (23:28 +0000)]
split rsync directory listing correctly
ticket:5
My previous code was generating a too many values to unpack exception
if there was a filename with spaces in it.
Diane Trout [Sat, 22 Dec 2007 02:04:25 +0000 (02:04 +0000)]
add a , to make customizing list of template dirs easier
Brandon King [Fri, 21 Dec 2007 02:13:47 +0000 (02:13 +0000)]
Work towards a working runner bot... still needs a bit of work.
Brandon King [Fri, 14 Dec 2007 20:56:30 +0000 (20:56 +0000)]
[project @ gaworkflow.runner progress]
* Work towards getting runner to launch jobs in a seperate thread.
* Stores status in a dictionary.
Diane Trout [Tue, 11 Dec 2007 22:35:09 +0000 (22:35 +0000)]
[project @ add skeleton for runner]
this is the basic outline of the runner bot. Important functions like
sequencingFinished and pipelineFinish still need to be filled in.
Diane Trout [Tue, 11 Dec 2007 22:32:18 +0000 (22:32 +0000)]
[project @ send messages to notify_runner not self.runner]
oops I forgot to update my parameter when I made a loop
Brandon King [Tue, 11 Dec 2007 21:43:19 +0000 (21:43 +0000)]
[project @ Bypass pipe lock mystery bug for configure step]
* There was a weird bug where certain failures of the pipeline could
leave the configure_run.py configure code waiting for output from the
configuration pipe, but where the program has already finished... leaving
the configure step stuck in a perminate state of waiting.
* This patch bypasses this problem by passing subprocess.Popen file
descriptors instead of subproccess.PIPE and then processing the
output after the program has terminated. I have never seen the
mystery bug when using this approach. This is how the run pipeline
step already handles running the pipeline and is also where I first
encountered this problem.
Diane Trout [Tue, 11 Dec 2007 08:38:49 +0000 (08:38 +0000)]
[project @ add a parser to spoolwatcher]
also simplify the I don't understand your command message so it
doesn't thrown a exception about formatting the string.
Brandon King [Tue, 4 Dec 2007 19:59:33 +0000 (19:59 +0000)]
[project @ Config Step: Detect missing cycles patch]
* Now detects error case of missing cycles and reports a more
meaningful error message in the error log. It also records a copy
of the original error message and then suppresses the remaining
error message of this type.
Diane Trout [Mon, 10 Dec 2007 21:16:47 +0000 (21:16 +0000)]
[project @ rename runFinished to sequencingFinished]
I decided it would be a bit more clear to say that spool watcher is
detecing when the sequencing is finished. Since the whole run can now
include finishing running through the pipeline.
Diane Trout [Mon, 10 Dec 2007 21:12:20 +0000 (21:12 +0000)]
[project @ make spoolwatcher a benderjab bot]
this was a pretty significant update to spool watcher, changing
the main event loop from being driven by inotify to BenderJab, and
changing the start copying and run finished messages from being
chat messages to xml-rpc messages.
Diane Trout [Mon, 10 Dec 2007 21:06:55 +0000 (21:06 +0000)]
[project @ use XmlRpcBot.rpc_send]
that way I can put all the logging code and error checking code
in one place, and I don't have to pass around the client connection.
Diane Trout [Mon, 10 Dec 2007 21:05:44 +0000 (21:05 +0000)]
[project @ Require a resource for JIDs that are used for XML-RPC messages]
obviously this requires a version of benderjab that has that
feature added
Diane Trout [Mon, 10 Dec 2007 21:03:51 +0000 (21:03 +0000)]
[project @ moved check_option into benderjab]
Diane Trout [Sat, 8 Dec 2007 00:40:11 +0000 (00:40 +0000)]
[project @ update CopierBot to new logging daemonizable XML-RPC BenderJab Bot]
this version reads all of the parameters out of the .benderjab config
file, will report what its currently copying, and when it gets a
"runFinished" message, will wait until its finished copying before
forwarding that on.
If it dies before finishing copying, but after gettting the runFinished
message that might get lost.
Brandon King [Fri, 30 Nov 2007 23:37:26 +0000 (23:37 +0000)]
[project @ Allow user to provide FC##### to download or a path to a config file.]
Brandon King [Fri, 30 Nov 2007 22:49:14 +0000 (22:49 +0000)]
[project @ Small documentation fix for retrieve_config.]
Brandon King [Fri, 30 Nov 2007 20:56:33 +0000 (20:56 +0000)]
[project @ Scientific name is no longer a unique field.]
Brandon King [Tue, 27 Nov 2007 22:55:42 +0000 (22:55 +0000)]
[project @ Updating setup.py]
* Added pipeline, frontend subpackages.
* Fixed spelling error with script name.
Brandon King [Tue, 27 Nov 2007 22:53:23 +0000 (22:53 +0000)]
[project @ Missing __init__.py file in pipeline subpackage.]
Brandon King [Mon, 26 Nov 2007 21:14:16 +0000 (21:14 +0000)]
[project @ run_status gerald correction (over estimated expected files)]
* Corrected the over estimate of files by not expecting *.tmp files.
* Because of the way the update feature works, when a *.tmp file
is processed, it will increment completed count as well as total
count, there by allowing 100% status report when all of the expected
+ unexpected files have been accounted for.
Brandon King [Mon, 26 Nov 2007 20:43:41 +0000 (20:43 +0000)]
[project @ run_status correct firecrest expected file estimate]
* The estimate for firecrest files was way off (total was
much higher than reality)
* Fixed by finding out two of the patterns didn't actually
use cycles; i.e. they were just hard coded numbers.
Brandon King [Thu, 22 Nov 2007 00:05:14 +0000 (00:05 +0000)]
[project @ Run default changed to full run]
* Previously was only running with lane 4 tiles
100 through 104.
Brandon King [Thu, 22 Nov 2007 00:03:22 +0000 (00:03 +0000)]
[project @ Shortend the s_generating regex; minor bug fix]
* Some reason the very end of the line was being
auto-wrapped... a simple fix was to search for
the begining half of the line. Simple and effective.
Brandon King [Wed, 21 Nov 2007 20:53:23 +0000 (20:53 +0000)]
[project @ Monitor status implementation + config_pipeline cmdling args]
* configure_pipeline now takes an optional command line
argument of an eland config file to use. (Overrides
automatic download).
* Added monitors.py which contains methods providing a way
of triggering some sort of threaded monitor of pipeline
progress.
* startCmdLineStatusMonitor(conf_info) prints status
to stdout
* Updated configure_pipeline script to use the
startCmdLineStatusMonitor function.
* ConfigInfo object now holds a status variable
(GARunStatus object)
* requires calling conf_info.createStatusObject() after
_cfg_filepath has been set (currently handled by
run_pipeline function)
Brandon King [Tue, 20 Nov 2007 23:01:53 +0000 (23:01 +0000)]
[project @ Added GARunStatus class for tracking percent complete through each step or the run and/or entire run!]
Brandon King [Tue, 20 Nov 2007 18:48:21 +0000 (18:48 +0000)]
[project @ Moved ga_frontend to gaworkflow.frontend package.]
* All the modules in the front end have been updated to be
located in gaworkflow.frontend.
* Requires that PYTHONPATH include top level directory or
for the package to be installed as gaworkflow/frontend/manage.py
only adds frontend/ to the python path, and therefore it will fail.
* Changed the hard coded paths to be more like
os.path.abspath('../../fctracker.db') so, the code should just work
if the main package is available on the python path. (Good defaults
are nice!)
Diane Trout [Tue, 20 Nov 2007 09:50:58 +0000 (09:50 +0000)]
[project @ move brandon's pipeline handling code into gaworkflow.pipeline]
the code that was in the if __name__ == "__main__" got moved into
similary named scripts in the scripts directory. Those import everything from
their corresponding gaworkfile.pipeline module.
I still wish the names were shorter, and yet still descriptive.
Other refactoring ideas, break configure_run up, make a single module to hold
all the exceptions from all the varios parts of the pipeline.
And:
I (still) find our lack of tests disturbing.
Diane Trout [Tue, 20 Nov 2007 01:04:19 +0000 (01:04 +0000)]
[project @ rename python module to gaworkflow from uashelper]
Brandon King [Mon, 19 Nov 2007 20:16:47 +0000 (20:16 +0000)]
[project @ changed config file section from 'server_info' to 'config_file_server']
Brandon King [Mon, 19 Nov 2007 20:13:37 +0000 (20:13 +0000)]
[project @ Removing bin/config_pipeline.py (use config_pipeline2.py instead)]
Brandon King [Sat, 17 Nov 2007 03:13:07 +0000 (03:13 +0000)]
[project @ Download Cfg, Use genome mapper, configure, run and monitor pipeline! (Proof of concept!)]
* Now downloads config file from fctracker db
* Requires ~/.ga_frontend.conf or /etc/ga_frontend/ga_frontend.conf
to have [server_info]\nbase_host_url: http://host:port
* FIXME: flowcell and genome dir are hard coded for testing
* Uses genome mapper to update config file with local available genomes
* Requires each genome dir to have a file called _metainfo_ with:
* species|build
* Then uses that config file to configure the pipeline
* Runs the pipeline monitoring the status
* TODOs:
* Allow for specifying config file from commandline
(skipping download config step).
* Need non-hardcoded way of getting flowcell and genome base directory
* Incorperate into run daemon that listens for copy complete command
* Add feature to notify users of success and failures.
Brandon King [Sat, 17 Nov 2007 02:41:33 +0000 (02:41 +0000)]
[project @ retrieve_eland_config.py catches two new errors]
* Handles 404 - Not found (throws exception)
* Handles Flowcell not in DB (throws exception)
Brandon King [Sat, 17 Nov 2007 02:18:22 +0000 (02:18 +0000)]
[project @ Minor cleanup patch]
Brandon King [Sat, 17 Nov 2007 02:09:29 +0000 (02:09 +0000)]
[project @ Species and build to valid genome dir mapper!]
* Config generator returns genome dir with %(species|build)s
* This module contains a dictionary generator which given
a genome base dir, will generate a dictionary whos key is
species|build and whos value is the valid genome dir
* requires a file in each genome dir called _metainfo_
* containing: species|build