htsworkflow.git
16 years agoAdd library.py, a script to extract a useful description of a flowcell
Diane Trout [Wed, 9 Apr 2008 20:53:35 +0000 (20:53 +0000)]
Add library.py, a script to extract a useful description of a flowcell
from the gaworkflow database.
Basically it nicely aggregates the library descritpion with the flow
cell lanes.

16 years agoDon't die if Config/FlowcellId.xml is missing warn the user and continue
Diane Trout [Tue, 1 Apr 2008 21:30:40 +0000 (21:30 +0000)]
Don't die if Config/FlowcellId.xml is missing warn the user and continue

16 years agoreport statistics on the various eland_result sequence status codes
Diane Trout [Tue, 1 Apr 2008 00:18:07 +0000 (00:18 +0000)]
report statistics on the various eland_result sequence status codes
(E.g. how many times does QC/NM/U[012]/R[012] show up)

16 years agoDon't die if Summary.htm isn't present
Diane Trout [Fri, 28 Mar 2008 23:32:12 +0000 (23:32 +0000)]
Don't die if Summary.htm isn't present

16 years agorename the summary report to summary_report to distingush it from Summary
Diane Trout [Fri, 28 Mar 2008 22:37:13 +0000 (22:37 +0000)]
rename the summary report to summary_report to distingush it from Summary
also moved the summarize_mapped_reads from the summary_report function to
the top level of the script

16 years agomake the mapped reads summary report more robust
Diane Trout [Fri, 28 Mar 2008 22:12:19 +0000 (22:12 +0000)]
make the mapped reads summary report more robust
It's really useful for the summary report to report everything that
mapped to the genome as a single entry, and everything that mapped
to the contamination or spike-ins as seperate entries.

This version tags files that aren't symlined as 'last_dir_element/filename'
and files that are symlinked into the genome directory as 'filename'.
Since I have the spike-ins symlinked in, and the last element of the path
name is the genome name it makes it easy to use the 'last_dir_element'
as the name to group all the per chromosome reads to.

16 years agoAdd runfolder.py to the list of scripts
Diane Trout [Thu, 27 Mar 2008 01:13:15 +0000 (01:13 +0000)]
Add runfolder.py to the list of scripts

16 years agosort the sample mapped reads output
Diane Trout [Wed, 26 Mar 2008 23:11:50 +0000 (23:11 +0000)]
sort the sample mapped reads output
also use the fasta name from the eland_result file even if there isn't a
corresponding file on disk any more.

16 years agoAdd documentation about what runfolder.py does
Diane Trout [Tue, 25 Mar 2008 21:54:03 +0000 (21:54 +0000)]
Add documentation about what runfolder.py does

16 years agoPut ten_nM_dilution field back in library table, and added the cluster_estimate field...
Lorian Schaeffer [Tue, 25 Mar 2008 18:31:09 +0000 (18:31 +0000)]
Put ten_nM_dilution field back in library table, and added the cluster_estimate fields back to the flowcell view.

16 years agoStore the firecrest/bustard/gerald tuple in a new class PipelineRun
Diane Trout [Sat, 22 Mar 2008 00:58:36 +0000 (00:58 +0000)]
Store the firecrest/bustard/gerald tuple in a new class PipelineRun
This was because I found the <runfolder>/Config directory which contained
a useful file containing the real flowcell id.

I still need to get runfolder to copy the "important" files and to write
the output xml file somewhere other than the current directory.

16 years agoSummarize information from the runfolder.
Diane Trout [Fri, 21 Mar 2008 23:09:09 +0000 (23:09 +0000)]
Summarize information from the runfolder.

This is the start of a tool for archiving the "important" parts of the run
folder, or providing some summary information.

16 years agoHopefully undid accidental changes to views.py and urls.py
Lorian Schaeffer [Fri, 21 Mar 2008 21:04:41 +0000 (21:04 +0000)]
Hopefully undid accidental changes to views.py and urls.py

16 years agoAdded a dropdown status field to flowcell table
Lorian Schaeffer [Fri, 21 Mar 2008 20:54:19 +0000 (20:54 +0000)]
Added a dropdown status field to flowcell table

16 years agoCreated table Person, with required fields "name" and "lab", and optional field ...
Lorian Schaeffer [Thu, 20 Mar 2008 19:17:21 +0000 (19:17 +0000)]
Created table Person, with required fields "name" and "lab", and optional field "email"
Changed "made_for" in table Library to choose from contents of table Person

16 years ago Flowcell patch:
Lorian Schaeffer [Thu, 20 Mar 2008 18:46:22 +0000 (18:46 +0000)]
Flowcell patch:
Table:
Added four "kit_#" fields to hold the lot #s for each kit
Added "cluster_station_id" field
Added "sequencer_id" field
Changed "lane_x_pM" to decimal instead of int
Changed "lane_x_cluster_estimate" to char instead of int

Display:
Search now works! Correct format for searching ForeignKeys is (name of ForeignKey in current model)__(name of linked field you want to search in ForeignKey's home model)
Set "save_as" option to be true
Reversed sort order for admin grid
Added flowcell_id to search function
Changed display of fields in detailed admin view
Removed "lane_x_cluster_estimate" from all views. Didn't remove the fields themselves in case the other group was planning to use them.
Changed order of fields in admin grid view
Made all fields in admin grid view link to correct page

16 years agoRemoved fields kit_#_lot from library table (they were supposed to go in flowcell...
Lorian Schaeffer [Thu, 20 Mar 2008 17:38:12 +0000 (17:38 +0000)]
Removed fields kit_#_lot from library table (they were supposed to go in flowcell table), changed display of undiluted_concentration to include units.

16 years ago Library patch:
Lorian Schaeffer [Wed, 19 Mar 2008 23:57:35 +0000 (23:57 +0000)]
Library patch:
Library table:
removed "ten_nM_dilution" field from table
added integer field "library_size" to table
added four integer fields "kit_[#]_lot" to table
changed library_id to be text instead of int
changed successful_pM to decimal instead of int
added option "Completed" to the PROTOCOL_END_POINTS set of options

Library display:
changed which fields are displayed on library admin grid
added "library_id" field to search
changed the display of fields in the detailed admin page

16 years agoconvert a single-match eland result file into a bed file readable by UCSC
Diane Trout [Thu, 6 Mar 2008 22:58:34 +0000 (22:58 +0000)]
convert a single-match eland result file into a bed file readable by UCSC

16 years agoAdd script to extract some subset of sequence from an eland result file.
Diane Trout [Thu, 6 Mar 2008 22:57:39 +0000 (22:57 +0000)]
Add script to extract some subset of sequence from an eland result file.

16 years agoDummyOptions didn't define a genome_dir member before trying to access it
Diane Trout [Tue, 29 Jan 2008 02:13:59 +0000 (02:13 +0000)]
DummyOptions didn't define a genome_dir member before trying to access it

16 years agoreturn most recent genome build for the pipeline config file.
Diane Trout [Wed, 23 Jan 2008 01:52:40 +0000 (01:52 +0000)]
return most recent genome build for the pipeline config file.

Brandon's original pipeline customization code replaced things
like %(genome|build)s with the path to the ELAND genome files.

What I did is made it possible to substitute keys like %(genome)s in
addition to %(genome|build)s. The idea is that the most config files
will be set to use whatever is the "most recent" build, but hopefully
at some point we'll provide some way of specifying which build.

The way I defined "most recent" genome build was to use the
alphanum sort, that sorts mixed alpha/numeric strings in the
'natural' order instead of ASCII order, thus "mm10" > "mm8".

For the genomes that we had installed right now this would work
for everything but arabadopsis--which appears to be using a version
number of MMDDYYYY. Though if we changed it to YYYYMMDD everything
should work correctly.

16 years agoLatest genome build, eland_config file temp. solution patch.
Brandon King [Wed, 23 Jan 2008 00:17:14 +0000 (00:17 +0000)]
Latest genome build, eland_config file temp. solution patch.
 * As per Diane's suggestion, the config file being generated will only
   contain the species name, which will mean "use latest build" for
   this species. At a later time when the web UI has been updated to allow
   overriding this default and selecting a specific genome build to
   use (the species_name|build_num convention will return). See
   ticket:50 for more information.

16 years agoAdded a new get_cycles(recipe_xml_filepath) function in
Brandon King [Thu, 17 Jan 2008 23:40:58 +0000 (23:40 +0000)]
Added a new get_cycles(recipe_xml_filepath) function in
gaworkflow.pipeline.recipe_parser.

16 years agoSpoolwatch function for getting cycle number from Recipe*.xml file.
Brandon King [Wed, 16 Jan 2008 02:13:59 +0000 (02:13 +0000)]
Spoolwatch function for getting cycle number from Recipe*.xml file.
 * Work towards ticket:28.

16 years agoAttempt to support case insensitive config file names.
Brandon King [Wed, 16 Jan 2008 01:07:05 +0000 (01:07 +0000)]
Attempt to support case insensitive config file names.
 * ticket:43

16 years agoFix for tickets #45 & #47
Brandon King [Wed, 16 Jan 2008 00:44:17 +0000 (00:44 +0000)]
Fix for tickets #45 & #47
 * retrieve_config needed the genome_dir substitution implemented (ticket:45)
 * configure_pipeline has a name clash, now fixed (ticket:47)

16 years agoOops, forgot to provide a null body to my the stub of WatcherEvent.
Diane Trout [Tue, 15 Jan 2008 23:51:22 +0000 (23:51 +0000)]
Oops, forgot to provide a null body to my the stub of WatcherEvent.
Roughly I'm trying to keep a way of scheduling different types
of events at speficfied times in the future.

I think it needs to be some kind of queue, that the event loop
scans for events whose time has come.

16 years agopyinotify rm_watch takes a list, not a dictionary.
Diane Trout [Tue, 15 Jan 2008 23:46:04 +0000 (23:46 +0000)]
pyinotify rm_watch takes a list, not a dictionary.

16 years agoUpdate test_copier to reflect copiers move to gaworkflow.automation
Diane Trout [Tue, 15 Jan 2008 02:04:30 +0000 (02:04 +0000)]
Update test_copier to reflect copiers move to gaworkflow.automation

16 years agomove the autotmation scripts into gaworkflow.automation
Diane Trout [Tue, 15 Jan 2008 01:07:48 +0000 (01:07 +0000)]
move the autotmation scripts into gaworkflow.automation
the scripts that were providing the tools to automate running
the solexa pipeline were unfairly "priviledged" compared to the
components that wrapped talking to the pipeline commands and
providing the website, in that the other components were in
sub-packages while the automation was just in the gaworkflow
package.

So I moved them into the somewhat clearer "gaworkflow.automation".

The intent is that gaworkflow.automation contains modules
that make things happen without human intervention.

16 years agoAdd the xmi file from umbrello documenting our basic pipeline usecase
Diane Trout [Fri, 11 Jan 2008 23:43:41 +0000 (23:43 +0000)]
Add the xmi file from umbrello documenting our basic pipeline usecase

16 years agoalways trigger a copy when we receive the sequencingFinshed message
Diane Trout [Wed, 9 Jan 2008 02:29:52 +0000 (02:29 +0000)]
always trigger a copy when we receive the sequencingFinshed message
hopefully then we'll only send sequencingFinished off to runner
when the rsyncing is well and truely finished.

16 years agotell subversion to ignore *.py[co~] files
Diane Trout [Tue, 8 Jan 2008 02:39:21 +0000 (02:39 +0000)]
tell subversion to ignore *.py[co~] files

16 years agomake sequencingFinished return something instead of none
Diane Trout [Tue, 8 Jan 2008 02:38:00 +0000 (02:38 +0000)]
make sequencingFinished return something instead of none
to make xmlrpclib happy.

16 years agoRemove unnecessary code from the runner.py module
Diane Trout [Tue, 8 Jan 2008 02:11:48 +0000 (02:11 +0000)]
Remove unnecessary code from the runner.py module

16 years agoAdd script to make it easier to launch runner
Diane Trout [Tue, 8 Jan 2008 01:32:53 +0000 (01:32 +0000)]
Add script to make it easier to launch runner

16 years agouse \w for alphanumeric flow cell IDs
Diane Trout [Fri, 4 Jan 2008 23:50:12 +0000 (23:50 +0000)]
use \w for alphanumeric flow cell IDs

16 years agoRunner now reports status when user sends status request!
Brandon King [Fri, 4 Jan 2008 23:09:58 +0000 (23:09 +0000)]
Runner now reports status when user sends status request!
 * just send 'status' to the bot to get status information now.

16 years agoRunner now notifies users about success or failure of the following steps:
Brandon King [Fri, 4 Jan 2008 22:41:50 +0000 (22:41 +0000)]
Runner now notifies users about success or failure of the following steps:
 * Retrieve Config
 * Configure Pipeline
 * Run Pipeline

16 years agoRunner seems to work with running the pipeline when launch from within
Brandon King [Fri, 4 Jan 2008 22:04:44 +0000 (22:04 +0000)]
Runner seems to work with running the pipeline when launch from within
any directory.
 * stdout/stderr output is saved in top level of the analysis directory.
 * Requires a base_analysis_dir to be set in bot config file (stating
   where all the analysis directories are stored.

16 years agoWork towards getting runner to work properly.
Brandon King [Fri, 4 Jan 2008 00:21:22 +0000 (00:21 +0000)]
Work towards getting runner to work properly.
 * Runner can now run one run if launched in the directory
   of the analysis it should run.
 * TODO: Allow for running the pipeline without changing directories.

16 years agoOk, one more option required.
Brandon King [Fri, 4 Jan 2008 00:04:23 +0000 (00:04 +0000)]
Ok, one more option required.

16 years agoForgot an option in the DummyOptions object.
Brandon King [Fri, 4 Jan 2008 00:03:03 +0000 (00:03 +0000)]
Forgot an option in the DummyOptions object.

16 years agoWork towards disabling command-line parsing outside of use
Brandon King [Thu, 3 Jan 2008 23:19:22 +0000 (23:19 +0000)]
Work towards disabling command-line parsing outside of use
in scripts.

16 years agoUpdated regexs to support new and "improved" flow cell numbers!
Brandon King [Wed, 2 Jan 2008 22:42:14 +0000 (22:42 +0000)]
Updated regexs to support new and "improved" flow cell numbers!
(ticket:2)
 * Thank heavens for regex!
 * URL for valid flow cell number is any alphanumeric character now
   rather than FC#####.
 * Getting the flowcell number from the directory path should now work
   again.

16 years agodirentry parser wasn't eating a trailing newline
Diane Trout [Sun, 30 Dec 2007 23:47:11 +0000 (23:47 +0000)]
direntry parser wasn't eating a trailing newline
This patch assumes that there always will be a single trailing whitespace
character to handle, on the off chance that someone will make a
horrific filename with trailing whitespace.

e.g. filename = "oh why do you do this..   "

16 years agoforgot to rename the caller and called when refactoring the dirlist parsing
Diane Trout [Sun, 30 Dec 2007 23:30:30 +0000 (23:30 +0000)]
forgot to rename the caller and called when refactoring the dirlist parsing

16 years agosplit rsync directory listing correctly
Diane Trout [Sun, 30 Dec 2007 23:28:17 +0000 (23:28 +0000)]
split rsync directory listing correctly
ticket:5
My previous code was generating a too many values to unpack exception
if there was a filename with spaces in it.

16 years agoadd a , to make customizing list of template dirs easier
Diane Trout [Sat, 22 Dec 2007 02:04:25 +0000 (02:04 +0000)]
add a , to make customizing list of template dirs easier

16 years agoWork towards a working runner bot... still needs a bit of work.
Brandon King [Fri, 21 Dec 2007 02:13:47 +0000 (02:13 +0000)]
Work towards a working runner bot... still needs a bit of work.

16 years ago[project @ gaworkflow.runner progress]
Brandon King [Fri, 14 Dec 2007 20:56:30 +0000 (20:56 +0000)]
[project @ gaworkflow.runner progress]
 * Work towards getting runner to launch jobs in a seperate thread.
 * Stores status in a dictionary.

16 years ago[project @ add skeleton for runner]
Diane Trout [Tue, 11 Dec 2007 22:35:09 +0000 (22:35 +0000)]
[project @ add skeleton for runner]
this is the basic outline of the runner bot. Important functions like
sequencingFinished and pipelineFinish still need to be filled in.

16 years ago[project @ send messages to notify_runner not self.runner]
Diane Trout [Tue, 11 Dec 2007 22:32:18 +0000 (22:32 +0000)]
[project @ send messages to notify_runner not self.runner]
oops I forgot to update my parameter when I made a loop

16 years ago[project @ Bypass pipe lock mystery bug for configure step]
Brandon King [Tue, 11 Dec 2007 21:43:19 +0000 (21:43 +0000)]
[project @ Bypass pipe lock mystery bug for configure step]
 * There was a weird bug where certain failures of the pipeline could
   leave the configure_run.py configure code waiting for output from the
   configuration pipe, but where the program has already finished... leaving
   the configure step stuck in a perminate state of waiting.
 * This patch bypasses this problem by passing subprocess.Popen file
   descriptors instead of subproccess.PIPE and then processing the
   output after the program has terminated. I have never seen the
   mystery bug when using this approach. This is how the run pipeline
   step already handles running the pipeline and is also where I first
   encountered this problem.

16 years ago[project @ add a parser to spoolwatcher]
Diane Trout [Tue, 11 Dec 2007 08:38:49 +0000 (08:38 +0000)]
[project @ add a parser to spoolwatcher]
also simplify the I don't understand your command message so it
doesn't thrown a exception about formatting the string.

16 years ago[project @ Config Step: Detect missing cycles patch]
Brandon King [Tue, 4 Dec 2007 19:59:33 +0000 (19:59 +0000)]
[project @ Config Step: Detect missing cycles patch]
 * Now detects error case of missing cycles and reports a more
   meaningful error message in the error log. It also records a copy
   of the original error message and then suppresses the remaining
   error message of this type.

16 years ago[project @ rename runFinished to sequencingFinished]
Diane Trout [Mon, 10 Dec 2007 21:16:47 +0000 (21:16 +0000)]
[project @ rename runFinished to sequencingFinished]
I decided it would be a bit more clear to say that spool watcher is
detecing when the sequencing is finished. Since the whole run can now
include finishing running through the pipeline.

16 years ago[project @ make spoolwatcher a benderjab bot]
Diane Trout [Mon, 10 Dec 2007 21:12:20 +0000 (21:12 +0000)]
[project @ make spoolwatcher a benderjab bot]
this was a pretty significant update to spool watcher, changing
the main event loop from being driven by inotify to BenderJab, and
changing the start copying and run finished messages from being
chat messages to xml-rpc messages.

16 years ago[project @ use XmlRpcBot.rpc_send]
Diane Trout [Mon, 10 Dec 2007 21:06:55 +0000 (21:06 +0000)]
[project @ use XmlRpcBot.rpc_send]
that way I can put all the logging code and error checking code
in one place, and I don't have to pass around the client connection.

16 years ago[project @ Require a resource for JIDs that are used for XML-RPC messages]
Diane Trout [Mon, 10 Dec 2007 21:05:44 +0000 (21:05 +0000)]
[project @ Require a resource for JIDs that are used for XML-RPC messages]
obviously this requires a version of benderjab that has that
feature added

16 years ago[project @ moved check_option into benderjab]
Diane Trout [Mon, 10 Dec 2007 21:03:51 +0000 (21:03 +0000)]
[project @ moved check_option into benderjab]

16 years ago[project @ update CopierBot to new logging daemonizable XML-RPC BenderJab Bot]
Diane Trout [Sat, 8 Dec 2007 00:40:11 +0000 (00:40 +0000)]
[project @ update CopierBot to new logging daemonizable XML-RPC BenderJab Bot]
this version reads all of the parameters out of the .benderjab config
file, will report what its currently copying, and when it gets a
"runFinished" message, will wait until its finished copying before
forwarding that on.

If it dies before finishing copying, but after gettting the runFinished
message that might get lost.

16 years ago[project @ Allow user to provide FC##### to download or a path to a config file.]
Brandon King [Fri, 30 Nov 2007 23:37:26 +0000 (23:37 +0000)]
[project @ Allow user to provide FC##### to download or a path to a config file.]

16 years ago[project @ Small documentation fix for retrieve_config.]
Brandon King [Fri, 30 Nov 2007 22:49:14 +0000 (22:49 +0000)]
[project @ Small documentation fix for retrieve_config.]

16 years ago[project @ Scientific name is no longer a unique field.]
Brandon King [Fri, 30 Nov 2007 20:56:33 +0000 (20:56 +0000)]
[project @ Scientific name is no longer a unique field.]

16 years ago[project @ Updating setup.py]
Brandon King [Tue, 27 Nov 2007 22:55:42 +0000 (22:55 +0000)]
[project @ Updating setup.py]
 * Added pipeline, frontend subpackages.
 * Fixed spelling error with script name.

16 years ago[project @ Missing __init__.py file in pipeline subpackage.]
Brandon King [Tue, 27 Nov 2007 22:53:23 +0000 (22:53 +0000)]
[project @ Missing __init__.py file in pipeline subpackage.]

16 years ago[project @ run_status gerald correction (over estimated expected files)]
Brandon King [Mon, 26 Nov 2007 21:14:16 +0000 (21:14 +0000)]
[project @ run_status gerald correction (over estimated expected files)]
 * Corrected the over estimate of files by not expecting *.tmp files.
 * Because of the way the update feature works, when a *.tmp file
   is processed, it will increment completed count as well as total
   count, there by allowing 100% status report when all of the expected
   + unexpected files have been accounted for.

16 years ago[project @ run_status correct firecrest expected file estimate]
Brandon King [Mon, 26 Nov 2007 20:43:41 +0000 (20:43 +0000)]
[project @ run_status correct firecrest expected file estimate]
 * The estimate for firecrest files was way off (total was
   much higher than reality)
 * Fixed by finding out two of the patterns didn't actually
   use cycles; i.e. they were just hard coded numbers.

16 years ago[project @ Run default changed to full run]
Brandon King [Thu, 22 Nov 2007 00:05:14 +0000 (00:05 +0000)]
[project @ Run default changed to full run]
 * Previously was only running with lane 4 tiles
   100 through 104.

16 years ago[project @ Shortend the s_generating regex; minor bug fix]
Brandon King [Thu, 22 Nov 2007 00:03:22 +0000 (00:03 +0000)]
[project @ Shortend the s_generating regex; minor bug fix]
  * Some reason the very end of the line was being
    auto-wrapped... a simple fix was to search for
    the begining half of the line. Simple and effective.

16 years ago[project @ Monitor status implementation + config_pipeline cmdling args]
Brandon King [Wed, 21 Nov 2007 20:53:23 +0000 (20:53 +0000)]
[project @ Monitor status implementation + config_pipeline cmdling args]
  * configure_pipeline now takes an optional command line
    argument of an eland config file to use. (Overrides
    automatic download).
  * Added monitors.py which contains methods providing a way
    of triggering some sort of threaded monitor of pipeline
    progress.
    *  startCmdLineStatusMonitor(conf_info) prints status
       to stdout
  * Updated configure_pipeline script to use the
    startCmdLineStatusMonitor function.
  * ConfigInfo object now holds a status variable
    (GARunStatus object)
    * requires calling conf_info.createStatusObject() after
      _cfg_filepath has been set (currently handled by
      run_pipeline function)

16 years ago[project @ Added GARunStatus class for tracking percent complete through each step...
Brandon King [Tue, 20 Nov 2007 23:01:53 +0000 (23:01 +0000)]
[project @ Added GARunStatus class for tracking percent complete through each step or the run and/or entire run!]

16 years ago[project @ Moved ga_frontend to gaworkflow.frontend package.]
Brandon King [Tue, 20 Nov 2007 18:48:21 +0000 (18:48 +0000)]
[project @ Moved ga_frontend to gaworkflow.frontend package.]
 * All the modules in the front end have been updated to be
   located in gaworkflow.frontend.
 * Requires that PYTHONPATH include top level directory or
   for the package to be installed as gaworkflow/frontend/manage.py
   only adds frontend/ to the python path, and therefore it will fail.
 * Changed the hard coded paths to be more like
   os.path.abspath('../../fctracker.db') so, the code should just work
   if the main package is available on the python path. (Good defaults
   are nice!)

16 years ago[project @ move brandon's pipeline handling code into gaworkflow.pipeline]
Diane Trout [Tue, 20 Nov 2007 09:50:58 +0000 (09:50 +0000)]
[project @ move brandon's pipeline handling code into gaworkflow.pipeline]
the code that was in the if __name__ == "__main__" got moved into
similary named scripts in the scripts directory. Those import everything from
their corresponding gaworkfile.pipeline module.

I still wish the names were shorter, and yet still descriptive.

Other refactoring ideas, break configure_run up, make a single module to hold
all the exceptions from all the varios parts of the pipeline.

And:

I (still) find our lack of tests disturbing.

16 years ago[project @ rename python module to gaworkflow from uashelper]
Diane Trout [Tue, 20 Nov 2007 01:04:19 +0000 (01:04 +0000)]
[project @ rename python module to gaworkflow from uashelper]

16 years ago[project @ changed config file section from 'server_info' to 'config_file_server']
Brandon King [Mon, 19 Nov 2007 20:16:47 +0000 (20:16 +0000)]
[project @ changed config file section from 'server_info' to 'config_file_server']

16 years ago[project @ Removing bin/config_pipeline.py (use config_pipeline2.py instead)]
Brandon King [Mon, 19 Nov 2007 20:13:37 +0000 (20:13 +0000)]
[project @ Removing bin/config_pipeline.py (use config_pipeline2.py instead)]

16 years ago[project @ Download Cfg, Use genome mapper, configure, run and monitor pipeline!...
Brandon King [Sat, 17 Nov 2007 03:13:07 +0000 (03:13 +0000)]
[project @ Download Cfg, Use genome mapper, configure, run and monitor pipeline! (Proof of concept!)]
 * Now downloads config file from fctracker db
   * Requires ~/.ga_frontend.conf or /etc/ga_frontend/ga_frontend.conf
     to have [server_info]\nbase_host_url: http://host:port
   * FIXME: flowcell and genome dir are hard coded for testing
 * Uses genome mapper to update config file with local available genomes
   * Requires each genome dir to have a file called _metainfo_ with:
     * species|build
 * Then uses that config file to configure the pipeline
 * Runs the pipeline monitoring the status
 * TODOs:
   * Allow for specifying config file from commandline
     (skipping download config step).
   * Need non-hardcoded way of getting flowcell and genome base directory
   * Incorperate into run daemon that listens for copy complete command
   * Add feature to notify users of success and failures.

16 years ago[project @ retrieve_eland_config.py catches two new errors]
Brandon King [Sat, 17 Nov 2007 02:41:33 +0000 (02:41 +0000)]
[project @ retrieve_eland_config.py catches two new errors]
 * Handles 404 - Not found (throws exception)
 * Handles Flowcell not in DB (throws exception)

16 years ago[project @ Minor cleanup patch]
Brandon King [Sat, 17 Nov 2007 02:18:22 +0000 (02:18 +0000)]
[project @ Minor cleanup patch]

16 years ago[project @ Species and build to valid genome dir mapper!]
Brandon King [Sat, 17 Nov 2007 02:09:29 +0000 (02:09 +0000)]
[project @ Species and build to valid genome dir mapper!]
 * Config generator returns genome dir with %(species|build)s
 * This module contains a dictionary generator which given
   a genome base dir, will generate a dictionary whos key is
   species|build and whos value is the valid genome dir
 * requires a file in each genome dir called _metainfo_
   * containing: species|build

16 years ago[project @ Use genome build patch]
Brandon King [Sat, 17 Nov 2007 01:28:51 +0000 (01:28 +0000)]
[project @ Use genome build patch]
 * WARNING: Changed DB model! (adds use_genome_build to species)
 * Update to django model to support proper generation of eland config files
 * Config file now generates config file with the following pattern:
   * %(Homo sapiens|hg18)s for the run daemon to decide if that genome
     is available or not.

16 years ago[project @ Proof of concept!!!]
Brandon King [Sat, 17 Nov 2007 01:17:59 +0000 (01:17 +0000)]
[project @ Proof of concept!!!]
 * Now uses inotify to confirm gerald, bustard, and firecrest
   "finished.txt" files have been created! if not, fail!
 * Also goes back an reads the stderr log for known errors and
   reports failure if one is detected here.
 * Just needs to be generalized an hooked up to config generator
   as well as inserted into the run daemon!

16 years ago[project @ Method 2 for monitoring pipeline run (based on config_pipeline.py]
Brandon King [Fri, 16 Nov 2007 22:13:21 +0000 (22:13 +0000)]
[project @ Method 2 for monitoring pipeline run (based on config_pipeline.py]
 * Since the first version stalled in a mysterious way,
   I am trying a new method.
 * Now writes out standard out and standard error to file using
   file descriptors.
 * Also uses inotify to detect "finished" files.
   * FIXME: Need to do something about all other files (currently
     printed to stdout).
 * FIXME: Double check the run_dir exists in configure step.
 * FIXME: Upon run completion, read and process stderr output.

16 years ago[project @ Handling of missing app errors with running pipeline]
Brandon King [Fri, 16 Nov 2007 18:46:22 +0000 (18:46 +0000)]
[project @ Handling of missing app errors with running pipeline]
 WARNING: Still not fully functional, making progress through.
 * Prints out success or failure of run
   * FIXME: There is a weird BUG where towards the end of a run when
     the summary files are being generated that the pipeline code just
     stops processing. It's just sitting in memory not doing anything.
     The config_pipeline.py gets stuck on line = pipe.stdout.readline()
     at this point.
 * Checks for lack of ghostscript.

16 years ago[project @ Work toward pipeline monitor code (WARNING: testing version)]
Brandon King [Thu, 15 Nov 2007 21:16:47 +0000 (21:16 +0000)]
[project @ Work toward pipeline monitor code (WARNING: testing version)]
 * Warning, this code is a steping stone to monitoring a running pipeline.
   * It only runs with tiles=s_4_0100,s_4_0101,s_4_0102,s_4_0103,s_4_0104
   * There are also important FIXME that should be looked at before using this version of the code.
 * This patch is needed for all future patches to work.

16 years ago[project @ Runs pipeline on successful config + traceback]
Brandon King [Thu, 15 Nov 2007 04:04:07 +0000 (04:04 +0000)]
[project @ Runs pipeline on successful config + traceback]
 * Now runs the pipeline upon successful configuration step.
   * Currently no monitoring of output, just writing to files so
     I can figure out how to code the monitoring.
 * Now detects if goat crashes and displays a traceback.
 * Figured out another goat_pipeline.py call that triggers a
   traceback (included as commented out code) -> Test case for future.
 * FIXME: pipeline monitoring code needs to be written.

16 years ago[project @ Now handles stderr output.]
Brandon King [Wed, 14 Nov 2007 21:19:49 +0000 (21:19 +0000)]
[project @ Now handles stderr output.]

16 years ago[project @ Better handling of configuring pipeline w/ clear success or failure. Loads...
Brandon King [Wed, 14 Nov 2007 05:09:00 +0000 (05:09 +0000)]
[project @ Better handling of configuring pipeline w/ clear success or failure. Loads run_path info ConfigInfo object as well.]

16 years ago[project @ First attempt at script for configuring pipeline automatically.]
Brandon King [Wed, 14 Nov 2007 03:33:04 +0000 (03:33 +0000)]
[project @ First attempt at script for configuring pipeline automatically.]
 * More of prof of concept.
 * FIXME: hardcoded config script path.
 * Uses logging module
   * Info for everything going fine.
   * Error when it breaks in a bad way.

16 years ago[project @ Updated eland_config app to generate config file from fctracker app.]
Brandon King [Tue, 13 Nov 2007 02:14:12 +0000 (02:14 +0000)]
[project @ Updated eland_config app to generate config file from fctracker app.]
 * WARNING DB CHANGE: Added read_length, and advanced_run to flowcell
   model in fctracker app.
 * Updated eland_config app to generate config files from db instead of form.
   * Currently regenerates the config file on every request. Simple flag will
     cause it to re-read saved copy on disk.
 * Changed URL to http://<host>/eland_config/.

16 years ago[project @ Url should be <host>/eland_config/ in this case.]
Brandon King [Mon, 12 Nov 2007 22:43:35 +0000 (22:43 +0000)]
[project @ Url should be <host>/eland_config/ in this case.]

16 years ago[project @ The rest of the rename fix.]
Brandon King [Mon, 12 Nov 2007 22:36:44 +0000 (22:36 +0000)]
[project @ The rest of the rename fix.]

16 years ago[project @ Small fixes to changed directory name]
Lorian Schaeffer [Mon, 12 Nov 2007 22:17:34 +0000 (22:17 +0000)]
[project @ Small fixes to changed directory name]

16 years ago[project @ Renamed main django project dir to ga_frontend + updated settings.]
Brandon King [Mon, 12 Nov 2007 22:16:09 +0000 (22:16 +0000)]
[project @ Renamed main django project dir to ga_frontend + updated settings.]

16 years ago[project @ Massive change to DB structure; complete library table]
Lorian Schaeffer [Fri, 9 Nov 2007 23:20:37 +0000 (23:20 +0000)]
[project @ Massive change to DB structure; complete library table]
Primary change to the DB is the library table and supporting changes to the flowcell table.
They should both be properly linked now; you'll have to pull species information from the
linked library field. In addition, I added a common name to the Species table. Most interface
changes are via the meta and admin classes in each model, and are fairly straightforward. I
also added databrowse support; go to /databrowse instead of /admin to play with it.

16 years ago[project @ Finished statement + additional error message]
Brandon King [Sat, 10 Nov 2007 01:27:44 +0000 (01:27 +0000)]
[project @ Finished statement + additional error message]
 * Now says where it wrote the config file to upon success
 * Now puts more meaningful error message when user enters invalid domain/ip address.

16 years ago[project @ Updated to give more user friendly error:]
Brandon King [Sat, 10 Nov 2007 01:20:19 +0000 (01:20 +0000)]
[project @ Updated to give more user friendly error:]
 * Gives a more user friendly error when connection is refused.
 * FIXME: Should include other errors as well, such as host lookup errors.