htsworkflow.git
10 years agoInitial port to python3 python3-django1.5
Diane Trout [Tue, 13 Aug 2013 23:41:28 +0000 (16:41 -0700)]
Initial port to python3

10 years agoUpdate test data for renamed library type.
Diane Trout [Mon, 12 Aug 2013 20:47:04 +0000 (13:47 -0700)]
Update test data for renamed library type.

10 years agoAdd sortOrder and filtering parameters to the trackhub.
Diane Trout [Sat, 10 Aug 2013 00:06:16 +0000 (17:06 -0700)]
Add sortOrder and filtering parameters to the trackhub.

The order of trackhub parameters is still fixed,
however it should skip missing options.

10 years agoMove version finding code into the util directory.
Diane Trout [Mon, 22 Jul 2013 20:53:42 +0000 (13:53 -0700)]
Move version finding code into the util directory.

Mostly done because I already have test code in
htsworkflow.util.test

10 years agoTest changes to submission code.
Diane Trout [Mon, 22 Jul 2013 19:53:39 +0000 (12:53 -0700)]
Test changes to submission code.

Also there's some commonality in simulating a submission directory
so refactor that code out to a common module.

10 years agoGenerate manifest.txt files for submitting to ENCODE3.
Diane Trout [Mon, 22 Jul 2013 19:48:53 +0000 (12:48 -0700)]
Generate manifest.txt files for submitting to ENCODE3.

Change trackhub generation from my previous template version to use
the Daler trackhub code.

This includes a feature to complain if you offer a submission set
name that diesn't exist.

Also the samples query returns all the submission components in
a single query instead of one of a time. (Which is much faster
way of doing things).

10 years agoturtle writing improvements.
Diane Trout [Mon, 22 Jul 2013 19:20:19 +0000 (12:20 -0700)]
turtle writing improvements.

  * Update namespaces added to default writer.
  * Add a function to generate a default turtle prefix header.

10 years agoAdd a generator that returns analysis nodes from a result map.
Diane Trout [Mon, 22 Jul 2013 19:16:58 +0000 (12:16 -0700)]
Add a generator that returns analysis nodes from a result map.

It iterates over the submission directories and returns
the fully qualified RDF node for them.

10 years agoAdd python namespace for encode3 RDF namespace
Diane Trout [Sat, 20 Jul 2013 05:50:06 +0000 (22:50 -0700)]
Add python namespace for encode3 RDF namespace

10 years agoAdd function to list the names for submissions from the RDF model.
Diane Trout [Wed, 17 Jul 2013 22:28:16 +0000 (15:28 -0700)]
Add function to list the names for submissions from the RDF model.

Currently the model is ill-specified and the name entries
are just pointing at the list of per-library directory names.

Make sure the end of the submission name doesn't have URL seperator
characters.

10 years agoAdd function to parse scp / ssh style URLs.
Diane Trout [Wed, 17 Jul 2013 00:02:30 +0000 (17:02 -0700)]
Add function to parse scp / ssh style URLs.

Also move some tests around from htsworkflow.util.url

10 years agoDeleting commented out code.
Diane Trout [Tue, 16 Jul 2013 22:04:52 +0000 (15:04 -0700)]
Deleting commented out code.

Came up with a better library filter strategy

10 years agoUpdate test to work with species -> species_name rename.
Diane Trout [Mon, 8 Jul 2013 22:02:42 +0000 (15:02 -0700)]
Update test to work with species -> species_name rename.

10 years agoGenerate manifest files for ENCODE3
Diane Trout [Wed, 3 Jul 2013 17:57:10 +0000 (10:57 -0700)]
Generate manifest files for ENCODE3

I added a new option to the trackhub generation script.
There were some changes to the model generation to capture
relative path names and add the library URI to files to
make some queries faster.

10 years agoMerge branch 'django1.4' of mus.cacr.caltech.edu:htsworkflow into django1.4
Diane Trout [Mon, 1 Jul 2013 23:02:46 +0000 (16:02 -0700)]
Merge branch 'django1.4' of mus.cacr.caltech.edu:htsworkflow into django1.4

10 years agoAdd the option to copy tree in addition to making a symlink tree from elsewhere.
Diane Trout [Mon, 1 Jul 2013 22:59:10 +0000 (15:59 -0700)]
Add the option to copy tree in addition to making a symlink tree from elsewhere.

Also don't copy subdirectories in an analysis directory tree being copied
from elsewhere.

10 years agoInitial attempt to start generating trackHubs and manifest files.
Diane Trout [Mon, 1 Jul 2013 22:57:12 +0000 (15:57 -0700)]
Initial attempt to start generating trackHubs and manifest files.

10 years agotype checking more detailed than Literal doesn't work well
Diane Trout [Mon, 1 Jul 2013 22:53:53 +0000 (15:53 -0700)]
type checking more detailed than Literal doesn't work well

10 years agoAdd option to copy source files for a submission.
Diane Trout [Fri, 28 Jun 2013 18:36:31 +0000 (11:36 -0700)]
Add option to copy source files for a submission.

Sometimes it may be worth while to keep a copy of the
files being submitted.

10 years agoFurther improve reliability of make_tree_from.
Diane Trout [Tue, 18 Jun 2013 18:59:13 +0000 (11:59 -0700)]
Further improve reliability of make_tree_from.

The previous update assumed that it was going to be running
in the target directory. which wasn't true in the test cases.

I also updated the test case to handle both a base filename and
an absolute pathname for the result map.

10 years agocopy_tree_from wasn't actually making any symlinks.
Diane Trout [Tue, 18 Jun 2013 00:19:22 +0000 (17:19 -0700)]
copy_tree_from wasn't actually making any symlinks.

It turns out at some point the results class switched to using
full paths. this meant my os.path.join's didn't work as it
didn't modify fullly qualified paths.

This patch converts the result lib path to a relative path based
on the destination so it can compute source paths more easily.

10 years agoUpdate tests for new version of redland rdf lib.
Diane Trout [Tue, 18 Jun 2013 00:18:00 +0000 (17:18 -0700)]
Update tests for new version of redland rdf lib.

Also change from failUnless to assert

10 years agoRemove deprecated adminmedia template tag from loader.
Diane Trout [Tue, 18 Jun 2013 00:09:29 +0000 (17:09 -0700)]
Remove deprecated adminmedia template tag from loader.

Since these templates didn't actually use the feature
I didn't bother replacing adminmedia with staticfiles

10 years agoReplace deprecated django.contrib adminmedia with staticfiles.
Diane Trout [Tue, 18 Jun 2013 00:06:13 +0000 (17:06 -0700)]
Replace deprecated django.contrib adminmedia with staticfiles.

Also update url template tag for django 1.5 syntax.

10 years agoUse proper User model import.
Diane Trout [Tue, 18 Jun 2013 00:00:40 +0000 (17:00 -0700)]
Use proper User model import.

The previous location was an unsupported alias.

11 years agoPreliminary implementation of trackDb generation.
Diane Trout [Fri, 1 Feb 2013 01:04:58 +0000 (17:04 -0800)]
Preliminary implementation of trackDb generation.

This is super preliminary importand report parts are
hard coded instead of being detected properly.

11 years agoForgot to import an exception I used
Diane Trout [Fri, 1 Feb 2013 01:03:58 +0000 (17:03 -0800)]
Forgot to import an exception I used

11 years agoMake a sample key list to go along with our lane list
Diane Trout [Fri, 1 Feb 2013 01:01:02 +0000 (17:01 -0800)]
Make a sample key list to go along with our lane list

At some point I had to add the sample key which could link
lane, library and index together. Some code know expects that
class so I needed to create a "standard" list and pass it in.

11 years agoSupport scanning HiSeq runs with multiple analyses.
Diane Trout [Wed, 16 Jan 2013 01:00:22 +0000 (17:00 -0800)]
Support scanning HiSeq runs with multiple analyses.

It extends the previous C1-100 directory name with the concept of
a suffix extension to that name. The suffix is gathered from whatever the
user came up with their own Aligned/Unaligned directory names.

I discovered I'd previously been calling the
"run_{flowcell id}_{timestamp}.xml" filename the runfolder name. Which is
dumb, as that's a filename, not a name. So this patch renamed it.
(Since I needed to clean up some of the names to implement the above
"run_dirname" functionality.

since I was testing the run.name in the test_runfolders* I needed
to fix those. (And I'm really regretting my cut-n-paste programming).

11 years agoThrown an exception if we can't parse base calling directory.
Diane Trout [Wed, 16 Jan 2013 00:55:20 +0000 (16:55 -0800)]
Thrown an exception if we can't parse base calling directory.

While working on making the alignment parsing optional. I made a mistake
copying some files around, that led to an unparseable base call directory.

I thought I should try to catch and report that error condition.

11 years agoStart making documentation for htsworkflow.
Diane Trout [Tue, 15 Jan 2013 23:09:56 +0000 (15:09 -0800)]
Start making documentation for htsworkflow.

This is a tiny start toward creating documentation for
htsworkflow, mostly I was getting confused about some of my
classes and thought I should try to get some API documentation
going.

Since I wanted to make sure I had the docstring syntax right I needed
some way to actually build the documentation, and so I might as well
commit what little documentation I created.

11 years agoA better resolution to a possible circular dependency.
Diane Trout [Wed, 9 Jan 2013 00:11:13 +0000 (16:11 -0800)]
A better resolution to a possible circular dependency.

The runfolder subdirectory processing tools (firecrest, bustard, gerald, etc.)
were importing runfolder for some common constants, however runfolder
imported them to actually build the runfolder structure.

My previous solution was to only include the imports for the
sub-directory processing in the function that used them.
However that lead to needing nested functions which seemed confusing.

What I did was move the common constants into pipelines.__init__
and just imported them from there.

11 years agoDon't accidentally transform an object into a tuple.
Diane Trout [Wed, 9 Jan 2013 00:03:34 +0000 (16:03 -0800)]
Don't accidentally transform an object into a tuple.

I had an extra ',' that was turning a simple assignment
into assignment of a tuple containing the variable I was expecting.
needless to say this cause trouble.

11 years agoMerge branch 'django1.4' of mus.cacr.caltech.edu:htsworkflow into django1.4
Diane Trout [Tue, 8 Jan 2013 01:51:28 +0000 (17:51 -0800)]
Merge branch 'django1.4' of mus.cacr.caltech.edu:htsworkflow into django1.4

I fixed one of the time-stamp formats on both development machines.

Conflicts:
htsworkflow/pipelines/gerald.py

11 years agoTry to make Aligned result directories optional in hiseq runs.
Diane Trout [Tue, 8 Jan 2013 01:46:40 +0000 (17:46 -0800)]
Try to make Aligned result directories optional in hiseq runs.

The previous implementation tried to match Aligned & Unaligned
directories by parsing the Aligned directories config file for
its the unaligned raw sequence directory.

Needless to say that didn't work if there wasn't an Aligned
directory.

This version tries to match them by comparing the suffix in
Aligned<Suffix> and Unaligned<Suffix>. Then the runfolder generation
code will still generate a runfolder if there's no aligned directory.

11 years agoAdd unaligned stats files to hiseq test case
Diane Trout [Tue, 8 Jan 2013 01:46:06 +0000 (17:46 -0800)]
Add unaligned stats files to hiseq test case

11 years agoTweak timestamp format.
Diane Trout [Tue, 8 Jan 2013 01:30:54 +0000 (17:30 -0800)]
Tweak timestamp format.

11 years agoGerald's time-stamp format was inconsistent.
Diane Trout [Fri, 14 Dec 2012 01:16:20 +0000 (17:16 -0800)]
Gerald's time-stamp format was inconsistent.

The different os / python versions had different defaults for
'%c'. I'd previously changed the read function, but not the
generation function. Also it didn't look quite like some
of my timestamps in my files.

So now both creating the time stamp and parsing the time
stamp are using the same date string.

11 years agoMake my ChangeList sub-class compatibile with Django 1.3
Diane Trout [Fri, 14 Dec 2012 00:58:56 +0000 (16:58 -0800)]
Make my ChangeList sub-class compatibile with Django 1.3

Django 1.3's django.contrib.admin.view.main.ChangeList
class takes one fewer parameter than the 1.4 version, as
does the get_query_set function.

I solved this by testing the django.VERSION and adding
the extra paramemeter to a dictionary and calling with
**kwarg expansion.

Yes it is dirty.

11 years agoMake the inventory pages work with the new HTSChangeList.
Diane Trout [Thu, 13 Dec 2012 22:21:52 +0000 (14:21 -0800)]
Make the inventory pages work with the new HTSChangeList.

There were a few problems, I was calling get_result_set in the template,
which requires the HTTP request which isn't available. I needed to be using
result_list instead.

I also changed the name in the context and failed to use the right model
for one of the indexes.

11 years agoDjango 1.4 requires the csrf token for posts.
Diane Trout [Thu, 13 Dec 2012 22:20:51 +0000 (14:20 -0800)]
Django 1.4 requires the csrf token for posts.

The old login form copied from django 1.1's admin directory was missing it.

11 years agoUse specific time formatting instead of locale '%c'
Diane Trout [Thu, 13 Dec 2012 19:40:10 +0000 (11:40 -0800)]
Use specific time formatting instead of locale '%c'

For some reason under my current configuration the
locale formatter didn't work. Since it's probably better
to actually set a known format for the xml files, I
manually coded the appropriate string parser format.

11 years agoMerge branch 'master' of mus.cacr.caltech.edu:htsworkflow into django1.4
Diane Trout [Thu, 13 Dec 2012 18:21:36 +0000 (10:21 -0800)]
Merge branch 'master' of mus.cacr.caltech.edu:htsworkflow into django1.4

11 years agoSimplify linking fastq files to their library id.
Diane Trout [Wed, 12 Dec 2012 23:39:00 +0000 (15:39 -0800)]
Simplify linking fastq files to their library id.

Unlike my previous effort which required the fastq generation
script to generate dc:source entries to match fastqs to libraries,
this version just parses the generated fastq filename.

This does mean that a manually generated file might not work.

I accomplished this by writing a class to generate the
fastq (for submission) filenames and to parse them, so at least
all that code is in one place.

Also after attaching the fastq metadata to the file node,
I discovered I the websites use of language tags on strings
made my query fail. So I changed the toTypedNode to take an optional
language tag. (Defaults to "en").

11 years agoAdd a FastqName class to create and parse standardized fastq names.
Diane Trout [Sat, 8 Dec 2012 01:40:39 +0000 (17:40 -0800)]
Add a FastqName class to create and parse standardized fastq names.

I had a pretty standard naming convention for the fastq file names,
instead of duplicating the code for creating & parsing them,
I thought I should try to localize the code.

So I just added htsworkflow.submission.fastqname

11 years agoChange my copied version of admin.changelist to a subclass.
Diane Trout [Fri, 30 Nov 2012 01:13:44 +0000 (17:13 -0800)]
Change my copied version of admin.changelist to a subclass.

Why repeat myself when I can subclass for customization purposes.
It works for the samples pages, but the inventory lists
dont work with this new code.

11 years agoThe formatting for the model name changed.
Diane Trout [Fri, 30 Nov 2012 01:01:14 +0000 (17:01 -0800)]
The formatting for the model name changed.

It dropped the <FileType: > stuff for the unicode call.
wonder why.

11 years agoAdd a few more records for django db initial_data.json.
Diane Trout [Thu, 29 Nov 2012 03:06:33 +0000 (19:06 -0800)]
Add a few more records for django db initial_data.json.

Django 1.4's test fixture loader wanted to verify referential
integrity, so I needed to add in some of the supporting records
for sample data that had slipped into test cases.

11 years agoUse unittest2's module hooks for setting up the django environment.
Diane Trout [Thu, 29 Nov 2012 00:45:33 +0000 (16:45 -0800)]
Use unittest2's module hooks for setting up the django environment.

It'll set up and destroy the test environment as well as
configure the email handler per module test.

I only added the code to the modules that were using
django.test.TestCase

11 years agoChange csrf imports and database settings for django 1.4
Diane Trout [Thu, 29 Nov 2012 00:39:43 +0000 (16:39 -0800)]
Change csrf imports and database settings for django 1.4

They moved the csrf protection code from django 1.1. I needed to change
some imports in both a few modules and the settings file.

Additionally the way to specify the database changed, the
old 1.1 version is still in there as of this patch.

11 years agoRemove prints that were being called in test code.
Diane Trout [Thu, 29 Nov 2012 00:01:04 +0000 (16:01 -0800)]
Remove prints that were being called in test code.

It messes up my dots. Unfortunately the fastq validator is still
ugly. But at least its messages are a lot shorter.

There was one debugging function ls_tree in simulate_runfolder
that I renamed to print_ls_tree because I'll occasionally remember
to grep for print to find things I shouldn't commit.

11 years agoMerge branch 'master' of mus.cacr.caltech.edu:htsworkflow
Diane Trout [Wed, 28 Nov 2012 19:37:34 +0000 (11:37 -0800)]
Merge branch 'master' of mus.cacr.caltech.edu:htsworkflow

11 years agoChanged name= to verbose_name= for LibraryType.
Diane Trout [Wed, 28 Nov 2012 19:34:40 +0000 (11:34 -0800)]
Changed name= to verbose_name= for LibraryType.

Strangely Django <1.4 didn't notice the error when importing
the fixtures. It was trying to use the name "Adapter Type"
as the database name, instead of the actual column name "name"
for the LibraryType table.

11 years agoMerge branch 'master' of mus.cacr.caltech.edu:htsworkflow
Diane Trout [Wed, 28 Nov 2012 19:22:19 +0000 (11:22 -0800)]
Merge branch 'master' of mus.cacr.caltech.edu:htsworkflow

11 years agoMake extra sure Django's setup test environment is run.
Diane Trout [Wed, 28 Nov 2012 19:19:58 +0000 (11:19 -0800)]
Make extra sure Django's setup test environment is run.

When running under unit2 discover, the mail test was using
a real mail server as it didn't know to run Django's
setup_test_environment. This rather heavy handedly runs
the django setup/teardown functions for the TestEmailNotify module.

11 years agoConvert to unittest2
Diane Trout [Wed, 28 Nov 2012 00:37:55 +0000 (16:37 -0800)]
Convert to unittest2

Test cases inherit from either unittest2 or django.test.TestCase
I should be able to use skip tests in the future.

I learned inheriting from django.test.TestCase will properly set up
the database for django tests. (Well at least mostly, I'm having
some possible errors on 1.4)

11 years agoConvert to unittest2
Diane Trout [Wed, 28 Nov 2012 00:37:55 +0000 (16:37 -0800)]
Convert to unittest2

Test cases inherit from either unittest2 or django.test.TestCase
I should be able to use skip tests in the future.

I learned inheriting from django.test.TestCase will properly set up
the database for django tests. (Well at least mostly, I'm having
some possible errors on 1.4)

11 years agoMerge branch 'master' of mus.cacr.caltech.edu:htsworkflow
Diane Trout [Tue, 27 Nov 2012 22:28:08 +0000 (14:28 -0800)]
Merge branch 'master' of mus.cacr.caltech.edu:htsworkflow

11 years agoIgnoring the missing type RDF error for the page under testing's url.
Diane Trout [Tue, 27 Nov 2012 22:22:33 +0000 (14:22 -0800)]
Ignoring the missing type RDF error for the page under testing's url.

When testing the stylesheet gets attached to the pages url. All
the meaningful information about the sample or experiment still
gets added to the right <host>/<category>/<id> pages,
unfortunately that means the page url doesn't have a type which
caused _validate_types to toss an error.

I'd previously fixed it by testing for the error message and
filtering it out from the test code, but that didn't work on
ubuntu 10.04 as the error message changes slightly with the
older version of redland rdf.

This version changes the sparql query to ignore the case
where the predicate is a stylesheet and there's no type.

11 years agoAdd dependencies to the setup.py
Diane Trout [Tue, 27 Nov 2012 18:30:36 +0000 (10:30 -0800)]
Add dependencies to the setup.py

I do need to make benderjab public before other people
could install this. Or maybe figure out how to use the optional
dependency mode.

11 years agoTest presence of species & species name on library index page.
Diane Trout [Tue, 20 Nov 2012 22:37:49 +0000 (14:37 -0800)]
Test presence of species & species name on library index page.

11 years agoCorrectly implement merging Notification & Manager sets
Diane Trout [Tue, 20 Nov 2012 22:36:49 +0000 (14:36 -0800)]
Correctly implement merging Notification & Manager sets

Apparently I was off imagining functiosn that don't exist.

11 years agoMerge branch 'master' of mus.cacr.caltech.edu:htsworkflow
Diane Trout [Tue, 20 Nov 2012 22:31:05 +0000 (14:31 -0800)]
Merge branch 'master' of mus.cacr.caltech.edu:htsworkflow

11 years agoRenamed various django tests.py to test_module.py
Diane Trout [Tue, 20 Nov 2012 22:29:20 +0000 (14:29 -0800)]
Renamed various django tests.py to test_module.py

It appears by default py.test was looking for packages named test_,
since it wasn't finding the tests.py modules. This seemed like
a reasonable alternative convention.

11 years agoIgnore .tox directory
Diane Trout [Tue, 20 Nov 2012 20:51:39 +0000 (12:51 -0800)]
Ignore .tox directory

I was experimenting with the virtualenv testing tool tox.testrun.org
and I might as well ignore its directory

11 years agoFix RDF schema problems with lane_number and species.
Diane Trout [Tue, 20 Nov 2012 20:46:41 +0000 (12:46 -0800)]
Fix RDF schema problems with lane_number and species.

My RDF schema was using the term "species" both for the species
name and a species class -- which doesn't make sense. This
version of the schema introduces a species_name which can
attach either to the Library object or the Species object.

Its still a little inconsistent as I'm using it for both
"common name" and "scientific name". But hey its an improvement.

Also there's a tweak to the library_number type on the library detail
page setting the type to string instead of number, as I decided
it should be treated internally as an opaque identifier.
In theory someone might start naming lanes A,B,C,D or 1T, 1B
(for the top and bottom of a flowcell slide).

Finally I decided that the gel_cut should be of type integer,
yes decimal is the "more general type" but I'm using integer in
my sql schema so its only going to return integers.

11 years agoRun the library detail page through RDF validation.
Diane Trout [Tue, 20 Nov 2012 20:41:57 +0000 (12:41 -0800)]
Run the library detail page through RDF validation.

Also ignore the missing type error message for http://localhost/
as that resource really shouldn't have a type.

This improved test does catch a few new model inconsistencies
which I'll fix in my next patch.

11 years agoChange add_default_schema to use pkg_resources feature to find schemas.
Diane Trout [Tue, 20 Nov 2012 06:16:08 +0000 (22:16 -0800)]
Change add_default_schema to use pkg_resources feature to find schemas.

I was trying to get py.test to work and it really wants to install
things, and my previous method to find the schema files wasn't working
very well with the egg distribution.

11 years agoAdd dependency information to the setup.py script
Diane Trout [Tue, 20 Nov 2012 01:09:47 +0000 (17:09 -0800)]
Add dependency information to the setup.py script

Though its still missing a bit as I don't have benderjab
hosted and librdf needs to be installed seperately.

11 years agordf:Resource can be either a resource or a blank node.
Diane Trout [Tue, 20 Nov 2012 01:04:12 +0000 (17:04 -0800)]
rdf:Resource can be either a resource or a blank node.

Thus we should only toss an error in the case of a node being
a literal.

11 years agoUniquely merge BCC and Manager lists for sending notification email.
Diane Trout [Tue, 20 Nov 2012 00:50:49 +0000 (16:50 -0800)]
Uniquely merge BCC and Manager lists for sending notification email.

This uses a set to only send one email address one copy of a notification.

11 years agoFurther attempts to validate RDF models.
Diane Trout [Fri, 16 Nov 2012 00:01:04 +0000 (16:01 -0800)]
Further attempts to validate RDF models.

I had a bug caused by lane numbers being langauage tagged strings,
and thus not being found by my sparql query.

I found a solution to filter based on just the contents of a string
ignoring the language tag. However I thought not only should I
make it easier to run my RDF model validation code, I should also
double check the literal types.

Previously I just tagged any literal as rdfs:Literal. For ones
that should have a known type, I've changed it to the xmlschema
types.

This patch doesn't actually fix the bug. Just introduces the
diagnostic tool.

11 years agoAdd error message for typoing a result map filename.
Diane Trout [Wed, 14 Nov 2012 19:30:35 +0000 (11:30 -0800)]
Add error message for typoing a result map filename.

As a gift to me when trying to do something while sleepy.

11 years agoMerge branch 'master' of mus.cacr.caltech.edu:htsworkflow
Diane Trout [Wed, 14 Nov 2012 00:38:20 +0000 (16:38 -0800)]
Merge branch 'master' of mus.cacr.caltech.edu:htsworkflow

11 years agoCatch another lookup error.
Diane Trout [Thu, 11 Oct 2012 23:32:27 +0000 (16:32 -0700)]
Catch another lookup error.

Apparently one of gerald_summary[read][lane] was a dictionary
and so threw a KeyError instead of an IndexError. I might as well
catch both.

11 years agoolder librdf wanted to include type information when showing query strings.
Diane Trout [Thu, 27 Sep 2012 18:46:56 +0000 (11:46 -0700)]
older librdf wanted to include type information when showing query strings.
So I ran everything through fromTypedNode to convert the nodes
to simple python types.

However fromTypedNode wasn't intended to handle non-literal nodes
so I had to adjust it a bit to return resource nodes safely.

11 years agoWhen collecting files for a geo submission, group on library id
Diane Trout [Thu, 27 Sep 2012 17:37:56 +0000 (10:37 -0700)]
When collecting files for a geo submission, group on library id
instead of the previous grouping on lane.

11 years agoThis might actually generate soft file with raw & supplemental data.
Diane Trout [Tue, 25 Sep 2012 23:18:42 +0000 (16:18 -0700)]
This might actually generate soft file with raw & supplemental data.

To make working with the development server easier, I changed
the submission class to take a host which it will use to generate
the base library url.

When constructing URLs for files, I'm now using the actual path names
instead of synthesizing something based on the submission name.
This is to limit the amount of knowledge that needs to be passed
between the fastq generation code.

For fastq files it looks at the source file to find the flowcell
information. For supplemental files it looks at the submission
class for that analysis directory and grabs the library id
from there.

11 years agoMerge changing lane_number to string and sequence finding code changes
Diane Trout [Mon, 24 Sep 2012 23:43:33 +0000 (16:43 -0700)]
Merge changing lane_number to string and sequence finding code changes
I started using actual file paths instead synthetic submission
paths for naming where my sequence files are.

This one still one generate geo submissions correctly as I'm
pretty sure not all of the queries have been updated yet.

11 years agoAdd a log message to for debugging
Diane Trout [Mon, 24 Sep 2012 23:37:30 +0000 (16:37 -0700)]
Add a log message to for debugging

11 years agoDefine XHTML_RDF_DTD as None when we can't load the DTD
Diane Trout [Mon, 24 Sep 2012 23:34:44 +0000 (16:34 -0700)]
Define XHTML_RDF_DTD as None when we can't load the DTD

11 years agoMake the public html pages valid xhtml, and validate more RDFa cases.
Diane Trout [Mon, 24 Sep 2012 22:28:10 +0000 (15:28 -0700)]
Make the public html pages valid xhtml, and validate more RDFa cases.

Also after I spent time playing with the w3c online validator,
I decided it was best to try and add modest validation to my
unit tests.

So now there's a validate_xhtml function in ethelp.

The one really weird thing is I tried to load the DTD
in the test case, however it looks like librdf clobbered the
XML catalog resolver at some point so the DTD resolver can't
find anything.

11 years agoremove some dead commented out code.
Diane Trout [Mon, 24 Sep 2012 22:26:14 +0000 (15:26 -0700)]
remove some dead commented out code.

11 years agoFix (some) missing closing tags.
Diane Trout [Thu, 20 Sep 2012 22:17:53 +0000 (15:17 -0700)]
Fix (some) missing closing tags.

11 years agoMake a validation error message between different ages of librdf.
Diane Trout [Thu, 20 Sep 2012 21:35:51 +0000 (14:35 -0700)]
Make a validation error message between different ages of librdf.
result.uri vs result again.

11 years agoAlso make the library index page conform to htsworkflow ontology.
Diane Trout [Thu, 20 Sep 2012 21:26:46 +0000 (14:26 -0700)]
Also make the library index page conform to htsworkflow ontology.

11 years agoMinor tweaks to deal with the older version of librdf on ubuntu 10.04
Diane Trout [Thu, 20 Sep 2012 00:08:43 +0000 (17:08 -0700)]
Minor tweaks to deal with the older version of librdf on ubuntu 10.04
things like utf-8 escaping a string, using str(node.uri) instead
of str(node).

11 years agoUse htsworkflow ontology to validate various RDF using components.
Diane Trout [Wed, 19 Sep 2012 23:10:57 +0000 (16:10 -0700)]
Use htsworkflow ontology to validate various RDF using components.
Of course to use the ontology I had to make one first.
Unsurprisingly implementing it touched a bunch of code & templates.

I tried to be more consisten with using mixed-case names for
classes and lower_case names for properties.

There's some inconsistencies. like i use the term notes & comments
in different areas. Also, should I be using my own terms or
do better at reusing more standard ontologies?

11 years agoRefactor property type validator to support multiple classes for domain/range.
Diane Trout [Wed, 19 Sep 2012 23:04:35 +0000 (16:04 -0700)]
Refactor property type validator to support multiple classes for domain/range.
Also test to make sure we can have more than one domain/range statement.

11 years agoAdd stub xhtml vocab ontology, to make model validation quieter.
Diane Trout [Wed, 19 Sep 2012 23:03:07 +0000 (16:03 -0700)]
Add stub xhtml vocab ontology, to make model validation quieter.
(the stylesheets got attached as a property of the library or flowcell)

11 years agoMerge ssh://jumpgate.caltech.edu/var/htsworkflow/htsworkflow
Diane Trout [Tue, 18 Sep 2012 23:34:27 +0000 (16:34 -0700)]
Merge ssh://jumpgate.caltech.edu/var/htsworkflow/htsworkflow

11 years agoemail.bcc should be a list, not a nested list.
Diane Trout [Tue, 18 Sep 2012 23:33:01 +0000 (16:33 -0700)]
email.bcc should be a list, not a nested list.
NOTIFICATION_BCC was already a list of options.

11 years agoMerge branch 'master' of mus.cacr.caltech.edu:htsworkflow
Diane Trout [Tue, 18 Sep 2012 18:36:16 +0000 (11:36 -0700)]
Merge branch 'master' of mus.cacr.caltech.edu:htsworkflow

11 years agoImprovements to rdfinfer.
Diane Trout [Tue, 18 Sep 2012 18:34:25 +0000 (11:34 -0700)]
Improvements to rdfinfer.

Add rule to infer class and subClassOf memberships,
add testing for the class case.

Add code to run all the validation rules.

11 years agoChange rdfhelp.dump_model so you can specify a destination stream.
Diane Trout [Tue, 18 Sep 2012 18:31:49 +0000 (11:31 -0700)]
Change rdfhelp.dump_model so you can specify a destination stream.

11 years agoProgress using rdf model to link fastqs with flowcell/lib metadata.
Diane Trout [Tue, 18 Sep 2012 18:20:26 +0000 (11:20 -0700)]
Progress using rdf model to link fastqs with flowcell/lib metadata.

I changed how I was using rdf:type -- the most raw data is now
a 'sequencer_result' and now there's a seperate file_type
attribute to indicate what kind of result file it is.

I renamed find_missing_targets to update_fastq_targets as
in addition to finding what fastqs we need to generate it'll
also download missing flowcell information.

I'm still having trouble fishing out the fastq files so this isn't
ready yet.

Finally minor tweaks to the soft file formatting to try
and get it to render everything without spurious spaces.

11 years agoMerge branch 'master' of mus.cacr.caltech.edu:htsworkflow
Diane Trout [Tue, 18 Sep 2012 17:55:36 +0000 (10:55 -0700)]
Merge branch 'master' of mus.cacr.caltech.edu:htsworkflow

11 years agoStart implementing infering triples.
Diane Trout [Sat, 15 Sep 2012 05:49:29 +0000 (22:49 -0700)]
Start implementing infering triples.

This includes utilities to import the common schemas, and
a bit of functionality for validating models, in
addition to the rule to compute inverseOf.

11 years agoBe more defensive if the database is missing some data instead of
Diane Trout [Wed, 12 Sep 2012 18:35:37 +0000 (11:35 -0700)]
Be more defensive if the database is missing some data instead of
crasing on trying to access an empty list.

11 years agofix a wrong variable name
Diane Trout [Wed, 12 Sep 2012 18:34:54 +0000 (11:34 -0700)]
fix a wrong variable name