3 Runfolder.py can generate a xml file capturing all the 'interesting' parameters from a finished pipeline run. (using the -a option). The information currently being captured includes:
7 * start/stop cycle numbers
8 * Firecrest, bustard, gerald version numbers
9 * Eland analysis types, and everything in the eland configuration file.
10 * cluster numbers and other values from the Summary.htm
11 LaneSpecificParameters table.
12 * How many reads mapped to a genome from an eland file
14 The ELAND "mapped reads" counter will also check for eland squashed file
15 that were symlinked from another directory. This is so I can track how
16 many reads landed on the genome of interest and on the spike ins.
18 Basically my subdirectories something like:
21 genomes/hg18/chr*.2bpb <- files for hg18 genome
23 genomes/hg18/VATG.fa.2bp <- symlink to genomes/spikeins
26 runfolder.py can also spit out a simple summary report (-s option)
27 that contains the per lane post filter cluster numbers and the mapped
28 read counts. (The report isn't currently very pretty)
34 from gaworkflow.pipeline import runfolder
37 usage = 'usage: %prog [options] runfolder_root_dir'
38 parser = optparse.OptionParser(usage)
39 parser.add_option('-v', '--verbose', dest='verbose', action='store_true',
41 help='turn on verbose mode')
42 parser.add_option('-s', '--summary', dest='summary', action='store_true',
44 help='produce summary report')
45 parser.add_option('-a', '--archive', dest='archive', action='store_true',
47 help='generate run configuration archive')
50 def main(cmdlist=None):
51 parser = make_parser()
52 opt, args = parser.parse_args(cmdlist)
55 parser.error('need path to a runfolder')
59 root_log = logging.getLogger()
60 root_log.setLevel(logging.INFO)
63 runs = runfolder.get_runs(run_dir)
65 print runfolder.summary_report(runs)
67 runfolder.extract_run_parameters(runs)
71 if __name__ == "__main__":
72 sys.exit(main(sys.argv[1:]))