X-Git-Url: http://woldlab.caltech.edu/gitweb/?p=erange.git;a=blobdiff_plain;f=docs%2FREADME.build-rds;fp=docs%2FREADME.build-rds;h=1c08a27a90f72a7c984660d4bae14c5c4b0f534d;hp=ef668d2bd184a73643dae51dea1bf7128bc799af;hb=0d3e3112fd04c2e6b44a25cacef1d591658ad181;hpb=5e4ae21098dba3d1edcf11e7279da0d84c3422e4 diff --git a/docs/README.build-rds b/docs/README.build-rds index ef668d2..1c08a27 100644 --- a/docs/README.build-rds +++ b/docs/README.build-rds @@ -137,9 +137,9 @@ single file e.g. test.comb.eland2 The makerdsfromeland2.py script is used to import the reads into RDS: -python makerdsfromeland2.py label infilename outrdsfile [-append] [-RNA ucscGeneModels] -[propertyName::propertyValue] [-index] [-paired 1 or 2] [-extended] [-verbose] -[-olddelimiter] [-maxlines num] [-cache numPages] +python makerdsfromeland2.py label infilename outrdsfile [--append] [--RNA ucscGeneModels] +[propertyName::propertyValue] [--index] [--paired 1 or 2] [--extended] [--verbose] +[--olddelimiter] [--maxlines num] [--cache numPages] The first 3 arguments are required: - label is any label that you wish (a combination flowcell+lane# @@ -149,18 +149,18 @@ is a good choice) - outdbname is the name of the rds file, e.g. test.rds If the reads are from paired-end runs, enter each eland_multi -(or extended) file separately with the "-paired 1" or "-paired 2" +(or extended) file separately with the "--paired 1" or "--paired 2" flag, as appropriate. -If entering more than one lane, use -append for all subsequent -lanes. Upon entering the last lane, use -index to build a read +If entering more than one lane, use --append for all subsequent +lanes. Upon entering the last lane, use --index to build a read index. Refer to MANIPULATING RDS METADATA AND CACHING for information on the optional property::value pairs and caching. For RNA-seq, you must in addition specify the path to knownGene.txt -using the -RNA flag, e.g. +using the --RNA flag, e.g. -python $ERANGEPATH/makerdsfromeland2.py myRNAlabel myRNA.eland_multi.txt rnatest.rds -RNA ../mm9/knownGene.txt [more options] +python $ERANGEPATH/makerdsfromeland2.py myRNAlabel myRNA.eland_multi.txt rnatest.rds --RNA ../mm9/knownGene.txt [more options] 6. MAPPING READS WITH BOWTIE @@ -187,13 +187,13 @@ python $ERANGEPATH/makerdsfrombowtie.py testLabel s1.mm9.bowtie.txt bowtietest.r The options for the script are: python makerdsfrombowtie.py label infilename outrdsfile -[-RNA ucscGeneModels] [-append] [-index] [propertyName::propertyValue] -[-rawreadID] [-verbose] [-cache numPages] +[--RNA ucscGeneModels] [--append] [--index] [propertyName::propertyValue] +[--rawreadID] [--verbose] [--cache numPages] Refer to "MAPPING READS WITH ELAND" for a description of label, -infilename, outdbname, '-append', '-index', and '-cache'. +infilename, outdbname, '--append', '--index', and '--cache'. -****REMEMBER TO USE -index WHEN LOADING THE LAST LANE OF YOUR +****REMEMBER TO USE --index WHEN LOADING THE LAST LANE OF YOUR DATASET.**** The script assumes that the read ID are from Illumina, i.e. that @@ -210,9 +210,9 @@ throw_away:uniqueid if unpaired throw_away:uniqueid/1 and throw_away:uniqueid/2 for paired-ends. For RNA-seq, you must in addition specify the path to knownGene.txt -using the -RNA flag, e.g. +using the --RNA flag, e.g. -python $ERANGEPATH/makerdsfrombowtie.py myRNAlabel myRNA.bowtie.txt rnatest.rds -RNA ../mm9/knownGene.txt [more options] +python $ERANGEPATH/makerdsfrombowtie.py myRNAlabel myRNA.bowtie.txt rnatest.rds --RNA ../mm9/knownGene.txt [more options] 7. MAPPING READS WITH BLAT @@ -239,20 +239,20 @@ Once the reads have been filtered, the makerdsfromblat.py script is used to import the mapped reads (in the example above s3_1.hg18.blatbetter) into RDS: -python makerdsfromblat.py label infilename outrdsfile [-append] [-index] [propertyName::propertyValue] -[-rawreadID] [-forceRNA] [-flag] [-strict minSpliceLen] [-spliceonly] [-verbose] [-cache numPages] +python makerdsfromblat.py label infilename outrdsfile [--append] [--index] [propertyName::propertyValue] +[--rawreadID] [--forceRNA] [--flag] [--strict minSpliceLen] [--spliceonly] [--verbose] [--cache numPages] If you are using BLAT for RNA-seq, please be sure to use --forceRNA in order to import spliced reads and consider -using -strict to require a minimum length of bases on +--forceRNA in order to import spliced reads and consider +using --strict to require a minimum length of bases on each side of the splice. You can combine BOWTIE and BLAT by mapping reads with BOWTIE first, and then using BLAT to map the unmapped reads. In that case, you may want to only load the spliced reads -using the -spliceonly flag. To track those reads in the RDS -file, use -flag ; you can then retrieve those reads using -the options "-flag blat -flagLike" with the makebedfromrds.py +using the --spliceonly flag. To track those reads in the RDS +file, use --flag ; you can then retrieve those reads using +the options "--flag blat --flagLike" with the makebedfromrds.py script. @@ -266,7 +266,7 @@ have neither the multireads nor the spliced reads. The command line options are similar to those for other scripts described in part 5-7: -python makerdsfrombed.py label bedfile outrdsfile [-append] [-index] [propertyName::propertyValue] [-cache numPages] +python makerdsfrombed.py label bedfile outrdsfile [--append] [--index] [propertyName::propertyValue] [--cache numPages] 9. COMBINING RDS FILES @@ -277,7 +277,7 @@ of importing all tables or specific ones (e.g. uniqs, splices). The combinerds.py command options are: -python combinerds.py destinationRDS inputrds1 [inputrds2 ....] [-table table_name] [-init] [-initrna] [-index] [-cache pages] +python combinerds.py destinationRDS inputrds1 [inputrds2 ....] [--table table_name] [--init] [--initrna] [--index] [--cache pages] 10. MANIPULATING RDS METADATA AND CACHING @@ -320,6 +320,7 @@ basis for mammalian genomes. RELEASE HISTORY +version 3.3 November 2010 - updated command line options version 3.2 October 2009 - added combinerds.py version 3.01 February 2009 - bug fixes version 3.0 January 2009 - added logging to buildrdsfrom*