Differences between revisions 7 and 8
Revision 7 as of 2010-09-03 22:35:10
Size: 2750
Editor: hamrhein
Comment:
Revision 8 as of 2010-09-16 18:19:21
Size: 4096
Editor: hamrhein
Comment:
Deletions are marked like this. Additions are marked like this.
Line 52: Line 52:
  (note: you will need to change the paths)
Line 54: Line 55:
environment="PATH=$PATH:/woldlab/glusterfs/data/bowtie-0.12.5:/woldlab/glusterfs/data/tophat-1.0.14/bin BOWTIE_INDEXES=/woldlab/glusterfs/data/bowtie-0.12.5/indexes/"
executable=/usr/bin/python
Line 55: Line 58:
environment="PATH=$PATH:/proj/genome/programs/bowtie-0.10.1:/proj/genome/programs/tophat-1.0.14/bin BOWTIE_INDEXES=/proj/genome/programs/bowtie-0.10.1/indexes INDIR=/full/path/to/input OUTDIR=/full/path/to/output" log=tophat.$(Process).log
output=tophat.$(Process).out
error=tophat.$(Process).err
Line 57: Line 62:
executable=/proj/genome/programs/bowtie-0.12.1/bowtie
arguments="-p 4 -o $(OUTDIR) hg19 $(INDIR)/11501_61MMHAAXX.fa"
request_cpus = 4
request_memory = 8000
request_disk = 0
Line 60: Line 66:
error=tophat.$(PROCESS).err
output=tophat.$(PROCESS).out
log=tophat.$(PROCESS).log
arguments="/woldlab/glusterfs/data/tophat-1.0.14/bin/tophat -o /woldlab/glusterfs/data/hamrhein/condor/20100916/HUVEC-WC-PolyA-010WC+-r147-std54 -p 4 -r 147 --mate-std-dev 54 hg19-male /woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_2_1.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_3_1.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_4_1.txt.75mers.fastq /woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_2_2.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_3_2.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_4_2.txt.75mers.fastq"
queue
Line 64: Line 69:
request_cpus=4 arguments="/woldlab/glusterfs/data/tophat-1.0.14/bin/tophat -o /woldlab/glusterfs/data/hamrhein/condor/20100916/HeLaS3-WC-PolyA-011WC+-r41-std92 -p 4 -r 41 --mate-std-dev 92 hg19-female /woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_4_1.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_5_1.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_6_1.txt.75mers.fastq /woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_4_2.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_5_2.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_6_2.txt.75mers.fastq"
queue
Line 66: Line 72:
queue

A Quick Word on Files

Condor has the ability to work with files which live on the NFS server (castor, loxcyc, rattus) as well as files local to the execute host. If you plan to work with a ton of small files or a handful of large files, feel free to use the NFS server as the source for your files. If you have a bunch of large files to process, you'll likely be better off telling Condor to transfer the files to the execute host before executing your job. Not only will you get better performance, everyone else will still be able to use the NFS server, allowing you to save face at the same time...Trust me, I speak from experience.

To transfer files to the execute host, use the following directives:

should_transfer_files = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = /full/path/to/infile1,/full/path/to/infile2,...

This can be done globally by placing these directives at the top of the recipe, or on a per-job basis by placing them before each "queue" directive.

Bowtie Template

universe=vanilla

environment="BOWTIE_INDEXES=/proj/genome/programs/bowtie-0.12.1/indexes OUTDIR=/full/path/to/output"

executable=/proj/genome/programs/bowtie-0.12.1/bowtie
arguments=hg19sp75spike -v 2 -k 11 -m 10 --best --strata -p 4 -q $(OUTDIR)/1184_1_1.fastq --un $(OUTDIR)/1184_1_1.unmapped.fa --max $(OUTDIR)/1185_1_1.repeat.fa $(OUTDIR)/1184_1_1.bowtie.txt

log=bowtie.$(Process).log
output=bowtie.$(Process).out
error=bowtie.$(Process).err

request_cpus = 4
request_memory = 8000
request_disk = 0

queue

It's important to set the "request_cpus" variable to match the -p option to bowtie. It's also probably a good idea to set the "request_memory" to a more realistic value...8000 is almost 8 Gigs

ERANGE Template

universe=vanilla

environment="ERANGEPATH=/path/to/erange/commoncode INDIR=/full/path/to/input OUTDIR=/full/path/to/output"

executable = /usr/bin/python
arguments = $(ERANGEPATH)/makerdsfrombowtie.py 1184_1_1 $(INDIR)/1184_1_1.bowtie.txt $(OUTDIR)/1184_1_1.rds -RNA $(INDIR)/hg19-knownGene.txt

log=makerdsfrombowtie.$(Process).log
output=makerdsfrombowtie.$(Process).out
error=makerdsfrombowtie.$(Process).err

queue

Tophat Template

  • (note: you will need to change the paths)

universe=vanilla
environment="PATH=$PATH:/woldlab/glusterfs/data/bowtie-0.12.5:/woldlab/glusterfs/data/tophat-1.0.14/bin BOWTIE_INDEXES=/woldlab/glusterfs/data/bowtie-0.12.5/indexes/"
executable=/usr/bin/python

log=tophat.$(Process).log
output=tophat.$(Process).out
error=tophat.$(Process).err

request_cpus = 4
request_memory = 8000
request_disk = 0

arguments="/woldlab/glusterfs/data/tophat-1.0.14/bin/tophat -o /woldlab/glusterfs/data/hamrhein/condor/20100916/HUVEC-WC-PolyA-010WC+-r147-std54 -p 4 -r 147 --mate-std-dev 54 hg19-male /woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_2_1.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_3_1.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_4_1.txt.75mers.fastq /woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_2_2.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_3_2.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/010WC+/LID8464_FC61LTAAAXX_4_2.txt.75mers.fastq"
queue

arguments="/woldlab/glusterfs/data/tophat-1.0.14/bin/tophat -o /woldlab/glusterfs/data/hamrhein/condor/20100916/HeLaS3-WC-PolyA-011WC+-r41-std92 -p 4 -r 41 --mate-std-dev 92 hg19-female /woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_4_1.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_5_1.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_6_1.txt.75mers.fastq /woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_4_2.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_5_2.txt.75mers.fastq,/woldlab/glusterfs/data/ENCODE_CSHL/011WC+/LID16633_FC61U2UAAXX_6_2.txt.75mers.fastq"
queue

WoldlabWiki: Condor/Templates (last edited 2015-04-13 22:38:12 by hamrhein)