Diff for "Condor"

Differences between revisions 5 and 7 (spanning 2 versions)

Quick Condor Notes

We're experimenting with using Condor as a queuing system. The first pass has pongo.cacr.caltech.edu configured as the submit host, and myogenin.cacr.caltech.edu and mondom.cacr.caltech.edu configured as the execute hosts.

Basically that means you should run condor_submit on pongo, but your jobs will actually run on myogenin & mondom. For example to run a random python script.

# this is a hackish way to make a file contianing the lines between the EOFs
$ cat >myscript.condor <<EOF
universe=vanilla
executable=/usr/bin/python
output=script.output
error=script.output
log=script.status
arguments=script.py --do_that_thing
queue
EOF
$ condor_submit myscript.conor

The condor user documentation is at http://www.cs.wisc.edu/condor/manual/v7.4/2_Users_Manual.html

A tutorial presentation (.ppt) and Videos from the 2008 Condor Week Presentations.

One difficulty with a queuing system is they want to view a single executable as only taking one cpu which isn't true for either multi-threaded apps, or applications that start sub-processes. I'm attempting to resolve that by using condor's Dynamic Slots feature

Instead a job running slot for each cpu with memory/cpu ram available, this method creates a single slot with all the cpus in a single slot. Then as each process gets allocated to the slot the remaning resources are used to create a new slot. However if you want to use a job that uses multiple cpus for a single executable you'll need to add a "request_cpus=N" variable to the condor submit script. (Think for example bowtie, tophat, or make -j).

I do have an example condor submit script with a simple python process that uses multiple cpus in multicpu.

Distributed computing in practice:the Condor experience is a paper describing the history and goals of the Condor project.

-  ⇤ ← Revision 5 as of 2010-05-21 18:20:17 → 
  Size: 1908
  Editor: hamrhein
  Comment: added link to Templates page
+   ← Revision 7 as of 2012-03-20 21:44:50 → ⇥
  Size: 2287
  Editor: diane
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 5:
-Basically that means you should run condor_submit on pongo, but your jobs will actually run on myogenin & mondom
+Basically that means you should run condor_submit on pongo, but your jobs will actually run on myogenin & mondom. For example to run a random python script.

{{{
# this is a hackish way to make a file contianing the lines between the EOFs
$ cat >myscript.condor <<EOF
universe=vanilla
executable=/usr/bin/python
output=script.output
error=script.output
log=script.status
arguments=script.py --do_that_thing
queue
EOF
$ condor_submit myscript.conor
}}}
-Line 17:
+Line 33:
-[[/Troubleshooting]] <<BR>>
[[/Templates]]
+ * [[/Troubleshooting]]
 * [[/Templates]]   * [[/FileTransfer]]