Differences between revisions 4 and 5
Revision 4 as of 2010-03-19 19:09:56
Size: 1885
Editor: diane
Comment:
Revision 5 as of 2010-05-21 18:20:17
Size: 1908
Editor: hamrhein
Comment: added link to Templates page
Deletions are marked like this. Additions are marked like this.
Line 17: Line 17:
[[/Troubleshooting]] [[/Troubleshooting]] <<BR>>
[[/Templates]]

Quick Condor Notes

We're experimenting with using Condor as a queuing system. The first pass has pongo.cacr.caltech.edu configured as the submit host, and myogenin.cacr.caltech.edu and mondom.cacr.caltech.edu configured as the execute hosts.

Basically that means you should run condor_submit on pongo, but your jobs will actually run on myogenin & mondom

The condor user documentation is at http://www.cs.wisc.edu/condor/manual/v7.4/2_Users_Manual.html

A tutorial presentation (.ppt) and Videos from the 2008 Condor Week Presentations.

One difficulty with a queuing system is they want to view a single executable as only taking one cpu which isn't true for either multi-threaded apps, or applications that start sub-processes. I'm attempting to resolve that by using condor's Dynamic Slots feature

Instead a job running slot for each cpu with memory/cpu ram available, this method creates a single slot with all the cpus in a single slot. Then as each process gets allocated to the slot the remaning resources are used to create a new slot. However if you want to use a job that uses multiple cpus for a single executable you'll need to add a "request_cpus=N" variable to the condor submit script. (Think for example bowtie, tophat, or make -j).

I do have an example condor submit script with a simple python process that uses multiple cpus in multicpu.

/Troubleshooting
/Templates


Distributed computing in practice:the Condor experience is a paper describing the history and goals of the Condor project.

WoldlabWiki: Condor (last edited 2014-07-15 21:33:54 by diane)