SCIENTIFIC-LINUX-USERS Archives

September 2013

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Joseph Areeda <[log in to unmask]>
Reply To:
Joseph Areeda <[log in to unmask]>
Date:
Mon, 30 Sep 2013 08:59:46 -0700
Content-Type:
text/plain
Parts/Attachments:
text/plain (99 lines)
Hi Eve,

Evidently, I haven't posted enough to that mailing list and had to wait
for my emails to be approved, but I did get one response back so far:
> guess the first question is do you really need the parallel universe? 
> If your MPI jobs are small enough that they can run on a single
> multi-core machine, you can run them in the vanilla universe, simply
> by requesting as many cores as you need.  This can be a lot easier to
> debug, as everything is running locally, and should give more
> consistent performance, as you aren't worrying about external
> networks, you can just set it up to run over a shared memory transport.
>
> Now, if the jobs are large enough that they need to cross machines,
> then you will, indeed, need to run in the parallel universe. 

I haven't rum MPI jobs in Condor, I'm just the middle man here.  I know
our Condor teams have been doing a lot of work on dynamic slots so jobs
could request as many cores as they need.  From what I gather from
reading that list this was a BIG job but the payoff was huge.  Before
that we had some slots with 8, 16 and 32 cores for these jobs.  Problem
was when they weren't running those slots would get assigned to single
core jobs leaving the others idle.

Joe


On 09/27/2013 04:11 PM, Eve V. E. Kovacs wrote:
> Hi Joe,
> We have looked at the web pages for the condor manual
> http://research.cs.wisc.edu/htcondor/manual/
> and
> http://research.cs.wisc.edu/htcondor/manual/v7.8/2_9Parallel_Applications.html
> and a few others.
>
> Any tips/gotchas for setting up the parallel universe and the
> dedicated parallel scheduler would be appreciated.
>
> If we can just follow the instructions in
> http://research.cs.wisc.edu/htcondor/manual/v7.8/3_12Setting_Up.html#sec:Config-Dedicated-Jobs
>
> that would be useful to know.
>
> Thanks
> Eve
>
>
>
> On Fri, 27 Sep 2013, Joseph Areeda wrote:
>
>> Date: Fri, 27 Sep 2013 16:06:28 -0500
>> From: Joseph Areeda <[log in to unmask]>
>> To: Eve V. E. Kovacs <[log in to unmask]>
>> Cc: [log in to unmask]
>> Subject: Re: condor and the parallel universe
>>
>> Hi Eve,
>>
>> Our group (http://ligo.org) has done this, dynamic slots thatcan run
>> MPI using one, some or all cores.  It was definitely not me but I can
>> ask specific questions of the Condor gurus.
>>
>> Sounds like your are looking for a "where to start" web page.
>> Anything more specific I should ask?
>>
>> Joe
>>
>> On 09/27/2013 12:17 PM, Eve V. E. Kovacs wrote:
>>> Has anyone out there setup condor with a parallel universe environment?
>>> If so, could you please point me to some documentation/notes/anything
>>> that would help me get this going?
>>> We'd like to run MPI jobs in our condor environment, but with our
>>> present (vanilla) setup they just sit in the queue forever.
>>> Thanks
>>> Eve
>>>
>>> ***************************************************************
>>> Eve Kovacs
>>> Argonne National Laboratory,
>>> Room L-177, Bldg. 360, HEP
>>> 9700 S. Cass Ave.
>>> Argonne, IL 60439 USA
>>> Phone: (630)-252-6208
>>> Fax:   (630)-252-5047
>>> email: [log in to unmask]
>>> ***************************************************************
>>
>>
>
> ***************************************************************
> Eve Kovacs
> Argonne National Laboratory,
> Room L-177, Bldg. 360, HEP
> 9700 S. Cass Ave.
> Argonne, IL 60439 USA
> Phone: (630)-252-6208
> Fax:   (630)-252-5047
> email: [log in to unmask]
> ***************************************************************

ATOM RSS1 RSS2