SCIENTIFIC-LINUX-USERS Archives

January 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Steven Timm <[log in to unmask]>
Reply To:
Steven Timm <[log in to unmask]>
Date:
Wed, 11 Jan 2012 15:56:21 -0600
Content-Type:
TEXT/PLAIN
Parts/Attachments:
TEXT/PLAIN (112 lines)
This smells like there could be problems with glibc version.. the
lx24 is presuming either a kernel version or a glibc version or both.
Do you have the appropriate compatibility glibc libraries installed?

Steve Timm



On Wed, 11 Jan 2012, Wil Irwin wrote:

> Hi-
>
> It is 64-bit on 64-bit. The exact version is from
> 'ge-6.2-bin-lx24-amd64.tar.gz' and 'ge-6.2-common.tar.gz'. So I can rule
> out that issue.
>
> As for the problems, I can provide more detail, but in brief (sort of):
>
> 1. The installation is w/o incident and I have used all the suggested
> defaults. Out of frustration, I've also installed in a couple of dozen time
> changing some of the more flexible defaults one at a time.
>
> 2. The "simple" job runs as it should.
>
> 3. There are 3 nodes (with the master also serving as an executor). All are
> talking to each other in term of the SGE ports and NFS.
>
> 4. My inquire was intended to be general in terms of some possible
> incompatibility between SGE and SL 6.1, the comment which follow have,
> unfortunately, the factor of submitting jobs using an analysis application.
> The script which this application uses is a bit convoluted, but I studied
> pretty well and, if there is some problem, I don't see it. I have not
> received any negative feedback from other users of this application.
> Unfortunately, it really isn't possible to submit the job from this
> application w/o using the accompanying script. So, of course, there is a
> bit of black-box factor.
>
> 5. One particular job is very large (~20K commands). After the commands are
> generated and submitted, SGE returns the rather confusing error message of
> "Unable to run job: job rejected: You try to submit a job with more than
> 75000 tasks. Exiting." 75000 is the configured limit, but I can readily see
> the command lines being generated and it is exactly 16900. I would say in
> general, this is the most perplexing problem.
>
> 6. #5 is accompanied by "failure" email messages, but no 16900 messages (I
> would say many hundred). I can't explain this behavior either. It could
> actually be an email server issue and not related to SGE, per se.
>
> 7. Another example is or will appear to be very specific to the analysis
> application I am using as opposed to a general SGE issue. For this
> application, there is an explicit user variable to set the queue, and I
> have set it to 'verylong.q'. When I submit a much smaller job (~200
> commands) to try to figure out what is going wrong, the 'verylong.q' is
> ignored and 'short.q' is selected. But more curious and more SGE-related is
> the job will run, but it runs the commands in series and only uses 1
> processor on the master node (each node has 6 x 2 cores).
>
> That's a flavor of what is causing my sanity to slowly drift away.
>
> Regards,
> Wil
>
> On Wed, Jan 11, 2012 at 1:00 PM, Keith Chadwick <[log in to unmask]> wrote:
>
>> Are you trying to run either:
>>
>>        1. A 32 bit version of SGE 6.2 on a 64 bit SL 6.1 system?
>>
>> or
>>
>>        2. A 64 bit version of SGE 6.2 on a 32 bit SL 6.1 system?
>>
>> In the case #1, you should be able to get SGE to run once you install
>> the necessary 32 bit compatibility libraries, or (recommended) switch
>> to a 64 bit version of SGE 6.2.
>>
>> In the case #2, you are going to be out of luck...
>>
>> -Keith.
>>
>>
>> At 12:43 PM -0800 1/11/12, Wil Irwin wrote:
>>
>>> Hello-
>>>
>>> I am having unparalleled (no pun intended) problems getting SGE 6.2 to
>>> run under SL 6.1. I have consulted with others who have quite a bit of
>>> experience using SGE on an earlier version of SL, and we cannot determine
>>> why it won't run.
>>>
>>> Before I list the nature of the problems, I though I would start by
>>> asking if anyone has had a successful experience with SGE 6.2 on SL 6.1.
>>>
>>> I'm running kernel:  2.6.32-220.2.1.el6.x86_64 #1 SMP Thu Dec 22 11:15:52
>>> CST 2011 x86_64
>>>
>>> Thanks for any help.
>>>
>>> -Wil
>>>
>>
>>
>

-- 
------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
[log in to unmask]  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Group Leader.
Lead of FermiCloud project.

ATOM RSS1 RSS2