SCIENTIFIC-LINUX-USERS Archives

March 2015

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Andreas Haupt <[log in to unmask]>
Reply To:
Andreas Haupt <[log in to unmask]>
Date:
Mon, 2 Mar 2015 09:55:32 +0100
Content-Type:
text/plain
Parts/Attachments:
text/plain (30 lines)
Hi Arnau,

Am Montag, den 02.03.2015, 09:34 +0100 schrieb Arnau Bria:
> Hi Andreas,
>  
> > over the weekend we managed to provoke an identical behaviour. Jobs
> > crash during the epilog phase when the job's CGroup gets removed.
> 
> So UGE 8.2.1 + Kernel is 2.6.32-504.8.1?

Yes.

> what kernel are you running in your production cluster? did you see
> this problem there, too? I can' upgrade to newer kernel because many
> nodes reboot and we lose many many jobs...

Our production cluster is still on UGE 8.0.1 without cgroup support - so
it is unaffected. In a "safe environment" you might want to consider a
kernel downgrade. In our case this is clearly not an option. So this
issue just moves the UGE 8.2.x production status to some unspecified
time in future :-(

Cheers,
Andreas
-- 
| Andreas Haupt            | E-Mail: [log in to unmask]
|  DESY Zeuthen            | WWW:    http://www-zeuthen.desy.de/~ahaupt
|  Platanenallee 6         | Phone:  +49/33762/7-7359
|  D-15738 Zeuthen         | Fax:    +49/33762/7-7216

ATOM RSS1 RSS2