SCIENTIFIC-LINUX-USERS Archives

July 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Orion Poplawski <[log in to unmask]>
Reply To:
Orion Poplawski <[log in to unmask]>
Date:
Tue, 17 Jul 2012 13:38:50 -0600
Content-Type:
text/plain
Parts/Attachments:
text/plain (51 lines)
On 07/17/2012 12:21 PM, Steven J. Yellin wrote:
>      If your atop service is on, you should be able to see something about
> what was happening shortly before the crash by viewing the appropriate
> /var/log/atop/<file> with the atop -r <file> command.
>      You could just try increasing your swap space; you don't have very much
> compared with your ram.  Simple 'top' and 'atop' commands show, among other
> things, current swap usage.  I'd get nervous if most of it gets used up.

I installed and started atop to see what that shows.  Didn't know about that 
one before, thanks.  I am running sa/sar and that showed:

12:00:02 AM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
08:50:01 AM   1989664    107480      5.13     25616     23.83

before the last crash.  and:

02:10:01 AM   2097144         0      0.00         0      0.00

for the previous one.  So I don't particularly suspect lack of swap.  The 
machine should have way more RAM than it needs, so it's mainly just buffer 
cache.  I did bump it up to 8GB just for fun though.

> On Tue, 17 Jul 2012, Orion Poplawski wrote:
>
>> Our SL6.2 KVM and nfs/backup server has been crashing frequently recently
>> (starting around Fri 13th - yikes!) with Kernel panic - Out of memory and no
>> killable processes.  The server has 48GB ram, 2GB swap, only about 15GB
>> dedicated to VM guests.  I've tried bumping up vm.min_free_kbytes to 262144
>> to no avail.  Nothing strange is getting written to the logs before the crash.
>>
>> Happening with both 2.6.32-220.23.1 and 2.6.32-279.1.1.
>>
>> Anyone else seeing this?  Any other ideas?  I've set a serial console log to
>> try to catch more information the next time it happens.
>>
>> --
>> Orion Poplawski
>> Technical Manager                     303-415-9701 x222
>> NWRA, Boulder Office                  FAX: 303-415-9702
>> 3380 Mitchell Lane                       [log in to unmask]
>> Boulder, CO 80301                   http://www.nwra.com
>>


-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA, Boulder Office                  FAX: 303-415-9702
3380 Mitchell Lane                       [log in to unmask]
Boulder, CO 80301                   http://www.nwra.com

ATOM RSS1 RSS2