Hi Orion Poplawski!

 On 2012.07.17 at 13:38:50 -0600, Orion Poplawski wrote next:

> >     If your atop service is on, you should be able to see something about
> >what was happening shortly before the crash by viewing the appropriate
> >/var/log/atop/<file> with the atop -r <file> command.
> >     You could just try increasing your swap space; you don't have very much
> >compared with your ram.  Simple 'top' and 'atop' commands show, among other
> >things, current swap usage.  I'd get nervous if most of it gets used up.
> 
> I installed and started atop to see what that shows.  Didn't know
> about that one before, thanks.  I am running sa/sar and that showed:
> 
> 12:00:02 AM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
> 08:50:01 AM   1989664    107480      5.13     25616     23.83
> 
> before the last crash.  and:
> 
> 02:10:01 AM   2097144         0      0.00         0      0.00
> 
> for the previous one.  So I don't particularly suspect lack of swap.
> The machine should have way more RAM than it needs, so it's mainly
> just buffer cache.  I did bump it up to 8GB just for fun though.

For this configuration (lots of ram and you don't actually plan to use
swap) I suggest lowering vm.swappiness to very low numbers to ensure
this buffer cache doesn't actually try to grow so much so it pushes out
something else into swap.

vm.swappiness = 1 should be fine value for 48 Gb KVM server.

Not that you actually need any buffer cache for pure virtual host, as
setting cache mode to "none" for all storage devices in guests provides
higher performance / lower latency and prevents nasty problems if guest
storage is accessed from host (with libguestfs or manual mounting), and
in this configuration each guest maintains own buffer cache and it's not
buffered for the second time on host.


Also, if you are using memory overcomitting with KSM, TUV recommends
(for some very good reasons!) to have sufficient amount of swap to
ensure RAM+swap is higher than amount of RAM allocated to all guests.
When running lots of linux guests I typically see lots of memory
saved by KSM (>10 gb on 48Gb servers), and you have to increase swap by
that value even if it's not used, because it suddenly might get required
depending on guest activity.

-- 

Vladimir