SCIENTIFIC-LINUX-USERS Archives

July 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Orion Poplawski <[log in to unmask]>
Reply To:
Orion Poplawski <[log in to unmask]>
Date:
Tue, 17 Jul 2012 14:00:21 -0600
Content-Type:
text/plain
Parts/Attachments:
text/plain (107 lines)
On 07/17/2012 11:46 AM, Stephan Wiesand wrote:
> On Jul 17, 2012, at 19:22 , Orion Poplawski wrote:
>
>> Our SL6.2 KVM and nfs/backup server has been crashing frequently recently (starting around Fri 13th - yikes!) with Kernel panic - Out of memory and no killable processes.  The server has 48GB ram, 2GB swap, only about 15GB dedicated to VM guests.  I've tried bumping up vm.min_free_kbytes to 262144 to no avail.  Nothing strange is getting written to the logs before the crash.

Hmm, I suppose bumping up min_free_kbytes might be making things worse?

>> Happening with both 2.6.32-220.23.1 and 2.6.32-279.1.1.
>>
>> Anyone else seeing this?
>
> Not on our KVM servers (which don't have any other duties though), which have been running -220.23.1 for three weeks.
>
>>   Any other ideas?
>
> Is swap space sufficient?

It was 2GB, but barely used.  The system should have way more RAM than needed. 
  Upped to 8GB.
>
> Have you modified vm.overcommit_* ? Doing so may help turning the panics into allocation failures that can be handled.
>

Haven't modified them:

vm.overcommit_memory = 0
vm.overcommit_ratio = 50
vm.nr_overcommit_hugepages = 0

I suppose:

vm.overcommit_memory = 2
vm.overcommit_ratio = 80

would limit total RAM usage to about 46.4GB which should be safe.  I might try 
that next.

> Do any slab pools keep growing, to an unusual size?
>

Here's what I have shortly after reboot.  I'll keep watching it.

  Active / Total Objects (% used)    : 1500116 / 1526912 (98.2%)
  Active / Total Slabs (% used)      : 37344 / 37481 (99.6%)
  Active / Total Caches (% used)     : 134 / 204 (65.7%)
  Active / Total Size (% used)       : 147024.04K / 152389.66K (96.5%)
  Minimum / Average / Maximum Object : 0.02K / 0.10K / 4096.00K

   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
123540 123283  99%    0.19K   6177       20     24708K size-192
236059 235736  99%    0.06K   4001       59     16004K ksm_rmap_item
484128 484094  99%    0.02K   3362      144     13448K avtab_node
    203    203 100%   32.12K    203        1     12992K kmem_cache
341936 341360  99%    0.03K   3053      112     12212K size-32
  14136  14109  99%    0.58K   2356        6      9424K inode_cache
  71595  71595 100%    0.10K   1935       37      7740K buffer_head
  31500  31078  98%    0.19K   1575       20      6300K dentry
  10857  10857 100%    0.55K   1551        7      6204K radix_tree_node
   5140   4839  94%    1.00K   1285        4      5140K size-1024
   4480   4462  99%    1.00K   1120        4      4480K ext4_inode_cache
   5772   5684  98%    0.62K    962        6      3848K proc_inode_cache
  24300  24269  99%    0.14K    900       27      3600K sysfs_dir_cache
   1558   1348  86%    2.00K    779        2      3116K size-2048
  13794  13421  97%    0.20K    726       19      2904K vm_area_struct
   1074   1050  97%    2.59K    358        3      2864K task_struct
    699    699 100%    4.00K    699        1      2796K size-4096
    975    951  97%    2.06K    325        3      2600K sighand_cache
   4536   3262  71%    0.50K    567        8      2268K size-512
     17     17 100%  128.00K     17        1      2176K size-131072
  27401  27229  99%    0.07K    517       53      2068K selinux_inode_security
   2255   2232  98%    0.78K    451        5      1804K shmem_inode_cache
  22007  21365  97%    0.06K    373       59      1492K size-64
  10950   9234  84%    0.12K    365       30      1460K size-128
     22     22 100%   64.00K     22        1      1408K size-65536
    326    326 100%    4.00K    326        1      1304K biovec-256
   5720   4125  72%    0.19K    286       20      1144K filp
   1020    985  96%    1.00K    255        4      1020K signal_cache

After some disk activity I'm at:

  Active / Total Objects (% used)    : 4829537 / 4855308 (99.5%)
  Active / Total Slabs (% used)      : 163899 / 163988 (99.9%)
  Active / Total Caches (% used)     : 132 / 204 (64.7%)
  Active / Total Size (% used)       : 630344.49K / 634988.70K (99.3%)
  Minimum / Average / Maximum Object : 0.02K / 0.13K / 4096.00K

   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
2918893 2918736  99%    0.10K  78889       37    315556K buffer_head
112616 112599  99%    1.00K  28154        4    112616K ext4_inode_cache
  98637  98563  99%    0.55K  14091        7     56364K radix_tree_node
165540 165060  99%    0.19K   8277       20     33108K dentry
123520 123363  99%    0.19K   6176       20     24704K size-192
236059 235736  99%    0.06K   4001       59     16004K ksm_rmap_item
484128 484094  99%    0.02K   3362      144     13448K avtab_node
    203    203 100%   32.12K    203        1     12992K kmem_cache
342384 341570  99%    0.03K   3057      112     12228K size-32
139019 138470  99%    0.07K   2623       53     10492K selinux_inode_security

Still watching it...

-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA, Boulder Office                  FAX: 303-415-9702
3380 Mitchell Lane                       [log in to unmask]
Boulder, CO 80301                   http://www.nwra.com

ATOM RSS1 RSS2