Subject: | |
From: | |
Reply To: | |
Date: | Mon, 30 Aug 2010 21:59:11 -0500 |
Content-Type: | TEXT/PLAIN |
Parts/Attachments: |
|
|
Hi Doug--I have seen the same message on some of our machines but
so far it hasn't caused any real performance problems up until now.
It's not so much if you are running SL5.5 but just as long as you
are running some of the latest errata kernels.. we only
saw it show up on SL5.3 but with the latest errata kernel.
Steve
Steve
On Mon, 30 Aug 2010, Doug Johnson wrote:
> Greetings,
>
> I am seeing the following messsage when an SL5.5 (all of the most recent
> updates are installed) is under load writing data to an NFS disk:
>
> NOTE: It occurs for other processes than kswapd0, so I don't think that
> has anything to do with the issue.
>
> Aug 30 18:25:21 se kernel: INFO: task kswapd0:220 blocked for more than 120 seconds.
> Aug 30 18:25:21 se kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Aug 30 18:25:21 se kernel: kswapd0 D ffff810003336420 0 220 36 221 219 (L-TLB)
> Aug 30 18:25:21 se kernel: ffff810003be19e0 0000000000000046 ffff810037c9c200 ffff8100ae3c4000
> Aug 30 18:25:21 se kernel: 0000000000000003 000000000000000a ffff810037f2a860 ffffffff80308b60
> Aug 30 18:25:21 se kernel: 00000a919f5c3fe1 00000000002d7d53 ffff810037f2aa48 00000000c770f5f8
> Aug 30 18:25:21 se kernel: Call Trace:
> Aug 30 18:25:21 se kernel: [<ffffffff8006e1db>] do_gettimeofday+0x40/0x90
> Aug 30 18:25:21 se kernel: [<ffffffff886646e5>] :nfs:nfs_wait_bit_uninterruptible+0x0/0xd
> Aug 30 18:25:21 se kernel: [<ffffffff800637ea>] io_schedule+0x3f/0x67
> Aug 30 18:25:21 se kernel: [<ffffffff886646ee>] :nfs:nfs_wait_bit_uninterruptible+0x9/0xd
> Aug 30 18:25:21 se kernel: [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
> Aug 30 18:25:21 se kernel: [<ffffffff886646e5>] :nfs:nfs_wait_bit_uninterruptible+0x0/0xd
> Aug 30 18:25:21 se kernel: [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
> Aug 30 18:25:21 se kernel: [<ffffffff800a0a06>] wake_bit_function+0x0/0x23
> Aug 30 18:25:21 se kernel: [<ffffffff88668106>] :nfs:nfs_wait_on_requests_locked+0x70/0xca
> Aug 30 18:25:21 se kernel: [<ffffffff88669146>] :nfs:nfs_sync_inode_wait+0x60/0x1db
> Aug 30 18:25:21 se kernel: [<ffffffff8865f234>] :nfs:nfs_release_page+0x2c/0x4d
> Aug 30 18:25:21 se kernel: [<ffffffff800caea8>] shrink_inactive_list+0x511/0x8d8
> Aug 30 18:25:21 se kernel: [<ffffffff800ca39b>] isolate_lru_pages+0x98/0xbf
> Aug 30 18:25:21 se kernel: [<ffffffff80047e98>] __pagevec_release+0x19/0x22
> Aug 30 18:25:21 se kernel: [<ffffffff800ca876>] shrink_active_list+0x4b4/0x4c4
> Aug 30 18:25:21 se kernel: [<ffffffff800130f5>] shrink_zone+0x127/0x18d
> Aug 30 18:25:21 se kernel: [<ffffffff80057b94>] kswapd+0x323/0x46c
> Aug 30 18:25:21 se kernel: [<ffffffff800a09d8>] autoremove_wake_function+0x0/0x2e
> Aug 30 18:25:21 se kernel: [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
> Aug 30 18:25:21 se kernel: [<ffffffff80057871>] kswapd+0x0/0x46c
> Aug 30 18:25:21 se kernel: [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
> Aug 30 18:25:21 se kernel: [<ffffffff8003287b>] kthread+0xfe/0x132
> Aug 30 18:25:21 se kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
> Aug 30 18:25:21 se kernel: [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
> Aug 30 18:25:21 se kernel: [<ffffffff8003277d>] kthread+0x0/0x132
> Aug 30 18:25:21 se kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
>
> I have seen this error with both an Intel Pro1000 and a Realtek Ethernet
> card.
>
> I am doing work with 2 other different Universities (completely
> different hardware) and they have all seen this message. Prior to 5.5,
> this would result in the machine locking up. Now with 5.5 it appears
> that the load level on the machine slowly rises (I assume due to D wait
> state blocked processes), but the machine is somewhat responsive. Also
> once these messages occur, ps will hang and that session becomes
> unusable.
>
> I don't what this means, but a similarly configured machine with
> identical hardware running SL4.7 does not produce these errors and the
> NFS throughput is pretty darn good.
>
> Any help or pointers in some direction will be appreciated,
> Thanks,
> doug
>
> ----------------------------------------------------------------------------
> Doug Johnson email: [log in to unmask]
> B390, Duane Physics (303)-492-4506 Office
> Boulder, CO 80309 (303)-492-5119 FAX
> http://www.aaccchildren.org
> Tully, baby. Look around. It's a cage with golden bars.
> ----------------------------------------------------------------------------
>
--
------------------------------------------------------------------
Steven C. Timm, Ph.D (630) 840-8525
[log in to unmask] http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.
|
|
|