SCIENTIFIC-LINUX-USERS Archives

August 2010

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Doug Johnson <[log in to unmask]>
Reply To:
Doug Johnson <[log in to unmask]>
Date:
Mon, 30 Aug 2010 20:55:54 -0600
Content-Type:
text/plain
Parts/Attachments:
text/plain (69 lines)
Greetings,

I am seeing the following messsage when an SL5.5 (all of the most recent
updates are installed) is under load writing data to an NFS disk:

NOTE: It occurs for other processes than kswapd0, so I don't think that
has anything to do with the issue. 

Aug 30 18:25:21 se kernel: INFO: task kswapd0:220 blocked for more than 120 seconds.
Aug 30 18:25:21 se kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 30 18:25:21 se kernel: kswapd0       D ffff810003336420     0   220     36           221   219 (L-TLB)
Aug 30 18:25:21 se kernel:  ffff810003be19e0 0000000000000046 ffff810037c9c200 ffff8100ae3c4000
Aug 30 18:25:21 se kernel:  0000000000000003 000000000000000a ffff810037f2a860 ffffffff80308b60
Aug 30 18:25:21 se kernel:  00000a919f5c3fe1 00000000002d7d53 ffff810037f2aa48 00000000c770f5f8
Aug 30 18:25:21 se kernel: Call Trace:
Aug 30 18:25:21 se kernel:  [<ffffffff8006e1db>] do_gettimeofday+0x40/0x90
Aug 30 18:25:21 se kernel:  [<ffffffff886646e5>] :nfs:nfs_wait_bit_uninterruptible+0x0/0xd
Aug 30 18:25:21 se kernel:  [<ffffffff800637ea>] io_schedule+0x3f/0x67
Aug 30 18:25:21 se kernel:  [<ffffffff886646ee>] :nfs:nfs_wait_bit_uninterruptible+0x9/0xd
Aug 30 18:25:21 se kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Aug 30 18:25:21 se kernel:  [<ffffffff886646e5>] :nfs:nfs_wait_bit_uninterruptible+0x0/0xd
Aug 30 18:25:21 se kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Aug 30 18:25:21 se kernel:  [<ffffffff800a0a06>] wake_bit_function+0x0/0x23
Aug 30 18:25:21 se kernel:  [<ffffffff88668106>] :nfs:nfs_wait_on_requests_locked+0x70/0xca
Aug 30 18:25:21 se kernel:  [<ffffffff88669146>] :nfs:nfs_sync_inode_wait+0x60/0x1db
Aug 30 18:25:21 se kernel:  [<ffffffff8865f234>] :nfs:nfs_release_page+0x2c/0x4d
Aug 30 18:25:21 se kernel:  [<ffffffff800caea8>] shrink_inactive_list+0x511/0x8d8
Aug 30 18:25:21 se kernel:  [<ffffffff800ca39b>] isolate_lru_pages+0x98/0xbf
Aug 30 18:25:21 se kernel:  [<ffffffff80047e98>] __pagevec_release+0x19/0x22
Aug 30 18:25:21 se kernel:  [<ffffffff800ca876>] shrink_active_list+0x4b4/0x4c4
Aug 30 18:25:21 se kernel:  [<ffffffff800130f5>] shrink_zone+0x127/0x18d
Aug 30 18:25:21 se kernel:  [<ffffffff80057b94>] kswapd+0x323/0x46c
Aug 30 18:25:21 se kernel:  [<ffffffff800a09d8>] autoremove_wake_function+0x0/0x2e
Aug 30 18:25:21 se kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Aug 30 18:25:21 se kernel:  [<ffffffff80057871>] kswapd+0x0/0x46c
Aug 30 18:25:21 se kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Aug 30 18:25:21 se kernel:  [<ffffffff8003287b>] kthread+0xfe/0x132
Aug 30 18:25:21 se kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Aug 30 18:25:21 se kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Aug 30 18:25:21 se kernel:  [<ffffffff8003277d>] kthread+0x0/0x132
Aug 30 18:25:21 se kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11

I have seen this error with both an Intel Pro1000 and a Realtek Ethernet
card.

I am doing work with 2 other different Universities (completely
different hardware) and they have all seen this message. Prior to 5.5,
this would result in the machine locking up. Now with 5.5 it appears
that the load level on the machine slowly rises (I assume due to D wait
state blocked processes), but the machine is somewhat responsive. Also
once these messages occur, ps will hang and that session becomes
unusable.

I don't what this means, but a similarly configured machine with
identical hardware running SL4.7 does not produce these errors and the
NFS throughput is pretty darn good.

	Any help or pointers in some direction will be appreciated,
	Thanks,
	doug

---------------------------------------------------------------------------- 
   Doug Johnson                    email: [log in to unmask]        
   B390, Duane Physics             (303)-492-4506 Office                     
   Boulder, CO 80309               (303)-492-5119 FAX                        
                                   http://www.aaccchildren.org               
   Tully, baby. Look around. It's a cage with golden bars.
----------------------------------------------------------------------------

ATOM RSS1 RSS2