Subject: | |
From: | |
Reply To: | |
Date: | Fri, 19 Aug 2011 14:01:00 -0700 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
I'm taking a total stab in the dark that someone else has seen similar
issues before and can save me a bunch of time.
I have a few SL6 web servers running mediawiki off a shared netapp nfs
export. One of the hosts had been exhibiting periodic delays while
loading pages. I finally tracked this down to lock contention issues on
the sessions. The netapp shows the lock requests stack but strace shows
some pretty suspect numbers.
======== NLM host wiki.a.foo
39583 0x00030c66:0x56082409 0:0 1 GWAITING (0x43316bd8)
39581 0x00030c66:0x56082409 0:0 1 GWAITING (0x245aa408)
39579 0x00030c66:0x56082409 0:0 1 GWAITING (0x64ea6ef8)
39578 0x00030c66:0x56082409 0:0 1 GRANTED (0x7e719a48)
...
http-trace.30965:13:27:01.709708 flock(11, LOCK_EX) = 0 <0.000185>
http-trace.30970:13:27:01.732142 flock(11, LOCK_EX) = 0 <0.016265>
http-trace.30963:13:27:01.747946 flock(11, LOCK_EX) = 0 <30.041287>
http-trace.30962:13:27:01.754564 flock(11, LOCK_EX) = 0 <60.085116>
http-trace.30961:13:28:17.877626 flock(11, LOCK_EX) = 0 <60.040963>
http-trace.30963:13:28:17.872813 flock(11, LOCK_EX) = 0 <0.044700>
http-trace.30962:13:28:17.873601 flock(11, LOCK_EX) = 0 <90.047198>
http-trace.30967:13:28:17.708213 flock(11, LOCK_EX) = 0 <0.000467>
http-trace.30967:13:28:17.849123 flock(11, LOCK_EX) = 0 <0.000251>
http-trace.30968:13:28:17.863273 flock(11, LOCK_EX) = 0 <30.056047>
...
So, while some lock requests get through just fine, others hang for no
apparent reason. As best I can tell, the locks are all be released
promptly so this appears to be more an issue where a LOCK_EX is held
when another LOCK_EX is queued it isn't actually granted until some
timer expires and the request is tried again.
Any ideas where to look?
--
Kelsey Cummings - [log in to unmask] sonic.net, inc.
System Architect 2260 Apollo Way
707.522.1000 Santa Rosa, CA 95407
|
|
|