SCIENTIFIC-LINUX-USERS Archives

June 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Devin Bougie <[log in to unmask]>
Reply To:
Devin Bougie <[log in to unmask]>
Date:
Fri, 22 Jun 2012 16:33:05 +0000
Content-Type:
text/plain
Parts/Attachments:
text/plain (41 lines)
Hi, All.  We see periodic file system hangs when a firefox profile is stored in an NFSv4 directory.  Both client and server are fully updated SL6.2.

We reliably see this when running firefox with the profile stored in an NFSv4 file system, and do not see this when the client switches to NFSv3.

To reproduce this, we simply run firefox with the profile stored in an NFSv4 share (for example, mount your home directory using NFSv4).  Eventually the NFSv4 file system will wedge and all access to that FS from that client will block.  When this happens, "umount -f /file/system" will un-wedge the file system and everything continues where it left off.

[root@cesr3601 ~]# umount -f /home/rf_ctl 
umount2: Device or resource busy
umount: /home/rf_ctl: device is busy.
       (In some cases useful info about processes that use
        the device is found by lsof(8) or fuser(1))
umount2: Device or resource busy

Firefox opens a bunch of sqlite files and sqlite uses flock to mediate
access, so this looks to be consistent with a problem w/flock and
NFSv4.  Doing strace on a hung firefox and then doing the 'umount -f'
to unhang shows it sitting in a futex which (presumably) gets woken
by the umount attempt:

futex(0x2b5710c9eab0, FUTEX_WAIT_PRIVATE, 2, NULL) = 0
futex(0x2b5710c9eab0, FUTEX_WAIT_PRIVATE, 2, NULL) = 0
futex(0x2b5710c9eab0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x2b5710d8ca4c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x2b570f606238, 89100) = 1
futex(0x2b5710c9eab0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x2b56fdd0c040, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x2b56fdd0c040, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x2b57083f630c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2b57083f6308, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1

We find lots of reports of problems with NFSv4 home directories and firefox with FC16 and Ubuntu:
https://bugzilla.redhat.com/show_bug.cgi?id=732748
https://bugzilla.redhat.com/show_bug.cgi?id=811138
http://thread.gmane.org/gmane.linux.nfs/48690/focus=48705
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/974664

We have now opened a report for RHEL6, but it appears it won't get much traction until confirmed by someone with a RH Support Contract.
https://bugzilla.redhat.com/show_bug.cgi?id=828521

Has anyone else experienced this or does anyone have any suggestions?

Thanks in advance,
Devin

ATOM RSS1 RSS2