SCIENTIFIC-LINUX-USERS Archives

November 2006

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Takashi Ichihara <[log in to unmask]>
Reply To:
Takashi Ichihara <[log in to unmask]>
Date:
Fri, 17 Nov 2006 12:09:55 +0900
Content-Type:
text/plain
Parts/Attachments:
text/plain (104 lines)
  Hi

  We were experimentally using XFS file system on our anonymous ftp
server with Scientific Linux 4.2  (i386) in this spring. Kernel was
build from the original kernel source 2.6.14.3 (latest at that time)
from kernel.org with XFS flags enabled. Also centosplus kernel which
supports XFS/JFS/Reiser FS was experimentally used.

  At the beginning, we noticed that the XFS file system is faster
then ext3. But after a few days, we have experienced frequent
PANIC with "kernel stack overflow."

  And then,  the XFS file systems started to be corrupted.

Mar 29 16:10:43 fff kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO 
at line 1714 of file fs/xfs/xfs_alloc.c.  Caller 0xf8a229f7
Mar 29 16:37:00 fff kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO 
at line 1714 of file fs/xfs/xfs_alloc.c.  Caller 0xf8a229f7
Mar 29 23:25:01 fff kernel: XFS internal error XFS_WANT_CORRUPTED_RETURN 
at line 310 of file fs/xfs/xfs_alloc.c.  Caller 0xf8a20d1e
Mar 30 00:06:43 fff kernel: XFS internal error 
XFS_WANT_CORet_pages_tag+0x28/0x64
Mar 30 00:06:43 fff kernel: XFS internal error XFS_WANT_CORRUPTED_RETURN 
at line 310 of file fs/xfs/xfs_alloc.c.  Caller 0xf8a20d1e
Mar 30 00:12:44 fff kernel: XFS internal error XFS_WANT_CORRUPTED_RETURN 
at line 310 of file fs/xfs/xfs_alloc.c.  Caller 0xf8a2
Mar 30 00:12:44 fff kernel: XFS internal error XFS_WANT_CORRUPTED_RETURN 
at line 310 of file fs/xfs/xfs_alloc.c.  Caller 0xf8a20d1e
Mar 30 09:24:46 fff kernel: Filesystem "sdb1": XFS internal error 
xfs_da_do_buf(1) at line 2176 of file fs/xfs/xfs_da_btree.c.  Caller 
0xf8a3d1dc
Mar 30 09:24:47 fff kernel: Filesystem "sdb1": XFS internal error 
xfs_da_do_buf(1) at line 2176 of file fs/xfs/xfs_da_btree.c.  Caller 
0xf8a3d1dc
Mar 30 09:24:47 fff kernel: Filesystem "sdb1": XFS internal error 
xfs_da_do_buf(1) at line 2176 of file fs/xfs/xfs_da_btree.c.  Caller 
0xf8a3d1dc
  :

For example, file size of x051_BPA.mf shows 4.3TB. This should not be.
.
|$ ls -l
|total 68
|-rw-rw-r--  1 archive archive           347 Jul 14  2005 CAN_trigorgon.mf
|-rw-rw-r--  1 archive archive           198 Aug 16  2005 x050_diastoli.mf
|-rw-rw-r--  1 archive archive 4398046511287 Aug 17  2005 x051_BPA.mf
 :

  Therefore, we decided to switch back all the file systems from XFS to
ext3 with standard Scientific Linux distribution kernel on our ftp
server (i386). After that, the ftp server becomes stable again with
ext3 file systems on SL distribution kernel.

  On the other hand, we are still using XFS file systems on several
X86_64 nodes with Scientific Linux (X86_64) for more than two years
and these seems to be very stable.  My personal impression is that
XFS on "i386"  architecture is not stable.

Takashi Ichihara  (RIKEN)

KELEMEN Peter wrote:
> * Urs Beyerle ([log in to unmask]) [20061115 11:13]:
>
>   
>> As written in
>> ftp://ftp.scientificlinux.org/linux/scientific/44/i386/contrib/RPMS/xfs/README,
>> XFS is not recommended on SL4 32bit kernels.
>>     
>
> Correct.
>
>   
>> What kind of problems where observed?
>>     
>
> Starting with RHEL4, the 4KSTACKS option is enabled when the
> kernel is compiled.  This limits each process' kernel stack to
> 4K with separate stack for interrupts.  XFS can have deep call
> chains (it's a complex filesystem doing complex stuff) and the
> codebase included in SL4 has not been updated to take this reduced
> stackspace into consideration (it's effectively the XFS codebase
> from the 2.6.9 times).  As a result, it is possible to load the
> machine so that XFS overflows its stack and then the game is
> over.  It can be easily triggered by stacking several software
> layers (SCSI+LVM/MD+XFS+NFS) on top of each other but it has
> been demonstrated that the stack overflow can be triggered with
> sufficient load on plain SCSI+XFS systems as well.
>
>   
>> Is somebody working on a solution, which does not require to
>> recompile the standard SL4 kernel?
>>     
>
> The proper solution would be to backport stack-usage reducing XFS
> patches from newer vanilla kernels to the SL4 kernel.  AFAICT
> nobody is working on it actively as we speak.  There are some
> garage projects here and there (hint-hint ;) but no ETA and
> certainly no commitment.  Your best bet is to move to 64-bit
> platform (practically any CPU you can buy nowadays would qualify).
>
> Peter
>
>   

ATOM RSS1 RSS2