On wednesday morning we updated most of our sl53 machines to the current 2.6.18-164.6.1.el5 kernel. Since then we have had two machines (both x86_64 of course) using xfs report xfs corruption problems, e.g. on one of the machines today: Nov 27 11:57:40 yotei kernel: Filesystem "md0": corrupt dinode 1073743441, (btree extents). Unmount and run xfs_repair. Nov 27 11:57:40 yotei kernel: Filesystem "md0": XFS internal error xfs_bmap_read_extents(1) at line 4560 of file fs/xfs/xfs_bmap.c. Caller 0xffffffff88f21b9a Nov 27 11:57:40 yotei kernel: Nov 27 11:57:40 yotei kernel: Call Trace: Nov 27 11:57:40 yotei kernel: [<ffffffff88f03156>] :xfs:xfs_bmap_read_extents+0x361/0x384 Nov 27 11:57:40 yotei kernel: [<ffffffff88f21b9a>] :xfs:xfs_iread_extents+0xac/0xc8 Nov 27 11:57:40 yotei kernel: [<ffffffff88f088bb>] :xfs:xfs_bmapi+0x226/0xe79 Nov 27 11:57:40 yotei kernel: [<ffffffff80062fc8>] thread_return+0x62/0xfe Nov 27 11:57:40 yotei kernel: [<ffffffff8001c03d>] generic_make_request+0x211/0x228 Nov 27 11:57:40 yotei kernel: [<ffffffff88d8507b>] :raid456:handle_stripe+0x21aa/0x2301 Nov 27 11:57:40 yotei kernel: [<ffffffff88f25a1b>] :xfs:xfs_iomap+0x144/0x2a5 Nov 27 11:57:40 yotei kernel: [<ffffffff88d80f5b>] :raid456:get_active_stripe+0x3e1/0x4b7 Nov 27 11:57:40 yotei kernel: [<ffffffff88f3ac24>] :xfs:__xfs_get_blocks+0x7a/0x1bf Nov 27 11:57:40 yotei kernel: [<ffffffff88d85f08>] :raid456:make_request+0x486/0x4d6 Nov 27 11:57:40 yotei kernel: [<ffffffff8009fc08>] autoremove_wake_function+0x0/0x2e Nov 27 11:57:40 yotei kernel: [<ffffffff80028353>] do_mpage_readpage+0x167/0x4a3 Nov 27 11:57:40 yotei kernel: [<ffffffff88f3ad7a>] :xfs:xfs_get_blocks+0x0/0xe Nov 27 11:57:40 yotei kernel: [<ffffffff88f3ad7a>] :xfs:xfs_get_blocks+0x0/0xe Nov 27 11:57:40 yotei kernel: [<ffffffff80038fd0>] mpage_readpages+0x91/0xd9 Nov 27 11:57:40 yotei kernel: [<ffffffff88f3ad7a>] :xfs:xfs_get_blocks+0x0/0xe Nov 27 11:57:40 yotei kernel: [<ffffffff8000f2a7>] __alloc_pages+0x65/0x2ce Nov 27 11:57:40 yotei kernel: [<ffffffff80012e69>] __do_page_cache_readahead+0xfc/0x179 Nov 27 11:57:40 yotei kernel: [<ffffffff8003244e>] blockable_page_cache_readahead+0x53/0xb2 Nov 27 11:57:40 yotei kernel: [<ffffffff80013f01>] page_cache_readahead+0xd6/0x1af Nov 27 11:57:40 yotei kernel: [<ffffffff8000c189>] do_generic_mapping_read+0xc6/0x354 Nov 27 11:57:40 yotei kernel: [<ffffffff8000d0b6>] file_read_actor+0x0/0x159 Nov 27 11:57:40 yotei kernel: [<ffffffff8000c563>] __generic_file_aio_read+0x14c/0x198 Nov 27 11:57:40 yotei kernel: f088bb>] :xfs:xfs_bmapi+0x226/0xe79 Nov 27 11:57:40 yotei kernel: [<ffffffff88f21ba8>] :xfs:xfs_iread_extents+0xba/0xc8 Nov 27 11:57:40 yotei kernel: [<ffffffff88f088bb>] :xfs:xfs_bmapi+0x226/0xe79 Nov 27 11:57:40 yotei kernel: [<ffffffff80030348>] __up_write+0x27/0xf2 Nov 27 11:57:40 yotei kernel: [<ffffffff88f25a1b>] :xfs:xfs_iomap+0x144/0x2a5 Nov 27 11:57:40 yotei kernel: [<ffffffff80030348>] __up_write+0x27/0xf2 Nov 27 11:57:40 yotei kernel: [<ffffffff88f3ac24>] :xfs:__xfs_get_blocks+0x7a/0x1bf Nov 27 11:57:40 yotei kernel: [<ffffffff80022d4d>] alloc_buffer_head+0x31/0x36 Nov 27 11:57:40 yotei kernel: [<ffffffff8002e731>] alloc_page_buffers+0x81/0xd3 Nov 27 11:57:40 yotei kernel: [<ffffffff800e0e74>] block_read_full_page+0x112/0x276 Nov 27 11:57:40 yotei kernel: [<ffffffff88f3ad7a>] :xfs:xfs_get_blocks+0x0/0xe Nov 27 11:57:40 yotei kernel: [<ffffffff88d85f08>] :raid456:make_request+0x486/0x4d6 Nov 27 11:57:40 yotei kernel: [<ffffffff8009fc08>] autoremove_wake_function+0x0/0x2e Nov 27 11:57:40 yotei kernel: [<ffffffff8002866e>] do_mpage_readpage+0x482/0x4a3 Nov 27 11:57:40 yotei kernel: [<ffffffff88f3ad7a>] :xfs:xfs_get_blocks+0x0/0xe Nov 27 11:57:40 yotei kernel: [<ffffffff8014f55a>] radix_tree_node_alloc+0x18/0x57 Nov 27 11:57:40 yotei kernel: [<ffffffff88f3ad7a>] :xfs:xfs_get_blocks+0x0/0xe Nov 27 11:57:40 yotei kernel: [<ffffffff8000c6dd>] add_to_page_cache+0xaa/0xc1 Nov 27 11:57:40 yotei kernel: [<ffffffff88f3ad7a>] :xfs:xfs_get_blocks+0x0/0xe Nov 27 11:57:40 yotei kernel: [<ffffffff80038fd0>] mpage_readpages+0x91/0xd9 Nov 27 11:57:40 yotei kernel: [<ffffffff88f3ad7a>] :xfs:xfs_get_blocks+0x0/0xe Nov 27 11:57:40 yotei kernel: [<ffffffff8000f2a7>] __alloc_pages+0x65/0x2ce Nov 27 11:57:40 yotei kernel: [<ffffffff80012e69>] __do_page_cache_readahead+0xfc/0x179 Nov 27 11:57:40 yotei kernel: [<ffffffff8003244e>] blockable_page_cache_readahead+0x53/0xb2 Nov 27 11:57:40 yotei kernel: [<ffffffff80013f01>] page_cache_readahead+0xd6/0x1af Nov 27 11:57:40 yotei kernel: [<ffffffff8000c189>] do_generic_mapping_read+0xc6/0x354 Nov 27 11:57:40 yotei kernel: [<ffffffff8000d0b6>] file_read_actor+0x0/0x159 Nov 27 11:57:40 yotei kernel: [<ffffffff8000c563>] __generic_file_aio_read+0x14c/0x198 Nov 27 11:57:40 yotei kernel: [<ffffffff88f4191c>] :xfs:xfs_read+0x187/0x209 Nov 27 11:57:40 yotei kernel: [<ffffffff88f3e5f0>] :xfs:xfs_file_aio_read+0x63/0x6b Nov 27 11:57:40 yotei kernel: [<ffffffff8000cddf>] do_sync_read+0xc7/0x104 Nov 27 11:57:40 yotei kernel: [<ffffffff8009fc08>] autoremove_wake_function+0x0/0x2e Nov 27 11:57:40 yotei kernel: [<ffffffff8000b695>] vfs_read+0xcb/0x171 Nov 27 11:57:40 yotei kernel: [<ffffffff80011b72>] sys_read+0x45/0x6e Nov 27 11:57:40 yotei kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Nov 27 11:57:40 yotei kernel: <A similar set of messages (with different traceback) is repeated many times until...> ... Nov 27 11:57:49 yotei kernel: xfs_force_shutdown(md0,0x8) called from line 1165 of file fs/xfs/xfs_trans.c. Return address = 0xffffffff88f32b48 Nov 27 11:57:49 yotei kernel: Filesystem "md0": Corruption of in-memory data detected. Shutting down filesystem: md0 Nov 27 11:57:49 yotei kernel: Please umount the filesystem, and rectify the problem(s) Nov 27 11:57:50 yotei kernel: Filesystem "md0": xfs_log_force: error 5 returned. Nov 27 11:57:57 yotei kernel: xfs_imap_to_bp: xfs_trans_read_buf()returned an error 5 on md0. Returning error. Nov 27 11:58:00 yotei kernel: Filesystem "md0": xfs_log_force: error 5 returned. Nov 27 11:58:11 yotei kernel: nfsd: non-standard errno: 5 Nov 27 11:58:11 yotei last message repeated 4 times Nov 27 11:58:11 yotei kernel: xfs_imap_to_bp: xfs_trans_read_buf()returned an error 5 on md0. Returning error. Nov 27 11:58:11 yotei last message repeated 5 times ... running xfs_repair(*) shows no obvious problems, and the fs then appears to be ok for a while at least. On one of the machines the problem came back after a few more hours, but since then hasn't happened again - yet. Is anyone else seeing this? So far we have only noticed this on two machines which happen to be also having the xfs volume used quite heavily over NFS but that may be co-incidence. Until the new kernel these machines were apparently working ok with the previous sl53 kernel, so maybe this is caused by how TUV happen to have built their xfs modules - as compared to the ones which SL made before... (*) xfs_repair (and xfs_check) refused to touch the file-system claiming it was mounted even though we had unmounted it. We ended up needing to reboot with the fs commented out of fstab to get xfs_repair to run. I've never needed to run xfs_repair before so I don't know if that is normal or not but it seems odd - though probably not related to the real problem. Is there a way we can do any tests with xfs built like it was for the older sl kernels? -- Jon -- /--------------------------------------------------------------------\ | "Computers are different from telephones. Computers do not ring." | | -- A. Tanenbaum, "Computer Networks", p. 32 | ---------------------------------------------------------------------| | Jon Peatfield, _Computer_ Officer, DAMTP, University of Cambridge | | Mail: [log in to unmask] Web: http://www.damtp.cam.ac.uk/ | \--------------------------------------------------------------------/