SCIENTIFIC-LINUX-USERS Archives

June 2007

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Brent L. Bates" <[log in to unmask]>
Reply To:
Brent L. Bates
Date:
Tue, 5 Jun 2007 15:35:17 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (52 lines)
     I've been doing a LOT of testing since my last post.  To recap, we're
running SL 3.0.5 with XFS file systems and the kernel is 2.4.21-37.EL.XFSsmp.
 We have 4 SATA drives on 2 controllers.  Each drive has 3 partitions.  One
partition on each drive is a mirrored software RAID for /boot.  A second
partion on each drive is a swap partition.  The final partition on each drive
is one large software RAID stripe and root (/) is on this file system.  We
have a Sony AIT-3 tape drive on a SCSI controller all internal.

     Tape backups suddenly started taking forever or never finishing because
we reached the end of the tape or so it said.  We have 27GB to backup and the
tape will take 100GB with no compression.  Someone thought we might need to
clean the tape drive, well AIT drives don't need cleaning as they are self
cleaning.  When I'd use `xfsdump' to dump to a file on the hard drive, the
process goes lighting fast.  Only when i try to backup to the tape drive are
there problems.

     The last thing I posted I thought it was a hardware problem with the tape
drive.  I do not think it is that any more.  When I thought it was a hardware
problem, I decided to copy everything from the problem system to a
subdirectory on another almost identical system using `rsync'.  I figured I
could use that system's working tape drive to do backups for both machines.
 Well the problem moved to that system.

     I tried all sorts of timing tests.  Some people thought I might have a
bad file that was causing backups to file, so I started deleting different
whole directory trees on the subdirectory of the second system trying to find
the bad file.  I've not found a single file that fixes things.  I've found
when I delete some directories things speed up.  The speed up doesn't seem
depend on the disk space used by the directory, but on the number of
files/directories under that directory.

     On the test system, I recently deleted all the files and directories from
the primary system and ran a backup.  Everything worked fine.  I then backed
up this test system to a file and then extracted the whole backup to a
subdirectory of that system.  I basically have the entire file system on the
drives twice.  I then did a backup to tape and that was dog slow.  To me this
says the problem is tied to the number of files and directories that need to
be backed up.  The problem ONLY occurs when backing up to tape and not to a
file on the disk drive.

     Anyone seen anything like this?  Anyone have any ideas on how to fix this
problem?  I find this whole thing very weird.  Any and all ideas welcome as
I'm not sure where to go from here.  Thanks.

-- 

  Brent L. Bates (UNIX Sys. Admin.)
  M.S. 912				Phone:(757) 865-1400, x204
  NASA Langley Research Center	  	  FAX:(757) 865-8177
  Hampton, Virginia  23681-0001
  Email: [log in to unmask]	http://www.vigyan.com/~blbates/

ATOM RSS1 RSS2