SCIENTIFIC-LINUX-USERS Archives

August 2007

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Daniel Widyono <[log in to unmask]>
Reply To:
Daniel Widyono <[log in to unmask]>
Date:
Thu, 30 Aug 2007 22:26:44 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (39 lines)
> Volume of data isn't the only measure of "large" - the number of files 
> is important too.

I started having problems at about 45K - 50K files per directory, using ext3
(pre-hashed-b-trees).  I forget total # files in top level.  This particular
issue wasn't rsync-specific, however.

> It also used an enormous amount of RAM:

> talks to the other while this is happening. I think this was taking an 
> hour or so, but this _was_ a few years ago.

Same experience, and these _were_ rsync-specific issues.  See below for my
"solution".

> The rsync gurus opined that it was better to backup this way than to 
> backup a single file, but my experience suggests otherwise; I now create 
> a filtered filesystem image and use rsync to update that.

It depends on your usage.  If you only update one or two files between
backups, rsync is better way to go (but see below for how to do it better).


So, basically, instead of rsync'ing the entire top level which forces rsync
to build an entire mapping of all the sub-directories, just go into each
sub-directory and rsync individually.  This works better the more balanced
your trees are, of course.  This works for e.g. home directories which are
hashed (/home/a-g/, /home/h-m/, /home/n-t/, /home/u-z/ for example).

The script I use has a subroutine which does an rsync on a specified
directory path.  The main routine just loops through all subdirectories in
the top-level, and calls the subroutine for each entry.  This breaks the
problem down into manageable chunks.  You can modify this to go two levels
down, etc. or recursively call itself down to a user-specified depth.  I got
lazy and it Worked Enough For Me. :)

HTH,
Dan W.

ATOM RSS1 RSS2