SCIENTIFIC-LINUX-USERS Archives

July 2014

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Paul Robert Marino <[log in to unmask]>
Reply To:
Paul Robert Marino <[log in to unmask]>
Date:
Mon, 14 Jul 2014 10:27:35 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (128 lines)
Wow I wish I got into this thread earlier I could have explained a lot.
I've worked with rsync on a low level for many years and have even
debated about writing a C library and possibly a multicast transport
layer for it so I know it quiet well.
Ive seen a lot of misinformation and guessing in this thread so I'm
going to clear a few things up.

First of all rsync does not use any encryption!
Some one misread the spec for rsync and jumped to this conclusion. I
can easily understand how someone could make this mistake so I;m going
to clarify what this person got confused about . rsync uses a rolling
md4 checksum to find differences so it can just sent just the blocks
that have differences rather than the whole file. The only thing close
to encryption support in rsync is if the network transport layer
supports it. In fact originally rsync used rsh as its network
transport layer, and for years to get it to use ssh you would have to
specify "--rsh=/usr/bin/ssh" now its the default.
When you run it locally it simply uses a UNIX socket which to rsync
appears to act the same way as an rsh session.

Second of all, files on differing filesystems formats are like
snowflakes no two are exactly alike even if they were from the same
source file. Since you are copying the files from one filesystem
format to a different filesystems format the sizes will never exactly
match. Each file system has its own overhead and as a result the same
file on Ext4 , XFS, Fat32, and NTFS will each have a slightly
different number of bytes. The same goes for two different devices
formatted with the same file system but different block or inode
sizes.
Therefor rsync will have to resort to check summing every file.

That said you need to know what rsync does under the hood to get its behaviour.

If it does find a difference it will make a hidden copy of the file on
the target and construct the new file with the corrections then moves
it into place. this behavior can be bypassed by using the --inplace
option. the reason it does this is to prvent leaving a corrupted file
if the rsync write process is terminated for any reason

I highly suggest you use the --inplace option in the senario you
described. It will save you time if there are any differences in any
of the files.


Honestly your best bet is to try to use the same format on the source
and target filesystem. This may mean EXT4 on your flash drive or
putting a fat32 or NTFS partition on your laptop. Also remember if you
need a virtual disk in a hurry the "-o loop" option in the mount
command is very useful for mounting ad hock disk images.


On Sun, Jul 13, 2014 at 12:26 AM, ToddAndMargo <[log in to unmask]> wrote:
> On 07/12/2014 06:20 PM, Patrick J. LoPresti wrote:
>>
>> On Fri, Jul 11, 2014 at 10:24 PM, ToddAndMargo <[log in to unmask]>
>> wrote:
>>>
>>>
>>> Hi Pat,
>>>
>>> --modify-window=1
>>>        3 hr - 9 sec
>>>
>>> --modify-window=10
>>>        3 hr - 8 sec
>>>
>>> Rat!  I really though this sounded right
>>
>>
>> Oh, well...
>>
>>> Any way to turn of the check sum testing?
>>
>>
>> Well, there is the "--whole-file" option. But -- and this was news to
>> me -- the man page says it is already the default for local copies:
>>
>>   -W, --whole-file
>>                With this option rsync’s delta-transfer algorithm  is  not
>> used
>>                and  the  whole file is sent as-is instead.  The transfer
>> may be
>>                faster if this option is used when  the  bandwidth  between
>> the
>>                source  and destination machines is higher than the
>> bandwidth to
>>                disk  (especially  when  the  "disk"  is  actually  a
>> networked
>>                filesystem).   This is the default when both the source and
>> des‐
>>                tination  are  specified  as  local  paths,  but  only   if
>> no
>>                batch-writing option is in effect.
>>
>> Are you doing incremental copies here, or are you generally copying an
>> entire tree "fresh"?
>
>
> Hi Pat,
>
> The changes can be random and anywhere.  It is basically
> 19 years of intellectual property I take with me.  I
> write everything down I trouble shoot at a customer's
> site.  There is way too much stuff to remember between
> Windows, Linux, and Mac.
>
>
>>
>> Have you tested the speed of a simple "cp -a" or tar/untar?
>
>
> Not yet.  I was going to test a find the missing file
> subroutine next I had a shot at it to see how fast
> I could find the removals, since cp won't remove
> defunct stuff.
>
>
>>
>>   - Pat
>>
>
>
> --
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Computers are like air conditioners.
> They malfunction when you open windows
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

ATOM RSS1 RSS2