SCIENTIFIC-LINUX-USERS Archives

February 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Stephen J. Gowdy" <[log in to unmask]>
Reply To:
Stephen J. Gowdy
Date:
Mon, 6 Feb 2012 21:47:11 +0100
Content-Type:
TEXT/PLAIN
Parts/Attachments:
TEXT/PLAIN (449 lines)
Hi Chris,
 	When I read and write to the same disk the 2GB bs helps (a lot), 
but if I just write to a normal disk the 2GB block size doesn't. I get 
about 52MB/s with 2 or 8MB block size but only 44MB/s with a 2GB block 
size. This is just a standard disk, no RAID.

 						regards,

 						Stephen.

On Mon, 6 Feb 2012, Chris Schanzle wrote:

> Hi Stephen,
>
> Most of my comments were in the context of reducing disk seeks.  Using an SSD 
> kinda eliminates that penalty.  :-)
>
> It is unclear why your SSD writes big blocks slower than small blocks.  SSD's 
> are very complex little devices.  Their write performance depends so much on 
> the firmware's ability to have pre-erased, ready-to-write sectors (since 
> erasing is slow), as well as writing on proper boundaries (like 'advanced 
> format' 4k sector drives) to avoid read/modify/write cycles.  TRIM/discard 
> support is vital to maintaining performance over time.  In one EL5 system, I 
> have a md RAID0 (thus no TRIM) that has the *worst* random write performance 
> of any system (including 'spinning rust' hard drives) now that it's aged and 
> the firmware basically doesn't have any free pre-erased blocks.
>
> Thanks for showing your results.  It's always good to test.
>
> Using similar dd commands to yours, on a traditional hard drive, I get about 
> 37 MB/sec (with a lot of disk seeking noise) with 32K or 8M blocks; with 
> bs=2G I get about 48 MB/s.  Not as much difference as I would have expected, 
> but in my case my output file might not have been very far from the beginning 
> of the disk (hard to tell with LVM), so seeks might not have been very 
> distant.  There was essentially no disk seeking with bs=2G.  Throw 
> 'iflag=direct oflag=direct' with 32KB blocks and I drop to 30 MB/s.
>
> If you're reading/writing to different spindles, then you want reasonably 
> small block sizes to increase parallelism between the reading and writing. 
> I.e., you don't want to wait for a 2GB read to complete before starting a 
> write.  In that case, letting the VM system handle writing in the background 
> in parallel works fine.  Optimize your read/write sizes for your device. 
> E.g., RAID devices typically have 64 KB to 256 KB stripes, so you want to be 
> at least that big or some multiple thereof.
>
> Regards,
> Chris
>
>
> On 02/06/2012 12:47 PM, Stephen J. Gowdy wrote:
>> Hi Chris,
>>          I understand using lager than 32kB block size can help the
>> throughput but I'd doubt you'd get advantage with a 2GB block size over a
>> 8MB block size for most devices. It may also be due to my laptop only
>> having 4GB of RAM but it is much better to use 8MB rather than 2GB for my
>> SSD drive;
>> 
>> [root@antonia ~]# time dd if=/dev/sda of=/scratch/gowdy/test bs=8MB 
>> count=256
>> 256+0 records in
>> 256+0 records out
>> 2048000000 bytes (2.0 GB) copied, 36.1101 s, 56.7 MB/s
>> 
>> real    0m36.125s
>> user    0m0.002s
>> sys     0m2.420s
>> root@antonia ~]# time dd if=/dev/sda of=/scratch/gowdy/test bs=2GB count=1
>> 1+0 records in
>> 1+0 records out
>> 2000000000 bytes (2.0 GB) copied, 56.1444 s, 35.6 MB/s
>> 
>> real    0m56.738s
>> user    0m0.001s
>> sys     0m14.715s
>> 
>> (oops, and I should have said 8M and 2G bs I guess). 2MB buffer isn't much
>> slower;
>> 
>> [root@antonia ~]# time dd if=/dev/sda of=/scratch/gowdy/test bs=2MB 
>> count=1024
>> 1024+0 records in
>> 1024+0 records out
>> 2048000000 bytes (2.0 GB) copied, 38.4204 s, 53.3 MB/s
>> 
>> real    0m38.781s
>> user    0m0.004s
>> sys     0m2.410s
>>
>>                                                          regards,
>>
>>                                                          Stephen.
>> 
>> 
>> On Mon, 6 Feb 2012, Chris Schanzle wrote:
>> 
>>> It's a shame the original question didn't explain what and why he was 
>>> trying
>>> to do something with these large blocks.
>>> 
>>> Huge block sizes are useful if you have lots of ram and are copying very
>>> large files on the same set of spindles.  This minimizes disk seeking 
>>> caused
>>> by head repositioning for reads and writes and is vastly more efficient 
>>> than
>>> say, "cp" which often uses at most 32 KB reads/writes and relies on the VM
>>> system to flush the writes (buffered by dirtying memory pages) pages as it
>>> deems appropriate (tunables in /proc/sys/vm/dirty*).
>>> 
>>> Anyway, let's look at what system calls 'dd' does:
>>> 
>>> $ strace dd if=/dev/zero of=/dev/shm/deleteme bs=12G count=1
>>> ...
>>> open("/dev/shm/deleteme", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
>>> dup2(3, 1)                              = 1
>>> close(3)                                = 0
>>> mmap(NULL, 12884914176, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
>>> -1,
>>> 0) = 0x2af98c7a0000
>>> read(0,
>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 12884901888) = 2147479552
>>> write(1,
>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 2147479552) = 2147479552
>>> close(0)                                = 0
>>> close(1)                                = 0
>>> ...
>>> 
>>> (count=2 is also interesting)
>>> 
>>> Things to notice:
>>> 
>>> 1.  strace shows dd is issuing a 12GB read from the input descriptor
>>> (/dev/zero) but is getting a 'short read' from the kernel of 2GB.  Short
>>> reads are not an error.
>>> 
>>> 2.  The "count=" option in the dd man page specifies that it limits the
>>> number of INPUT blocks.  So it writes what it read (2GB) and quits.
>>> 
>>> So it seems to be working as designed, though perhaps not as you want.
>>> 
>>> Adding 'iflag=fullblock' will cause dd to perform multiple reads to fill 
>>> the
>>> input block size.
>>> 
>>> mmap(NULL, 12884914176, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
>>> -1,
>>> 0) = 0x2b2d8735e000
>>> read(0,
>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 12884901888) = 2147479552
>>> read(0,
>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 10737422336) = 2147479552
>>> read(0,
>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 8589942784) = 2147479552
>>> read(0,
>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 6442463232) = 2147479552
>>> read(0,
>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 4294983680) = 2147479552
>>> read(0,
>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 2147504128) = 2147479552
>>> read(0,
>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
>>> 24576)
>>> = 24576
>>> write(1, "", 12884901888)               = 2147479552
>>> write(1, "", 10737422336)               = 2147479552
>>> write(1,
>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 8589942784) = 2147479552
>>> write(1, "", 6442463232)                = 2147479552
>>> write(1,
>>> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 4294983680) = 2147479552
>>> 
>>> Notice how the writes empty the input 2GB at a time.
>>> 
>>> Of course, all this reading/writing goes through typical VM buffering, so 
>>> you
>>> might want to consider direct i/o:  iflag=direct and oflag=direct.
>>> 
>>> Which begs the question: how to encourage the kernel to allow larger
>>> read/write file buffers?  Couldn't find that answer easily.  Anyone?
>>> 
>>> -c
>>> 
>>> On 02/02/2012 12:32 PM, Stephen J. Gowdy wrote:
>>>> Hi Andrey,
>>>>           Why would you want a block size in GB? I don't know what the
>>>> actual limit for dd itself is, although it does seem to be exactly 2GiB.
>>>>
>>>>                                                   regards,
>>>>
>>>>                                                   Stephen.
>>>> 
>>>> On Thu, 2 Feb 2012, Andrey Y. Shevel wrote:
>>>> 
>>>>> 
>>>>> Hi Stephen,
>>>>> 
>>>>> thank you for your reply.
>>>>> 
>>>>> ======
>>>>> [root@pcfarm-10 ~]# rpm -qa --queryformat "%{NAME}-%{VERSION}.%{ARCH}\n" 
>>>>> |
>>>>> grep coreutils
>>>>> policycoreutils-1.33.12.x86_64
>>>>> policycoreutils-newrole-1.33.12.x86_64
>>>>> coreutils-5.97.x86_64
>>>>> policycoreutils-gui-1.33.12.x86_64
>>>>> =====
>>>>> 
>>>>> And obviously
>>>>> 
>>>>> ================
>>>>> [root@pcfarm-10 ~]# arch
>>>>> x86_64
>>>>> ===============
>>>>> 
>>>>> 
>>>>> The result is prety same as I shown earlier.
>>>>> 
>>>>> And the same I see at CERN
>>>>> 
>>>>> =======================
>>>>> [lxplus427] /afs/cern.ch/user/s/shevel>   dd if=/dev/zero 
>>>>> of=/tmp/testx64
>>>>> bs=3GB count=1
>>>>> 0+1 records in
>>>>> 0+1 records out
>>>>> 2147479552 bytes (2.1 GB) copied, 5.91242 seconds, 363 MB/s
>>>>> [lxplus427] /afs/cern.ch/user/s/shevel>   rpm -q --file /bin/dd
>>>>> coreutils-5.97-34.el5
>>>>> [lxplus427] /afs/cern.ch/user/s/shevel>    rpm -qa --queryformat
>>>>> "%{NAME}-%{VERSION}.%{ARCH}\n" | grep coreutil
>>>>> policycoreutils-1.33.12.x86_64
>>>>> coreutils-5.97.x86_64
>>>>> policycoreutils-gui-1.33.12.x86_64
>>>>> ===========================
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> As far as I understand the main question is "is there 64 bit dd version
>>>>> which
>>>>> can operate more then 2GB value for BS in SL anyway?"
>>>>> 
>>>>> Any answer (yes or no) is good to know.
>>>>> 
>>>>> Many thanks,
>>>>> 
>>>>> Andrey
>>>>> 
>>>>> 
>>>>> On Wed, 1 Feb 2012, Stephen J. Gowdy wrote:
>>>>> 
>>>>>> Date: Wed, 1 Feb 2012 19:10:14 +0100 (CET)
>>>>>> From: Stephen J. Gowdy<[log in to unmask]>
>>>>>> To: Andrey Y. Shevel<[log in to unmask]>
>>>>>> Cc: [log in to unmask]
>>>>>> Subject: Re: coreutils for 64 bit
>>>>>> 
>>>>>> Exactly.... if you type "man rpm" it will show you how you get it to
>>>>>> print
>>>>>> the arch string (usually i686 or x86_64). Since you seem unabel to read 
>>>>>> a
>>>>>> man page what you want to type is;
>>>>>> 
>>>>>> rpm -qa --queryformat "%{NAME}-%{VERSION}.%{ARCH}\n" | grep coreutils
>>>>>> 
>>>>>> (or miss out the VERSION if you want to see somethign similar to yum)
>>>>>> 
>>>>>> On Wed, 1 Feb 2012, Andrey Y. Shevel wrote:
>>>>>> 
>>>>>>>
>>>>>>>    Hi Stephen,
>>>>>>>
>>>>>>>    thanks for the reply.
>>>>>>>
>>>>>>>    I am not sure that I do understand you (sorry for my stupidity).
>>>>>>>
>>>>>>>    I have
>>>>>>>    =======================================
>>>>>>>    [root@pcfarm-10 ~]# yum list | grep coreutil
>>>>>>>    Failed to set locale, defaulting to C
>>>>>>>    coreutils.x86_64                         5.97-34.el5 installed
>>>>>>>    policycoreutils.x86_64                   1.33.12-14.8.el5 installed
>>>>>>>    policycoreutils-gui.x86_64               1.33.12-14.8.el5 installed
>>>>>>>    policycoreutils-newrole.x86_64           1.33.12-14.8.el5 installed
>>>>>>>    [root@pcfarm-10 ~]# rpm -q --file /bin/dd
>>>>>>>    coreutils-5.97-34.el5
>>>>>>>    =============================================
>>>>>>>
>>>>>>>    Presumably all packages are appropriate (they have suffix x86_64) 
>>>>>>> as
>>>>>>> shown
>>>>>>>    by yum.
>>>>>>>
>>>>>>>    At the same time rpm does show packages without above suffixes
>>>>>>>
>>>>>>>    =========================
>>>>>>>    [root@pcfarm-10 ~]# rpm -qa | grep coreutil
>>>>>>>    policycoreutils-1.33.12-14.8.el5
>>>>>>>    policycoreutils-newrole-1.33.12-14.8.el5
>>>>>>>    coreutils-5.97-34.el5
>>>>>>>    policycoreutils-gui-1.33.12-14.8.el5
>>>>>>>    =========================
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>
>>>>>>>    On Wed, 1 Feb 2012, Stephen J. Gowdy wrote:
>>>>>>>
>>>>>>>>    Date: Wed, 1 Feb 2012 11:32:40 +0100 (CET)
>>>>>>>>    From: Stephen J. Gowdy<[log in to unmask]>
>>>>>>>>    To: Andrey Y Shevel<[log in to unmask]>
>>>>>>>>    Cc: [log in to unmask]
>>>>>>>>    Subject: Re: coreutils for 64 bit
>>>>>>>>>    It says it only copied 2.1GB. You are runnig a 64bit OS. You
>>>>>>> reinstalld>    the same coreutils package. You need to change the 
>>>>>>> format
>>>>>>> of
>>>>>>> the package>    names from "rpm -qa" if you want to see the 
>>>>>>> architecture
>>>>>>> ("man rpm">    should help you figure out how).
>>>>>>>>>    On Wed, 1 Feb 2012, Andrey Y Shevel wrote:
>>>>>>>>>>     Hi,
>>>>>>>>>>>     I just paid attention that utility 'dd' uses just 2 GB even I
>>>>>>> use>   >     greater
>>>>>>>>>     block size (BS). For example
>>>>>>>>>>>     =====
>>>>>>>>>     [root@pcfarm-10 ~]# dd if=/dev/zero of=/mnt/sdb/TestFile-S1 
>>>>>>>>> bs=12GB
>>>>>>>>>     count=1
>>>>>>>>>     0+1 records in
>>>>>>>>>     0+1 records out
>>>>>>>>>     2147479552 bytes (2.1 GB) copied, 15.8235 seconds, 136 MB/s
>>>>>>>>>     ============
>>>>>>>>>>>     BTW,
>>>>>>>>>>>     [root@pcfarm-10 ~]# uname -a
>>>>>>>>>     Linux pcfarm-10.pnpi.spb.ru 2.6.18-274.17.1.el5xen #1 SMP Tue 
>>>>>>>>> Jan
>>>>>>>>> 10
>>>>>>>>>     16:41:16 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
>>>>>>>>>     [root@pcfarm-10 ~]# cat /etc/issue
>>>>>>>>>     Scientific Linux SL release 5.7 (Boron)
>>>>>>>>>     Kernel \r on an \m
>>>>>>>>>>>>>>>>>     I decided to reinstall coreutils:
>>>>>>>>>>>     [root@pcfarm-10 ~]# yum reinstall coreutils.x86_64
>>>>>>>>>     Failed to set locale, defaulting to C
>>>>>>>>>     Loaded plugins: kernel-module
>>>>>>>>>     Setting up Reinstall Process
>>>>>>>>>     Resolving Dependencies
>>>>>>>>>     -->   Running transaction check
>>>>>>>>>     --->   Package coreutils.x86_64 0:5.97-34.el5 set to be updated
>>>>>>>>>     -->   Finished Dependency Resolution
>>>>>>>>>     Beginning Kernel Module Plugin
>>>>>>>>>     Finished Kernel Module Plugin
>>>>>>>>>>>     Dependencies Resolved
>>>>>>>>>>> 
>>>>>>> ===========================================================================================
>>>>>>>>>     Package              Arch              Version
>>>>>>>>>     Repository
>>>>>>>>>             Size
>>>>>>>>> 
>>>>>>> ===========================================================================================
>>>>>>>>>     Reinstalling:
>>>>>>>>>     coreutils            x86_64            5.97-34.el5>   > 
>>>>>>>>> sl-base
>>>>>>>>>             3.6 M
>>>>>>>>>>>     Transaction Summary
>>>>>>>>> 
>>>>>>> ===========================================================================================
>>>>>>>>>     Remove        0 Package(s)
>>>>>>>>>     Reinstall     1 Package(s)
>>>>>>>>>     Downgrade     0 Package(s)
>>>>>>>>>>>     Total download size: 3.6 M
>>>>>>>>>     Is this ok [y/N]: y
>>>>>>>>>     Downloading Packages:
>>>>>>>>>     coreutils-5.97-34.el5.x86_64.rpm
>>>>>>> |>   >     3.6
>>>>>>>>>     MB
>>>>>>>>>        00:05
>>>>>>>>>     Running rpm_check_debug
>>>>>>>>>     Running Transaction Test
>>>>>>>>>     Finished Transaction Test
>>>>>>>>>     Transaction Test Succeeded
>>>>>>>>>     Running Transaction
>>>>>>>>>      Installing     : coreutils
>>>>>>>>>               1/1
>>>>>>>>>>>     Installed:
>>>>>>>>>      coreutils.x86_64 0:5.97-34.el5
>>>>>>>>>>>>>     Complete!
>>>>>>>>>     =========================
>>>>>>>>>>>     However after that I see
>>>>>>>>>>>>>     [root@pcfarm-10 ~]# ls -l /bin/dd
>>>>>>>>>     -rwxr-xr-x 1 root root 41464 Jul 26  2011 /bin/dd
>>>>>>>>>     [root@pcfarm-10 ~]# rpm -q --file /bin/dd
>>>>>>>>>     coreutils-5.97-34.el5
>>>>>>>>>>>>>     [root@pcfarm-10 ~]# rpm -qa | grep coreutils
>>>>>>>>>     policycoreutils-1.33.12-14.8.el5
>>>>>>>>>     policycoreutils-newrole-1.33.12-14.8.el5
>>>>>>>>>     coreutils-5.97-34.el5
>>>>>>>>>     policycoreutils-gui-1.33.12-14.8.el5
>>>>>>>>>>>>>     i.e. no package with name coreutils.x86_64
>>>>>>>>>>>     I failed to find anything on the topic in scientific linux
>>>>>>> mailing>   >     list.
>>>>>>>>>>>     Does somebody know about dd for 64 bit ?
>>>>>>>>>>>     Many thanks in advance,
>>>>>>>>>>>     Andrey
>>>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> --
>>>>     /------------------------------------+-------------------------\
>>>> |Stephen J. Gowdy                     | CERN       Office: 8-1-11|
>>>> |http://cern.ch/gowdy/                | CH-1211 Geneva 23        |
>>>> |                                     | Switzerland              |
>>>> |EMail: [log in to unmask]                 | Tel: +41 76 487 2215     |
>>>>     \------------------------------------+-------------------------/
>>> 
>> 
>> --
>>    /------------------------------------+-------------------------\
>> |Stephen J. Gowdy                     | CERN       Office: 8-1-11|
>> |http://cern.ch/gowdy/                | CH-1211 Geneva 23        |
>> |                                     | Switzerland              |
>> |EMail: [log in to unmask]                 | Tel: +41 76 487 2215     |
>>    \------------------------------------+-------------------------/
>

-- 
  /------------------------------------+-------------------------\
|Stephen J. Gowdy                     | CERN       Office: 8-1-11|
|http://cern.ch/gowdy/                | CH-1211 Geneva 23        |
|                                     | Switzerland              |
|EMail: [log in to unmask]                 | Tel: +41 76 487 2215     |
  \------------------------------------+-------------------------/

ATOM RSS1 RSS2