SCIENTIFIC-LINUX-USERS Archives

May 2007

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Connie Sieh <[log in to unmask]>
Reply To:
Connie Sieh <[log in to unmask]>
Date:
Tue, 22 May 2007 13:23:00 -0500
Content-Type:
TEXT/PLAIN
Parts/Attachments:
TEXT/PLAIN (136 lines)
On Tue, 22 May 2007, Connie Sieh wrote:

> On Tue, 22 May 2007, rochelle lauer wrote:
>
>> This is a multi-part message in MIME format.
>>
>> --Boundary_(ID_xQkpMk+I3bYu/zzmEDy/dg)
>> Content-type: text/plain; charset=ISO-8859-1; format=flowed
>> Content-transfer-encoding: 7BIT
>>
>>
>>
>> --Boundary_(ID_xQkpMk+I3bYu/zzmEDy/dg)
>> Content-type: text/plain; name=timing.txt
>> Content-transfer-encoding: 7BIT
>> Content-disposition: inline; filename=timing.txt
>>
>> Hello,
>>
>> I am trying to understand some weird performance characteristics
>> on a newly purchased blade (see statistics below).
>>
>> The hardware is an HP BL465 with 2 dual core AMD HE 2216 processors.
>> This is the first AMD and first 64 processor we have bought.
>
> Is this system a numa based system?

Numa can be involved in performance issues.  If you have numa and it is on 
try turning it off and rerun your tests.

-Connie Sieh
>
>>
>> I installed SL44 x86_64 and we did some performance tests.
>>
>> When running a single job (compute bound monte-carlo with HBOOK output)
>> the performance was about twice as slow as running on our
>> Intel based blade. Although this difference
>> could be attributed to difference in
>> proccesors, running several single
>
> How does the amount of memory compare to the intel based tests?
>
>> jobs in a row produced rather erratic results...
>> 200-300 seconds different on a 900 second job.
>> Some were comparable to the 32 bit processor, some were not.
>>
>> Also, running 4 of the same jobs in parallel
>> produced results which were almost twice as fast !
>>
>> I then (for fun) installed SL43 x86_64 .  This produced results
>> quite different than those on SL44 and more compatible with
>> our 32 bit blades.
>>
>> Below  is a sample of the CPU statistics
>>
>> We first ran the existing 32 bit executable.
>>
>> We then recompiled and ran the 64 bit executable.
>>
>> Many of our jobs cannot be recompiled (won't compile on gcc 3.4 or have
>> missing libraries) so we would really like to understand this performance
>> discrepency on 32 bit executables and SL44.
>>
>> 32 bit executable single job
>>
>>     SL 44                          SL43
>>       906 sec                       556 sec
>>
>> 32 bit executable 4 jobs in parallel
>>
>>   SL44                          SL43
>>
>> job 1   452 sec                 446 sec
>> job 2   446 sec                 442 sec
>> job 3   445 sec                 444 sec
>> job 4   448 sec                 446 sec
>>
>>
>> 64 bit executable single job
>>
>>    510 sec                    497  sec
>>    The 64 bit executable seems to be a little more predictable
>>
>>
>> So, does anyone  have any idea
>>
>>  1. Why such a difference in performance between SL44 and SL43 (Why does
>>     SL44 produce much slower results on a single job)
>
> Not enough info to determine this.
> The biggest difference between SL43 and SL44 is that the kernel has
> changes.
>
>>
>>  2. Why running 4 jobs in parallel produces faster results than
>>      a single job ? One would think jobs running in parallel
>>      would produce slightly slower performance.
>
> Depends on what they are doing?
>
>>
>>  3. Why running 4 jobs in parallel on SL44 produces much
>>     faster results (900 sec vs 452 sec) .
>>
>
> I suggest you try some of the performance tools to help determine what is
> going on.
>
> Things like oprofile, vmstat can help determine what is going on.
>
>>  4. Should we not be running our 32 bit executables with an
>>     SLxx  x86_64  installed ?
>>     I have not yet tried installing SL44(43) x86 to check the
>>     performance. Should I ?
>>
>
> Most see a performance improvement with 32bit on 64bit os.  This has been
> seen quite a bit with AMD 64bit Opteron cpu's because the memory bandwith
> is faster on AMD 64bit Opteron cpu's.
>
> faster on >
>>
>>
>> Thanks for any insight or help
>>
>> Regards
>> Rochelle Lauer
>> Yale University Physics
>>
>>
>> --Boundary_(ID_xQkpMk+I3bYu/zzmEDy/dg)--
>>
> -Connie Sieh
>

ATOM RSS1 RSS2