Hi Pete,
On Apr 28, 2010, at 19:22 , Peter Elmer wrote:
> Hi,
>
> On Apr 28, 2010, at 18:58, Stephan Wiesand <[log in to unmask]> wrote:
>> On Apr 27, 2010, at 00:15 , Brett Viren wrote:
>>> We recently started running our C++ analysis code on 64bit SL5.3 and
>>> have been surprised to find the memory usage is about 2x what we are
>>> used when running it on 32 bits. Comparing a few basic applications
>>> like sleep(1) show similar memory usage. Others, like sshd, show only a
>>> 30% size increase (maybe that is subject to configuration differences
>>> between the two hosts).
>>>
>>> I understand that pointers must double in size but the bulk of our
>>> objects are made of ints and floats and these are 32/64 bit-invariant.
>>> I found[1] that poorly defined structs containing pointers can bloat
>>> even on non-pointer data members due the padding needed to keep
>>> everything properly aligned. It would kind of surprise me if this is
>>> what is behind what we see.
>>>
>>> Does anyone have experience in understanding or maybe even combating
>>> this increase in a program's memory footprint when going to 64 bits?
>>
>> Is it real or virtual memory usage that's increasing beyond expectations?
>>
>> Example: glibc's locale handling code will behave quite differently in the 64-bit case. In 32-bit mode, even virtual address space is a scarce resource, while in 64-bit mode it isn't. So in the latter case, they simply mmap the whole file providing the info for the locale in use, while in the former they use a small address window they slide to the appropriate position. The 64-bit case is simpler and thus probably less code, more robust and easier to maintain. And it's probably faster. The 32-bit case uses less *virtual* memory - but *real* memory usage is about the same, since only those pages actually read will ever be paged in. This has a dramatic effect on the VSZ of "hello world in python". It does not on anything that really matters - in particular, checking the memory footprints of sleep & co. is not very useful because they're really small compared to typical HEP analysis apps anyway.
>
> You can work around the locale thing for any batch application (for which that usually should
> not matter) by setting the LANG envvar to "C". For a single process this will only be about 50MB, though.
yes, this is what I meant to say with "not anything that really matters".
> The big difference most of us saw was due to the linker forcing shared libraries text/data to align to 2MB, while we have very many very small (<<2MB) libraries.
Ah. I really didn't know about this one yet.
> You should see this
> explicitly if you do a 'pmap' of your process once it is running and has loaded all
> libraries. You'll see memory sections with no permissions next to those corresponding to
> libraries. Assuming you aren't using huge memory pages in your application there is a
> linker option (don't recall off the top of my head the name) in SL5 binutils ld which allows
> you to reduce this.
>
> But what both of these things say is that VSIZE for 64bit is not a very good measure of
> how much memory an app really needs.
Right. Unfortunately, it's the only value that actually can be attributed to a single process on a system running multiple jobs.
> Taking out fake accounting things like the two
> above our estimate is that our (CMS) applications typically only need 20-25% more memory
> at 64bit relative to 32bit. (From the small code size increase, data type increases for ptr's
> and whatnot and some increase from overhead/alignment for live objects in the heap..)
Yes, this seems reasonable.
> We are actually preparing some proposals/recommendations about measuring memory use,
> as in addition to this VSIZE/64bit confusion the introduction of "multicore" applications which
> share memory also misleads people...
Really looking forward to those. This is a serious problem.
- Stephan
--
Stephan Wiesand
DESY -DV-
Platanenenallee 6
15738 Zeuthen, Germany
|