Thanks
This one, with which I have no experience, is often advertised in the
developer trade rags:
> Does there exist an API to replace the default memory manager in MSVCRT?
Well, C++ allows you to override new operators and Win32 allows for a few
ways of allocating memory. So, if you knew an awful lot about how your
application uses memory you might be able to do better by coming up with
your own sub-allocation scheme assuming you are willing to handle the thread
synchronization issues on your own.
> Any tricks?
Take a look at HeapCreate(). It creates heaps in addition to the standard
process heap. HeapAlloc() has a flag which can be used to turn off its
synchronization. Of course, if you do that it is up to you to protect any
heap which is shared among threads. Note that the docs specifically warn
against doing that on the process heap.
Regards,
Will
Taking into account that I use 3rd party libraries, and they allocate memory
on their own, it looks impossible or at least very difficult.
Some of these 3rd party libs such as STLport are available in source code
(not the most intuitive C++ code I must say) so in principle I can modify
them to use alternative heap manager. But for other libraries I don't even
have source code - however all of them seem to be based on MSVCRT runtime
library.
> > Any tricks?
> Take a look at HeapCreate(). It creates heaps in addition to the standard
> process heap.
thanks for the info
If you relay on malloc()/free(), then there is no general replacement
facility.
[actually, I've seen commercial libraries to replace the first 5 bytes
of msvcrt!malloc to jump to a replacement allocator,
but that is not a recomended and supportable approach.]
Depending on the problems your application is suffering the most
(not expressed below), you might consider LowFragmentation heap,
a pool of private heaps,
or writing custom std::Allocator<> for your SC++L containers.
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
"Zilsch" <z...@ztop.not> wrote in message
news:c%CFc.197954$207.2...@news20.bellglobal.com...
Like I said in a reply to William DePalo, we use 3rd party libraries, and
they allocate memory on their own.
Just replacing custom operators new/new[]/delete/delete[] will not solve the
problem
for the DLLs.
> If you relay on malloc()/free(), then there is no general replacement
> facility.
> [actually, I've seen commercial libraries to replace the first 5 bytes
> of msvcrt!malloc to jump to a replacement allocator,
> but that is not a recomended and supportable approach.]
Looks like there is no other easy way.
I am not going to replace system MSVCRT's,
just put a hacked DLL for my application.
> Depending on the problems your application is suffering the most
> (not expressed below), you might consider LowFragmentation heap,
> a pool of private heaps,
> or writing custom std::Allocator<> for your SC++L containers.
This is all good, but for me it looks like too much unnecessary development
effort to compensate for lousy heap implementation in MSVCRT. I am just
wondering how is it possible that we still have a heap manager that is
basically designed for single threaded apps in the operating system that
claims to natively support both multithreading and SMP (Win2k).
Is there a heap implementation from Microsoft that scales?
Should I try Advanced Server or Win2K Enterprise version?
Do all of them have this problem?
FYI: they ask 5 to 25K USD depending on license terms.
Way too expensive and doesn't make any sense.
- Keith MacDonald
"Zilsch" <z...@ztop.not> wrote in message
news:61sGc.17761$WM5.8...@news20.bellglobal.com...
I've read the book, but I guess it will not solve a memory contention issue
when multiple thread simultaneously access the heap/allocator. Another issue
is that such small object allocators tend to never release the memory they
once allocated which is not desirable in my case. Their internal storage can
only grow.
In addition I need to allocate variable length memory blocks (dynamic
arrays) for which the trick with a linked list constant time allocation
doesn't work.
> Another free one is Hoard
> (http://sourceforge.net/projects/libhoard/).
Good stuff. I will check it out.
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
"Zilsch" <z...@ztop.not> wrote in message
news:61sGc.17761$WM5.8...@news20.bellglobal.com...
We have tried WinHeap library (http://www.winheap.com/) and our application
started to run about 50% faster. Other libraries such as SmartHeap SMP and
Hoard can also deliver great speedups. There is nothing especially esoteric
in the application - just an ordinary C++ code in a multithreaded
environment on Dual CPU machine (yes, we use STL, but we don't use any
home-grown allocators of our own). You shouldn't blame us, poor victims of
less than perfect Microsoft's heap implementation.
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
"Zilsch" <z...@ztop.not> wrote in message
news:10891306...@news.pubnix.net...
Well, if heap access is serial, then, apparently, the single-threaded code
has been "ported" to MT environment by simply putting a global lock on top
of it, instead of implementing true MT heap which must allow parallel
allocations.
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
"Zilsch" <z...@ztop.not> wrote in message
news:10891319...@news.pubnix.net...
Can I configure Visual C++ runtime library to use "Lookaside" and/or "low
fragmentation heap"?
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
"Zilsch" <z...@ztop.not> wrote in message
news:10891357...@news.pubnix.net...
This means we were already using the LookAside and in our performance tests
it didn't do well. We have not run into serious memory fragmentation
troubles yet so I don't see how LowFrag can be better.
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
"Zilsch" <z...@ztop.not> wrote in message
news:10891411...@news.pubnix.net...
Check this one too:
http://www.winheap.com/winheap_info/benchmarks.php
You can get a trial version of WinHeap for free.
I haven't checked all the results myself but the charts are a bit alarming,
aren't they?
The output of `!heap -p -all` and/or `!heap -s` and/or `!heap -a`
does not contain any sensitive information (or al least it's not supposed
to,
in any case it would be plain text information, that is pretty much
a gigantic list of hex numbers).
It might simply confirm that your case would not get any benefit from
the allocators available in the OS, or it might not.
So far, nothing that is actionable has been provided
(besides catalog-listing the options in the OS,
in the runtime and in the ISV implementations).
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
"Zilsch" <z...@ztop.not> wrote in message
news:10891482...@news.pubnix.net...
It automatically creates per-thread heaps if the main heap is locked
when another thread requires memory so should perform well in
multi-threaded/processor environments. I haven't tried it myself in
VC++ though
Cheers
Russell
hth,
Brand
"Zilsch" <z...@ztop.not> wrote in message news:<c%CFc.197954$207.2...@news20.bellglobal.com>...
> Like I said in a reply to William DePalo, we use 3rd party libraries, and
> they allocate memory on their own.
> Just replacing custom operators new/new[]/delete/delete[] will not solve the
> problem
> for the DLLs.
malloc() and free() in Microsoft Visual C++ version 6 are
generally delegated to the WinAPI routines HeapAlloc() and
HeapFree(). Therefore try LeapHeap (http://www.leapheap.com)
which intercepts these calls from the executable and (just
about) all loaded DLLs, and vectors them to a lock-free memory
allocator.
Chris
(do not reply to the email address given; it is spammed out)
>Taking into account that I use 3rd party libraries, and they allocate memory
>on their own, it looks impossible or at least very difficult.
>Some of these 3rd party libs such as STLport are available in source code
>(not the most intuitive C++ code I must say) so in principle I can modify
>them to use alternative heap manager.
First, what is called "STL" is now official part of the language (more
precisely: part of the C++ Standard Library) Unless you have very
specific situations, you do not need STLPort or other STL-LIbraries.
Second, you do not need to tweak the source code to use a different
memory manager. All functionality that needs to allocate/free memory
provides a template argument that specifies a so called Allocator.
Write your own Allocator class and instantiate the templates with
that, and you're done. (Yes: the difficult part is to write an
allocator that is substantialy better for your algorithm and operating
system and whatever than the standard-allocator ).
BTW, Stroustrup reports dramatic performance gains when using special
allocators for special situations. For the
standard-Windows-Application this gain will be less dramatic, but for
computing-intensive things it very well is.
------------------------------------------------
Martin Aupperle
------------------------------------------------
It has been an official part of the language for about 5 years or more.
So what?
>Unless you have very specific situations, you do not need STLPort or other
STL-LIbraries.
STLport provides some facilities not available in other implementations such
as "debug mode".
http://www.stlport.org/doc/debug_mode.html
In reality it can be simpler to deal with a single STL implementation, which
is portable, than with a bunch of slightly differing STL libs from various
vendors.
> Second, you do not need to tweak the source code to use a different
> memory manager. All functionality that needs to allocate/free memory
> [....]
Aha, and I have to modify tons of existing and working code (BTW, other
people's code) replacing std::vector<int> with unreadable std::vector<in,
MyAllocator> and so forth....
> Yes: the difficult part is to write an allocator that is substantialy
better
> for your algorithm and operating
> system and whatever than the standard-allocator
We are talking about multi-threaded apps on SMP platform (SMP/MT). In SMP/MT
environment, outperforming MSVCRT is not difficult at all. Visual C++
runtime library is not really designed for SMP/MT - it has been tweaked (in
a simplistic way) to run on SMP/MT - but as a result we have pathetic
performance.
You can check MSVCRT source code to convince yourself.
Search for
_mlock( _HEAP_LOCK );
As much as you can have issues with
the multi-threaded performance of the SC++L usage in your application,
it's not a relevant argument for the case.
For eaxmple, if you set a breakpoint in ntdll!RtlpLowFragHeapAlloc,
there will be no locks whatsoever taken in the heap manager code path.
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
"Zilsch" <z...@ztop.not> wrote in message
news:10897411...@news.pubnix.net...
I don't know if it is outdated or not, but the thing can still be found in
Microsoft Visual Studio .NET 2003\Vc7\crt\src\
and benchmark results are really bad for both Release and Debug.
Benchmarking the debug version of the C-Rutnime is not that interesting.
If you benchmark an application with FullPageHeap enabled and not enabled,
the results can be radically different.
Again, I second the point that some benchmarks
on certain SC++L usages can give bad results.
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
"Zilsch" <z...@ztop.not> wrote in message
news:10897584...@news.pubnix.net...
We tried all of theose as well but found that the fasteset memory
manager came from Cherrystone Software Labs in the form of a product
called ESA. It was faster than all of them. 2x faster than smartheap
on multi processor systems. I'd be hard pressed to believe that 50%
speedup with winheap because of all of the benchmarking that we've
done, we found winheap to be the slowest of the 4 (ESA, smartheap,
hoard, winheap ... in that order).
But this problem isn't just with Microsoft, it's with all of
the OS vendors. Linux too. Linux is poor in the memory management area
in terms of application memory management via malloc/free. And STL
certainly doesn't make things faster either.
that's a sorry state of affairs. The big OS vendors such as Sun and
Microsoft are selling products that have proud names like "Advanced Server",
"Enterprise", "SMP", "N-way" etc and they still didn't bother themselves to
implement a decent malloc()/free()... I guess they were busy doing more
important things - crafting cute GUIs, bloating application suites, etc.
This can also happen if the benchmark is started with a debugger
attached (yes, that's also possible in relase builds). In this case,
HeapAlloc() automatically fills all allocated memory with 0xbaadf00d
which obviously takes some time...
So I hope all benchmarks have been run without a debugger ;)
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
"Tobias Güntner" <fat...@web.de> wrote in message
news:cdjqbk$k30$06$1...@news.t-online.com...
It only doesn't make sence if 1 of the following is true:
1) You don't care about the performance of your application.
2) You're developing freeware.
If you do care about the performance of your application, or perhaps
oyu don't but your customers do, thne it makes sence to do whatever it
takes to inclrease the performance of the said application. Whether it
be ESA from cherrystone, smartheap from microquill, winheap from
whoever, or something else, by all means do it. Not doing it is not
serving in the best interests of your company and your customers.
We settled on ESA and increased our transactions from 600 transactions
per second to almost 2000 per second (we're a financial house). So who
gains? Ultimately the customer does, and by extension, the company
developing the software, and by further extension, you, the engineer,
for choosing to do something about performance.
Look into the following memory managers:
ESA http://www.cherrystonesoftware.com (Cherrystone Software Labs)
smartheap http://www.microquill.com (Microquill)
winheap http://winheap.com (bevan tech?)
We found ESA to be faster and more scalable than the others (by a long
shotan cheaper than smartheap as as well), but YMMV. Do your own
benchmarks and see which is best for your application.
But you have to care about the performance of your application to have
the motivation to do it.
Jack Dao
joe
PS You can also look at www.hoard.org for an efficient free multi-threaded
memory allocator. It gives a nice scalable improvement, but also suffers
from the inability to easily return memory to the OS.
"Jack Dao" <d...@snakebrook.com> wrote in message
news:cff1ea7c.04072...@posting.google.com...
My be worth a look if that is an issue for you.
http://gee.cs.oswego.edu/dl/html/malloc.html
Cheers
Russell
ESA from Cherrystone Software does not suffer from this
problem. Ican sit there and watch the memory usage of the process go
up and down and memory is mapped in and out.
> joe
>
> PS You can also look at www.hoard.org for an efficient free multi-threaded
> memory allocator. It gives a nice scalable improvement, but also suffers
> from the inability to easily return memory to the OS.
Hoard also suffers from a speed problem as well. In a ESA vs
hoard benchmarking tests we've run, hoard was just plain not fast
enough.
John