Modern systems use on demand allocation. I.e. you can allocate a (f.ex.) 32 MB
SHM chunk, but the actual resource usage (RAM) will correspond to what you
actually use. For example:
0 1M 32M
[****|..................]
| |
| +- unused part
|
+- used part of the SHM segment
As long as your program does not touch (neither reads nor writes) the unused part,
the actual physical memory usage will be 1M + small amount for page tables (worst
case: 4kB of page tables for 4MB of virtual address space). This is at least how
SYSV SHM works on Solaris 10 (look up DISM - dynamic intimate shared memory); I
would expect it to work in the same way on new linux kernels too. I'm not
sufficiently acquainted with NT kernel to be able to comment on it.
Bottom line: allocate as few as large chunks as possible; modern VMs should be
able to handle it gracefully.
===
If you don't know how much memory you will need in total, how do you handle out of
memory situations?
Alternatively, why not use files instead?
_______________________________________________
Boost-users mailing list
Boost...@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users
Modern systems use on demand allocation. I.e. you can allocate a (f.ex.) 32 MB
SHM chunk, but the actual resource usage (RAM) will correspond to what you
actually use. For example:
0 1M 32M
As long as your program does not touch (neither reads nor writes) the unused part,
[****|..................]
| |
| +- unused part
|
+- used part of the SHM segment
the actual physical memory usage will be 1M + small amount for page tables (worst
case: 4kB of page tables for 4MB of virtual address space). This is at least how
SYSV SHM works on Solaris 10 (look up DISM - dynamic intimate shared memory); I
would expect it to work in the same way on new linux kernels too. I'm not
sufficiently acquainted with NT kernel to be able to comment on it.
If you mean the overhead added by a named allocation and indexes, use
managed_shared_memory.get_free_memory() function to know how many bytes
you have after creating the managed segment and after creating an empty
vector.
Take in care that if you fill those vectors alternatively, the
reallocations needed by the vector might not take advantage of all the
needed memory (one vector can end just in the middle of the segment and
the other vector just can't take the memory before and after that
vector). If you need to minimize needed shared memory, pre-calculate all
the data and then dump it in shared memory.
Regards,
Ion
Is there a separate column for "committed memory"? Virtual is, well, just
reserved; committed is actually allocated; working set is what is currently
in RAM (usually less than committed).
Again, I'm not an NT expert -- plese cross-check the above paragraph(s) with
other sources.
>
> Sounds good, but the 6th one of these failed and I got a warning saying my
> system was low on virtual memory. So it sounds like there is a 4GB total
>
I'd rather say that you're low on swap. Each SHM segment needs a corresponding
amount of swap space which can be used as backing store, should you decide to
really use all of the reserved memory. I.e. when the total working set size
of all programs exceeds the total amount of physical memory (minus kernel
memory), some of the pages need to be swapped out to backing store -- in this
case swap.
Also note that the swap space is also just _reserved_ -- the kernel needs to
ensure that it's there before it hands you out the SHM segment, but it will
not be used unless you become short on physical memory. I.e. a mere swap
space _reservation_ will not slow down your system or program.
Try increasing the amount of swap space (so that it's [64MB * # of programs]
larger[*] than the [SHM segment size * # of programs]), repeat the experiment
and see what happens. 6 programs x 512MB, so you should be safe at 3GB +
amount of physical RAM + extra ~1GB for everything else on the system.
[*] Rule of thumb. Every process needs additional VM for stack, data, code,
etc.
>
> which seems silly since each process should have its own address space and
>
it does.
>
> the per-process limit is), I should be able to reserve an unlimited amount
> of total address space. No can do. :(
>
what do you mean by "total address space"? total address space == RAM + swap
(and that is, I guess, what NT calls "virtual memory"), so it is not unlimited.
it is very reasonable that the kernel refuses to overcommit memory (i.e. does
not allow you to reserve more than the "total address space"); simulation of
truly unlimited memory quickly leads to nasty situations (read about linux's
out-of-memory killer).
>
> So strategy #1 of being profligate in choosing shared memory segment size
> fails on WinXP; there's a significant resource cost even if you don't
> actually allocate any memory. Drat.
>
Well, the only resource cost that I can see is disk space reserved for swap.
Given today's disks, I don't see that being a problem if it buys you a simpler
programming model. (And to make it clear, just in case: this is my comment on
your particular application; I do *not* recommend this approach as a general
programming practice!)
This would at least solve your swap problem because the file itself *is*
the backing store for its own mapping (well, if the mapping is shared so
that modifications don't create anonymous COW pages which in turn need
swap) - no additional swap needed.
I havn't followed this whole thread, but I seem to recall that HDF5 supports
MPI with Parallel HDF5.
http://www.hdfgroup.org/HDF5/PHDF5/
Or does that not solve your requirements?
--
Scanned for viruses and dangerous content at
http://www.oneunified.net and is believed to be clean.
Alas, Parallel HDF5 != concurrent file access. As I understand it,
parallel HDF5 = cooperating threads within a process writing in
parallel, and I need one process to write & others to monitor/display
the data.
>Could you maybe use a raw memory-mapped file instead,
>and convert it to HDF5 off-line?
well, technically yes, but for robustness reasons I want to decouple
the HDF5 logging from the shared memory logging. I'm very happy with
the file format's storage efficiency and robustness + have not had to
worry about file corruption (though oddly enough, the "official" HDF5
editor from the HDF5 maintainers has caused corruption in a few logs
when I added some attributes after the fact), so would like to
maintain independent paths: the HDF5 file as a (possibly) permanent
record, and my shared memory structure, which could possibly become
corrupt if I have one of those impossible-to-reproduce bugs -- but I
don't care since I have the log file.
I'm also dealing with a very wide range of storage situations; most
are going to be consecutive packets of data that are written to the
file + left there, but in some cases I may actually delete portions of
previously-written data that has been deemed discardable, in order to
make room for a long test run... more complicated than a vector that
grows with time, or a circular buffer. I've defined structures within
the HDF5 file which handle this fine; in the shared memory I was going
to do essentially the same thing & have a boost::interprocess::list<>
or map<> of moderately-sized data chunks (64K-256K) that I can
keep/discard.
But back to the topic at hand -- let me restate my problem:
Suppose you have N processes where each process i=0,1,...N-1 is going
to need a pool of related memory with a maximum usage of sz[i] bytes.
This size sz[i] is not known beforehand but is guaranteed less than
some maximum M; it has a mean expected value of m where m is much
smaller than M. From a programmer's standpoint, the best way to handle
this would be to reserve a single shared memory segment and ask
Boost::interprocess to make the segment size equal to M. If I do this
then my resource usage in the page file (or on disk if I use a
memory-mapped file) is N*M which is much higher than I need. (I
figured out the source of this: windows_shared_memory pre-commits
space in the page file equal to the requested size)
So what's a reasonable way to architect shared memory use to support
this kind of demand? I guess maybe I could use a vector of shared
memory segments, starting with something like 256KB and increasing
this number as I need to add additional segments. It just seems like a
pain to have to maintain separate memory segments and have to remember
which items live where.
Just for numbers, I may have an occasional log going on that needs to
be in the 512MB range (though most of the time though it will be in
the 50-500K range, occasionally several megabytes), and I can have 4-6
of these going on at once (though usually just one or two). On my own
computer I have increased my max swap file size from 3GB to 7GB (so
the hard limit is somewhat adjustable), though it didn't take effect
until I restarted my PC. I'm going to be using my programs on several
computers + it seems silly to have to go to this extent.
>
> Boost::interprocess to make the segment size equal to M. If I do this
> then my resource usage in the page file (or on disk if I use a
> memory-mapped file) is N*M which is much higher than I need. (I
>
Which is much higher than you need on the average. I fail to see why
having a 10GB, or even a 20GB, swap-file is a problem for you. Too much
of a hassle to configure it on all workstations?
>
> of these going on at once (though usually just one or two). On my own
> computer I have increased my max swap file size from 3GB to 7GB (so
> the hard limit is somewhat adjustable), though it didn't take effect
> until I restarted my PC. I'm going to be using my programs on several
> computers + it seems silly to have to go to this extent.
>
Ok, and you'll run your job on a machine with e.g. 1GB of swap[*], and this
particular instance will need 4GB of swap. What will happen when the
allocation fails? Note that growing the SHM segment in small chunks will
not help you with insufficient virtual memory, so you might as well allocate
M*N at once and exit immediately if the memory is not available.
[*] I'm using "swap" somewhat imprecisely to refer to total virtual memory
(RAM + swap).
Next-best solution: use binary search to find the maximum size you can
allocate and use that instead of M*N.
Neither way is particularly friendly towards other processes on the machine
(I assumed that you were running the jobs on dedicated machines), but is
least painful. Which is least expensive: your time spent developing multi-
chunk SHM management or just allocating a big chunk and reconfiguring all
computers *once*?
(I'm sorry, I'm very pragmatic, and I don't seem to have enough info to really
understand why you're making such a fuss over the swap size issue. I'm afraid
I can't offer you any further suggestions, since I consider this a non-problem
unless you have further constraints.)
I wouldn't allocate M*N at once. Each process could start/stop at
random times (this is triggered by users other than me who would start
multiple logs as necessary) so N changes as a function of time.
What I will probably do is just use one memory segment of size M[i]
memory per process #i, where M[i] has a default value M0, say 64MB,
that I can preset to a larger value if I know I'm going to have a long
duration log.
It's not a huge deal to increase the swap file (even in an old
computer, which most of our lab pc's are, I could add a 2nd hard drive
if I needed), & is almost certainly the most expedient solution for
the time being.
>(I'm sorry, I'm very pragmatic, and I don't seem to have enough info to really
>understand why you're making such a fuss over the swap size issue. I'm afraid
>I can't offer you any further suggestions, since I consider this a non-problem
>unless you have further constraints.)
Not a fuss, just trying to be aware of all the problems. This
discussion has been helpful. I have a career where my resources are
spread thinly among a wide range of things, & it's much more expensive
for me to design quickly for 90% success + then refactor 1-2yrs later
when absolutely necessary, than it is to spend the extra effort up
front to design for 99% success, understand where the 1% failure lies,
and move on to other things knowing I'm far less likely to have to
revisit. Especially when 90% success rates have a tendency to be
overestimated as there are customers who forget to mention certain
design requirements ;)
See ticket #1975: http://svn.boost.org/trac/boost/ticket/1975
Coincidently on the HDF5 mail list today, here were a few possibly related
comments (one user indicates that they do simultaneous writes, which may or
may not be similar to what is needed here :
================
Hello;
i've got an implementation which uses HL API and i run multiple writers and
possibly one reader. The writers go to the same os file but different hdf
files.
in use case scenario, the reader and writer are operational on same hdf
asset at the same time. this reader is also written in a manner that if it
reaches EOF, then it'll wait sometime and then proceed reading.
all this is for win32/vc++... not sure if the same applies to *nix. and it
works fine.
the only thing i needed to do was to enable multi-threading building of
HDF5 and HL. i think there is a link on how to do that... i believe one
need only define the symbol "H5_HAVE_THREADSAFE" and uncomment some
commented out lines in H5pubconf.h.
not sure that answers your questions... and... hope it helps.
regards,
Sheshadri
Jason Sachs wrote:
> I was wondering where I could find some more technical details about
> concurrent reading/writing.
>
> The FAQ discusses it briefly
> (http://www.hdfgroup.org/hdf5-quest.html#grdwt):
>
> <excerpt>
> It is possible for multiple processes to read an HDF5 file when it is
> being written to, and still read correct data. (The following steps
> should be followed, EVEN IF the dataset that is being written to is
> different than the datasets that are read.)
>
> Here's what needs to be done:
>
> * Call H5Fflush() from the writing process.
>
> * The writing process _must_ wait until either a copy of the file
> is made for the reading process, or the reading process is done
> accessing the file (so that more data isn't written to the file,
> giving the reader an inconsistent view of the file's state).
>
> * The reading process _must_ open the file (it cannot have the
> file open before the writing process flushes its information, or it
> runs the risk of having its data cached in memory being incorrect with
> respect to the state of the file) and read whatever information it wants.
>
> * The reading process must close the file.
>
> * The writing process may now proceed to write more data to the
> file.
>
> There must also be some mechanism for the writing process to signal
> the reading process that the file is ready for reading and some way
> for the reading process to signal the writing process that the file
> may be written to again.
> </excerpt>
>
> Could someone elaborate in a more technical manner? e.g. SWMR
> (single-writer multiple-reader) can occur if the following is true
> (not sure if I have this correct; I use "process" rather than
> "threads" here & am not sure if HDF5 in-memory caches have thread
affinity):
>
> 1. At all times the file is in one of the following states:
> (a) unmodified
> (b) modified (written to, but not flushed)
>
> 2. In the unmodified state, zero or more processes may have the file
> open. No process may write to the data.
>
> 3. In the modified state, exactly one process may have the file open.
> This is the process that can write to it.
>
> 4. A successful transition from the unmodified state -> modified state
> takes place when exactly one process has the file open and begins
> writing to it.
>
> 5. A successful transition from the modified state -> unmodified state
> takes place when the process that has written to the file completes a
> successful call to H5Fflush().
>
> The facilities to ensure that only one process has the file open for
> (4) above are not provided by the HDF5 library and must be provided by
> OS-specific facilities e.g. mutexes/semaphores/messaging/etc.
>
==================
--
Scanned for viruses and dangerous content at
http://www.oneunified.net and is believed to be clean.
_______________________________________________