This loop is where we get stuck, in
managed_open_or_create_impl::priv_open_or_create
while(value == InitializingSegment || value == UninitializedSegment){
detail::thread_yield();
value = detail::atomic_read32(patomic_word);
}
At this point I have opened the file in open_only mode. *patomic_word
is 0 (UninitializedSegment) at this point. It appears this code is
waiting for some other process or thread to initialize the segment, but
in fact there is no such process doing so.
Some possible solutions:
If, at this point we have opened a file and not created it, why wait for
an UnitializedSegment to change state? If the segment is Unitialized
here then simply throw an error. Make it the caller's responsibility to
ensure the segment is created/initialized before it opened in a
read-only mode.
Perhaps, if you want to allow multiple processes to do open and create
simultaneously without any additional synchronization mechanism, you
could accomplish that by adding a count of open mappings into the shared
segment. If the reference count is 1 at this point, don't attempt this
spinlock because the state of the file is never going to change. In
this case throw if the *patomic_word is != InitializedSegment.
--
KEVIN ARUNSKI
_______________________________________________
Boost-users mailing list
Boost...@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users
> Some possible solutions:
>
> If, at this point we have opened a file and not created it, why wait for
> an UnitializedSegment to change state? If the segment is Unitialized
> here then simply throw an error. Make it the caller's responsibility to
> ensure the segment is created/initialized before it opened in a
> read-only mode.
The reason is to support simultaneous open and create, as you indicate
below.
> Perhaps, if you want to allow multiple processes to do open and create
> simultaneously without any additional synchronization mechanism, you
> could accomplish that by adding a count of open mappings into the shared
> segment. If the reference count is 1 at this point, don't attempt this
> spinlock because the state of the file is never going to change. In this
> case throw if the *patomic_word is != InitializedSegment.
A count does not work, because if a process dies, then you have a wrong
count. If you need to commit the first page to avoid power errors, call
flush() just after creating the managed segment.
Anyway, trying to use a mapped file after a hard shut down has no
sensible recovery, you don't know which parts of the file the OS has
committed, the internal data structure might be absolutely corrupted.
Best,
Ion
Understood. I have been using flush() to commit managed file segments
to save them; and indeed that does work fine. The problem comes when
the crash occurs between opening the file and calling flush. I could
move the flush() earlier ahead in the process, though, to reduce the
change of this situation happening. But, even if this allows the open
to proceed, how much can I tell about the file since changes were made
after the flush? If for example, I wanted to set a dirty flag within
the segment itself, wouldn't I run the risk of the allocation
structures within the segment being corrupt, leaving me unable to find
the offset of my flag?
>
> Anyway, trying to use a mapped file after a hard shut down has no
> sensible recovery, you don't know which parts of the file the OS has
> committed, the internal data structure might be absolutely corrupted.
>
Indeed, I do not want to use the corrupted file at all, but I have no
way to tell if the file is corrupted or OK. If I try to open the
segment read only and examine it, I get stuck in the loop with no way
to detect the failure. This is the problem I am seeking a solution
to. From looking at the code it appears that if, for whatever reason,
the first 32 bits of the file are 0, and the file is opened read-only,
then I am stuck.
I was able to solve the issue for my purposes with this change:
diff -r boostb/interprocess/detail/managed_open_or_create_impl.hpp
boosta/interprocess/detail/managed_open_or_create_impl.hpp
353c353,358
< while(value == InitializingSegment || value ==
UninitializedSegment){
---
> if (value == UninitializedSegment)
> {
> throw interprocess_exception(error_info(corrupted_error));
> }
>
> while(value == InitializingSegment){
But, as you can see if the user intends to use the open and create
simultaneously as a synchronization mechanism it will fail. This is
ok for me because I already have synchronization elsewhere in my code
that prevents that scenario.
Perhaps, rather than spinning indefinitely, there could a timeout or
limit other on how long the open function will wait for the file to
become initialized? I assume from the fact that you chose a spin lock
that you didn't intend for the user to wait indefinitely.
KEVIN ARUNSKI