I am wondering if anything can be done about the COW (copy-on-write)
problem when forking a python process. I have found several
discussions of this problem, but I have seen no proposed solutions or
workarounds. My understanding of the problem is that an object's
reference count is stored in the "ob_refcnt" field of the PyObject
structure itself. When a process forks, its memory is initially not
copied. However, if any references to an object are made or destroyed
in the child process, the page in which the objects "ob_refcnt" field
is located in will be copied.
My first thought was the obvious one: make the ob_refcnt field a
pointer into an array of all object refcounts stored elsewhere.
However, I do not think that there would be a way of doing this
without adding a lot of complexity. So my current thinking is that it
should be possible to disable refcounting for an object. This could
be done by adding a field to PyObject named "ob_optout". If ob_optout
is true then py_INCREF and py_DECREF will have no effect on the
object:
from refcount import optin, optout
class Foo: pass
mylist = [Foo() for _ in range(10)]
optout(mylist) # Sets ob_optout to true
for element in mylist:
optout(element) # Sets ob_optout to true
Fork_and_block_while_doing_stuff(mylist)
optin(mylist) # Sets ob_optout to false
for element in mylist:
optin(element) # Sets ob_optout to false
Has anyone else looked into the COW problem? Are there workarounds
and/or other plans to fix it? Does the solution I am proposing sound
reasonable, or does it seem like overkill? Does anyone foresee any
problems with it?
Thanks,
--jac
Why'd you need a "fix" like this for something that isn't broken? COW
doesn't just refer to the object reference-count, but to the object
itself, too. _All_ memory of the parent (and, as such, all objects, too)
become unrelated to memory in the child once the fork is complete.
The initial object reference-count state of the child is guaranteed to
be sound for all objects (because the parent's final reference-count
state was, before the process image got cloned [remember, COW is just an
optimization for a complete clone, and it's up the operating-system to
make sure that you don't notice different semantics from a complete
copy]), and what you're proposing (opting in/out of reference counting)
breaks that.
--
--- Heiko.
I disagree with your statement that COW is an optimization for a
complete clone, it is an optimization that works at the memory page
level, not at the memory image level. In other words, if I write to a
copy-on-write page, only that page is copied into my process' address
space, not the entire parent image. To the best of my knowledge by
preventing the child process from altering an object's reference count
you can prevent the object from being copied (assuming the object is
not altered explicitly of course.)
Hopefully this clarifies my previous post,
--jac
As I said before: COW for "sharing" a processes forked memory is simply
an implementation-detail, and an _optimization_ (and of course a
sensible one at that) for fork; there is no provision in the semantics
of fork that an operating system should use COW memory-pages for
implementing the copying (and early UNIXes didn't do that; they
explicitly copied the complete process image for the child). The only
semantic that is specified for fork is that the parent and the child
have independent process images, that are equivalent copies (except for
some details) immediately after the fork call has returned successfully
(see SUSv4).
What you're thinking of (and what's generally useful in the context
you're describing) is shared memory; Python supports putting objects
into shared memory using e.g. POSH (which is an extension that allows
you to place Python objects in shared memory, using the SysV
IPC-featureset that most UNIXes implement today).
--
--- Heiko.
Thanks,
--jac