On Mon, Oct 20, 2014 at 11:24 PM, Hunter Blanks <
hun...@napofearth.com> wrote:
> Daniel,
>
> Thanks for writing! Indeed, most of the deps are OpenStack. The ones that
> aren't are basically gevent, and but I think others have already touched on
> the long-term goal of using multiprocessing as an alternative.
>
> So far as limiting backend deps, I'd agree that wal-e-s3, etc. packages are
> probably the way to go, though it would take a little care to make those
> packages work out of the same repo.
>
> As for python-daemon, your reckoning may differ, but my own list of
> preferences would be:
>
> If you don't have the requirements in
>
http://legacy.python.org/dev/peps/pep-3143/#correct-daemon-behaviour, then
> just use subprocess to farm out the fetches. My own reckoning, though, is
> that you must need them or else you wouldn't have gone to the trouble.
> (Maybe to prevent shared network FD's, maybe to prevent polluting stdout /
> stderr; the rationale is fairly clear, but I'm either too casual or slow a
> reader to say.)
Ah. I'll itemize the rationale:
Postgres does a system() call to run the archive fetch command, and
blocks until it is complete, then applies the fetched WAL. If one
only does parallelism, the result is that, say, 8 WAL segments will be
downloaded in parallel and then Postgres will apply them...something
that can take some time. And in that time, no downloads are
happening, which is a big loss.
So, to get around this synchronous API, it's necessary to detach from
the parent process and be downloading even after the parent invoked by
Postgres returns with WAL segment in-place. A small exacted cost is
that new parent processes look-aside at the prefetch directory first
to see if it can promote a segment downloaded in the background in
this way.
Do something like "watch find pg_xlog/.wal-e" to see the directories
backing this dance in action while a database catches up in
"wal-fetch".
> If you do require everything that is "daemonization", and you're willing to
> maintain it yourself, Alex Martelli's daemonization example is pretty
> straightforward and where I usually end up. On the one hand, it is sad that
> this stuff never has made it into the standard library. On the other,
> daemonization has just enough differences of opinion that the "one way to do
> it" may never make it in, notwithstanding . (For a hint of all that, see the
> in-depth comments and example Alex refers to at
>
http://code.activestate.com/recipes/278731/. Mr. Finney also has quite a
> good discussion in his PEP from 2009.) In the rare cases where I had to do
> such a thing, I've just worked off of Alex's example, taking the parts I
> needed to take.
>
> If you still require daemonization and don't want to write it yourself,
> daemonize seems to be a fairly similar implementation that lacks
> dependencies.
> .
> Else, you could talk to Ben Finney about altering his install_requires and
> maybe removing the lockfile dependency.
>
> Well, sorry for the long story there. Please let me know which of those you
> find amenable. None of them are particularly hard, and I'm happy to do a
> little legwork on any of them .
Sure, pick your favorite given you know my basic requirement: above
background processes that can run after the parent exits.
I'm a bit reticent to do this in the release candidate part of the
release cycle, but we can get on it pronto for 0.9dev and apply the
patch first before more interesting changes go in. I'd recommend,
then, using that.
Finally, I think I did things this way because one can do "apt-get
install python-daemon" and get the dependency they need on Ubuntus.