Sage modules and forking

35 views
Skip to first unread message

Pavel Panchekha

unread,
Oct 26, 2012, 3:33:25 PM10/26/12
to sage-...@googlegroups.com
I have a Sage package/module that does parallelism by forking and communicating over pipes.  This of course forks the whole Sage executable.

This seems to sometimes (but rarely) result in any of a large number of problems: double-frees, corrupted internal glibc structures, hangs, null pointers in absurd places, and so on.

Is this a known bug?  Is this something that should work?  Is forking Sage just impossible to support?

William Stein

unread,
Oct 26, 2012, 3:39:51 PM10/26/12
to sage-...@googlegroups.com
I'd love to see an example of any of the above, since I've caused Sage
to fork millions of times in various situations (even in the last 24
hours!) and never seen these problems. There are caveats though,
e.g., make sure that all pseudotty interfaces are closed after forking
-- there are @parallel and @fork decorators that do this.

Also, I think I've maybe seen situations where causing errors in
maxima via libecl leads to corruption of Maxima for the parent process
(though maybe I was just confused and this really involved
pseudotty's).

Another possibility is that all of your problems: double frees,
corrupted internal glibc structures, etc., are the results of bugs in
code, and by running code in parallel you're exercising it a lot more.

-- William


>
> --
> You received this message because you are subscribed to the Google Groups
> "sage-devel" group.
> To post to this group, send email to sage-...@googlegroups.com.
> To unsubscribe from this group, send email to
> sage-devel+...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sage-devel?hl=en.
>
>



--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

Volker Braun

unread,
Oct 26, 2012, 4:22:11 PM10/26/12
to sage-...@googlegroups.com
The forked process is pretty much independent. In fact, this is probably the main drawback of the fork() multiprocessing model: Its impossible for processes to influence each other's address space after the fork() has happened even if you want to pass data around. 

Things to look out for:
  * both processes will try to run exit handlers (e.g. delete temporary files)
  * you need to be careful with open file handles, ideally have the child close everything to completely detach. Libraries that have fds open might be unhappy about this, though.

Florent Hivert

unread,
Oct 26, 2012, 4:43:36 PM10/26/12
to sage-...@googlegroups.com
Hi There,

> > I have a Sage package/module that does parallelism by forking and
> > communicating over pipes. This of course forks the whole Sage executable.
> >
> > This seems to sometimes (but rarely) result in any of a large number of
> > problems: double-frees, corrupted internal glibc structures, hangs, null
> > pointers in absurd places, and so on.
> >
> > Is this a known bug? Is this something that should work? Is forking Sage
> > just impossible to support?
>
> I'd love to see an example of any of the above, since I've caused Sage
> to fork millions of times in various situations (even in the last 24
> hours!) and never seen these problems. There are caveats though,
> e.g., make sure that all pseudotty interfaces are closed after forking
> -- there are @parallel and @fork decorators that do this.
>
> Also, I think I've maybe seen situations where causing errors in
> maxima via libecl leads to corruption of Maxima for the parent process
> (though maybe I was just confused and this really involved
> pseudotty's).
>
> Another possibility is that all of your problems: double frees,
> corrupted internal glibc structures, etc., are the results of bugs in
> code, and by running code in parallel you're exercising it a lot more.

I'm also in the process of finalizing a patch which do parallel and even
distributed map-reduce on recursively enumerated sets (currently badly named
SearchForest, I'll change the name in my patch, ticket #13580, patch on
Sage-Combinat queue [1]). Aside William's caveat, I would add "don't mix fork
and threads". A more or less accurate description of what could happen is that
forking is somehow not atomic so that you may end up forking also threads you
don't want to. Before being aware of that I had some very nasty behavior such
as a lock being taken twice by the same thread. I can imagine that it could
leads to double frees...

I hope this help.

Cheers,

Florent

[1] http://combinat.sagemath.org/patches/file/7e81f6e12973/trac_13580-map_reduce-fh.patch

Jason Grout

unread,
Oct 26, 2012, 4:54:38 PM10/26/12
to sage-...@googlegroups.com
On 10/26/12 3:43 PM, Florent Hivert wrote:

> Aside William's caveat, I would add "don't mix fork
> and threads". A more or less accurate description of what could happen is that
> forking is somehow not atomic so that you may end up forking also threads you
> don't want to. Before being aware of that I had some very nasty behavior such
> as a lock being taken twice by the same thread. I can imagine that it could
> leads to double frees...
>

I thought this article title was great:
http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them

"Threads and fork(): think twice before mixing them"

The article itself is an interesting read.

Thanks,

Jason


Reply all
Reply to author
Forward
0 new messages