(cy)PARI in Parallel, Heisenbugs, and Merging Policy

98 views
Skip to first unread message

Travis Scrimshaw

unread,
May 14, 2022, 11:06:18 PM5/14/22
to sage-devel
Hi everyone,
   On ticket #30423, Dan, Willie, and I have been working on a parallel-computation based implementation for computing F-matrices that are used in math physics. However, we have been seeing some doctest failures sporadically that involve segfaults and linked-list corruption from (cy)PARI. Here are the logs from testing with the first and the last having full tracebacks.


The first question would be if anyone has an idea about what is causing this. I have this impression that PARI is thread-safe, but I am wondering if cypari is also thread/parallel-safe or if there are any specific things that we should be careful about. (We’ve already had to work around a pickling issue with polynomials IIRC.)

Second question is that because this is a Heisenbug and I suspect it is something upstream (and so far, nobody has been able to reproducing it during an interactive version of Sage), I was wondering what the policy would be for merging the ticket. I recall in the past that we have merged tickets with Heisenbugs with followup tickets noting the behavior, but I am not 100% sure about that (and I wouldn’t necessarily know how to find any explicit examples). I was wondering if we could merge the ticket in an early beta version so that many people/systems can test it to see if it becomes more reproducible; of course this is assuming that the build bots are not consistent in reproducing this. Should we just mark any offending test(s) as “# known bug” and is there some general policy about this?

Thanks,
Travis


dwb...@gmail.com

unread,
May 15, 2022, 1:15:24 AM5/15/22
to sage-devel

A bit more information: as far as we know there are problems only on Linux: the logs badlog, badlog1, badlog2 and badlog3 are made by one machine (a Xeon box running Ubuntu 18.04) and badlog-match is another machne (an i7 also running Ubuntu 18.04).
In all the logs except badlog, there is a segmentation fault.
In badlog3, gdb attaches the running process and produces a backtrace.
We are currently not seeing crashes on MacOS.

Daniel Bump

Vincent Delecroix

unread,
May 15, 2022, 3:14:31 AM5/15/22
to sage-...@googlegroups.com
Probably related to https://github.com/sagemath/cypari2/issues/107 ?

Le 15/05/2022 à 05:06, 'Travis Scrimshaw' via sage-devel a écrit :
> Hi everyone,
> On ticket #30423 <https://trac.sagemath.org/ticket/30423>, Dan, Willie,

Travis Scrimshaw

unread,
May 16, 2022, 12:42:42 AM5/16/22
to sage-devel
That sure seems like it. So what should we do about the ticket? Would there be opposition to merging this piece of code, as there doesn't seem like there is a fix coming for the likely underlying cypari bug anytime soon?

Best,
Travis

Vincent Delecroix

unread,
May 16, 2022, 3:03:15 AM5/16/22
to sage-...@googlegroups.com
I would say that code with parallel computations + cypari2 should not
be merged (as cypari2 does not support it).

If you need parallel + PARI then use the C library directly with the
appropriate threads locks.

If the problem comes from somewhere else, then it would better be sorted
out.

Best
Vincent

Dima Pasechnik

unread,
May 16, 2022, 6:36:04 AM5/16/22
to sage-devel
On Mon, May 16, 2022 at 8:03 AM Vincent Delecroix
<20100.d...@gmail.com> wrote:
>
> I would say that code with parallel computations + cypari2 should not
> be merged (as cypari2 does not support it).

is it parallel multiprocessing, or parallel multithreading?
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/fd05e9f5-52b2-f2f8-bc02-c968c0a70e79%40gmail.com.

Vincent Delecroix

unread,
May 19, 2022, 8:42:45 AM5/19/22
to sage-...@googlegroups.com


Le 16/05/2022 à 12:35, Dima Pasechnik a écrit :
> On Mon, May 16, 2022 at 8:03 AM Vincent Delecroix
> <20100.d...@gmail.com> wrote:
>>
>> I would say that code with parallel computations + cypari2 should not
>> be merged (as cypari2 does not support it).
>
> is it parallel multiprocessing, or parallel multithreading?

True : multiprocessing is not a problem with cypari2. Only
multithreading must be avoided.
Reply all
Reply to author
Forward
0 new messages