On 5/31/24 00:38, Waldek Hebisch wrote:
> On Thu, May 30, 2024 at 07:43:30PM +0800, Qian Yun wrote:
>>
>>
>> On 5/29/24 22:51, Waldek Hebisch wrote:
>>>
>>> first build went fine, but a few later failed. Actually, it looks
>>> worse than previous version where probability of success looked
>>> higher.
>>>
>>> Yes, problem is because some .tex files are truncated. In one run it
>>> was 'ug10.tex', in few cases it was 'SEGBIND.tex'. In other cases
>>> I did not check the files but LaTeX error messuge indicated truncation.
>>
>> Can you try my patch from yesterday and see if this one helps.
>
> I could, but it takes time. And such tweaking is risky, while it
> may solve problem on my machine we risk breaking other. I would
> prefer to get closer to reasons so that we can be confident that
> book build really works.
>
I'll explain more. The output is truncated because some subprocess
of sman are killed before socket buffer is outputted.
The direction of IO goes like this:
FRICASsys <=> (forked child) sman <=> session <=> spadclient <=> stdio
When FRICASsys quits, the SpadServer socket and pty closes,
sman detects that and quits, causing SessionIOServer to be closed,
session detects that and quits, causing SessionServer to be closed,
spadclient detects that and quits.
(I ignored hypertex in this picture, it should also quit properly.)
Each process quits before processing all of its IO, so the output
will not be truncated.
The core idea is to detect socket shutdown, from "man recv":
When a stream socket peer has performed an orderly shutdown,
the return value will be 0.
I removed my previous workaround and applied this patch,
and I no longer have this truncation issue.
If this patch works on your side as well, I can improve the
details of this patch and upstream it.
>
> Concerning book, I did a few trials with version in the trunk, and
> it worked fine on each trial. That is too little to be sure,
> but is strong indication that trouble is due to recent changes.
>
Trunk version uses "FRICASsys" in pipe, that is fine.
My version uses "sman with FRICASsys" in pipe, which causes problem,
but I think it existed in the past as well.
>> The "sman -paste" invocation of "hypertex" does not have this problem
>> because it uses only socket IO to FRICASsys, it's purely sequential.
>
> Yes, socket I/O is free from worst races. Pure use of stdio also
> should be good.
>
>> While the "usage of sman/FRICASsys in pipe" is more complex, it involves
>> both socket IO and stdio, so the race happens there.
>
> Well, clearly we should try to limit simultaneous use of socket IO
> and stdio.
>
A minor correction: socket IO and stdio are sequential, but the signal
to kill the process is parallel, causing the race problem.
As I explained, this patch makes each process exits normally, instead
of being killed by sman simultaneously.
- Qian