occasional hang due to fork while another thread writes log

Dominique MARTINET

unread,

Apr 6, 2021, 1:35:22 AM4/6/21

to swup...@googlegroups.com

Hi,

I also stumbled upon a hang today, that happens ~1/5 times when run with
traces on for my board so easy enough to reproduce.

Looking further into it, here's what happens:

3797 fork( <unfinished ...>
3801 writev(1, [{iov_base="[TRACE] : SWUPDATE running : [l"..., iov_len=90}, {iov_base="\n", iov_len=1}], 2 <unfinished ...>
3797 <... fork resumed>) = 3803
3801 <... writev resumed>) = 91
....
3803 futex(0x76f5a16c, FUTEX_WAIT_PRIVATE, 1073745625, NULL <unfinished ...>
^ that futex is in fileno(stdout) from
void notify_init(void)
console_ansi_colors = (isatty(fileno(stdout)) && isatty(fileno(stderr)))
next to immediately after the fork.

Basically, we fork while another thread holds the stdout lock, then when
the new process tries to use that lock it sees the file locked and waits
forever.

The easy hammer is to just say we shouldn't mix threads and forks,
see for example[1] that lists other problematic functions (e.g. you
can't fork() if another thread uses malloc() either apparently, and I'm
sure there are more. My hang is on musl libc but I've seen this kind of
hangs on glibc as well)

[1] https://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them

I'm open to more subtle solutions, but not quite sure what can be
done in that respect.
Thanksfully if verbosity is lowered this is way harder to hit so I'll
just dig a hole and forget about this for now, but didn't see any past
discussion about this so figured you might want to know.

--
Dominique

Stefano Babic

unread,

Apr 6, 2021, 3:32:29 AM4/6/21

to Dominique MARTINET, swup...@googlegroups.com

Hi Dominique,

On 06.04.21 07:35, Dominique MARTINET wrote:
> Hi,
>
> I also stumbled upon a hang today, that happens ~1/5 times when run with
> traces on for my board so easy enough to reproduce.
>
> Looking further into it, here's what happens:
>
>
> 3797 fork( <unfinished ...>
> 3801 writev(1, [{iov_base="[TRACE] : SWUPDATE running : [l"..., iov_len=90}, {iov_base="\n", iov_len=1}], 2 <unfinished ...>
> 3797 <... fork resumed>) = 3803
> 3801 <... writev resumed>) = 91
> ....
> 3803 futex(0x76f5a16c, FUTEX_WAIT_PRIVATE, 1073745625, NULL <unfinished ...>
> ^ that futex is in fileno(stdout) from
> void notify_init(void)
> console_ansi_colors = (isatty(fileno(stdout)) && isatty(fileno(stderr)))
> next to immediately after the fork.
>
>
> Basically, we fork while another thread holds the stdout lock, then when
> the new process tries to use that lock it sees the file locked and waits
> forever.
>

I have to check exactly in code. SWUpdate forks all processes at the
early beginning. Later, a fork() is done only if run_system_cmd() is
called to execute shell scripts. Internally, SWUpdate just uses
pthreads(). I guess this happens when your handler is forking to tun
"podman load", right ?

>
> The easy hammer is to just say we shouldn't mix threads and forks,
> see for example[1] that lists other problematic functions

See above - SWUpdate just forks at the early beginning, or when shell
scripts must be executed. Anyway, the same issue should happens any time
a function in code calls system(), should not ?

> (e.g. you
> can't fork() if another thread uses malloc() either apparently, and I'm
> sure there are more. My hang is on musl libc but I've seen this kind of
> hangs on glibc as well)
>
> [1] https://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them

I cannot take all stuff in this article as the real truth.

>
>
> I'm open to more subtle solutions, but not quite sure what can be
> done in that respect.
> Thanksfully if verbosity is lowered this is way harder to hit so I'll
> just dig a hole and forget about this for now, but didn't see any past
> discussion about this so figured you might want to know.

What I suggested more times on the ML is to switch when is possible to
Lua scripts instead of shell scripts, first at all for security reasons.
Lua scripts runs in the context of the calling process, while a shell
script requires a fork(). But such as hang was never reported.

Best regards,
Stefano Babic

--
=====================================================================
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-53 Fax: +49-8142-66989-80 Email: sba...@denx.de
=====================================================================

Dominique MARTINET

unread,

Apr 6, 2021, 6:57:50 PM4/6/21

to Stefano Babic, swup...@googlegroups.com

Stefano Babic wrote on Tue, Apr 06, 2021 at 09:32:21AM +0200:
> > 3797 fork( <unfinished ...>
> > 3801 writev(1, [{iov_base="[TRACE] : SWUPDATE running : [l"..., iov_len=90}, {iov_base="\n", iov_len=1}], 2 <unfinished ...>
> > 3797 <... fork resumed>) = 3803
> > 3801 <... writev resumed>) = 91
> > ....
> > 3803 futex(0x76f5a16c, FUTEX_WAIT_PRIVATE, 1073745625, NULL <unfinished ...>
> > ^ that futex is in fileno(stdout) from
> > void notify_init(void)
> > console_ansi_colors = (isatty(fileno(stdout)) && isatty(fileno(stderr)))
> > next to immediately after the fork.
> >
> >
> > Basically, we fork while another thread holds the stdout lock, then when
> > the new process tries to use that lock it sees the file locked and waits
> > forever.
> >
>
> I have to check exactly in code. SWUpdate forks all processes at the early
> beginning. Later, a fork() is done only if run_system_cmd() is called to
> execute shell scripts. Internally, SWUpdate just uses pthreads(). I guess
> this happens when your handler is forking to tun "podman load", right ?

This happens before the download started, according to strace on the
very first fork.

> > The easy hammer is to just say we shouldn't mix threads and forks,
> > see for example[1] that lists other problematic functions
>
> See above - SWUpdate just forks at the early beginning, or when shell
> scripts must be executed. Anyway, the same issue should happens any time a
> function in code calls system(), should not ?

It might also happen for these, sure, but run_system_cmd() uses its own
fork() call almost immediately followed by execl (and does not use any
libc FILE or other "complex" functions), so the execl will clear things
up for us anyway and I don't think it's a problem.

This right now is in the spawn_process() call, only used by
start_swupdate_subprocess, so our internal things.

> > (e.g. you
> > can't fork() if another thread uses malloc() either apparently, and I'm
> > sure there are more. My hang is on musl libc but I've seen this kind of
> > hangs on glibc as well)
> >
> > [1] https://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
>
> I cannot take all stuff in this article as the real truth.

Well, it might write things in an exaggerated tone but I've seen real
such hangs on malloc in the past, and there is this...

--
Dominique

Dominique MARTINET

unread,

Apr 6, 2021, 7:35:33 PM4/6/21

to Stefano Babic, swup...@googlegroups.com

Dominique MARTINET wrote on Wed, Apr 07, 2021 at 07:57:32AM +0900:
> > I have to check exactly in code. SWUpdate forks all processes at the early
> > beginning. Later, a fork() is done only if run_system_cmd() is called to
> > execute shell scripts. Internally, SWUpdate just uses pthreads(). I guess
> > this happens when your handler is forking to tun "podman load", right ?
>
> This happens before the download started, according to strace on the
> very first fork.

Well, it's easy to reproduce so it happens when starting the downloader
thread; here are the last logs I see before hang:
[TRACE] : SWUPDATE running : [network_initializer] : Main loop Daemon
[TRACE] : SWUPDATE running : [start_swupdate_subprocess] : Started download with pid 5329 and fd 4
[TRACE] : SWUPDATE running : [listener_create] : creating socket at /tmp/sockinstctrl

(and the stuck process matches 5329; so it's really just the various
threads each doing their init in parallel and printing things causes
problem for the thread that forks)

Obviously just running without -v makes the traces not print, so the
problem doesn't happen then.
It also doesn't happen when redirecting stdout to /dev/null so I guess
ssh is just a bit slow to flush stdout on the board or something.

I also cannot reproduce on my laptop with glibc but the timing is quite
different... If I compile with asan (-fsanitize=addres added to both
cflags and ldflags) then I can produce a similar hang within asan locks,
the child process is stuck in its finish routine probably because
another thread was allocating something at the time of the fork or who
knows what:

Thread 1 (Thread 0x7ffff3e00ac0 (LWP 3251170) "swupdate_unstri"):
#0 __sanitizer::atomic_exchange<__sanitizer::atomic_uint32_t> (mo=__sanitizer::memory_order_acquire, v=2, a=0x640000000f00) at ../../../../src/libsanitizer/sanitizer_common/sanitizer_atomic_clang.h:62
#1 __sanitizer::BlockingMutex::Lock (this=this@entry=0x640000000f00) at ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux.cpp:652
#2 0x00007ffff760e00f in __sanitizer::SizeClassAllocator64<__asan::AP64<__sanitizer::LocalAddressSpaceView> >::ForceLock (this=0x7ffff770ae40 <__asan::instance>) at ../../../../src/libsanitizer/sanitizer_common/sanitizer_allocator_primary64.h:281
#3 __sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocator64<__asan::AP64<__sanitizer::LocalAddressSpaceView> >, __sanitizer::LargeMmapAllocatorPtrArrayDynamic>::ForceLock (this=0x7ffff770ae40 <__asan::instance>) at ../../../../src/libsanitizer/sanitizer_common/sanitizer_allocator_combined.h:181
#4 __lsan::LockAllocator () at ../../../../src/libsanitizer/asan/asan_allocator.cpp:1013
#5 0x00007ffff76bdf4e in __lsan::LockStuffAndStopTheWorldCallback (info=info@entry=0x7fffffff4400, size=size@entry=64, data=data@entry=0x7fffffff4490) at ../../../../src/libsanitizer/lsan/lsan_common_linux.cpp:121
#6 0x00007ffff6d591e5 in __GI___dl_iterate_phdr (callback=callback@entry=0x7ffff76bdf40 <__lsan::LockStuffAndStopTheWorldCallback(dl_phdr_info*, size_t, void*)>, data=data@entry=0x7fffffff4490) at dl-iteratephdr.c:75
#7 0x00007ffff76be27c in __lsan::LockStuffAndStopTheWorld (callback=callback@entry=0x7ffff76bc570 <__lsan::CheckForLeaksCallback(__sanitizer::SuspendedThreadsList const&, void*)>, argument=argument@entry=0x7fffffff44d0) at ../../../../src/libsanitizer/lsan/lsan_common_linux.cpp:139
#8 0x00007ffff76bd675 in __lsan::CheckForLeaks () at ../../../../src/libsanitizer/lsan/lsan_common.cpp:573
#9 0x00007ffff76bd9b5 in __lsan::DoLeakCheck () at ../../../../src/libsanitizer/lsan/lsan_common.cpp:613
#10 __lsan::DoLeakCheck () at ../../../../src/libsanitizer/lsan/lsan_common.cpp:608
#11 0x00007ffff6c60ac6 in __cxa_finalize (d=0x7ffff7708480) at cxa_finalize.c:83
#12 0x00007ffff760cc33 in __do_global_dtors_aux () from /lib/x86_64-linux-gnu/libasan.so.6
#13 0x00007fffffff47e0 in ?? ()
#14 0x00007ffff7fe2373 in _dl_fini () at dl-fini.c:138

and the main process never gets the SIGCHLD that is supposed to wake it
up to exit.

Anyway, this kind of hangs is rare, but can definitely happen -- in the
right conditions both of the cases I can reproduce happen around 1/5
times for me so it's far from something I could ignore.

Since it's a difficult problem and I have workarounds (lower log level,
or redirect to /dev/null to keep only syslog) I won't be pushing this
further but I just thought you should be aware of it; it has nothing to
do with lua or shell scripts.

--
Dominique

Stefano Babic

unread,

Apr 7, 2021, 2:13:00 AM4/7/21

to Dominique MARTINET, Stefano Babic, swup...@googlegroups.com

Hi Dominique,

Ok, so this surrounds where the race happens. In fact, after checking
the code, there is something suspect and not correct.

For a child process, the logging capabilities are redirected from
stdout. In fact:

- the main process writes via the console notifier, and it uses stdout
- the children use a socket, and output is transmitted via this internal
socket to the main process.

That means that just the main process will print output, and this is fine.

However, commit 9f2b1b6 introduced fancy colours for the output and this
line:

console_ansi_colors = (isatty(fileno(stdout)) &&
isatty(fileno(stderr)))

? true : false;

This is called *after* fork and there is no check if it is for the
parent or children processes. Line slipped there and I have not noted
this during review.

Could you be so kind and check if the following patch helps ?

diff --git a/core/notifier.c b/core/notifier.c
index c078891..444d865 100644
--- a/core/notifier.c
+++ b/core/notifier.c
@@ -489,9 +489,6 @@ void notify_init(void)
}
#endif

- console_ansi_colors = (isatty(fileno(stdout)) &&
isatty(fileno(stderr)))
- ? true : false;
-
if (pid == getpid()) {
char buf[60];
snprintf(buf, sizeof(buf), "Notify%d", pid);
@@ -518,6 +515,8 @@ void notify_init(void)
* socket. This can changed if more as one instance of
swupdate
* is started (name adjusted to avoid adrress is in use)
*/
+ console_ansi_colors = (isatty(fileno(stdout)) &&
isatty(fileno(stderr)))
+ ? true : false;
addr_init(&notify_server, "NotifyServer");
STAILQ_INIT(&clients);
register_notifier(console_notifier);

That is, the check is done only for the parent process.

>
>>> (e.g. you
>>> can't fork() if another thread uses malloc() either apparently, and I'm
>>> sure there are more. My hang is on musl libc but I've seen this kind of
>>> hangs on glibc as well)
>>>
>>> [1] https://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
>>
>> I cannot take all stuff in this article as the real truth.
>
> Well, it might write things in an exaggerated tone but I've seen real
> such hangs on malloc in the past, and there is this...
>

That it is more difficult to have a multiprocess and multithread
application, I think everyone agree. If what is written in this blog is
the absolute truth, well...

At the end, a unix OS is always started by a single init process, and
then thousands of processes and threads are started. If we take as true
what is written, nothing can work...anyway, it gives some points to
think about.

Best regards,
Stefano

Dominique MARTINET

unread,

Apr 7, 2021, 3:37:21 AM4/7/21

to Stefano Babic, swup...@googlegroups.com

Stefano Babic wrote on Wed, Apr 07, 2021 at 08:12:49AM +0200:
> Ok, so this surrounds where the race happens. In fact, after checking the
> code, there is something suspect and not correct.
>
> For a child process, the logging capabilities are redirected from stdout. In
> fact:
>
> - the main process writes via the console notifier, and it uses stdout
> - the children use a socket, and output is transmitted via this internal
> socket to the main process.

That makes sense, if the children do not use stdout at all the
likelyhood of a problem is reduced by that much.

> However, commit 9f2b1b6 introduced fancy colours for the output and this
> line:
>
> console_ansi_colors = (isatty(fileno(stdout)) &&
> isatty(fileno(stderr)))
> ? true : false;
>
> This is called *after* fork and there is no check if it is for the parent or
> children processes. Line slipped there and I have not noted this during
> review.
>
> Could you be so kind and check if the following patch helps ?

This is where I had pointed the child hangs, so removing it will
certainly help for this particular hang.
I will check tomorrow but it looks good to me.

> > Well, it might write things in an exaggerated tone but I've seen real
> > such hangs on malloc in the past, and there is this...
> >
>
> That it is more difficult to have a multiprocess and multithread
> application, I think everyone agree. If what is written in this blog is the
> absolute truth, well...
>
> At the end, a unix OS is always started by a single init process, and then
> thousands of processes and threads are started. If we take as true what is
> written, nothing can work...anyway, it gives some points to think about.

As it says, some functions are safe to call and some aren't.

In particular, the execve line of functions are just wrappers around
system calls which do not lock, and linux will reinitialize the process
memory so the child after exec will not have any problem -- the article
states this example as the only safe kind of forks.
So init-style of processes are fine (I hope there aren't init around
that is multithreaded though, even systemd's init isn't)

In practice, there are other problems such as heavy slowdowns for
multithreaded programs that fork-exec a lot where unrelated threads get
frozen while the fork happens, so for high performance workloads would
want to avoid forks anyway or leave them to a pool of tiny spawner
processes; but that's off topic anyway we're not there...

What isn't safe is calling libc or other library functions that can lock
in the child thread, if another thread possibly held that lock at the
time of the fork.
If you have a monospaced font this should illustrate it well enough:

parent process
┌────────┬────────┐
│thread 0│thread 1│
│ │ │
│write │ │
│starts │ │
│ locks │ │ child
│ │ fork │ ┌─────────────┐
│ │ │ │ │
│ │ │ │ tries to │
│ unlock │ │ │ write but │
│ │ │ │ lock taken │
│ │ │ │ by "no-one" │
│ │ │ │ so will │
│ │ │ │ never be │
│ │ │ │ released │
└────────┴────────┘ └─────────────┘

I wrote write() here but this is true for other functions as they say,
as far as I'm aware it is a real problem that can be partly avoided as
you're doing by not using the same FILE* in the child process, but I'm
not sure we can guarantee the same about other potentially locking
functions.

The "solution" would be to fork all internal subprocesses early before
starting other threads, or to have a forker subprocess over IPC that
creates new subprocesses as requested for the main daemon, but that's
quite an overhaul; I'm honestly not sure it's worth it.

(I just had a look for malloc, it looks like the glibc/musl
implementations try hard to leave things in a consistent state with
atfork-like helpers, so it is implementation dependant and address
sanitizer obviously did not try hard enough in my second reproducer.
Now I'm thinking back the malloc hang I was thinking about was a signal
handler calling malloc while in another malloc...)

--
Dominique

Stefano Babic

unread,

Apr 7, 2021, 4:03:31 AM4/7/21

to Dominique MARTINET, Stefano Babic, swup...@googlegroups.com

Hi Dominique,

On 07.04.21 09:37, Dominique MARTINET wrote:
> Stefano Babic wrote on Wed, Apr 07, 2021 at 08:12:49AM +0200:
>> Ok, so this surrounds where the race happens. In fact, after checking the
>> code, there is something suspect and not correct.
>>
>> For a child process, the logging capabilities are redirected from stdout. In
>> fact:
>>
>> - the main process writes via the console notifier, and it uses stdout
>> - the children use a socket, and output is transmitted via this internal
>> socket to the main process.
>
> That makes sense, if the children do not use stdout at all the
> likelyhood of a problem is reduced by that much.
>

Right - this was for design, but it looks like that implementation is
broken.

>> However, commit 9f2b1b6 introduced fancy colours for the output and this
>> line:
>>
>> console_ansi_colors = (isatty(fileno(stdout)) &&
>> isatty(fileno(stderr)))
>> ? true : false;
>>
>> This is called *after* fork and there is no check if it is for the parent or
>> children processes. Line slipped there and I have not noted this during
>> review.
>>
>> Could you be so kind and check if the following patch helps ?
>
> This is where I had pointed the child hangs, so removing it will
> certainly help for this particular hang.
> I will check tomorrow but it looks good to me.

Thanks - if it helps, I will send then an official patch and I will
integrate it.

>
>>> Well, it might write things in an exaggerated tone but I've seen real
>>> such hangs on malloc in the past, and there is this...
>>>
>>
>> That it is more difficult to have a multiprocess and multithread
>> application, I think everyone agree. If what is written in this blog is the
>> absolute truth, well...
>>
>> At the end, a unix OS is always started by a single init process, and then
>> thousands of processes and threads are started. If we take as true what is
>> written, nothing can work...anyway, it gives some points to think about.
>
> As it says, some functions are safe to call and some aren't.
>
> In particular, the execve line of functions are just wrappers around
> system calls which do not lock, and linux will reinitialize the process
> memory so the child after exec will not have any problem -- the article
> states this example as the only safe kind of forks.

SWUpdate uses execve and execvp - if you grep the code, there are no
calls to system()

> So init-style of processes are fine (I hope there aren't init around
> that is multithreaded though, even systemd's init isn't)

Right, I agree

Sure - so I agree with you that we should be very careful when we fork
processes, and we have to take special care of resources to avoid
deadlock. The article points that there are these issues, and it is
fine. But on the other end, just to say that it is quite impossible and
it could not be done, it is very hard (and there are a lot of software
doing this and they are working).

My vote is that there is a bug in SWUpdate and this must be fixed.

>
> The "solution" would be to fork all internal subprocesses early before
> starting other threads,

If you see, this is exactly what is done in SWUpdate. The downloader is
created at the early beginning, Webserver / Suricatta are also created
at the startup. After SWUpdate is started, there is no fork anymore,
while threads are created on demand.

> or to have a forker subprocess over IPC that
> creates new subprocesses as requested for the main daemon, but that's
> quite an overhaul; I'm honestly not sure it's worth it.

This is ok, but we are working for embedded system. The best way is to
try to allocate resources first at the startup, so that we are sure that
it will be work even later (of course, not always this is possible, OOM
can happen). SWUpdate follows this design and tries to allocate
resources at the startup, that means processes are created very early.

>
>
> (I just had a look for malloc, it looks like the glibc/musl
> implementations try hard to leave things in a consistent state with
> atfork-like helpers, so it is implementation dependant and address
> sanitizer obviously did not try hard enough in my second reproducer.

I am very surprised if there is such as issue in malloc(), even in musl.

> Now I'm thinking back the malloc hang I was thinking about was a signal
> handler calling malloc while in another malloc...)

Well, this could be, I had myself weird behavior after writing own
handlers. But it was caused by my messy signal handler that generated a
recursion.

Best regards,
Stefano Babic

Dominique MARTINET

unread,

Apr 7, 2021, 4:46:11 AM4/7/21

to Stefano Babic, swup...@googlegroups.com

Stefano Babic wrote on Wed, Apr 07, 2021 at 10:03:21AM +0200:
> > > Could you be so kind and check if the following patch helps ?
> >
> > This is where I had pointed the child hangs, so removing it will
> > certainly help for this particular hang.
> > I will check tomorrow but it looks good to me.
>
> Thanks - if it helps, I will send then an official patch and I will
> integrate it.

It is fine, I just had to send it through our build pipeline as I am
too lazy to rebuild on the board.

So I have a good news and a bad news..

The good one is that the hang I described no longer happens.
The bad one is I can reproduce something similar to my laptop, where
swupdate hangs towards the end; here are the logs as it gets stuck:

[TRACE] : SWUPDATE running : [network_initializer] : Main loop Daemon

[TRACE] : SWUPDATE running : [start_swupdate_subprocess] : Started download with pid 2000 and fd 4

[TRACE] : SWUPDATE running : [listener_create] : creating socket at /tmp/sockinstctrl

[TRACE] : SWUPDATE running : [download_from_url] : Image download started : http://10.1.1.1/404
[INFO ] : SWUPDATE running : [channel_get_file] : Total download size is 0 kB.
[TRACE] : SWUPDATE running : [network_thread] : Incoming network request: processing...
[INFO ] : SWUPDATE started : Software Update started !
[ERROR] : SWUPDATE failed [0] ERROR : CPIO Format not recognized: magic not found
[ERROR] : SWUPDATE failed [0] ERROR : CPIO Header corrupted, cannot be parsed
[ERROR] : SWUPDATE failed [1] Image invalid or corrupted. Not installing ...
[TRACE] : SWUPDATE running : [network_initializer] : Main thread sleep again !
[INFO ] : No SWUPDATE running : Waiting for requests...

[TRACE] : SWUPDATE running : [network_initializer] : Main loop Daemon

[TRACE] : SWUPDATE running : [channel_log_effective_url] : Channel's effective URL resolved to http://10.1.1.1/404
[ERROR] : SWUPDATE failed [0] ERROR : Channel operation returned HTTP error code 404.

As you can see, there is no swu, just an error.
The PID of the remaining process is 2000 which matches what the log
describes as the download pid; it is stuck there (I managed to get debug
symbols this time, that helps!):

(gdb) bt
#0 __syscall4 (d=0, c=<optimized out>, b=128, a=1996136812, n=240) at ./arch/arm/syscall_arch.h:75
#1 __futexwait (priv=128, val=<optimized out>, addr=0x76faa16c <__stdout_FILE+76>) at ./src/internal/pthread_impl.h:178
#2 __lockfile (f=f@entry=0x76faa120 <__stdout_FILE>) at src/stdio/__lockfile.c:14
#3 0x76f4ef30 in close_file (f=0x76faa120 <__stdout_FILE>) at src/stdio/__stdio_exit.c:11
#4 0x76f4efc0 in __stdio_exit () at src/stdio/__stdio_exit.c:21
#5 0x76f1862c in exit (code=1995540012) at src/exit/exit.c:31
#6 0x0047df9c in start_download ()
#7 0x0046fac8 in ?? ()

So we got one stdio user out of the way, but unfortunately musl also
accesses it on cleanup so the same race is present.

> > In particular, the execve line of functions are just wrappers around
> > system calls which do not lock, and linux will reinitialize the process
> > memory so the child after exec will not have any problem -- the article
> > states this example as the only safe kind of forks.
>
> SWUpdate uses execve and execvp - if you grep the code, there are no calls
> to system()

Yes, I believe the late forks (handlers run_system_cmd()) are fine.

This strictly seems to be about helper processes which run within the
same code.
A pattern I often see is running the same program with a different
argv[0], like busybox, so for example spawn_process would not run a
start function but call a different entry point through execv() that
will run through a clean libc init again?

--
Dominique

Stefano Babic

unread,

Apr 7, 2021, 6:52:58 AM4/7/21

to Dominique MARTINET, Stefano Babic, swup...@googlegroups.com

I will say the two are unrelated - one issue is fixed, there is a second
one. So the first issue seems solved, you hit a second one.

This could be related to the signal_handler() that is installed for each
forked process. Rather I have no chance to test this, it never happened
with glibc.

>
>
>
>>> In particular, the execve line of functions are just wrappers around
>>> system calls which do not lock, and linux will reinitialize the process
>>> memory so the child after exec will not have any problem -- the article
>>> states this example as the only safe kind of forks.
>>
>> SWUpdate uses execve and execvp - if you grep the code, there are no calls
>> to system()
>
> Yes, I believe the late forks (handlers run_system_cmd()) are fine.
>
> This strictly seems to be about helper processes which run within the
> same code.
> A pattern I often see is running the same program with a different
> argv[0], like busybox, so for example spawn_process would not run a
> start function but call a different entry point through execv() that
> will run through a clean libc init again?

Dominique MARTINET

unread,

Apr 7, 2021, 8:21:48 PM4/7/21

to Stefano Babic, swup...@googlegroups.com

Stefano Babic wrote on Wed, Apr 07, 2021 at 12:52:49PM +0200:
> > So we got one stdio user out of the way, but unfortunately musl also
> > accesses it on cleanup so the same race is present.
>
> I will say the two are unrelated - one issue is fixed, there is a second
> one. So the first issue seems solved, you hit a second one.

Let's agree to disagree - for me the problem is that another thread
accesses stdio while fork happens, so it's the same race, while you're
thinking of it as stdio accesses in the child process so different
access timing. Both are valid ways of thinking.

> This could be related to the signal_handler() that is installed for each
> forked process. Rather I have no chance to test this, it never happened with
> glibc.

I agree it's likely specific to musl

musl's __stdio_exit/close_file (where the remaining hang happens) is
quite simple:
https://fossies.org/linux/musl/src/stdio/__stdio_exit.c

After a while trying to overwrite stdout/stderr to NULL and failing
miserably (musl uses an internal symbol and __visilibity__("hidden")
attribute is strong), I've got a few possible ways forward:

- moving the start of as many threads as possible after the
start_subprocess calls in main() -- for some reason I thought the
downloaded subprocess was started from network thread so that would be
difficult but it actually appears simple enough and probably the best
way forward, unless subprocesses would fail if some socket threads
create is missing?
At which point adding some synchronisation to make sure the early
threads are idle when we start forks would also work.

- have console_notifier take a swupdate-specific lock around
fprintf/fflush, which could also be taken before fork an freed in both
children as we control this lock

- have forks call execv on swupdate with a differnet argv[0], and call
the appropriate subfunction if argv[0] name matches internal process
name; the execv would ensure a fresh libc init and avoid any similar
problem in the future (like my similar hang with address sanitizer which
probably wouldn't be fixed by just protecting stdio)

- have internal processes exit with _exit() instead of exit() to skip
libc cleanup

From my point of view delaying thread starts is probably the most
straightforward, while execve being probably the most reliable, so I'd
suggest one of the two.

Happy to help implement anything and obviously test.
--
Dominique

Stefano Babic

unread,

Apr 8, 2021, 10:52:07 AM4/8/21

to Dominique MARTINET, Stefano Babic, swup...@googlegroups.com

On 08.04.21 02:21, Dominique MARTINET wrote:
> Stefano Babic wrote on Wed, Apr 07, 2021 at 12:52:49PM +0200:
>>> So we got one stdio user out of the way, but unfortunately musl also
>>> accesses it on cleanup so the same race is present.
>>
>> I will say the two are unrelated - one issue is fixed, there is a second
>> one. So the first issue seems solved, you hit a second one.
>
> Let's agree to disagree - for me the problem is that another thread
> accesses stdio while fork happens, so it's the same race, while you're
> thinking of it as stdio accesses in the child process so different
> access timing. Both are valid ways of thinking.
>

Ok, there is still an issue but in a different part of code.

>> This could be related to the signal_handler() that is installed for each
>> forked process. Rather I have no chance to test this, it never happened with
>> glibc.
>
> I agree it's likely specific to musl

I ran yesterday the same test you have (check SWU on a local URL with
the downloader) for some hours. Never blocked. I do not say it is a musl
issue, but it is the combination swupdate+musl.

>
>
> musl's __stdio_exit/close_file (where the remaining hang happens) is
> quite simple:
> https://fossies.org/linux/musl/src/stdio/__stdio_exit.c
>
> After a while trying to overwrite stdout/stderr to NULL and failing

But above the file is "stdio" and not "stdout". Have you found which is
the file descriuptor involved ?

> miserably (musl uses an internal symbol and __visilibity__("hidden")
> attribute is strong), I've got a few possible ways forward:
>
> - moving the start of as many threads as possible after the
> start_subprocess calls in main()

There is just *one* thread called before the processes, and this is the
network thread. But this is required to have all services available when
the processes are started. The current setup is already to minimize the
risk, and moving this thread away cause a lot of problems.

> -- for some reason I thought the
> downloaded subprocess was started from network thread

It is not. The network thread is started, but the processes are
initiated by the main process / main thread.

> so that would be
> difficult but it actually appears simple enough and probably the best
> way forward, unless subprocesses would fail if some socket threads
> create is missing?

This is the case.

> At which point adding some synchronisation to make sure the early
> threads are idle when we start forks would also work.

This should be verified - so you are supposing that the network thread
has not yet called the accept(), where it blocks.

>
> - have console_notifier take a swupdate-specific lock around
> fprintf/fflush, which could also be taken before fork an freed in both
> children as we control this lock

But the downloader does not use printf. As I said, all processes do not
write to the console. There is an internal IPC and TRACE/ERROR are
redirected to a UDS socket, and this is read by the notifier. This above
is not the cause.

>
> - have forks call execv on swupdate with a differnet argv[0], and call
> the appropriate subfunction if argv[0]

The execvX() function are done to run a command from the filesystem,
that is they are also loading the program. Else it is ok to call a
function and go on with the process.

IMHO there is not yet a clear cause about what happens, just suspicions.
The cause should be investigate together with MUSL. If MUSL is asking to
close a file (stdio or not), it is not clear to me why this blocks. The
file descriptor is the same but is duplicated for each forked process,
and each process can close it independently what the other processes are
doing. And kernel really closes the fd, that is the entry point in
driver is called, when all fd pointing to the same resources are closed.
So IMHO this is also not a solution.

> name matches internal process
> name; the execv would ensure a fresh libc init and avoid any similar
> problem in the future (like my similar hang with address sanitizer which
> probably wouldn't be fixed by just protecting stdio)
>
> - have internal processes exit with _exit() instead of exit() to skip
> libc cleanup

Disagree here, at löeast not for glibc.

>
>
> From my point of view delaying thread starts is probably the most
> straightforward,

It does not work.

> while execve being probably the most reliable, so I'd
> suggest one of the two.
>
> Happy to help implement anything and obviously test.
>

I cannot get the real cuase, just a suspicion. And try to fix without
knowing exactly the cause is bad: it can solve the issue now, it can
come back later in a much more difficult way to analyze.

Dominique MARTINET

unread,

Apr 8, 2021, 8:56:07 PM4/8/21

to Stefano Babic, swup...@googlegroups.com

Stefano Babic wrote on Thu, Apr 08, 2021 at 04:52:00PM +0200:
> > musl's __stdio_exit/close_file (where the remaining hang happens) is
> > quite simple:
> > https://fossies.org/linux/musl/src/stdio/__stdio_exit.c
> >
> > After a while trying to overwrite stdout/stderr to NULL and failing
>
> But above the file is "stdio" and not "stdout". Have you found which is the
> file descriuptor involved ?

Yes, in the backtrace I gave earlier the hang happens on __stdout_FILE
so stdout is the problem in this particular case, but if an error could
happen while fork happens stderr has the same risk.

> > At which point adding some synchronisation to make sure the early
> > threads are idle when we start forks would also work.
>
> This should be verified - so you are supposing that the network thread has
> not yet called the accept(), where it blocks.

I did not name any thread.
You said the only thread started but I see four clone() system calls
before first and only fork in a strace, so there are five threads in
play (four if you exclude the main forking thread)

More on this at the very end.

> > - have console_notifier take a swupdate-specific lock around
> > fprintf/fflush, which could also be taken before fork an freed in both
> > children as we control this lock
>
> But the downloader does not use printf. As I said, all processes do not
> write to the console. There is an internal IPC and TRACE/ERROR are
> redirected to a UDS socket, and this is read by the notifier. This above is
> not the cause.

It's not the child (downloader) that I want to make sure does not emit
any log, but other threads while the fork happens.
We both agree the child no longer accesses stdout voluntarily, it now
only does during libc cleanup as far as I can see.

> > - have forks call execv on swupdate with a differnet argv[0], and call
> > the appropriate subfunction if argv[0]
>
> The execvX() function are done to run a command from the filesystem, that is
> they are also loading the program. Else it is ok to call a function and go
> on with the process.
>
> IMHO there is not yet a clear cause about what happens, just suspicions. The
> cause should be investigate together with MUSL. If MUSL is asking to close a
> file (stdio or not), it is not clear to me why this blocks. The file
> descriptor is the same but is duplicated for each forked process, and each
> process can close it independently what the other processes are doing. And
> kernel really closes the fd, that is the entry point in driver is called,
> when all fd pointing to the same resources are closed. So IMHO this is also
> not a solution.

The problem is quite clear to me, what can I do to convince you it is as
I say?
strace unfortunately does not log locks being taken, but in all cases I
have seen and the trace I have sent the pattern shows a write to stdout
happening while fork happens:
1489 fork( <unfinished ...>
1491 writev(1, [{iov_base="[TRACE] : SWUPDATE running :
[network_initializer] : Main loop Daemon\33[0m", iov_len=74},
1489 <... fork resumed>) = 1494
1491 <... writev resumed>) = 75

We write with fprintf and fflush which take a FILE, not a file
descriptor -- the libc serializes writes to FILEs through an internal
lock (even glibc does this, but since you cannot reproduce they must
also make sure the lock is freed in atfork handlers).

What hangs later down the road is when the child tries to access the
FILE, in the new case during musl's atexit cleanup procedure, because
that other thread (1491 here) had the lock which is shared for the whole
process and has been cloned in its locked state.

This is clear to me from the backtrace I gave the other day, that is a
lock on the FILE describing stdout:

#2 __lockfile (f=f@entry=0x76faa120 <__stdout_FILE>) at src/stdio/__lockfile.c:14
#3 0x76f4ef30 in close_file (f=0x76faa120 <__stdout_FILE>) at src/stdio/__stdio_exit.c:11
#4 0x76f4efc0 in __stdio_exit () at src/stdio/__stdio_exit.c:21

... which I can only think has been taken by the write above, but I
cannot get traces for that with any tool I know.

Back to solving the issue, looking at the write content we now have
confirmed we are looking at "network_initializer", which is writing
something so obviously not waiting anywhere.

You mentioned accept() but this thread does not call any accept, nor
does it connect() to anything other than syslog, so I went back to the
code and it is synchronised by the stream_mutex/stream_wkup mutex/cond
pair.. immediately after that write.

It did also spawn another thread, network_thread, which I now realize
means the network initializer was not the network thread... and that one
does call accept(). Looking at traces the accept() also happens after
the fork(), along with unlink/bind on /tmp/sockinstctrl in that order
(sorry, didn't think of taking timestamps for strace, but the order
should be correct)

1489 fork( <unfinished ...>
1491 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0 <unfinished ...>
1492 <... socket resumed>) = 5
1492 unlink("/tmp/swupdateprog" <unfinished ...>
1492 bind(5, {sa_family=AF_UNIX, sun_path="/tmp/swupdateprog"}, 110
<unfinished ...>
1492 accept(5, <unfinished ...>
1493 socket(AF_UNIX, SOCK_STREAM, 0) = 6
1493 unlink("/tmp/sockinstctrl") = 0
1493 bind(6, {sa_family=AF_UNIX, sun_path="/tmp/sockinstctrl"}, 110) = 0
1493 accept(6, <unfinished ...>

(bonus: swupdateprog has the same "problem" if children are expected to
be able to connect to it -- I don't know if this is the case, but code
path can definitely print messages until that thread is in accept state)

So nothing prevents the children from trying to connect to swupdateprog
before the network thread has done its job, and in highly loaded
environment where the linux scheduler does weird things I'm sure I could
reproduce a child process that cannot connect once in a while -- for
example by making sure the filesystem is very slow by having a hundred
other process create/write/destroy files in /tmp in parallel.

If we add proper synchronisation here to make sure the network threads
are all waiting it would also fix "my hang", and I think you will also
agree that some form of synchronisation would be required?

(Curiosity also got me to look at the other two threads, one is the
progress_bar_thread which creates /tmp/swupdateprog and has the same
race: this one would either need synchronisation if required for forks
or could be moved after forks.
The last one is the binds on @"NotifyServer so would be the
notifier_thread started in notify_init which also has no synchronisation
so in theory could mean logs from children process can get lost)

From that point the most straightforward way of going forward would be
to just add some signaling (cond/lock, polling on a global variable,
whatever) to make sure the threads are done binding to their socket, and
ideally make sure they don't log anything before waiting on accept.

(In theory that still leaves room for a fork to connect to one of the
sockets and make the other threads talk while other forks happen, but
from what I could see most usages of swupdate I have done only fork once
so I'll ignore that...)

Thanks,
--
Dominique

Stefano Babic

unread,

Apr 9, 2021, 6:29:40 AM4/9/21

to Dominique MARTINET, Stefano Babic, swup...@googlegroups.com

Hi Dominique,

On 09.04.21 02:55, Dominique MARTINET wrote:
> Stefano Babic wrote on Thu, Apr 08, 2021 at 04:52:00PM +0200:
>>> musl's __stdio_exit/close_file (where the remaining hang happens) is
>>> quite simple:
>>> https://fossies.org/linux/musl/src/stdio/__stdio_exit.c
>>>
>>> After a while trying to overwrite stdout/stderr to NULL and failing
>>
>> But above the file is "stdio" and not "stdout". Have you found which is the
>> file descriuptor involved ?
>
> Yes, in the backtrace I gave earlier the hang happens on __stdout_FILE
> so stdout is the problem in this particular case, but if an error could
> happen while fork happens stderr has the same risk.
>
>
>>> At which point adding some synchronisation to make sure the early
>>> threads are idle when we start forks would also work.
>>
>> This should be verified - so you are supposing that the network thread has
>> not yet called the accept(), where it blocks.
>
> I did not name any thread.
> You said the only thread started but I see four clone() system calls
> before first and only fork in a strace, so there are five threads in
> play (four if you exclude the main forking thread)
>

You are mixing two concepts. There are processes and threads. A fork
creates a new processes, that is with a copy of the parent's data space
and all file descriptors are duplicated in the child. The thread shares
the same data space and the same file descriptors as the caller (the
caller of pthread_create). So even if the kernel treats them at the same
level because fork will call clone() and the kernel just see processes,
there is a big diferrence how they are used.

When SWUpdate starts (main process) , it creates a *thread* network
thread. After that, it creates "subprocesses" , they are processes. The
number of processes depends on your configuration and how SWUpdate is
called, but they usually do not create threads. Each interface has a
separate process: Webserver, downloader, suricatta.

So the "five threads" make no sense, it is not clear enough. I am also
not sure how musl handle this

The problem (lock on a resource) is clear, the cause not, and even not
if this is really an issue inside SWUpdate.

> strace unfortunately does not log locks being taken, but in all cases I
> have seen and the trace I have sent the pattern shows a write to stdout
> happening while fork happens:
> 1489 fork( <unfinished ...>
> 1491 writev(1, [{iov_base="[TRACE] : SWUPDATE running :
> [network_initializer] : Main loop Daemon\33[0m", iov_len=74},
> 1489 <... fork resumed>) = 1494
> 1491 <... writev resumed>) = 75
>
> We write with fprintf and fflush which take a FILE, not a file
> descriptor -- the libc serializes writes to FILEs through an internal
> lock (even glibc does this, but since you cannot reproduce they must
> also make sure the lock is freed in atfork handlers).

Let's start : FILE are buffered and their handling is resolved by libc.
It is duty of the library to resolve locking if the locks are not done
by the single processes, as in this case.

In other words: a program could also SEGV, exit or whatever - this
should not lead to a deadlock or to a crash of the whole system.

Which is the musl version you are using ? At least 1.2.2 ?

There is a list of bugs in MUSL changelog that can be the cause of
problem, for example:

"soft deadlock regression in stdio FILE locks with >2 threads contending"

and several others.

We should first detect if there is something in SWUpdate that is not
correct, and if yes, then fix it, or the cause is in another component
(in this case, MUSL).

>
> What hangs later down the road is when the child tries to access the
> FILE, in the new case during musl's atexit cleanup procedure, because
> that other thread (1491 here) had the lock which is shared for the whole
> process and has been cloned in its locked state.

Again, this does not say where is the cause. I see your analyses, but:

- FILE are handled by the libc and SWUpdate does not set any mutex /
lock, and in any case on stdio / stdout / stderr.
- if a task is started and fd (or FILE) are available, it is perfectly
correct that it make use of them. Cleanup and resource manging is solved
by the caller, that is inside fork() / pthread_create().

>
> This is clear to me from the backtrace I gave the other day, that is a
> lock on the FILE describing stdout:
> #2 __lockfile (f=f@entry=0x76faa120 <__stdout_FILE>) at src/stdio/__lockfile.c:14
> #3 0x76f4ef30 in close_file (f=0x76faa120 <__stdout_FILE>) at src/stdio/__stdio_exit.c:11
> #4 0x76f4efc0 in __stdio_exit () at src/stdio/__stdio_exit.c:21
>

This is clear that there is a lock during exit, but not that this is
caused by SWUpdate.

> ... which I can only think has been taken by the write above, but I
> cannot get traces for that with any tool I know.

But again, it is legitimate (even if I forbid that SWUpdate is doing
this) to write to stdio. It is duty of OS to cleanup resources even
after a SWUpdate's crash.

>
>
>
> Back to solving the issue, looking at the write content we now have
> confirmed we are looking at "network_initializer", which is writing
> something so obviously not waiting anywhere.

You are now digging into SWUpdate - but even if SWUpdate writes to
stdio, is it forbidden ? Not really.

I thinks we should first check which MUSL you are using, and from this
analyses, it looks like that the cause is probably inside MUSL and not
inside SWUpdate.

>
> You mentioned accept() but this thread does not call any accept,

The network thread initializes and block on accept.

> nor
> does it connect()

connect is done by clients, so yes, this is not called.

> to anything other than syslog, so I went back to the
> code and it is synchronised by the stream_mutex/stream_wkup mutex/cond
> pair.. immediately after that write.
>
> It did also spawn another thread, network_thread, which I now realize
> means the network initializer was not the network thread... and that one
> does call accept().

This is the network thread. Just stick with the names, processes are
cloned with fork, threads are created with POSIX pthread_create()

> Looking at traces the accept() also happens after
> the fork(), along with unlink/bind on /tmp/sockinstctrl in that order
> (sorry, didn't think of taking timestamps for strace, but the order
> should be correct)
>
> 1489 fork( <unfinished ...>
> 1491 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0 <unfinished ...>
> 1492 <... socket resumed>) = 5
> 1492 unlink("/tmp/swupdateprog" <unfinished ...>
> 1492 bind(5, {sa_family=AF_UNIX, sun_path="/tmp/swupdateprog"}, 110
> <unfinished ...>
> 1492 accept(5, <unfinished ...>
> 1493 socket(AF_UNIX, SOCK_STREAM, 0) = 6
> 1493 unlink("/tmp/sockinstctrl") = 0
> 1493 bind(6, {sa_family=AF_UNIX, sun_path="/tmp/sockinstctrl"}, 110) = 0
> 1493 accept(6, <unfinished ...>
>

Yes, it seems correct and it is expected.

> (bonus: swupdateprog has the same "problem" if children are expected to
> be able to connect to it -- I don't know if this is the case, but code
> path can definitely print messages until that thread is in accept state)
>
>
> So nothing prevents the children from trying to connect to swupdateprog
> before the network thread has done its job, and in highly loaded
> environment where the linux scheduler does weird things I'm sure I could
> reproduce a child process that cannot connect once in a while -- for
> example by making sure the filesystem is very slow by having a hundred
> other process create/write/destroy files in /tmp in parallel.

But this is not your case here. This is also the reason why the network
thread is called *earlier*. The clients have a mechanism to retry a
connection if first time fails (and if not can be added). This avoids to
have a sync at startup that seems overkilling.

>
> If we add proper synchronisation here to make sure the network threads
> are all waiting it would also fix "my hang",

No, it has nothing to do, it hangs when you exit.

> and I think you will also
> agree that some form of synchronisation would be required?

No.

>
>
> (Curiosity also got me to look at the other two threads, one is the
> progress_bar_thread which creates /tmp/swupdateprog and has the same
> race: this one would either need synchronisation if required for forks
> or could be moved after forks.
> The last one is the binds on @"NotifyServer so would be the
> notifier_thread started in notify_init which also has no synchronisation
> so in theory could mean logs from children process can get lost)

They should be queued, until UDS socket is full.

>
>
> From that point the most straightforward way of going forward would be
> to just add some signaling (cond/lock, polling on a global variable,
> whatever) to make sure the threads are done binding to their socket, and
> ideally make sure they don't log anything before waiting on accept.
>

Let's go back: you find the problem (block on a resource), you do not
find the cause (issue in SWUpdate or in MUSL), you get the symptoms, and
you want try to work around the symptoms.

Again: it should be not forbidden by a client to log / do something.
This should not lead to a deadlock. If deadlock happens, it is caused by
the one who handles the lock, and for stdio / stdout / stderr this is
not SWUpdate.

> (In theory that still leaves room for a fork to connect to one of the
> sockets and make the other threads talk while other forks happen, but
> from what I could see most usages of swupdate I have done only fork once
> so I'll ignore that...)

Dominique MARTINET

unread,

Apr 11, 2021, 7:43:37 PM4/11/21

to Stefano Babic, swup...@googlegroups.com

Hi,

Stefano Babic wrote on Fri, Apr 09, 2021 at 12:29:31PM +0200:
> > I did not name any thread.
> > You said the only thread started but I see four clone() system calls
> > before first and only fork in a strace, so there are five threads in
> > play (four if you exclude the main forking thread)
> >
>
> You are mixing two concepts. There are processes and threads. A fork creates
> a new processes, that is with a copy of the parent's data space and all file
> descriptors are duplicated in the child. The thread shares the same data
> space and the same file descriptors as the caller (the caller of
> pthread_create). So even if the kernel treats them at the same level because
> fork will call clone() and the kernel just see processes, there is a big
> diferrence how they are used.
>
> When SWUpdate starts (main process) , it creates a *thread* network thread.
> After that, it creates "subprocesses" , they are processes. The number of
> processes depends on your configuration and how SWUpdate is called, but they
> usually do not create threads. Each interface has a separate process:
> Webserver, downloader, suricatta.

I'm not sure what makes you think I misunderstand what threads and
processes are?

> So the "five threads" make no sense, it is not clear enough. I am also not
> sure how musl handle this

I'm not sure why it doesn't make sense to you either. Threads can be
identified, and are real entities, so can be counted?

swupdate starts, creates the following four threads that I will identify
with the function they start:
- notifier_thread
- network_initializer
- network_thread
- progress_bar_thread

(so there are five in total with the "main" thread), then processes on
forking its other helpers.
There are other threads in the code base, but these are all started
*before* the fork (or can be, for network_thread, as it is started from
network_initializer the timing can vary -- but other three always are
there before)

As a matter of fact, for linux, threads and processes are very
similar -- they're both tasks in the scheduler and the main difference
is that threads happen to share the same memory mappings. glibc actually
uses the same system call (clone) for both on x86_64, I was surprised to
see fork() in strace the other day...

While we are doing trivia, a possible enhancement to make debugging
easier would be to call pthread_setname_np() in each of these functions
to clearly identify each thread.
Doing that would clearly show what thread is what in gdb or other
debugging tools; that would have saved me a few minutes on Friday and I
would not have been vague like that as threads would really "have names"

> > The problem is quite clear to me, what can I do to convince you it is as
> > I say?
>
> The problem (lock on a resource) is clear, the cause not, and even not if
> this is really an issue inside SWUpdate.

The problem is the child blocking.
The cause is the *parent's thread* having the lock when fork happens in
parallel.

I'm not sure what other cause you could be looking for.

> Let's start : FILE are buffered and their handling is resolved by libc. It
> is duty of the library to resolve locking if the locks are not done by the
> single processes, as in this case.
>
> In other words: a program could also SEGV, exit or whatever - this should
> not lead to a deadlock or to a crash of the whole system.

Locks are per process, I don't understand why you keep repeating other
processes can impact the rest of the system.
I'm only referring to the state of the parent at the moment it is
forked, which is when its memory is copied (well, there is copy-on-write
so shallow-copied?) when the fork happens.

> Which is the musl version you are using ? At least 1.2.2 ?

Yes.

> There is a list of bugs in MUSL changelog that can be the cause of problem,
> for example:
>
> "soft deadlock regression in stdio FILE locks with >2 threads contending"
>
> and several others.
>
> We should first detect if there is something in SWUpdate that is not
> correct, and if yes, then fix it, or the cause is in another component (in
> this case, MUSL).

You can argue that musl should register atfork handlers to clear up the
lock when forks happen so children can't block like this, like glibc
does, but this is hardly a new problem so I'm not sure they're
interested...
Well, it can't hurt to ask if you think that's the way forward.

> - FILE are handled by the libc and SWUpdate does not set any mutex / lock,
> and in any case on stdio / stdout / stderr.
> - if a task is started and fd (or FILE) are available, it is perfectly
> correct that it make use of them. Cleanup and resource manging is solved by
> the caller, that is inside fork() / pthread_create().

Yes, the child can use the fd or FILE if the parent left them in usable
state.
The problem is it is not the case here due to concurrent access with
fork().

> > Looking at traces the accept() also happens after
> > the fork(), along with unlink/bind on /tmp/sockinstctrl in that order
> > (sorry, didn't think of taking timestamps for strace, but the order
> > should be correct)
> >
> > 1489 fork( <unfinished ...>
> > 1491 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0 <unfinished ...>
> > 1492 <... socket resumed>) = 5
> > 1492 unlink("/tmp/swupdateprog" <unfinished ...>
> > 1492 bind(5, {sa_family=AF_UNIX, sun_path="/tmp/swupdateprog"}, 110
> > <unfinished ...>
> > 1492 accept(5, <unfinished ...>
> > 1493 socket(AF_UNIX, SOCK_STREAM, 0) = 6
> > 1493 unlink("/tmp/sockinstctrl") = 0
> > 1493 bind(6, {sa_family=AF_UNIX, sun_path="/tmp/sockinstctrl"}, 110) = 0
> > 1493 accept(6, <unfinished ...>
> >
>
> Yes, it seems correct and it is expected.

That is very surprising to me... but I see below

>
> > (bonus: swupdateprog has the same "problem" if children are expected to
> > be able to connect to it -- I don't know if this is the case, but code
> > path can definitely print messages until that thread is in accept state)
> >
> >
> > So nothing prevents the children from trying to connect to swupdateprog
> > before the network thread has done its job, and in highly loaded
> > environment where the linux scheduler does weird things I'm sure I could
> > reproduce a child process that cannot connect once in a while -- for
> > example by making sure the filesystem is very slow by having a hundred
> > other process create/write/destroy files in /tmp in parallel.
>
> But this is not your case here. This is also the reason why the network
> thread is called *earlier*. The clients have a mechanism to retry a
> connection if first time fails (and if not can be added). This avoids to
> have a sync at startup that seems overkilling.

Ok, so yes there is an ugly loop for 3 attempts with a sleep(1) -- that
explains why download sometimes felt sluggish during my tests I was
wondering why the logs stalled from time to time when running swupdate
in a loop....

Well, now I konw there's the retry there's no harm in just trying (I'm
not suggesting this as a real fix) to move all the thread creations
after the forks, and illustrate that the problem no longer occurs?

That seems obvious to me so I would normally not do it, but if that
helps convincing you that it is about other threads accessing resources
during fork I can spend the time to do it.

> > (Curiosity also got me to look at the other two threads, one is the
> > progress_bar_thread which creates /tmp/swupdateprog and has the same
> > race: this one would either need synchronisation if required for forks
> > or could be moved after forks.
> > The last one is the binds on @"NotifyServer so would be the
> > notifier_thread started in notify_init which also has no synchronisation
> > so in theory could mean logs from children process can get lost)
>
> They should be queued, until UDS socket is full.

I'm not sure what UDS is, but for reference I did try:
- a process does bind on an address + recvfrom
- a process does sendto on same address

with various timings.

The result is:
- messages sent before the bind are just plain lost, as I would have
expected.
- messages sent after bind, before recvfrom are queued properly. If the
queue gets full the sendto blocks.

While doing that I noticed you also bind a socket on the client (with a
different name), I tried that too with no difference -- I don't think it
would be useful unless the server would send replies acknolwedging logs,
prehaps that was done in the past? But it could be removed now.

There is also a loop where the server tries to use different names if
the socket is not available.

With that in mind I see at least two problems:
- the child won the race and tries to send messages before the parent
process bound the socket, in which case the messages are lost until the
parent did its job.
- the child starts with one socket name, then parent does bind and the
socket name is not free for a reason or another and name changes.
Messages from the child go for the original socket name. Depending on
the reason it could be something like:
one preexisting swupdate process; new one forks
-> messages go to older swupdate
at which point old one exits
-> messages start getting lost.

I am sorry, but I really do think synchronisation between threads before
forking is useful.

> Let's go back: you find the problem (block on a resource), you do not find
> the cause (issue in SWUpdate or in MUSL), you get the symptoms, and you want
> try to work around the symptoms.
>
> Again: it should be not forbidden by a client to log / do something. This
> should not lead to a deadlock. If deadlock happens, it is caused by the one
> who handles the lock, and for stdio / stdout / stderr this is not SWUpdate.

Again, the client can use the file if and only if the lock was not taken
by swupdate while the fork happens.
You can argue it's libc doing the actual lock so it's libc's job to
cleanup the lock on fork, and it's certainly possible to do that
cleanup, but for me it is a clear case of "there are certain functions
that can be called while fork happens and others not" from the article I
linked.

--
Dominique

Stefano Babic

unread,

Apr 12, 2021, 3:52:03 AM4/12/21

to Dominique MARTINET, Stefano Babic, swup...@googlegroups.com

Hi Dominique,

I am trying to sumarize to move on, it looks like we stuck in discussion...

Yes - later threads has no influence and they are created and destroyed
on demand.

>
>
> As a matter of fact, for linux, threads and processes are very
> similar -- they're both tasks in the scheduler and the main difference
> is that threads happen to share the same memory mappings.

Right.

> glibc actually
> uses the same system call (clone) for both on x86_64, I was surprised to
> see fork() in strace the other day...
>
> While we are doing trivia, a possible enhancement to make debugging
> easier would be to call pthread_setname_np() in each of these functions
> to clearly identify each thread.
> Doing that would clearly show what thread is what in gdb or other
> debugging tools; that would have saved me a few minutes on Friday and I
> would not have been vague like that as threads would really "have names"

This can be done but it just for gdb - let skip it at the moment.

>
>>> The problem is quite clear to me, what can I do to convince you it is as
>>> I say?
>>
>> The problem (lock on a resource) is clear, the cause not, and even not if
>> this is really an issue inside SWUpdate.
>
> The problem is the child blocking.

Right

> The cause is the *parent's thread* having the lock when fork happens in
> parallel.
>
> I'm not sure what other cause you could be looking for.

Like whre is the lock hold in the child ?

>
>
>> Let's start : FILE are buffered and their handling is resolved by libc. It
>> is duty of the library to resolve locking if the locks are not done by the
>> single processes, as in this case.
>>
>> In other words: a program could also SEGV, exit or whatever - this should
>> not lead to a deadlock or to a crash of the whole system.
>
> Locks are per process, I don't understand why you keep repeating other
> processes can impact the rest of the system.
> I'm only referring to the state of the parent at the moment it is
> forked, which is when its memory is copied (well, there is copy-on-write
> so shallow-copied?) when the fork happens.

Ok

>
>> Which is the musl version you are using ? At least 1.2.2 ?
>
> Yes.

Fine

>
>> There is a list of bugs in MUSL changelog that can be the cause of problem,
>> for example:
>>
>> "soft deadlock regression in stdio FILE locks with >2 threads contending"
>>
>> and several others.
>>
>> We should first detect if there is something in SWUpdate that is not
>> correct, and if yes, then fix it, or the cause is in another component (in
>> this case, MUSL).
>
> You can argue that musl should register atfork handlers to clear up the
> lock when forks happen so children can't block like this, like glibc
> does, but this is hardly a new problem so I'm not sure they're
> interested...
> Well, it can't hurt to ask if you think that's the way forward.

Let's see what we can do at the moment - this is also difficult to
explain for MUSL.

Let's see - in fact, there are more threads as I wanted. Some of thenm
slipped sure before the forks() unwanted without notice during the
development. It should be well tested again, but yes, we can move quite
all of them later. I mean these twos:

network_daemon = start_thread(network_initializer, &swcfg);
start_thread(progress_bar_thread, NULL);

The remaining one (but maybe the most important) is the notifier. The
reason is to have the tracing / logging facilities as soon as possible,
as you can well understand. If it is moved later and one process (not
all of them are SWUpdate processes, SWUpdate can even start other
processes) and one of them fails to start, there is no info at all bout
the reason.

>
> That seems obvious to me so I would normally not do it, but if that
> helps convincing you that it is about other threads accessing resources
> during fork I can spend the time to do it.

Ok - let's start (as I said, I cannot test myself) to move the threads
above and then let me know.

>
>>> (Curiosity also got me to look at the other two threads, one is the
>>> progress_bar_thread which creates /tmp/swupdateprog and has the same
>>> race: this one would either need synchronisation if required for forks
>>> or could be moved after forks.
>>> The last one is the binds on @"NotifyServer so would be the
>>> notifier_thread started in notify_init which also has no synchronisation
>>> so in theory could mean logs from children process can get lost)
>>
>> They should be queued, until UDS socket is full.
>
> I'm not sure what UDS

Unix Domain Socket

>is, but for reference I did try:
> - a process does bind on an address + recvfrom
> - a process does sendto on same address
>
> with various timings.
>
> The result is:
> - messages sent before the bind are just plain lost, as I would have
> expected.

Yes, it is expected - and yes (I did not remember well), I used plain
sendto / recvfrom to avoid overhead as this is just an IPC.

> - messages sent after bind, before recvfrom are queued properly. If the
> queue gets full the sendto blocks.

Right.

>
> While doing that I noticed you also bind a socket on the client (with a
> different name), I tried that too with no difference -- I don't think it
> would be useful unless the server would send replies acknolwedging logs,
> prehaps that was done in the past? But it could be removed now.

It was in the past.

Dominique MARTINET

unread,

Apr 12, 2021, 4:05:05 AM4/12/21

to Stefano Babic, swup...@googlegroups.com

Stefano Babic wrote on Mon, Apr 12, 2021 at 09:51:59AM +0200:
> I am trying to sumarize to move on, it looks like we stuck in discussion...

Agreed, sorry for being stubborn :)

> > The cause is the *parent's thread* having the lock when fork happens in
> > parallel.
> >
> > I'm not sure what other cause you could be looking for.
>
> Like whre is the lock hold in the child ?

The lock exit waits on now is in __stdout_FILE (hidden global variable
in musl libc), in its lock field.
If what I say is correct, it is taken by the parent's other thread, then
copied as is in its taken state when the parent forks, so the child
never "takes" the lock right now.

> > Well, now I konw there's the retry there's no harm in just trying (I'm
> > not suggesting this as a real fix) to move all the thread creations
> > after the forks, and illustrate that the problem no longer occurs?
>
> Let's see - in fact, there are more threads as I wanted. Some of thenm
> slipped sure before the forks() unwanted without notice during the
> development. It should be well tested again, but yes, we can move quite all
> of them later. I mean these twos:
>
> network_daemon = start_thread(network_initializer, &swcfg);
> start_thread(progress_bar_thread, NULL);
>
> The remaining one (but maybe the most important) is the notifier. The reason
> is to have the tracing / logging facilities as soon as possible, as you can
> well understand. If it is moved later and one process (not all of them are
> SWUpdate processes, SWUpdate can even start other processes) and one of them
> fails to start, there is no info at all bout the reason.

Yes, I understand wanting to start these early.
For the sockets in particular I think they really have to, but
synchronisation (making sure the threads have all reached their waiting
state before forking) would help in my opinion.

> > That seems obvious to me so I would normally not do it, but if that
> > helps convincing you that it is about other threads accessing resources
> > during fork I can spend the time to do it.
>
> Ok - let's start (as I said, I cannot test myself) to move the threads above
> and then let me know.

Ok, will do the test just to test and report; probably tomorrow or the
day after tomorrow.

> > While doing that I noticed you also bind a socket on the client (with a
> > different name), I tried that too with no difference -- I don't think it
> > would be useful unless the server would send replies acknolwedging logs,
> > prehaps that was done in the past? But it could be removed now.
>
> It was in the past.

Ok, will add to things that could be cleaned up list later (after the
current in flight patches are done, there already is too much right
now. I appologise for that as well, I don't think I will send more
patches for a bit.)

--
Dominique

Dominique MARTINET

unread,

Apr 12, 2021, 9:12:30 PM4/12/21

to Stefano Babic, swup...@googlegroups.com

Stefano Babic wrote on Mon, Apr 12, 2021 at 09:51:59AM +0200:
> > That seems obvious to me so I would normally not do it, but if that
> > helps convincing you that it is about other threads accessing resources
> > during fork I can spend the time to do it.
>
> Ok - let's start (as I said, I cannot test myself) to move the threads above
> and then let me know.

So I went ahead and tried two different patches successfully (I can no
longer reproduce any hang):
- the first one just moved network_initializer and progress_bar_thread
starts just below the process starts, right on top of master (without
even the other fix that only checks for colors in parent process):
https://github.com/martinetd/swupdate/commit/9feabeb44438f8f88517d6236baacbba2f9b5286

This has the drawback of moving threads start down a bit, so the race
with sockets is lost a bit more often and it took a bit of time, but I
ran that 10000 times overnight without any hang, checking exit status
was always 0 on a real swu (so the file was always downloaded and
installed properly)

- the second one implements some basic synchronisation; basically just
increment a counter in start_thread and have threads decrement/signal
when they are ready.
https://github.com/martinetd/swupdate/commit/ad611bd48728ee1f6d2666e67a52002780ad414f

This also worked as expected, but there is another sleep(1) in
ipc_wait_for_complete so I didn't wait for the full 10000 loop (I did
10k runs on a 404 that always exits right away, and ~200 runs on a
successful image that waits for completion -- the hang could be
reproduced on 404 so it is convincing enough for me)

If you're ok on principle I can send the second patch after a bit of
polish as a real mail; for me this really solves the root cause (non
"fork-safe" functions being called while fork happens in other thread)
and has the side-effect of making sure the sockets are ready (even if
not strictly required except for logs thanks to the retry).

Thanks,
--
Dominique

Stefano Babic

unread,

Apr 17, 2021, 11:05:08 AM4/17/21

to Dominique MARTINET, Stefano Babic, swup...@googlegroups.com

Hi Dominique,

On 13.04.21 03:12, Dominique MARTINET wrote:
> Stefano Babic wrote on Mon, Apr 12, 2021 at 09:51:59AM +0200:
>>> That seems obvious to me so I would normally not do it, but if that
>>> helps convincing you that it is about other threads accessing resources
>>> during fork I can spend the time to do it.
>>
>> Ok - let's start (as I said, I cannot test myself) to move the threads above
>> and then let me know.
>
> So I went ahead and tried two different patches successfully (I can no
> longer reproduce any hang):
> - the first one just moved network_initializer and progress_bar_thread
> starts just below the process starts, right on top of master (without
> even the other fix that only checks for colors in parent process):
> https://github.com/martinetd/swupdate/commit/9feabeb44438f8f88517d6236baacbba2f9b5286
>
> This has the drawback of moving threads start down a bit, so the race
> with sockets is lost a bit more often and it took a bit of time, but I
> ran that 10000 times overnight without any hang, checking exit status
> was always 0 on a real swu (so the file was always downloaded and
> installed properly)
>

Ok

> - the second one implements some basic synchronisation; basically just
> increment a counter in start_thread and have threads decrement/signal
> when they are ready.
> https://github.com/martinetd/swupdate/commit/ad611bd48728ee1f6d2666e67a52002780ad414f
>
> This also worked as expected, but there is another sleep(1) in
> ipc_wait_for_complete so I didn't wait for the full 10000 loop (I did
> 10k runs on a 404 that always exits right away, and ~200 runs on a
> successful image that waits for completion -- the hang could be
> reproduced on 404 so it is convincing enough for me)
>
>
>
> If you're ok on principle I can send the second patch after a bit of
> polish as a real mail; for me this really solves the root cause (non
> "fork-safe" functions being called while fork happens in other thread)
> and has the side-effect of making sure the sockets are ready (even if
> not strictly required except for logs thanks to the retry).

Fine - just send the patches to ML for review / coverity.

Best regards,
Stefano

>
>
> Thanks,

Dominique Martinet

unread,

Apr 19, 2021, 3:48:22 AM4/19/21

to swup...@googlegroups.com, Dominique Martinet

Count when we start threads and add helpers to wait for threads
to have finished their init phase before moving on and starting
subprocesses.

This helps making sure children processes can immediately send logs
to the parent swupdate process or connect to swupdate control sockets
immediately without waiting and retrying.

This also fixes a hang with musl libc that occured when another thread
was using stdio functions while fork happens.

Signed-off-by: Dominique Martinet <dominique...@atmark-techno.com>
---

Note for traces in network_initializer: v0 of the patch added a
first-iteration bool to only signal thread ready once within the loop
after the print statement, but that is a bit ugly.
There actually already is another trace just before the end of the for
loop (sleep again message) so having the message within the loop brings
no information.
OTOH there was no message that update started so I added one, but I
guess it is not really required and am happy to remove it.

core/network_thread.c | 2 ++
core/notifier.c | 1 +
core/pctl.c | 35 +++++++++++++++++++++++++++++++++++
core/progress_thread.c | 1 +
core/stream_interface.c | 4 +++-
core/swupdate.c | 3 +++
include/pctl.h | 3 +++
7 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/core/network_thread.c b/core/network_thread.c
index dbfdf0800fec..b21d983680be 100644
--- a/core/network_thread.c
+++ b/core/network_thread.c
@@ -333,6 +333,7 @@ static void handle_subprocess_ipc(struct subprocess_msg_elem *subprocess_msg)
static void *subprocess_thread (void *data)
{
(void)data;
+ thread_ready();
pthread_mutex_lock(&subprocess_msg_lock);

while(1) {
@@ -394,6 +395,7 @@ void *network_thread (void *data)
get_ctrl_socket());
}

+ thread_ready();
do {
clilen = sizeof(cliaddr);
if ( (ctrlconnfd = accept(ctrllisten, (struct sockaddr *) &cliaddr, &clilen)) < 0) {
diff --git a/core/notifier.c b/core/notifier.c
index 5aa5435a2c72..1052dbda0fd8 100644
--- a/core/notifier.c
+++ b/core/notifier.c
@@ -448,6 +448,7 @@ static void *notifier_thread (void __attribute__ ((__unused__)) *data)
break;
} while (1);

+ thread_ready();
do {
len = recvfrom(serverfd, &msg, sizeof(msg), 0, NULL, NULL);
/*
diff --git a/core/pctl.c b/core/pctl.c
index 517be2dddc22..9c2d53443c1f 100644
--- a/core/pctl.c
+++ b/core/pctl.c
@@ -51,6 +51,13 @@ int pid = 0;

int sw_sockfd = -1;

+/*
+ * This allows waiting for initial threads to be ready before spawning subprocesses
+ */
+static int threads_towait = 0;
+static pthread_mutex_t threads_towait_lock = PTHREAD_MUTEX_INITIALIZER;
+static pthread_cond_t threads_towait_cond = PTHREAD_COND_INITIALIZER;
+
#if defined(__linux__)
static void parent_dead_handler(int __attribute__ ((__unused__)) dummy)
{
@@ -70,6 +77,10 @@ pthread_t start_thread(void *(* start_routine) (void *), void *arg)
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);

+ pthread_mutex_lock(&threads_towait_lock);
+ threads_towait++;
+ pthread_mutex_unlock(&threads_towait_lock);
+
ret = pthread_create(&id, &attr, start_routine, arg);
if (ret) {
exit(1);
@@ -77,6 +88,30 @@ pthread_t start_thread(void *(* start_routine) (void *), void *arg)
return id;
}

+/*
+ * Internal threads should signal they are ready if internal subprocesses
+ * can be spawned after them
+ */
+void thread_ready(void)
+{
+ pthread_mutex_lock(&threads_towait_lock);
+ threads_towait--;
+ if (threads_towait == 0)
+ pthread_cond_broadcast(&threads_towait_cond);
+ pthread_mutex_unlock(&threads_towait_lock);
+}
+
+/*
+ * Wait for all threads to have signaled they're ready
+ */
+void wait_threads_ready(void)
+{
+ pthread_mutex_lock(&threads_towait_lock);
+ while (threads_towait != 0)
+ pthread_cond_wait(&threads_towait_cond, &threads_towait_lock);
+ pthread_mutex_unlock(&threads_towait_lock);
+}
+
/*
* spawn_process forks and start a new process
* under a new user
diff --git a/core/progress_thread.c b/core/progress_thread.c
index aaca3d52d936..e46a00ed713d 100644
--- a/core/progress_thread.c
+++ b/core/progress_thread.c
@@ -260,6 +260,7 @@ void *progress_bar_thread (void __attribute__ ((__unused__)) *data)
get_prog_socket());
}

+ thread_ready();
do {
clilen = sizeof(cliaddr);
if ( (connfd = accept(listen, (struct sockaddr *) &cliaddr, &clilen)) < 0) {
diff --git a/core/stream_interface.c b/core/stream_interface.c
index 0631abd756ba..e87fc7185964 100644
--- a/core/stream_interface.c
+++ b/core/stream_interface.c
@@ -509,10 +509,11 @@ void *network_initializer(void *data)
/* fork off the local dialogs and network service */
network_thread_id = start_thread(network_thread, &inst);

+ TRACE("Main loop daemon");
+ thread_ready();
/* handle installation requests (from either source) */
while (1) {
ret = 0;
- TRACE("Main loop Daemon");

/* wait for someone to issue an install request */
pthread_mutex_lock(&stream_mutex);
@@ -520,6 +521,7 @@ void *network_initializer(void *data)
inst.status = RUN;
pthread_mutex_unlock(&stream_mutex);
notify(START, RECOVERY_NO_ERROR, INFOLEVEL, "Software Update started !");
+ TRACE("Software update started");

req = &inst.req;

diff --git a/core/swupdate.c b/core/swupdate.c
index 2d114650a102..b39cff9a0dcc 100644
--- a/core/swupdate.c
+++ b/core/swupdate.c
@@ -856,6 +856,9 @@ int main(int argc, char **argv)

start_thread(progress_bar_thread, NULL);

+ /* wait for threads to be done before starting children */
+ wait_threads_ready();
+
/* Start embedded web server */
#if defined(CONFIG_MONGOOSE)
if (opt_w) {
diff --git a/include/pctl.h b/include/pctl.h
index 8767864e165b..82b6e638cccb 100644
--- a/include/pctl.h
+++ b/include/pctl.h
@@ -27,6 +27,9 @@ struct swupdate_task {

pthread_t start_thread(void *(* start_routine) (void *), void *arg);

+void thread_ready(void);
+void wait_threads_ready(void);
+
typedef int (*swupdate_process)(const char *cfgname, int argc, char **argv);

void start_subprocess(sourcetype type, const char *name,
--
2.30.2

Stefano Babic

unread,

Apr 21, 2021, 10:23:13 AM4/21/21

to Dominique Martinet, swup...@googlegroups.com

Applied to -master, thanks !

Best regards,
Stefano Babic

Reply all

Reply to author

Forward