Handling SIGTERM (see issue #229)

已查看 45 次
跳至第一个未读帖子

Basile Starynkevitch

未读,
2018年6月30日 04:26:412018/6/30
收件人 onion-dev、Basile Starynkevitch
Hello All,

I opened the issue #229 (see https://github.com/davidmoreno/onion/issues/229 for more)

I believe (but am not entirely sure) that libonion is designed to deal with exactly one instance of onion (as obtained from onion_new). Perhaps that should be documented.

I would believe that it is useful to handle SIGTERM to gracefully stop some onion_listen loop.

I want in my bismon program to handle both SIGTERM and SIGQUIT and in both cases to gracefully return from onion_listen.

My understanding is that libonion is providing its own event loop (but I don't understand exactly all the details). I could start another thread to handle SIGTERM and SIGQUIT in some other event loop, but I believe that most programs should have only one event loop (calling some basic OS multiplexing primitive like poll).

For example, a GTK application or a Qt one has on Linux its own event loop (provided by gtk_main and QApplication::exec respectively). It would be nice if libonion could reuse that event loop; this is certainly possible, but I don't know how.

In general, I believe that threads are quite heavy, so I might not be satisfied with the idea to run libonion (that is onion_listen) in its own thread.

Cheers.

--
Basile Starynkevitch   - http://starynkevitch.net/Basile
Bourg La Reine, France

Basile Starynkevitch

未读,
2018年6月30日 10:37:502018/6/30
收件人 onion-dev、bas...@starynkevitch.net


On Saturday, June 30, 2018 at 10:26:41 AM UTC+2, Basile Starynkevitch wrote:
Hello All,

I opened the issue #229 (see https://github.com/davidmoreno/onion/issues/229 for more)


I am reading the source code (if that matters, using commit 9b02a052993069 ...), and I believe that I have found a bug related to that issue.

There is an internal function called shutdown_server (at line 347 of src/onion/onion.c). That function is (sometimes) installed as a SIGINT and SIGTERM signal handler in onion_new line 248. BTW, it should be installed using sigaction(2) instead of signal(2) which explicitly documents: 
       The effects of signal() in a multithreaded process are unspecified.

However, that signal handler is not async-signal-safe (as every signal handler should be, see signal-safety(7) man page on Linux). And that shutdown_server is not async-signal-safe since it calls onion_listen_stop at line 350. And that onion_listen_stop is calling (indirectly, thru onion_low_pthread_join) at line 573 the POSIX pthread_join(3) function. And that pthread_join function is not listed in signal-safety(7) as an async-signal-safe function, so we should suppose that it is not async-signal-safe.

Therefore I believe that in unlikely but possible circumstances, we have some undefined behavior in libonion, which might crash in some rare cases. BTW, such a crash is not easily reproducible, it might only happen when the kernel sends a signal inside a tiny range of machine instructions. These kind of heisenbugs are really difficult to correct.

So, is my analysis correct?

My suggestion would then be to follow Qt's practices on Unix signals. Create a pipe(7) to self at initialization time in onion_new. Install a SIGTERM (or SIGINT, or SIGQUIT) handler which just write(2)-s one or a few bytes on that pipe (e.g. it could write a byte containing the signal number: for those signals, that number fits in a byte). Then configure the polling machinery (and this I am not familiar with) so that I would poll that pipe read descriptor, and call appropriate routines when there is something on it.

BTW, it would be nice if we could handle slightly differently the SIGTERM, SIGQUIT and SIGINT signals.

Any guidance on how to add such a polling inside libonion? I am not familiar enough with epoll(7).

Regards.

--
Bourg La Reine, France

David Moreno Montero

未读,
2018年7月3日 07:33:272018/7/3
收件人 Basile Starynkevitch、onion-dev
Hi,

you are right.

I have to add that me myself "switched" programming languages and now embraced the Zen Erlang and let it crash on heisenbugs. They should be solved, but the best way of action is have some supervisor that if the program fails because of this bug or any unknown other, the program is restarted and if possible the user don't even notice. In the case of HTTP it could be because there is another server in front (nginx with a reverse proxy, for example) that take care of retries on connection errors, and the proper systemd / supervisord that restarts the onion powered server if fails.

For your use case I don't know if its possible that drastic approach, but if there is some way to engineer this strategy, I highly recommend it. Not because of onion, but in general.

But being more pragmatic, I like the strategy you suggest, and actually a similar one it's used for the timers on the epoller (using timerfd_create). Again should be just add the proper slot to manage that.

Can somebody volunteer for the development?

Regards,
David.




--
You received this message because you are subscribed to the Google Groups "onion-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to onion-dev+unsubscribe@coralbits.com.
To post to this group, send email to onio...@coralbits.com.
Visit this group at https://groups.google.com/a/coralbits.com/group/onion-dev/.
For more options, visit https://groups.google.com/a/coralbits.com/d/optout.



--

Zachary Grafton

未读,
2018年7月3日 07:50:502018/7/3
收件人 David Moreno Montero、Basile Starynkevitch、onion-dev
I like that strategy as well. I think it might be a bit more challenging to implement in the case of multiple onion instances. I think the correct approach would be to shut down(or call a specified handler) every instance of onion when the pipe receives the bytes written out by the signal handler. Any ideas on how to share the pipe amongst multiple onion instances?

Thanks,
Zack

From: David Moreno Montero [mailto:dmo...@coralbits.com]
Sent: Tuesday, July 03, 2018 7:33 AM
To: Basile Starynkevitch <bas...@starynkevitch.net>
Cc: onion-dev <onio...@coralbits.com>
Subject: Re: [onion-dev] Re: Handling SIGTERM (see issue #229)

Hi,

you are right.

I have to add that me myself "switched" programming languages and now embraced the Zen Erlang and let it crash on heisenbugs. They should be solved, but the best way of action is have some supervisor that if the program fails because of this bug or any unknown other, the program is restarted and if possible the user don't even notice. In the case of HTTP it could be because there is another server in front (nginx with a reverse proxy, for example) that take care of retries on connection errors, and the proper systemd / supervisord that restarts the onion powered server if fails.

For your use case I don't know if its possible that drastic approach, but if there is some way to engineer this strategy, I highly recommend it. Not because of onion, but in general.

But being more pragmatic, I like the strategy you suggest, and actually a similar one it's used for the timers on the epoller (using timerfd_create). Again should be just add the proper slot to manage that.

Can somebody volunteer for the development?

Regards,
David.




On 30 June 2018 at 16:37, Basile Starynkevitch <mailto:bas...@starynkevitch.net> wrote:


On Saturday, June 30, 2018 at 10:26:41 AM UTC+2, Basile Starynkevitch wrote:
Hello All,

I opened the issue #229 (see https://github.com/davidmoreno/onion/issues/229 for more)


I am reading the source code (if that matters, using https://github.com/davidmoreno/onion/commit/9b02a052993069cdf3356cd9ce8f41fa05ad6087 ...), and I believe that I have found a bug related to that issue.

There is an internal function called shutdown_server (at https://github.com/davidmoreno/onion/blob/master/src/onion/onion.c#L347 of src/onion/onion.c). That function is (sometimes) installed as a SIGINT and SIGTERM signal handler in onion_new https://github.com/davidmoreno/onion/blob/master/src/onion/onion.c#L248. BTW, it should be installed using http://man7.org/linux/man-pages/man2/sigaction.2.html instead of http://man7.org/linux/man-pages/man2/signal.2.html which explicitly documents:
The effects of signal() in a multithreaded process are unspecified.
However, that signal handler is not async-signal-safe (as every signal handler should be, see http://man7.org/linux/man-pages/man7/signal-safety.7.html man page on Linux). And that shutdown_server is not async-signal-safe since it calls onion_listen_stop at https://github.com/davidmoreno/onion/blob/master/src/onion/onion.c#L350. And that onion_listen_stop is calling (indirectly, thru onion_low_pthread_join) at https://github.com/davidmoreno/onion/blob/master/src/onion/onion.c#L573 the POSIX http://man7.org/linux/man-pages/man3/pthread_join.3.html function. And that pthread_join function is not listed in http://man7.org/linux/man-pages/man7/signal-safety.7.html as an async-signal-safe function, so we should suppose that it is not async-signal-safe.

Therefore I believe that in unlikely but possible circumstances, we have some undefined behavior in libonion, which might crash in some rare cases. BTW, such a crash is not easily reproducible, it might only happen when the kernel sends a signal inside a tiny range of machine instructions. These kind of https://en.wikipedia.org/wiki/Heisenbug are really difficult to correct.

So, is my analysis correct?

My suggestion would then be to follow http://doc.qt.io/qt-5/unix-signals.html. Create a http://man7.org/linux/man-pages/man7/pipe.7.html to self at initialization time in onion_new. Install a SIGTERM (or SIGINT, or SIGQUIT) handler which just http://man7.org/linux/man-pages/man2/write.2.html-s one or a few bytes on that pipe (e.g. it could write a byte containing the signal number: for those signals, that number fits in a byte). Then configure the polling machinery (and this I am not familiar with) so that I would poll that pipe read descriptor, and call appropriate routines when there is something on it.

BTW, it would be nice if we could handle slightly differently the SIGTERM, SIGQUIT and SIGINT signals.

Any guidance on how to add such a polling inside libonion? I am not familiar enough with http://man7.org/linux/man-pages/man7/epoll.7.html.

Regards.

--
Basile Starynkevitch http://starynkevitch.net/Basile/
Bourg La Reine, France

--
You received this message because you are subscribed to the Google Groups "onion-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mailto:onion-dev+...@coralbits.com.
To post to this group, send email to mailto:onio...@coralbits.com.
Visit this group at https://groups.google.com/a/coralbits.com/group/onion-dev/.
For more options, visit https://groups.google.com/a/coralbits.com/d/optout.



--
David Moreno Montero

mailto:dmo...@coralbits.com
http://www.coralbits.com/
http://www.coralbits.com

--
You received this message because you are subscribed to the Google Groups "onion-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mailto:onion-dev+...@coralbits.com.
To post to this group, send email to mailto:onio...@coralbits.com.

Basile Starynkevitch

未读,
2018年7月3日 08:11:562018/7/3
收件人 Zachary Grafton、David Moreno Montero、onion-dev
On 2018-07-03 13:50, Zachary Grafton wrote:
> I like that strategy as well. I think it might be a bit more
> challenging to implement in the case of multiple onion instances. I
> think the correct approach would be to shut down(or call a specified
> handler) every instance of onion when the pipe receives the bytes
> written out by the signal handler. Any ideas on how to share the pipe
> amongst multiple onion instances?


Is that really needed? I would instead think that we could simply
document that there should be only one onion instance.
(exactly like Qt requires only one QApplication per process, or GTK
handles only one X11 display)

Is there an actual need to handle several onion instances (result of
onion_new)? Who is doing that, for what reasons, exactly?

My current understanding was that onion_new is in practice always called
once. If we document that, it should be ok?

Otherwise, explain how several onion_new-s can be handled? Why? What is
the actual use case?


Requiring explicitly onion_new to be called exactly once seems pragmatic
to me. (My opinion is that more complex software won't use onion in that
weird case)

Cheers

--
Basile Starynkevitch http://starynkevitch.net/Basile
France (opinions are only mine)

David Moreno Montero

未读,
2018年7月3日 09:10:342018/7/3
收件人 Basile Starynkevitch、Zachary Grafton、onion-dev
If there are several onion_new the developer should write a custom sigterm handler that does cannot each onion_stop_listen for each onion*. To help a bit, onion_stop_listen function should be signal friendly, so that it can just be called from a signal handler. 

The use case.... I don't really know, but the onion architectute allows it. Maybe serve a webapp in one port and a completely different one in another, and do not rely on the host header?

David Moreno Montero

未读,
2018年7月3日 10:08:002018/7/3
收件人 Basile Starynkevitch、onion-dev
Yes to the pipe trick (eventfd might be better, but same idea). But no to limit to one onion_new. Its not difficult to maintain it and keeps the architecture cleaner (singleton free) and easier to reason later on the stack about ownership and access to data, specially useful for multithreading.

When I said make onion_listen_stop signal friendly, it should be that onion_listen_stop only writes the event to the pipe/eventfd, and then a poll slot handler do everything that is necessary to stop it (current onion_listen_stop code). It will not ensure immediate stop, but eventual.

To make it work it needs:

1. A New field at onion_t (types_internal.h) with the handler fd (eventfd is a good name).
2. At onion_new register the stop handler (after creating the poller).
3. Move the current onion_listen_stop to that handler
4. Make onion_liste_stop just write something to the handler.
5. Test a lot for corner cases

This way the current SIGTERM/SIGINT handler will keep working, and if the user have several onion*, or specific SIG* handlers, it can just call onion_listen_stop.

Regards,
David.


On 3 July 2018 at 15:47, Basile Starynkevitch <bas...@starynkevitch.net> wrote:
On 2018-07-03 15:10, David Moreno Montero wrote:
If there are several onion_new the developer should write a custom
sigterm handler that does cannot each onion_stop_listen for each
onion*. To help a bit, onion_stop_listen function should be signal
friendly, so that it can just be called from a signal handler.


But that makes a huge difference.

My feeling is to use the pipe to self trick (explained in http://doc.qt.io/qt-5/unix-signals.html for Qt, but we can adapt that to Onion).

Then, the signal handler justs writes into a pipe (this is ok and async-signal-safe) and the Onion event handler handles that pipe appropriately. So onion_listen_stop is not called from the signal handler (but from the event loop) and that is async-signal-safe.

If we have several event loops (that is several onion_new) that scheme won't work reliably.

I believe that we should not care about several onion_new, and document that only one instance of onion_new is expected.

This is the practice in GTK (only one X11 display handled) and in Qt (only one QApplication instance).

But we need to decide if we can afford requiring only one instance of onion_new.

My vote is of course yes!


Cheers
--
Basile Starynkevitch         http://starynkevitch.net/Basile
France                       (opinions are only mine)



--
David Moreno Montero 

Basile Starynkevitch

未读,
2018年7月3日 10:12:382018/7/3
收件人 onio...@coralbits.com


On 2018-07-03 15:20, Zachary Grafton wrote:
> I'm not sure it it's really needed, but I could see a use case. Say
> for instance, you have a web service listening on a public IP address,
> but want to implement an admin interface with a different set of
> handlers listening on an internal IP address.


My feeling is that such (very rare) cases are outside the scope of
libonion. My belief is that people requiring that feature
should not use libonion (or accept to have several processes, and do
their own inter-process communication).

Likewise, you cannot easily use GTK or Qt (in a single process) in the
rare cases where you need several X11 displays. Because they are not
designed for that weird case.

But we need to decide if we do care about several event loops (running
in several threads). I don't need that at all.

So it is a philosophical issue. I won't decide alone, but whatever we
decide should be documented.

My wish is to add somewhere in the documentation a sentence like:

libonion expects the user application to have only one single
instance of onion_new in his/her process. It has some single master
event loop.

What do you think?

Basile Starynkevitch

未读,
2018年7月3日 10:18:132018/7/3
收件人 David Moreno Montero、onion-dev
On 2018-07-03 16:07, David Moreno Montero wrote:
> Yes to the pipe trick (eventfd might be better, but same idea). But no
> to limit to one onion_new. Its not difficult to maintain it and keeps
> the architecture cleaner (singleton free) and easier to reason later
> on the stack about ownership and access to data, specially useful for
> multithreading.
>
> When I said make onion_listen_stop signal friendly, it should be that
> onion_listen_stop only writes the event to the pipe/eventfd, and then
> a poll slot handler do everything that is necessary to stop it
> (current onion_listen_stop code). It will not ensure immediate stop,
> but eventual.
>
> To make it work it needs:
>
> 1. A New field at onion_t (types_internal.h) with the handler fd
> (eventfd is a good name).
>
> 2. At onion_new register the stop handler (after creating the poller).
>
> 3. Move the current onion_listen_stop to that handler
> 4. Make onion_liste_stop just write something to the handler.
>
> 5. Test a lot for corner cases
>
> This way the current SIGTERM/SIGINT handler will keep working, and if
> the user have several onion*, or specific SIG* handlers, it can just
> call onion_listen_stop.
>

How would that work with several unrelated onion_new instances running
in different threads (started by the same application) of the same
process?

Signal handling are global to the process, not to individual threads.

We cannot install a signal handler only in one thread.

Otherwise, we could use the Linux specific signalfd(2) (and eventfd(2)
is also Linux specific),
but there is no equivalent on MacOSX and several other POSIX systems.

Do we care (like Qt do) do about portability to other Unixes? In
particular MacOSX (which I don't know and don't have).

David Moreno Montero

未读,
2018年7月3日 10:24:302018/7/3
收件人 Basile Starynkevitch、onion-dev
I thought eventfd was posix... so pipe it is.

IIRC the signal will be delivered only to one thread, but even if its delivered to all of them, there should be no problem, or for sure we can think a workaround around multiple dispatch of the event. I would prefer to stick to signal, not signalfd.

About how this helps, is:

1. Simple most common case one onion_new: use the provided handlers, be happy.
2. Complex case, more than one: Do your custom handler and call one by one the onion_listen_stop on each onion*.

There is no big portability promise as in Qt, but its nice to be a good POSIX citizen.

Regards,
David.


Basile Starynkevitch

未读,
2018年7月3日 12:16:502018/7/3
收件人 David Moreno Montero、onion-dev


On 07/03/2018 04:24 PM, David Moreno Montero wrote:
> I thought eventfd was posix... so pipe it is.
>
> IIRC the signal will be delivered only to one thread, but even if its
> delivered to all of them, there should be no problem, or for sure we
> can think a workaround around multiple dispatch of the event. I would
> prefer to stick to signal, not signalfd.
>
> About how this helps, is:
>
> 1. Simple most common case one onion_new: use the provided handlers,
> be happy.

Do we want to detect that case automatically? I hope that not. The evil
is in the details.

The point is that pthread functions (notably mutex related) are not
async-signal-safe. In other words, it is widely known that Unix signals
are not thread friendly and have been invented before threads. So there
is no way to circumvent that (think of ugly improbable cases like a
SIGTERM happening during some onion_new in another thread....)

My feeling is that we should document that libonion is not a universally
usable library, in particular it is not really C10K friendly. See
https://en.wikipedia.org/wiki/C10k_problem for more. To put it another
way, if competing HTTP implementations (such as lighttpd or apache or
nginx) are significantly larger than libonion (in terms of line of
source code), there is a reason for that.

I really would prefer that we state explicitly that libonion expects one
onion_new. And if we don't we should at the very least state that
SIGTERM support expects only one instance of onion_new and won't
reliably work otherwise. Or at least that the set of onion_new is known
early in the program, before SIGTERM would be sent (and we might need
some stronger hypothesis, I don't exacly know which one)

> 2. Complex case, more than one: Do your custom handler and call one by
> one the onion_listen_stop on each onion*.
>
> There is no big portability promise as in Qt, but its nice to be a
> good POSIX citizen.

I could easily use #if linux to special-case code using eventfd &
signalfd. But I am unable to configure cmake to be nice with that. I
don't know cmake at all, and don't want to dive into it (I dislike it).

Cheers.

--
Basile STARYNKEVITCH == http://starynkevitch.net/Basile
opinions are mine only - les opinions sont seulement miennes
Bourg La Reine, France

Zachary Grafton

未读,
2018年7月3日 13:18:352018/7/3
收件人 Basile Starynkevitch、David Moreno Montero、onion-dev
I would imagine this might be a path to resolving the issue:

1. Remove signal handling code from the onion object
2. Create a signal handling object, where only 1 instance is expected
3. Create a method for adding onion objects to the signal handling object

It might add some additional code to a small application that wants the default signal handling, but it would be easier to manage in larger applications that might not want signal handling functionality at all. I imagine the process working like so:

1. Signal is fired and the event handling object writes to an internal pipe, returns from signal handler
2. Event handling object receives its own message, then signals each onion instance that is registered with it to shutdown
3. Each onion object shutdowns gracefully and releases resources

Any thoughts?

> -----Original Message-----
> From: Basile Starynkevitch [mailto:bas...@starynkevitch.net]
> Sent: Tuesday, July 03, 2018 12:17 PM
> To: David Moreno Montero <dmo...@coralbits.com>
> Cc: onion-dev <onio...@coralbits.com>
> Subject: Re: [onion-dev] Re: Handling SIGTERM (see issue #229)
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "onion-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to onion-dev+...@coralbits.com.
> To post to this group, send email to onio...@coralbits.com.

Basile Starynkevitch

未读,
2018年7月3日 13:34:522018/7/3
收件人 Zachary Grafton、David Moreno Montero、onion-dev


On 07/03/2018 07:18 PM, Zachary Grafton wrote:
> I would imagine this might be a path to resolving the issue:
>
> 1. Remove signal handling code from the onion object
> 2. Create a signal handling object, where only 1 instance is expected

We don't need to create that. It can be a static (or global, probably
with visibility hidden) variable.

My preference is still to consider onion_new as giving only one instance
(and document that fact). This is still the very common case.
(otherwise, please show me some free software project which uses more
than one onion_new, or name a proprietary software for which it is sure
there is more than one onion_new, and explain why is it so).

If we still want to handle the case of several onion_new (and I really
don't care about that case which I believe is improbable) we could just
have some way to disable SIGTERM (the current API has that with
O_NO_SIGTERM, we could keep that but document that it is needed if we
have more than one instance of onion_new) and document some way to write
a SIGTERM handler.

So my current thinking is to implement a SIGTERM for the common case
where there is one onion_new instance, document that this should usually
be the case (and for those rare cases where that is false, require them
to use O_NO_SIGTERM and explain how SIGTERM could be handled wisely, in
some application specific way).

I strongly believe that having several instances of onion_new does not
happen in practice. Please raise your voice if you have a concrete case
but explain what it is (not simply a "we should keep the remote
possibility of several onion_new-s instances")

Sorry folks, I think that those few who are using onion_new several
times would need to handle themselves SIGTERM in some application
specific way. We just need to give some instructions in how to do that
wisely, not to implement that ourselves. These instructions could be as
simple as collect all your instances of onion_new, but be aware that
onion_listen_stop is not async-signal-safe and cannot be called from a
signal handler.

Cheers

Basile Starynkevitch

未读,
2018年7月3日 14:28:242018/7/3
收件人 Zachary Grafton、David Moreno Montero、onion-dev


> > Sorry folks, I think that those few who are using onion_new several > times would need to handle themselves SIGTERM in some application > specific way. We just need to give some instructions in how to do > that wisely, not to implement that ourselves. These instructions > could be as simple as collect all your instances of onion_new, but be > aware that onion_listen_stop is not async-signal-safe and cannot be > called from a signal handler.

I have started to work on that issue. See my (new) onion branch. Notice the improved README.md on https://github.com/bstarynk/onion/blob/master/README.md and its new Caveats section.

I don't really care about C10K, or multiple instances of onion_new. In the future, those who really care will submit significant patches on these issues (but my current belief is that they should not use onion in such cases). But I am not working on these right now. However, I will work to provide a more robust handling of SIGTERM for the one instance of onion_new very common case.

Cheers.


--
Basile STARYNKEVITCH   == http://starynkevitch.net/Basile
opinions are mine only - les opinions sont seulement miennes
Bourg La Reine, France
,

David Moreno Montero

未读,
2018年7月3日 15:35:462018/7/3
收件人 Basile Starynkevitch、Zachary Grafton、onion-dev
Hi,

I think we all share the same basic ideas, so its perfect if you can work on it Basile. Thanks!

I did read the caveats section, and it's ok, but I'm only wondering why you say onion is not c10k ready, not that that was a explicit goal, but load tests are quite good: 138k req/s using keep alive and 20 concurrent connections, 20k no keep alive (on an above average i7).

Regards,
David.

Basile Starynkevitch

未读,
2018年7月3日 17:08:562018/7/3
收件人 David Moreno Montero、onion-dev



On 07/03/2018 09:35 PM, David Moreno Montero wrote:
Hi,

I think we all share the same basic ideas, so its perfect if you can work on it Basile. Thanks!

I started to work. Don't expect miracles. I hope to finish in a week (and I cannot afford a lot more work).


I did read the caveats section, and it's ok, but I'm only wondering why you say onion is not c10k ready, not that that was a explicit goal, but load tests are quite good: 138k req/s using keep alive and 20 concurrent connections, 20k no keep alive (on an above average i7).

Perhaps (because of performance increase in current hardware) what was called the C10K problem last century should today be called the C100K problem. But I think (not sure of that) that the C10K problem is a widely used locution (and that is why it has its wikipedia page), so the name has stuck (again, I could be wrong: suggest a better locution in that case; remember that we both are not native English speakers, so be sure to give some external references). I am not sure people would understand C100K problem. I could be wrong (however, even today there is no C100K problem wikipage, I just checked in start of July 2018). What do you think? My point is that above some threshold of connections, web servers should be designed very carefully (and with care) to deal with that. My feeling is that libonion cares more about simplicity (and robustness) than about absolute performance. My understanding of the C10K problem is that it deals with TCP connections (not HTTP requests/second). So what matters is the connect(2) system call done in browsers (or other HTTP clients). On the server side inside libonion, you are caring about accept(2). Are you really doing that accept at a frequency of 100KHz (in a single thread)?

In my particular use case (bismon), I don't have any C100K problem problem (or C10K one). The program I am working on (bismon) is expected to have at most a dozen of human users simultaneously (and today I am very far from that: I still have no bismon users except myself). Humans cannot really do 10K HTTP requests per second each. Basically, I don't care that much about performance. But I do care a lot about robustness (so lack of data races, and having few bugs). If onion crashes when my bismon program is using it, it could happen that a team of ten users lose several hours of work, and they won't be happy. (bismon probably would be some single-page applications, so lots of AJAX). Actually, I hope to work to make that delay smaller, but it still would be several dozens of minutes (I probably cannot afford persisting an entire bismon heap every second; since my persistence dump would write many dozens of megabytes, this might happen in 2019). As an analogy (assuming you use emacs) you won't dump emacs every second neither. (And bismon has no journal mechanism; its persistency machinery is in principle similar to SBCL's save-lisp-then-die, or to emacs dumps, and the format used for persistence is textual, not binary, to be git friendly. Look into my store2.bmon file for example).

A request is not supposed to the be the same as an active connection. In other words, most HTTP clients and servers are wisely using the Connection: close request field and usually have Connection: keep-alive (and that is why the Content-size: field is important in practice) and try to maintain an HTTP persistent connection. I have to admit that my knowledge related to these persistent connections is very theoretical (I don't really know what are the figures in practice, my understanding is that most recent browsers want persistent connections extensively).

BTW, I don't know much about HTTP. If you have some guidelines about it (such as concrete figures about how many HTTP connections a modern web browser like a recent firefox or chrome is making), I would be very happy to know them. Any clues about web development is welcome. BTW, in my eyes, ocsigen is very inspirational (as an approach to web development), much more than e.g. old PHP.

BTW I would be delighted to have some concrete things about your usage and also usage by other readers of onion-dev of libonion. (e.g. the actual use case, hardware, number of requests per second, number of connections per second, data volume, common HTTP requests, ...). If you can share them (perhaps on some blog) I would be extremely happy! (even if your applications using libonion are not free software).

Regards.

Basile Starynkevitch

未读,
2018年7月4日 13:34:252018/7/4
收件人 David Moreno Montero、onion-dev


On 07/03/2018 11:08 PM, Basile Starynkevitch wrote:

BTW, I don't know much about HTTP. If you have some guidelines about it (such as concrete figures about how many HTTP connections a modern web browser like a recent firefox or chrome is making), I would be very happy to know them. Any clues about web development is welcome. BTW, in my eyes, ocsigen is very inspirational (as an approach to web development), much more than e.g. old PHP.

BTW I would be delighted to have some concrete things about your usage and also usage by other readers of onion-dev of libonion. (e.g. the actual use case, hardware, number of requests per second, number of connections per second, data volume, common HTTP requests, ...). If you can share them (perhaps on some blog) I would be extremely happy! (even if your applications using libonion are not free software).

That wish above still holds. If people are allowed to explain how they use libonion on this onion-dev list, I would be delighted to know more.


Back to the subject of SIGTERM. I really believe it is a special case (even if it could be a very common one).

First, SIGTERM handling makes only sense with one single onion object (that is a single call to onion_new happening at initialization time). I really believe there is no way to handle SIGTERM safely and reliably in an arbitrary case of several onion instances created and managed in independent threads. BTW, so far, nobody gave any use case of several onion instances. So that case is really uncommon (and honestly, I don't think that libonion has been well tested in that general case). BTW, for the particular case of a single onion, I suggest also having some

That makes me think that we might want an incompatible API change. Instead of having a O_NO_SIGTERM flag to onion_new (BTW, negative flags are harder to understand than positive ones), I recommend having some O_WITH_SIGTERM (and also some O_WITH_SIGINT and O_WITH_SIGQUIT) flag. So people would want to pass explicitly the O_WITH_SIGTERM flag only in the case when they have a single instance of onion_new. And then SIGTERM would stop listening the loop, and perhaps call some global variable, provided by libonion, and declared void (*onion_sigterm_processor) (void); If that global function pointer is not null, it would be called during SIGTERM processing. I feel it does not need any additional data argument.

At last, I need to use  onion_poller_slot_new but I feel that its function argument (the second one, function pointer f) is not well documented. It seems to be (for a read polling) some kind of reading function, but I am surprised it does not get (also) the file descriptor to read (so I would expect that argument to be declared int (*f) (int fd, void*clientdata), not int (*f) (void*). Why is it so?  Also, I am guessing that it returns the number of read bytes on success, and -1 on read failure. Is my understanding correct?

Comments are welcome. In particular, confirm my understanding of onion_poller_slot_new please.

Cheers

--
Basile STARYNKEVITCH   == http://starynkevitch.net/Basile
opinions are mine only - les opinions sont seulement miennes
Bourg La Reine, France (email: <bas...@starynkevitch.net>)

Basile Starynkevitch

未读,
2018年7月4日 13:39:452018/7/4
收件人 David Moreno Montero、onion-dev



On 07/04/2018 07:34 PM, Basile Starynkevitch wrote:

First, SIGTERM handling makes only sense with one single onion object (that is a single call to onion_new happening at initialization time). I really believe there is no way to handle SIGTERM safely and reliably in an arbitrary case of several onion instances created and managed in independent threads. BTW, so far, nobody gave any use case of several onion instances. So that case is really uncommon (and honestly, I don't think that libonion has been well tested in that general case). BTW, for the particular case of a single onion, I suggest also having some

(sorry missed some words, I mean:)


BTW, for the particular case of a single onion, I suggest also having some
  
/**
 * @short Gives the only single instance of onion. 
 * Usable only if you are sure to call `onion_new` once.  Don't use that to check that you do use `onion_new` once.
 * \b{undefined} when you call `onion_new` several times.
 * @ingroup onion
 */
onion* onion_single_instance (void);


function which (when there is only one instance of onion; that is the very common case) returns it. Otherwise (with several onion-s), the behavior of onion_single_instance is undefined.

Basile Starynkevitch

未读,
2018年7月4日 14:09:492018/7/4
收件人 David Moreno Montero、onion-dev


Hello All,

I just commit-ed on my branch some (incomplete) code regarding SIGTERM (and the removal of HAVE_PTHREADS). My branch is on https://github.com/bstarynk/onion more precisely I am speaking of my commit b92169515eabb8f9e
-it is on https://github.com/bstarynk/onion/commit/b92169515eabb8f9e894a9abc8756a59fd9eaaab
I have a linking issue, probably related to cmake (which I don't understand, which I hate, and was never able to learn; for me it is an evil black box). It looks like the removal of HAVE_PTHREADS have somehow changed its behavior (but as far as I understand, I did not touch any cmake related stuff).

Here is the link failure:

 % make    
[ 14%] Built target onion_static
[ 28%] Built target onion
[ 29%] Built target onion_extras
[ 31%] Built target onioncpp_static
[ 34%] Built target onioncpp
[ 38%] Built target opack
[ 38%] Built target crl
[ 38%] Linking C executable otemplate
/usr/bin/ld: CMakeFiles/otemplate.dir/__/__/src/onion/log.c.o: in function `onion_log_stderr':
log.c:(.text+0x7f): undefined reference to `pthread_once'
/usr/bin/ld: CMakeFiles/otemplate.dir/__/__/src/onion/dict.c.o: in function `onion_dict_new':
dict.c:(.text+0x13a): undefined reference to `pthread_rwlock_init'
/usr/bin/ld: CMakeFiles/otemplate.dir/__/__/src/onion/dict.c.o: in function `onion_dict_free':
dict.c:(.text+0x1ed): undefined reference to `pthread_rwlock_destroy'
/usr/bin/ld: CMakeFiles/otemplate.dir/__/__/src/onion/dict.c.o: in function `onion_dict_lock_read':
dict.c:(.text+0xc35): undefined reference to `pthread_rwlock_rdlock'
/usr/bin/ld: CMakeFiles/otemplate.dir/__/__/src/onion/dict.c.o: in function `onion_dict_lock_write':
dict.c:(.text+0xc45): undefined reference to `pthread_rwlock_wrlock'
/usr/bin/ld: CMakeFiles/otemplate.dir/__/__/src/onion/dict.c.o: in function `onion_dict_unlock':
dict.c:(.text+0xc55): undefined reference to `pthread_rwlock_unlock'
/usr/bin/ld: CMakeFiles/otemplate.dir/__/__/src/onion/low.c.o:(.data.rel+0x0): undefined reference to `pthread_sigmask'
/usr/bin/ld: CMakeFiles/otemplate.dir/__/__/src/onion/low.c.o:(.data.rel+0x10): undefined reference to `pthread_detach'
/usr/bin/ld: CMakeFiles/otemplate.dir/__/__/src/onion/low.c.o:(.data.rel+0x18): undefined reference to `pthread_cancel'
/usr/bin/ld: CMakeFiles/otemplate.dir/__/__/src/onion/low.c.o:(.data.rel+0x20): undefined reference to `pthread_join'
/usr/bin/ld: CMakeFiles/otemplate.dir/__/__/src/onion/low.c.o:(.data.rel+0x28): undefined reference to `pthread_create'
collect2: error: ld returned 1 exit status
make[2]: *** [tools/otemplate/CMakeFiles/otemplate.dir/build.make:279: tools/otemplate/otemplate] Error 1
make[1]: *** [CMakeFiles/Makefile2:559: tools/otemplate/CMakeFiles/otemplate.dir/all] Error 2
make: *** [Makefile:152: all] Error 2

Can someone reproduce that? It looks like -lpthread is no more linked? Why? How to improve that?

Cheers.

David Moreno Montero

未读,
2018年7月4日 17:20:342018/7/4
收件人 Basile Starynkevitch、onion-dev
About the cmake error, I think you can apply the attached patch.

Basically the main problem is that pthread is now required, even for otempleate, which did not require it before.

About the code, looks good, but I would change some things.

1. The signal fds, should be stored at the onion struct, and the poller handler always installed. This would allow to stop it even from other non signal scenarios, using these pipe. It should be closed at onion_free as well.
2. So the install_sigterm_processing_onion (better onion_install_sigterm) would only add the signals, not deal with any pipe related.
3. The function comments, should better follow the standard on other functions: /// if only one line, "/* @short" and so on if several.
4. If the functions are not to be exported, mark them "static" and they will not pollute the global scope, which in C is quite easy to pollute. And they may be inlined and optimized (byte_to_signum).
5. Function names, I would prefer, but you are the boss:
  signal_poll_pipe_routine -> onion_manage_event
  install_sigterm_processing_onion -> onion_sigterm_install
6. In another email you asked about the private date for epoll handlers; you can pass onion itself as the data for this poller, si that it can grab from it the pipe read fd, and do whatever is necessary to stop onion.

Thanks for all the effort.

Regards,
David
0001-Fix-pthread-is-now-always-required.patch

Basile Starynkevitch

未读,
2018年7月5日 00:01:322018/7/5
收件人 David Moreno Montero、onion-dev



On 07/04/2018 11:20 PM, David Moreno Montero wrote:
About the cmake error, I think you can apply the attached patch.
Thanks


Basically the main problem is that pthread is now required, even for otempleate, which did not require it before.

About the code, looks good, but I would change some things.

1. The signal fds, should be stored at the onion struct, and the poller handler always installed. This would allow to stop it even from other non signal scenarios, using these pipe. It should be closed at onion_free as well.
But SIGTERM won't be handled reliably in the multi-onion case...


2. So the install_sigterm_processing_onion (better onion_install_sigterm) would only add the signals, not deal with any pipe related.
3. The function comments, should better follow the standard on other functions: /// if only one line, "/* @short" and so on if several.
4. If the functions are not to be exported, mark them "static" and they will not pollute the global scope, which in C is quite easy to pollute. And they may be inlined and optimized (byte_to_signum).
5. Function names, I would prefer, but you are the boss:
  signal_poll_pipe_routine -> onion_manage_event
  install_sigterm_processing_onion -> onion_sigterm_install
6. In another email you asked about the private date for epoll handlers; you can pass onion itself as the data for this poller, si that it can grab from it the pipe read fd, and do whatever is necessary to stop onion.

I still don't understand how onion_poller_slot_new works, to repeat my question


At last, I need to use  onion_poller_slot_new but I feel that its function argument (the second one, function pointer f) is not well documented. It seems to be (for a read polling) some kind of reading function, but I am surprised it does not get (also) the file descriptor to read (so I would expect that argument to be declared int (*f) (int fd, void*clientdata), not int (*f) (void*). Why is it so?  Also, I am guessing that it returns the number of read bytes on success, and -1 on read failure. Is my understanding correct?

I am blocked on this issue. Please explain.

Basile Starynkevitch

未读,
2018年7月5日 00:46:092018/7/5
收件人 David Moreno Montero、onion-dev
On 07/04/2018 11:20 PM, David Moreno Montero wrote:

1. The signal fds, should be stored at the onion struct, and the poller handler always installed. This would allow to stop it even from other non signal scenarios, using these pipe. It should be closed at onion_free as well.
2. So the install_sigterm_processing_onion (better onion_install_sigterm) would only add the signals, not deal with any pipe related.

Are you aware that it won't work as nicely as you want it? In particular, it won't work with several onion-s (e.g. created and used in independent application threads). Signal handling is a global property of the entire process (not of threads).

If SIGTERM is sent precisely during the call to e.g. the second onion_new, aftermath (i.e. disaster) happens. Because whatever I code in onion_new to manage an hypothetical set (or list) of all onion-s is not async-signal-safe and cannot be made so (e.g. because pthread_mutex_lock is not async-signal-safe).

So I can add the pipes inside the onion (and I was tempted to do that at first), but that don't change the fact that only one onion can reliably exist when handling SIGTERM. Remember that onion functions are not async-signal-safe, and cannot be transformed to be so (e.g. because you are forbidden to call malloc, even very indirectly, inside signal handlers). The set of async-signal-safe functions is tiny, and is completely documented in signal-safety(7). Any other function of libc is not async-signal-safe and should not be assumed to be so.

So making these pipes a field e.g. signal_write_fd inside onion is the same as making them static variables. The signal handler would access last_onion->signal_write_fd instead of my static signal_pipe_wr but that don't make much a difference.

But I will do like you want. However, be aware that it does not make any real difference.

Cheers.

Basile Starynkevitch

未读,
2018年7月5日 01:08:582018/7/5
收件人 David Moreno Montero、onion-dev



On 07/04/2018 11:20 PM, David Moreno Montero wrote:
About the cmake error, I think you can apply the attached patch.

I am not able to apply that patch cleanly. See the attached CMakeLists.txt.rej file.

Could you please give me a full CMakeLists.txt file for my https://github.com/bstarynk/onion

Thanks!
CMakeLists.txt.rej

David Moreno Montero

未读,
2018年7月5日 04:09:462018/7/5
收件人 Basile Starynkevitch、onion-dev
Hi,

the default handler will not work on the multi onion case, but a custom handler may know all the onion * handlers and send all the required signals through the pipe to all the onion*.

onion_poller_slot is an struct that stores all the data for the handler: callback, destructor and timeout. It must be created beforehand adding to the poller, as the struct must be fully populated before inserting it. When the appropriate event arrives (only read) it calls the callback with the private data. This private data must contain the fd handler and so on.

Why is not fd passed as first parameter? As onion uses opaque pointers (opaque to the library users) for all internal data, there is no direct use for the fd, as it will always call a function that does the real read from one of these pointers. For example at listen_point.c:109, it stores the request itself. This abstracts if the protocol is HTTP or HTTPS; each requires different read functions.

So it could be changed (API break, no big deal), but will not add so much from my point of view, as you will need almost always some private data, which can contain this fd. Teh actual read normally is not directly from the fd, but in some basic cases, and if you dont need private data, just use the fd as private data if you can as pointer (&onion->listenfd) and no free. Due that you will never call the handler after onion is freed, there is no need to free it.

In worst case reserve memory just for it: "int *fdp=malloc(sizeof(fd)); *fdp=fd;" and "free" as destructor.

Regards,
David.








--
You received this message because you are subscribed to the Google Groups "onion-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to onion-dev+unsubscribe@coralbits.com.

To post to this group, send email to onio...@coralbits.com.
Visit this group at https://groups.google.com/a/coralbits.com/group/onion-dev/.
For more options, visit https://groups.google.com/a/coralbits.com/d/optout.

David Moreno Montero

未读,
2018年7月5日 04:15:522018/7/5
收件人 Basile Starynkevitch、onion-dev
Hi,

you are right that it may not make a big difference, but I think prepares us for unknown uses, and even you scenario can be solved in different ways: (lock free queues, atomic variables (Check-And-Set)...). And most important singletons are evil and limit you later. And in this case is quite easy to do it without them.

David Moreno Montero

未读,
2018年7月5日 04:23:202018/7/5
收件人 Basile Starynkevitch、onion-dev
Can you try again to merge it with "git am FILENAME.patch". I just tried on your latest master and it applied cleanly.

Basile Starynkevitch

未读,
2018年7月5日 14:57:032018/7/5
收件人 David Moreno Montero、onion-dev



On 07/05/2018 10:22 AM, David Moreno Montero wrote:
Can you try again to merge it with "git am FILENAME.patch". I just tried on your latest master and it applied cleanly


Thanks. That worked. and I pushed that

(I might not have time to work tomorrow on libonion, so please be patient)

Cheers

Basile Starynkevitch

未读,
2018年7月6日 01:46:272018/7/6
收件人 David Moreno Montero、onion-dev



On 07/05/2018 10:09 AM, David Moreno Montero wrote:
Hi,

the default handler will not work on the multi onion case, but a custom handler may know all the onion * handlers and send all the required signals through the pipe to all the onion*.

onion_poller_slot is an struct that stores all the data for the handler: callback, destructor and timeout. It must be created beforehand adding to the poller, as the struct must be fully populated before inserting it. When the appropriate event arrives (only read) it calls the callback with the private data. This private data must contain the fd handler and so on.





I am not very happy with the current API regarding SIGTERM.  Remember that signal(7) processing is whole-process-wide, not thread-specific (even if, to complicate things, some internal signals like SIGSEGV or SIGBUS or SIGSYS or SIGPIPE are sent to a specific thread; but SIGIO, SIGTERM, SIGINT, SIGQUIT, SIGALRM etc ... are sent to the entire process). This is a sad fact (and that is why signals and pthreads(7) don't play well together) mandated by POSIX API. BTW, some persons believe that future versions of POSIX might deprecate such asynchronous threads in favor of something inspired by Linux-specific signalfd(2) - but that might not happen before my retirement).

First, I believe that SIGTERM handling should be optional and explicit. Installing a signal handler at onion_new time is IMHO a bad approach, and very unfriendly with multiple onion-s.

If we want to handle later (perhaps not now, they certainly are subtle corner cases that I did not thought of) SIGTERM with multiple onion-s, we need to make SIGTERM enabling separate and explicit.

My suggestion is to break the current libonion API, so:

remove the O_NO_SIGTERM flag to onion_new.

by default libonion don't handle any SIGTERM signal implicitly anymore, so the default unix behavior of abruptly terminating the process is restored. This is an incompatible change.

we add a signature for signal related notification callbacks

    typedef void onion_signal_callback_sigt (onion*o, int signum, void*clientdata);

and additional functions to enable signal handling to be called after onion_new, but before onion_listen:

     // install signal processing; should be done before onion_listen
     void onion_handle_sigterm (onion*o, onion_signal_callback_sigt*sigcallback, void*clientdata);

     void onion_handle_sigint (onion*o, onion_signal_callback_sigt*sigcallback, void*clientdata);
     void onion_handle_sigquit (onion*o, onion_signal_callback_sigt*sigcallback, void*clientdata);

with the important convention that these onion_handle_sigXXX should be called before any onion_listen, even on some other onion in some other thread.

So we document the convention that all onion-s should be created early at initialization time by onion_new in the main thread, then some onion_handle_sigXXX could be called, then onion_listen can be called (even in some other thread). However, calling any onion_handle_sigXXX function while some onion event loop is running (in some onion_listen perhaps on some other onion in another thread...) is undefined behavior.

BTW, I really don't understand well (i.e. dislike a bit) the onion_listen function. It looks as having different unrelated roles: running some event loop forever -until someone calls onion_listen_stop-, and running once the body of that loop (with the O_ONE and O_ONE_LOOP flags, which are underdocumented).
I prefer the GTK design: it provides the gtk_main function to run the event loop, gtk_events_pending to test if some event is pending (so it is polling with a 0 delay) and gtk_main_iteration and gtk_main_iteration_do to run a single iteration of that loop.

IMHO, mixing both the event loop and its body inside an onion_listen function is confusing. And since onion_listen is usually some event loop, it should be named as such (so onion_loop ....); if we change names we would obviously provide old names for compatibility.

I still don't understand clearly what having several instances of onion would mean (with several calls to onion_new perhaps in different threads), and so far nobody on this list came with some convincing examples (so I believe nobody needs that in practice, and libonion was not really designed for that obscure case).

BTW, we could want to handle several onion-s in the same event loop.

More importantly, expliciting an event loop -like GTK did- allows to cleanly write a single-threaded onion application (the current documentation is so confusing that I am not sure to understand how to do that, even if I feel it is possible). These could be interesting in cheap devices (perhaps IoT devices running Linux, even single-core) which handle only few HTTP requests (think of my consumer wifi box or of some domotic processor) mixed with other events (so far, I don't understand how to do that in libonion: a single-threaded thing handling both HTTP requests and RPCJSON); that smells to be possible with the current libonion API, but I don't understand how (because of lack of documentation and meaningful comments in header files). And that might even become a new interesting example (I don't mind about RPCJSON particularly, it could be also some simplistic raw X11 application -in the 1980s style, using only libX11 without any toolkit- drawing a black square).

Cheers

David Moreno Montero

未读,
2018年7月6日 04:25:132018/7/6
收件人 Basile Starynkevitch、onion-dev
Hi,

The philosophy I tried to use when designing the API was 'make simple things easy, complex possible' (https://en.m.wikiquote.org/wiki/Alan_Kay). And KISS.

So the benchmark is that if it's possible to make the hello world example simpler, the change is in. If it does not change, it's possible, if it more complex, sorry, no. The goal is easier onboarding. As you need more functionalities it may become harder and more complex, but base case super easy.

The change you propose fits (in my mind) in the complex scenario. Your use case is embed onion into another already existing application, and you care about the signal handling in your specific case. The simple case is just control-c stops the program. As a nicety it waits to handle in flight requests. This case is what should be optimized for simplicity. So removing the O_NO_SIGTERM actually is what we do NOT want to do, and it will require and extra flag or function call.

For me it just better to make onion_listen_stop signal friendly, and by default attach it to the case of only one onion. In my mind I only see problems with several onions if you want automatic signal handling. So this case may even show a runtime warning (or error and abort) and be properly documented.

As several onions is not the standard case we should not optimize for it, but make it possible. I see no need to create 5 new functions. If the user needs this complex case, it should go in their code, not onion. But onion allows it.

About the loop, yes it could be better, and I should double check it. But accepting several loop modes (one, one loop, threaded, pool) makes it quite complex. I will check again how it works as I don't remember properly. Anyway indeed it could be split into several simpler functions.

A several instances case hypotetical example. A DNS  over HTTP server that allows rest updates on one port, and HTTP DNS queries on another. You would never want to mix the ports as you can not trust the host header. And each port may need different bindings (one to localhost, the other to 0.0.0.0). One way to it is to have two onion instances. This is what PowerDNS does, but it does not use onion. And no dns over http support yet. Ok, and security needs a bit more thought, but it should work.

Regards,
David.

Basile Starynkevitch

未读,
2018年7月6日 12:38:272018/7/6
收件人 David Moreno Montero、onion-dev


On 07/06/2018 10:25 AM, David Moreno Montero wrote:
> Hi,
>
> The philosophy I tried to use when designing the API was 'make simple
> things easy, complex possible'
> (https://en.m.wikiquote.org/wiki/Alan_Kay). And KISS.

Then, why bother about the multi-onion  case? I did not. But sadly, the
documentation does not mention that there should be only one onion.

I am ok with requiring multiple onion-s to all use O_NO_SIGTERM
回复全部
回复作者
转发
0 个新帖子