Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

requests.{get,post} timeout

115 views
Skip to first unread message

Skip Montanaro

unread,
Aug 22, 2017, 10:02:47 AM8/22/17
to
I'm using the requests module with timeouts to fetch URLs, for example:

response = requests.get("http://www.google.com/", timeout=10)

I understand the timeout value in this case applies both to creating the
connection and fetching the remote content. Can the server dribble out the
content (say, one byte every few seconds) to avoid triggering the timeout,
or must the request be completed within ten seconds after the connection is
successfully opened? My reading of the documentation here is inconclusive:

http://docs.python-requests.org/en/master/user/advanced/#timeouts

If you specify a single value for the timeout, like this:

r = requests.get('https://github.com', timeout=5)

The timeout value will be applied to both the connect and the read
timeouts.

Does "read timeout" imply the timeout applied to an individual read from
the underlying socket? A quick glance at the code suggests that might be
the case, but I got a bit lost in the urllib3 code which underpins the
requests module.

Thx,

Skip

Chris Angelico

unread,
Aug 22, 2017, 12:53:15 PM8/22/17
to
Once your client has connected to the server and sent the HTTP
request, the read timeout is the number of seconds the client will
wait for the server to send a response. (Specifically, it's the number
of seconds that the client will wait between bytes sent from the
server. In 99.9% of cases, this is the time before the server sends
the first byte).
"""

"Between bytes" implies that you could have a long request, as long as
there's a keep-alive transmission every few seconds.

ChrisA

Jon Ribbens

unread,
Aug 22, 2017, 1:02:53 PM8/22/17
to
On 2017-08-22, Skip Montanaro <skip.mo...@gmail.com> wrote:
> I'm using the requests module with timeouts to fetch URLs, for example:
>
> response = requests.get("http://www.google.com/", timeout=10)
>
> I understand the timeout value in this case applies both to creating the
> connection and fetching the remote content. Can the server dribble out the
> content (say, one byte every few seconds) to avoid triggering the timeout,

Yes. There is no timeout feature that can be used to limit the total
time a 'requests' request takes. Some people might think that this is
a serious flaw in the requests library that would need urgent
rectification in order to make the library safe and useful to use in
almost any situation, but the 'requests' developers are apparently not
among those people.

Chris Angelico

unread,
Aug 22, 2017, 1:08:59 PM8/22/17
to
I'm not either. The idea of a timeout is to detect when something's
completely not working, not to limit the overall time to process. If
you want that, you can do it locally, maybe with signal.alarm or a
thread or something.

ChrisA

Skip Montanaro

unread,
Aug 22, 2017, 1:09:07 PM8/22/17
to
> """
> Once your client has connected to the server and sent the HTTP
> request, the read timeout is the number of seconds the client will
> wait for the server to send a response. (Specifically, it's the number
> of seconds that the client will wait between bytes sent from the
> server. In 99.9% of cases, this is the time before the server sends
> the first byte).
> """
>
> "Between bytes" implies that you could have a long request, as long as
> there's a keep-alive transmission every few seconds.

Thanks, Chris. That appears to be what's going on.

S

Jon Ribbens

unread,
Aug 22, 2017, 2:19:40 PM8/22/17
to
We appear to have different understandings of the word "timeout".
I think it means a time, which if it runs out, will stop the operation.

I am somewhat surprised that anyone might have a different definition
- not least because, from a human being's point of view, they care
about the overall time something takes to happen and telling them that
nothing's wrong because technically we are still "successfully" receiving
the expected 10 kilobytes of data 3 hours later is unlikely to make
them happy.

Grant Edwards

unread,
Aug 22, 2017, 2:32:12 PM8/22/17
to
On 2017-08-22, Chris Angelico <ros...@gmail.com> wrote:

> """
Except a keep-alive transmission doesn't contain any bytes, so it
shouldn't reset the timer.

--
Grant Edwards grant.b.edwards Yow! It's some people
at inside the wall! This is
gmail.com better than mopping!

Chris Angelico

unread,
Aug 22, 2017, 2:43:46 PM8/22/17
to
You start downloading a file from a web page. It stalls out.

Is it merely slow, and continuing to wait will get you a result?

Or has it actually stalled out and you should give up?

The low-level timeout will distinguish between those. If you want a
high-level timeout across the entire job, you can do that too, but
then you have to figure out exactly how long is "too long". Let's say
you set a thirty-second timeout. Great! Now someone uses your program
on a midrange connection to download a 100MB file, or on a poor
connection to download a 5MB file, or on dial-up to download a 10KB
file. Data is constantly flowing, but at some point, the connection
just dies, because it's hit your timeout. This is EXTREMELY
frustrating.

You can always add in the overall timeout separately. If the low-level
timeout were implemented that way, there would be no way to externally
add the other form of timeout. Therefore the only sane way to
implement the request timeout is a between-byte limit.

ChrisA

Chris Angelico

unread,
Aug 22, 2017, 2:45:17 PM8/22/17
to
On Wed, Aug 23, 2017 at 4:31 AM, Grant Edwards
<grant.b...@gmail.com> wrote:
> On 2017-08-22, Chris Angelico <ros...@gmail.com> wrote:
>
>> """
>> Once your client has connected to the server and sent the HTTP
>> request, the read timeout is the number of seconds the client will
>> wait for the server to send a response. (Specifically, it's the number
>> of seconds that the client will wait between bytes sent from the
>> server. In 99.9% of cases, this is the time before the server sends
>> the first byte).
>> """
>>
>> "Between bytes" implies that you could have a long request, as long as
>> there's a keep-alive transmission every few seconds.
>
> Except a keep-alive transmission doesn't contain any bytes, so it
> shouldn't reset the timer.

If it's a TCP keep-alive, yes. But if you're looking at a long-poll
HTTP server, or a websocket, or you're proxying a different type of
connection, you can use a connection-level keep-alive to reset it.
I've often worked with TELNET, using an IAC GA or similar as a
keep-alive to get past stupid routers that drop connections after five
minutes of idleness..

ChrisA

MRAB

unread,
Aug 22, 2017, 3:10:48 PM8/22/17
to
On 2017-08-22 19:43, Chris Angelico wrote:
> You start downloading a file from a web page. It stalls out.
>
> Is it merely slow, and continuing to wait will get you a result?
>
> Or has it actually stalled out and you should give up?
>
> The low-level timeout will distinguish between those. If you want a
> high-level timeout across the entire job, you can do that too, but
> then you have to figure out exactly how long is "too long". Let's say
> you set a thirty-second timeout. Great! Now someone uses your program
> on a midrange connection to download a 100MB file, or on a poor
> connection to download a 5MB file, or on dial-up to download a 10KB
> file. Data is constantly flowing, but at some point, the connection
> just dies, because it's hit your timeout. This is EXTREMELY
> frustrating.
>
> You can always add in the overall timeout separately. If the low-level
> timeout were implemented that way, there would be no way to externally
> add the other form of timeout. Therefore the only sane way to
> implement the request timeout is a between-byte limit.
>
You might want to have a way of setting the minimum data rate in order
to defend against a slowloris attack.

Jon Ribbens

unread,
Aug 22, 2017, 3:10:48 PM8/22/17
to
On 2017-08-22, Chris Angelico <ros...@gmail.com> wrote:
> The low-level timeout will distinguish between those. If you want a
> high-level timeout across the entire job, you can do that too, but
> then you have to figure out exactly how long is "too long". Let's say
> you set a thirty-second timeout. Great! Now someone uses your program
> on a midrange connection to download a 100MB file, or on a poor
> connection to download a 5MB file, or on dial-up to download a 10KB
> file. Data is constantly flowing, but at some point, the connection
> just dies, because it's hit your timeout. This is EXTREMELY
> frustrating.

Sure, the right timeout to use depends on what your application is and
what it's doing.

> You can always add in the overall timeout separately. If the low-level
> timeout were implemented that way, there would be no way to externally
> add the other form of timeout. Therefore the only sane way to
> implement the request timeout is a between-byte limit.

I have no idea what you mean here. The only sane way to implement the
request timeout is to provide both types of timeout.

Chris Angelico

unread,
Aug 22, 2017, 3:21:31 PM8/22/17
to
On Wed, Aug 23, 2017 at 5:10 AM, MRAB <pyt...@mrabarnett.plus.com> wrote:
> On 2017-08-22 19:43, Chris Angelico wrote:
>>
>> On Wed, Aug 23, 2017 at 4:14 AM, Jon Ribbens <jon+u...@unequivocal.eu>
>> You start downloading a file from a web page. It stalls out.
>>
>> Is it merely slow, and continuing to wait will get you a result?
>>
>> Or has it actually stalled out and you should give up?
>>
>> The low-level timeout will distinguish between those. If you want a
>> high-level timeout across the entire job, you can do that too, but
>> then you have to figure out exactly how long is "too long". Let's say
>> you set a thirty-second timeout. Great! Now someone uses your program
>> on a midrange connection to download a 100MB file, or on a poor
>> connection to download a 5MB file, or on dial-up to download a 10KB
>> file. Data is constantly flowing, but at some point, the connection
>> just dies, because it's hit your timeout. This is EXTREMELY
>> frustrating.
>>
>> You can always add in the overall timeout separately. If the low-level
>> timeout were implemented that way, there would be no way to externally
>> add the other form of timeout. Therefore the only sane way to
>> implement the request timeout is a between-byte limit.
>>
> You might want to have a way of setting the minimum data rate in order to
> defend against a slowloris attack.

That assumes that that's an attack - it often isn't. But if that's
what you want, then add that as a separate feature - it's distinct
from a timeout.

ChrisA

Chris Angelico

unread,
Aug 22, 2017, 3:22:12 PM8/22/17
to
On Wed, Aug 23, 2017 at 5:06 AM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
>> You can always add in the overall timeout separately. If the low-level
>> timeout were implemented that way, there would be no way to externally
>> add the other form of timeout. Therefore the only sane way to
>> implement the request timeout is a between-byte limit.
>
> I have no idea what you mean here. The only sane way to implement the
> request timeout is to provide both types of timeout.

You could provide both, but since one of them can be handled
externally (with a thread, with a SIGALRM, or with some other sort of
time limiting), the other one MUST be provided by the request.

ChrisA

Skip Montanaro

unread,
Aug 22, 2017, 4:15:42 PM8/22/17
to
> You could provide both, but since one of them can be handled
> externally (with a thread, with a SIGALRM, or with some other sort of
> time limiting), the other one MUST be provided by the request.

Given the semantics of timeouts which percolate up from the socket
level, I agree with Chris. It has a particular meaning, that
implemented by the underlying socket layer. Unfortunately, the word
"timeout" can take on related (but different) meanings, depending on
context. We can discuss how to implement the timeout which means, "the
maximum amount of time it should take to transfer a chunk of content
from one end of the connection to the other", it's difficult to say
exactly where detecting such timeouts should live in the application's
network stack. That it might be tedious to implement correctly (I
suspect given their druthers, most people would prefer to leave
sleeping threading and signaling dogs lie) is kind of beside the
point.

Now that I have a firmer grasp of what timeout I do have (the socket
level per-read-or-write call timeout), I can decide how important it
is for me to implement the other.

Skip

dieter

unread,
Aug 23, 2017, 3:02:22 AM8/23/17
to
Skip Montanaro <skip.mo...@gmail.com> writes:
> ...
> Given the semantics of timeouts which percolate up from the socket
> level, I agree with Chris. It has a particular meaning, that
> implemented by the underlying socket layer. Unfortunately, the word
> "timeout" can take on related (but different) meanings, depending on
> context.

That's why the documentation (you have cited in your original post)
clearly distinguished between "connect" and "read" timeout - and
thereby explains that only those types of timeouts are supported.

As you explained, those timeouts directly derive from the socket layer
timeouts.

Jon Ribbens

unread,
Aug 23, 2017, 7:15:06 AM8/23/17
to
On 2017-08-22, Chris Angelico <ros...@gmail.com> wrote:
I am interested to learn what you mean by "with a thread". How would
one execute a requests, er, request in a thread with a proper timeout?

Chris Angelico

unread,
Aug 23, 2017, 8:34:34 AM8/23/17
to
Assuming that by "proper timeout" you mean "limit the entire
download's wall time": Use one thread to do the request, and another
thread to monitor it. Generally, the monitoring thread is your UI
thread (most commonly, the main thread of the program), though it
doesn't have to be. If the monitoring thread decide that the
requesting thread has taken too long, it can cut it off and report
failure to the user.

ChrisA

Jon Ribbens

unread,
Aug 23, 2017, 8:57:06 AM8/23/17
to
Yes, what I was interested to learn was how the monitoring thread can
"cut off" the requesting thread.

Chris Angelico

unread,
Aug 23, 2017, 10:04:30 AM8/23/17
to
Ah, I see. That partly depends on your definition of "cut off", and
how it needs to interact with other things. I'm not sure about the
requests module specifically, but one very effective method of
terminating a network query is to close the socket (since resources
are owned by processes, not threads, any thread can close the
underlying socket connection); it won't instantly terminate the
thread, but it will result in an end-of-connection read shortly
afterwards. You'd have to do some sort of high-level "hey, I'm
cancelling this request" for the user's benefit, too - or maybe the
user initiated it in the first place. For example, in a web browser,
you can hit Esc to cancel the current page download; that can
immediately close the socket, and it probably has to tell the cache
subsystem not to hold that data, and maybe some logging and stuff, but
in terms of aborting the download, closing the socket is usually
sufficient.

How this interacts with the actual specific details of the requests
module I'm not sure, especially since a lot of the work usually
happens inside requests.get() or one of its friends; but if you
explicitly construct a Request object before sending it [1], it would
be conceivable to have a "req.close()" or "req.abort()" method that
closes the underlying socket. (Or possibly that goes on the
PreparedRequest. Or maybe the Session. I don't know; normally, when I
use requests, I use the higher level interfaces.) It would need to be
a feature provided by requests ("abort this request"), as it would
potentially interact with connection pooling and such. But at the
simplest level, closing the socket WILL abort the connection.

[1] http://docs.python-requests.org/en/master/user/advanced/#prepared-requests

ChrisA

Marko Rauhamaa

unread,
Aug 23, 2017, 10:07:55 AM8/23/17
to
Jon Ribbens <jon+u...@unequivocal.eu>:

> Yes, what I was interested to learn was how the monitoring thread can
> "cut off" the requesting thread.

In general, that cannot be done. Often, you resort to a dirty trick
whereby the monitoring thread closes the I/O object requesting thread is
waiting on, triggering an immediate I/O exception in the requesting
thread.

The fact that threads cannot be terminated at will is one of the big
drawbacks of the multithreaded programming model. Note that coroutines
can always be interrupted at await.


Marko

Jon Ribbens

unread,
Aug 23, 2017, 10:57:14 AM8/23/17
to
OK cool, so circling back to where you were - which is the same place
that the 'requests' developers are - which is the claim that requests
does not need to provide an "overall timeout" feature because you
can cancel the request yourself is untrue since, as you explain above,
you cannot in fact cancel the request yourself without some sort of
support for this in the module itself. Sure, requests *could* provide
a "cancel" feature, just the same as it *could* provide an "overall
timeout" feature, but it doesn't actually provide either, and this
is a problem.

Chris Angelico

unread,
Aug 23, 2017, 12:09:30 PM8/23/17
to
On Thu, Aug 24, 2017 at 12:52 AM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
> OK cool, so circling back to where you were - which is the same place
> that the 'requests' developers are - which is the claim that requests
> does not need to provide an "overall timeout" feature because you
> can cancel the request yourself is untrue since, as you explain above,
> you cannot in fact cancel the request yourself without some sort of
> support for this in the module itself. Sure, requests *could* provide
> a "cancel" feature, just the same as it *could* provide an "overall
> timeout" feature, but it doesn't actually provide either, and this
> is a problem.

Yes and no. If requests provided a 'cancel query' feature, it would
play nicely with everything else, but (a) the entire concept here is
that the request has stalled, so you COULD just ignore the pending
query and pretend it's failed without actually cancelling it; and (b)
you could just close the underlying socket without help, but it might
mess up future queries that end up getting put onto the same socket.
It's not that you CAN'T do this without help (which is the case for a
"time between bytes" timeout), but that having help would allow
requests *itself* to benefit.

But also, this honestly isn't as big an issue as you might think. If
the user thinks a program has been running for too long, s/he can hit
Ctrl-C. Voila! Signal is sent, which aborts a socket read, and thus
the request. And if your top-level code is doing something else (so
cancelling one request shouldn't terminate the whole program), Python
already lets you catch KeyboardInterrupt. This is ONLY a problem when
you need to have a program decide *by itself* that a request has taken
too long.

ChrisA

Marko Rauhamaa

unread,
Aug 23, 2017, 12:59:58 PM8/23/17
to
Chris Angelico <ros...@gmail.com>:

> But also, this honestly isn't as big an issue as you might think. If
> the user thinks a program has been running for too long, s/he can hit
> Ctrl-C. Voila! Signal is sent, which aborts a socket read,

Well, no, it doesn't. First run:

========================================================================
nc -l -p 12345
========================================================================

in one window. Then, execute this program in another one:

========================================================================
import threading, socket

def f():
s = socket.socket()
try:
s.connect(("localhost4", 12345))
s.recv(1000)
finally:
s.close()

t = threading.Thread(target=f)
t.start()
t.join()
========================================================================

After you hit Ctrl-C once (under Linux), you get this trace:

========================================================================
Traceback (most recent call last):
File "test.py", line 13, in <module>
t.join()
File "/usr/lib64/python3.5/threading.py", line 1054, in join
self._wait_for_tstate_lock()
File "/usr/lib64/python3.5/threading.py", line 1070, in _wait_for_tstate_lock
elif lock.acquire(block, timeout):
KeyboardInterrupt
========================================================================

The program hangs, though, and "nc" doesn't terminate indicating that
the socket hasn't closed.

Then, press Ctrl-C again to get:

========================================================================
Exception ignored in: <module 'threading' from '/usr/lib64/python3.5/threading.py'>
Traceback (most recent call last):
File "/usr/lib64/python3.5/threading.py", line 1288, in _shutdown
t.join()
File "/usr/lib64/python3.5/threading.py", line 1054, in join
self._wait_for_tstate_lock()
File "/usr/lib64/python3.5/threading.py", line 1070, in _wait_for_tstate_lock
elif lock.acquire(block, timeout):
KeyboardInterrupt
========================================================================

and the program terminates.


Marko

Chris Angelico

unread,
Aug 23, 2017, 1:29:43 PM8/23/17
to
On Thu, Aug 24, 2017 at 2:59 AM, Marko Rauhamaa <ma...@pacujo.net> wrote:
> Chris Angelico <ros...@gmail.com>:
>
>> But also, this honestly isn't as big an issue as you might think. If
>> the user thinks a program has been running for too long, s/he can hit
>> Ctrl-C. Voila! Signal is sent, which aborts a socket read,
>
> Well, no, it doesn't. First run:
>
> ========================================================================
> nc -l -p 12345
> ========================================================================
>
> in one window. Then, execute this program in another one:
>
> ========================================================================
> import threading, socket
>
> def f():
> s = socket.socket()
> try:
> s.connect(("localhost4", 12345))
> s.recv(1000)
> finally:
> s.close()
>
> t = threading.Thread(target=f)
> t.start()
> t.join()
> ========================================================================
>
> After you hit Ctrl-C once (under Linux), you get this trace:

[chomp]

What I said was that you don't need threading or alarms because most
of the time you can let the user use SIGINT. And without the (utterly
totally useless) threading that you have here, it works flawlessly:
Ctrl-C instantly breaks the recv call.

All you've demonstrated is that Ctrl-C halts a long-running request
*in the main thread*, which in this case is your join(). And when I
tested it interactively, it left the subthread running and halted the
join. The reason you see the "hit Ctrl-C again" phenomenon is that the
program wants to join all threads on termination. Solution 1: Keep the
program running but halt the request. Solution 2: Daemonize the
thread. Just run "t.daemon = True" before starting the thread, and the
program terminates cleanly after one Ctrl-C. I'd prefer solution 1,
myself, but you can take your pick.

ChrisA

Marko Rauhamaa

unread,
Aug 23, 2017, 2:00:49 PM8/23/17
to
Chris Angelico <ros...@gmail.com>:

> What I said was that you don't need threading or alarms because most
> of the time you can let the user use SIGINT. And without the (utterly
> totally useless) threading that you have here, it works flawlessly:
> Ctrl-C instantly breaks the recv call.

Oh, if you give up threading (which I commend you for), problems
evaporate.

So just use async if that's your cup of tea, or nonblocking I/O and
select.epoll(), which is my favorite.


Marko

Jon Ribbens

unread,
Aug 23, 2017, 6:59:12 PM8/23/17
to
On 2017-08-23, Chris Angelico <ros...@gmail.com> wrote:
> Yes and no. If requests provided a 'cancel query' feature, it would
> play nicely with everything else, but (a) the entire concept here is
> that the request has stalled, so you COULD just ignore the pending
> query and pretend it's failed without actually cancelling it; and (b)
> you could just close the underlying socket without help, but it might
> mess up future queries that end up getting put onto the same socket.
> It's not that you CAN'T do this without help (which is the case for a
> "time between bytes" timeout), but that having help would allow
> requests *itself* to benefit.

I don't understand - in the above paragraph you first explain how
it cannot be done without help from requests, then you state that it
can be done without help from requests. Was your first explanation
wrong?

> But also, this honestly isn't as big an issue as you might think. If
> the user thinks a program has been running for too long, s/he can hit
> Ctrl-C. Voila! Signal is sent, which aborts a socket read, and thus
> the request. And if your top-level code is doing something else (so
> cancelling one request shouldn't terminate the whole program), Python
> already lets you catch KeyboardInterrupt. This is ONLY a problem when
> you need to have a program decide *by itself* that a request has taken
> too long.

Yes. Which is a very common situation - indeed, I wouldn't be
surprised if it is the most common situation in which requests is
used. It is certainly the situation I was trying to use requests in
when I came up against this problem.

Chris Angelico

unread,
Aug 23, 2017, 7:19:38 PM8/23/17
to
On Thu, Aug 24, 2017 at 8:54 AM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
> On 2017-08-23, Chris Angelico <ros...@gmail.com> wrote:
>> Yes and no. If requests provided a 'cancel query' feature, it would
>> play nicely with everything else, but (a) the entire concept here is
>> that the request has stalled, so you COULD just ignore the pending
>> query and pretend it's failed without actually cancelling it; and (b)
>> you could just close the underlying socket without help, but it might
>> mess up future queries that end up getting put onto the same socket.
>> It's not that you CAN'T do this without help (which is the case for a
>> "time between bytes" timeout), but that having help would allow
>> requests *itself* to benefit.
>
> I don't understand - in the above paragraph you first explain how
> it cannot be done without help from requests, then you state that it
> can be done without help from requests. Was your first explanation
> wrong?

Not quite. I first explain that it can be done WITH help, and then
state that it can be done WITHOUT help. That it can be done with help
does not imply that it cannot be done without help.

Help is nice but it mainly helps for *subsequent* requests; an
external abort might leave internal state somewhat messed up, which
would result in resources not being released until the next query, or
perhaps a query failing and getting retried. But even without help, it
would all work.

ChrisA

Jon Ribbens

unread,
Aug 24, 2017, 7:48:17 AM8/24/17
to
On 2017-08-23, Chris Angelico <ros...@gmail.com> wrote:
> On Thu, Aug 24, 2017 at 8:54 AM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
>> On 2017-08-23, Chris Angelico <ros...@gmail.com> wrote:
>>> Yes and no. If requests provided a 'cancel query' feature, it would
>>> play nicely with everything else, but (a) the entire concept here is
>>> that the request has stalled, so you COULD just ignore the pending
>>> query and pretend it's failed without actually cancelling it; and (b)
>>> you could just close the underlying socket without help, but it might
>>> mess up future queries that end up getting put onto the same socket.
>>> It's not that you CAN'T do this without help (which is the case for a
>>> "time between bytes" timeout), but that having help would allow
>>> requests *itself* to benefit.
>>
>> I don't understand - in the above paragraph you first explain how
>> it cannot be done without help from requests, then you state that it
>> can be done without help from requests. Was your first explanation
>> wrong?
>
> Not quite. I first explain that it can be done WITH help, and then
> state that it can be done WITHOUT help. That it can be done with help
> does not imply that it cannot be done without help.

Where did you explain how it can be done without help? As far as I'm
aware, you can't close the socket without help since you can't get
access to it, and as you mentioned even if you were to do so the
effect it would have on requests is completely undefined.

> Help is nice but it mainly helps for *subsequent* requests; an
> external abort might leave internal state somewhat messed up, which
> would result in resources not being released until the next query, or
> perhaps a query failing and getting retried. But even without help, it
> would all work.

We now appear to have different understandings of the word "work",
which you are defining to include things that are clearly not working.

Even in a situation where there is a user constantly watching over the
operation, there is still no solution as, while the user might well
indicate their running out of patience by pressing 'cancel' or
something, the process has no way to cancel the request in response to
the user's command.

Basically the only way to use requests safely is to fork an individual
process for every request, which is of course spectacularly inefficient.

Chris Angelico

unread,
Aug 24, 2017, 10:01:33 AM8/24/17
to
On Thu, Aug 24, 2017 at 9:43 PM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
> On 2017-08-23, Chris Angelico <ros...@gmail.com> wrote:
>> On Thu, Aug 24, 2017 at 8:54 AM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
>>> On 2017-08-23, Chris Angelico <ros...@gmail.com> wrote:
>>>> Yes and no. If requests provided a 'cancel query' feature, it would
>>>> play nicely with everything else, but (a) the entire concept here is
>>>> that the request has stalled, so you COULD just ignore the pending
>>>> query and pretend it's failed without actually cancelling it; and (b)
>>>> you could just close the underlying socket without help, but it might
>>>> mess up future queries that end up getting put onto the same socket.
>>>> It's not that you CAN'T do this without help (which is the case for a
>>>> "time between bytes" timeout), but that having help would allow
>>>> requests *itself* to benefit.
>>>
>>> I don't understand - in the above paragraph you first explain how
>>> it cannot be done without help from requests, then you state that it
>>> can be done without help from requests. Was your first explanation
>>> wrong?
>>
>> Not quite. I first explain that it can be done WITH help, and then
>> state that it can be done WITHOUT help. That it can be done with help
>> does not imply that it cannot be done without help.
>
> Where did you explain how it can be done without help? As far as I'm
> aware, you can't close the socket without help since you can't get
> access to it, and as you mentioned even if you were to do so the
> effect it would have on requests is completely undefined.

In a single-threaded program, just hit Ctrl-C. Job done. No need to
know ANYTHING about the internals of the requests module, beyond that
it has correct handling of signals. Want that on a clock? SIGALRM. You
only need the more complicated solutions if you can't do this (eg if
you want multithreading for othre reasons, or if SIGALRM doesn't work
on Windows - which is probably the case).

>> Help is nice but it mainly helps for *subsequent* requests; an
>> external abort might leave internal state somewhat messed up, which
>> would result in resources not being released until the next query, or
>> perhaps a query failing and getting retried. But even without help, it
>> would all work.
>
> We now appear to have different understandings of the word "work",
> which you are defining to include things that are clearly not working.
>
> Even in a situation where there is a user constantly watching over the
> operation, there is still no solution as, while the user might well
> indicate their running out of patience by pressing 'cancel' or
> something, the process has no way to cancel the request in response to
> the user's command.
>

Since any process-level signal will have the effect I describe, the
requests module HAS to cope with it. Hence, it WILL all work. It might
not be quite as efficient ("resources not released until next query"),
but it will be fully functional. I don't know what you mean by "not
working". Basically, you have the potential for a dangling socket that
isn't being used for anything but is still an open file. Unless you're
creating a billion of them in quick succession, that's not going to
break your program.

> Basically the only way to use requests safely is to fork an individual
> process for every request, which is of course spectacularly inefficient.

Yes. Spectacularly inefficient and almost certainly unnecessary.

ChrisA

Jon Ribbens

unread,
Aug 24, 2017, 10:22:11 AM8/24/17
to
On 2017-08-24, Chris Angelico <ros...@gmail.com> wrote:
> On Thu, Aug 24, 2017 at 9:43 PM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
>> Where did you explain how it can be done without help? As far as I'm
>> aware, you can't close the socket without help since you can't get
>> access to it, and as you mentioned even if you were to do so the
>> effect it would have on requests is completely undefined.
>
> In a single-threaded program, just hit Ctrl-C.

By that, do you mean "kill the process"? That's obviously not
a sensible answer in general, especially given we were including
processes which have no terminal or user sitting there watching
them.

> Job done. No need to know ANYTHING about the internals of the
> requests module, beyond that it has correct handling of signals.
> Want that on a clock? SIGALRM.

Doesn't work with threading.

>> Even in a situation where there is a user constantly watching over the
>> operation, there is still no solution as, while the user might well
>> indicate their running out of patience by pressing 'cancel' or
>> something, the process has no way to cancel the request in response to
>> the user's command.
>
> Since any process-level signal will have the effect I describe, the
> requests module HAS to cope with it.

Receiving a signal is the same as closing the socket? What?
(And as I already mentioned, you can't close the socket anyway.)

> Hence, it WILL all work. It might not be quite as efficient
> ("resources not released until next query"), but it will be fully
> functional. I don't know what you mean by "not working".

Resources not released, subsequent operations failing, the library
possibly left in a state from which it cannot recover. This is
pretty obviously stuff that "is not working". Although even then
you still haven't explained how we can abort the operation (even
with all these side-effects) in the first place.

> Basically, you have the potential for a dangling socket that isn't
> being used for anything but is still an open file. Unless you're
> creating a billion of them in quick succession, that's not going to
> break your program.

It would take many orders of magnitude fewer than a billion to break
the program. This is not a responsible or sensible way to write a
computer program - to deliberately let it leak stuff and invoke
undefined behaviour all over the place and hope that somehow it'll
muddle along regardless.

>> Basically the only way to use requests safely is to fork an individual
>> process for every request, which is of course spectacularly inefficient.
>
> Yes. Spectacularly inefficient and almost certainly unnecessary.

You haven't suggested any workable alternative so far.

Chris Angelico

unread,
Aug 24, 2017, 12:02:12 PM8/24/17
to
On Fri, Aug 25, 2017 at 12:17 AM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
> On 2017-08-24, Chris Angelico <ros...@gmail.com> wrote:
>> On Thu, Aug 24, 2017 at 9:43 PM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
>>> Where did you explain how it can be done without help? As far as I'm
>>> aware, you can't close the socket without help since you can't get
>>> access to it, and as you mentioned even if you were to do so the
>>> effect it would have on requests is completely undefined.
>>
>> In a single-threaded program, just hit Ctrl-C.
>
> By that, do you mean "kill the process"? That's obviously not
> a sensible answer in general, especially given we were including
> processes which have no terminal or user sitting there watching
> them.

Only in the sense that "kill" is the Unix term for "send signal".
Python catches the signal, the system call terminates with EINTR, and
the exception is raised. Give it a try.

(Caveat: I have no idea how this works on Windows. I do expect,
though, that it will abort the connection without terminating the
process, just like it does on Unix.)

>> Job done. No need to know ANYTHING about the internals of the
>> requests module, beyond that it has correct handling of signals.
>> Want that on a clock? SIGALRM.
>
> Doesn't work with threading.

How many of your programs have threads in them? Did you not read my
point where I said that the bulk of programs can use these simple
techniques?

>>> Even in a situation where there is a user constantly watching over the
>>> operation, there is still no solution as, while the user might well
>>> indicate their running out of patience by pressing 'cancel' or
>>> something, the process has no way to cancel the request in response to
>>> the user's command.
>>
>> Since any process-level signal will have the effect I describe, the
>> requests module HAS to cope with it.
>
> Receiving a signal is the same as closing the socket? What?
> (And as I already mentioned, you can't close the socket anyway.)

Not as such, but try it and see what actually happens. The signal
aborts the syscall; the exception causes the stack to be unwound. TRY
IT. It works.

>> Hence, it WILL all work. It might not be quite as efficient
>> ("resources not released until next query"), but it will be fully
>> functional. I don't know what you mean by "not working".
>
> Resources not released, subsequent operations failing, the library
> possibly left in a state from which it cannot recover. This is
> pretty obviously stuff that "is not working". Although even then
> you still haven't explained how we can abort the operation (even
> with all these side-effects) in the first place.

Not released UNTIL NEXT QUERY. Everything is recoverable. TRY IT. It works.

>> Basically, you have the potential for a dangling socket that isn't
>> being used for anything but is still an open file. Unless you're
>> creating a billion of them in quick succession, that's not going to
>> break your program.
>
> It would take many orders of magnitude fewer than a billion to break
> the program. This is not a responsible or sensible way to write a
> computer program - to deliberately let it leak stuff and invoke
> undefined behaviour all over the place and hope that somehow it'll
> muddle along regardless.

Leaking until the next call? A billion.

>>> Basically the only way to use requests safely is to fork an individual
>>> process for every request, which is of course spectacularly inefficient.
>>
>> Yes. Spectacularly inefficient and almost certainly unnecessary.
>
> You haven't suggested any workable alternative so far.

Have you tried any of what I said?

ChrisA

Marko Rauhamaa

unread,
Aug 24, 2017, 1:40:31 PM8/24/17
to
Chris Angelico <ros...@gmail.com>:

> On Fri, Aug 25, 2017 at 12:17 AM, Jon Ribbens
> <jon+u...@unequivocal.eu> wrote:
>> By that, do you mean "kill the process"? That's obviously not a
>> sensible answer in general, especially given we were including
>> processes which have no terminal or user sitting there watching them.
>
> Only in the sense that "kill" is the Unix term for "send signal".
> Python catches the signal, the system call terminates with EINTR, and
> the exception is raised. Give it a try.

Signals are an arcane Unix communication method. I strongly recommend
against using signals for anything but terminating a process, and even
then you have to be extra careful.

I have seen code that uses signals for runtime communication, but none
of it was free from race conditions.

>>>> Basically the only way to use requests safely is to fork an
>>>> individual process for every request, which is of course
>>>> spectacularly inefficient.
>>>
>>> Yes. Spectacularly inefficient and almost certainly unnecessary.

Processes are a nice way to exercise multiple CPUs and also a way to
deal with obnoxious synchronous function calls.

However, you don't want to fork a process (or a thread, for that matter)
for each context. Rather, you should have a pool of processes in the
order of, say, 2 * CPU count, and the processes should fetch work from a
queue.

And if a process should get stuck, killing it is trivial.


Marko

Chris Angelico

unread,
Aug 24, 2017, 2:02:37 PM8/24/17
to
On Fri, Aug 25, 2017 at 3:40 AM, Marko Rauhamaa <ma...@pacujo.net> wrote:
> Chris Angelico <ros...@gmail.com>:
>
>> On Fri, Aug 25, 2017 at 12:17 AM, Jon Ribbens
>> <jon+u...@unequivocal.eu> wrote:
>>> By that, do you mean "kill the process"? That's obviously not a
>>> sensible answer in general, especially given we were including
>>> processes which have no terminal or user sitting there watching them.
>>
>> Only in the sense that "kill" is the Unix term for "send signal".
>> Python catches the signal, the system call terminates with EINTR, and
>> the exception is raised. Give it a try.
>
> Signals are an arcane Unix communication method. I strongly recommend
> against using signals for anything but terminating a process, and even
> then you have to be extra careful.
>
> I have seen code that uses signals for runtime communication, but none
> of it was free from race conditions.

Strongly disagree. Signals exist so that they can be used. Sending
SIGHUP to a daemon to tell it to reload its configs is well-supported
by the ecosystem; use of SIGCHLD and SIGWINCH for non-termination
conditions is also vital. How else should an operating system or
desktop environment inform a process of something important?

Ctrl-C sends SIGINT meaning "interrupt". There are many, MANY
situations in which "interrupting" a process doesn't terminate it.
Here's one very simple example:

$ python3
Python 3.7.0a0 (heads/master:3913bad495, Jul 21 2017, 20:53:52)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> (
...
KeyboardInterrupt
>>>

You hit Ctrl-C in the middle of typing a multi-line command (maybe you
didn't expect it to be multi-line?), and the current input is
cancelled. Easy. No termination needed. And no race condition. I don't
know why you'd get those, but they're not hard to eliminate.

>>>>> Basically the only way to use requests safely is to fork an
>>>>> individual process for every request, which is of course
>>>>> spectacularly inefficient.
>>>>
>>>> Yes. Spectacularly inefficient and almost certainly unnecessary.
>
> Processes are a nice way to exercise multiple CPUs and also a way to
> deal with obnoxious synchronous function calls.
>
> However, you don't want to fork a process (or a thread, for that matter)
> for each context. Rather, you should have a pool of processes in the
> order of, say, 2 * CPU count, and the processes should fetch work from a
> queue.
>
> And if a process should get stuck, killing it is trivial.

Yeah, if you DO need to split them out, a pool is good. I didn't
bother mentioning it as a single process was enough for the scenario
in question (which definitely sounded like an I/O-bound app), but even
if you do fork, there's not a lot of point forking exactly one per
request.

Interestingly, the 2*CPUs figure isn't always optimal. I've messed
around with -j options on repeatable tasks, and sometimes the best
figure is lower than that, and other times it's insanely higher. On my
four-core-hyperthreading CPU, I've had times when 12 is right, I've
had times when 4 is right, and I even had one situation when I was
debating between -j64 and -j128 on a task that was theoretically
CPU-bound (ray-tracing). There are a lot of weird things happening
with caching and such, and the only way to truly know what's best is
to try it!

ChrisA

bream...@gmail.com

unread,
Aug 24, 2017, 2:27:33 PM8/24/17
to
On Thursday, August 24, 2017 at 5:02:12 PM UTC+1, Chris Angelico wrote:
>
> (Caveat: I have no idea how this works on Windows. I do expect,
> though, that it will abort the connection without terminating the
> process, just like it does on Unix.)
>
> ChrisA

There was a big thread "cross platform alternative for signal.SIGALRM?" https://mail.python.org/pipermail/python-list/2015-November/698968.html which you might find interesting.

Kindest regards.

Mark Lawrence.

Marko Rauhamaa

unread,
Aug 24, 2017, 3:07:33 PM8/24/17
to
Chris Angelico <ros...@gmail.com>:

> On Fri, Aug 25, 2017 at 3:40 AM, Marko Rauhamaa <ma...@pacujo.net> wrote:
>> Signals are an arcane Unix communication method. I strongly recommend
>> against using signals for anything but terminating a process, and even
>> then you have to be extra careful.
>>
>> I have seen code that uses signals for runtime communication, but none
>> of it was free from race conditions.
>
> Strongly disagree. Signals exist so that they can be used. Sending
> SIGHUP to a daemon to tell it to reload its configs is well-supported
> by the ecosystem;

The ancient SIGHUP reload antipattern is infamous:

/bin/kill -HUP $MAINPID

Note however that reloading a daemon by sending a signal (as with the
example line above) is usually not a good choice, because this is an
asynchronous operation and hence not suitable to order reloads of
multiple services against each other. It is strongly recommended to
set ExecReload= to a command that not only triggers a configuration
reload of the daemon, but also synchronously waits for it to
complete.

<URL: https://www.freedesktop.org/software/systemd/man/systemd.servic
e.html>

The SIGHUP practice makes automation painful. I want to reconfigure but
can't be sure when the new configuration has taken effect.

> use of SIGCHLD and SIGWINCH for non-termination conditions is also
> vital. How else should an operating system or desktop environment
> inform a process of something important?

I never use SIGCHLD. Instead I leave a pipe open between the child and
the parent and notice an EOF on the pipe as the child exits. The pipe
(or socketpair) is handy for other IPC as well.

The signalfd mechanism in newer Linux kernels might make signals
borderline usable. However, code that relies on signals had better
meticulously call sigprocmask(2) and understand the precise points where
signals should be let through.

Another thing: this is a C programming issue, but functions like
fprintf() should never be used together with signals:

<URL: http://unix.derkeiler.com/Newsgroups/comp.unix.programmer/200
6-05/msg00356.html>


Marko

Chris Angelico

unread,
Aug 24, 2017, 3:16:09 PM8/24/17
to
On Fri, Aug 25, 2017 at 5:07 AM, Marko Rauhamaa <ma...@pacujo.net> wrote:
> Chris Angelico <ros...@gmail.com>:
>
>> On Fri, Aug 25, 2017 at 3:40 AM, Marko Rauhamaa <ma...@pacujo.net> wrote:
>>> Signals are an arcane Unix communication method. I strongly recommend
>>> against using signals for anything but terminating a process, and even
>>> then you have to be extra careful.
>>>
>>> I have seen code that uses signals for runtime communication, but none
>>> of it was free from race conditions.
>>
>> Strongly disagree. Signals exist so that they can be used. Sending
>> SIGHUP to a daemon to tell it to reload its configs is well-supported
>> by the ecosystem;
>
> The ancient SIGHUP reload antipattern is infamous:
>
> /bin/kill -HUP $MAINPID
>
> Note however that reloading a daemon by sending a signal (as with the
> example line above) is usually not a good choice, because this is an
> asynchronous operation and hence not suitable to order reloads of
> multiple services against each other. It is strongly recommended to
> set ExecReload= to a command that not only triggers a configuration
> reload of the daemon, but also synchronously waits for it to
> complete.
>
> <URL: https://www.freedesktop.org/software/systemd/man/systemd.servic
> e.html>
>
> The SIGHUP practice makes automation painful. I want to reconfigure but
> can't be sure when the new configuration has taken effect.

And yet, despite you calling it an antipattern, it's still very well
supported. There are limitations to it - as you say, it's asynchronous
- but it is the default for many services. SystemD is completely
supporting it.

ChrisA

Jon Ribbens

unread,
Aug 24, 2017, 8:47:17 PM8/24/17
to
On 2017-08-24, Chris Angelico <ros...@gmail.com> wrote:
> On Fri, Aug 25, 2017 at 12:17 AM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
>> On 2017-08-24, Chris Angelico <ros...@gmail.com> wrote:
>>> On Thu, Aug 24, 2017 at 9:43 PM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
>>>> Where did you explain how it can be done without help? As far as I'm
>>>> aware, you can't close the socket without help since you can't get
>>>> access to it, and as you mentioned even if you were to do so the
>>>> effect it would have on requests is completely undefined.
>>>
>>> In a single-threaded program, just hit Ctrl-C.
>>
>> By that, do you mean "kill the process"? That's obviously not
>> a sensible answer in general, especially given we were including
>> processes which have no terminal or user sitting there watching
>> them.
>
> Only in the sense that "kill" is the Unix term for "send signal".
> Python catches the signal, the system call terminates with EINTR, and
> the exception is raised. Give it a try.

Give what a try? Pressing ctrl-c? That'll kill the process.
Obviously we all agree that killing the entire process will
terminate the request and all resources associated with it.

>>> Job done. No need to know ANYTHING about the internals of the
>>> requests module, beyond that it has correct handling of signals.
>>> Want that on a clock? SIGALRM.
>>
>> Doesn't work with threading.
>
> How many of your programs have threads in them?

Um, basically all of them?

> Did you not read my point where I said that the bulk of programs can
> use these simple techniques?

I'm not sure I did but even so your point would appear to be wrong.

>> Receiving a signal is the same as closing the socket? What?
>> (And as I already mentioned, you can't close the socket anyway.)
>
> Not as such, but try it and see what actually happens. The signal
> aborts the syscall; the exception causes the stack to be unwound. TRY
> IT. It works.

Try *what*?

>> Resources not released, subsequent operations failing, the library
>> possibly left in a state from which it cannot recover. This is
>> pretty obviously stuff that "is not working". Although even then
>> you still haven't explained how we can abort the operation (even
>> with all these side-effects) in the first place.
>
> Not released UNTIL NEXT QUERY. Everything is recoverable. TRY IT. It works.

Try what?

>> It would take many orders of magnitude fewer than a billion to break
>> the program. This is not a responsible or sensible way to write a
>> computer program - to deliberately let it leak stuff and invoke
>> undefined behaviour all over the place and hope that somehow it'll
>> muddle along regardless.
>
> Leaking until the next call? A billion.

I don't believe you about it only leaking "until the next call",
whatever that means.

>>>> Basically the only way to use requests safely is to fork an individual
>>>> process for every request, which is of course spectacularly inefficient.
>>>
>>> Yes. Spectacularly inefficient and almost certainly unnecessary.
>>
>> You haven't suggested any workable alternative so far.
>
> Have you tried any of what I said?

What have you actually suggested to try?

Chris Angelico

unread,
Aug 24, 2017, 8:54:57 PM8/24/17
to
On Fri, Aug 25, 2017 at 10:42 AM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
> On 2017-08-24, Chris Angelico <ros...@gmail.com> wrote:
>> Only in the sense that "kill" is the Unix term for "send signal".
>> Python catches the signal, the system call terminates with EINTR, and
>> the exception is raised. Give it a try.
>
> Give what a try? Pressing ctrl-c? That'll kill the process.
> Obviously we all agree that killing the entire process will
> terminate the request and all resources associated with it.

>>> import requests
>>> requests.get("http://192.168.0.1/")
^CTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/site-packages/requests/api.py", line 70, in get
return request('get', url, params=params, **kwargs)
File "/usr/local/lib/python3.7/site-packages/requests/api.py", line
56, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python3.7/site-packages/requests/sessions.py",
line 488, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.7/site-packages/requests/sessions.py",
line 609, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.7/site-packages/requests/adapters.py",
line 423, in send
timeout=timeout
File "/usr/local/lib/python3.7/site-packages/requests/packages/urllib3/connectionpool.py",
line 600, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.7/site-packages/requests/packages/urllib3/connectionpool.py",
line 356, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/local/lib/python3.7/http/client.py", line 1230, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1276, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1225, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1017, in _send_output
self.send(msg)
File "/usr/local/lib/python3.7/http/client.py", line 955, in send
self.connect()
File "/usr/local/lib/python3.7/site-packages/requests/packages/urllib3/connection.py",
line 166, in connect
conn = self._new_conn()
File "/usr/local/lib/python3.7/site-packages/requests/packages/urllib3/connection.py",
line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/local/lib/python3.7/site-packages/requests/packages/urllib3/util/connection.py",
line 73, in create_connection
sock.connect(sa)
KeyboardInterrupt
>>>

That looks like an exception to me. Not a "process is now terminated".
That's what happened when I pressed Ctrl-C (the IP address was
deliberately picked as one that doesn't currently exist on my network,
so it took time).

ChrisA

dieter

unread,
Aug 25, 2017, 2:52:50 AM8/25/17
to
Jon Ribbens <jon+u...@unequivocal.eu> writes:

> On 2017-08-24, Chris Angelico <ros...@gmail.com> wrote:
>> On Thu, Aug 24, 2017 at 9:43 PM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
>>> Where did you explain how it can be done without help? As far as I'm
>>> aware, you can't close the socket without help since you can't get
>>> access to it, and as you mentioned even if you were to do so the
>>> effect it would have on requests is completely undefined.
>>
>> In a single-threaded program, just hit Ctrl-C.
>
> By that, do you mean "kill the process"? That's obviously not
> a sensible answer in general, especially given we were including
> processes which have no terminal or user sitting there watching
> them.

In Python 2, there is "PyThreadState_SetAsyncExc" (defined in "pystate.c"),
documented as follows:

Asynchronously raise an exception in a thread.
Requested by Just van Rossum and Alex Martelli.
To prevent naive misuse, you must write your own extension
to call this, or use ctypes. Must be called with the GIL held.
Returns the number of tstates modified (normally 1, but 0 if `id` didn't
match any known thread id). Can be called with exc=NULL to clear an
existing async exception. This raises no exceptions

int PyThreadState_SetAsyncExc(long id, PyObject *exc);

Together with a "read timeout", you can implement a total
timeout for your requests: you perform your request in a separate
thread; you set up a "connect/read timeout" (relatively small compared to the
total timeout; these timeouts ensure that the thread does not stall
inside the C runtime (where "PyThreadState_SetAsyncExc" has not effect));
you monitor the request runtime (maybe in a different thread)
and send it an exception when your global timeout is exceeded.


I do not know whether a similar API function is available for
Python 3 (but I suppose so).

dieter

unread,
Aug 25, 2017, 3:06:50 AM8/25/17
to
Chris Angelico <ros...@gmail.com> writes:
> ...
> That looks like an exception to me. Not a "process is now terminated".
> That's what happened when I pressed Ctrl-C (the IP address was
> deliberately picked as one that doesn't currently exist on my network,
> so it took time).

What Jon argues about: signals are delivered to Python's main thread;
if a thread is informed (e.g. via a signal induced exception) that
a request (running in a different thread) should terminate, he needs
a way to make the different thread do that.


You may have argued before that in case of a signal, the request
fails anyway due to an EINTR exception from the IO library.

This may no longer work. Long ago, I have often been plagued
by such EINTR exceptions, and I have wished heavily that in those
cases the IO operation should be automatically resumed. In recent time,
I have no longer seen such exceptions - and I concluded that my wish
has been fulfilled (at least for many signal types).

Jon Ribbens

unread,
Aug 25, 2017, 11:52:01 AM8/25/17
to
On 2017-08-25, Chris Angelico <ros...@gmail.com> wrote:
> That looks like an exception to me. Not a "process is now terminated".
> That's what happened when I pressed Ctrl-C (the IP address was
> deliberately picked as one that doesn't currently exist on my network,
> so it took time).

Ok yes, so ctrl-C is sending SIGINT which interrupts the system call
and is then caught as a Python exception, so this is very similar to
the SIGALRM idea you already suggested, in that it doesn't work with
threads, except it also relies on there being a person there to press
ctrl-C. So we still don't have any workable solution to the problem.

Jon Ribbens

unread,
Aug 25, 2017, 11:54:15 AM8/25/17
to
On 2017-08-25, dieter <die...@handshake.de> wrote:
> This may no longer work. Long ago, I have often been plagued
> by such EINTR exceptions, and I have wished heavily that in those
> cases the IO operation should be automatically resumed. In recent time,
> I have no longer seen such exceptions - and I concluded that my wish
> has been fulfilled (at least for many signal types).

You are correct, this was addressed in PEP 475 which was implemented
in Python 3.5 and it was indeed a good thing.

Chris Angelico

unread,
Aug 25, 2017, 1:32:10 PM8/25/17
to
The two complement each other. Want something on a specified clock?
SIGALRM. Want to handle that fuzzy notion of "it's been too long"? Let
the user hit Ctrl-C. They work basically the same way, from different
causes.

ChrisA

Jon Ribbens

unread,
Aug 25, 2017, 3:45:01 PM8/25/17
to
Neither works with threads. Threads, neither of them work with.
With threads, neither of them works. Works, threads with, neither
of them does. Of them, working with threads, does neither. Threads!
Them work with! Does not!

Chris Angelico

unread,
Aug 25, 2017, 3:54:25 PM8/25/17
to
So why are you using multiple threads? You never said that part.

ChrisA

Jon Ribbens

unread,
Aug 25, 2017, 4:21:37 PM8/25/17
to
I said it in the majority of the posts I've made in this thread.
I said it in the post you were responding to just now. I'm using
threads. Now I've said it again.

Chris Angelico

unread,
Aug 25, 2017, 4:32:46 PM8/25/17
to
You said WHY you are using multiple threads? I can't find it.

But if you're using threads, then you can use other techniques, like
reaching into the request and closing its socket. You get what you pay
for.

ChrisA

Jon Ribbens

unread,
Aug 25, 2017, 4:41:21 PM8/25/17
to
On 2017-08-25, Chris Angelico <ros...@gmail.com> wrote:
> On Sat, Aug 26, 2017 at 6:16 AM, Jon Ribbens <jon+u...@unequivocal.eu> wrote:
>> I said it in the majority of the posts I've made in this thread.
>> I said it in the post you were responding to just now. I'm using
>> threads. Now I've said it again.
>
> You said WHY you are using multiple threads? I can't find it.

What's that got to do with anything? Because my program is doing
multiple things at once.

> But if you're using threads, then you can use other techniques, like
> reaching into the request and closing its socket.

There's no documented way of doing that that I'm aware of.

> You get what you pay for.

Are you saying people should expect free software to be rubbish
because it's free? If not, what are you saying?
0 new messages