之前的“Async/await 带来的冲击以及 asyncio 的问题”中 Trio 的作者 Nathaniel J. Smith (以下简称 njs)给 Python 社区带来了非常好的思考,现在他又带来了另一个令人赞赏的思考:
Timeouts and cancellation for humans
https://vorpus.org/blog/timeouts-and-cancellation-for-humans/文中 njs 揪出了 Python 社区另一位大牛—— requests 的作者 Kenneth Reitz 在 requests 的 API 设计中使用 `timeout=` 参数的糟糕设计。并表示:
> I don't mean to pick on requests here – this problem is everywhere in Python APIs. I'm using requests as the example because Kenneth Reitz is famous for his obsession with making its API as obvious and intuitive as possible, and this is one of the rare places where he's failed. I think this is the only part of the requests API that gets
a big box in the documentation warning you that it's counterintuitive. So like... if even Kenneth Reitz can't get this right, I think we can conclude that "just slap a timeout= argument on it" does not lead to APIs fit for human consumption.
先尝试使用 `deadline=` 替代 `timeout=` 来解决问题:
>
Absolute deadlines are composable (but kinda annoying to use)虽然 `deadline=` 的设计会比 `timeout=` 更好一些,但是依然没有从根本上解决问题。
> But this approach also has a downside: it succeeds in moving the
annoying bit out of the library internals, and and instead puts it on
the person using the API. At the outermost level where timeout policy
is being set, your library's users probably want to say something like
"give up after 10 seconds", and if all you take is a `
deadline=`
argument then they have to do the conversion by hand every time. Or
you could have every function take both
`timeout=` and `
deadline=`
arguments, but then you need some boilerplate in every function to
normalize them, raise an error if both are specified, and so forth.
Deadlines are an improvement over raw timeouts, but it feels like
there's still some missing abstraction here.
在对 Python 社区各种 API 设计的批判之后,njs 提出出现这个问题的原因在于:
>
Cancel tokens are unreliable in practice because humans are lazy然后 cancellation 的灵感来自于 C# 和 Go 语言的设计:
> As far as I know, it originally comes from Joe Duffy's
cancellation tokens work in C#, and Go
context objects are essentially the same idea.
> ...
> I don't mean to make fun. This stuff is hard. But C# and Go are huge
projects maintained by teams of highly-skilled full-time developers
and backed by Fortune 50 companies. If they can't get it right, who
can? Not me. I'm one human trying to reinvent I/O in Python. I can't
afford to make things that complicated.
> ...
> Remember way back at the beginning of this post, we noted that Python socket methods don't take individual timeout arguments, but instead let you set the timeout once on the socket so it's implicitly passed to every method you call? And in the section just above, we noticed that C# and Go do pretty much the same thing? I think they're on to something. Maybe we should accept that when you have some data that has to be passed through to every function you call, that's something the computer should handle, rather than making flaky humans do the work – but in a general way that supports complex abstractions, not just sockets.
当然,njs 对 Python 的 `with` 设计比较满意:
> But since this post is about the underlying design, we'll focus on the primitive version. (Credit: the idea of using `with` blocks for timeouts is something I first saw in Dave Beazley's Curio, though I changed a bunch. I'll hide the details in a footnote:
[4].)
> You should think of `with open_cancel_scope()` as creating a cancel token, but it doesn't actually expose any `CancelToken` object publically. Instead, the cancel token is pushed onto an invisible internal stack, and automatically applied to any blocking operations called inside the `with` block. So `requests` doesn't have to do anything to pass this through – when it eventually sends and receives data over the network, those primitive calls will automatically have the deadline applied.
> ...
> A way to delimit the boundaries of a cancel scope. Python's `with` blocks work great; other options would include dedicated syntax, or restricting cancel scopes to individual function calls like `with_timeout(10, some_fn, arg1, arg2)` (though this could force awkward factorings, and you'd need to figure out some way to expose the cancel scope object).
> ...
> If you're working in another language, I'd love to hear how the cancel scope idea adapts – if at all. For example, it'll definitely need some adjustment for languages that don't use exceptions, or that are missing the kind of user-extensible syntax that Python's `with` blocks provide.
针对并发下(比如多线程或其他并发库)的问题,njs 又来卖 Trio 设计中的 nursery 设计,这个的确也是要比没有处理的好:
> This system has many advantages, but the relevant one here is that it preserves the key assumptions that cancel scopes rely on. Any given nursery is either inside or outside the cancel scope – we can tell by checking whether the `with open_cancel_scope` block encloses the `async with open_nursery` block. And then it's straightforward to say that if a nursery is inside a cancel scope, then that scope should apply to all children in that nursery. This means that if we apply a timeout to a function, it can't "escape" by spawning a child task – the timeout applies to the child task too. (The exception is if you pass an outside nursery into the function, then it can spawn tasks into that nursery, which can escape the timeout. But then this is obvious to the caller, because they have to provide the nursery – the point is to make it clear what's going on, not to make it impossible to spawn background tasks.)
njs 表示已经开始尝试让 requests 运行在 Trio 之上了:
> Returning to our initial example: I've been doing some initial work on
porting
requests to run on Trio (
you can help!), and so far it
looks like the Trio version will not only handle timeouts better than
the traditional synchronous version, but that it will be able to do
this using
zero lines of code – all the places where you'd want to
check for cancellation are the ones where Trio does so automatically,
and all the places where you need special care to handle the resulting
exceptions are places where
requests is prepared to handle
arbitrary exceptions for other reasons.
njs 表示 requests 的作者不必为 `timeout=` 参数而道歉, 哈哈:
> Our original motivating examples involved `requests`, an ordinary synchronous library. And pretty much everything above applies equally to synchronous or concurrent code. So I think it's interesting to explore the idea of using these in classic synchronous Python. Maybe we can fix `requests` so it doesn't have to apologize for its `timeout` argument!
最后,njs 说明了这篇文章的一个写作起因,是和 asyncio 相关的尝试:
> One of the original motivations for this blog post was talking to
Yury about whether we could retrofit any of Trio's improvements back into asyncio.
只是想在 asyncio 之中集成 Trio 的优点实在困难,毕竟历史负担比较重,不过也不是完全没有希望……
> Unfortunately asyncio's in a bit of a tricky position, because it's built on an architecture derived from the previous decade of experience with async I/O in Python... and then after that architecture was locked in, it added new syntax to Python that invalidated all that experience. But hopefully it's still possible to adapt some of these lessons – at least with some compromises.
Python 社区另一位大牛 Yury Selivanov 在和 njs 此次火花碰撞之后,表示非常激动,并且已经知道如何去改善 asyncio 在此问题上的表现。
https://twitter.com/1st1/status/951399181599559680> A very insightful blog post by Nathaniel about timeouts and cancellations. Highly recommended to read. Btw, I think I know how to improve it in asyncio, will share more details later.
其他引用:
*
https://github.com/njsmith/deadline-scopes