[Python-Dev] PEP 567 -- Context Variables

250 views
Skip to first unread message

Yury Selivanov

unread,
Dec 12, 2017, 12:35:31 PM12/12/17
to Python-Dev
Hi,

This is a new proposal to implement context storage in Python.

It's a successor of PEP 550 and builds on some of its API ideas and
datastructures. Contrary to PEP 550 though, this proposal only focuses
on adding new APIs and implementing support for it in asyncio. There
are no changes to the interpreter or to the behaviour of generator or
coroutine objects.


PEP: 567
Title: Context Variables
Version: $Revision$
Last-Modified: $Date$
Author: Yury Selivanov <yu...@magic.io>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 12-Dec-2017
Python-Version: 3.7
Post-History: 12-Dec-2017


Abstract
========

This PEP proposes the new ``contextvars`` module and a set of new
CPython C APIs to support context variables. This concept is
similar to thread-local variables but, unlike TLS, it allows
correctly keeping track of values per asynchronous task, e.g.
``asyncio.Task``.

This proposal builds directly upon concepts originally introduced
in :pep:`550`. The key difference is that this PEP is only concerned
with solving the case for asynchronous tasks, and not generators.
There are no proposed modifications to any built-in types or to the
interpreter.


Rationale
=========

Thread-local variables are insufficient for asynchronous tasks which
execute concurrently in the same OS thread. Any context manager that
needs to save and restore a context value and uses
``threading.local()``, will have its context values bleed to other
code unexpectedly when used in async/await code.

A few examples where having a working context local storage for
asynchronous code is desired:

* Context managers like decimal contexts and ``numpy.errstate``.

* Request-related data, such as security tokens and request
data in web applications, language context for ``gettext`` etc.

* Profiling, tracing, and logging in large code bases.


Introduction
============

The PEP proposes a new mechanism for managing context variables.
The key classes involved in this mechanism are ``contextvars.Context``
and ``contextvars.ContextVar``. The PEP also proposes some policies
for using the mechanism around asynchronous tasks.

The proposed mechanism for accessing context variables uses the
``ContextVar`` class. A module (such as decimal) that wishes to
store a context variable should:

* declare a module-global variable holding a ``ContextVar`` to
serve as a "key";

* access the current value via the ``get()`` method on the
key variable;

* modify the current value via the ``set()`` method on the
key variable.

The notion of "current value" deserves special consideration:
different asynchronous tasks that exist and execute concurrently
may have different values. This idea is well-known from thread-local
storage but in this case the locality of the value is not always
necessarily to a thread. Instead, there is the notion of the
"current ``Context``" which is stored in thread-local storage, and
is accessed via ``contextvars.get_context()`` function.
Manipulation of the current ``Context`` is the responsibility of the
task framework, e.g. asyncio.

A ``Context`` is conceptually a mapping, implemented using an
immutable dictionary. The ``ContextVar.get()`` method does a
lookup in the current ``Context`` with ``self`` as a key, raising a
``LookupError`` or returning a default value specified in
the constructor.

The ``ContextVar.set(value)`` method clones the current ``Context``,
assigns the ``value`` to it with ``self`` as a key, and sets the
new ``Context`` as a new current. Because ``Context`` uses an
immutable dictionary, cloning it is O(1).


Specification
=============

A new standard library module ``contextvars`` is added with the
following APIs:

1. ``get_context() -> Context`` function is used to get the current
``Context`` object for the current OS thread.

2. ``ContextVar`` class to declare and access context variables.

3. ``Context`` class encapsulates context state. Every OS thread
stores a reference to its current ``Context`` instance.
It is not possible to control that reference manually.
Instead, the ``Context.run(callable, *args)`` method is used to run
Python code in another context.


contextvars.ContextVar
----------------------

The ``ContextVar`` class has the following constructor signature:
``ContextVar(name, *, default=no_default)``. The ``name`` parameter
is used only for introspection and debug purposes. The ``default``
parameter is optional. Example::

# Declare a context variable 'var' with the default value 42.
var = ContextVar('var', default=42)

``ContextVar.get()`` returns a value for context variable from the
current ``Context``::

# Get the value of `var`.
var.get()

``ContextVar.set(value) -> Token`` is used to set a new value for
the context variable in the current ``Context``::

# Set the variable 'var' to 1 in the current context.
var.set(1)

``contextvars.Token`` is an opaque object that should be used to
restore the ``ContextVar`` to its previous value, or remove it from
the context if it was not set before. The ``ContextVar.reset(Token)``
is used for that::

old = var.set(1)
try:
...
finally:
var.reset(old)

The ``Token`` API exists to make the current proposal forward
compatible with :pep:`550`, in case there is demand to support
context variables in generators and asynchronous generators in the
future.

``ContextVar`` design allows for a fast implementation of
``ContextVar.get()``, which is particularly important for modules
like ``decimal`` an ``numpy``.


contextvars.Context
-------------------

``Context`` objects are mappings of ``ContextVar`` to values.

To get the current ``Context`` for the current OS thread, use
``contextvars.get_context()`` method::

ctx = contextvars.get_context()

To run Python code in some ``Context``, use ``Context.run()``
method::

ctx.run(function)

Any changes to any context variables that ``function`` causes, will
be contained in the ``ctx`` context::

var = ContextVar('var')
var.set('spam')

def function():
assert var.get() == 'spam'

var.set('ham')
assert var.get() == 'ham'

ctx = get_context()
ctx.run(function)

assert var.get('spam')

Any changes to the context will be contained and persisted in the
``Context`` object on which ``run()`` is called on.

``Context`` objects implement the ``collections.abc.Mapping`` ABC.
This can be used to introspect context objects::

ctx = contextvars.get_context()

# Print all context variables in their values in 'ctx':
print(ctx.items())

# Print the value of 'some_variable' in context 'ctx':
print(ctx[some_variable])


asyncio
-------

``asyncio`` uses ``Loop.call_soon()``, ``Loop.call_later()``,
and ``Loop.call_at()`` to schedule the asynchronous execution of a
function. ``asyncio.Task`` uses ``call_soon()`` to run the
wrapped coroutine.

We modify ``Loop.call_{at,later,soon}`` to accept the new
optional *context* keyword-only argument, which defaults to
the current context::

def call_soon(self, callback, *args, context=None):
if context is None:
context = contextvars.get_context()

# ... some time later
context.run(callback, *args)

Tasks in asyncio need to maintain their own isolated context.
``asyncio.Task`` is modified as follows::

class Task:
def __init__(self, coro):
...
# Get the current context snapshot.
self._context = contextvars.get_context()
self._loop.call_soon(self._step, context=self._context)

def _step(self, exc=None):
...
# Every advance of the wrapped coroutine is done in
# the task's context.
self._loop.call_soon(self._step, context=self._context)
...


CPython C API
-------------

TBD


Implementation
==============

This section explains high-level implementation details in
pseudo-code. Some optimizations are omitted to keep this section
short and clear.

The internal immutable dictionary for ``Context`` is implemented
using Hash Array Mapped Tries (HAMT). They allow for O(log N) ``set``
operation, and for O(1) ``get_context()`` function. For the purposes
of this section, we implement an immutable dictionary using
``dict.copy()``::

class _ContextData:

def __init__(self):
self.__mapping = dict()

def get(self, key):
return self.__mapping[key]

def set(self, key, value):
copy = _ContextData()
copy.__mapping = self.__mapping.copy()
copy.__mapping[key] = value
return copy

def delete(self, key):
copy = _ContextData()
copy.__mapping = self.__mapping.copy()
del copy.__mapping[key]
return copy

Every OS thread has a reference to the current ``_ContextData``.
``PyThreadState`` is updated with a new ``context_data`` field that
points to a ``_ContextData`` object::

PyThreadState:
context : _ContextData

``contextvars.get_context()`` is implemented as follows:

def get_context():
ts : PyThreadState = PyThreadState_Get()

if ts.context_data is None:
ts.context_data = _ContextData()

ctx = Context()
ctx.__data = ts.context_data
return ctx

``contextvars.Context`` is a wrapper around ``_ContextData``::

class Context(collections.abc.Mapping):

def __init__(self):
self.__data = _ContextData()

def run(self, callable, *args):
ts : PyThreadState = PyThreadState_Get()
saved_data : _ContextData = ts.context_data

try:
ts.context_data = self.__data
callable(*args)
finally:
self.__data = ts.context_data
ts.context_data = saved_data

# Mapping API methods are implemented by delegating
# `get()` and other Mapping calls to `self.__data`.

``contextvars.ContextVar`` interacts with
``PyThreadState.context_data`` directly::

class ContextVar:

def __init__(self, name, *, default=NO_DEFAULT):
self.__name = name
self.__default = default

@property
def name(self):
return self.__name

def get(self, default=NO_DEFAULT):
ts : PyThreadState = PyThreadState_Get()
data : _ContextData = ts.context_data

try:
return data.get(self)
except KeyError:
pass

if default is not NO_DEFAULT:
return default

if self.__default is not NO_DEFAULT:
return self.__default

raise LookupError

def set(self, value):
ts : PyThreadState = PyThreadState_Get()
data : _ContextData = ts.context_data

try:
old_value = data.get(self)
except KeyError:
old_value = NO_VALUE

ts.context_data = data.set(self, value)
return Token(self, old_value)

def reset(self, token):
if token.__used:
return

if token.__old_value is NO_VALUE:
ts.context_data = data.delete(token.__var)
else:
ts.context_data = data.set(token.__var,
token.__old_value)

token.__used = True


class Token:

def __init__(self, var, old_value):
self.__var = var
self.__old_value = old_value
self.__used = False


Backwards Compatibility
=======================

This proposal preserves 100% backwards compatibility.

Libraries that use ``threading.local()`` to store context-related
values, currently work correctly only for synchronous code. Switching
them to use the proposed API will keep their behavior for synchronous
code unmodified, but will automatically enable support for
asynchronous code.


Appendix: HAMT Performance Analysis
===================================

.. figure:: pep-0550-hamt_vs_dict-v2.png
:align: center
:width: 100%

Figure 1. Benchmark code can be found here: [1]_.

The above chart demonstrates that:

* HAMT displays near O(1) performance for all benchmarked
dictionary sizes.

* ``dict.copy()`` becomes very slow around 100 items.

.. figure:: pep-0550-lookup_hamt.png
:align: center
:width: 100%

Figure 2. Benchmark code can be found here: [2]_.

Figure 2 compares the lookup costs of ``dict`` versus a HAMT-based
immutable mapping. HAMT lookup time is 30-40% slower than Python dict
lookups on average, which is a very good result, considering that the
latter is very well optimized.

The reference implementation of HAMT for CPython can be found here:
[3]_.


References
==========

.. [1] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd

.. [2] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e

.. [3] https://github.com/1st1/cpython/tree/hamt


Copyright
=========

This document has been placed in the public domain.


..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Victor Stinner

unread,
Dec 12, 2017, 6:52:19 PM12/12/17
to Yury Selivanov, Python-Dev
Hi Yury,

I like the overall idea and I prefer this PEP over PEP 550 since it's
shorter and easier to read :-)

Question: Is there an API to list all context variables?

Would it be possible to have a very summary of the changes in the PEP?
I propose:

"""
* Added contextvars module with ContextVar, Context and Token classes,
and a get_context() function
* asyncio: Added keyword-only context parameter to call_at(),
call_later(), call_soon() methods of event loops and
Future.add_done_callback(); Task are modified internally to maintain
their own isolated context.
"""

Each get_context() call returns a new Context object. It may be worth
to mention it. I understand why, but it's surprising that "assert
get_context() is not get_context()" fails. Maybe it's a naming issue?
Maybe rename it to contextvars.context()?


> Abstract: ... This concept is similar to thread-local variables but, unlike TLS, ...

nitpick: please write "Thread Local Storage (TLS)". When I read TLS, I
understand HTTPS (Transport Layer Security) :-)

Your PEP seems to be written for asyncio. Maybe it would help to
understand it to make it more explicit in the abstract... even if I
understand perfectly that it's not strictly specific to asyncio ;-)


> # Declare a context variable 'var' with the default value 42.
> var = ContextVar('var', default=42)

nitpick: I suggest to use 'name' rather than 'var', to make it obvious
that the first parameter is the variable name.

> ``contextvars.Token`` is an opaque object that should be used to
> restore the ``ContextVar`` to its previous value, or remove it from
> the context if it was not set before. The ``ContextVar.reset(Token)``
> is used for that::
>
> old = var.set(1)
> try:
> ...
> finally:
> var.reset(old)

I don't see where is the token in this example. Does set() return a
token object? Yes according to ContextVar pseudo-code below.

When I read "old", I understand that set() returns the old value, not
an opaque token. Maybe rename "old" to "token"?


> The ``Token`` API exists to make the current proposal forward
> compatible with :pep:`550`, in case there is demand to support
> context variables in generators and asynchronous generators in the
> future.

Cool. I like the idea of starting with something simple in Python 3.7.
Then extend it in Python 3.8 or later (support generators), if it
becomes popular, once the first simple (but "incomplete", without
generators) implementation is battle-tested.


> Any changes to any context variables that ``function`` causes, will
> be contained in the ``ctx`` context::
>
> var = ContextVar('var')
> var.set('spam')
>
> def function():
> assert var.get() == 'spam'
>
> var.set('ham')
> assert var.get() == 'ham'
>
> ctx = get_context()
> ctx.run(function)
>
> assert var.get('spam')

Should I read assert var.get() == 'spam' here?

At the first read, I understood that that ctx.run() creates a new
temporary context which is removed once ctx.run() returns.

Now I understand that context variable values are restored to their
previous values once run() completes. Am I right?

Maybe add a short comment to explain that?

# Call function() in the context ctx
# and then restores context variables of ctx to their previous values
ctx.run(function)


> Backwards Compatibility
> =======================
>
> This proposal preserves 100% backwards compatibility.

Ok.

> Libraries that use ``threading.local()`` to store context-related
> values, currently work correctly only for synchronous code. Switching
> them to use the proposed API will keep their behavior for synchronous
> code unmodified, but will automatically enable support for
> asynchronous code.

I'm confused by this sentence. I suggest to remove it :-)

Converting code to contextvars makes it immediately backward
incompatible, no I'm not sure that it's a good it to suggest it in
this section.

Victor

Yury Selivanov

unread,
Dec 12, 2017, 8:36:56 PM12/12/17
to Victor Stinner, Python-Dev
Hi Victor,

On Tue, Dec 12, 2017 at 6:49 PM, Victor Stinner
<victor....@gmail.com> wrote:
> Hi Yury,
>
> I like the overall idea and I prefer this PEP over PEP 550 since it's
> shorter and easier to read :-)
>
> Question: Is there an API to list all context variables?

Context implements abc.Mapping, so 'get_context().keys()' will give
you a list of all ContextVars in the current context.

>
> Would it be possible to have a very summary of the changes in the PEP?
> I propose:
>
> """
> * Added contextvars module with ContextVar, Context and Token classes,
> and a get_context() function
> * asyncio: Added keyword-only context parameter to call_at(),
> call_later(), call_soon() methods of event loops and
> Future.add_done_callback(); Task are modified internally to maintain
> their own isolated context.
> """

Added.

>
> Each get_context() call returns a new Context object. It may be worth
> to mention it. I understand why, but it's surprising that "assert
> get_context() is not get_context()" fails. Maybe it's a naming issue?
> Maybe rename it to contextvars.context()?

I think the name is fine. While get_context() will return a new instance
every time you call it, those instances will have the same context
variables/values in them, so I don't think it's a problem.

>
>
>> Abstract: ... This concept is similar to thread-local variables but, unlike TLS, ...
>
> nitpick: please write "Thread Local Storage (TLS)". When I read TLS, I
> understand HTTPS (Transport Layer Security) :-)

Fixed.

[..]
>> ``contextvars.Token`` is an opaque object that should be used to
>> restore the ``ContextVar`` to its previous value, or remove it from
>> the context if it was not set before. The ``ContextVar.reset(Token)``
>> is used for that::
>>
>> old = var.set(1)
>> try:
>> ...
>> finally:
>> var.reset(old)
>
> I don't see where is the token in this example. Does set() return a
> token object? Yes according to ContextVar pseudo-code below.
>
> When I read "old", I understand that set() returns the old value, not
> an opaque token. Maybe rename "old" to "token"?

Fixed.

>
>
>> The ``Token`` API exists to make the current proposal forward
>> compatible with :pep:`550`, in case there is demand to support
>> context variables in generators and asynchronous generators in the
>> future.
>
> Cool. I like the idea of starting with something simple in Python 3.7.
> Then extend it in Python 3.8 or later (support generators), if it
> becomes popular, once the first simple (but "incomplete", without
> generators) implementation is battle-tested.
>
>
>> Any changes to any context variables that ``function`` causes, will
>> be contained in the ``ctx`` context::
>>
>> var = ContextVar('var')
>> var.set('spam')
>>
>> def function():
>> assert var.get() == 'spam'
>>
>> var.set('ham')
>> assert var.get() == 'ham'
>>
>> ctx = get_context()
>> ctx.run(function)
>>
>> assert var.get('spam')
>
> Should I read assert var.get() == 'spam' here?

Yes, fixed.

>
> At the first read, I understood that that ctx.run() creates a new
> temporary context which is removed once ctx.run() returns.
>
> Now I understand that context variable values are restored to their
> previous values once run() completes. Am I right?

ctx.run(func) runs 'func' in the 'ctx' context. Any changes to
ContextVars that func makes will stay isolated to the 'ctx' context.

>
> Maybe add a short comment to explain that?

Added.

>
> # Call function() in the context ctx
> # and then restores context variables of ctx to their previous values
> ctx.run(function)
>
>
>> Backwards Compatibility
>> =======================
>>
>> This proposal preserves 100% backwards compatibility.
>
> Ok.
>
>> Libraries that use ``threading.local()`` to store context-related
>> values, currently work correctly only for synchronous code. Switching
>> them to use the proposed API will keep their behavior for synchronous
>> code unmodified, but will automatically enable support for
>> asynchronous code.
>
> I'm confused by this sentence. I suggest to remove it :-)
>
> Converting code to contextvars makes it immediately backward
> incompatible, no I'm not sure that it's a good it to suggest it in
> this section.

If we update decimal to use ContextVars internally, decimal will stay
100% backwards compatible.

Yury

Guido van Rossum

unread,
Dec 12, 2017, 9:58:13 PM12/12/17
to Yury Selivanov, Python-Dev
On Tue, Dec 12, 2017 at 5:35 PM, Yury Selivanov <yseliv...@gmail.com> wrote:
On Tue, Dec 12, 2017 at 6:49 PM, Victor Stinner
<victor....@gmail.com> wrote:
> I like the overall idea and I prefer this PEP over PEP 550 since it's
> shorter and easier to read :-)
>
> Question: Is there an API to list all context variables?

Context implements abc.Mapping, so 'get_context().keys()' will give
you a list of all ContextVars in the current context.

This was hinted at in the PEP, but maybe an explicit example would be nice.
 
> Each get_context() call returns a new Context object. It may be worth
> to mention it. I understand why, but it's surprising that "assert
> get_context() is not get_context()" fails. Maybe it's a naming issue?
> Maybe rename it to contextvars.context()?

I think the name is fine.  While get_context() will return a new instance
every time you call it, those instances will have the same context
variables/values in them, so I don't think it's a problem.

I'm fine with this, but perhaps == should be supported so that those two are guaranteed to be considered equal? (Otherwise an awkward idiom to compare contexts using expensive dict() copies would be needed to properly compare two contexts for equality.)
 
> At the first read, I understood that that ctx.run() creates a new
> temporary context which is removed once ctx.run() returns.
>
> Now I understand that context variable values are restored to their
> previous values once run() completes. Am I right?

ctx.run(func) runs 'func' in the 'ctx' context.  Any changes to
ContextVars that func makes will stay isolated to the 'ctx' context.

>
> Maybe add a short comment to explain that?

Added.

The PEP still contains the following paragraph:

> Any changes to the context will be contained and persisted in the
> ``Context`` object on which ``run()`` is called on.

This phrase is confusing; it could be read as implying that context changes made by the function *will* get propagated back to the caller of run(), contradicting what was said earlier. Maybe it's best to just delete it? Otherwise if you intend it to add something it needs to be rephrased. Maybe "persisted" is the key word causing confusion?

--
--Guido van Rossum (python.org/~guido)

Yury Selivanov

unread,
Dec 12, 2017, 10:14:39 PM12/12/17
to Guido van Rossum, Python-Dev
On Tue, Dec 12, 2017 at 9:55 PM, Guido van Rossum <gu...@python.org> wrote:
> On Tue, Dec 12, 2017 at 5:35 PM, Yury Selivanov <yseliv...@gmail.com>
> wrote:
>>
>> On Tue, Dec 12, 2017 at 6:49 PM, Victor Stinner
>> <victor....@gmail.com> wrote:
>> > I like the overall idea and I prefer this PEP over PEP 550 since it's
>> > shorter and easier to read :-)
>> >
>> > Question: Is there an API to list all context variables?
>>
>> Context implements abc.Mapping, so 'get_context().keys()' will give
>> you a list of all ContextVars in the current context.
>
>
> This was hinted at in the PEP, but maybe an explicit example would be nice.

Sure.

>
>>
>> > Each get_context() call returns a new Context object. It may be worth
>> > to mention it. I understand why, but it's surprising that "assert
>> > get_context() is not get_context()" fails. Maybe it's a naming issue?
>> > Maybe rename it to contextvars.context()?
>>
>> I think the name is fine. While get_context() will return a new instance
>> every time you call it, those instances will have the same context
>> variables/values in them, so I don't think it's a problem.
>
>
> I'm fine with this, but perhaps == should be supported so that those two are
> guaranteed to be considered equal? (Otherwise an awkward idiom to compare
> contexts using expensive dict() copies would be needed to properly compare
> two contexts for equality.)

I've no problem with implementing 'Context.__eq__'. I think
abc.Mapping also implements it.

>
>>
>> > At the first read, I understood that that ctx.run() creates a new
>> > temporary context which is removed once ctx.run() returns.
>> >
>> > Now I understand that context variable values are restored to their
>> > previous values once run() completes. Am I right?
>>
>> ctx.run(func) runs 'func' in the 'ctx' context. Any changes to
>> ContextVars that func makes will stay isolated to the 'ctx' context.
>>
>> >
>> > Maybe add a short comment to explain that?
>>
>> Added.
>
>
> The PEP still contains the following paragraph:
>
>> Any changes to the context will be contained and persisted in the
>> ``Context`` object on which ``run()`` is called on.
>
> This phrase is confusing; it could be read as implying that context changes
> made by the function *will* get propagated back to the caller of run(),
> contradicting what was said earlier. Maybe it's best to just delete it?
> Otherwise if you intend it to add something it needs to be rephrased. Maybe
> "persisted" is the key word causing confusion?

I'll remove "persisted" now, I agree it adds more confusion than
clarity. Victor is also confused with how 'Context.run()' is
currently explained, I'll try to make it clearer.

Thank you,

Guido van Rossum

unread,
Dec 12, 2017, 10:39:14 PM12/12/17
to Yury Selivanov, Python-Dev
Some more feedback:


> This proposal builds directly upon concepts originally introduced
> in :pep:`550`. 

The phrase "builds upon" typically implies that the other resource must be read and understood first. I don't think that we should require PEP 550 for understanding of PEP 567. Maybe "This proposal is a simplified version of :pep:`550`." ?


> The notion of "current value" deserves special consideration:
> different asynchronous tasks that exist and execute concurrently
> may have different values.  This idea is well-known from thread-local
> storage but in this case the locality of the value is not always
> necessarily to a thread.  Instead, there is the notion of the
> "current ``Context``" which is stored in thread-local storage, and
> is accessed via ``contextvars.get_context()`` function.
> Manipulation of the current ``Context`` is the responsibility of the
> task framework, e.g. asyncio.

This begs two (related) questions:
- If it's stored in TLS, why isn't it equivalent to TLS?
- If it's read-only (as mentioned in the next paragraph) how can the framework modify it?

I realize the answers are clear, but at this point in the exposition you haven't given the reader enough information to answer them, so this paragraph may confuse readers.

> Specification
> =============
> [points 1, 2, 3]

Shouldn't this also list Token? (It must be a class defined here so users can declare the type of variables/arguments in their code representing these tokens.)

> The ``ContextVar`` class has the following constructor signature:
> ``ContextVar(name, *, default=no_default)``.

I think a word or two about the provenance of `no_default` would be good. (I think it's an internal singleton right?) Ditto for NO_DEFAULT in the C implementation sketch.

>     class Task:
>         def __init__(self, coro):

Do we need a keyword arg 'context=None' here too? (I'm not sure what would be the use case, but somehow it stands out in comparison to call_later() etc.)

> CPython C API
> -------------
> TBD

Yeah, what about it? :-)

> The internal immutable dictionary for ``Context`` is implemented
> using Hash Array Mapped Tries (HAMT).  They allow for O(log N) ``set``
> operation, and for O(1) ``get_context()`` function.  [...]

I wonder if we can keep the HAMT out of the discussion at this point. I have nothing against it, but given that you already say you're leaving out optimizations and nothing in the pseudo code given here depends on them I wonder if they shouldn't be mentioned later. (Also the appendix with the perf analysis is the one thing that I think we can safely leave out, just reference PEP 550 for this.)

> class _ContextData

Since this isn't a real class anyway I think the __mapping attribute might as well be named _mapping. Ditto for other __variables later.

Yury Selivanov

unread,
Dec 12, 2017, 11:22:33 PM12/12/17
to Guido van Rossum, Python-Dev
On Tue, Dec 12, 2017 at 10:36 PM, Guido van Rossum <gu...@python.org> wrote:
> Some more feedback:
>
>> This proposal builds directly upon concepts originally introduced
>> in :pep:`550`.
>
> The phrase "builds upon" typically implies that the other resource must be
> read and understood first. I don't think that we should require PEP 550 for
> understanding of PEP 567. Maybe "This proposal is a simplified version of
> :pep:`550`." ?

I agree, "simplified version" is better.

>
>> The notion of "current value" deserves special consideration:
>> different asynchronous tasks that exist and execute concurrently
>> may have different values. This idea is well-known from thread-local
>> storage but in this case the locality of the value is not always
>> necessarily to a thread. Instead, there is the notion of the
>> "current ``Context``" which is stored in thread-local storage, and
>> is accessed via ``contextvars.get_context()`` function.
>> Manipulation of the current ``Context`` is the responsibility of the
>> task framework, e.g. asyncio.
>
> This begs two (related) questions:
> - If it's stored in TLS, why isn't it equivalent to TLS?
> - If it's read-only (as mentioned in the next paragraph) how can the
> framework modify it?
>
> I realize the answers are clear, but at this point in the exposition you
> haven't given the reader enough information to answer them, so this
> paragraph may confuse readers.

I'll think how to rephrase it.

>
>> Specification
>> =============
>> [points 1, 2, 3]
>
> Shouldn't this also list Token? (It must be a class defined here so users
> can declare the type of variables/arguments in their code representing these
> tokens.)
>
>> The ``ContextVar`` class has the following constructor signature:
>> ``ContextVar(name, *, default=no_default)``.
>
> I think a word or two about the provenance of `no_default` would be good. (I
> think it's an internal singleton right?) Ditto for NO_DEFAULT in the C
> implementation sketch.

Fixed.

>
>> class Task:
>> def __init__(self, coro):
>
> Do we need a keyword arg 'context=None' here too? (I'm not sure what would
> be the use case, but somehow it stands out in comparison to call_later()
> etc.)

call_later() is low-level and it needs the 'context' argument as Task
and Future use it in their implementation.

It would be easy to add 'context' parameter to Task and
loop.create_task(), but I don't know about any concrete use-case for
that just yet.

>
>> CPython C API
>> -------------
>> TBD
>
> Yeah, what about it? :-)

I've added it: https://github.com/python/peps/pull/508/files

I didn't want to get into too much detail about the C API until I have
a working PR. Although I feel that the one I describe in the PEP now
is very close to what we'll have.

>
>> The internal immutable dictionary for ``Context`` is implemented
>> using Hash Array Mapped Tries (HAMT). They allow for O(log N) ``set``
>> operation, and for O(1) ``get_context()`` function. [...]
>
> I wonder if we can keep the HAMT out of the discussion at this point. I have
> nothing against it, but given that you already say you're leaving out
> optimizations and nothing in the pseudo code given here depends on them I
> wonder if they shouldn't be mentioned later. (Also the appendix with the
> perf analysis is the one thing that I think we can safely leave out, just
> reference PEP 550 for this.)

I've added a new section "Implementation Notes" that mentions HAMT and
ContextVar.get() cache. Both refer to PEP 550's lengthy explanations.

>
>> class _ContextData
>
> Since this isn't a real class anyway I think the __mapping attribute might
> as well be named _mapping. Ditto for other __variables later.

Done.

Dima Tisnek

unread,
Dec 13, 2017, 1:41:51 AM12/13/17
to Yury Selivanov, Python-Dev
My 2c:
TL;DR PEP specifies implementation in some detail, but doesn't show
how proposed change can or should be used.



get()/set(value)/delete() methods: Python provides syntax sugar for
these, let's use it.
(dict: d["k"]/d["k] = value/del d["k"]; attrs: obj.k/obj.k = value/del
obj.k; inheriting threading.Local)



This PEP and 550 describe why TLS is inadequate, but don't seem to
specify how proposed context behaves in async world. I'd be most
interested in how it appears to work to the user of the new library.

Consider a case of asynchronous cache:

async def actual_lookup(name):
...

def cached_lookup(name, cache={}):
if name not in cache:
cache["name"] = shield(ensure_future(actual_lookup(name))
return cache["name"]

Unrelated (or related) asynchronous processes end up waiting on the same future:

async def called_with_user_context():
...
await cached_lookup(...)
...

Which context is propagated to actual_lookup()?
The PEP doesn't seem to state that clearly.
It appears to be first caller's context.
Is it a copy or a reference?
If first caller is cancelled, the context remains alive.



token is fragile, I believe PEP should propose a working context
manager instead.
Btw., isn't a token really a reference to
state-of-context-before-it's-cloned-and-modified?
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/dimaqq%40gmail.com

Nathaniel Smith

unread,
Dec 13, 2017, 5:26:02 AM12/13/17
to Dima Tisnek, Python-Dev
On Tue, Dec 12, 2017 at 10:39 PM, Dima Tisnek <dim...@gmail.com> wrote:
> My 2c:
> TL;DR PEP specifies implementation in some detail, but doesn't show
> how proposed change can or should be used.
>
>
>
> get()/set(value)/delete() methods: Python provides syntax sugar for
> these, let's use it.
> (dict: d["k"]/d["k] = value/del d["k"]; attrs: obj.k/obj.k = value/del
> obj.k; inheriting threading.Local)

This was already discussed to death in the PEP 550 threads... what
most users want is a single value, and routing get/set through a
ContextVar object allows for important optimizations and a simpler
implementation. Also, remember that 99% of users will never use these
objects directly; it's a low-level API mostly useful to framework
implementers.

> This PEP and 550 describe why TLS is inadequate, but don't seem to
> specify how proposed context behaves in async world. I'd be most
> interested in how it appears to work to the user of the new library.
>
> Consider a case of asynchronous cache:
>
> async def actual_lookup(name):
> ...
>
> def cached_lookup(name, cache={}):
> if name not in cache:
> cache["name"] = shield(ensure_future(actual_lookup(name))
> return cache["name"]
>
> Unrelated (or related) asynchronous processes end up waiting on the same future:
>
> async def called_with_user_context():
> ...
> await cached_lookup(...)
> ...
>
> Which context is propagated to actual_lookup()?
> The PEP doesn't seem to state that clearly.
> It appears to be first caller's context.

Yes.

> Is it a copy or a reference?

It's a copy, as returned by get_context().

> If first caller is cancelled, the context remains alive.
>
>
>
> token is fragile, I believe PEP should propose a working context
> manager instead.
> Btw., isn't a token really a reference to
> state-of-context-before-it's-cloned-and-modified?

No, a Token only represents the value of one ContextVar, not the whole
Context. This could maybe be clearer in the PEP, but it has to be this
way or you'd get weird behavior from code like:

with decimal.localcontext(...): # sets and then restores
numpy.seterr(...) # sets without any plan to restore
# after the 'with' block, the decimal ContextVar gets restored
# but this shouldn't affect the numpy.seterr ContextVar

-n

--
Nathaniel J. Smith -- https://vorpus.org

Victor Stinner

unread,
Dec 13, 2017, 5:50:27 AM12/13/17
to Dima Tisnek, Python-Dev
Hi Dima,

2017-12-13 7:39 GMT+01:00 Dima Tisnek <dim...@gmail.com>:
> get()/set(value)/delete() methods: Python provides syntax sugar for
> these, let's use it.
> (dict: d["k"]/d["k] = value/del d["k"]; attrs: obj.k/obj.k = value/del
> obj.k; inheriting threading.Local)

I was trapped by Context which is described as "a mapping". Usually,
when I read "mappin", I associate it to a mutable dictionary. But in
fact Context is a *read-only* mapping. Yury changed the Introduction
to add "read-only", but not the Context section:
https://www.python.org/dev/peps/pep-0567/#contextvars-context

Only a single ContextVar variable can be modified. This object is a
container for a *single* value, not a mapping, you cannot write "var =
value", you have to write "var.set(value)", and "var['key] = value"
doesn't make sense.


> This PEP and 550 describe why TLS is inadequate, but don't seem to
> specify how proposed context behaves in async world. I'd be most
> interested in how it appears to work to the user of the new library.

In short, context is inherited automatically, you have nothing to do
:-) Put anything you want into a context, and it will follow
transparently your asynchronous code.

The answer is in the sentence: "Tasks in asyncio need to maintain
their own context that they inherit from the point they were created
at. "

You may want to use a task context to pass data from a HTTP request:
user name, cookie, IP address, etc. If you save data into the "current
context", in practice, the context is inherited by tasks and
callbacks, and so even if your code is made of multiple tasks, you
still "inherit" the context as expected.

Only tasks have to manually "save/restore" the context, since only
tasks use "await" in their code, not callbacks called by call_soon() &
cie.


> token is fragile, I believe PEP should propose a working context
> manager instead.

Why is it fragile? In asyncio, you cannot use a context manager
because of the design of tasks.

Victor

Eric Snow

unread,
Dec 13, 2017, 4:03:52 PM12/13/17
to Yury Selivanov, Python-Dev
Overall, I like this PEP. It's definitely easier to follow
conceptually than PEP 550. Thanks for taking the time to re-think the
idea. I have a few comments in-line below.

-eric

On Tue, Dec 12, 2017 at 10:33 AM, Yury Selivanov
<yseliv...@gmail.com> wrote:
> This is a new proposal to implement context storage in Python.

+1

This is something I've had on my back burner for years. Getting this
right is non-trivial, so having a stdlib implementation will help open
up clean solutions in a number of use cases that are currently
addressed in more error-prone ways.

>
> It's a successor of PEP 550 and builds on some of its API ideas and
> datastructures. Contrary to PEP 550 though, this proposal only focuses
> on adding new APIs and implementing support for it in asyncio. There
> are no changes to the interpreter or to the behaviour of generator or
> coroutine objects.

Do you have any plans to revisit extension of the concept to
generators and coroutine objects? I agree they can be addressed
separately, if necessary. TBH, I'd expect this PEP to provide an
approach that allows such applications of the concept to effectively
be implementation details that can be supported later.

> Abstract
> ========
>
> This PEP proposes the new ``contextvars`` module and a set of new
> CPython C APIs to support context variables. This concept is
> similar to thread-local variables but, unlike TLS, it allows

s/it allows/it also allows/

> correctly keeping track of values per asynchronous task, e.g.
> ``asyncio.Task``.
>
> [snip]
>
> Rationale
> =========
>
> Thread-local variables are insufficient for asynchronous tasks which
> execute concurrently in the same OS thread. Any context manager that
> needs to save and restore a context value and uses
> ``threading.local()``, will have its context values bleed to other
> code unexpectedly when used in async/await code.

FWIW, I'd consider the concept to extend to all execution contexts in
the interpreter, of which threads and async/await are the only kinds
we have currently. That said, I don't see us adding any new kinds of
execution context so what you've said is entirely satisfactory. :)

>
> [snip]
>
> Introduction
> ============
>
> [snip]
>
> Specification
> =============
>
> A new standard library module ``contextvars`` is added

Why not add this to contextlib instead of adding a new module? IIRC
this was discussed relative to PEP 550, but I don't remember the
reason. Regardless, it would be worth mentioning somewhere in the
PEP.

> with the
> following APIs:
>
> 1. ``get_context() -> Context`` function is used to get the current
> ``Context`` object for the current OS thread.
>
> 2. ``ContextVar`` class to declare and access context variables.

It may be worth explaining somewhere in the PEP the reason why you've
chosen to add ContextVar instead of adding a new keyword (e.g.
"context", a la global and nonlocal) to do roughly the same thing.
Consider that execution contexts are very much a language-level
concept, a close sibling to scope. Driving that via a keyword would a
reasonable approach, particularly since it introduces less coupling
between a language-level feature and a stdlib module. (Making it a
builtin would sort of help with that too, but a keyword would seem
like a better fit.) A keyword would obviate the need for explicitly
calling .get() and .set().

FWIW, I agree with not adding a new keyword. To me context variables
are a low-level tool for library authors to implement their high-level
APIs. ContextVar, with its explicit .get() and .set() methods is a
good fit for that and better communicates the conceptual intent of the
feature. However, it would still be worth explicitly mentioning the
alternate keyword-based approach in the PEP.

>
> 3. ``Context`` class encapsulates context state. Every OS thread
> stores a reference to its current ``Context`` instance.
> It is not possible to control that reference manually.
> Instead, the ``Context.run(callable, *args)`` method is used to run
> Python code in another context.

I'd call that "Context.call()" since its for callables. Did you have
a specific reason for calling it "run" instead?

>
>

FWIW, I think there are some helpers you could add that library
authors would appreciate. However, they aren't critical so I'll hold
off and maybe post about them later. :)

> contextvars.ContextVar
> ----------------------
>
> The ``ContextVar`` class has the following constructor signature:
> ``ContextVar(name, *, default=no_default)``. The ``name`` parameter
> is used only for introspection and debug purposes.

It doesn't need to be required then, right?

> [snip]
>
> ``ContextVar.set(value) -> Token`` is used to set a new value for
> the context variable in the current ``Context``::
>
> # Set the variable 'var' to 1 in the current context.
> var.set(1)
>
> ``contextvars.Token`` is an opaque object that should be used to
> restore the ``ContextVar`` to its previous value, or remove it from
> the context if it was not set before. The ``ContextVar.reset(Token)``
> is used for that::
>
> old = var.set(1)
> try:
> ...
> finally:
> var.reset(old)
>
> The ``Token`` API exists to make the current proposal forward
> compatible with :pep:`550`, in case there is demand to support
> context variables in generators and asynchronous generators in the
> future.

The "restoring values" focus is valuable on its own, It emphasizes a
specific usage pattern to users (though a context manager would
achieve the same). The token + reset() approach means that users
don't need to think about "not set" when restoring values. That said,
is there otherwise any value to the "not set" concept? If so,
"is_set()" (not strictly necessary) and "unset()" methods may be
warranted.

Also, there's a strong context manager vibe here. Some sort of
context manager support would be nice. However, with the token coming
out of .set() and with no alternative (e.g. "get_token()"), I'm not
sure what an intuitive CM interface would be here.

> [snip]
>
> contextvars.Context
> -------------------
>
> [snip]
>
> Any changes to any context variables that ``function`` causes, will
> be contained in the ``ctx`` context::
>
> var = ContextVar('var')
> var.set('spam')
>
> def function():
> assert var.get() == 'spam'
>
> var.set('ham')
> assert var.get() == 'ham'
>
> ctx = get_context()
> ctx.run(function)
>
> assert var.get('spam')

Shouldn't this be "assert var.get() == 'spam'"?

>
> Any changes to the context will be contained and persisted in the
> ``Context`` object on which ``run()`` is called on.

For me this would be more clear if it could be spelled like this:

with ctx:
function()

Also, let's say I want to run a function under a custom context,
whether a fresh one or an adaptation of an existing one. How can I
compose such a Context? AFAICS, the only way to modify a context is
by using ContextVar.set() (and reset()), which modifies the current
context. It might be useful if there were a more direct way, like a
"Context.add(*var) -> Context" and "Context.remove(*var) -> Context"
and maybe even a "Context.set(var, value) -> Context" and
"Context.unset(var) -> Context".

Eric Snow

unread,
Dec 13, 2017, 4:05:54 PM12/13/17
to Victor Stinner, Python-Dev
On Tue, Dec 12, 2017 at 4:49 PM, Victor Stinner
<victor....@gmail.com> wrote:
>> The ``Token`` API exists to make the current proposal forward
>> compatible with :pep:`550`, in case there is demand to support
>> context variables in generators and asynchronous generators in the
>> future.
>
> Cool. I like the idea of starting with something simple in Python 3.7.
> Then extend it in Python 3.8 or later (support generators), if it
> becomes popular, once the first simple (but "incomplete", without
> generators) implementation is battle-tested.

+1 for starting with a basic API and building on that.

-eric

Yury Selivanov

unread,
Dec 13, 2017, 4:38:11 PM12/13/17
to Eric Snow, Python-Dev
Hi Eric,

Thanks for a detailed review!

On Wed, Dec 13, 2017 at 3:59 PM, Eric Snow <ericsnow...@gmail.com> wrote:
> Overall, I like this PEP. It's definitely easier to follow
> conceptually than PEP 550. Thanks for taking the time to re-think the
> idea. I have a few comments in-line below.
>
> -eric
>
> On Tue, Dec 12, 2017 at 10:33 AM, Yury Selivanov
> <yseliv...@gmail.com> wrote:
>> This is a new proposal to implement context storage in Python.
>
> +1
>
> This is something I've had on my back burner for years. Getting this
> right is non-trivial, so having a stdlib implementation will help open
> up clean solutions in a number of use cases that are currently
> addressed in more error-prone ways.

Right!

>
>>
>> It's a successor of PEP 550 and builds on some of its API ideas and
>> datastructures. Contrary to PEP 550 though, this proposal only focuses
>> on adding new APIs and implementing support for it in asyncio. There
>> are no changes to the interpreter or to the behaviour of generator or
>> coroutine objects.
>
> Do you have any plans to revisit extension of the concept to
> generators and coroutine objects? I agree they can be addressed
> separately, if necessary. TBH, I'd expect this PEP to provide an
> approach that allows such applications of the concept to effectively
> be implementation details that can be supported later.

Maybe we'll extend the concept to work for generators in Python 3.8,
but that's a pretty remote topic to discuss (and we'll need a new PEP
for that). In case we decide to do that, PEP 550 provides a good
implementation plan, and PEP 567 are forward-compatible with it.

>
>> Abstract
>> ========
>>
>> This PEP proposes the new ``contextvars`` module and a set of new
>> CPython C APIs to support context variables. This concept is
>> similar to thread-local variables but, unlike TLS, it allows
>
> s/it allows/it also allows/

Will fix it.

[..]
>> A new standard library module ``contextvars`` is added
>
> Why not add this to contextlib instead of adding a new module? IIRC
> this was discussed relative to PEP 550, but I don't remember the
> reason. Regardless, it would be worth mentioning somewhere in the
> PEP.
>

The mechanism is generic and isn't directly related to context
managers. Context managers can (and in many cases should) use the new
APIs to store global state, but the contextvars APIs do not depend on
context managers or require them.

I also feel that contextlib is a big module already, so having the new
APIs in their separate module and having a separate documentation page
makes it more approachable.

>> with the
>> following APIs:
>>
>> 1. ``get_context() -> Context`` function is used to get the current
>> ``Context`` object for the current OS thread.
>>
>> 2. ``ContextVar`` class to declare and access context variables.
>
> It may be worth explaining somewhere in the PEP the reason why you've
> chosen to add ContextVar instead of adding a new keyword (e.g.
> "context", a la global and nonlocal) to do roughly the same thing.
> Consider that execution contexts are very much a language-level
> concept, a close sibling to scope. Driving that via a keyword would a
> reasonable approach, particularly since it introduces less coupling
> between a language-level feature and a stdlib module. (Making it a
> builtin would sort of help with that too, but a keyword would seem
> like a better fit.) A keyword would obviate the need for explicitly
> calling .get() and .set().
>
> FWIW, I agree with not adding a new keyword. To me context variables
> are a low-level tool for library authors to implement their high-level
> APIs. ContextVar, with its explicit .get() and .set() methods is a
> good fit for that and better communicates the conceptual intent of the
> feature. However, it would still be worth explicitly mentioning the
> alternate keyword-based approach in the PEP.

Yeah, adding keywords is way harder than adding a new module. It
would require a change in Grammar, new opcodes, changes to frameobject
etc. I also don't think that ContextVars will be that popular to have
their own syntax -- how many threadlocals do you see every day?

For PEP 567/550 a keyword isn't really needed, we can implement the
concept with a ContextVar class.

>>
>> 3. ``Context`` class encapsulates context state. Every OS thread
>> stores a reference to its current ``Context`` instance.
>> It is not possible to control that reference manually.
>> Instead, the ``Context.run(callable, *args)`` method is used to run
>> Python code in another context.
>
> I'd call that "Context.call()" since its for callables. Did you have
> a specific reason for calling it "run" instead?

We have a bunch of run() methods in asyncio, and as I'm actively
working on its codebase I might be biased here, but ".run()" reads
better for me personally than ".call()".


> FWIW, I think there are some helpers you could add that library
> authors would appreciate. However, they aren't critical so I'll hold
> off and maybe post about them later. :)

My goal with this PEP is to keep the API to its bare minimum, but if
you have some ideas please share!

>
>> contextvars.ContextVar
>> ----------------------
>>
>> The ``ContextVar`` class has the following constructor signature:
>> ``ContextVar(name, *, default=no_default)``. The ``name`` parameter
>> is used only for introspection and debug purposes.
>
> It doesn't need to be required then, right?

If it's not required then people won't use it. And then when you want
to introspect the context, you'll see a bunch of anonymous variables.

So as with namedtuple(), I think there'd no harm in requiring the name
parameter.
"unset()" would be incompatible with PEP 550, which has a chained
execution context model. When you have a chain of contexts, unset()
becomes ambiguous.

"is_set()" is trivially implemented via "var.get(default=marker) is
marker", but I don't think people will need this to add this method
now.

[..]
>> Any changes to the context will be contained and persisted in the
>> ``Context`` object on which ``run()`` is called on.
>
> For me this would be more clear if it could be spelled like this:
>
> with ctx:
> function()

But we would still need "run()" to use in asyncio. Context managers
are slower that a single method call.

Also, context management like this is a *very* low-level API intended
to be used by framework/library authors in very few places. Again,
I'd really prefer to keep the API to the minimum in 3.7.

> Also, let's say I want to run a function under a custom context,
> whether a fresh one or an adaptation of an existing one. How can I
> compose such a Context? AFAICS, the only way to modify a context is
> by using ContextVar.set() (and reset()), which modifies the current
> context. It might be useful if there were a more direct way, like a
> "Context.add(*var) -> Context" and "Context.remove(*var) -> Context"
> and maybe even a "Context.set(var, value) -> Context" and
> "Context.unset(var) -> Context".

Again this would be a shortcut for a very limited number of use-cases.
I just can't come up with a good real-world example where you want to
add many context variables to the context and run something in it.
But even if you want that, you can always just wrap your function:

def set_and_call(var, val, func):
var.set(val)
return func()

context.run(set_and_call, var, val, func)

Yury

Ben Darnell

unread,
Dec 17, 2017, 11:54:32 AM12/17/17
to Yury Selivanov, Python-Dev
On Tue, Dec 12, 2017 at 12:34 PM Yury Selivanov <yseliv...@gmail.com> wrote:
Hi,

This is a new proposal to implement context storage in Python.

It's a successor of PEP 550 and builds on some of its API ideas and
datastructures.  Contrary to PEP 550 though, this proposal only focuses
on adding new APIs and implementing support for it in asyncio.  There
are no changes to the interpreter or to the behaviour of generator or
coroutine objects.

I like this proposal. Tornado has a more general implementation of a similar idea (https://github.com/tornadoweb/tornado/blob/branch4.5/tornado/stack_context.py), but it also tried to solve the problem of exception handling of callback-based code so it had a significant performance cost (to interpose try/except blocks all over the place). Limiting the interface to coroutine-local variables should keep the performance impact minimal.

If the contextvars package were published on pypi (and backported to older pythons), I'd deprecate Tornado's stack_context and use it instead (even if there's not an official backport, I'll probably move towards whatever interface is defined in this PEP if it is accepted).

One caveat based on Tornado's experience with stack_context: There are times when the automatic propagation of contexts won't do the right thing (for example, a database client with a connection pool may end up hanging on to the context from the request that created the connection instead of picking up a new context for each query). Compatibility with this feature will require testing and possible fixes with many libraries in the asyncio ecosystem before it can be relied upon. 

-Ben
 

Yury Selivanov

unread,
Dec 17, 2017, 3:04:39 PM12/17/17
to Ben Darnell, Python-Dev
Hi Ben,

On Sun, Dec 17, 2017 at 10:38 AM, Ben Darnell <b...@bendarnell.com> wrote:
> On Tue, Dec 12, 2017 at 12:34 PM Yury Selivanov <yseliv...@gmail.com>
> wrote:
>>
>> Hi,
>>
>> This is a new proposal to implement context storage in Python.
>>
>> It's a successor of PEP 550 and builds on some of its API ideas and
>> datastructures. Contrary to PEP 550 though, this proposal only focuses
>> on adding new APIs and implementing support for it in asyncio. There
>> are no changes to the interpreter or to the behaviour of generator or
>> coroutine objects.
>
>
> I like this proposal. Tornado has a more general implementation of a similar
> idea
> (https://github.com/tornadoweb/tornado/blob/branch4.5/tornado/stack_context.py),
> but it also tried to solve the problem of exception handling of
> callback-based code so it had a significant performance cost (to interpose
> try/except blocks all over the place). Limiting the interface to
> coroutine-local variables should keep the performance impact minimal.

Thank you, Ben!

Yes, task local API of PEP 567 has no impact on generators/coroutines.
Impact on asyncio should be well within 1-2% slowdown, visible only in
microbenchmarks (and asyncio will be 3-6% faster in 3.7 at least due
to some asyncio.Task C optimizations).

[..]
> One caveat based on Tornado's experience with stack_context: There are times
> when the automatic propagation of contexts won't do the right thing (for
> example, a database client with a connection pool may end up hanging on to
> the context from the request that created the connection instead of picking
> up a new context for each query).

I can see two scenarios that could lead to that:

1. The connection pool explicitly captures the context with 'get_context()' at
the point where it was created. It later schedules all of its code within the
captured context with Context.run().

2. The connection pool calls ContextVar.get() once and _caches_ it.

Both (1) and (2) are anti-patterns. The documentation of asyncio and
contextvars
module will explain that users are supposed to simply call
ContextVar.get() whenever
they need to get a context value (e.g. there's no need to
cache/persist it) and that
they should not manage the context manually (just trust asyncio to do
that for you).

Thank you,
Yury
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Ivan Levkivskyi

unread,
Dec 18, 2017, 6:02:08 PM12/18/17
to Yury Selivanov, Python-Dev
On 13 December 2017 at 22:35, Yury Selivanov <yseliv...@gmail.com> wrote:
[..]
>> A new standard library module ``contextvars`` is added
>
> Why not add this to contextlib instead of adding a new module?  IIRC
> this was discussed relative to PEP 550, but I don't remember the
> reason.  Regardless, it would be worth mentioning somewhere in the
> PEP.
>

The mechanism is generic and isn't directly related to context
managers.  Context managers can (and in many cases should) use the new
APIs to store global state, but the contextvars APIs do not depend on
context managers or require them.


This was the main point of confusion for me when reading the PEP.
Concept of TLS is independent of context managers, but using word "context"
everywhere leads to doubts like "Am I getting everything right?" I think just adding the
two quoted sentences will clarify the intent.

Otherwise the PEP is easy to read, the proposed API looks simple, and this
definitely will be a useful feature.

--
Ivan


Yury Selivanov

unread,
Dec 18, 2017, 8:39:28 PM12/18/17
to Ben Darnell, Python-Dev
> 3. The connection pool has a queue, and creates a task for each connection to serve requests from that queue. Naively, each task could inherit the context of the request that caused it to be created, but the task would outlive the request and go on to serve other requests. The connection pool would need to specifically suppress the caller's context when creating its worker tasks.

I haven't used this pattern myself, but it looks like a good case for
adding a keyword-only 'context' rgument to `loop.create_task()`. This
way the pool can capture the context when some API method is called
and pass it down to the queue along with the request. The queue task
can then run connection code in that context.

Yury Selivanov

unread,
Dec 18, 2017, 8:42:19 PM12/18/17
to Ivan Levkivskyi, Python-Dev
On Mon, Dec 18, 2017 at 6:00 PM, Ivan Levkivskyi <levki...@gmail.com> wrote:
> On 13 December 2017 at 22:35, Yury Selivanov <yseliv...@gmail.com>
> wrote:
>>
>> [..]
>> >> A new standard library module ``contextvars`` is added
>> >
>> > Why not add this to contextlib instead of adding a new module? IIRC
>> > this was discussed relative to PEP 550, but I don't remember the
>> > reason. Regardless, it would be worth mentioning somewhere in the
>> > PEP.
>> >
>>
>> The mechanism is generic and isn't directly related to context
>> managers. Context managers can (and in many cases should) use the new
>> APIs to store global state, but the contextvars APIs do not depend on
>> context managers or require them.
>>
>
> This was the main point of confusion for me when reading the PEP.
> Concept of TLS is independent of context managers, but using word "context"
> everywhere leads to doubts like "Am I getting everything right?" I think
> just adding the
> two quoted sentences will clarify the intent.

I'll try to clarify this in the Abstract section.

>
> Otherwise the PEP is easy to read, the proposed API looks simple, and this
> definitely will be a useful feature.

Thanks, Ivan!

Ben Darnell

unread,
Dec 18, 2017, 8:54:43 PM12/18/17
to Yury Selivanov, Python-Dev
On Sun, Dec 17, 2017 at 2:49 PM Yury Selivanov <yseliv...@gmail.com> wrote:
> One caveat based on Tornado's experience with stack_context: There are times
> when the automatic propagation of contexts won't do the right thing (for
> example, a database client with a connection pool may end up hanging on to
> the context from the request that created the connection instead of picking
> up a new context for each query).

I can see two scenarios that could lead to that:

1. The connection pool explicitly captures the context with 'get_context()' at
the point where it was created. It later schedules all of its code within the
captured context with Context.run().

2. The connection pool calls ContextVar.get() once and _caches_ it.



3. The connection pool has a queue, and creates a task for each connection to serve requests from that queue. Naively, each task could inherit the context of the request that caused it to be created, but the task would outlive the request and go on to serve other requests. The connection pool would need to specifically suppress the caller's context when creating its worker tasks. 
 
The situation was more complicated for Tornado since we were trying to support callback-based workflows as well. Limiting this to coroutines closes off a lot of the potential issues - most of the specific examples I can think of would not be possible in a coroutine-only world. 

-Ben

Ben Darnell

unread,
Dec 18, 2017, 10:41:45 PM12/18/17
to Yury Selivanov, Python-Dev
On Mon, Dec 18, 2017 at 8:37 PM Yury Selivanov <yseliv...@gmail.com> wrote:
> 3. The connection pool has a queue, and creates a task for each connection to serve requests from that queue. Naively, each task could inherit the context of the request that caused it to be created, but the task would outlive the request and go on to serve other requests. The connection pool would need to specifically suppress the caller's context when creating its worker tasks.

I haven't used this pattern myself, but it looks like a good case for
adding a keyword-only 'context' rgument to `loop.create_task()`.  This
way the pool can capture the context when some API method is called
and pass it down to the queue along with the request.  The queue task
can then run connection code in that context.


Yes, that would be useful.

-Ben 
Reply all
Reply to author
Forward
0 new messages