Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

[Python-ideas] Proposal: Use mypy syntax for function annotations

804 views
Skip to first unread message

Guido van Rossum

unread,
Aug 13, 2014, 3:45:51 PM8/13/14
to Python-Ideas, Jukka Lehtosalo, Bob Ippolito
[There is no TL;DR other than the subject line. Please read the whole thing before replying. I do have an appendix with some motivations for adding type annotations at the end.]

Yesterday afternoon I had an inspiring conversation with Bob Ippolito (man of many trades, author of simplejson) and Jukka Lehtosalo (author of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about what Python can learn from Haskell (and other languages); yesterday he gave the same talk at Dropbox. The talk is online (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad strokes comes down to three suggestions:

  (a) Python should adopt mypy's syntax for function annotations
  (b) Python's use of mutabe containers by default is wrong
  (c) Python should adopt some kind of Abstract Data Types

Proposals (b) and (c) don't feel particularly actionable (if you disagree please start a new thread, I'd be happy to discuss these further if there's interest) but proposal (a) feels right to me.

So what is mypy?  It is a static type checker for Python written by Jukka for his Ph.D. thesis. The basic idea is that you add type annotations to your program using some custom syntax, and when running your program using the mypy interpreter, type errors will be found during compilation (i.e., before the program starts running).

The clever thing here is that the custom syntax is actually valid Python 3, using (mostly) function annotations: your annotated program will still run with the regular Python 3 interpreter. In the latter case there will be no type checking, and no runtime overhead, except to evaluate the function annotations (which are evaluated at function definition time but don't have any effect when the function is called).

In fact, it is probably more useful to think of mypy as a heavy-duty linter than as a compiler or interpreter; leave the type checking to mypy, and the execution to Python. It is easy to integrate mypy into a continuous integration setup, for example.

To read up on mypy's annotation syntax, please see the mypy-lang.org website. Here's just one complete example, to give a flavor:

  from typing import List, Dict

  def word_count(input: List[str]) -> Dict[str, int]:
      result = {}  #type: Dict[str, int]
      for line in input:
          for word in line.split():
              result[word] = result.get(word, 0) + 1
      return result


Note that the #type: comment is part of the mypy syntax; mypy uses comments to declare types in situations where no syntax is available -- although this particular line could also be written as follows:

    result = Dict[str, int]()

Either way the entire function is syntactically valid Python 3, and a suitable implementation of typing.py (containing class definitions for List and Dict, for example) can be written to make the program run correctly. One is provided as part of the mypy project.

I should add that many of mypy's syntactic choices aren't actually new. The basis of many of its ideas go back at least a decade: I blogged about this topic in 2004 (http://www.artima.com/weblogs/viewpost.jsp?thread=85551 -- see also the two followup posts linked from the top there).

I'll emphasize once more that mypy's type checking happens in a separate pass: no type checking happens at run time (other than what the interpreter already does, like raising TypeError on expressions like 1+"1").

There's a lot to this proposal, but I think it's possible to get a PEP written, accepted and implemented in time for Python 3.5, if people are supportive. I'll go briefly over some of the action items.

(1) A change of direction for function annotations

PEP 3107, which introduced function annotations, is intentional non-committal about how function annotations should be used. It lists a number of use cases, including but not limited to type checking. It also mentions some rejected proposals that would have standardized either a syntax for indicating types and/or a way for multiple frameworks to attach different annotations to the same function. AFAIK in practice there is little use of function annotations in mainstream code, and I propose a conscious change of course here by stating that annotations should be used to indicate types and to propose a standard notation for them.

(We may have to have some backwards compatibility provision to avoid breaking code that currently uses annotations for some other purpose. Fortunately the only issue, at least initially, will be that when running mypy to type check such code it will produce complaints about the annotations; it will not affect how such code is executed by the Python interpreter. Nevertheless, it would be good to deprecate such alternative uses of annotations.)

(2) A specification for what to add to Python 3.5

There needs to be at least a rough consensus on the syntax for annotations, and the syntax must cover a large enough set of use cases to be useful. Mypy is still under development, and some of its features are still evolving (e.g. unions were only added a few weeks ago). It would be possible to argue endlessly about details of the notation, e.g. whether to use 'list' or 'List', what either of those means (is a duck-typed list-like type acceptable?) or how to declare and use type variables, and what to do with functions that have no annotations at all (mypy currently skips those completely).

I am proposing that we adopt whatever mypy uses here, keeping discussion of the details (mostly) out of the PEP. The goal is to make it possible to add type checking annotations to 3rd party modules (and even to the stdlib) while allowing unaltered execution of the program by the (unmodified) Python 3.5 interpreter. The actual type checker will not be integrated with the Python interpreter, and it will not be checked into the CPython repository. The only thing that needs to be added to the stdlib is a copy of mypy's typing.py module. This module defines several dozen new classes (and a few decorators and other helpers) that can be used in expressing argument types. If you want to type-check your code you have to download and install mypy and run it separately.

The curious thing here is that while standardizing a syntax for type annotations, we technically still won't be adopting standard rules for type checking. This is intentional. First of all, fully specifying all the type checking rules would make for a really long and boring PEP (a much better specification would probably be the mypy source code). Second, I think it's fine if the type checking algorithm evolves over time, or if variations emerge. The worst that can happen is that you consider your code correct but mypy disagrees; your code will still run.

That said, I don't want to completely leave out any specification. I want the contents of the typing.py module to be specified in the PEP, so that it can be used with confidence. But whether mypy will complain about your particular form of duck typing doesn't have to be specified by the PEP. Perhaps as mypy evolves it will take options to tell it how to handle certain edge cases. Forks of mypy (or entirely different implementations of type checking based on the same annotation syntax) are also a possibility. Maybe in the distant future a version of Python will take a different stance, once we have more experience with how this works out in practice, but for Python 3.5 I want to restrict the scope of the upheaval.

Appendix -- Why Add Type Annotations?

The argument between proponents of static typing and dynamic typing has been going on for many decades. Neither side is all wrong or all right. Python has traditionally fallen in the camp of extremely dynamic typing, and this has worked well for most users, but there are definitely some areas where adding type annotations would help.

- Editors (IDEs) can benefit from type annotations; they can call out obvious mistakes (like misspelled method names or inapplicable operations) and suggest possible method names. Anyone who has used IntelliJ or Xcode will recognize how powerful these features are, and type annotations will make such features more useful when editing Python source code.

- Linters are an important tool for teams developing software. A linter doesn't replace a unittest, but can find certain types of errors better or quicker. The kind of type checking offered by mypy works much like a linter, and has similar benefits; but it can find problems that are beyond the capabilities of most linters.

- Type annotations are useful for the human reader as well! Take the above word_count() example. How long would it have taken you to figure out the types of the argument and return value without annotations? Currently most people put the types in their docstrings; developing a standard notation for type annotations will reduce the amount of documentation that needs to be written, and running the type checker might find bugs in the documentation, too. Once a standard type annotation syntax is introduced, it should be simple to add support for this notation to documentation generators like Sphinx.

- Refactoring. Bob's talk has a convincing example of how type annotations help in (manually) refactoring code. I also expect that certain automatic refactorings will benefit from type annotations -- imagine a tool like 2to3 (but used for some other transformation) augmented by type annotations, so it will know whether e.g. x.keys() is referring to the keys of a dictionary or not.

- Optimizers. I believe this is actually the least important application, certainly initially. Optimizers like PyPy or Pyston wouldn't be able to fully trust the type annotations, and they are better off using their current strategy of optimizing code based on the types actually observed at run time. But it's certainly feasible to imagine a future optimizer also taking type annotations into account.

--
--Guido "I need a new hobby" van Rossum (python.org/~guido)

Ethan Furman

unread,
Aug 13, 2014, 4:00:33 PM8/13/14
to python...@python.org
On 08/13/2014 12:44 PM, Guido van Rossum wrote:
>
> [There is no TL;DR other than the subject line. Please read the whole thing before replying. I do have an appendix with
> some motivations for adding type annotations at the end.]

+0 on the proposal as a whole. It is not something I'm likely to use, but I'm not opposed to it, so long as it stays
optional.


> Nevertheless, it would be good to deprecate such alternative uses of annotations.

-1 on deprecating alternative uses of annotations.

--
~Ethan~
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Guido van Rossum

unread,
Aug 13, 2014, 4:20:35 PM8/13/14
to Ethan Furman, Python-Ideas
On Wed, Aug 13, 2014 at 12:59 PM, Ethan Furman <et...@stoneleaf.us> wrote:
On 08/13/2014 12:44 PM, Guido van Rossum wrote:

[There is no TL;DR other than the subject line. Please read the whole thing before replying. I do have an appendix with
some motivations for adding type annotations at the end.]

+0 on the proposal as a whole.  It is not something I'm likely to use, but I'm not opposed to it, so long as it stays optional.



Nevertheless, it would be good to deprecate such alternative uses of annotations.

-1 on deprecating alternative uses of annotations.

Do you have a favorite alternative annotation use that you actually use (or are likely to)?

--
--Guido van Rossum (python.org/~guido)

Alex Gaynor

unread,
Aug 13, 2014, 4:30:31 PM8/13/14
to python...@python.org
I'm strongly opposed this, for a few reasons.

First, I think that standardizing on a syntax, without a semantics is
incredibly confusing, and I can't imagine how having *multiple* competing
implementations would be a boon for anyone.

This proposal seems to be built around the idea that we should have a syntax,
and then people can write third party tools, but Python itself won't really do
anything with them.

Fundamentally, this seems like a very confusing approach. How we write a type,
and what we do with that information are fundamentally connected. Can I cast a
``List[str]`` to a ``List[object]`` in any way? If yes, what happens when I go
to put an ``int`` in it? There's no runtime checking, so the type system is
unsound, on the other hand, disallowing this prevents many types of successes.

Both solutions have merit, but the idea of some implementations of the type
checker having covariance and some contravariance is fairly disturbing.

Another concern I have is that analysis based on these types is making some
pretty strong assumptions about static-ness of Python programs that aren't
valid. While existing checkers like ``flake8`` also do this, their assumptions
are basically constrained to the symbol table, while this is far deeper. For
example, can I annotate somethign as ``six.text_type``? What about
``django.db.models.sql.Query`` (keep in mind that this class is redefined based
on what database you're using (not actually true, but it used to be))?

Python's type system isn't very good. It lacks many features of more powerful
systems such as algebraic data types, interfaces, and parametric polymorphism.
Despite this, it works pretty well because of Python's dynamic typing. I
strongly believe that attempting to enforce the existing type system would be a
real shame.

Alex

PS: You're right. None of this would provide *any* value for PyPy.

Christian Heimes

unread,
Aug 13, 2014, 4:31:14 PM8/13/14
to python...@python.org
On 13.08.2014 21:44, Guido van Rossum wrote:
> Yesterday afternoon I had an inspiring conversation with Bob Ippolito
> (man of many trades, author of simplejson) and Jukka Lehtosalo (author
> of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about
> what Python can learn from Haskell (and other languages); yesterday he
> gave the same talk at Dropbox. The talk is online
> (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad
> strokes comes down to three suggestions:
>
> (a) Python should adopt mypy's syntax for function annotations
> (b) Python's use of mutabe containers by default is wrong
> (c) Python should adopt some kind of Abstract Data Types

I was at Bob's talk during EP14 and really liked the idea. A couple of
colleagues and other attendees also said it's a good and useful
proposal. I also like your proposal to standardize the type annotations
first without a full integration of mypy.

In general I'm +1 but I like to discuss two aspects:

1) I'm not keen with the naming of mypy's typing classes. The visual
distinction between e.g. dict() and Dict() is too small and IMHO
confusing for newcomers. How about an additional 'T' prefix to make
clear that the objects are referring to typing objects?

from typing import TList, TDict

def word_count(input: TList[str]) -> TDict[str, int]:
...

2) PEP 3107 only specifies arguments and return values but not
exceptions that can be raised by a function. Java has the "throws"
syntax to list possible exceptions:

public void readFile() throws IOException {}

May I suggest that we also standardize a way to annotate the exceptions
that can be raised by a function? It's a very useful piece of
information and commonly requested information on the Python user
mailing list. It doesn't have to be a new syntax element, a decorator in
the typing module would suffice, too. For example:

from typing import TList, TDict, raises

@raises(RuntimeError, (ValueError, "is raised when input is empty"))
def word_count(input: TList[str]) -> TDict[str, int]:
...

Regards,
Christian

Ethan Furman

unread,
Aug 13, 2014, 4:50:43 PM8/13/14
to Python-Ideas
On 08/13/2014 01:19 PM, Guido van Rossum wrote:
> On Wed, Aug 13, 2014 at 12:59 PM, Ethan Furman wrote:
>>
>> -1 on deprecating alternative uses of annotations.
>
> Do you have a favorite alternative annotation use that you actually use (or are likely to)?

My script argument parser [1] uses annotations to figure out how to parse the cli parameters and cast them to
appropriate values (copied the idea from one of Michele Simionato's projects... plac [2], I believe).

I could store the info in some other structure besides 'annotations', but it's there and it fits the bill conceptually.
Amusingly, it's a form of type info, but instead of saying what it has to already be, says what it will become.

--
~Ethan~


[1] https://pypi.python.org/pypi/scription (due for an overhaul now I've used it for awhile ;)
[2] https://pypi.python.org/pypi/plac/0.9.1

Donald Stufft

unread,
Aug 13, 2014, 4:54:17 PM8/13/14
to Alex Gaynor, python...@python.org
I agree with Alex that I think leaving the actual semantics of what these things
mean up to a third party, which can possibly be swapped out by individual end
users, is terribly confusing. I don’t think I agree though that this is a bad
idea in general, I think that we should just add it for real and skip the
indirection.

IOW I'm not sure I see the benefit of defining the syntax but not the semantics
when it seems this is already completely possible given the fact that mypy
exists.

The only real benefits I can see from doing it are that the stdlib can use it,
and the ``import typing`` aspect. I don't believe that the stdlib benefits are
great enough to get the possible confusion of multiple different implementations
and I think that the typing import could easily be provided as a project on PyPI
that people can depend on if they want to use this in their code.

So my vote would be to add mypy semantics to the language itself.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Andrey Vlasovskikh

unread,
Aug 13, 2014, 5:09:19 PM8/13/14
to gu...@python.org, Python-Ideas
2014-08-14, 0:19, Guido van Rossum <gu...@python.org> wrote:

> Yesterday afternoon I had an inspiring conversation with Bob Ippolito (man of many trades, author of simplejson) and Jukka Lehtosalo (author of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about what Python can learn from Haskell (and other languages); yesterday he gave the same talk at Dropbox. The talk is online (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad strokes comes down to three suggestions:
>
> (a) Python should adopt mypy's syntax for function annotations


+1. I'm a developer of the code analysis engine of PyCharm. I have discussed this idea with Jukka Lehtosalo and recently with Dave Halter, the author of Jedi code completion library. Standardized type annotations would be very useful for code analysis tools and IDEs such as PyCharm, Jedi and pylint. Type annotations would be especially great for third-party libraries. The idea is that most Python programmers don't have to write annotations in order to benefit from them. Annotated libraries are often enough for good code analysis.

We (PyCharm) and Jukka have made some initial steps in this direction, including thoughts on semantics of annotations (https://github.com/pytypes/pytypes). Feedback is welcome.

Here are slides from my talk about optional typing in Python, that show how Mypy types can be used in both static and dynamic type checking (http://blog.pirx.ru/media/files/2013/python-optional-typing/), Mypy-related part starts from slide 14.

We are interested in getting type annotations standardized and we would like to help developing and testing type annotations proposals.

--
Andrey Vlasovskikh
Web: http://pirx.ru/

Antoine Pitrou

unread,
Aug 13, 2014, 5:13:51 PM8/13/14
to python...@python.org

Hello,

First, as a disclaimer, I am currently working on Numba for Continuum
Analytics. Numba has its own type inference system which it applies to
functions decorated with the @jit decorator. Due to Numba's objectives,
the type inference system is heavily geared towards numerical computing,
but it is conceptually (and a bit concretely) able to represent more
generic information, such as "an enumerate() over an iterator of a
complex128 numpy array".

There are two sides to type inference:

1) first the (optional) annotations
(I'm saying "optional" because in the most basic usage, a JIT compiler
is normally able to defer compilation until the first function or method
call, and to deduce input types from that)

2) second the inference engine properly, which walks the code (in
whatever form the tool's developer has chosen: bytecode, AST, IR) and
deduces types for any intermediate values

Now only #1 is implied by this PEP proposal, but it also sounds like we
should take into account the desired properties of #2 (for example,
being able to express "an iterator of three-tuples" can be important for
a JIT compiler - or not, perhaps, depending on the JIT compiler :-)).
What #2 wants to do will differ depending on the use case: e.g. a code
checker may need less type granularity than a JIT compiler.


Therefore, regardless of mypy's typesystem's completeness and
granularity, one requirement is for it to be easily extensible. By
extensible I mean not only being able to define new type descriptions,
but being able to do so for existing third-party libraries you don't
want to modify.

I'm saying that because I'm looking at
http://mypy-lang.org/tutorial.html#genericclasses , and it's not clear
from this example whether the typing code has to be interwoven with the
collection's implementation, or can be written as a separate code module
entirely (*). Ideally both should probably be possible (in the same vein
as being able to subclass an ABC, or register an existing class with
it). This also includes being to type-declare functions and types from C
extension modules.

In Numba, this would be typically required to write typing descriptions
for Numpy arrays and functions; but also to derive descriptions for
fixed-width integers, single-precision floats, etc. (this also means
some form of subclassing for type descriptions themselves).

(*) (actually, I'm a bit worried when I see that "List[int]()"
instantiates an actual list; calling a type description class should
give you a parametered type description, not an object; the [] notation
is in general not powerful enough if you want several type parameters,
possibly keyword-only)


At some point, it will be even better if the typing system is powerful
enough to remember properties of the *values* (for example not only "a
string", but "a one-character string, or even "one of the 'Y', 'M', 'D'
strings"). Think about type-checking / type-infering calls to the struct
module.


I may come back with more comments once I've read the mypy docs and/or
code in detail.

Regards

Antoine.

Guido van Rossum

unread,
Aug 13, 2014, 5:47:46 PM8/13/14
to Alex Gaynor, Python-Ideas
On Wed, Aug 13, 2014 at 1:29 PM, Alex Gaynor <alex....@gmail.com> wrote:
I'm strongly opposed this, for a few reasons.

First, I think that standardizing on a syntax, without a semantics is
incredibly confusing, and I can't imagine how having *multiple* competing
implementations would be a boon for anyone.

That part was probably overly vague in my original message. I actually do want to standardize on semantics, but I think the semantics will prove controversial (they already have :-) and I think it's better to standardize the syntax and *some* semantics first rather than having to wait another decade for the debate over the semantics to settle. I mostly want to leave the door open for mypy to become smarter. But it might make sense to have a "weaker" interpretation in some cases too (e.g. an IDE might use a weaker type system in order to avoid overwhelming users with warnings).
 
This proposal seems to be built around the idea that we should have a syntax,
and then people can write third party tools, but Python itself won't really do
anything with them.

Right.
 
Fundamentally, this seems like a very confusing approach. How we write a type,
and what we do with that information are fundamentally connected. Can I cast a
``List[str]`` to a ``List[object]`` in any way? If yes, what happens when I go
to put an ``int`` in it? There's no runtime checking, so the type system is
unsound, on the other hand, disallowing this prevents many types of successes.

Mypy has a cast() operator that you can use to shut it up when you (think you) know the conversion is safe.
 
Both solutions have merit, but the idea of some implementations of the type
checker having covariance and some contravariance is fairly disturbing.

Yeah, that wouldn't be good. ;-)
 
Another concern I have is that analysis based on these types is making some
pretty strong assumptions about static-ness of Python programs that aren't
valid. While existing checkers like ``flake8`` also do this, their assumptions
are basically constrained to the symbol table, while this is far deeper. For
example, can I annotate something as ``six.text_type``? What about

``django.db.models.sql.Query`` (keep in mind that this class is redefined based
on what database you're using (not actually true, but it used to be))?

Time will have to tell. Stubs can help. I encourage you to try annotating a medium-sized module. It's likely that you'll find a few things: maybe a bug in mypy, maybe a missing mypy feature, maybe a bug in your code, maybe a shady coding practice in your code or a poorly documented function (I know I found several of each during my own experiments so far).
 
Python's type system isn't very good. It lacks many features of more powerful
systems such as algebraic data types, interfaces, and parametric polymorphism.
Despite this, it works pretty well because of Python's dynamic typing. I
strongly believe that attempting to enforce the existing type system would be a
real shame.

Mypy shines in those areas of Python programs that are mostly statically typed. There are many such areas in most large systems. There are usually also some areas where mypy's type system is inadequate. It's easy to shut it up for those cases (in fact, mypy is silent unless you use at least one annotation for a function). But that's the case with most type systems. Even Haskell sometimes calls out to C.

Guido van Rossum

unread,
Aug 13, 2014, 6:01:47 PM8/13/14
to Ethan Furman, Python-Ideas
On Wed, Aug 13, 2014 at 1:50 PM, Ethan Furman <et...@stoneleaf.us> wrote:
On 08/13/2014 01:19 PM, Guido van Rossum wrote:

On Wed, Aug 13, 2014 at 12:59 PM, Ethan Furman wrote:

-1 on deprecating alternative uses of annotations.

Do you have a favorite alternative annotation use that you actually use (or are likely to)?

My script argument parser [1] uses annotations to figure out how to parse the cli parameters and cast them to appropriate values (copied the idea from one of Michele Simionato's projects... plac [2], I believe).

I could store the info in some other structure besides 'annotations', but it's there and it fits the bill conceptually.  Amusingly, it's a form of type info, but instead of saying what it has to already be, says what it will become.

I couldn't find any docs for scription (the tarball contains just the source code, not even an example), although I did find some for plac. I expect using type annotations to the source of scription.py might actually make it easier to grok what it does. :-)

But really, I'm sure that in Python 3.5, scription and mypy can coexist. If the mypy idea takes off you might eventually be convinced to use a different convention. But you'd get plenty of warning.
 
[1] https://pypi.python.org/pypi/scription  (due for an overhaul now I've used it for awhile ;)
[2] https://pypi.python.org/pypi/plac/0.9.1

Guido van Rossum

unread,
Aug 13, 2014, 6:07:21 PM8/13/14
to Donald Stufft, Python-Ideas, Alex Gaynor
On Wed, Aug 13, 2014 at 1:53 PM, Donald Stufft <don...@stufft.io> wrote:
I agree with Alex that I think leaving the actual semantics of what these things
mean up to a third party, which can possibly be swapped out by individual end
users, is terribly confusing. I don’t think I agree though that this is a bad
idea in general, I think that we should just add it for real and skip the
indirection.

Yeah, I probably overstated the option of alternative interpretations. I just don't want to have to write a PEP that specifies every little detail of mypy's type checking algorithm, and I don't think anyone would want to have to read such a PEP either. But maybe we can compromise on something that sketches broad strokes and leaves the details up to the team that maintains mypy (after all that tactic has worked pretty well for Python itself :-).
 
IOW I'm not sure I see the benefit of defining the syntax but not the semantics
when it seems this is already completely possible given the fact that mypy
exists.

The only real benefits I can see from doing it are that the stdlib can use it,
and the ``import typing`` aspect. I don't believe that the stdlib benefits are
great enough to get the possible confusion of multiple different implementations
and I think that the typing import could easily be provided as a project on PyPI
that people can depend on if they want to use this in their code.

So my vote would be to add mypy semantics to the language itself.

What exactly would that mean? I don't think the Python interpreter should reject programs that fail the type check -- in fact, separating the type check from run time is the most crucial point of my proposal.

I'm fine to have a discussion on things like covariance vs. contravariance, or what form of duck typing are acceptable, etc.
 

Guido van Rossum

unread,
Aug 13, 2014, 6:08:48 PM8/13/14
to Andrey Vlasovskikh, Python-Ideas
Wow. Awesome. I will make time to study what you have already done!


On Wed, Aug 13, 2014 at 2:08 PM, Andrey Vlasovskikh <andrey.vl...@gmail.com> wrote:
2014-08-14, 0:19, Guido van Rossum <gu...@python.org> wrote:

> Yesterday afternoon I had an inspiring conversation with Bob Ippolito (man of many trades, author of simplejson) and Jukka Lehtosalo (author of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about what Python can learn from Haskell (and other languages); yesterday he gave the same talk at Dropbox. The talk is online (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad strokes comes down to three suggestions:
>
>  (a) Python should adopt mypy's syntax for function annotations


+1. I'm a developer of the code analysis engine of PyCharm. I have discussed this idea with Jukka Lehtosalo and recently with Dave Halter, the author of Jedi code completion library. Standardized type annotations would be very useful for code analysis tools and IDEs such as PyCharm, Jedi and pylint. Type annotations would be especially great for third-party libraries. The idea is that most Python programmers don't have to write annotations in order to benefit from them. Annotated libraries are often enough for good code analysis.

We (PyCharm) and Jukka have made some initial steps in this direction, including thoughts on semantics of annotations (https://github.com/pytypes/pytypes). Feedback is welcome.

Here are slides from my talk about optional typing in Python, that show how Mypy types can be used in both static and dynamic type checking (http://blog.pirx.ru/media/files/2013/python-optional-typing/), Mypy-related part starts from slide 14.

We are interested in getting type annotations standardized and we would like to help developing and testing type annotations proposals.

--
Andrey Vlasovskikh
Web: http://pirx.ru/




Juancarlo Añez

unread,
Aug 13, 2014, 6:22:46 PM8/13/14
to Guido van Rossum, Jukka Lehtosalo, Python-Ideas

On Wed, Aug 13, 2014 at 3:14 PM, Guido van Rossum <gu...@python.org> wrote:
I am proposing that we adopt whatever mypy uses here, keeping discussion of the details (mostly) out of the PEP. The goal is to make it possible to add type checking annotations to 3rd party modules (and even to the stdlib) while allowing unaltered execution of the program by the (unmodified) Python 3.5 interpreter.

I'll comment later on the core subject.

For now, I think this deserves some thought:

Function annotations are not available in Python 2.7, so promoting widespread use of annotations in 3.5 would be promoting code that is compatible only with 3.x, when the current situation is that much effort is being spent on writing code that works on both 2.7 and 3.4 (most libraries?).

Independently of its core merits, this proposal should fail unless annotations are added to Python 2.8.

Cheers,

--
Juancarlo Añez

Todd

unread,
Aug 13, 2014, 6:29:00 PM8/13/14
to python-ideas


On Aug 13, 2014 9:45 PM, "Guido van Rossum" <gu...@python.org> wrote:
> (1) A change of direction for function annotations
>
> PEP 3107, which introduced function annotations, is intentional non-committal about how function annotations should be used. It lists a number of use cases, including but not limited to type checking. It also mentions some rejected proposals that would have standardized either a syntax for indicating types and/or a way for multiple frameworks to attach different annotations to the same function. AFAIK in practice there is little use of function annotations in mainstream code, and I propose a conscious change of course here by stating that annotations should be used to indicate types and to propose a standard notation for them.
>
> (We may have to have some backwards compatibility provision to avoid breaking code that currently uses annotations for some other purpose. Fortunately the only issue, at least initially, will be that when running mypy to type check such code it will produce complaints about the annotations; it will not affect how such code is executed by the Python interpreter. Nevertheless, it would be good to deprecate such alternative uses of annotations.)

I watched the original talk and read your proposal.  I think type annotations could very very useful in certain contexts. 

However, I still don't get this bit. Why would allowing type annotations automatically imply that no other annotations would be possible?  Couldn't we formalize what would be considered a type annotation while still allowing annotations that don't fit this criteria to be used for other things?

Manuel Cerón

unread,
Aug 13, 2014, 6:29:01 PM8/13/14
to python...@python.org
On Wed, Aug 13, 2014 at 9:44 PM, Guido van Rossum <gu...@python.org> wrote:
[There is no TL;DR other than the subject line. Please read the whole thing before replying. I do have an appendix with some motivations for adding type annotations at the end.]

This is a very interesting idea. I played a bit with function annotations (https://github.com/ceronman/typeannotations) and I gave a talk about them at EuroPython 2013. Certainly static type analysis is probably the best use case. 

The curious thing here is that while standardizing a syntax for type annotations, we technically still won't be adopting standard rules for type checking. This is intentional. First of all, fully specifying all the type checking rules would make for a really long and boring PEP (a much better specification would probably be the mypy source code). Second, I think it's fine if the type checking algorithm evolves over time, or if variations emerge. The worst that can happen is that you consider your code correct but mypy disagrees; your code will still run.

That said, I don't want to completely leave out any specification. I want the contents of the typing.py module to be specified in the PEP, so that it can be used with confidence. But whether mypy will complain about your particular form of duck typing doesn't have to be specified by the PEP. Perhaps as mypy evolves it will take options to tell it how to handle certain edge cases. Forks of mypy (or entirely different implementations of type checking based on the same annotation syntax) are also a possibility. Maybe in the distant future a version of Python will take a different stance, once we have more experience with how this works out in practice, but for Python 3.5 I want to restrict the scope of the upheaval.

The type checking algorithm might evolve over the time, but by including typing.py in the stdlib, the syntax for annotations would be almost frozen and that will be a limitation. In other projects such as TypeScript (http://www.typescriptlang.org/), that the syntax usually evolves alongside the algorithms. 

Is the syntax specifyed in typing.py mature enough to put it in the stdlib and expect users to start annotating their projects without worrying too much about future changes?

Is there enough feedback from users using mypy in their projects?

I think that rushing typing.py into 3.5 is not a good idea. However, It'd be nice to add some notes in PEP8, encourage it's use as an external library, let some projects and tools (e.g. PyCharm) use it. It's not that bad if mypy lives 100% outside the Python distribution for a while. Just like TypeScript to JavaScript. After getting some user base, part of it (typing.py) could be moved to the stdlib.

Manuel.

Ryan Gonzalez

unread,
Aug 13, 2014, 6:35:11 PM8/13/14
to Christian Heimes, python-ideas
On Wed, Aug 13, 2014 at 3:29 PM, Christian Heimes <chri...@python.org> wrote:
On 13.08.2014 21:44, Guido van Rossum wrote:
> Yesterday afternoon I had an inspiring conversation with Bob Ippolito
> (man of many trades, author of simplejson) and Jukka Lehtosalo (author
> of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about
> what Python can learn from Haskell (and other languages); yesterday he
> gave the same talk at Dropbox. The talk is online
> (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad
> strokes comes down to three suggestions:
>
>   (a) Python should adopt mypy's syntax for function annotations
>   (b) Python's use of mutabe containers by default is wrong
>   (c) Python should adopt some kind of Abstract Data Types

I was at Bob's talk during EP14 and really liked the idea. A couple of
colleagues and other attendees also said it's a good and useful
proposal. I also like your proposal to standardize the type annotations
first without a full integration of mypy.

In general I'm +1 but I like to discuss two aspects:

1) I'm not keen with the naming of mypy's typing classes. The visual
distinction between e.g. dict() and Dict() is too small and IMHO
confusing for newcomers. How about an additional 'T' prefix to make
clear that the objects are referring to typing objects?

  from typing import TList, TDict

  def word_count(input: TList[str]) -> TDict[str, int]:
      ...

Eeewwwww. That's way too Pascal-ish.


2) PEP 3107 only specifies arguments and return values but not
exceptions that can be raised by a function. Java has the "throws"
syntax to list possible exceptions:

 public void readFile() throws IOException {}

May I suggest that we also standardize a way to annotate the exceptions
that can be raised by a function? It's a very useful piece of
information and commonly requested information on the Python user
mailing list. It doesn't have to be a new syntax element, a decorator in
the typing module would suffice, too. For example:

  from typing import TList, TDict, raises

  @raises(RuntimeError, (ValueError, "is raised when input is empty"))
  def word_count(input: TList[str]) -> TDict[str, int]:
      ...

That was a disaster in C++. It's confusing, especially since Python uses exceptions more than most other languages do.
 

Regards,
Christian

_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/



--
Ryan
If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated."

Donald Stufft

unread,
Aug 13, 2014, 6:44:59 PM8/13/14
to gu...@python.org, Python-Ideas, Alex Gaynor
I don’t know exactly :)

Some ideas:

1) Raise a warning when the type check fails, but allow it happen. This would
   have the benefit of possibly catching bugs, but it's still opt in in the
   sense that you have to write the annotations for anything to happen. This
   would also enable people to turn on enforced type checking by raising the
   warning level to an exception.

   Even if this was off by default it would make it easy to enable it during
   test runs and also enable easier/better quickcheck like functionality.

2) Simply add a flag to the interpreter that turns on type checking.

3) Add a stdlib module that would run the program under type checking, like
   ``python -m typing myprog`` instead of ``python -m myprog``.

Really I think a lot of the benefit is likely to come in the form of linting
and during test runs. However if I have to run a seperate Python interpreter
to actually do the run then I risk getting bad results through varying things
like interpreter differences, language level differences, etc.

Although I wouldn't complain if it meant that Python had actual type checking
at the run time if a function had type annotations :)


I'm fine to have a discussion on things like covariance vs. contravariance, or what form of duck typing are acceptable, etc.

I’m not particularly knowledgable about the actual workings of a type system and
covariance vs contravariance and the like. My main concern there is having a
single reality. The meaning of something shouldn't change because I used a
different interpreter/linter/whatever. Beyond that I don't know enough to have
an opinion on the actual semantics.

Guido van Rossum

unread,
Aug 13, 2014, 7:13:17 PM8/13/14
to Juancarlo Añez, Jukka Lehtosalo, Python-Ideas
Actually, mypy already has a solution. There's a codec (https://github.com/JukkaL/mypy/tree/master/mypy/codec) that you can use which transforms Python-2-with-annotations into vanilla Python 2. It's not an ideal solution, but it can work in cases where you absolutely have to have state of the art Python 3.5 type checking *and* backwards compatibility with Python 2.

Guido van Rossum

unread,
Aug 13, 2014, 7:26:15 PM8/13/14
to Manuel Cerón, Python-Ideas
On Wed, Aug 13, 2014 at 3:26 PM, Manuel Cerón <cero...@gmail.com> wrote:
The type checking algorithm might evolve over the time, but by including typing.py in the stdlib, the syntax for annotations would be almost frozen and that will be a limitation. In other projects such as TypeScript (http://www.typescriptlang.org/), that the syntax usually evolves alongside the algorithms.

What kind of evolution did TypeScript experience?
 
Is the syntax specifyed in typing.py mature enough to put it in the stdlib and expect users to start annotating their projects without worrying too much about future changes?

This is a good question. I do think it is good enough as a starting point for future evolution. Perhaps the biggest question is how fast will the annotation syntax need to evolve? If it needs to evolve significantly faster than Python 3 feature releases come out (every 18 months, approximately) then it may be better to hold off and aim for inclusion in the 3.6 standard library. That would allow more time to reach agreement (though I'm not sure that's a good thing :-), and in the mean time typing.py could be distributed as a 3rd party module on PyPI.
 
Is there enough feedback from users using mypy in their projects?

I think that rushing typing.py into 3.5 is not a good idea. However, It'd be nice to add some notes in PEP8, encourage it's use as an external library, let some projects and tools (e.g. PyCharm) use it. It's not that bad if mypy lives 100% outside the Python distribution for a while. Just like TypeScript to JavaScript.

Well, JavaScript's evolution is tied up forever in a standards body, so TypeScript realistically had no choice in the matter. But are there actually people writing TypeScript? I haven't heard from them yet (people at Dropbox seem to rather like CoffeeScript). Anyway, the situation isn't quite the same -- you wouldn't make any friends in the Python world if you wrote your code in an incompatible dialect that could only be executed after a translation step, but in the JavaScript world that's how all alternative languages work (and they even manage to interoperate).
 
After getting some user base, part of it (typing.py) could be moved to the stdlib.

I'm still hopeful that we can get a sufficient user base and agreement on mypy's features for inclusion in 3.5 (extrapolating the 3.4 release schedule by 18 months, 3.5 alpha 1 would go out around February 2015; the feature freeze cut-off date, beta 1, would around May thereafter).

Ben Finney

unread,
Aug 13, 2014, 7:28:16 PM8/13/14
to python...@python.org
Christian Heimes <chri...@python.org>
writes:

> 1) I'm not keen with the naming of mypy's typing classes. The visual
> distinction between e.g. dict() and Dict() is too small and IMHO
> confusing for newcomers. How about an additional 'T' prefix to make
> clear that the objects are referring to typing objects?

To this reader, ‘dict’ and ‘list’ *are* “typing objects” — they are
objects that are types. Seeing code that referred to something else as
“typing objects” would be an infitation to confusion, IMO.

You could argue “that's because you don't know the special meaning of
“typing object” being discussed here”. To which my response would be,
for a proposal to add something else as meaningful Python syntax, the
jargon is poorly chosen and needlessly confusing with established terms
in Python.

If there's going to be a distinction between the types (‘dict’, ‘list’,
etc.) and something else, I'd prefer it to be based on a clearer
terminology distinction.

--
\ “Simplicity and elegance are unpopular because they require |
`\ hard work and discipline to achieve and education to be |
_o__) appreciated.” —Edsger W. Dijkstra |
Ben Finney

Guido van Rossum

unread,
Aug 13, 2014, 7:31:43 PM8/13/14
to Todd, python-ideas
On Wed, Aug 13, 2014 at 3:28 PM, Todd <todd...@gmail.com> wrote:
However, I still don't get this bit. Why would allowing type annotations automatically imply that no other annotations would be possible?  Couldn't we formalize what would be considered a type annotation while still allowing annotations that don't fit this criteria to be used for other things?

We certainly *could* do that. However, I haven't seen sufficient other uses of annotations. If there is only one use for annotations (going forward), annotations would be unambiguous. If we allow different types of annotations, there would have to be a way to tell whether a particular annotation is intended as a type annotation or not. Currently mypy ignores all modules that don't import typing.py (using any form of import statement), and we could continue this convention. But it would mean that something like this would still require the typing import in order to be checked by mypy:

import typing

def gcd(int a, int b) -> int:
    <tralala>

The (necessary) import would be flagged as unused by every linter in the world... :-(

Guido van Rossum

unread,
Aug 13, 2014, 7:45:23 PM8/13/14
to Donald Stufft, Python-Ideas, Alex Gaynor
On Wed, Aug 13, 2014 at 3:44 PM, Donald Stufft <don...@stufft.io> wrote:
On Aug 13, 2014, at 6:05 PM, Guido van Rossum <gu...@python.org> wrote:

On Wed, Aug 13, 2014 at 1:53 PM, Donald Stufft <don...@stufft.io> wrote:

So my vote would be to add mypy semantics to the language itself.

What exactly would that mean? I don't think the Python interpreter should reject programs that fail the type check -- in fact, separating the type check from run time is the most crucial point of my proposal.

I don’t know exactly :)

Some ideas:

1) Raise a warning when the type check fails, but allow it happen. This would
   have the benefit of possibly catching bugs, but it's still opt in in the
   sense that you have to write the annotations for anything to happen. This
   would also enable people to turn on enforced type checking by raising the
   warning level to an exception.

I don't think that's going to happen. It would require the entire mypy implementation to be checked into the stdlib. It would also require all sorts of hacks in that implementation to deal with dynamic (or just delayed) imports. Mypy currently doesn't handle any of that -- it must be able to find all imported modules before it starts executing even one line of code.
 
   Even if this was off by default it would make it easy to enable it during
   test runs and also enable easier/better quickcheck like functionality.

It would *have* to be off by default -- it's way too slow to be on by default (note that some people are already fretting out today about a 25 msec process start-up time).
 
2) Simply add a flag to the interpreter that turns on type checking.

3) Add a stdlib module that would run the program under type checking, like
   ``python -m typing myprog`` instead of ``python -m myprog``.

Really I think a lot of the benefit is likely to come in the form of linting
and during test runs. However if I have to run a separate Python interpreter
to actually do the run then I risk getting bad results through varying things
like interpreter differences, language level differences, etc.

Yeah, but I just don't think it's realistic to do anything about that for 3.5 (or 3.6 for that matter). In a decade... Who knows! :-)
 
Although I wouldn't complain if it meant that Python had actual type checking
at the run time if a function had type annotations :)

It's probably possibly to write a decorator that translates annotations into assertions that are invoked when a function is called. But in most cases it would be way too slow to turn on everywhere.
I'm fine to have a discussion on things like covariance vs. contravariance, or what form of duck typing are acceptable, etc.
I’m not particularly knowledgable about the actual workings of a type system and
covariance vs contravariance and the like. My main concern there is having a
single reality. The meaning of something shouldn't change because I used a
different interpreter/linter/whatever. Beyond that I don't know enough to have
an opinion on the actual semantics.

Yeah, I regret writing it so vaguely already. Having Alex Gaynor open with "I'm strongly opposed [to] this" is a great joy killer. :-)

I just really don't want to have to redundantly write up a specification for all the details of mypy's type checking rules in PEP-worthy English. But I'm fine with discussing whether List[str] is a subclass or a superclass of List[object] and how to tell the difference.

Still, different linters exist and I don't hear people complain about that. I would also be okay if PyCharm's interpretation of the finer points of the type checking syntax was subtly different from mypy's. In fact I would be surprised if they weren't sometimes in disagreement. Heck, PyPy doesn't give *every* Python program the same meaning as CPython, and that's a feature. :-)

Donald Stufft

unread,
Aug 13, 2014, 7:59:40 PM8/13/14
to gu...@python.org, Python-Ideas, Alex Gaynor
Understood! And really the most important thing I'm worried about isn’t that
there is some sort of code in the stdlib or in the interpreter just that there
is an authoritative source of what stuff means.


Still, different linters exist and I don't hear people complain about that. I would also be okay if PyCharm's interpretation of the finer points of the type checking syntax was subtly different from mypy's. In fact I would be surprised if they weren't sometimes in disagreement. Heck, PyPy doesn't give *every* Python program the same meaning as CPython, and that's a feature. :-)


Depends on what is meant by "meaning" I suppose. Generally in those linters or
PyPy itself if there is a different *meaningful* result (for instance if
print was defaulting to sys.stderr) then CPython (incl docs) acts as the
authoritative source of what ``print()`` means (in this case writing to
sys.stdout).

I'm also generally OK with deferring possible code/interpreter changes to add
actual type checking until a later point in time. If there's a defined semantics
to what those annotations mean than third parties can experiment and do things
with it and those different things can be looked at adding/incorporating into
Python proper in 3.6 (or 3.7, or whatever).

Honestly I think that probably the things I was worried about is sufficiently
allayed given that it appears I was reading more into the vaguness and the
optionally different interpretations than what was meant and I don't want to
keep harping on it :) As long as there's some single source of what List[str]
or what have you means than I'm pretty OK with it all.

Chris Angelico

unread,
Aug 13, 2014, 8:32:42 PM8/13/14
to Python-Ideas
On Thu, Aug 14, 2014 at 5:44 AM, Guido van Rossum <gu...@python.org> wrote:
> from typing import List, Dict
>
> def word_count(input: List[str]) -> Dict[str, int]:
> result = {} #type: Dict[str, int]
> for line in input:
> for word in line.split():
> result[word] = result.get(word, 0) + 1
> return result

I strongly support the concept of standardized typing information.
There'll be endless bikeshedding on names, though - personally, I
don't like the idea of "from typing import ..." as there's already a
"types" module and I think it'd be confusing. (Also, "mypy" sounds
like someone's toy reimplementation of Python, which it does seem to
be :) but that's not really well named for "type checker using stdlib
annotations".) But I think the idea is excellent, and it deserves
stdlib support.

The cast notation sounds to me like it's what Pike calls a "soft cast"
- it doesn't actually *change* anything (contrast a C or C++ type
cast, where (float)42 is 42.0), it just says to the copmiler/type
checker "this thing is actually now this type". If the naming is clear
on this point, it leaves open the possibility of actual recursive
casting - where casting a List[str] to List[int] is equivalent to
[int(x) for x in lst]. Whether or not that's a feature worth adding
can be decided in the distant future :)

+1 on the broad proposal. +0.5 on defining the notation while leaving
the actual type checking to an external program.

ChrisA

Haoyi Li

unread,
Aug 13, 2014, 8:54:25 PM8/13/14
to Chris Angelico, Python-Ideas
Both solutions have merit, but the idea of some implementations of the type checker having covariance and some contravariance is fairly disturbing.

Why can't we have both? That's the only way to properly type things, since immutable-get-style APIs are always going to be convariant, set-only style APIs (e.g. a function that takes 1 arg and returns None) are going to be contravariant and mutable get-set APIs (like most python collections) should really be invariant.

Łukasz Langa

unread,
Aug 13, 2014, 9:01:40 PM8/13/14
to guido@python.org van Rossum, Python-Ideas
It’s great to see this finally happening!
I did some research on existing optional-typing approaches [1]. What I learned in the process was that linting is the most important use case for optional typing; runtime checks is too little, too late.

That being said, having optional runtime checks available *is* also important. Used in staging environments and during unit testing, this case is able to cover cases obscured by meta-programming. Implementations like “obiwan” and “pytypedecl” show that providing a runtime type checker is absolutely feasible.

The function annotation syntax currently supported in Python 3.4 is not well-suited for typing. This is because users expect to be able to operate on the types they know. This is currently not feasible because:
1. forward references are impossible
2. generics are impossible without custom syntax (which is the reason Mypy’s Dict exists)
3. optional types are clumsy to express (Optional[int] is very verbose for a use case this common)
4. union types are clumsy to express

All those problems are elegantly solved by Google’s pytypedecl via moving type information to a separate file. Because for our use case that would not be an acceptable approach, my intuition would be to:

1. Provide support for generics (understood as an answer to the question: “what does this collection contain?”) in Abstract Base Classes. That would be a PEP in itself.
2. Change the function annotation syntax so that it’s not executed at import time but rather treated as strings. This solves forward references and enables us to…
3. Extend the function annotation syntax with first-class generics support (most languages like "list<str>”)
4. Extend the function annotation syntax with first-class union type support. pytypedecl simply uses “int or None”, which I find very elegant.
5. Speaking of None, possibly further extend the function annotation syntax with first-class optionality support. In the Facebook codebase in Hack we have tens of thousands of optional ints (nevermind other optional types!), this is a case that’s going to be used all the time. Hack uses ?int, that’s the most succinct style you can get. Yes, it’s special but None is a special type, too.

All in all, I believe Mypy has the highest chance of becoming our typing linter, which is great! I just hope we can improve on the syntax, which is currently lacking. Also, reusing our existing ABCs where applicable would be nice. With Mypy’s typing module I feel like we’re going to get a new, orthogonal set of ABCs, which will confuse users to no end. Finally, the runtime type checker would make the ecosystem complete.

This is just the beginning of the open issues I was juggling with and the reason my own try at the PEP was coming up slower than I’d like.

[1] You can find a summary of examples I looked at here: http://lukasz.langa.pl/typehinting/

-- 
Best regards,
Łukasz Langa

WWW: http://lukasz.langa.pl/
Twitter: @llanga
IRC: ambv on #python-dev

On Aug 13, 2014, at 12:44 PM, Guido van Rossum <gu...@python.org> wrote:

[There is no TL;DR other than the subject line. Please read the whole thing before replying. I do have an appendix with some motivations for adding type annotations at the end.]
Yesterday afternoon I had an inspiring conversation with Bob Ippolito (man of many trades, author of simplejson) and Jukka Lehtosalo (author of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about what Python can learn from Haskell (and other languages); yesterday he gave the same talk at Dropbox. The talk is online (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad strokes comes down to three suggestions:

  (a) Python should adopt mypy's syntax for function annotations
  (b) Python's use of mutabe containers by default is wrong
  (c) Python should adopt some kind of Abstract Data Types

Proposals (b) and (c) don't feel particularly actionable (if you disagree please start a new thread, I'd be happy to discuss these further if there's interest) but proposal (a) feels right to me.

So what is mypy?  It is a static type checker for Python written by Jukka for his Ph.D. thesis. The basic idea is that you add type annotations to your program using some custom syntax, and when running your program using the mypy interpreter, type errors will be found during compilation (i.e., before the program starts running).

The clever thing here is that the custom syntax is actually valid Python 3, using (mostly) function annotations: your annotated program will still run with the regular Python 3 interpreter. In the latter case there will be no type checking, and no runtime overhead, except to evaluate the function annotations (which are evaluated at function definition time but don't have any effect when the function is called).

In fact, it is probably more useful to think of mypy as a heavy-duty linter than as a compiler or interpreter; leave the type checking to mypy, and the execution to Python. It is easy to integrate mypy into a continuous integration setup, for example.

To read up on mypy's annotation syntax, please see the mypy-lang.org website. Here's just one complete example, to give a flavor:


  from typing import List, Dict

  def word_count(input: List[str]) -> Dict[str, int]:
      result = {}  #type: Dict[str, int]
      for line in input:
          for word in line.split():
              result[word] = result.get(word, 0) + 1
      return result


Note that the #type: comment is part of the mypy syntax; mypy uses comments to declare types in situations where no syntax is available -- although this particular line could also be written as follows:

    result = Dict[str, int]()

Either way the entire function is syntactically valid Python 3, and a suitable implementation of typing.py (containing class definitions for List and Dict, for example) can be written to make the program run correctly. One is provided as part of the mypy project.

I should add that many of mypy's syntactic choices aren't actually new. The basis of many of its ideas go back at least a decade: I blogged about this topic in 2004 (http://www.artima.com/weblogs/viewpost.jsp?thread=85551 -- see also the two followup posts linked from the top there).

I'll emphasize once more that mypy's type checking happens in a separate pass: no type checking happens at run time (other than what the interpreter already does, like raising TypeError on expressions like 1+"1").

There's a lot to this proposal, but I think it's possible to get a PEP written, accepted and implemented in time for Python 3.5, if people are supportive. I'll go briefly over some of the action items.

(1) A change of direction for function annotations

PEP 3107, which introduced function annotations, is intentional non-committal about how function annotations should be used. It lists a number of use cases, including but not limited to type checking. It also mentions some rejected proposals that would have standardized either a syntax for indicating types and/or a way for multiple frameworks to attach different annotations to the same function. AFAIK in practice there is little use of function annotations in mainstream code, and I propose a conscious change of course here by stating that annotations should be used to indicate types and to propose a standard notation for them.

(We may have to have some backwards compatibility provision to avoid breaking code that currently uses annotations for some other purpose. Fortunately the only issue, at least initially, will be that when running mypy to type check such code it will produce complaints about the annotations; it will not affect how such code is executed by the Python interpreter. Nevertheless, it would be good to deprecate such alternative uses of annotations.)

(2) A specification for what to add to Python 3.5

There needs to be at least a rough consensus on the syntax for annotations, and the syntax must cover a large enough set of use cases to be useful. Mypy is still under development, and some of its features are still evolving (e.g. unions were only added a few weeks ago). It would be possible to argue endlessly about details of the notation, e.g. whether to use 'list' or 'List', what either of those means (is a duck-typed list-like type acceptable?) or how to declare and use type variables, and what to do with functions that have no annotations at all (mypy currently skips those completely).

I am proposing that we adopt whatever mypy uses here, keeping discussion of the details (mostly) out of the PEP. The goal is to make it possible to add type checking annotations to 3rd party modules (and even to the stdlib) while allowing unaltered execution of the program by the (unmodified) Python 3.5 interpreter. The actual type checker will not be integrated with the Python interpreter, and it will not be checked into the CPython repository. The only thing that needs to be added to the stdlib is a copy of mypy's typing.py module. This module defines several dozen new classes (and a few decorators and other helpers) that can be used in expressing argument types. If you want to type-check your code you have to download and install mypy and run it separately.

The curious thing here is that while standardizing a syntax for type annotations, we technically still won't be adopting standard rules for type checking. This is intentional. First of all, fully specifying all the type checking rules would make for a really long and boring PEP (a much better specification would probably be the mypy source code). Second, I think it's fine if the type checking algorithm evolves over time, or if variations emerge. The worst that can happen is that you consider your code correct but mypy disagrees; your code will still run.

That said, I don't want to completely leave out any specification. I want the contents of the typing.py module to be specified in the PEP, so that it can be used with confidence. But whether mypy will complain about your particular form of duck typing doesn't have to be specified by the PEP. Perhaps as mypy evolves it will take options to tell it how to handle certain edge cases. Forks of mypy (or entirely different implementations of type checking based on the same annotation syntax) are also a possibility. Maybe in the distant future a version of Python will take a different stance, once we have more experience with how this works out in practice, but for Python 3.5 I want to restrict the scope of the upheaval.

Appendix -- Why Add Type Annotations?

The argument between proponents of static typing and dynamic typing has been going on for many decades. Neither side is all wrong or all right. Python has traditionally fallen in the camp of extremely dynamic typing, and this has worked well for most users, but there are definitely some areas where adding type annotations would help.

- Editors (IDEs) can benefit from type annotations; they can call out obvious mistakes (like misspelled method names or inapplicable operations) and suggest possible method names. Anyone who has used IntelliJ or Xcode will recognize how powerful these features are, and type annotations will make such features more useful when editing Python source code.

- Linters are an important tool for teams developing software. A linter doesn't replace a unittest, but can find certain types of errors better or quicker. The kind of type checking offered by mypy works much like a linter, and has similar benefits; but it can find problems that are beyond the capabilities of most linters.

- Type annotations are useful for the human reader as well! Take the above word_count() example. How long would it have taken you to figure out the types of the argument and return value without annotations? Currently most people put the types in their docstrings; developing a standard notation for type annotations will reduce the amount of documentation that needs to be written, and running the type checker might find bugs in the documentation, too. Once a standard type annotation syntax is introduced, it should be simple to add support for this notation to documentation generators like Sphinx.

- Refactoring. Bob's talk has a convincing example of how type annotations help in (manually) refactoring code. I also expect that certain automatic refactorings will benefit from type annotations -- imagine a tool like 2to3 (but used for some other transformation) augmented by type annotations, so it will know whether e.g. x.keys() is referring to the keys of a dictionary or not.

- Optimizers. I believe this is actually the least important application, certainly initially. Optimizers like PyPy or Pyston wouldn't be able to fully trust the type annotations, and they are better off using their current strategy of optimizing code based on the types actually observed at run time. But it's certainly feasible to imagine a future optimizer also taking type annotations into account.

--
--Guido "I need a new hobby" van Rossum (python.org/~guido)

Gregory P. Smith

unread,
Aug 13, 2014, 9:10:24 PM8/13/14
to Guido van Rossum, Jukka Lehtosalo, Python-Ideas

First, I am really happy that you are interested in this and that your point (2) of what you want to see done is very limited and acknowledges that it isn't going to specify everything!  Because that isn't possible. :)

Unfortunately I feel that adding syntax like this to the language itself is not useful without enforcement because it that leads to code being written with unintentionally incorrect annotations that winds up deployed in libraries that later become a problem as soon as an actual analysis tool attempts to run over something that uses that unknowingly incorrectly specified code in a place where it cannot be easily updated (like the standard library).

At the summit in Montreal earlier this year Łukasz Langa (cc'd) volunteered to lead writing the PEP on Python type hinting based on the many existing implementations of such things (including mypy, cython, numba and pytypedecl). I believe he has an initial draft he intends to send out soon. I'll let him speak to that.

Looks like Łukasz already responded, I'll stop writing now and go read that. :)

Personal opinion from experience trying: You can't express the depth of types for an interface within the Python language syntax itself (assuming hacks such as specially formatted comments, strings or docstrings do not count). Forward references to things that haven't even been defined yet are common. You often want an ability to specify a duck type interface rather than a specific type.  I think he has those points covered better than I do.

-gps

PS If anyone want to see a run time type checker make code run at half speed, look at the one pytypedecl offers. I'm sure it could be sped up, but run-time checkers in an interpreter are always likely to be slow.

Greg Ewing

unread,
Aug 13, 2014, 9:28:56 PM8/13/14
to python...@python.org
On 08/14/2014 12:32 PM, Chris Angelico wrote:
> I don't like the idea of "from typing import ..." as there's already a
> "types" module and I think it'd be confusing.

Maybe

from __statictyping__ import ...

More explicit, and being a dunder name suggests that it's
something special that linters should ignore if they don't
understand it.

--
Greg

Andrew Barnert

unread,
Aug 13, 2014, 9:30:53 PM8/13/14
to Alex Gaynor, python...@python.org
On Wednesday, August 13, 2014 1:30 PM, Alex Gaynor <alex....@gmail.com> wrote:


>I'm strongly opposed this, for a few reasons.


[...]

>Python's type system isn't very good. It lacks many features of more powerful
>systems such as algebraic data types, interfaces, and parametric polymorphism.
>Despite this, it works pretty well because of Python's dynamic typing. I
>strongly believe that attempting to enforce the existing type system would be a
>real shame.

This is my main concern, but I'd phrase it very differently.


First, Python's type system _is_ powerful, but only because it's dynamic. Duck typing simulates parametric polymorphism perfectly, disjunction types as long as they don't include themselves recursively, algebraic data types in some but not all cases, etc. Simple (Java-style) generics, of the kind that Guido seems to be proposing, are not nearly as flexible. That's the problem.

On the other hand, even though these types only cover a small portion of the space of Python's implicit type system, a lot of useful functions fall within that small portion. As long as you can just leave the rest of the program untyped, and there are no boundary problems, there's no real risk.

On the third hand, what worries me is this:

> Mypy has a cast() operator that you can use to shut it up when you (think you) know the conversion is safe.

Why do we need casts? You shouldn't be trying to enforce static typing in a part of the program whose static type isn't sound. Languages like Java and C++ have no choice; Python does, so why not take advantage of it?

The standard JSON example seems appropriate here. What's the return type of json.loads? In Haskell, you write a pretty trivial JSONThing ADT, and you return a JSONThing that's an Object (which means its value maps String to JSONThing). In Python today, you return a dict, and use it exactly the same as in Haskell, except that you can't verify its soundness at compile time. In Java or C++, it's… what? The sound option is a special JSONThing that has separate getObjectMemberString and getArrayMemberString and getObjectMemberInt, which is incredibly painful to use. A plain old Dict[String, Object] looks simple, but it means you have to downcast all over the place to do anything, making it completely unsound, and still unpleasant. The official Java json.org library gives you a hybrid between the two that manages to be neither sound nor user-friendly. And of course there are libraries for many poor static languages (especially C++) that try to fake duck
typing as far as possible for their JSON objects, which is of course nowhere near as far as Python gets for free.

Andrew Barnert

unread,
Aug 13, 2014, 9:42:40 PM8/13/14