[Python-ideas] Proposal: Use mypy syntax for function annotations

804 views
Skip to first unread message

Guido van Rossum

unread,
Aug 13, 2014, 3:45:51 PM8/13/14
to Python-Ideas, Jukka Lehtosalo, Bob Ippolito
[There is no TL;DR other than the subject line. Please read the whole thing before replying. I do have an appendix with some motivations for adding type annotations at the end.]

Yesterday afternoon I had an inspiring conversation with Bob Ippolito (man of many trades, author of simplejson) and Jukka Lehtosalo (author of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about what Python can learn from Haskell (and other languages); yesterday he gave the same talk at Dropbox. The talk is online (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad strokes comes down to three suggestions:

  (a) Python should adopt mypy's syntax for function annotations
  (b) Python's use of mutabe containers by default is wrong
  (c) Python should adopt some kind of Abstract Data Types

Proposals (b) and (c) don't feel particularly actionable (if you disagree please start a new thread, I'd be happy to discuss these further if there's interest) but proposal (a) feels right to me.

So what is mypy?  It is a static type checker for Python written by Jukka for his Ph.D. thesis. The basic idea is that you add type annotations to your program using some custom syntax, and when running your program using the mypy interpreter, type errors will be found during compilation (i.e., before the program starts running).

The clever thing here is that the custom syntax is actually valid Python 3, using (mostly) function annotations: your annotated program will still run with the regular Python 3 interpreter. In the latter case there will be no type checking, and no runtime overhead, except to evaluate the function annotations (which are evaluated at function definition time but don't have any effect when the function is called).

In fact, it is probably more useful to think of mypy as a heavy-duty linter than as a compiler or interpreter; leave the type checking to mypy, and the execution to Python. It is easy to integrate mypy into a continuous integration setup, for example.

To read up on mypy's annotation syntax, please see the mypy-lang.org website. Here's just one complete example, to give a flavor:

  from typing import List, Dict

  def word_count(input: List[str]) -> Dict[str, int]:
      result = {}  #type: Dict[str, int]
      for line in input:
          for word in line.split():
              result[word] = result.get(word, 0) + 1
      return result


Note that the #type: comment is part of the mypy syntax; mypy uses comments to declare types in situations where no syntax is available -- although this particular line could also be written as follows:

    result = Dict[str, int]()

Either way the entire function is syntactically valid Python 3, and a suitable implementation of typing.py (containing class definitions for List and Dict, for example) can be written to make the program run correctly. One is provided as part of the mypy project.

I should add that many of mypy's syntactic choices aren't actually new. The basis of many of its ideas go back at least a decade: I blogged about this topic in 2004 (http://www.artima.com/weblogs/viewpost.jsp?thread=85551 -- see also the two followup posts linked from the top there).

I'll emphasize once more that mypy's type checking happens in a separate pass: no type checking happens at run time (other than what the interpreter already does, like raising TypeError on expressions like 1+"1").

There's a lot to this proposal, but I think it's possible to get a PEP written, accepted and implemented in time for Python 3.5, if people are supportive. I'll go briefly over some of the action items.

(1) A change of direction for function annotations

PEP 3107, which introduced function annotations, is intentional non-committal about how function annotations should be used. It lists a number of use cases, including but not limited to type checking. It also mentions some rejected proposals that would have standardized either a syntax for indicating types and/or a way for multiple frameworks to attach different annotations to the same function. AFAIK in practice there is little use of function annotations in mainstream code, and I propose a conscious change of course here by stating that annotations should be used to indicate types and to propose a standard notation for them.

(We may have to have some backwards compatibility provision to avoid breaking code that currently uses annotations for some other purpose. Fortunately the only issue, at least initially, will be that when running mypy to type check such code it will produce complaints about the annotations; it will not affect how such code is executed by the Python interpreter. Nevertheless, it would be good to deprecate such alternative uses of annotations.)

(2) A specification for what to add to Python 3.5

There needs to be at least a rough consensus on the syntax for annotations, and the syntax must cover a large enough set of use cases to be useful. Mypy is still under development, and some of its features are still evolving (e.g. unions were only added a few weeks ago). It would be possible to argue endlessly about details of the notation, e.g. whether to use 'list' or 'List', what either of those means (is a duck-typed list-like type acceptable?) or how to declare and use type variables, and what to do with functions that have no annotations at all (mypy currently skips those completely).

I am proposing that we adopt whatever mypy uses here, keeping discussion of the details (mostly) out of the PEP. The goal is to make it possible to add type checking annotations to 3rd party modules (and even to the stdlib) while allowing unaltered execution of the program by the (unmodified) Python 3.5 interpreter. The actual type checker will not be integrated with the Python interpreter, and it will not be checked into the CPython repository. The only thing that needs to be added to the stdlib is a copy of mypy's typing.py module. This module defines several dozen new classes (and a few decorators and other helpers) that can be used in expressing argument types. If you want to type-check your code you have to download and install mypy and run it separately.

The curious thing here is that while standardizing a syntax for type annotations, we technically still won't be adopting standard rules for type checking. This is intentional. First of all, fully specifying all the type checking rules would make for a really long and boring PEP (a much better specification would probably be the mypy source code). Second, I think it's fine if the type checking algorithm evolves over time, or if variations emerge. The worst that can happen is that you consider your code correct but mypy disagrees; your code will still run.

That said, I don't want to completely leave out any specification. I want the contents of the typing.py module to be specified in the PEP, so that it can be used with confidence. But whether mypy will complain about your particular form of duck typing doesn't have to be specified by the PEP. Perhaps as mypy evolves it will take options to tell it how to handle certain edge cases. Forks of mypy (or entirely different implementations of type checking based on the same annotation syntax) are also a possibility. Maybe in the distant future a version of Python will take a different stance, once we have more experience with how this works out in practice, but for Python 3.5 I want to restrict the scope of the upheaval.

Appendix -- Why Add Type Annotations?

The argument between proponents of static typing and dynamic typing has been going on for many decades. Neither side is all wrong or all right. Python has traditionally fallen in the camp of extremely dynamic typing, and this has worked well for most users, but there are definitely some areas where adding type annotations would help.

- Editors (IDEs) can benefit from type annotations; they can call out obvious mistakes (like misspelled method names or inapplicable operations) and suggest possible method names. Anyone who has used IntelliJ or Xcode will recognize how powerful these features are, and type annotations will make such features more useful when editing Python source code.

- Linters are an important tool for teams developing software. A linter doesn't replace a unittest, but can find certain types of errors better or quicker. The kind of type checking offered by mypy works much like a linter, and has similar benefits; but it can find problems that are beyond the capabilities of most linters.

- Type annotations are useful for the human reader as well! Take the above word_count() example. How long would it have taken you to figure out the types of the argument and return value without annotations? Currently most people put the types in their docstrings; developing a standard notation for type annotations will reduce the amount of documentation that needs to be written, and running the type checker might find bugs in the documentation, too. Once a standard type annotation syntax is introduced, it should be simple to add support for this notation to documentation generators like Sphinx.

- Refactoring. Bob's talk has a convincing example of how type annotations help in (manually) refactoring code. I also expect that certain automatic refactorings will benefit from type annotations -- imagine a tool like 2to3 (but used for some other transformation) augmented by type annotations, so it will know whether e.g. x.keys() is referring to the keys of a dictionary or not.

- Optimizers. I believe this is actually the least important application, certainly initially. Optimizers like PyPy or Pyston wouldn't be able to fully trust the type annotations, and they are better off using their current strategy of optimizing code based on the types actually observed at run time. But it's certainly feasible to imagine a future optimizer also taking type annotations into account.

--
--Guido "I need a new hobby" van Rossum (python.org/~guido)

Ethan Furman

unread,
Aug 13, 2014, 4:00:33 PM8/13/14
to python...@python.org
On 08/13/2014 12:44 PM, Guido van Rossum wrote:
>
> [There is no TL;DR other than the subject line. Please read the whole thing before replying. I do have an appendix with
> some motivations for adding type annotations at the end.]

+0 on the proposal as a whole. It is not something I'm likely to use, but I'm not opposed to it, so long as it stays
optional.


> Nevertheless, it would be good to deprecate such alternative uses of annotations.

-1 on deprecating alternative uses of annotations.

--
~Ethan~
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Guido van Rossum

unread,
Aug 13, 2014, 4:20:35 PM8/13/14
to Ethan Furman, Python-Ideas
On Wed, Aug 13, 2014 at 12:59 PM, Ethan Furman <et...@stoneleaf.us> wrote:
On 08/13/2014 12:44 PM, Guido van Rossum wrote:

[There is no TL;DR other than the subject line. Please read the whole thing before replying. I do have an appendix with
some motivations for adding type annotations at the end.]

+0 on the proposal as a whole.  It is not something I'm likely to use, but I'm not opposed to it, so long as it stays optional.



Nevertheless, it would be good to deprecate such alternative uses of annotations.

-1 on deprecating alternative uses of annotations.

Do you have a favorite alternative annotation use that you actually use (or are likely to)?

--
--Guido van Rossum (python.org/~guido)

Alex Gaynor

unread,
Aug 13, 2014, 4:30:31 PM8/13/14
to python...@python.org
I'm strongly opposed this, for a few reasons.

First, I think that standardizing on a syntax, without a semantics is
incredibly confusing, and I can't imagine how having *multiple* competing
implementations would be a boon for anyone.

This proposal seems to be built around the idea that we should have a syntax,
and then people can write third party tools, but Python itself won't really do
anything with them.

Fundamentally, this seems like a very confusing approach. How we write a type,
and what we do with that information are fundamentally connected. Can I cast a
``List[str]`` to a ``List[object]`` in any way? If yes, what happens when I go
to put an ``int`` in it? There's no runtime checking, so the type system is
unsound, on the other hand, disallowing this prevents many types of successes.

Both solutions have merit, but the idea of some implementations of the type
checker having covariance and some contravariance is fairly disturbing.

Another concern I have is that analysis based on these types is making some
pretty strong assumptions about static-ness of Python programs that aren't
valid. While existing checkers like ``flake8`` also do this, their assumptions
are basically constrained to the symbol table, while this is far deeper. For
example, can I annotate somethign as ``six.text_type``? What about
``django.db.models.sql.Query`` (keep in mind that this class is redefined based
on what database you're using (not actually true, but it used to be))?

Python's type system isn't very good. It lacks many features of more powerful
systems such as algebraic data types, interfaces, and parametric polymorphism.
Despite this, it works pretty well because of Python's dynamic typing. I
strongly believe that attempting to enforce the existing type system would be a
real shame.

Alex

PS: You're right. None of this would provide *any* value for PyPy.

Christian Heimes

unread,
Aug 13, 2014, 4:31:14 PM8/13/14
to python...@python.org
On 13.08.2014 21:44, Guido van Rossum wrote:
> Yesterday afternoon I had an inspiring conversation with Bob Ippolito
> (man of many trades, author of simplejson) and Jukka Lehtosalo (author
> of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about
> what Python can learn from Haskell (and other languages); yesterday he
> gave the same talk at Dropbox. The talk is online
> (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad
> strokes comes down to three suggestions:
>
> (a) Python should adopt mypy's syntax for function annotations
> (b) Python's use of mutabe containers by default is wrong
> (c) Python should adopt some kind of Abstract Data Types

I was at Bob's talk during EP14 and really liked the idea. A couple of
colleagues and other attendees also said it's a good and useful
proposal. I also like your proposal to standardize the type annotations
first without a full integration of mypy.

In general I'm +1 but I like to discuss two aspects:

1) I'm not keen with the naming of mypy's typing classes. The visual
distinction between e.g. dict() and Dict() is too small and IMHO
confusing for newcomers. How about an additional 'T' prefix to make
clear that the objects are referring to typing objects?

from typing import TList, TDict

def word_count(input: TList[str]) -> TDict[str, int]:
...

2) PEP 3107 only specifies arguments and return values but not
exceptions that can be raised by a function. Java has the "throws"
syntax to list possible exceptions:

public void readFile() throws IOException {}

May I suggest that we also standardize a way to annotate the exceptions
that can be raised by a function? It's a very useful piece of
information and commonly requested information on the Python user
mailing list. It doesn't have to be a new syntax element, a decorator in
the typing module would suffice, too. For example:

from typing import TList, TDict, raises

@raises(RuntimeError, (ValueError, "is raised when input is empty"))
def word_count(input: TList[str]) -> TDict[str, int]:
...

Regards,
Christian

Ethan Furman

unread,
Aug 13, 2014, 4:50:43 PM8/13/14
to Python-Ideas
On 08/13/2014 01:19 PM, Guido van Rossum wrote:
> On Wed, Aug 13, 2014 at 12:59 PM, Ethan Furman wrote:
>>
>> -1 on deprecating alternative uses of annotations.
>
> Do you have a favorite alternative annotation use that you actually use (or are likely to)?

My script argument parser [1] uses annotations to figure out how to parse the cli parameters and cast them to
appropriate values (copied the idea from one of Michele Simionato's projects... plac [2], I believe).

I could store the info in some other structure besides 'annotations', but it's there and it fits the bill conceptually.
Amusingly, it's a form of type info, but instead of saying what it has to already be, says what it will become.

--
~Ethan~


[1] https://pypi.python.org/pypi/scription (due for an overhaul now I've used it for awhile ;)
[2] https://pypi.python.org/pypi/plac/0.9.1

Donald Stufft

unread,
Aug 13, 2014, 4:54:17 PM8/13/14
to Alex Gaynor, python...@python.org
I agree with Alex that I think leaving the actual semantics of what these things
mean up to a third party, which can possibly be swapped out by individual end
users, is terribly confusing. I don’t think I agree though that this is a bad
idea in general, I think that we should just add it for real and skip the
indirection.

IOW I'm not sure I see the benefit of defining the syntax but not the semantics
when it seems this is already completely possible given the fact that mypy
exists.

The only real benefits I can see from doing it are that the stdlib can use it,
and the ``import typing`` aspect. I don't believe that the stdlib benefits are
great enough to get the possible confusion of multiple different implementations
and I think that the typing import could easily be provided as a project on PyPI
that people can depend on if they want to use this in their code.

So my vote would be to add mypy semantics to the language itself.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Andrey Vlasovskikh

unread,
Aug 13, 2014, 5:09:19 PM8/13/14
to gu...@python.org, Python-Ideas
2014-08-14, 0:19, Guido van Rossum <gu...@python.org> wrote:

> Yesterday afternoon I had an inspiring conversation with Bob Ippolito (man of many trades, author of simplejson) and Jukka Lehtosalo (author of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about what Python can learn from Haskell (and other languages); yesterday he gave the same talk at Dropbox. The talk is online (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad strokes comes down to three suggestions:
>
> (a) Python should adopt mypy's syntax for function annotations


+1. I'm a developer of the code analysis engine of PyCharm. I have discussed this idea with Jukka Lehtosalo and recently with Dave Halter, the author of Jedi code completion library. Standardized type annotations would be very useful for code analysis tools and IDEs such as PyCharm, Jedi and pylint. Type annotations would be especially great for third-party libraries. The idea is that most Python programmers don't have to write annotations in order to benefit from them. Annotated libraries are often enough for good code analysis.

We (PyCharm) and Jukka have made some initial steps in this direction, including thoughts on semantics of annotations (https://github.com/pytypes/pytypes). Feedback is welcome.

Here are slides from my talk about optional typing in Python, that show how Mypy types can be used in both static and dynamic type checking (http://blog.pirx.ru/media/files/2013/python-optional-typing/), Mypy-related part starts from slide 14.

We are interested in getting type annotations standardized and we would like to help developing and testing type annotations proposals.

--
Andrey Vlasovskikh
Web: http://pirx.ru/

Antoine Pitrou

unread,
Aug 13, 2014, 5:13:51 PM8/13/14
to python...@python.org

Hello,

First, as a disclaimer, I am currently working on Numba for Continuum
Analytics. Numba has its own type inference system which it applies to
functions decorated with the @jit decorator. Due to Numba's objectives,
the type inference system is heavily geared towards numerical computing,
but it is conceptually (and a bit concretely) able to represent more
generic information, such as "an enumerate() over an iterator of a
complex128 numpy array".

There are two sides to type inference:

1) first the (optional) annotations
(I'm saying "optional" because in the most basic usage, a JIT compiler
is normally able to defer compilation until the first function or method
call, and to deduce input types from that)

2) second the inference engine properly, which walks the code (in
whatever form the tool's developer has chosen: bytecode, AST, IR) and
deduces types for any intermediate values

Now only #1 is implied by this PEP proposal, but it also sounds like we
should take into account the desired properties of #2 (for example,
being able to express "an iterator of three-tuples" can be important for
a JIT compiler - or not, perhaps, depending on the JIT compiler :-)).
What #2 wants to do will differ depending on the use case: e.g. a code
checker may need less type granularity than a JIT compiler.


Therefore, regardless of mypy's typesystem's completeness and
granularity, one requirement is for it to be easily extensible. By
extensible I mean not only being able to define new type descriptions,
but being able to do so for existing third-party libraries you don't
want to modify.

I'm saying that because I'm looking at
http://mypy-lang.org/tutorial.html#genericclasses , and it's not clear
from this example whether the typing code has to be interwoven with the
collection's implementation, or can be written as a separate code module
entirely (*). Ideally both should probably be possible (in the same vein
as being able to subclass an ABC, or register an existing class with
it). This also includes being to type-declare functions and types from C
extension modules.

In Numba, this would be typically required to write typing descriptions
for Numpy arrays and functions; but also to derive descriptions for
fixed-width integers, single-precision floats, etc. (this also means
some form of subclassing for type descriptions themselves).

(*) (actually, I'm a bit worried when I see that "List[int]()"
instantiates an actual list; calling a type description class should
give you a parametered type description, not an object; the [] notation
is in general not powerful enough if you want several type parameters,
possibly keyword-only)


At some point, it will be even better if the typing system is powerful
enough to remember properties of the *values* (for example not only "a
string", but "a one-character string, or even "one of the 'Y', 'M', 'D'
strings"). Think about type-checking / type-infering calls to the struct
module.


I may come back with more comments once I've read the mypy docs and/or
code in detail.

Regards

Antoine.

Guido van Rossum

unread,
Aug 13, 2014, 5:47:46 PM8/13/14
to Alex Gaynor, Python-Ideas
On Wed, Aug 13, 2014 at 1:29 PM, Alex Gaynor <alex....@gmail.com> wrote:
I'm strongly opposed this, for a few reasons.

First, I think that standardizing on a syntax, without a semantics is
incredibly confusing, and I can't imagine how having *multiple* competing
implementations would be a boon for anyone.

That part was probably overly vague in my original message. I actually do want to standardize on semantics, but I think the semantics will prove controversial (they already have :-) and I think it's better to standardize the syntax and *some* semantics first rather than having to wait another decade for the debate over the semantics to settle. I mostly want to leave the door open for mypy to become smarter. But it might make sense to have a "weaker" interpretation in some cases too (e.g. an IDE might use a weaker type system in order to avoid overwhelming users with warnings).
 
This proposal seems to be built around the idea that we should have a syntax,
and then people can write third party tools, but Python itself won't really do
anything with them.

Right.
 
Fundamentally, this seems like a very confusing approach. How we write a type,
and what we do with that information are fundamentally connected. Can I cast a
``List[str]`` to a ``List[object]`` in any way? If yes, what happens when I go
to put an ``int`` in it? There's no runtime checking, so the type system is
unsound, on the other hand, disallowing this prevents many types of successes.

Mypy has a cast() operator that you can use to shut it up when you (think you) know the conversion is safe.
 
Both solutions have merit, but the idea of some implementations of the type
checker having covariance and some contravariance is fairly disturbing.

Yeah, that wouldn't be good. ;-)
 
Another concern I have is that analysis based on these types is making some
pretty strong assumptions about static-ness of Python programs that aren't
valid. While existing checkers like ``flake8`` also do this, their assumptions
are basically constrained to the symbol table, while this is far deeper. For
example, can I annotate something as ``six.text_type``? What about

``django.db.models.sql.Query`` (keep in mind that this class is redefined based
on what database you're using (not actually true, but it used to be))?

Time will have to tell. Stubs can help. I encourage you to try annotating a medium-sized module. It's likely that you'll find a few things: maybe a bug in mypy, maybe a missing mypy feature, maybe a bug in your code, maybe a shady coding practice in your code or a poorly documented function (I know I found several of each during my own experiments so far).
 
Python's type system isn't very good. It lacks many features of more powerful
systems such as algebraic data types, interfaces, and parametric polymorphism.
Despite this, it works pretty well because of Python's dynamic typing. I
strongly believe that attempting to enforce the existing type system would be a
real shame.

Mypy shines in those areas of Python programs that are mostly statically typed. There are many such areas in most large systems. There are usually also some areas where mypy's type system is inadequate. It's easy to shut it up for those cases (in fact, mypy is silent unless you use at least one annotation for a function). But that's the case with most type systems. Even Haskell sometimes calls out to C.

Guido van Rossum

unread,
Aug 13, 2014, 6:01:47 PM8/13/14
to Ethan Furman, Python-Ideas
On Wed, Aug 13, 2014 at 1:50 PM, Ethan Furman <et...@stoneleaf.us> wrote:
On 08/13/2014 01:19 PM, Guido van Rossum wrote:

On Wed, Aug 13, 2014 at 12:59 PM, Ethan Furman wrote:

-1 on deprecating alternative uses of annotations.

Do you have a favorite alternative annotation use that you actually use (or are likely to)?

My script argument parser [1] uses annotations to figure out how to parse the cli parameters and cast them to appropriate values (copied the idea from one of Michele Simionato's projects... plac [2], I believe).

I could store the info in some other structure besides 'annotations', but it's there and it fits the bill conceptually.  Amusingly, it's a form of type info, but instead of saying what it has to already be, says what it will become.

I couldn't find any docs for scription (the tarball contains just the source code, not even an example), although I did find some for plac. I expect using type annotations to the source of scription.py might actually make it easier to grok what it does. :-)

But really, I'm sure that in Python 3.5, scription and mypy can coexist. If the mypy idea takes off you might eventually be convinced to use a different convention. But you'd get plenty of warning.
 
[1] https://pypi.python.org/pypi/scription  (due for an overhaul now I've used it for awhile ;)
[2] https://pypi.python.org/pypi/plac/0.9.1

Guido van Rossum

unread,
Aug 13, 2014, 6:07:21 PM8/13/14
to Donald Stufft, Python-Ideas, Alex Gaynor
On Wed, Aug 13, 2014 at 1:53 PM, Donald Stufft <don...@stufft.io> wrote:
I agree with Alex that I think leaving the actual semantics of what these things
mean up to a third party, which can possibly be swapped out by individual end
users, is terribly confusing. I don’t think I agree though that this is a bad
idea in general, I think that we should just add it for real and skip the
indirection.

Yeah, I probably overstated the option of alternative interpretations. I just don't want to have to write a PEP that specifies every little detail of mypy's type checking algorithm, and I don't think anyone would want to have to read such a PEP either. But maybe we can compromise on something that sketches broad strokes and leaves the details up to the team that maintains mypy (after all that tactic has worked pretty well for Python itself :-).
 
IOW I'm not sure I see the benefit of defining the syntax but not the semantics
when it seems this is already completely possible given the fact that mypy
exists.

The only real benefits I can see from doing it are that the stdlib can use it,
and the ``import typing`` aspect. I don't believe that the stdlib benefits are
great enough to get the possible confusion of multiple different implementations
and I think that the typing import could easily be provided as a project on PyPI
that people can depend on if they want to use this in their code.

So my vote would be to add mypy semantics to the language itself.

What exactly would that mean? I don't think the Python interpreter should reject programs that fail the type check -- in fact, separating the type check from run time is the most crucial point of my proposal.

I'm fine to have a discussion on things like covariance vs. contravariance, or what form of duck typing are acceptable, etc.
 

Guido van Rossum

unread,
Aug 13, 2014, 6:08:48 PM8/13/14
to Andrey Vlasovskikh, Python-Ideas
Wow. Awesome. I will make time to study what you have already done!


On Wed, Aug 13, 2014 at 2:08 PM, Andrey Vlasovskikh <andrey.vl...@gmail.com> wrote:
2014-08-14, 0:19, Guido van Rossum <gu...@python.org> wrote:

> Yesterday afternoon I had an inspiring conversation with Bob Ippolito (man of many trades, author of simplejson) and Jukka Lehtosalo (author of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about what Python can learn from Haskell (and other languages); yesterday he gave the same talk at Dropbox. The talk is online (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad strokes comes down to three suggestions:
>
>  (a) Python should adopt mypy's syntax for function annotations


+1. I'm a developer of the code analysis engine of PyCharm. I have discussed this idea with Jukka Lehtosalo and recently with Dave Halter, the author of Jedi code completion library. Standardized type annotations would be very useful for code analysis tools and IDEs such as PyCharm, Jedi and pylint. Type annotations would be especially great for third-party libraries. The idea is that most Python programmers don't have to write annotations in order to benefit from them. Annotated libraries are often enough for good code analysis.

We (PyCharm) and Jukka have made some initial steps in this direction, including thoughts on semantics of annotations (https://github.com/pytypes/pytypes). Feedback is welcome.

Here are slides from my talk about optional typing in Python, that show how Mypy types can be used in both static and dynamic type checking (http://blog.pirx.ru/media/files/2013/python-optional-typing/), Mypy-related part starts from slide 14.

We are interested in getting type annotations standardized and we would like to help developing and testing type annotations proposals.

--
Andrey Vlasovskikh
Web: http://pirx.ru/




Juancarlo Añez

unread,
Aug 13, 2014, 6:22:46 PM8/13/14
to Guido van Rossum, Jukka Lehtosalo, Python-Ideas

On Wed, Aug 13, 2014 at 3:14 PM, Guido van Rossum <gu...@python.org> wrote:
I am proposing that we adopt whatever mypy uses here, keeping discussion of the details (mostly) out of the PEP. The goal is to make it possible to add type checking annotations to 3rd party modules (and even to the stdlib) while allowing unaltered execution of the program by the (unmodified) Python 3.5 interpreter.

I'll comment later on the core subject.

For now, I think this deserves some thought:

Function annotations are not available in Python 2.7, so promoting widespread use of annotations in 3.5 would be promoting code that is compatible only with 3.x, when the current situation is that much effort is being spent on writing code that works on both 2.7 and 3.4 (most libraries?).

Independently of its core merits, this proposal should fail unless annotations are added to Python 2.8.

Cheers,

--
Juancarlo Añez

Todd

unread,
Aug 13, 2014, 6:29:00 PM8/13/14
to python-ideas


On Aug 13, 2014 9:45 PM, "Guido van Rossum" <gu...@python.org> wrote:
> (1) A change of direction for function annotations
>
> PEP 3107, which introduced function annotations, is intentional non-committal about how function annotations should be used. It lists a number of use cases, including but not limited to type checking. It also mentions some rejected proposals that would have standardized either a syntax for indicating types and/or a way for multiple frameworks to attach different annotations to the same function. AFAIK in practice there is little use of function annotations in mainstream code, and I propose a conscious change of course here by stating that annotations should be used to indicate types and to propose a standard notation for them.
>
> (We may have to have some backwards compatibility provision to avoid breaking code that currently uses annotations for some other purpose. Fortunately the only issue, at least initially, will be that when running mypy to type check such code it will produce complaints about the annotations; it will not affect how such code is executed by the Python interpreter. Nevertheless, it would be good to deprecate such alternative uses of annotations.)

I watched the original talk and read your proposal.  I think type annotations could very very useful in certain contexts. 

However, I still don't get this bit. Why would allowing type annotations automatically imply that no other annotations would be possible?  Couldn't we formalize what would be considered a type annotation while still allowing annotations that don't fit this criteria to be used for other things?

Manuel Cerón

unread,
Aug 13, 2014, 6:29:01 PM8/13/14
to python...@python.org
On Wed, Aug 13, 2014 at 9:44 PM, Guido van Rossum <gu...@python.org> wrote:
[There is no TL;DR other than the subject line. Please read the whole thing before replying. I do have an appendix with some motivations for adding type annotations at the end.]

This is a very interesting idea. I played a bit with function annotations (https://github.com/ceronman/typeannotations) and I gave a talk about them at EuroPython 2013. Certainly static type analysis is probably the best use case. 

The curious thing here is that while standardizing a syntax for type annotations, we technically still won't be adopting standard rules for type checking. This is intentional. First of all, fully specifying all the type checking rules would make for a really long and boring PEP (a much better specification would probably be the mypy source code). Second, I think it's fine if the type checking algorithm evolves over time, or if variations emerge. The worst that can happen is that you consider your code correct but mypy disagrees; your code will still run.

That said, I don't want to completely leave out any specification. I want the contents of the typing.py module to be specified in the PEP, so that it can be used with confidence. But whether mypy will complain about your particular form of duck typing doesn't have to be specified by the PEP. Perhaps as mypy evolves it will take options to tell it how to handle certain edge cases. Forks of mypy (or entirely different implementations of type checking based on the same annotation syntax) are also a possibility. Maybe in the distant future a version of Python will take a different stance, once we have more experience with how this works out in practice, but for Python 3.5 I want to restrict the scope of the upheaval.

The type checking algorithm might evolve over the time, but by including typing.py in the stdlib, the syntax for annotations would be almost frozen and that will be a limitation. In other projects such as TypeScript (http://www.typescriptlang.org/), that the syntax usually evolves alongside the algorithms. 

Is the syntax specifyed in typing.py mature enough to put it in the stdlib and expect users to start annotating their projects without worrying too much about future changes?

Is there enough feedback from users using mypy in their projects?

I think that rushing typing.py into 3.5 is not a good idea. However, It'd be nice to add some notes in PEP8, encourage it's use as an external library, let some projects and tools (e.g. PyCharm) use it. It's not that bad if mypy lives 100% outside the Python distribution for a while. Just like TypeScript to JavaScript. After getting some user base, part of it (typing.py) could be moved to the stdlib.

Manuel.

Ryan Gonzalez

unread,
Aug 13, 2014, 6:35:11 PM8/13/14
to Christian Heimes, python-ideas
On Wed, Aug 13, 2014 at 3:29 PM, Christian Heimes <chri...@python.org> wrote:
On 13.08.2014 21:44, Guido van Rossum wrote:
> Yesterday afternoon I had an inspiring conversation with Bob Ippolito
> (man of many trades, author of simplejson) and Jukka Lehtosalo (author
> of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about
> what Python can learn from Haskell (and other languages); yesterday he
> gave the same talk at Dropbox. The talk is online
> (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad
> strokes comes down to three suggestions:
>
>   (a) Python should adopt mypy's syntax for function annotations
>   (b) Python's use of mutabe containers by default is wrong
>   (c) Python should adopt some kind of Abstract Data Types

I was at Bob's talk during EP14 and really liked the idea. A couple of
colleagues and other attendees also said it's a good and useful
proposal. I also like your proposal to standardize the type annotations
first without a full integration of mypy.

In general I'm +1 but I like to discuss two aspects:

1) I'm not keen with the naming of mypy's typing classes. The visual
distinction between e.g. dict() and Dict() is too small and IMHO
confusing for newcomers. How about an additional 'T' prefix to make
clear that the objects are referring to typing objects?

  from typing import TList, TDict

  def word_count(input: TList[str]) -> TDict[str, int]:
      ...

Eeewwwww. That's way too Pascal-ish.


2) PEP 3107 only specifies arguments and return values but not
exceptions that can be raised by a function. Java has the "throws"
syntax to list possible exceptions:

 public void readFile() throws IOException {}

May I suggest that we also standardize a way to annotate the exceptions
that can be raised by a function? It's a very useful piece of
information and commonly requested information on the Python user
mailing list. It doesn't have to be a new syntax element, a decorator in
the typing module would suffice, too. For example:

  from typing import TList, TDict, raises

  @raises(RuntimeError, (ValueError, "is raised when input is empty"))
  def word_count(input: TList[str]) -> TDict[str, int]:
      ...

That was a disaster in C++. It's confusing, especially since Python uses exceptions more than most other languages do.
 

Regards,
Christian

_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/



--
Ryan
If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated."

Donald Stufft

unread,
Aug 13, 2014, 6:44:59 PM8/13/14
to gu...@python.org, Python-Ideas, Alex Gaynor
I don’t know exactly :)

Some ideas:

1) Raise a warning when the type check fails, but allow it happen. This would
   have the benefit of possibly catching bugs, but it's still opt in in the
   sense that you have to write the annotations for anything to happen. This
   would also enable people to turn on enforced type checking by raising the
   warning level to an exception.

   Even if this was off by default it would make it easy to enable it during
   test runs and also enable easier/better quickcheck like functionality.

2) Simply add a flag to the interpreter that turns on type checking.

3) Add a stdlib module that would run the program under type checking, like
   ``python -m typing myprog`` instead of ``python -m myprog``.

Really I think a lot of the benefit is likely to come in the form of linting
and during test runs. However if I have to run a seperate Python interpreter
to actually do the run then I risk getting bad results through varying things
like interpreter differences, language level differences, etc.

Although I wouldn't complain if it meant that Python had actual type checking
at the run time if a function had type annotations :)


I'm fine to have a discussion on things like covariance vs. contravariance, or what form of duck typing are acceptable, etc.

I’m not particularly knowledgable about the actual workings of a type system and
covariance vs contravariance and the like. My main concern there is having a
single reality. The meaning of something shouldn't change because I used a
different interpreter/linter/whatever. Beyond that I don't know enough to have
an opinion on the actual semantics.

Guido van Rossum

unread,
Aug 13, 2014, 7:13:17 PM8/13/14
to Juancarlo Añez, Jukka Lehtosalo, Python-Ideas
Actually, mypy already has a solution. There's a codec (https://github.com/JukkaL/mypy/tree/master/mypy/codec) that you can use which transforms Python-2-with-annotations into vanilla Python 2. It's not an ideal solution, but it can work in cases where you absolutely have to have state of the art Python 3.5 type checking *and* backwards compatibility with Python 2.

Guido van Rossum

unread,
Aug 13, 2014, 7:26:15 PM8/13/14
to Manuel Cerón, Python-Ideas
On Wed, Aug 13, 2014 at 3:26 PM, Manuel Cerón <cero...@gmail.com> wrote:
The type checking algorithm might evolve over the time, but by including typing.py in the stdlib, the syntax for annotations would be almost frozen and that will be a limitation. In other projects such as TypeScript (http://www.typescriptlang.org/), that the syntax usually evolves alongside the algorithms.

What kind of evolution did TypeScript experience?
 
Is the syntax specifyed in typing.py mature enough to put it in the stdlib and expect users to start annotating their projects without worrying too much about future changes?

This is a good question. I do think it is good enough as a starting point for future evolution. Perhaps the biggest question is how fast will the annotation syntax need to evolve? If it needs to evolve significantly faster than Python 3 feature releases come out (every 18 months, approximately) then it may be better to hold off and aim for inclusion in the 3.6 standard library. That would allow more time to reach agreement (though I'm not sure that's a good thing :-), and in the mean time typing.py could be distributed as a 3rd party module on PyPI.
 
Is there enough feedback from users using mypy in their projects?

I think that rushing typing.py into 3.5 is not a good idea. However, It'd be nice to add some notes in PEP8, encourage it's use as an external library, let some projects and tools (e.g. PyCharm) use it. It's not that bad if mypy lives 100% outside the Python distribution for a while. Just like TypeScript to JavaScript.

Well, JavaScript's evolution is tied up forever in a standards body, so TypeScript realistically had no choice in the matter. But are there actually people writing TypeScript? I haven't heard from them yet (people at Dropbox seem to rather like CoffeeScript). Anyway, the situation isn't quite the same -- you wouldn't make any friends in the Python world if you wrote your code in an incompatible dialect that could only be executed after a translation step, but in the JavaScript world that's how all alternative languages work (and they even manage to interoperate).
 
After getting some user base, part of it (typing.py) could be moved to the stdlib.

I'm still hopeful that we can get a sufficient user base and agreement on mypy's features for inclusion in 3.5 (extrapolating the 3.4 release schedule by 18 months, 3.5 alpha 1 would go out around February 2015; the feature freeze cut-off date, beta 1, would around May thereafter).

Ben Finney

unread,
Aug 13, 2014, 7:28:16 PM8/13/14
to python...@python.org
Christian Heimes <chri...@python.org>
writes:

> 1) I'm not keen with the naming of mypy's typing classes. The visual
> distinction between e.g. dict() and Dict() is too small and IMHO
> confusing for newcomers. How about an additional 'T' prefix to make
> clear that the objects are referring to typing objects?

To this reader, ‘dict’ and ‘list’ *are* “typing objects” — they are
objects that are types. Seeing code that referred to something else as
“typing objects” would be an infitation to confusion, IMO.

You could argue “that's because you don't know the special meaning of
“typing object” being discussed here”. To which my response would be,
for a proposal to add something else as meaningful Python syntax, the
jargon is poorly chosen and needlessly confusing with established terms
in Python.

If there's going to be a distinction between the types (‘dict’, ‘list’,
etc.) and something else, I'd prefer it to be based on a clearer
terminology distinction.

--
\ “Simplicity and elegance are unpopular because they require |
`\ hard work and discipline to achieve and education to be |
_o__) appreciated.” —Edsger W. Dijkstra |
Ben Finney

Guido van Rossum

unread,
Aug 13, 2014, 7:31:43 PM8/13/14
to Todd, python-ideas
On Wed, Aug 13, 2014 at 3:28 PM, Todd <todd...@gmail.com> wrote:
However, I still don't get this bit. Why would allowing type annotations automatically imply that no other annotations would be possible?  Couldn't we formalize what would be considered a type annotation while still allowing annotations that don't fit this criteria to be used for other things?

We certainly *could* do that. However, I haven't seen sufficient other uses of annotations. If there is only one use for annotations (going forward), annotations would be unambiguous. If we allow different types of annotations, there would have to be a way to tell whether a particular annotation is intended as a type annotation or not. Currently mypy ignores all modules that don't import typing.py (using any form of import statement), and we could continue this convention. But it would mean that something like this would still require the typing import in order to be checked by mypy:

import typing

def gcd(int a, int b) -> int:
    <tralala>

The (necessary) import would be flagged as unused by every linter in the world... :-(

Guido van Rossum

unread,
Aug 13, 2014, 7:45:23 PM8/13/14
to Donald Stufft, Python-Ideas, Alex Gaynor
On Wed, Aug 13, 2014 at 3:44 PM, Donald Stufft <don...@stufft.io> wrote:
On Aug 13, 2014, at 6:05 PM, Guido van Rossum <gu...@python.org> wrote:

On Wed, Aug 13, 2014 at 1:53 PM, Donald Stufft <don...@stufft.io> wrote:

So my vote would be to add mypy semantics to the language itself.

What exactly would that mean? I don't think the Python interpreter should reject programs that fail the type check -- in fact, separating the type check from run time is the most crucial point of my proposal.

I don’t know exactly :)

Some ideas:

1) Raise a warning when the type check fails, but allow it happen. This would
   have the benefit of possibly catching bugs, but it's still opt in in the
   sense that you have to write the annotations for anything to happen. This
   would also enable people to turn on enforced type checking by raising the
   warning level to an exception.

I don't think that's going to happen. It would require the entire mypy implementation to be checked into the stdlib. It would also require all sorts of hacks in that implementation to deal with dynamic (or just delayed) imports. Mypy currently doesn't handle any of that -- it must be able to find all imported modules before it starts executing even one line of code.
 
   Even if this was off by default it would make it easy to enable it during
   test runs and also enable easier/better quickcheck like functionality.

It would *have* to be off by default -- it's way too slow to be on by default (note that some people are already fretting out today about a 25 msec process start-up time).
 
2) Simply add a flag to the interpreter that turns on type checking.

3) Add a stdlib module that would run the program under type checking, like
   ``python -m typing myprog`` instead of ``python -m myprog``.

Really I think a lot of the benefit is likely to come in the form of linting
and during test runs. However if I have to run a separate Python interpreter
to actually do the run then I risk getting bad results through varying things
like interpreter differences, language level differences, etc.

Yeah, but I just don't think it's realistic to do anything about that for 3.5 (or 3.6 for that matter). In a decade... Who knows! :-)
 
Although I wouldn't complain if it meant that Python had actual type checking
at the run time if a function had type annotations :)

It's probably possibly to write a decorator that translates annotations into assertions that are invoked when a function is called. But in most cases it would be way too slow to turn on everywhere.
I'm fine to have a discussion on things like covariance vs. contravariance, or what form of duck typing are acceptable, etc.
I’m not particularly knowledgable about the actual workings of a type system and
covariance vs contravariance and the like. My main concern there is having a
single reality. The meaning of something shouldn't change because I used a
different interpreter/linter/whatever. Beyond that I don't know enough to have
an opinion on the actual semantics.

Yeah, I regret writing it so vaguely already. Having Alex Gaynor open with "I'm strongly opposed [to] this" is a great joy killer. :-)

I just really don't want to have to redundantly write up a specification for all the details of mypy's type checking rules in PEP-worthy English. But I'm fine with discussing whether List[str] is a subclass or a superclass of List[object] and how to tell the difference.

Still, different linters exist and I don't hear people complain about that. I would also be okay if PyCharm's interpretation of the finer points of the type checking syntax was subtly different from mypy's. In fact I would be surprised if they weren't sometimes in disagreement. Heck, PyPy doesn't give *every* Python program the same meaning as CPython, and that's a feature. :-)

Donald Stufft

unread,
Aug 13, 2014, 7:59:40 PM8/13/14
to gu...@python.org, Python-Ideas, Alex Gaynor
Understood! And really the most important thing I'm worried about isn’t that
there is some sort of code in the stdlib or in the interpreter just that there
is an authoritative source of what stuff means.


Still, different linters exist and I don't hear people complain about that. I would also be okay if PyCharm's interpretation of the finer points of the type checking syntax was subtly different from mypy's. In fact I would be surprised if they weren't sometimes in disagreement. Heck, PyPy doesn't give *every* Python program the same meaning as CPython, and that's a feature. :-)


Depends on what is meant by "meaning" I suppose. Generally in those linters or
PyPy itself if there is a different *meaningful* result (for instance if
print was defaulting to sys.stderr) then CPython (incl docs) acts as the
authoritative source of what ``print()`` means (in this case writing to
sys.stdout).

I'm also generally OK with deferring possible code/interpreter changes to add
actual type checking until a later point in time. If there's a defined semantics
to what those annotations mean than third parties can experiment and do things
with it and those different things can be looked at adding/incorporating into
Python proper in 3.6 (or 3.7, or whatever).

Honestly I think that probably the things I was worried about is sufficiently
allayed given that it appears I was reading more into the vaguness and the
optionally different interpretations than what was meant and I don't want to
keep harping on it :) As long as there's some single source of what List[str]
or what have you means than I'm pretty OK with it all.

Chris Angelico

unread,
Aug 13, 2014, 8:32:42 PM8/13/14
to Python-Ideas
On Thu, Aug 14, 2014 at 5:44 AM, Guido van Rossum <gu...@python.org> wrote:
> from typing import List, Dict
>
> def word_count(input: List[str]) -> Dict[str, int]:
> result = {} #type: Dict[str, int]
> for line in input:
> for word in line.split():
> result[word] = result.get(word, 0) + 1
> return result

I strongly support the concept of standardized typing information.
There'll be endless bikeshedding on names, though - personally, I
don't like the idea of "from typing import ..." as there's already a
"types" module and I think it'd be confusing. (Also, "mypy" sounds
like someone's toy reimplementation of Python, which it does seem to
be :) but that's not really well named for "type checker using stdlib
annotations".) But I think the idea is excellent, and it deserves
stdlib support.

The cast notation sounds to me like it's what Pike calls a "soft cast"
- it doesn't actually *change* anything (contrast a C or C++ type
cast, where (float)42 is 42.0), it just says to the copmiler/type
checker "this thing is actually now this type". If the naming is clear
on this point, it leaves open the possibility of actual recursive
casting - where casting a List[str] to List[int] is equivalent to
[int(x) for x in lst]. Whether or not that's a feature worth adding
can be decided in the distant future :)

+1 on the broad proposal. +0.5 on defining the notation while leaving
the actual type checking to an external program.

ChrisA

Haoyi Li

unread,
Aug 13, 2014, 8:54:25 PM8/13/14
to Chris Angelico, Python-Ideas
Both solutions have merit, but the idea of some implementations of the type checker having covariance and some contravariance is fairly disturbing.

Why can't we have both? That's the only way to properly type things, since immutable-get-style APIs are always going to be convariant, set-only style APIs (e.g. a function that takes 1 arg and returns None) are going to be contravariant and mutable get-set APIs (like most python collections) should really be invariant.

Łukasz Langa

unread,
Aug 13, 2014, 9:01:40 PM8/13/14
to guido@python.org van Rossum, Python-Ideas
It’s great to see this finally happening!
I did some research on existing optional-typing approaches [1]. What I learned in the process was that linting is the most important use case for optional typing; runtime checks is too little, too late.

That being said, having optional runtime checks available *is* also important. Used in staging environments and during unit testing, this case is able to cover cases obscured by meta-programming. Implementations like “obiwan” and “pytypedecl” show that providing a runtime type checker is absolutely feasible.

The function annotation syntax currently supported in Python 3.4 is not well-suited for typing. This is because users expect to be able to operate on the types they know. This is currently not feasible because:
1. forward references are impossible
2. generics are impossible without custom syntax (which is the reason Mypy’s Dict exists)
3. optional types are clumsy to express (Optional[int] is very verbose for a use case this common)
4. union types are clumsy to express

All those problems are elegantly solved by Google’s pytypedecl via moving type information to a separate file. Because for our use case that would not be an acceptable approach, my intuition would be to:

1. Provide support for generics (understood as an answer to the question: “what does this collection contain?”) in Abstract Base Classes. That would be a PEP in itself.
2. Change the function annotation syntax so that it’s not executed at import time but rather treated as strings. This solves forward references and enables us to…
3. Extend the function annotation syntax with first-class generics support (most languages like "list<str>”)
4. Extend the function annotation syntax with first-class union type support. pytypedecl simply uses “int or None”, which I find very elegant.
5. Speaking of None, possibly further extend the function annotation syntax with first-class optionality support. In the Facebook codebase in Hack we have tens of thousands of optional ints (nevermind other optional types!), this is a case that’s going to be used all the time. Hack uses ?int, that’s the most succinct style you can get. Yes, it’s special but None is a special type, too.

All in all, I believe Mypy has the highest chance of becoming our typing linter, which is great! I just hope we can improve on the syntax, which is currently lacking. Also, reusing our existing ABCs where applicable would be nice. With Mypy’s typing module I feel like we’re going to get a new, orthogonal set of ABCs, which will confuse users to no end. Finally, the runtime type checker would make the ecosystem complete.

This is just the beginning of the open issues I was juggling with and the reason my own try at the PEP was coming up slower than I’d like.

[1] You can find a summary of examples I looked at here: http://lukasz.langa.pl/typehinting/

-- 
Best regards,
Łukasz Langa

WWW: http://lukasz.langa.pl/
Twitter: @llanga
IRC: ambv on #python-dev

On Aug 13, 2014, at 12:44 PM, Guido van Rossum <gu...@python.org> wrote:

[There is no TL;DR other than the subject line. Please read the whole thing before replying. I do have an appendix with some motivations for adding type annotations at the end.]
Yesterday afternoon I had an inspiring conversation with Bob Ippolito (man of many trades, author of simplejson) and Jukka Lehtosalo (author of mypy: http://mypy-lang.org/). Bob gave a talk at EuroPython about what Python can learn from Haskell (and other languages); yesterday he gave the same talk at Dropbox. The talk is online (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad strokes comes down to three suggestions:

  (a) Python should adopt mypy's syntax for function annotations
  (b) Python's use of mutabe containers by default is wrong
  (c) Python should adopt some kind of Abstract Data Types

Proposals (b) and (c) don't feel particularly actionable (if you disagree please start a new thread, I'd be happy to discuss these further if there's interest) but proposal (a) feels right to me.

So what is mypy?  It is a static type checker for Python written by Jukka for his Ph.D. thesis. The basic idea is that you add type annotations to your program using some custom syntax, and when running your program using the mypy interpreter, type errors will be found during compilation (i.e., before the program starts running).

The clever thing here is that the custom syntax is actually valid Python 3, using (mostly) function annotations: your annotated program will still run with the regular Python 3 interpreter. In the latter case there will be no type checking, and no runtime overhead, except to evaluate the function annotations (which are evaluated at function definition time but don't have any effect when the function is called).

In fact, it is probably more useful to think of mypy as a heavy-duty linter than as a compiler or interpreter; leave the type checking to mypy, and the execution to Python. It is easy to integrate mypy into a continuous integration setup, for example.

To read up on mypy's annotation syntax, please see the mypy-lang.org website. Here's just one complete example, to give a flavor:


  from typing import List, Dict

  def word_count(input: List[str]) -> Dict[str, int]:
      result = {}  #type: Dict[str, int]
      for line in input:
          for word in line.split():
              result[word] = result.get(word, 0) + 1
      return result


Note that the #type: comment is part of the mypy syntax; mypy uses comments to declare types in situations where no syntax is available -- although this particular line could also be written as follows:

    result = Dict[str, int]()

Either way the entire function is syntactically valid Python 3, and a suitable implementation of typing.py (containing class definitions for List and Dict, for example) can be written to make the program run correctly. One is provided as part of the mypy project.

I should add that many of mypy's syntactic choices aren't actually new. The basis of many of its ideas go back at least a decade: I blogged about this topic in 2004 (http://www.artima.com/weblogs/viewpost.jsp?thread=85551 -- see also the two followup posts linked from the top there).

I'll emphasize once more that mypy's type checking happens in a separate pass: no type checking happens at run time (other than what the interpreter already does, like raising TypeError on expressions like 1+"1").

There's a lot to this proposal, but I think it's possible to get a PEP written, accepted and implemented in time for Python 3.5, if people are supportive. I'll go briefly over some of the action items.

(1) A change of direction for function annotations

PEP 3107, which introduced function annotations, is intentional non-committal about how function annotations should be used. It lists a number of use cases, including but not limited to type checking. It also mentions some rejected proposals that would have standardized either a syntax for indicating types and/or a way for multiple frameworks to attach different annotations to the same function. AFAIK in practice there is little use of function annotations in mainstream code, and I propose a conscious change of course here by stating that annotations should be used to indicate types and to propose a standard notation for them.

(We may have to have some backwards compatibility provision to avoid breaking code that currently uses annotations for some other purpose. Fortunately the only issue, at least initially, will be that when running mypy to type check such code it will produce complaints about the annotations; it will not affect how such code is executed by the Python interpreter. Nevertheless, it would be good to deprecate such alternative uses of annotations.)

(2) A specification for what to add to Python 3.5

There needs to be at least a rough consensus on the syntax for annotations, and the syntax must cover a large enough set of use cases to be useful. Mypy is still under development, and some of its features are still evolving (e.g. unions were only added a few weeks ago). It would be possible to argue endlessly about details of the notation, e.g. whether to use 'list' or 'List', what either of those means (is a duck-typed list-like type acceptable?) or how to declare and use type variables, and what to do with functions that have no annotations at all (mypy currently skips those completely).

I am proposing that we adopt whatever mypy uses here, keeping discussion of the details (mostly) out of the PEP. The goal is to make it possible to add type checking annotations to 3rd party modules (and even to the stdlib) while allowing unaltered execution of the program by the (unmodified) Python 3.5 interpreter. The actual type checker will not be integrated with the Python interpreter, and it will not be checked into the CPython repository. The only thing that needs to be added to the stdlib is a copy of mypy's typing.py module. This module defines several dozen new classes (and a few decorators and other helpers) that can be used in expressing argument types. If you want to type-check your code you have to download and install mypy and run it separately.

The curious thing here is that while standardizing a syntax for type annotations, we technically still won't be adopting standard rules for type checking. This is intentional. First of all, fully specifying all the type checking rules would make for a really long and boring PEP (a much better specification would probably be the mypy source code). Second, I think it's fine if the type checking algorithm evolves over time, or if variations emerge. The worst that can happen is that you consider your code correct but mypy disagrees; your code will still run.

That said, I don't want to completely leave out any specification. I want the contents of the typing.py module to be specified in the PEP, so that it can be used with confidence. But whether mypy will complain about your particular form of duck typing doesn't have to be specified by the PEP. Perhaps as mypy evolves it will take options to tell it how to handle certain edge cases. Forks of mypy (or entirely different implementations of type checking based on the same annotation syntax) are also a possibility. Maybe in the distant future a version of Python will take a different stance, once we have more experience with how this works out in practice, but for Python 3.5 I want to restrict the scope of the upheaval.

Appendix -- Why Add Type Annotations?

The argument between proponents of static typing and dynamic typing has been going on for many decades. Neither side is all wrong or all right. Python has traditionally fallen in the camp of extremely dynamic typing, and this has worked well for most users, but there are definitely some areas where adding type annotations would help.

- Editors (IDEs) can benefit from type annotations; they can call out obvious mistakes (like misspelled method names or inapplicable operations) and suggest possible method names. Anyone who has used IntelliJ or Xcode will recognize how powerful these features are, and type annotations will make such features more useful when editing Python source code.

- Linters are an important tool for teams developing software. A linter doesn't replace a unittest, but can find certain types of errors better or quicker. The kind of type checking offered by mypy works much like a linter, and has similar benefits; but it can find problems that are beyond the capabilities of most linters.

- Type annotations are useful for the human reader as well! Take the above word_count() example. How long would it have taken you to figure out the types of the argument and return value without annotations? Currently most people put the types in their docstrings; developing a standard notation for type annotations will reduce the amount of documentation that needs to be written, and running the type checker might find bugs in the documentation, too. Once a standard type annotation syntax is introduced, it should be simple to add support for this notation to documentation generators like Sphinx.

- Refactoring. Bob's talk has a convincing example of how type annotations help in (manually) refactoring code. I also expect that certain automatic refactorings will benefit from type annotations -- imagine a tool like 2to3 (but used for some other transformation) augmented by type annotations, so it will know whether e.g. x.keys() is referring to the keys of a dictionary or not.

- Optimizers. I believe this is actually the least important application, certainly initially. Optimizers like PyPy or Pyston wouldn't be able to fully trust the type annotations, and they are better off using their current strategy of optimizing code based on the types actually observed at run time. But it's certainly feasible to imagine a future optimizer also taking type annotations into account.

--
--Guido "I need a new hobby" van Rossum (python.org/~guido)

Gregory P. Smith

unread,
Aug 13, 2014, 9:10:24 PM8/13/14
to Guido van Rossum, Jukka Lehtosalo, Python-Ideas

First, I am really happy that you are interested in this and that your point (2) of what you want to see done is very limited and acknowledges that it isn't going to specify everything!  Because that isn't possible. :)

Unfortunately I feel that adding syntax like this to the language itself is not useful without enforcement because it that leads to code being written with unintentionally incorrect annotations that winds up deployed in libraries that later become a problem as soon as an actual analysis tool attempts to run over something that uses that unknowingly incorrectly specified code in a place where it cannot be easily updated (like the standard library).

At the summit in Montreal earlier this year Łukasz Langa (cc'd) volunteered to lead writing the PEP on Python type hinting based on the many existing implementations of such things (including mypy, cython, numba and pytypedecl). I believe he has an initial draft he intends to send out soon. I'll let him speak to that.

Looks like Łukasz already responded, I'll stop writing now and go read that. :)

Personal opinion from experience trying: You can't express the depth of types for an interface within the Python language syntax itself (assuming hacks such as specially formatted comments, strings or docstrings do not count). Forward references to things that haven't even been defined yet are common. You often want an ability to specify a duck type interface rather than a specific type.  I think he has those points covered better than I do.

-gps

PS If anyone want to see a run time type checker make code run at half speed, look at the one pytypedecl offers. I'm sure it could be sped up, but run-time checkers in an interpreter are always likely to be slow.

Greg Ewing

unread,
Aug 13, 2014, 9:28:56 PM8/13/14
to python...@python.org
On 08/14/2014 12:32 PM, Chris Angelico wrote:
> I don't like the idea of "from typing import ..." as there's already a
> "types" module and I think it'd be confusing.

Maybe

from __statictyping__ import ...

More explicit, and being a dunder name suggests that it's
something special that linters should ignore if they don't
understand it.

--
Greg

Andrew Barnert

unread,
Aug 13, 2014, 9:30:53 PM8/13/14
to Alex Gaynor, python...@python.org
On Wednesday, August 13, 2014 1:30 PM, Alex Gaynor <alex....@gmail.com> wrote:


>I'm strongly opposed this, for a few reasons.


[...]

>Python's type system isn't very good. It lacks many features of more powerful
>systems such as algebraic data types, interfaces, and parametric polymorphism.
>Despite this, it works pretty well because of Python's dynamic typing. I
>strongly believe that attempting to enforce the existing type system would be a
>real shame.

This is my main concern, but I'd phrase it very differently.


First, Python's type system _is_ powerful, but only because it's dynamic. Duck typing simulates parametric polymorphism perfectly, disjunction types as long as they don't include themselves recursively, algebraic data types in some but not all cases, etc. Simple (Java-style) generics, of the kind that Guido seems to be proposing, are not nearly as flexible. That's the problem.

On the other hand, even though these types only cover a small portion of the space of Python's implicit type system, a lot of useful functions fall within that small portion. As long as you can just leave the rest of the program untyped, and there are no boundary problems, there's no real risk.

On the third hand, what worries me is this:

> Mypy has a cast() operator that you can use to shut it up when you (think you) know the conversion is safe.

Why do we need casts? You shouldn't be trying to enforce static typing in a part of the program whose static type isn't sound. Languages like Java and C++ have no choice; Python does, so why not take advantage of it?

The standard JSON example seems appropriate here. What's the return type of json.loads? In Haskell, you write a pretty trivial JSONThing ADT, and you return a JSONThing that's an Object (which means its value maps String to JSONThing). In Python today, you return a dict, and use it exactly the same as in Haskell, except that you can't verify its soundness at compile time. In Java or C++, it's… what? The sound option is a special JSONThing that has separate getObjectMemberString and getArrayMemberString and getObjectMemberInt, which is incredibly painful to use. A plain old Dict[String, Object] looks simple, but it means you have to downcast all over the place to do anything, making it completely unsound, and still unpleasant. The official Java json.org library gives you a hybrid between the two that manages to be neither sound nor user-friendly. And of course there are libraries for many poor static languages (especially C++) that try to fake duck
typing as far as possible for their JSON objects, which is of course nowhere near as far as Python gets for free.

Andrew Barnert

unread,
Aug 13, 2014, 9:42:40 PM8/13/14
to gu...@python.org, Python-Ideas, Jukka Lehtosalo
On Wednesday, August 13, 2014 12:45 PM, Guido van Rossum <gu...@python.org> wrote:


>  def word_count(input: List[str]) -> Dict[str, int]:
>      result = {}  #type: Dict[str, int]
>      for line in input:
>          for word in line.split():
>              result[word] = result.get(word, 0) + 1
>      return result


I just realized why this bothers me.

This function really, really ought to be taking an Iterable[String] (except that we don't have a String ABC). If you hadn't statically typed it, it would work just fine with, say, a text file—or, for that matter, a binary file. By restricting it to List[str], you've made it a lot less usable, for no visible benefit.

And, while this is less serious, I don't think it should be guaranteeing that the result is a Dict rather than just some kind of Mapping. If you want to change the implementation tomorrow to return some kind of proxy or a tree-based sorted mapping, you can't do so without breaking all the code that uses your function.

And if even Guido, in the motivating example for this feature, is needlessly restricting the usability and future flexibility of a function, I suspect it may be a much bigger problem in practice.


This example also shows exactly what's wrong with simple generics: if this function takes an Iterable[String], it doesn't just return a Mapping[String, int], it returns a Mapping of _the same String type_. If your annotations can't express that, any value that passes through this function loses type information. 

And not being able to tell whether the keys in word_count(f) are str or bytes *even if you know that f was a text file* seems like a pretty major loss.

Guido van Rossum

unread,
Aug 13, 2014, 9:44:10 PM8/13/14
to Haoyi Li, Python-Ideas
On Wed, Aug 13, 2014 at 5:53 PM, Haoyi Li <haoy...@gmail.com> wrote:
Both solutions have merit, but the idea of some implementations of the type checker having covariance and some contravariance is fairly disturbing.

Why can't we have both? That's the only way to properly type things, since immutable-get-style APIs are always going to be convariant, set-only style APIs (e.g. a function that takes 1 arg and returns None) are going to be contravariant and mutable get-set APIs (like most python collections) should really be invariant.
 
That makes sense. Can you put something in the mypy tracker about this? (Or send a pull request. :-)

Juancarlo Añez

unread,
Aug 13, 2014, 9:44:54 PM8/13/14
to Guido van Rossum, Jukka Lehtosalo, Python-Ideas

On Wed, Aug 13, 2014 at 6:41 PM, Guido van Rossum <gu...@python.org> wrote:
Actually, mypy already has a solution. There's a codec (https://github.com/JukkaL/mypy/tree/master/mypy/codec) that you can use which transforms Python-2-with-annotations into vanilla Python 2. It's not an ideal solution, but it can work in cases where you absolutely have to have state of the art Python 3.5 type checking *and* backwards compatibility with Python 2.

It can't be a solution because it's a hack...

Cheers,

--
Juancarlo Añez

Greg Ewing

unread,
Aug 13, 2014, 9:45:04 PM8/13/14
to python...@python.org
On 08/14/2014 01:26 PM, Andrew Barnert wrote:

> In Java or C++, it's… what? The sound option is a special JSONThing that
> has separate getObjectMemberString and getArrayMemberString and
> getObjectMemberInt, which is incredibly painful to use.

That's mainly because Java doesn't let you define your own
types that use convenient syntax such as [] for indexing.

Python doesn't have that problem, so a decent static type
system for Python should let you define a JSONThing class
that's fully type-safe while having a standard mapping
interface.

--
Greg

Łukasz Langa

unread,
Aug 13, 2014, 9:58:09 PM8/13/14
to Andrew Barnert, Jukka Lehtosalo, Python-Ideas
On Aug 13, 2014, at 6:39 PM, Andrew Barnert <abar...@yahoo.com.dmarc.invalid> wrote:

On Wednesday, August 13, 2014 12:45 PM, Guido van Rossum <gu...@python.org> wrote:

  def word_count(input: List[str]) -> Dict[str, int]:
      result = {}  #type: Dict[str, int]
      for line in input:
          for word in line.split():
              result[word] = result.get(word, 0) + 1
      return result

I just realized why this bothers me.

This function really, really ought to be taking an Iterable[String]

You do realize String also happens to be an Iterable[String], right? One of my big dreams about Python is that one day we'll drop support for strings being iterable. Nothing of value would be lost and that would enable us to use isinstance(x, Iterable) and more importantly isinstance(x, Sequence). Funny that this surfaces now, too.

Terry Reedy

unread,
Aug 13, 2014, 10:28:39 PM8/13/14
to python...@python.org
Guido, as requesting, I read your whole post before replying. Please to
the same. This response is both critical and supportive.

On 8/13/2014 3:44 PM, Guido van Rossum wrote:

> Yesterday afternoon I had an inspiring conversation with Bob Ippolito
> (man of many trades, author of simplejson) and Jukka Lehtosalo (author
> of mypy: http://mypy-lang.org/).

My main concern with static typing is that it tends to be
anti-duck-typing, while I consider duck-typing to be a major *feature*
of Python. The example in the page above is "def fib(n: int):". Fib
should get an count (non-negative integer) value, but it need not be an
int, and 'half' the ints do not qualify. Reading the tutorial, I could
not tell if it supports numbers.Number (which should approximate the
domain from above.)

Now consider an extended version (after Lucas).

def fib(n, a, b):
i = 0
while i <= n:
print(i,a)
i += 1
a, b = b, a+b

The only requirement of a, b is that they be addable. Any numbers should
be allowed, as in fib(10, 1, 1+1j), but so should fib(5, '0', '1').
Addable would be approximated from below by Union(Number, str).

> Bob gave a talk at EuroPython about
> what Python can learn from Haskell (and other languages); yesterday he
> gave the same talk at Dropbox. The talk is online
> (https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad
> strokes comes down to three suggestions:
>
> (a) Python should adopt mypy's syntax for function annotations

-+ Syntax with no meaning is a bit strange. On the other hand, syntax
not bound to semantics, or at least not bound to just one meaning is
quite pythonic. '+' has two standard meanings, plus custom meanings
embodied in .__add__ methods.

+ The current semantics of annotations is that they are added to
functions objects as .__annotations__ (for whatever use) *and* used as
part of inspect.signature and included in help(ob) responses. In other
words, annotations are already used in the stdlib.

>>> def f(i:int) -> float: pass

>>> from inspect import signature as sig
>>> str(sig(f))
'(i:int) -> float'
>>> help(f)
Help on function f in module __main__:

f(i:int) -> float

Idle calltips include them also. A appropriately flexible standardized
notation would enhance this usage and many others.

+-+ I see the point of "The goal is to make it possible to add type
checking annotations to 3rd party modules (and even to the stdlib) while
allowing unaltered execution of the program by the (unmodified) Python
3.5 interpreter." On the other hand, "pip install mypytyping" is not a
huge burden. On the third hand, in the stdlib allows use in the stdlib.

> (b) Python's use of mutabe [mutable] containers by default is wrong

The premise of this is partly wrong and partly obsolete. As far as I can
remember, Python *syntax* only use tuples, not lists: "except (ex1,
ex2):", "s % (val1, val2)", etc. The use of lists as the common format
for data interchange between functions has largely been replaced by
iterators. This fact makes Python code much more generic, and
anti-generic static typing more wrong.

In remaining cases, 'wrong' is as much a philosophical opinion as a fact.

> (c) Python should adopt some kind of Abstract Data Types

I would have to look at the talk to know what Jukka means.

> Proposals (b) and (c) don't feel particularly actionable (if you
> disagree please start a new thread, I'd be happy to discuss these
> further if there's interest) but proposal (a) feels right to me.

> So what is mypy? It is a static type checker for Python written by
> Jukka for his Ph.D. thesis. The basic idea is that you add type
> annotations to your program using some custom syntax, and when running
> your program using the mypy interpreter, type errors will be found
> during compilation (i.e., before the program starts running).
>
> The clever thing here is that the custom syntax is actually valid Python
> 3, using (mostly) function annotations: your annotated program will
> still run with the regular Python 3 interpreter. In the latter case
> there will be no type checking, and no runtime overhead, except to
> evaluate the function annotations (which are evaluated at function
> definition time but don't have any effect when the function is called).
>
> In fact, it is probably more useful to think of mypy as a heavy-duty
> linter than as a compiler or interpreter; leave the type checking to
> mypy, and the execution to Python. It is easy to integrate mypy into a
> continuous integration setup, for example.
>
> To read up on mypy's annotation syntax, please see the mypy-lang.org
> <http://mypy-lang.org> website.

I did not see a 'reference' page, but the tutorial comes pretty close.
http://mypy-lang.org/tutorial.html
Beyond that, typings.py would be definitive,
https://github.com/JukkaL/mypy/blob/master/lib-typing/3.2/typing.py

> Here's just one complete example, to give a flavor:

> from typing import List, Dict
>
> def word_count(input: List[str]) -> Dict[str, int]:

The input annotation should be Iterable[str], which mypy does have.

> result = {} #type: Dict[str, int]
> for line in input:
> for word in line.split():
> result[word] = result.get(word, 0) + 1
> return result

The information that input is an Iterable[str] can be used either within
the definition of word_count or at places where word_count is called. A
type aware checker, either in the editor or compiler, could check that
the only uses of 'input' within the function is as input to functions
declared to accept an Iterable or in for statements.

Checking that the input to word_count is specifically Iterable[str] as
opposed to any other Iterable may not be possible. But I think what can
be done, including enhancing help information, might be worth it.

For instance, the parameter to s.join is named 'iterable'. Something
more specific, either 'iterable_of_strings' or 'strings: Iterable[str]'
would be more helpful. Indeed, there have been people posting on python
list who thought that 'iterable' means iterable and that .join would
call str() on each object. I think there are other cases where a
parameter is given a bland under-informative type name instead of a
context-specific semantic name just because there was no type annotation
available. There are places where the opposite problem occurs, too
specific instead of too general, where iterable parameters are still
called 'list'.

> Note that the #type: comment is part of the mypy syntax; mypy uses
> comments to declare types in situations where no syntax is available --
> although this particular line could also be written as follows:
>
> result = Dict[str, int]()
>
> Either way the entire function is syntactically valid Python 3, and a
> suitable implementation of typing.py (containing class definitions for
> List and Dict, for example) can be written to make the program run
> correctly. One is provided as part of the mypy project.
>
> I should add that many of mypy's syntactic choices aren't actually new.
> The basis of many of its ideas go back at least a decade: I blogged
> about this topic in 2004
> (http://www.artima.com/weblogs/viewpost.jsp?thread=85551 -- see also the
> two followup posts linked from the top there).
>
> I'll emphasize once more that mypy's type checking happens in a separate
> pass: no type checking happens at run time (other than what the
> interpreter already does, like raising TypeError on expressions like 1+"1").
>
> There's a lot to this proposal, but I think it's possible to get a PEP
> written, accepted and implemented in time for Python 3.5, if people are
> supportive. I'll go briefly over some of the action items.
>
> *(1) A change of direction for function annotations*
>
> PEP 3107 <http://legacy.python.org/dev/peps/pep-3107/>, which introduced
> function annotations, is intentional non-committal about how function
> annotations should be used. It lists a number of use cases, including
> but not limited to type checking. It also mentions some rejected
> proposals that would have standardized either a syntax for indicating
> types and/or a way for multiple frameworks to attach different
> annotations to the same function. AFAIK in practice there is little use
> of function annotations in mainstream code, and I propose a conscious
> change of course here by stating that annotations should be used to
> indicate types and to propose a standard notation for them.

There are many uses for type information and I think Python should
remain neutral among them.

> (We may have to have some backwards compatibility provision to avoid
> breaking code that currently uses annotations for some other purpose.
> Fortunately the only issue, at least initially, will be that when
> running mypy to type check such code it will produce complaints about
> the annotations; it will not affect how such code is executed by the
> Python interpreter. Nevertheless, it would be good to deprecate such
> alternative uses of annotations.)

I can imagine that people who have used annotations might feel a bit
betrayed by deprecation of a new-in-py3 feature. But I do not think it
necessary to do so. Tools that work with mypy annotations, including
mypy itself, should only assume mypy typing if typing is imported. No
'import typing', no 'Warning: annotation does not follow typing rules."
If 'typing' were a package with a 'mypy' module, the door would be
left open to other 'blessed' typing modules.

> *(2) A specification for what to add to Python 3.5*
>
> There needs to be at least a rough consensus on the syntax for
> annotations, and the syntax must cover a large enough set of use cases
> to be useful. Mypy is still under development, and some of its features
> are still evolving (e.g. unions were only added a few weeks ago). It
> would be possible to argue endlessly about details of the notation, e.g.
> whether to use 'list' or 'List', what either of those means (is a
> duck-typed list-like type acceptable?) or how to declare and use type
> variables, and what to do with functions that have no annotations at all
> (mypy currently skips those completely).
>
> I am proposing that we adopt whatever mypy uses here, keeping discussion
> of the details (mostly) out of the PEP. The goal is to make it possible
> to add type checking annotations to 3rd party modules (and even to the
> stdlib) while allowing unaltered execution of the program by the
> (unmodified) Python 3.5 interpreter. The actual type checker will not be
> integrated with the Python interpreter, and it will not be checked into
> the CPython repository. The only thing that needs to be added to the
> stdlib is a copy of mypy's typing.py module. This module defines several
> dozen new classes (and a few decorators and other helpers) that can be
> used in expressing argument types. If you want to type-check your code
> you have to download and install mypy and run it separately.
>
> The curious thing here is that while standardizing a syntax for type
> annotations, we technically still won't be adopting standard rules for
> type checking.

Fine with me, as that is not the only use. And even for type checking,
there is the choice between accept unless clearly wrong, versus reject
unless clearly right.

> This is intentional. First of all, fully specifying all
> the type checking rules would make for a really long and boring PEP (a
> much better specification would probably be the mypy source code).
> Second, I think it's fine if the type checking algorithm evolves over
> time, or if variations emerge.

As in the choice between accept unless clearly wrong, versus reject
unless clearly right.

> The worst that can happen is that you
> consider your code correct but mypy disagrees; your code will still run.
>
> That said, I don't want to /completely/ leave out any specification. I
> want the contents of the typing.py module to be specified in the PEP, so
> that it can be used with confidence. But whether mypy will complain
> about your particular form of duck typing doesn't have to be specified
> by the PEP. Perhaps as mypy evolves it will take options to tell it how
> to handle certain edge cases. Forks of mypy (or entirely different
> implementations of type checking based on the same annotation syntax)
> are also a possibility. Maybe in the distant future a version of Python
> will take a different stance, once we have more experience with how this
> works out in practice, but for Python 3.5 I want to restrict the scope
> of the upheaval.

As usual, we should review the code before acceptance. It is not clear
to me how much of the tutorial is implemented, as it says "Some of these
features might never see the light of day. " ???

> *Appendix -- Why Add Type Annotations?
> *
> The argument between proponents of static typing and dynamic typing has
> been going on for many decades. Neither side is all wrong or all right.
> Python has traditionally fallen in the camp of extremely dynamic typing,
> and this has worked well for most users, but there are definitely some
> areas where adding type annotations would help.

The answer to why on the mypy page is 'easier to find bugs', 'easier
maintenance'. I find this under-convincing as sufficient justification
in itself. I don't think there are many bugs on the tracker due to
calling functions with the wrong type of object. Logic errors, ignored
corner cases, and system idiosyncrasies are much more of a problem.

Your broader list is more convincing.

> - Editors (IDEs) can benefit from type annotations; they can call out
> obvious mistakes (like misspelled method names or inapplicable
> operations) and suggest possible method names. Anyone who has used
> IntelliJ or Xcode will recognize how powerful these features are, and
> type annotations will make such features more useful when editing Python
> source code.
>
> - Linters are an important tool for teams developing software. A linter
> doesn't replace a unittest, but can find certain types of errors better
> or quicker. The kind of type checking offered by mypy works much like a
> linter, and has similar benefits; but it can find problems that are
> beyond the capabilities of most linters.

Currently, Python linters do not have standard type annotations to work
with. I suspect that programs other than mypy would use them if available.

> - Type annotations are useful for the human reader as well! Take the
> above word_count() example. How long would it have taken you to figure
> out the types of the argument and return value without annotations?

Under a minute, including the fact the the annotation was overly
restrictive. But then I already know that only a mutation method can
require a list.

> Currently most people put the types in their docstrings; developing a
> standard notation for type annotations will reduce the amount of
> documentation that needs to be written, and running the type checker
> might find bugs in the documentation, too. Once a standard type
> annotation syntax is introduced, it should be simple to add support for
> this notation to documentation generators like Sphinx.
>
> - Refactoring. Bob's talk has a convincing example of how type
> annotations help in (manually) refactoring code. I also expect that
> certain automatic refactorings will benefit from type annotations --
> imagine a tool like 2to3 (but used for some other transformation)
> augmented by type annotations, so it will know whether e.g. x.keys() is
> referring to the keys of a dictionary or not.
>
> - Optimizers. I believe this is actually the least important
> application, certainly initially. Optimizers like PyPy or Pyston
> <https://github.com/dropbox/pyston> wouldn't be able to fully trust the
> type annotations, and they are better off using their current strategy
> of optimizing code based on the types actually observed at run time. But
> it's certainly feasible to imagine a future optimizer also taking type
> annotations into account.

--
Terry Jan Reedy

Andrew Barnert

unread,
Aug 13, 2014, 11:02:58 PM8/13/14
to Greg Ewing, python...@python.org
On Aug 13, 2014, at 18:44, Greg Ewing <greg....@canterbury.ac.nz> wrote:

On 08/14/2014 01:26 PM, Andrew Barnert wrote:

In Java or C++, it's… what? The sound option is a special JSONThing that
has separate getObjectMemberString and getArrayMemberString and
getObjectMemberInt, which is incredibly painful to use.

That's mainly because Java doesn't let you define your own
types that use convenient syntax such as [] for indexing.

No it's not, or other languages like C++ (which has operator methods and overloading) wouldn't have the exact same problem, but they do. Look at JsonCpp, for example:


Python doesn't have that problem, so a decent static type
system for Python should let you define a JSONThing class
that's fully type-safe while having a standard mapping
interface.

How?

If you go with a single JSONThing type that represents an object, array, number, bool, string, or null, then it can't have a standard mapping interface, because it also needs to have a standard sequence interface, and they conflict. Likewise for number vs. string. The only fully type-safe interface it can have is as_string, as_number, etc. methods (which of course can only check at runtime, so it's no better than using isinstance from Python, and you're forced to do it for every single access.)

What if you go the other way and have separate JSONObject, JSONArray, etc. types? Then all of those problems go away; you can define an unambiguous __getitem__. But what is its return value? The only possibility is a union of all the various types mentioned above, and such a union type has no interface at all. It's only useful if people subvert the type safety by casting.  (I guess you could argue that returning a union type makes your JSON library type safe, it's only every program that ever uses it for anything that's unsafe. But where does that get you?) The only usable type safe interface is separate get_string, get_number, etc. methods in place of __getitem__.

Or you can merge the two together and have a single JSONThing that has both as methods and, for convenience, combined as_object+get, or even as_object+get+as_str.

Also, look at the mutation interfaces for these libraries. They're only marginally tolerable because all variable have obligatory types that you can overload on, which wouldn't be the case in Python.

The alternative is, of course, to come up with a way to avoid type safety. In Swift, you parse a JSON object into a Cocoa NSDictionary, which is a dynamically-typed heterogeneous collection just like a Python dict. There are C++ libraries with a bunch of types that effectively act like Python dict, list, float, str, etc. and try to magically cast when they come into contact with native types. That's the best solution anyone has to dealing with even a dead-simple algebraic data type like JSON in a static language whose type system isn't powerful enough: to try to fake being a duck typed language.

Carlo Pires

unread,
Aug 13, 2014, 11:07:39 PM8/13/14
to Python-Ideas
I'm very happy to see this happening. "Optional" type checking for python would be a great addition to the language. For large codebases, use of type checking can really help.

I also like the idea of using annotations instead of decorators (or something like that). I'm already using this for python3 [1] in a non intrusive way, by using assert to disable it on production.

[1] https://pypi.python.org/pypi/optypecheck
--
  Carlo Pires

Andrew Barnert

unread,
Aug 13, 2014, 11:15:04 PM8/13/14
to Łukasz Langa, Python-Ideas
On Aug 13, 2014, at 18:56, Łukasz Langa <luk...@langa.pl> wrote:

On Aug 13, 2014, at 6:39 PM, Andrew Barnert <abar...@yahoo.com.dmarc.invalid> wrote:

On Wednesday, August 13, 2014 12:45 PM, Guido van Rossum <gu...@python.org> wrote:

  def word_count(input: List[str]) -> Dict[str, int]:
      result = {}  #type: Dict[str, int]
      for line in input:
          for word in line.split():
              result[word] = result.get(word, 0) + 1
      return result

I just realized why this bothers me.

This function really, really ought to be taking an Iterable[String]

You do realize String also happens to be an Iterable[String], right?

Of course, but that's not a new problem, so I didn't want to bring it up. The fact that the static type checker couldn't reject word_count(f.read()) is annoying, but that's not the fault of the static type checking proposal.

One of my big dreams about Python is that one day we'll drop support for strings being iterable. Nothing of value would be lost and that would enable us to use isinstance(x, Iterable) and more importantly isinstance(x, Sequence). Funny that this surfaces now, too.

IIRC, str doesn't implement Container, and therefore doesn't implement Sequence, because its __contains__ method is substring match instead of containment. So if you really want to treat sequences of strings separately from strings, you can. If only that really _were_ more important than Iterable, but I think the opposite is true.

But anyway, this is probably off topic, so I'll stop here.

Łukasz Langa

unread,
Aug 13, 2014, 11:16:50 PM8/13/14
to Andrew Barnert, Python-Ideas
str and bytes objects respond True to both isinstance(x, Container) and isinstance(x, Sequence).

But you’re right, off topic.

David Mertz

unread,
Aug 13, 2014, 11:29:04 PM8/13/14
to Łukasz Langa, Python-Ideas
A long while back I posted a recipe for using annotations for type checking.  I'm certainly not the first person to do this, and what I did was deliberately simple:


The approach I used was to use per-function decorators to say that a given function should be type checked.  The type system I enforce in that recipe is much less than what mypy allows, but I can't see a real reason that it couldn't be extended to cover exactly the same range of type specifiers.

The advantage I perceive in this approach is that it is purely optional, per module and per function.  As well, it doesn't actually require making ANY change to Python 3.5 to implement it.  Or as a minimal change, an extra decorator could simply be available in functools or elsewhere in the standard library, which implemented the full semantics of mypy.

Now admittedly, this would be type checking, but not *static* type checking.  There may not be an easy way to make a pre-runtime "lint" tool do the checking there.  On the other hand, as a number of posters have noted, there's also no way to enforce, e.g. 'Iterable[String]' either statically.

I'm not the BDFL of course, but I do not really get what advantage there is to the pre-runtime check that can catch a fairly small subset of type constraints rather than check at runtime everything that is available then (as the decorator approach could get you).


_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/



--
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.

Guido van Rossum

unread,
Aug 13, 2014, 11:43:05 PM8/13/14
to Andrew Barnert, Python-Ideas
On Wed, Aug 13, 2014 at 6:39 PM, Andrew Barnert <abar...@yahoo.com.dmarc.invalid> wrote:
On Wednesday, August 13, 2014 12:45 PM, Guido van Rossum <gu...@python.org> wrote:

>  def word_count(input: List[str]) -> Dict[str, int]:
>      result = {}  #type: Dict[str, int]
>      for line in input:
>          for word in line.split():
>              result[word] = result.get(word, 0) + 1
>      return result

I just realized why this bothers me.

This function really, really ought to be taking an Iterable[String] (except that we don't have a String ABC). If you hadn't statically typed it, it would work just fine with, say, a text file—or, for that matter, a binary file. By restricting it to List[str], you've made it a lot less usable, for no visible benefit.

Heh. :-) I had wanted to write an additional paragraph explaining that it's easy to change this to use typing.Iterable instead of typing.List, but I forgot to add that.
 
And, while this is less serious, I don't think it should be guaranteeing that the result is a Dict rather than just some kind of Mapping. If you want to change the implementation tomorrow to return some kind of proxy or a tree-based sorted mapping, you can't do so without breaking all the code that uses your function.

Yeah, there's a typing.Mapping for that.
 
And if even Guido, in the motivating example for this feature, is needlessly restricting the usability and future flexibility of a function, I suspect it may be a much bigger problem in practice.

Well, so it was actually semi-intentional. :-)
 
This example also shows exactly what's wrong with simple generics: if this function takes an Iterable[String], it doesn't just return a Mapping[String, int], it returns a Mapping of _the same String type_. If your annotations can't express that, any value that passes through this function loses type information.

In most cases it really doesn't matter though -- some types are better left concrete, especially strings and numbers. If you read the mypy docs you'll find that there are generic types, so that it's possible to define a function as taking an Iterable[T] and returning a Mapping[T, int]. What's not currently possible is expressing additional constraints on T such as that it must be a String. When I last talked to Jukka he explained that he was going to add something for that too (@Jukka: structured types?).
 
And not being able to tell whether the keys in word_count(f) are str or bytes *even if you know that f was a text file* seems like a pretty major loss.

On this point one of us must be confused. Let's assume it's me. :-) Mypy has a few different IO types that can express the difference between text and binary files. I think there's some work that needs to be done (and of course the built-in open() function has a terribly ambiguous return type :-( ), but it should be possible to say that a text file is an Interable[str] and a binary file is an Iterable[bytes]. So together with the structured (?) types it should be possible to specify the signature of word_count() just as you want it. However, in most cases it's overkill, and you wouldn't want to do that for most code.

Also, it probably wouldn't work for more realistic examples -- as soon as you replace the split() method call with something that takes punctuation into account, you're probably going to write it in a way that works only for text strings anyway, and very few people will want or need to write the polymorphic version. (But if they do, mypy has a handy @overload decorator that they can use. :-)

Anyway, I agree it would be good to make sure that some of these more advanced things can actually be spelled before we freeze our commitment to a specific syntax, but let's not assume that just because you can't spell every possible generic use case it's no good.

Jukka Lehtosalo

unread,
Aug 14, 2014, 12:07:25 AM8/14/14
to Andrew Barnert, Python-Ideas
On Wed, Aug 13, 2014 at 6:39 PM, Andrew Barnert <abar...@yahoo.com> wrote:
On Wednesday, August 13, 2014 12:45 PM, Guido van Rossum <gu...@python.org> wrote:


>  def word_count(input: List[str]) -> Dict[str, int]:
>      result = {}  #type: Dict[str, int]
>      for line in input:
>          for word in line.split():
>              result[word] = result.get(word, 0) + 1
>      return result


I just realized why this bothers me.

This function really, really ought to be taking an Iterable[String] (except that we don't have a String ABC). If you hadn't statically typed it, it would work just fine with, say, a text file—or, for that matter, a binary file. By restricting it to List[str], you've made it a lot less usable, for no visible benefit.

And, while this is less serious, I don't think it should be guaranteeing that the result is a Dict rather than just some kind of Mapping. If you want to change the implementation tomorrow to return some kind of proxy or a tree-based sorted mapping, you can't do so without breaking all the code that uses your function.

I see this is a matter of programming style. In a library module, I'd usually use about as general types as feasible (without making them overly complex). However, if we have just a simple utility function that's only used within a single program, declaring everything using abstract types buys you little, IMHO, but may make things much more complicated. You can always refactor the code to use more general types if the need arises. Using simple, concrete types seems to decrease the cognitive load, but that's just my experience.

Also, programmers don't always read documentation/annotations and can abuse the knowledge of the concrete return type of any function (they can figure this out easily by using repr()/type()). In general, as long as dynamically typed programs may call your function, changing the concrete return type of a library function risks breaking code that makes too many assumptions. Thus I'd rather use concrete types for function return types -- but of course everybody is free to not follow this convention.


And if even Guido, in the motivating example for this feature, is needlessly restricting the usability and future flexibility of a function, I suspect it may be a much bigger problem in practice.


This example also shows exactly what's wrong with simple generics: if this function takes an Iterable[String], it doesn't just return a Mapping[String, int], it returns a Mapping of _the same String type_. If your annotations can't express that, any value that passes through this function loses type information. 

If I define a subclass X of str, split() still returns a List[str] rather than List[X], unless I override something, so this wouldn't work with the above example:

>>> class X(str): pass
...
>>> type(X('x y').split()[0])
<class 'str'>


And not being able to tell whether the keys in word_count(f) are str or bytes *even if you know that f was a text file* seems like a pretty major loss.

Mypy considers bytes incompatible with str, and vice versa. The annotation Iterable[str] says that Iterable[bytes] (such as a binary file) would not be a valid argument. Text files and binary files have different types, though the return type of open(...) is not inferred correctly right now. It would be easy to fix this for the most common cases, though.

You could use AnyStr to make the example work with bytes as well:

  def word_count(input: Iterable[AnyStr]) -> Dict[AnyStr, int]:
      result = {}  #type: Dict[AnyStr, int]

      for line in input:
          for word in line.split():
              result[word] = result.get(word, 0) + 1
      return result

Again, if this is just a simple utility function that you use once or twice, I see no reason to spend a lot of effort in coming up with the most general signature. Types are an abstraction and they can't express everything precisely -- there will always be a lot of cases where you can't express the most general type. However, I think that relatively simple types work well enough most of the time, and give the most bang for the buck.

Jukka

Terry Reedy

unread,
Aug 14, 2014, 12:22:37 AM8/14/14
to python...@python.org
On 8/13/2014 5:08 PM, Andrey Vlasovskikh wrote:

> Here are slides from my talk about optional typing in Python, that
> show how Mypy types can be used in both static and dynamic type
> checking
> (http://blog.pirx.ru/media/files/2013/python-optional-typing/),
I tried this on Windows 7in both Firefox and Internet Explorer and I
cannot find any way to advance other than changing the page number on
the url bar.


--
Terry Jan Reedy

Jukka Lehtosalo

unread,
Aug 14, 2014, 12:34:35 AM8/14/14
to Guido van Rossum, Python-Ideas
On Wed, Aug 13, 2014 at 8:41 PM, Guido van Rossum <gu...@python.org> wrote:
On Wed, Aug 13, 2014 at 6:39 PM, Andrew Barnert <abar...@yahoo.com.dmarc.invalid> wrote:
This example also shows exactly what's wrong with simple generics: if this function takes an Iterable[String], it doesn't just return a Mapping[String, int], it returns a Mapping of _the same String type_. If your annotations can't express that, any value that passes through this function loses type information.

In most cases it really doesn't matter though -- some types are better left concrete, especially strings and numbers. If you read the mypy docs you'll find that there are generic types, so that it's possible to define a function as taking an Iterable[T] and returning a Mapping[T, int]. What's not currently possible is expressing additional constraints on T such as that it must be a String. When I last talked to Jukka he explained that he was going to add something for that too (@Jukka: structured types?).

I wrote another message where I touched this. Mypy is likely to support something like this in the future, but I doubt it's usually worth the complexity. If a type signature is very general, at some point it describes the implementation in sufficient detail that you can't modify the code without changing the type. For example, we could plausibly allow anything that just supports split(), but if we change the implementation to use something other than split(), the signature would have to change. If we use more specific types (such as str), we leave us the freedom to modify the implementation within the bounds of the str interface. Standard library functions often only accept concrete str objects, so the moment you start using an abstract string type you lose access to much of the stdlib.

 
And not being able to tell whether the keys in word_count(f) are str or bytes *even if you know that f was a text file* seems like a pretty major loss.

On this point one of us must be confused. Let's assume it's me. :-) Mypy has a few different IO types that can express the difference between text and binary files. I think there's some work that needs to be done (and of course the built-in open() function has a terribly ambiguous return type :-( ), but it should be possible to say that a text file is an Interable[str] and a binary file is an Iterable[bytes]. So together with the structured (?) types it should be possible to specify the signature of word_count() just as you want it. However, in most cases it's overkill, and you wouldn't want to do that for most code.

See my other message where I show that you can do this right now, except for the problem with open().
 

Also, it probably wouldn't work for more realistic examples -- as soon as you replace the split() method call with something that takes punctuation into account, you're probably going to write it in a way that works only for text strings anyway, and very few people will want or need to write the polymorphic version. (But if they do, mypy has a handy @overload decorator that they can use. :-)

Anyway, I agree it would be good to make sure that some of these more advanced things can actually be spelled before we freeze our commitment to a specific syntax, but let's not assume that just because you can't spell every possible generic use case it's no good.

It's always easy to come up with interesting corner cases where a type system would break down, but luckily, these are often almost non-existent in the wild :-) I've learned that examples should be motivated by patterns in existing, 'real' code, as otherwise you'll waste your time on things that happen maybe once a million lines (or maybe only in code that *you* write).
 
Jukka

Guido van Rossum

unread,
Aug 14, 2014, 12:56:47 AM8/14/14
to Łukasz Langa, Python-Ideas
On Wed, Aug 13, 2014 at 6:00 PM, Łukasz Langa <luk...@langa.pl> wrote:
It’s great to see this finally happening!

Yes. :-)
 
I did some research on existing optional-typing approaches [1]. What I learned in the process was that linting is the most important use case for optional typing; runtime checks is too little, too late.

That being said, having optional runtime checks available *is* also important. Used in staging environments and during unit testing, this case is able to cover cases obscured by meta-programming. Implementations like “obiwan” and “pytypedecl” show that providing a runtime type checker is absolutely feasible.

Yes. And the proposal here might well enable such applications (by providing a standard way to spell complex types). But I think it's going to be less important than good support for linting, so that's what I want to focus on first.
 
The function annotation syntax currently supported in Python 3.4 is not well-suited for typing. This is because users expect to be able to operate on the types they know. This is currently not feasible because:
1. forward references are impossible

(Mypy's hack for this is that a string literal can be used as a forward reference.)

2. generics are impossible without custom syntax (which is the reason Mypy’s Dict exists)
3. optional types are clumsy to express (Optional[int] is very verbose for a use case this common)

So define an alias 'oint'. :-)
 
4. union types are clumsy to express

Aliasing can help.
 
All those problems are elegantly solved by Google’s pytypedecl via moving type information to a separate file.

Mypy supports this too using stub files, but I think it is actually a strength that it doesn't require new syntax (although if the idea becomes popular we could certainly add syntax to support those things where mypy currently requires magic comments).

Honestly I'm not sure what to do about mypy vs. pytypedecl. Should they compete, collaborate, converge? Do we need a bake-off or a joint hackathon? Food for thought.
 
Because for our use case that would not be an acceptable approach, my intuition would be to:

1. Provide support for generics (understood as an answer to the question: “what does this collection contain?”) in Abstract Base Classes. That would be a PEP in itself.
2. Change the function annotation syntax so that it’s not executed at import time but rather treated as strings. This solves forward references and enables us to…
3. Extend the function annotation syntax with first-class generics support (most languages like "list<str>”)
4. Extend the function annotation syntax with first-class union type support. pytypedecl simply uses “int or None”, which I find very elegant.
5. Speaking of None, possibly further extend the function annotation syntax with first-class optionality support. In the Facebook codebase in Hack we have tens of thousands of optional ints (nevermind other optional types!), this is a case that’s going to be used all the time. Hack uses ?int, that’s the most succinct style you can get. Yes, it’s special but None is a special type, too.

Hm. I think that selling such (IMO) substantial changes to Python's syntax is going to be much harder than just the idea of a standard typing syntax implemented as a new stdlib module. While mypy's syntax is perhaps not as concise or elegant as would be possible if we were to design the syntax from the ground up, it's actually pretty darn readable, and it is compatible with Python 3.2. It has decent ways to spell generics, forward references, unions and optional types already. And while I want to eventually phase out other uses of function annotations, your change #2 would break all existing packages that use them for other purposes (like Ethan Furman's scription).
 
All in all, I believe Mypy has the highest chance of becoming our typing linter, which is great! I just hope we can improve on the syntax, which is currently lacking. Also, reusing our existing ABCs where applicable would be nice. With Mypy’s typing module I feel like we’re going to get a new, orthogonal set of ABCs, which will confuse users to no end. Finally, the runtime type checker would make the ecosystem complete.

We can discuss these things separately. Language evolution is an exercise in compromise. We may be able to reuse the existing ABCs, and mypy could still support Python 3.2 (or, with the codeck hack, 2.7) by having the typing module export aliases to those ABCs. I won't stop you from implementing a runtime type checker, but I think it should be a separate project.
 
This is just the beginning of the open issues I was juggling with and the reason my own try at the PEP was coming up slower than I’d like.

Hopefully I've motivated you to speed up!
 
[1] You can find a summary of examples I looked at here: http://lukasz.langa.pl/typehinting/
 
--
--Guido van Rossum (python.org/~guido)

Guido van Rossum

unread,
Aug 14, 2014, 1:12:49 AM8/14/14
to Gregory P. Smith, Python-Ideas
On Wed, Aug 13, 2014 at 6:09 PM, Gregory P. Smith <gr...@krypto.org> wrote:
First, I am really happy that you are interested in this and that your point (2) of what you want to see done is very limited and acknowledges that it isn't going to specify everything!  Because that isn't possible. :)

What a shame. :-)
 
Unfortunately I feel that adding syntax like this to the language itself is not useful without enforcement because it that leads to code being written with unintentionally incorrect annotations that winds up deployed in libraries that later become a problem as soon as an actual analysis tool attempts to run over something that uses that unknowingly incorrectly specified code in a place where it cannot be easily updated (like the standard library).

We could refrain from using type annotations in the stdlib (similar to how we refrain from using Unicode identifiers). Mypy's stubs mechanism makes it possible to ship the type declarations for stdlib modules with mypy instead of baking them into the stdlib.
 
At the summit in Montreal earlier this year Łukasz Langa (cc'd) volunteered to lead writing the PEP on Python type hinting based on the many existing implementations of such things (including mypy, cython, numba and pytypedecl). I believe he has an initial draft he intends to send out soon. I'll let him speak to that.

Mypy has a lot more than an initial draft. Don't be mistaken by its status as "one person's Ph.D. project" -- Jukka has been thinking about this topic for a decade, and mypy works remarkably well already. It also has some very active contributors already.
 
Looks like Łukasz already responded, I'll stop writing now and go read that. :)

Personal opinion from experience trying: You can't express the depth of types for an interface within the Python language syntax itself (assuming hacks such as specially formatted comments, strings or docstrings do not count). Forward references to things that haven't even been defined yet are common. You often want an ability to specify a duck type interface rather than a specific type.  I think he has those points covered better than I do.

I think mypy has solutions for the syntactic issues, and the rest can be addressed by introducing a few more magic helper functions. It's remarkably readable.

Guido van Rossum

unread,
Aug 14, 2014, 1:25:39 AM8/14/14
to Jukka Lehtosalo, Python-Ideas
On Wed, Aug 13, 2014 at 9:06 PM, Jukka Lehtosalo <jleht...@gmail.com> wrote:

You could use AnyStr to make the example work with bytes as well:

  def word_count(input: Iterable[AnyStr]) -> Dict[AnyStr, int]:
      result = {}  #type: Dict[AnyStr, int]

      for line in input:
          for word in line.split():
              result[word] = result.get(word, 0) + 1
      return result

Again, if this is just a simple utility function that you use once or twice, I see no reason to spend a lot of effort in coming up with the most general signature. Types are an abstraction and they can't express everything precisely -- there will always be a lot of cases where you can't express the most general type. However, I think that relatively simple types work well enough most of the time, and give the most bang for the buck.

I heartily agree. But just for the type theorists amongst us, if I really wanted to write the most general type, how would I express that the AnyStr in the return type matches the one in the argument? (I think pytypedecl would use something like T <= AnyStr.)

Haoyi Li

unread,
Aug 14, 2014, 1:40:19 AM8/14/14
to Guido van Rossum, Python-Ideas
I heartily agree. But just for the type theorists amongst us, if I really wanted to write the most general type, how would I express that the AnyStr in the return type matches the one in the argument? (I think pytypedecl would use something like T <= AnyStr.)

To borrow Scala syntax, it would look something like

def word_count[AnyStr <: String](input: Iterable[AnyStr]): Dict[AnyStr, Int]
Where word_count is a generic function on the type AnyStr, which is not just *any* type but a type bound by the restriction it is a subclass of String. Thus you can force that the AnyStr going in and the AnyStr going out are the same one.

I'm not sure if mypy allows for type-bounds, but that's the way you achieve what you want if it does, or will, in future =P


Guido van Rossum

unread,
Aug 14, 2014, 1:45:46 AM8/14/14
to Haoyi Li, Python-Ideas
I'd be more interested in Jukka's specific proposal. (Note that in mypy, AnyStr is the type that you call String here -- it's either bytes or str; and it seems that you're introducing AnyStr here as a type variable -- mypy conventionally uses T for this, but you have to define it first.)

Greg Ewing

unread,
Aug 14, 2014, 2:49:03 AM8/14/14
to python...@python.org
Andrew Barnert wrote:
> If you go with a single JSONThing type that represents an object, array,
> number, bool, string, or null, then it can't have a standard mapping
> interface, because it also needs to have a standard sequence interface,

I didn't mean that. The most Pythonic way would probably be to
have something like

JSONThing = Union(JSONDict, JSONList, Int, Float, Str, Bool)
JSONDict = Mapping[str, JSONThing]
JSONList = Sequence[JSONThing]

If you're concerned about type safety, at some point you
need to introspect on what you've got to figure out what
to do with it. This is inevitable, since a JSON object
is inherently a dynamically-typed thing. This is true
even in Haskell, you just don't notice it so much because
you do it with pattern matching.

> The only
> possibility is a union of all the various types mentioned above, and
> such a union type has no interface at all. It's only useful if people
> subvert the type safety by casting.

I don't know what mypy does with union types, but if I were
designing a type system like this I would say that, e.g. if
i is know to be of type Int and d of type JSONDict, then

i = d['foo']

should be allowed, on the grounds that the return value
*could* be an Int, with a run-time type check to make sure
that it actually is. An implicit cast, in other words.

As long as mypy is just a linter it's obviously not in
a position to insert the runtime check, but it could be
argued that it should allow the assignment anyway,
since Python will end up raising a TypeError at some
point if it's wrong.

I've forgotten what the original point of all this was.
If the point was that there's no benefit in trying to
make JSON type-safe in Python, and you should just leave
it all dynamically typed, maybe you're right.

Andrew Barnert

unread,
Aug 14, 2014, 2:51:29 AM8/14/14
to Jukka Lehtosalo, Python-Ideas
On Aug 13, 2014, at 21:06, Jukka Lehtosalo <jleht...@gmail.com> wrote:

> You could use AnyStr to make the example work with bytes as well:
>
> def word_count(input: Iterable[AnyStr]) -> Dict[AnyStr, int]:
> result = {} #type: Dict[AnyStr, int]
> for line in input:
> for word in line.split():
> result[word] = result.get(word, 0) + 1
> return result

Defining a function as taking an Iterable[AnyStr] and returning a Dict[AnyStr, int], without any way to declare that the two AnyStr are the same type, is exactly what I meant by losing critical information. I pass in something that I know is a text file, and I get out something that are either strings or bytes and I don't know which; what am I going to do with that? If you can't propagate types, static typing doesn't help.

My point is that the depth you seem to be reaching for is not actually a sweet spot for power vs. simplicity. You could make it simpler by just dropping generics in favor of a handful of special-purpose modifiers (optional, iterable-of), or you could make it more useful by having real parametric types, or at least the equivalent of C++ template templates or Swift arbitrary constraints. What makes Java-style simple generics with subclass constraints the right level of complexity?

Andrew Barnert

unread,
Aug 14, 2014, 3:40:12 AM8/14/14
to Greg Ewing, python...@python.org
On Aug 13, 2014, at 23:48, Greg Ewing <greg....@canterbury.ac.nz> wrote:

> I've forgotten what the original point of all this was.
> If the point was that there's no benefit in trying to
> make JSON type-safe in Python, and you should just leave
> it all dynamically typed, maybe you're right.

Well, that was my starting point, which I assumed people would take for granted, so I could use it to make my real point, which is that "just downcast it", which came up a few times in the first couple hours of this discussion, is a bad answer in general.

If you start from the assumption that everything can and should be statically typed, but don't design a type system powerful enough to do that, you end up putting downcasts from void* or Object or whatever all over the place, and it's impossible to tell what parts of the program are actually safe, which defeats the entire purpose of static typing.

If you start from the assumption that there will be some self-contained regions of your code that can be typed by your type system, and other parts just have to be dynamically typed and that's OK, you don't subvert the type system all over the place, and you know which parts of the code are or aren't known to be safe.

Anyway, I think this is basically what you just said--you started explaining how we could effectively add C++ style implicit casts to hide the ugliness and unsafety of dealing with JSON, but then said that maybe it would be better to just leave the values dynamically typed. As long as the type checker can (not necessarily in the initial version, but at least reasonably plausibly) make it easy to tell which areas of your code have been "infected" by dynamic types, I think defaulting to "too hard, leave it dynamic" is the simplest, and probably right, thing to do. (Unless we actually want to design a sufficiently powerful type system, which I don't think we do.)

There's a great paper on how this works in practice in an ML variant, but unfortunately googling any subset of the only keywords I can remember (ML type duck infect dynamic) just brings up papers about avian virology (even if I leave our "duck"). I think there's also a (much more recent and less technical) blog post by one of the Swift guys about a similar idea--making it easy to stay dynamic means you end up with cleanly walled off regions of safe and unsafe code. (Or you would, if Swift had a halfway-decent stdlib instead of forcing you to bridge to ObjC for almost anything, but that's not important here.)

Nick Coghlan

unread,
Aug 14, 2014, 4:14:16 AM8/14/14
to Guido van Rossum, Python-Ideas, Fernando Perez
On 14 August 2014 15:11, Guido van Rossum <gu...@python.org> wrote:
> On Wed, Aug 13, 2014 at 6:09 PM, Gregory P. Smith <gr...@krypto.org> wrote:
>>
>> At the summit in Montreal earlier this year Łukasz Langa (cc'd)
>> volunteered to lead writing the PEP on Python type hinting based on the many
>> existing implementations of such things (including mypy, cython, numba and
>> pytypedecl). I believe he has an initial draft he intends to send out soon.
>> I'll let him speak to that.
>
>
> Mypy has a lot more than an initial draft. Don't be mistaken by its status
> as "one person's Ph.D. project" -- Jukka has been thinking about this topic
> for a decade, and mypy works remarkably well already. It also has some very
> active contributors already.

FWIW, I was strongly encouraging folks at SciPy that were interested
in static typing to look at mypy as an example of a potentially
acceptable syntax for standardised optional static typing.

Aside from being +1 on the general idea of picking *something* as
"good enough" and iterating from there, I don't have a strong personal
opinion, though.

(And yes, my main interest is in improving the ability to do effective
linting for larger projects. "pylint -E" is a lot better than nothing,
but it has its limits in the absence of additional static hints)

Cheers,
Nick.

--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia

Manuel Cerón

unread,
Aug 14, 2014, 6:29:25 AM8/14/14
to Guido van Rossum, Python-Ideas
On Thu, Aug 14, 2014 at 1:24 AM, Guido van Rossum <gu...@python.org> wrote:
On Wed, Aug 13, 2014 at 3:26 PM, Manuel Cerón <cero...@gmail.com> wrote:
The type checking algorithm might evolve over the time, but by including typing.py in the stdlib, the syntax for annotations would be almost frozen and that will be a limitation. In other projects such as TypeScript (http://www.typescriptlang.org/), that the syntax usually evolves alongside the algorithms.

What kind of evolution did TypeScript experience?


Most of these are in the way that the type checking engine works, but there are some syntax changes. While the basic types have remained stable, some special things have not. For example: generics, optional and default arguments, interfaces.

I think it'd be valuable to learn from TypeScript as much as possible, it's the only project that I know that It's trying to bring static type analysis to a widely used dynamically typed language.

One interesting feature of TypeScript is that it allows you to annotate existing code without modifying it, by using external definition files. In the JavaScript world, many people have contributed TypeScript annotation files for popular JS libraries (http://definitelytyped.org/).

I think this is possible in Python as well doing something like this:

@annotate('math.ciel')
def ciel(x: float) -> int:
    pass

I think this should be the way to go for annotating the stdlib. It has the advantage that if the type syntax changes, it's possible to provide new type annotations without changing the libraries at all, and even supporting older versions. In this way the code and type annotations can evolve separately.

Does mypy support something like this?

Manuel.

willv...@gmail.com

unread,
Aug 14, 2014, 7:16:54 AM8/14/14
to python...@python.org
I fully support formalizing Python 3's annotations for type checking.

I wrote - and use daily - my own type checker called obiwan
https://pypi.python.org/pypi/obiwan

Its a runtime type checker, and if enabled will check and enforce
types on every call.

I support the wider adoption and standardization of static type
checkers, but runtime checkers are still wanted for very dynamic code
and for checking external data e.g. I use obiwan for validating JSON.

One small detail is that I feel the obiwan annotations are more
pythonic than the mypy examples given.

E.g. instead of:

from typing import List, Dict

def word_count(input: List[str]) -> Dict[str, int]:
...

It would be:

def word_count(input: [str]) -> {str, int}:
...

Obiwan does not check types within functions; I was unwilling to try
and overload comments! You can invoke obiwan to check things
explicitly, but these are more as assertions.

Anyway, when I look at the mypy in-function annotations (where
comments are overloaded) I am cautious. It would be far nicer if we
had annotations as part of the language instead, e.g. instead of:

result = {} #type: Dict[str, int]

It would be:

result = {} -> {str, int}

where we use the -> arrow again. I can see pros and cons for any
implementation (as we'd want the annotation to be both to declare a
type and to check a type, and want the annotation to be attached to
the variable forever etc) so this would need a full PEP treatment and
possibly be configurable as asserts are.

But proper annotation support in the language rather than overloading
comments would definitely be my preference.

/Will

/Will

Andrey Vlasovskikh

unread,
Aug 14, 2014, 7:37:05 AM8/14/14
to Manuel Cerón, Python-Ideas
On Thu, Aug 14, 2014 at 2:28 PM, Manuel Cerón <cero...@gmail.com> wrote:

> One interesting feature of TypeScript is that it allows you to annotate
> existing code without modifying it, by using external definition files. In
> the JavaScript world, many people have contributed TypeScript annotation
> files for popular JS libraries (http://definitelytyped.org/).
>
> I think this is possible in Python as well doing something like this:
>
> @annotate('math.ciel')
> def ciel(x: float) -> int:
> pass
>
> I think this should be the way to go for annotating the stdlib. It has the
> advantage that if the type syntax changes, it's possible to provide new type
> annotations without changing the libraries at all, and even supporting older
> versions. In this way the code and type annotations can evolve separately.
>
> Does mypy support something like this?

We use something quite similar to TypeScript's repository of
annotations in PyCharm. Here is our external annotations proposal and
stubs for some stdlib modules
(https://github.com/JetBrains/python-skeletons) that uses a custom
type syntax in docstrings due to lack of a better standard, see
README. We state in our proposal that we would like the standard to
emerge. The idea being discussed here about using Mypy's type system
is not new to us. As I've mentioned in the original thread, we have
discussed it with Jukka Lehtosalo, the author of Mypy. Some initial
ideas are listed here (https://github.com/pytypes/pytypes).

--
Andrey Vlasovskikh
Web: http://pirx.ru/

Yann Kaiser

unread,
Aug 14, 2014, 10:00:44 AM8/14/14
to Andrey Vlasovskikh, Python-Ideas
I'd like to offer my perspective(and objections) on this:

I author and maintain a reflective argument parser for python, clize[1], possibly in the same vein as Ethan's. While it originally did not support any Python 3 features, it has come to accept keyword-only parameters and annotations for alternate parameter names and conversion functions(which may look like type annotations to a reader looking for them). The upcoming version will completely deprecate the old "pass a bunch of unsightly arguments to a decorators[2]" API, and use annotations exclusively[3], using backports[4][5] to make them and keyword-only parameters available to Python 2 users. Clize has a handful of users, with 100 stars on GitHub, and ~75 downloads per day according to PyPI[1].

Needless to say, the idea of deprecating uses of parameter and return annotations that aren't this particular way of typechecking is upsetting to me.

On to the most formal of formal complaints: PEP 3107[6] rejected standardizing typing-related annotations. What has changed since then, or what has otherwise invalidated that conclusion, that it should be backtracked on and changed? GvR says he knows of little practical use of function annotations in mainstream code. Here's the (unfortunate) reason: Mainstream code is either incompatible with Python 3 or maintaining compatibility between both Python 2 and Python 3. Standard library code which had no such concern for the most part stuck with what stood in PEP 3107, which was not to teach annotations as typing info. Short of resorting to tools like sigtools.modifiers.annotate[4], *mainstream code can't use function annotations*.

You will argue that people are already coming up with type-checking libraries en masse, and that little alternative uses have come up. That's because type-checking is an idea that people have seen and had in mind well before now, PEP 3107, or Python itself. It is even one of the two examples in the PEP, the other being fairly irrelevant given docstrings, IMO. Mainstream code can't use annotations. New brains are mostly taught to use python 2[7] or keep compatibility with it, so no new ideas there(save for the endless stream of "I've used Python for a day and it needs static typing" coming from statically-typed languages.) IMO there simply hasn't been enough time for new uses to show up. There's the argument parsers me and a couple others have mentioned, and although the idea can be ported to other interfaces(web forms, GUI forms?), this is only one use yet, but I think it is a good one, one that is also fairly unique to Python. Do we want to close this window of opportunity by trying to imitate the 80's?

Even if you don't deprecate other uses, putting this in the standard library sends a message to the mob: "Annotations are meant to express types and other uses should conform." It's already slightly against-current to propose reflective tools because "reflection is weird and unreliable," I'd hate to see where this would take us.

Worse than shutting down a potential area of innovation, promoting type-checking in such a way would alienate what many experienced Python programmers view as a core tenet of Python: duck-typing.

Ironically, GvR's example illustrates it perfectly:

  from typing import List, Dict

  def word_count(input: List[str]) -> Dict[str, int]:
      result = {}  #type: Dict[str, int]
      for line in input:
          for word in line.split():
              result[word] = result.get(word, 0) + 1
      return result

First, there's the obvious mistake others have noted: this thing does not need a list, it needs an iterable, and that iterable merely needs to yield objects that have a split method, which yields... hashable objects. That's it. This function could work just as well if line.split yielded integers. It's what's awesome about things like collections.Counter. You can give it whatever iterable you like, whether it yields characters, strings, integers, the types of objects tracked by the garbage collector, you name it. This is awesome. Declaring and casting between type templates is confusing, verbose, scary and not awesome in general. It just makes me think of C++ (sorry).

Let's say that this function needs to operate on strings, because at some point in the future it will replace line.split() with something that accounts for punctuation and whatnot, and ignore that this would really mean it actually became a different function. (Do you expect str.split to begin handling punctuation overnight?) Now the only mistake is that it pretends it is specific to lists. This mistake is easily realized, so what happens when you write your fancy library that fully declares its typing, declaring generic interfaces where appropriate, and one of your dependencies finally makes the switch over and gets it wrong? Now your typechecker says your functions are making incorrect calls to that API because you passed the result of a generator function to something that expects a list. How often do you think this will happen? How many people eager to type-check everything will not know how duck-typing matters in their library or program and specify overly restrictive interfaces for their functions? How many will assume checking declared types is a substitute for unit testing? This is speculative, but being a regular participant of #python on FreeNode, I already dread this.

Finally, if I abstract this proposal, it boils down to: The proposal wants to take what has been so far a fully free-for-all attribute of functions, and make it free-for-all-but-only-for-this-purpose. That's a bit... weird? Maybe the mypy project could split its declaration and checking components, sort of like I split clize(CLI argument parsing) and sigtools(high-level signature operations and reflection improvements)? Isn't the way those libraries declare types-to-be-checked the main way they will distinguish each other for developers? Is there a problem preventing IDEs from implementing a runner for mypy, much like eg. the way you can use pyflakes as your compiler in vim?

Sorry for the long rant.

[4] Inspect.signature-compatible backport of annotations: http://sigtools.readthedocs.org/en/latest/#sigtools.modifiers.annotate
[5] Backport of inspect.signature: https://pypi.python.org/pypi/funcsigs/0.4
[6] Rejected proposals from Function Annotations PEP: http://legacy.python.org/dev/peps/pep-3107/#rejected-proposals
[7] Warnings for Beginners - Learn Python the Hard Way: http://learnpythonthehardway.org/book/ex0.html#warnings-for-beginners

-Yann Kaiser

Antoine Pitrou

unread,
Aug 14, 2014, 10:22:04 AM8/14/14
to python...@python.org

Hi,

So a couple more comments:

- there seems to be no detailed description of mypy's typing.py. There
are some docstrings in the module but they don't really explain the
intended semantics of using those types (and, most importantly,
combining and extending them). Or is the tutorial the description?

- mypy is a young module; is it actually proven on real-world codebases?
if a PEP gets written, typing.py should be proposed on the "provisional"
basis as described in PEP 411

- for the two reasons above I would expect a PEP to spell out and
somehow motivate the typing semantics (by "typing" I do not mean the
actual algorithm of a type checker, but the semantics of combining types
and the mechanisms for extension), rather than defer to third-party docs
and source code

- many discussion topics seems to revolve around the syntax for
declarations; besides syntactic sugar for declarations, though, there
should probably be a hook to describe typing in a more programmatic way,
especially when parametric types ("generic"?) are involved. I can't
think of an example, so I realize this comment may not be very helpful
(!), but I'm sure we'll hit the limits of what declarative syntax can
give us, especially once people try it out on non-trivial code bases. Do
such hooks already exist?

- the typing module apparently has its own brand of generic functions
(with multiple dispatch?). Are they actually necessary for typing
declarations and type inference? If so, it would be nice if this could
be unified with functools.singledispatch, or at least put in the
functools namespace, and sufficiently similar API-wise that it doesn't
stand out.

- it would be nice if the PEP could discuss at least a little bit the
performance expectations of using the type system, and if there are
design decisions that can have a (positive or negative) performance
impact on the algorithms using it. I realize this is not top-priority
for the linting use case, but it's still important (especially for
linting large code bases, which is one of the proclaimed use cases).

(this reminds me how annoyed I am at the slowness of Sphinx on non-small
reST docs)

Regards

Antoine.

Ryan Gonzalez

unread,
Aug 14, 2014, 10:39:01 AM8/14/14
to Terry Reedy, python-ideas
On Wed, Aug 13, 2014 at 9:27 PM, Terry Reedy <tjr...@udel.edu> wrote:
Guido, as requesting, I read your whole post before replying. Please to the same. This response is both critical and supportive.


On 8/13/2014 3:44 PM, Guido van Rossum wrote:

Yesterday afternoon I had an inspiring conversation with Bob Ippolito
(man of many trades, author of simplejson) and Jukka Lehtosalo (author
of mypy: http://mypy-lang.org/).

My main concern with static typing is that it tends to be anti-duck-typing, while I consider duck-typing to be a major *feature* of Python.  The example in the page above is "def fib(n: int):". Fib should get an count (non-negative integer) value, but it need not be an int, and 'half' the ints do not qualify. Reading the tutorial, I could not tell if it supports numbers.Number (which should approximate the domain from above.)

Very true; this is one of my fears. There are plenty of people who adore static typing and will make everything in all their libraries static with static this and static that. Maybe there could be a way to disable type checks?
 

Now consider an extended version (after Lucas).

def fib(n, a, b):
    i = 0
    while i <= n:
        print(i,a)
        i += 1
        a, b = b, a+b

The only requirement of a, b is that they be addable. Any numbers should be allowed, as in fib(10, 1, 1+1j), but so should fib(5, '0', '1'). Addable would be approximated from below by Union(Number, str).


Unless MyPy added some sort of type classes...



--
Ryan
If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated."

Mark Lawrence

unread,
Aug 14, 2014, 10:52:50 AM8/14/14
to python...@python.org
On 14/08/2014 14:59, Yann Kaiser wrote:

> New brains are mostly taught to use python 2[7] or keep compatibility with it,

> [7] Warnings for Beginners - Learn Python the Hard Way:
> http://learnpythonthehardway.org/book/ex0.html#warnings-for-beginners
>
> -Yann Kaiser

This is the second time in a few days that I've seen a reference to that
and IMHO it's simply plain wrong and should be ignored.

Just my £0.02p worth.

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

Petr Viktorin

unread,
Aug 14, 2014, 10:53:38 AM8/14/14
to python-ideas
It seems to me that rather than adding a module which is only used by
one project so far to the standard library is a bit premature.

I support optional typing, but why push this to stdlib now? Wouldn't
it be better to wait until most IDEs/linters all agree on this syntax,
until freezing it in stdlib? So far typing seems to be a part of mypy,
shouldn't it spend some time on PyPI first?

I'm also sure about there not being other uses of annotations -- clize
aside, there are not many widely used Python3-only 3rd party
libraries, so it's no surprise that nothing big is built around Python
3 features.

Maybe the way from PEP 3107's "here's a feature, use it for whatever
you like" to "annotations are for typing declarations, using
mypy/typing syntax" should include a step of "if you use annotations
for typing, use mypy/typing syntax for it". (And perhaps it should end
there.)

Juancarlo Añez

unread,
Aug 14, 2014, 11:00:54 AM8/14/14
to Guido van Rossum, Python-Ideas

On Wed, Aug 13, 2014 at 3:14 PM, Guido van Rossum <gu...@python.org> wrote:
I am proposing that we adopt whatever mypy uses here, keeping discussion of the details (mostly) out of the PEP. The goal is to make it possible to add type checking annotations to 3rd party modules (and even to the stdlib) while allowing unaltered execution of the program by the (unmodified) Python 3.5 interpreter.

To the bottom of things...

About the second time I wrote about Python ("Why not Python", 2007) I dismissed it as serious software development environment because the lack of static type checking hindered the creation of proper software development environments.


So, Why do I now have doubts about adding support for static type checking?

I've been programming in almost-only Python for several years now, and this discussion had me think hard about "Why?".

The answer is simple: I never was as productive as I've been since I've centered on Python.

But, again Why?

Despite what my '07 article says, the IDE I use is pythonized-VIM and the command line. Where does the productivity come from?
  1. Readability with the right amount of succinctness. Python programs are very small, but understandable.
  2. The breadth and design consistency of the standard library. Some 70%? of what I need is there, and the design consistency makes it easy (intiutive) to use.
  3. PyPi covers another 28%.
  4. The Zen of Python (import this) permeates all of the above, including most third-party packages. The ecosystem is consistent too. It's a culture.

What do I fear? I think it is that Python be transformed into a programming language different from the one that now makes me so productive.

I studied Ruby, and I don't like it. I've been studying Go, and I don't like it. One must like the concepts and the power, sure, but the syntax required for some day-to-day stuff stinks like trouble;  simple stuff is so complicated to express and so easy to get wrong...

I hate "List[str]" and "Dict[str:int]". Where did those come from? Shouldn't they (as others have proposed) be "[str]" and "{str:int}"? What about tuples?

Why not write a similar, but different programming language that targets the Cython runtime and includes all the desired features?

All said, this is my proposal.

The PSF could support (even fund) MyPy and similar projects, promoting their maturity and their convergence. The changes in 3.5 would be limited but enough to enable those efforts, and those of the several IDE tool-smiths (changes in annotations, and maybe in ABCs). Basically, treat MyPy as PyPy or NumPy (which got '::'). It's in Python's history to enable third-party developments and then adopt what's mature or become the de-facto standard.

Then, on a separate line of work, it would be good to think about how to enable different programming languages to target the CPython environment (because of #2, #3, and #4 above), maybe by improving AST creation and AST-to-bytecode? There could be other languages targeting the CPython runtime, which is the relationship that Scala, Jython, IronPython, and others have to their own runtimes.

-1 for standardizing static type checking in 3.5

Cheers,

--
Juancarlo Añez

Andrew Barnert

unread,
Aug 14, 2014, 11:24:53 AM8/14/14
to Ryan Gonzalez, python-ideas, Terry Reedy
On Aug 14, 2014, at 7:37, Ryan Gonzalez <rym...@gmail.com> wrote:

On 8/13/2014 3:44 PM, Guido van Rossum wrote:

Now consider an extended version (after Lucas).

def fib(n, a, b):
    i = 0
    while i <= n:
        print(i,a)
        i += 1
        a, b = b, a+b

The only requirement of a, b is that they be addable. Any numbers should be allowed, as in fib(10, 1, 1+1j), but so should fib(5, '0', '1'). Addable would be approximated from below by Union(Number, str).


Unless MyPy added some sort of type classes...

By "type classes", do you mean this in the Haskell sense, or do you mean classes used just for typing--whether more granular ABCs (like an Addable which both Number and AnyStr and probably inherit) or typing.py type specifiers (like an Addable defined as Union(Number, AnyStr)?

It's also worth noting that the idea that this function should take a Number or a str seems way off. It's questionable whether it should accept str, but if it does, shouldn't it also accept bytes, bytearray, and other string-like types? What about sequences? And meanwhile, whether or not it accepts str, it should probably accept np.ndarray and other types of element-wise adding types. If you create an Addable type, it has to define, globally, which of those counts as addable, but different functions will have different definitions that make sense.

In fact, look at the other discussion going on. People want to ensure that sum only works on numbers or number-like types (and does that include NumPy arrays or not?), while others want to change it to work on all sequences, or only mutable sequences with += plus str because it effectively has magical += handling under the covers, etc. If we can't even decide what Addable means for one specific function that everyone has experience with...

On the other hand, if sum could have been annotated to tell us the author's intention (or, rather, the consensus of the dev list), then all of these recurring arguments about summing str would go away. Until someone defined a number-like class (maybe even one that meets the ABC, but by calling register on it) and sum won't work for him.

Brett Cannon

unread,
Aug 14, 2014, 12:02:16 PM8/14/14
to Petr Viktorin, python-ideas
On Thu Aug 14 2014 at 10:53:37 AM Petr Viktorin <enc...@gmail.com> wrote:
It seems to me that rather than adding a module which is only used by
one project so far to the standard library is a bit premature.

I support optional typing, but why push this to stdlib now? Wouldn't
it be better to wait until most IDEs/linters all agree on this syntax,
until freezing it in stdlib? So far typing seems to be a part of mypy,
shouldn't it spend some time on PyPI first?

Because as you have noticed in this thread there are already a ton of competing solutions and no consensus has been reached. Sometimes Guido and/or python-dev have to step in and simply say "there is obvious need and the community is not reaching consensus, so we will make the decision ourselves".
 

I'm also sure about there not being other uses of annotations -- clize
aside, there are not many widely used Python3-only 3rd party
libraries, so it's no surprise that nothing big is built around Python
3 features.

Maybe the way from PEP 3107's "here's a feature, use it for whatever
you like" to "annotations are for typing declarations, using
mypy/typing syntax" should include a step of "if you use annotations
for typing, use mypy/typing syntax for it". (And perhaps it should end
there.)

That's a possibility. Another thing to support this approach is that if something like List[str] is used over `[str]`  then the returned object can subclass some common superclass which can be typechecked for to know that the annotation is from typing.py and not clize/scription and continue to function. That way you can avoid any decorators adding some attribute on functions to know that types have been specified while allowing function annotations to be used for anything. Otherwise a @typing.ignore decorator could also exist for alternative formats to use (since typing has been the most common use case and decorating your single main() function with @typing.ignore is not exactly heavy-handed).

Sunjay Varma

unread,
Aug 14, 2014, 12:03:11 PM8/14/14
to Juancarlo Añez, Python-Ideas
I am strongly opposed to this entire proposal. As Juancarlo points out, Python programs are small, but very understandable. I think this syntax detracts from that. I'll suggest an alternative further down in my reply.

One benefit of Python that makes it so attractive for new programmers and even old programmers alike is that you can usually pick out any piece of Python code and begin to understand it immediately. Even if you come from a different programming language, Python is written in english explicitly using words like "and" and "or". Those constructs, as opposed to "&&" or "||" make the language less scary for new developers and in general easier to read as well. It's also easier to type regular english words (no need to use the shift key). Using the annotation syntax this heavily will detract very much from the readability of Python and from the overall usability as well. Programs are read more times than they are written.

Several years ago, before I had any programming experience in any language at all, I needed to edit some Python code to make something I was doing work. Without any experience at all, I was able to look through the (small) program I was editing and figure out exactly what I needed to adjust. Without Python being such a clean, almost English language, that would have been impossible.

Though the annotation syntax is already present in Python 3, I would argue that using this for type annotations will get very messy very quickly. If I'm understanding the syntax correctly, writing any function using a large library with many nested subpackages could result in code like this:

    import twisted.protocols.mice.mouseman

    def process_mouseman(inputMouseMan: twisted.protocols.mice.mouseman.MouseMan) -> twisted.protocols.mice.mouseman.MouseMan:
        pass

That function definition is 122 characters long. Far more than what PEP8 recommends. Though this example was crafted to illustrate my point (I don't think most people would really write code like this), it is easy to see that this kind of code is possible and may sometimes be written by some less experienced programmers. It demonstrates how messy things can get even with just one parameter. 

It is also easy to see that it is very difficult to parse out what is going on in that function. Adding type annotations inline makes it very difficult to quickly get an idea of what arguments a function takes and in what order. It detracts from the overall readability of a program and can also lead to very poorly formatted programs that break the guidelines in PEP8. Though I have only demonstrated this for function declarations, the example could also be extended to inline statement comments as well. Things get too messy too quickly.

My Alternative Proposal:
As an alternative, I would like to propose a syntax that Pycharm already supports: http://www.jetbrains.com/pycharm/webhelp/using-docstrings-to-specify-types.html

Since this type information isn't going to be used at runtime in the regular Python interpreter anyway, why not have it in the function docstring instead? This provides both readability and type checking. Standardizing that syntax or at least adding it as an optional way to check your program would in my opinion be a much better addition to the language. This approach needs no new syntax, keeps readability and allows the programmer to add additional documentation without going over the 80 character limit.

Additionally, this approach can be used by documentation generators as well and removes any duplication from the function declaration and the docstring.

Here's a taste of what that looks like:
    class SimpleEquation(object):
    
        def demo(self, a, b, c):
            """
            This function returns the product of a, b and c
            @type self: SimpleEquation
            :param a: int - The first number
            :param b: int
            :param c: int - The third number should not be zero and should also
                only be -1 if you enjoy carrots (this comment spans 2 lines)
            :return: int
            """
            return a * b * c

Overall, I think overloading function declarations and inline comments is a bad idea. It promotes writing code with poor readability and in general adds a lot of extra bits to the language that (from the sounds of your proposal) aren't even going to be used by the main interpreter.

On the original proposal:
These changes really do seem to be overestimating the staticness of Python programs as well. What about functions that don't care about the type? What about functions that only want you to pass in an object that implements __iter__? Python should not become a language where developers are required to add hundreds of odd cast() calls every time they choose to pass a different, but still compatible type, to a function. This syntax makes too many assumptions about what developers know about their code. What if I develop a similar, but different class that is compatible with an existing function? If that function doesn't specify that my class can be used, my perfectly valid code will be rejected.

-1 to adding mypy annotations to Python 3.

Sunjay

_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/



--
Sunjay Varma
Python Programmer & Web Developer

Manuel Cerón

unread,
Aug 14, 2014, 12:35:48 PM8/14/14
to Terry Reedy, Python-Ideas
On Thu, Aug 14, 2014 at 4:27 AM, Terry Reedy <tjr...@udel.edu> wrote:
My main concern with static typing is that it tends to be anti-duck-typing, while I consider duck-typing to be a major *feature* of Python.  The example in the page above is "def fib(n: int):". Fib should get an count (non-negative integer) value, but it need not be an int, and 'half' the ints do not qualify. Reading the tutorial, I could not tell if it supports numbers.Number (which should approximate the domain from above.)

This is a good point. But I think that static typing and duck typing are not mutually exclusive. TypeScript does this very nicely by defining structural interfaces (http://www.typescriptlang.org/Handbook#interfaces). With them is possible to define a given behaviour and any object capable of providing that behaviour is accepted, without having to be part of any specific type hierarchy or having to explicitly register as implementation of certain specification. That's basically what duck typing means.

For example:

interface Named {
    name: string;
say(): string;
}

function doSomething(x: Named) {
console.log(x.name);
}

doSomething({name: "hello", say: function() { return this.name }}); // OK
doSomething({something: "hello"}); // ERROR

I think something like this is a must have for mypy. In Python, I've been playing with something similar (https://github.com/ceronman/typeannotations) but for runtime checking only:

>>> class Person(Interface):
    ...     name = str
    ...     age = int
    ...     def say_hello(name: str) -> str:
    ...             pass

Any object defining those the name, age and say_hello() members is a valid implementation of that interface. For example:

>>> class Developer:
...     def __init__(self, name, age):
...             self.name = name
...             self.age = age
...     def say_hello(self, name: str) -> str:
...             return 'hello ' + name
...
>>> isinstance(Developer('bill', 20), Person)
True

Are there any plans for adding something like this to mypy?

Manuel.

Juancarlo Añez

unread,
Aug 14, 2014, 1:18:56 PM8/14/14
to Manuel Cerón, Python-Ideas, Terry Reedy

On Thu, Aug 14, 2014 at 12:04 PM, Manuel Cerón <cero...@gmail.com> wrote:
For example:

interface Named {
    name: string;
say(): string;
}

function doSomething(x: Named) {
console.log(x.name);
}

doSomething({name: "hello", say: function() { return this.name }}); // OK
doSomething({something: "hello"}); // ERROR

That is so Java....!


--
Juancarlo Añez

Ethan Furman

unread,
Aug 14, 2014, 1:29:47 PM8/14/14
to python...@python.org
On 08/14/2014 09:01 AM, Sunjay Varma wrote:
>
> Additionally, this approach can be used by documentation generators as well and removes any duplication from the
> function declaration and the docstring.
>
> Here's a taste of what that looks like:
> class SimpleEquation(object):
> def demo(self, a, b, c):
> """
> This function returns the product of a, b and c
> @type self: SimpleEquation
> :param a: int - The first number
> :param b: int
> :param c: int - The third number should not be zero and should also
> only be -1 if you enjoy carrots (this comment spans 2 lines)
> :return: int
> """
> return a * b * c


+1 I like this much more.

--
~Ethan~

Steven D'Aprano

unread,
Aug 14, 2014, 1:32:03 PM8/14/14
to python...@python.org
As requested, I've read the whole post, and the whole thread, before
responding :-)

On Wed, Aug 13, 2014 at 12:44:21PM -0700, Guido van Rossum wrote:

> (a) Python should adopt mypy's syntax for function annotations

[...]

I'm very excited to see functional annotations being treated seriously,
I think the introduction of static typing, even optional, has the
potential to radically change the nature of Python language and I'm not
sure if that will be good or bad :-) but it is reassuring to hear that
the intention is that it will be treated more like an optional linter
than as a core part of the language.

On the other hand, are you aware of Cobra, which explicitly was modelled
on Python but with optional static typing?

http://cobra-language.com/


[...]


> *(1) A change of direction for function annotations*

> [...] I propose a conscious change of course here by stating


> that annotations should be used to indicate types and to propose a standard
> notation for them.

And in a later email, Guido also stated:

> I want to eventually phase out other uses of function annotations

That disappoints me and I hope you will reconsider.

I've spent some time thinking about using annotations for purposes other
than type checking, but because most of my code has to run on Python 2,
there's nothing concrete. One example is that I started exploring ways
to use annotations as documentation for the statistics module in 3.4,
except that annotations are banned from the standard library. (Naturally
I haven't spent a lot of time on something that I knew was going to be
rejected.) I came up with ideas like this:

def mean(data) -> 'μ = ∑(x)/n':

def pvariance(data) -> 'σ² = ∑(x - μ)² ÷ n':

which might have been a solution to this request:

http://bugs.python.org/issue21046

had annotations been allowed in the stdlib. Regardless of whether this
specific idea is a good one or not, I will be disappointed if
annotations are limited to one and only one use. I don't mind if there
is a standard, default, set of semantics so long as there is a way to
opt-out and use something else:

@use_spam_annotations
def frobnicate(x: spam, y: eggs)->breakfast:
...

for example. Whatever the mechanism, I think Python should not prohibit
or deprecate other annotation semantics.


> *(2) A specification for what to add to Python 3.5*
>
> There needs to be at least a rough consensus on the syntax for annotations,
> and the syntax must cover a large enough set of use cases to be useful.
> Mypy is still under development, and some of its features are still
> evolving (e.g. unions were only added a few weeks ago). It would be
> possible to argue endlessly about details of the notation, e.g. whether to
> use 'list' or 'List', what either of those means (is a duck-typed list-like
> type acceptable?) or how to declare and use type variables, and what to do
> with functions that have no annotations at all (mypy currently skips those
> completely).

It doesn't sound to me like the mypy syntax is mature enough to bless,
let alone to start using it in the standard library.


> I am proposing that we adopt whatever mypy uses here, keeping discussion of
> the details (mostly) out of the PEP. The goal is to make it possible to add
> type checking annotations to 3rd party modules (and even to the stdlib)
> while allowing unaltered execution of the program by the (unmodified)

> Python 3.5 interpreter. The actual type checker will not be integrated with
> the Python interpreter, and it will not be checked into the CPython
> repository. The only thing that needs to be added to the stdlib is a copy
> of mypy's typing.py module.

What happens when the typing.py module in the standard library gets out
of sync with the typing.py module in mypy?


[...]


> *Appendix -- Why Add Type Annotations?*
> The argument between proponents of static typing and dynamic typing has
> been going on for many decades. Neither side is all wrong or all right.
> Python has traditionally fallen in the camp of extremely dynamic typing,
> and this has worked well for most users, but there are definitely some
> areas where adding type annotations would help.

Some people have probably already seen this, but I have found this
article to be very useful for understanding why static and dynamic type
checking can be complementary rather than opposed:

http://cdsmith.wordpress.com/2011/01/09/an-old-article-i-wrote/

--
Steven

Neal Becker

unread,
Aug 14, 2014, 1:35:50 PM8/14/14
to python...@python.org
Does mypy support annotation of functions implemented in C code?

If I extend cpython via C-API, does mypy provide a mechanism to annotate those
functions?

Steven D'Aprano

unread,
Aug 14, 2014, 1:36:52 PM8/14/14
to python...@python.org
On Thu, Aug 14, 2014 at 06:34:43PM +0200, Manuel Cerón wrote:
> On Thu, Aug 14, 2014 at 4:27 AM, Terry Reedy <tjr...@udel.edu> wrote:
> >
> > My main concern with static typing is that it tends to be
> > anti-duck-typing, while I consider duck-typing to be a major *feature* of
> > Python. The example in the page above is "def fib(n: int):". Fib should
> > get an count (non-negative integer) value, but it need not be an int, and
> > 'half' the ints do not qualify. Reading the tutorial, I could not tell if
> > it supports numbers.Number (which should approximate the domain from above.)
>
>
> This is a good point. But I think that static typing and duck typing are
> not mutually exclusive.
[...]
> Are there any plans for adding something like this to mypy?

The mypy FAQs claim to be focusing on nominative typing, but haven't
ruled out structural typing in the future:

http://www.mypy-lang.org/faq.html



--
Steven

Stefan Behnel

unread,
Aug 14, 2014, 1:38:47 PM8/14/14
to python...@python.org
Guido van Rossum schrieb am 13.08.2014 um 21:44:
> Yesterday afternoon I had an inspiring conversation with Bob Ippolito (man
> of many trades, author of simplejson) and Jukka Lehtosalo (author of mypy:
> http://mypy-lang.org/). Bob gave a talk at EuroPython about what Python can
> learn from Haskell (and other languages); yesterday he gave the same talk
> at Dropbox. The talk is online (
> https://ep2014.europython.eu/en/schedule/sessions/121/) and in broad
> strokes comes down to three suggestions:
>
> (a) Python should adopt mypy's syntax for function annotations
> [...] proposal (a) feels right to me.

FWIW, Cython currently understands these forms of function argument
annotations in Python code:

x: dict
x: {"type": dict}
x: {"type": "dict"}
x: {"ctype": "long double"}

The "ctype" values that are usable here obviously only include those that
can be converted from and to Python types, e.g. no arbitrary pointers.

It'd be nice to have a way to declare container item types, but that's
never really been a priority in the Cython project so far. Declaring
protocols, OTOH, is pretty useless for a compiler, as it's obvious from the
code which protocols are being used on a given value (iteration, item
access, etc.).

It's clear that restricting everything to one kind of annotation isn't
enough, as there are use cases for a mix of different type systems, Python
itself plus at least C/C++ in CPython and Cython, Java in Jython, C# in
IronPython, plus others that people might want to interface with. C/C++ are
generally interesting, for example, also for .NET or JVM users.

Although I wasn't very impressed by Bob Ippolito's talk at EuroPython, I'm
generally not opposed to type annotations to provide 1) additional contract
information for libraries, 2) documentation, 3) better static analysis or
4) type hints for compilers. But these are actually quite different use
cases that may each suggest a different strictness in the type
declarations. For 3) and 4), function signatures aren't enough and should
be accompanied by declarations for local variables. 4) should also support
additional type systems for language integration. But 1) and 2) aren't
completely overlapping either. 1) would need declarations that can be used
for hard type checking, whereas 2) can be much more relaxed, generic and
incomplete. Trying to get all of these under one umbrella might not be a
good idea, but letting people add three different annotations for each
function argument definitely isn't either.

Stefan

Steven D'Aprano

unread,
Aug 14, 2014, 1:46:18 PM8/14/14
to python...@python.org
On Wed, Aug 13, 2014 at 05:51:40PM -0430, Juancarlo Añez wrote:

> Function annotations are not available in Python 2.7, so promoting
> widespread use of annotations in 3.5 would be promoting code that is
> compatible only with 3.x,

Yes. You say that as if it were a bad thing. It is not. Python 3 is
here to stay and we should be promoting Python 3 only code. There is
absolutely no need to apologise for that fact. If people are happy with
Python the way it is in 2.7, or 1.5 for that matter, that's great, they
can stay on it for ever, but all new features are aimed at 3.x and not
2.x or 1.x.


> when the current situation is that much effort is
> being spent on writing code that works on both 2.7 and 3.4 (most
> libraries?).

There's no reason why all new code should be aimed at 2.x and 3.x. But
even for code which is, the nice thing about this proposal is that it's
optional, so you can run your type-check using mypy under Python 3.x and
still get the benefit of it when running under 2.x.


> Independently of its core merits, this proposal should fail unless
> annotations are added to Python 2.8.

There will be no Python 2.8, and no Python 2.9 either. New features go
into 3.x.



--
Steven

Gregory P. Smith

unread,
Aug 14, 2014, 1:47:47 PM8/14/14
to Brett Cannon, Guido van Rossum, python-ideas
On Thu, Aug 14, 2014 at 9:01 AM, Brett Cannon <br...@python.org> wrote:

On Thu Aug 14 2014 at 10:53:37 AM Petr Viktorin <enc...@gmail.com> wrote:
It seems to me that rather than adding a module which is only used by
one project so far to the standard library is a bit premature.

I support optional typing, but why push this to stdlib now? Wouldn't
it be better to wait until most IDEs/linters all agree on this syntax,
until freezing it in stdlib? So far typing seems to be a part of mypy,
shouldn't it spend some time on PyPI first?

Because as you have noticed in this thread there are already a ton of competing solutions and no consensus has been reached. Sometimes Guido and/or python-dev have to step in and simply say "there is obvious need and the community is not reaching consensus, so we will make the decision ourselves".

My overarching concern with the entire proposal is that adding this would just be yet more syntax added to the language with not much use that doesn't go far enough.

We'd ultimately need pytd or something else regardless when it comes to full scale Python static analysis.

But that isn't necessarily a bad thing. Specifying an actual basic annotation syntax that can do some subset of what you want to annotate in the language should in theory still be useful to a real code analysis tool. If it isn't, it will simply ignore it. If it is, it can use it and build on it even though it needs the ability to specify on the side.

If you do add a module for this, at least consider hiding it behind a "from __future__ import some_module_full_of_annotation_related_things" instead of making it a new no-op top level module.

-gps
 
 

I'm also sure about there not being other uses of annotations -- clize
aside, there are not many widely used Python3-only 3rd party
libraries, so it's no surprise that nothing big is built around Python
3 features.

Maybe the way from PEP 3107's "here's a feature, use it for whatever
you like" to "annotations are for typing declarations, using
mypy/typing syntax" should include a step of "if you use annotations
for typing, use mypy/typing syntax for it". (And perhaps it should end
there.)

That's a possibility. Another thing to support this approach is that if something like List[str] is used over `[str]`  then the returned object can subclass some common superclass which can be typechecked for to know that the annotation is from typing.py and not clize/scription and continue to function. That way you can avoid any decorators adding some attribute on functions to know that types have been specified while allowing function annotations to be used for anything. Otherwise a @typing.ignore decorator could also exist for alternative formats to use (since typing has been the most common use case and decorating your single main() function with @typing.ignore is not exactly heavy-handed).

Stefan Behnel

unread,
Aug 14, 2014, 1:54:58 PM8/14/14
to python...@python.org
Neal Becker schrieb am 14.08.2014 um 19:33:
> Does mypy support annotation of functions implemented in C code?
>
> If I extend cpython via C-API, does mypy provide a mechanism to annotate those
> functions?

No, mypy isn't about C or even CPython.

However, you can already do that, although not easily, I think. What you'd
need is an __annotations__ dict on the function object and a bit of
trickery to make CPython believe it's a function. Cython gives you that for
free (by simply providing the normal Python semantics), but you can get the
same thing with some additional work when writing your own C code. There's
also the argument clinic, but IIRC it doesn't support signature annotations
for some reason, guess it wasn't considered relevant (yet).

Stefan

Ryan

unread,
Aug 14, 2014, 2:06:39 PM8/14/14
to Andrew Barnert, python-ideas, Terry Reedy


Andrew Barnert <abar...@yahoo.com> wrote:
>On Aug 14, 2014, at 7:37, Ryan Gonzalez <rym...@gmail.com> wrote:
>
>>> On 8/13/2014 3:44 PM, Guido van Rossum wrote:
>
>>> Now consider an extended version (after Lucas).
>>>
>>> def fib(n, a, b):
>>> i = 0
>>> while i <= n:
>>> print(i,a)
>>> i += 1
>>> a, b = b, a+b
>>>
>>> The only requirement of a, b is that they be addable. Any numbers
>should be allowed, as in fib(10, 1, 1+1j), but so should fib(5, '0',
>'1'). Addable would be approximated from below by Union(Number, str).
>>
>> Unless MyPy added some sort of type classes...
>
>By "type classes", do you mean this in the Haskell sense, or do you
>mean classes used just for typing--whether more granular ABCs (like an
>Addable which both Number and AnyStr and probably inherit) or typing.py
>type specifiers (like an Addable defined as Union(Number, AnyStr)?

The Haskell way. Having ABCs and type classes can get confusing, but, when I can, I use type classes for the more unrelated concepts(such as Addable) and ABCs for the more parent-child concepts(such as Node in an AST or Generator in a set of generators).

The fine line might actually make it a bad choice to add to Python/mypy, though.

>
>It's also worth noting that the idea that this function should take a
>Number or a str seems way off. It's questionable whether it should
>accept str, but if it does, shouldn't it also accept bytes, bytearray,
>and other string-like types? What about sequences? And meanwhile,
>whether or not it accepts str, it should probably accept np.ndarray and
>other types of element-wise adding types. If you create an Addable
>type, it has to define, globally, which of those counts as addable, but
>different functions will have different definitions that make sense.
>
>In fact, look at the other discussion going on. People want to ensure
>that sum only works on numbers or number-like types (and does that
>include NumPy arrays or not?), while others want to change it to work
>on all sequences, or only mutable sequences with += plus str because it
>effectively has magical += handling under the covers, etc. If we can't
>even decide what Addable means for one specific function that everyone
>has experience with...
>
>On the other hand, if sum could have been annotated to tell us the
>author's intention (or, rather, the consensus of the dev list), then
>all of these recurring arguments about summing str would go away. Until
>someone defined a number-like class (maybe even one that meets the ABC,
>but by calling register on it) and sum won't work for him.

--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Steven D'Aprano

unread,
Aug 14, 2014, 2:16:48 PM8/14/14
to python...@python.org
On Thu, Aug 14, 2014 at 12:01:37PM -0400, Sunjay Varma wrote:

> Though the annotation syntax is already present in Python 3, I would argue
> that using this for type annotations will get very messy very quickly. If
> I'm understanding the syntax correctly, writing any function using a large
> library with many nested subpackages could result in code like this:
>
> import twisted.protocols.mice.mouseman
>
> def
> process_mouseman(inputMouseMan: twisted.protocols.mice.mouseman.MouseMan)
> -> twisted.protocols.mice.mouseman.MouseMan:
> pass

I would write that like this:

from twisted.protocols.mice.mouseman import MouseMan

def process_mouseman(inputMouseMan: MouseMan) -> MouseMan:
pass


> That function definition is 122 characters long.

Or 58.


> It is also easy to see that it is very difficult to parse out what is going
> on in that function.

Only because I have no idea what MouseMan means :-)


> As an alternative, I would like to propose a syntax that Pycharm already
> supports:
> http://www.jetbrains.com/pycharm/webhelp/using-docstrings-to-specify-types.html
[...]
> Here's a taste of what that looks like:
> class SimpleEquation(object):
>
> def demo(self, a, b, c):
> """
> This function returns the product of a, b and c
> @type self: SimpleEquation
> :param a: int - The first number
> :param b: int
> :param c: int - The third number should not be zero and should
> also
> only be -1 if you enjoy carrots (this comment spans 2 lines)
> :return: int
> """
> return a * b * c

I really dislike that syntax. I dislike adding cruft like "@type" and
":param" into docstrings, which should be written for human readers, not
linters. I dislike that you have documented that self is a
SimpleEquation. (What else could it be?) I dislike that the syntax
clashes with ReST syntax. I dislike that it isn't obvious to me why the
first parameter uses @type while the second parameter uses :param.


> Overall, I think overloading function declarations and inline comments is a
> bad idea. It promotes writing code with poor readability

I like the annotation syntax. I'm not completely convinced that the mypy
syntax is mature enough to bless, but the basic idea of type annotations
is pretty common in dozens of languages. I think you are in a tiny
minority if you think that putting the type declaration right next to
the parameter make it *less* clear that putting the type declaration in
a completely different part of the code.

# the type is together with the parameter
def frobinate(x: Spam, y: Egg)->Breakfast:

# the type declaration and parameter are distantly apart
def frobinate(x, y):
"""Return the frobinated x and y.

Some more text goes here. Perhaps lots of text.

:param x: Spam
:param y: Eggs
:return: Breakfast
"""


> On the original proposal:
> These changes really do seem to be overestimating the staticness of Python
> programs as well. What about functions that don't care about the type?

They can declare that they are object. Or not declare a type at all.

> What about functions that only want you to pass in an object that implements
> __iter__?


I would expect this should work:

from typing import Iter
def func(it:Iter):
...


> Python should not become a language where developers are required
> to add hundreds of odd cast() calls every time they choose to pass a
> different, but still compatible type, to a function.

I'm not sure how you go from *optional* static typing to developers
being *required* to cast values.

As I see it, one HUGE advantage of this proposal is that people who want
strict static typing currently might write code like this:

def make_sandwich(filling):
if not isinstance(filling, Ham):
raise TypeError
...


With the new proposal, they will probably write this:

def make_sandwich(filling: Ham):
...

and allow the static type check to occur at compile time. That means
that if I want to pass a Spam instance instead of a Ham instance, all I
need do is disable the compile-time type check, and make_sandwich will
happily accept anything that has the same duck-type interface as Ham,
like Spam. If I pass an int instead, I'll get the same run-time error
that I would have got if make_sandwich did not include an explicit type
check.

So, I think this proposal might actually lead to *more* duck typing
rather than less, since you can always turn off the type checking.


--
Steven

Steven D'Aprano

unread,
Aug 14, 2014, 2:31:42 PM8/14/14
to python...@python.org
On Thu, Aug 14, 2014 at 10:46:43AM -0700, Gregory P. Smith wrote:

> My overarching concern with the entire proposal is that adding this would
> just be yet more syntax added to the language with not much use that
> doesn't go far enough.

This isn't new syntax. Functional annotations have been in Python since
Python 3.0. What's new here is blessing one specific meaning for that
syntax as the One Offical use for annotations.

I'd rather:

- bless function annotations for static type checking as the default
meaning, but allowing code to opt-out and use annotations for
something else;

- encourage the various type checkers and linters to come up with a
standard syntax for type annotations.

I'm not convinced that mypy syntax is yet mature enough to be that
standard. But, perhaps if the typing module is given provisional status,
maybe it could be a good start.


> If you do add a module for this, at least consider hiding it behind a "from
> __future__ import some_module_full_of_annotation_related_things" instead of
> making it a new no-op top level module.

I'm not sure what this comment means. Did you read Guido's first post in
this thread? I thought he was clear that to get type checking, you would
do this:

from typing import Dict

def function(d: Dict[str, int])-> int:
...

I bet you can guess what that does, but in case you can't, it declares
that argument d is a Dict with str keys and int values, and the return
result is an int. I'm not sure where you get the idea of a no-op top
level module from.

"from __future__ import ..." is inappropriate too, since that is
intended for changes to compiler semantics (e.g. new syntax). This is
existing syntax, and the actual type checking itself will be delegated
to a separate product, mypy, which as Guido stated will behave as a
linter. That means that the type checks won't do anything unless you
have mypy installed, and you can still run your code under any Python
compiler you like. As I understand it, CPython itself requires no
changes to make this work, just the addition of typing.py from the mypy
project and a policy change to the use of annotations.



--
Steven

Cory Benfield

unread,
Aug 14, 2014, 2:31:43 PM8/14/14
to Steven D'Aprano, python-ideas
On 14 August 2014 19:15, Steven D'Aprano <st...@pearwood.info> wrote:
> I really dislike that syntax. I dislike adding cruft like "@type" and
> ":param" into docstrings, which should be written for human readers, not
> linters.

That ship has long-since sailed. Sphinx uses exactly this :param: and
:return: syntax for its docstring parsing. It is by now a common
convention (at least, I see it all over the place in open source
code), and should not be considered a surprise. I appreciate that it
doesn't lead to clean docstrings, but I've found it leads to
docstrings that are genuinely written to be read (because they're part
of your documentation).

> So, I think this proposal might actually lead to *more* duck typing
> rather than less, since you can always turn off the type checking.

I found this conclusion impossible to understand: have I missed
something Steven? To my eyes, the fact that when run by a user who
knows nothing about the static type checker much duck typing will fail
will clearly not lead to more duck typing. It will lead either to a)
less duck typing because of all the bug reports (your code breaks
whenever I try to run it!), or b) everyone turning the static type
checker off.

That objection assumes the static checker would be on by default. If
it were off by default but available, both of these problems go away
but we're back in the situation we're in right now. In that case, I
don't see why we'd add this to CPython.

Ethan Furman

unread,
Aug 14, 2014, 2:34:30 PM8/14/14
to python...@python.org
On 08/14/2014 11:15 AM, Steven D'Aprano wrote:
>
> I like the annotation syntax. I'm not completely convinced that the mypy
> syntax is mature enough to bless, but the basic idea of type annotations
> is pretty common in dozens of languages. I think you are in a tiny
> minority if you think that putting the type declaration right next to
> the parameter make it *less* clear that putting the type declaration in
> a completely different part of the code.
>
> # the type is together with the parameter
> def frobinate(x: Spam, y: Egg)->Breakfast:
>
> # the type declaration and parameter are distantly apart
> def frobinate(x, y):
> """Return the frobinated x and y.
>
> Some more text goes here. Perhaps lots of text.
>
> :param x: Spam
> :param y: Eggs
> :return: Breakfast
> """

Sure, keeping that info in the annotations makes more sense, but I'd rather see it in the doc string instead of ruling
out all other possible uses of annotations -- particularly for something that's supposed to be /optional/.

--
~Ethan~

Steven D'Aprano

unread,
Aug 14, 2014, 2:57:19 PM8/14/14
to python...@python.org
On Thu, Aug 14, 2014 at 07:25:04PM +0100, Cory Benfield wrote:
> On 14 August 2014 19:15, Steven D'Aprano <st...@pearwood.info> wrote:
> > I really dislike that syntax. I dislike adding cruft like "@type" and
> > ":param" into docstrings, which should be written for human readers, not
> > linters.
>
> That ship has long-since sailed. Sphinx uses exactly this :param: and
> :return: syntax for its docstring parsing. It is by now a common
> convention (at least, I see it all over the place in open source
> code), and should not be considered a surprise.

I've seen it too, but not in docstrings written in vanilla ReST. That's
a disappointment to hear that Sphinx uses it, because I think it is
hideously ugly :-(


> > So, I think this proposal might actually lead to *more* duck typing
> > rather than less, since you can always turn off the type checking.
>
> I found this conclusion impossible to understand: have I missed
> something Steven? To my eyes, the fact that when run by a user who
> knows nothing about the static type checker much duck typing will fail
> will clearly not lead to more duck typing. It will lead either to a)
> less duck typing because of all the bug reports (your code breaks
> whenever I try to run it!), or b) everyone turning the static type
> checker off.

Let me explain my reasoning.

Back in the Old Days, before Python 2.2, there was no isinstance(). We
were strongly discouraged from doing type checks, instead we were
encouraged to rely on duck-typing and that functions would fail loudly
if passed the wrong argument. With the introduction of isinstance,
Python code has slowly, gradually, begun using more and more run-time
explicit type checks with isinstance. Some people do this more than
others. Let's consider Fred, who is a Java programmer at heart and so
writes code like this:

def foo(x):
if not instance(x, float):
raise TypeError("Why doesn't python check this for me?")
return (x+1)/2

I want to pass a Decimal to foo(), but can't, because of the explicit
type check. I am sad.

But with this proposal, Fred may write his function like this:

def foo(x:float)->float:
return (x+1)/2

and rely on mypy to check the types at compile time. Fred is happy: he
has static type checks, Python does it automatically for him (once he
has set up his build system to call mypy), and he is now convinced that
foo() is type-safe and an isinstance check at run-time would be a waste
of cycles.

I want to pass a Decimal to foo(). All I have to do is *not* install
mypy, or disable it, and lo and behold, like magic, the type checking
doesn't happen, and foo() operates by duck-typing just like in the glory
days of Python 1.5. Both Fred and I are now happy, and with the explicit
isinstance check removed, the only type checking that occurs when I run
Fred's library are the run-time duck-typing checks.

--
Steven

Steven D'Aprano

unread,
Aug 14, 2014, 3:02:19 PM8/14/14
to python...@python.org
On Thu, Aug 14, 2014 at 12:28:26PM +0200, Manuel Cerón wrote:

> One interesting feature of TypeScript is that it allows you to annotate
> existing code without modifying it, by using external definition files. In
> the JavaScript world, many people have contributed TypeScript annotation
> files for popular JS libraries (http://definitelytyped.org/).
>
> I think this is possible in Python as well doing something like this:
>
> @annotate('math.ciel')
> def ciel(x: float) -> int:
> pass

I'm afraid I don't understand what the annotate decorator is doing here.
Can you explain please?

Steven D'Aprano

unread,
Aug 14, 2014, 3:03:37 PM8/14/14
to python...@python.org
On Wed, Aug 13, 2014 at 10:29:48PM +0200, Christian Heimes wrote:

> 1) I'm not keen with the naming of mypy's typing classes. The visual
> distinction between e.g. dict() and Dict() is too small and IMHO
> confusing for newcomers. How about an additional 'T' prefix to make
> clear that the objects are referring to typing objects?
>
> from typing import TList, TDict
>
> def word_count(input: TList[str]) -> TDict[str, int]:
> ...

Would it be possible, and desirable, to modify the built-in types so
that we could re-use them in the type annotations?

def word_count(input: list[str]) -> dict[str, int]:


Since types are otherwise unlikely to be indexable like that, I think
that might work.



> 2) PEP 3107 only specifies arguments and return values but not
> exceptions that can be raised by a function. Java has the "throws"
> syntax to list possible exceptions:
>
> public void readFile() throws IOException {}

I understand that this is called a "checked exception" in Java. I also
understand that they are hated and derided as useless or even
counter-productive:

http://literatejava.com/exceptions/checked-exceptions-javas-biggest-mistake/


> May I suggest that we also standardize a way to annotate the exceptions
> that can be raised by a function? It's a very useful piece of
> information and commonly requested information on the Python user
> mailing list.

And as frequently explained on the python-list, it's almost impossible
to answer. Or rather, the answer is usually no more specific than
"raises Exception".

There are very few guarantees you can reliably make about what
exceptions *cannot* be raised by a function. To put it simply, given
almost any operation in your function, say, x+1, there's no limit on
what x.__add__ might raise. Even if you know x is a subclass of int, it
could do anything in its __add__ method. Only if you know x is *exactly*
a builtin int can you be confident that it won't raise (say)
ImportError.

Perhaps with a few years of experience, we might be able to extend this
to exceptions without making the same mistakes as Java's checked
exceptions, but I wouldn't rush into it.

Sunjay Varma

unread,
Aug 14, 2014, 3:22:39 PM8/14/14
to Nathaniel Smith, Python-Ideas

Frankly, I'd just really like to get all of this noise out of the function declaration. Any reasonable, readable and consistent documentation format is fine with me. I chose the sphinx format because it is already well supported in pycharm and that was mentioned in the first few responses.

I actually don't like the colon syntax very much (it's awkward and unnatural to type), so if anyone has a different suggestion I'd be very open to that.

Mainly I want to ensure that Python doesn't sacrifice readability and line length (which is part of readability) just because annotations are already built in.

I suggest we decide on a standard format that can be used in documentation strings and also used with type checking.

Let's enhance our documentation with types, not obfuscate function declarations.

Sunjay

On Aug 14, 2014 3:14 PM, "Nathaniel Smith" <n...@pobox.com> wrote:

On 14 Aug 2014 17:02, "Sunjay Varma" <varma....@gmail.com> wrote:
> Here's a taste of what that looks like:
>     class SimpleEquation(object):
>     
>         def demo(self, a, b, c):
>             """
>             This function returns the product of a, b and c
>             @type self: SimpleEquation
>             :param a: int - The first number
>             :param b: int
>             :param c: int - The third number should not be zero and should also
>                 only be -1 if you enjoy carrots (this comment spans 2 lines)
>             :return: int
>             """
>             return a * b * c

There are at least three existing, popular, standardized syntaxes for these kinds of docstring annotations in use: plain ReST, Google's docstring standard, and numpy's docstring standard. All are supported by Sphinx out of the box. (The latter two require enabling the "napolean" extension, but this is literally a one line config file switch.)

Would you suggest that python-dev should pick one of these and declare it to be the official standard, or...?

-n

Nathaniel Smith

unread,
Aug 14, 2014, 3:23:25 PM8/14/14
to Sunjay Varma, python...@python.org

On 14 Aug 2014 17:02, "Sunjay Varma" <varma....@gmail.com> wrote:

> Here's a taste of what that looks like:
>     class SimpleEquation(object):
>     
>         def demo(self, a, b, c):
>             """
>             This function returns the product of a, b and c
>             @type self: SimpleEquation
>             :param a: int - The first number
>             :param b: int
>             :param c: int - The third number should not be zero and should also
>                 only be -1 if you enjoy carrots (this comment spans 2 lines)
>             :return: int
>             """
>             return a * b * c

There are at least three existing, popular, standardized syntaxes for these kinds of docstring annotations in use: plain ReST, Google's docstring standard, and numpy's docstring standard. All are supported by Sphinx out of the box. (The latter two require enabling the "napolean" extension, but this is literally a one line config file switch.)

Dennis Brakhane

unread,
Aug 14, 2014, 3:28:55 PM8/14/14
to gu...@python.org, Ethan Furman, Python-Ideas
Allow me to chime in.

Am 13.08.2014 22:19, schrieb Guido van Rossum:
> On Wed, Aug 13, 2014 at 12:59 PM, Ethan Furman <et...@stoneleaf.us
> <mailto:et...@stoneleaf.us>> wrote:
>
> -1 on deprecating alternative uses of annotations.
>
>
> Do you have a favorite alternative annotation use that you actually
> use (or are likely to)?
>
I would be very sad to see annotations being limited to convey type
information. But I think I have a solution that
will make all happy (see end of mail)


I've programmed Java web application for many years now (hoping to
finally switch to Python), and "method parameter
annotations" as they are called in Java would be one thing I'd really
miss, as they can be very useful.

Let me give you two examples:

1. Annotations can be used to communicate additional restrictions on
values that must be checked on run time

Let's assume a simple Web service that is called via HTTP to register a
user, and the underlying framework decodes
the request and finally calls a simple controller function, it could
look like this

(Java code, @ signifies an annotation)

public Response register(@Range(18,100) int age, @ValidEmail String
email) { ... }

The framework would check the range of the age parameter and the
validity of the email and if there are validation errors,
refusing the request with a suitable error message without ever calling
our function.

Even if we assume that mypy's type system will be incredibly complex and
allowing to specify all kinds of restrictions on a type,
it won't help because those checks have to be done at run time, and are
not optional.

Of course those checks could be hard coded into the function, but using
annotation also provides a simple and immediate
documentation about the allowed values, and avoids boilerplate (I do not
have to write
"if (emailNotvalie(email) throw ValidationError("Field email is not a
valid email")" in every method that uses an email)


2. They can give meta information to a framework

An admittedly contrieved example, let's expand our register function:

public Response register( int age, @Inject @Scope("session") UserService
userService, @Authenticated User user) ....

Here I can tell the dependency injection framework that I want an
instance of the UserService injected, but one instance
that has session scope instead of the normal "singleton" scope.
I also ask the framework to inject me the currently authenticated user
object (let's assume if I'd write
"@Authenticated String user" I could get the login name as string etc.)


The flexibility annotations give in Java makes programming much less
frustrating. It also took quite some time before
annotations were widly used (they were only available starting in Java
5) and people started finding more uses for them.
I think this will also be true for Python, it will take time before
people find useful ways for them. Redefining them now to
be "not much more" than static type information feels wrong to me.


My proposed solution:

If an annotation is a tuple, mypy will take a look at each item and do
it's usual thing. If it doesn't recognise an item, it is skipped.

Every framework that uses annotations should also iterate over entries
of a tuple to find the ones it is interested in.

This also allows more than one annotation at a time, and is completely
backwards compatible (as far as Python itself is concerned)


for example, my first example could be written as

def register(age: (int, range(18,100)), email: (str, ValidEmail))

also, it will allow me to add annotations to existing "typed" functions,

def foo(bar: int) -> int

could become

def foo(bar: (int, MyAnnotation)) -> (int, AnotherOfMyAnnotations)

I'm not sure what should happen if two conflicting types are given, like
(int, str); I think it should be treated as a union
type (either int or str).

--
Dennis


signature.asc

Skip Montanaro

unread,
Aug 14, 2014, 3:35:40 PM8/14/14
to Dennis Brakhane, Python-Ideas
On Thu, Aug 14, 2014 at 2:28 PM, Dennis Brakhane
<brak...@googlemail.com> wrote:
> public Response register(@Range(18,100) int age, @ValidEmail String
> email) { ... }
>
> The framework would check the range of the age parameter and the
> validity of the email and if there are validation errors,
> refusing the request with a suitable error message without ever calling
> our function.

Couldn't you do that today in Python with a suitably sophisticated
function decorator? The range/type checking would happen before the
user's actual function is called.

Skip

Łukasz Langa

unread,
Aug 14, 2014, 3:36:20 PM8/14/14
to Steven D'Aprano, python...@python.org
On Aug 14, 2014, at 12:02 PM, Steven D'Aprano <st...@pearwood.info> wrote:

Would it be possible, and desirable, to modify the built-in types so
that we could re-use them in the type annotations?

   def word_count(input: list[str]) -> dict[str, int]:


Since types are otherwise unlikely to be indexable like that, I think
that might work.

-1 on that idea. Actually, -1 on List, Dict and friends as well.

Square brackets are for lookup (indexing, key-based, or slicing). Saying here that you’re “looking up” a subtype of list that holds strings is a far stretch.

-- 
Best regards,
Łukasz Langa

WWW: http://lukasz.langa.pl/
Twitter: @llanga
IRC: ambv on #python-dev

Oleg Broytman

unread,
Aug 14, 2014, 3:38:44 PM8/14/14
to python...@python.org
On Fri, Aug 15, 2014 at 04:52:45AM +1000, Steven D'Aprano <st...@pearwood.info> wrote:
> But with this proposal, Fred may write his function like this:
>
> def foo(x:float)->float:
> return (x+1)/2
>
> and rely on mypy to check the types at compile time. Fred is happy: he
> has static type checks, Python does it automatically for him (once he
> has set up his build system to call mypy), and he is now convinced that
> foo() is type-safe and an isinstance check at run-time would be a waste
> of cycles.
>
> I want to pass a Decimal to foo(). All I have to do is *not* install
> mypy, or disable it, and lo and behold, like magic, the type checking
> doesn't happen, and foo() operates by duck-typing just like in the glory
> days of Python 1.5. Both Fred and I are now happy, and with the explicit
> isinstance check removed, the only type checking that occurs when I run
> Fred's library are the run-time duck-typing checks.

Well, that's funny. Static type checking as a way to subvert type
checking! (-:

Oleg.
--
Oleg Broytman http://phdru.name/ p...@phdru.name
Programmers don't die, they just GOSUB without RETURN.

Ethan Furman

unread,
Aug 14, 2014, 3:44:03 PM8/14/14
to Python-Ideas
On 08/14/2014 11:55 AM, Steven D'Aprano wrote:
> On Thu, Aug 14, 2014 at 12:28:26PM +0200, Manuel Cerón wrote:
>
>> One interesting feature of TypeScript is that it allows you to annotate
>> existing code without modifying it, by using external definition files. In
>> the JavaScript world, many people have contributed TypeScript annotation
>> files for popular JS libraries (http://definitelytyped.org/).
>>
>> I think this is possible in Python as well doing something like this:
>>
>> @annotate('math.ciel')
>> def ciel(x: float) -> int:
>> pass
>
> I'm afraid I don't understand what the annotate decorator is doing here.
> Can you explain please?

My understanding is it's using the 'math.ciel' file in order to understand how it should treat the annotations of
'float' and 'int'.

--
~Ethan~

Dennis Brakhane

unread,
Aug 14, 2014, 3:46:21 PM8/14/14
to Skip Montanaro, Python-Ideas
Am 14.08.2014 21:34, schrieb Skip Montanaro:

> Couldn't you do that today in Python with a suitably sophisticated
> function decorator? The range/type checking would happen before the
> user's actual function is called.

I suppose so, but I'd have to repeat myself and it would look ugly,
because I
have to tell the decorator which parameter I'm talking about.

Something like

@do_range_check('age', 18, 100)
@do_email_check('email')
def register(age: int, email: str):

looks not nearly as nice.

Furthermore, my proposal allows multiple uses of annotations, without
restricting them
to a single use.

If you only use mypy, you can keep using annotations as type
declarations, when you use
some other framework that uses annotations for a different thing, you
can still use them,
only once you want to use both *and* you have a method that needs both
types of annotations
are you forced to use the tuple notation.

Sunjay Varma

unread,
Aug 14, 2014, 3:48:50 PM8/14/14
to Dennis Brakhane, Python-Ideas

See responses inline

On Aug 14, 2014 3:28 PM, "Dennis Brakhane" <brak...@googlemail.com> wrote:

> 1. Annotations can be used to communicate additional restrictions on
> values that must be checked on run time
>
> Let's assume a simple Web service that is called via HTTP to register a
> user, and the underlying framework decodes
> the request and finally calls a simple controller function, it could
> look like this
>
> (Java code, @ signifies an annotation)
>
>   public Response register(@Range(18,100) int age, @ValidEmail String
> email) { ... }
>
> The framework would check the range of the age parameter and the
> validity of the email and if there are validation errors,
> refusing the request with a suitable error message without ever calling
> our function.
>

This is exactly the kind of syntax I want to avoid. Python should not attempt to just blindly become like Java or any other language. Though Dennis was just illustrating his point (this was not a suggestion of an alternate syntax), I feel like Python is moving further and further towards code like this. Python programmers should not be forcing as much as possible into each line. Explicit is better than implicit.

> Even if we assume that mypy's type system will be incredibly complex and
> allowing to specify all kinds of restrictions on a type,
> it won't help because those checks have to be done at run time, and are
> not optional.

This is a great point. Regardless of the nature of the annotation, we can't let this become a way out for lazy programmers. Having annotations is no excuse for poor error checking.

> 2. They can give meta information to a framework
>
> An admittedly contrieved example, let's expand our register function:
>
> public Response register( int age, @Inject @Scope("session") UserService
> userService, @Authenticated User user) ....

This is too much to put on one line in Python. We should be aiming to make code cleaner and promote proper error checking and documentation.

> The flexibility annotations give in Java makes programming much less
> frustrating. It also took quite some time before
> annotations were widly used (they were only available starting in Java
> 5) and people started finding more uses for them.
> I think this will also be true for Python, it will take time before
> people find useful ways for them. Redefining them now to
> be "not much more" than static type information feels wrong to me.

I agree that annotations can be useful, but I don't think they should be used for type checking at this scale.

Sunjay _______________________________________________

Ben Finney

unread,
Aug 14, 2014, 3:49:43 PM8/14/14
to python...@python.org
Steven D'Aprano <st...@pearwood.info> writes:

> On Wed, Aug 13, 2014 at 10:29:48PM +0200, Christian Heimes wrote:
>
> > 1) I'm not keen with the naming of mypy's typing classes. The visual
> > distinction between e.g. dict() and Dict() is too small and IMHO

> > confusing for newcomers. […]


>
> Would it be possible, and desirable, to modify the built-in types so
> that we could re-use them in the type annotations?

That would address my concern. With that change, when the programmer who
reads the code encounters mention of a type, it means precisely the same
type object as it appears to be and not some special beast.

--
\ “Science embraces facts and debates opinion; religion embraces |
`\ opinion and debates the facts.” —Tom Heehler, _The Well-Spoken |
_o__) Thesaurus_ |
Ben Finney

Terry Reedy

unread,
Aug 14, 2014, 4:01:00 PM8/14/14
to python...@python.org
On 8/14/2014 11:21 AM, Andrew Barnert wrote:
> On Aug 14, 2014, at 7:37, Ryan Gonzalez
> <rym...@gmail.com
> <mailto:rym...@gmail.com>> wrote:
>
>> On 8/13/2014 3:44 PM, Guido van Rossum wrote:
>
>> Now consider an extended version (after Lucas).
>>
>> def fib(n, a, b):
>> i = 0
>> while i <= n:
>> print(i,a)
>> i += 1
>> a, b = b, a+b
>>
>> The only requirement of a, b is that they be addable. Any numbers
>> should be allowed, as in fib(10, 1, 1+1j), but so should fib(5,
>> '0', '1'). Addable would be approximated from below by
>> Union(Number, str).
>>
>>
>> Unless MyPy added some sort of type classes...
>
> By "type classes", do you mean this in the Haskell sense, or do you mean
> classes used just for typing--whether more granular ABCs (like an
> Addable which both Number and AnyStr and probably inherit) or typing.py
> type specifiers (like an Addable defined as Union(Number, AnyStr)?

What I like is the idea of protocol based types, as in
Andrey Vlasovskikh's slide 26
http://blog.pirx.ru/media/files/2013/python-optional-typing/#26

class Addable(Protocol):
@abstractmethod
def __add__(self, other):
pass

Even this does not capture the actual requirement that a and b be
addable together, in that order. Addable_together is inherently a pair
concept. Syntactically, that could be handled by passing a pair.

def fib(n:Comparable_to_int, pair:Addable_together) -> Type resulting
from pair:

However, the actual Python rules for Addable_together are rather
complex. A static type checker for this would be extremely difficult to
write. The best dynamic type checker is to try and let Python say no by
raising.

> It's also worth noting that the idea that this function should take a
> Number or a str seems way off.

As written, it *does* take any pair that can be repeatedly added together.

> It's questionable whether it should accept str,

I recently read Douglas Hofstadler's Fluid Concepts and Creative
Analogies: Computer Models of the Fundamental Mechanisms of Thought. In
the first chapter he discussed puzzles like: A series begins 0, 1, 01,
101, ..., what is the next item. He started with number sequences and
moved on to digit string sequences like the above, where the digit
strings are *not* interpreted as number. Generic functions and
duck-typing encourages this sort of creative thinking.

> but if it does, shouldn't it also accept bytes,
> bytearray, and other string-like types?

Of course. A descriptive type should not invalidate anything that is
allowed.

> What about sequences? And meanwhile,
> whether or not it accepts str, it should probably accept np.ndarray and
> other types of element-wise adding types.

The function above already does accept such.

> If you create an Addable type,
> it has to define, globally, which of those counts as addable, but
> different functions will have different definitions that make sense.

A single arg having .__(r)add__ is trivial. The real difficultly is
expressing *addable to each other* and relating the result type to the
types of the members of the pair.




--
Terry Jan Reedy

Dennis Brakhane

unread,
Aug 14, 2014, 4:06:37 PM8/14/14
to gu...@python.org, Alex Gaynor, Python-Ideas
Am 13.08.2014 23:46, schrieb Guido van Rossum:
>
>
> Mypy has a cast() operator that you can use to shut it up when you
> (think you) know the conversion is safe.
>
Does Mypy provide a way to "fix/monkeypatch" incorrect type declarations
in function signatures? For example, by modifying __annotations__?

My pet peeve of static languages is that programmers are often too
fixated on their particular problem that they don't think about alternate
uses for their code and make the type declarations uncessarily complex.

For example, in Java nearly every method in Java that deals with
character strings uses "String" as parameter type, while they should
have used "CharSequence". Having to read an entire input stream and
storing it in a String just to be able to use a method is not fun
(String is final in Java)

I'm worried that in Python we will have utility functions that declare
they require a List[int], when in fact they actually only require a
Sequence[int] or
Sequence[Number].

While Mypy's cast is nice in that I won't have to wrap my Integer Tuple
in list like object, having to cast it every time I use a particular
broken utility method
feels very ugly to me; and defining a wrapper function with the correct
type information feels like unnecessary run time overhead for no gain.

Don't get me wrong, I'm not entirely against some kind of type checking,
but I fear that there must exist possible workarounds for badly written
code.

Sunjay Varma

unread,
Aug 14, 2014, 4:16:59 PM8/14/14
to Dennis Brakhane, Python-Ideas, Alex Gaynor

+1 This is definitely something to consider.

One of the many benefits of Python is that you can use objects with equivalent interfaces in functions that may not have expected that type while they were being written.

One thing to note: if we try to make this syntax too verbose, it may lose all of its purpose all together.

For example: one (bad) solution to support what I described above is to define some sort of convoluted syntax for specifying the exact interface a function supports. At this point, any addition to the language would do nothing but hinder it. Anything that verbose loses is too complex to be valuable.

We have to be careful with this. If we do accept it (or any of the many alternatives suggested so far), then we should choose a few use cases and focus on solving them as best as possible.

Manuel Cerón

unread,
Aug 14, 2014, 4:30:31 PM8/14/14
to Steven D'Aprano, Python-Ideas
On Thu, Aug 14, 2014 at 8:55 PM, Steven D'Aprano <st...@pearwood.info> wrote:
On Thu, Aug 14, 2014 at 12:28:26PM +0200, Manuel Cerón wrote:

> One interesting feature of TypeScript is that it allows you to annotate
> existing code without modifying it, by using external definition files. In
> the JavaScript world, many people have contributed TypeScript annotation
> files for popular JS libraries (http://definitelytyped.org/).
>
> I think this is possible in Python as well doing something like this:
>
> @annotate('math.ciel')
> def ciel(x: float) -> int:
>     pass

I'm afraid I don't understand what the annotate decorator is doing here.
Can you explain please?

The idea is to add type annotations to modules without modifying them. For example, in this case, the stdlib math module is defined and implemented in C, but you still want to have annotations for it so that if you write math.ciel('foo'), the static type analyzer gives you a error or warning. By defining a new module, for example math_annotations.py with empty functions with annotated signatures, you can let the static analyzer know what are the annotations for another module. In this example, the annotate decorator is just a way of telling the static analyzer that these annotations apply to the math.ceil function, not math_annotations.ceil.

This is what TypeScript does to annotate popular libraries written in plain JavaScript with zero type information. 

Manuel.

Cory Benfield

unread,
Aug 14, 2014, 4:33:44 PM8/14/14
to Steven D'Aprano, python-ideas
On 14 August 2014 19:52, Steven D'Aprano <st...@pearwood.info> wrote:
> I want to pass a Decimal to foo(). All I have to do is *not* install
> mypy, or disable it, and lo and behold, like magic, the type checking
> doesn't happen, and foo() operates by duck-typing just like in the glory
> days of Python 1.5. Both Fred and I are now happy, and with the explicit
> isinstance check removed, the only type checking that occurs when I run
> Fred's library are the run-time duck-typing checks.

Thanks for explaining Steven, that's a lot clearer. I understand where
you're coming from now.

I still don't agree, however. I suspect what's more likely to happen
is that Fred writes his code, a user goes to run it with duck typing,
and it breaks. Assuming the static checker is in CPython and on by
default, there are a number of options here, most of which are bad:

1. The user doesn't know about the type checker and Googles the
problem. They find there's a flag they can pass to make the problem go
away, so they do. They have now learned a bad habit: to silence these
errors, pass this flag. They can no longer gain any benefits from the
type checker: it may as well have been not there (or off by default).

2. The user doesn't know about the type checker and blames Fred's
library, opening a bug report. In extreme cases, for popular
libraries, this will happen so often that Fred will either relent and
remove the annotations or get increasingly frustrated and take it out
on the users. (I'm speaking from experience in this regard.)

3. The user knows about the type checker and isn't using it. They turn
it off. Fine, this is ok.

4. The user knows about the type checker but is using it for their own
code in the same program. They're between a rock and a hard place:
either they turn off the checker and lose the benefit in their own
code, or they stop duck typing. This is actually the worst of these
cases.

Basically, my objection is to the following (admittedly extreme) case:
a static type checker that is a) present in the core distribution, b)
on by default, and c) with the only available scope being per-program.
I think that such an implementation is a recipe for having everyone
learn to turn the checker off, wasting all the effort associated with
it.

I am much happier if (b) goes away. Off by default is fine. Not in the
core distribution at all is also fine (because it's effectively
off-by-default). Allowing refined scopes is also a good idea, but
doesn't solve the core problem: people will continue to just turn it
off.

I am not averse to having static checking be an option for Python and
for annotations to be the mechanism by which such typing is done. I
just think we should be really cautious about ever including it in
CPython.

Juancarlo Añez

unread,
Aug 14, 2014, 4:39:11 PM8/14/14
to Steven D'Aprano, Python-Ideas

On Thu, Aug 14, 2014 at 1:15 PM, Steven D'Aprano <st...@pearwood.info> wrote:
Yes. You say that as if it were a bad thing. It is not. Python 3 is
here to stay and we should be promoting Python 3 only code. There is
absolutely no need to apologise for that fact. If people are happy with
Python the way it is in 2.7, or 1.5 for that matter, that's great, they
can stay on it for ever, but all new features are aimed at 3.x and not
2.x or 1.x.

That's reasonable..., in theory.

Reality is that most of the people most supportive of the migration towards Python 3 are currently writing code that is compatible with both 3.x and 2.[67].

You don't leave your people behind (not without a lifeboat).

Since its decided there will not be a 2.8, the right thing to do is to delay decisions about static typing (or restrictions on annotations) till 4.0. It would be "a bad thing" to break or deprecate existing 3.x code with 3.5.

Cheers,
--
Juancarlo Añez
It is loading more messages.
0 new messages