Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

[Python-ideas] Why is design-by-contracts not widely adopted?

198 views
Skip to first unread message

Marko Ristin-Kaufmann

unread,
Sep 23, 2018, 1:10:53 AM9/23/18
to Python-Ideas
Hi,

(I'd like to fork from a previous thread, "Pre-conditions and post-conditions", since it got long and we started discussing a couple of different things. Let's put the general discussion related to design-by-contract in this thread and I'll spawn another thread for the discussion about the concrete implementation of a design-by-contract library in Python.)

After the discussion we had on the list and after browsing the internet a bit, I'm still puzzled why design-by-contract was not more widely adopted and why so few languages support it. Please have a look at these articles and answers:

I did see that there are a lot of misconceptions about it ("simple asserts", "developer overhead", "needs upfront design", "same as unit testing"). This is probably the case with any novel concept that people are not familiar with. However, what does puzzle me is that once the misconceptions are rectified ("it's not simple asserts", "the development is actually faster", "no need for upfront design", "not orthogonal, but dbc + unit testing is better than just unit testing"), the concept is still discarded.

After properly reading about design-by-contract and getting deeper into the topic, there is no rational argument against it and the benefits are obvious. And still, people just wave their hand and continue without formalizing the contracts in the code and keep on writing them in the descriptions.

Why is that so? I'm completely at loss about that -- especially about the historical reasons (some mentioned that design-by-contract did not take off since Bertrand Meyer holds the trademark on the term and because of his character. Is that the reason?).

One explanation that seems plausible to me is that many programmers are actually having a hard time with formalization and logic rules (e.g., implication, quantifiers), maybe due to missing education (e.g. many programmers are people who came to programming from other less-formal fields). It's hence easier for them to write in human text and takes substantial cognitive load to formalize these thoughts in code. Does that explains it?

What do you think? What is the missing part of the puzzle?

Cheers,
Marko

David Mertz

unread,
Sep 23, 2018, 3:16:39 AM9/23/18
to Marko Ristin-Kaufmann, python-ideas
On Sun, Sep 23, 2018, 1:10 AM Marko Ristin-Kaufmann <marko....@gmail.com> wrote:
One explanation that seems plausible to me is that many programmers are actually having a hard time with formalization and logic rules (e.g., implication, quantifiers), maybe due to missing education (e.g. many programmers are people who came to programming from other less-formal fields). It's hence easier for them to write in human text and takes substantial cognitive load to formalize these thoughts in code. Does that explains it?

I've tried to explain my own reasons for not being that interested in DbC in other threads. I've been familiar with DbC libraries in Python for close to 20 years, and it never struck me as worth the effort of using.

I'm not alone in this. A large majority of folks formally educted in computer science and related fields have been aware of DbC for decades but deliberately decided not to use them in their own code. Maybe you and Bertram Meyer are simple better than that 99% of programmers... Or maybe the benefit is not so self-evidently and compelling as it feels to you.

To me, as I've said, DbC imposes a very large cost for both writers and readers of code. While it's possible to split hairs about the edge cases where assertions and unit tests cannot cover identical ground, the reality is that the benefits are extremely close between the different techniques. However, it's vastly easier to take a more incremental and as-needed approach using assertions and unit tests than it is with DbC. 

Moreover, unit tests have the giant advantage of living *elsewhere* than in the main code itself... This probably doesn't matter so much to writers, but it's a huge win for readers. Even with doctests—which I'm somewhat unusual in actually liking—even though the tests live in the same file and function/class as the operational code, it still feels relatively easy to separate the concerns visual when reading such code. I just cannot get that with DbC.

I know you can inside I'm wrong about all this, and my code would be better and faster if I would accept this niche orthodoxy. But I just do not see DbC becoming non-niche in any plausible future, neither in Python not in any other mainstream language. Opinions are free to differ, and I could be wrong.

Angus Hollands

unread,
Sep 23, 2018, 6:15:12 AM9/23/18
to python...@python.org

Hi Marko,

I think there are several ways to approach this problem, though am not weighing in on whether DbC is a good thing in Python. I wrote a simple implementation of DbC which is currently a run-time checker. You could, with the appropriate tooling, validate statically too (as with all approaches). In my approach, I use a “proxy” object to allow the contract code to be defined at function definition time. It does mean that some things are not as pretty as one would like - anything that cannot be hooked into with magic methods i.e isinstance, but I think this is acceptable as it makes features like old easier. Also, one hopes that it encourages simpler contract checks as a side-effect. Feel free to take a look - https://github.com/agoose77/pyffel
It is by no means well written, but a fun PoC nonetheless.
Regards,
Angus

Hugh Fisher

unread,
Sep 23, 2018, 6:34:47 AM9/23/18
to python...@python.org
> Date: Sun, 23 Sep 2018 07:09:37 +0200
> From: Marko Ristin-Kaufmann <marko....@gmail.com>
> To: Python-Ideas <python...@python.org>
> Subject: [Python-ideas] Why is design-by-contracts not widely adopted?

[ munch ]

> *. *After properly reading about design-by-contract and getting deeper into
> the topic, there is no rational argument against it and the benefits are
> obvious. And still, people just wave their hand and continue without
> formalizing the contracts in the code and keep on writing them in the
> descriptions.

Firstly, I see a difference between rational arguments against Design By
Contract (DbC) and against DbC in Python. Rejecting DbC for Python is
not the same as rejecting DbC entirely.

Programming languages are different, obviously. Python is not the same
as C is not the same as Lisp... To me this also means that different
languages are used for different problem domains, and in different styles
of development. I wouldn't use DbC in programming C or assembler
because it's not really helpful for the kind of low level close to the machine
stuff I use C or assembler for. And I wouldn't use DbC for Python because
I wouldn't find it helpful for the kind of dynamic, exploratory development
I do in Python. I don't write strict contracts for Python code because in a
dynamically typed, and duck typed, programming language they just don't
make sense to me. Which is not to say I think Design by Contract is bad,
just that it isn't good for Python.

Secondly, these "obvious" benefits. If they're obvious, I want to know why
aren't you using Eiffel? It's a programming language designed around DbC
concepts. It's been around for three decades, at least as long as Python or
longer. There's an existing base of compilers and support tools and libraries
and textbooks and experienced programmers to work with.

Could it be that Python has better libraries, is faster to develop for, attracts
more programmers? If so, I suggest it's worth considering that this might
be *because* Python doesn't have DbC.

Or is this an updated version of the old saying "real programmers write
FORTRAN in any language" ? If you are accustomed to Design by Contract,
think of your time in the Python world as a trip to another country. Relax
and try to program like the locals do. You might enjoy it.

--

cheers,
Hugh Fisher
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Marko Ristin-Kaufmann

unread,
Sep 24, 2018, 3:47:37 AM9/24/18
to Python-Ideas
Hi,

Thank you for your replies, Hugh and David! Please let me address the points in serial.

Obvious benefits
You both seem to misconceive the contracts. The goal of the design-by-contract is not reduced to testing the correctness of the code, as I reiterated already a couple of times in the previous thread. The contracts document formally what the caller and the callee expect and need to satisfy when using a method, a function or a class. This is meant for a module that is used by multiple people which are not necessarily familiar with the code. They are not a niche. There are 150K projects on pypi.org. Each one of them would benefit if annotated with the contracts.

Please do re-read my previous messages on the topic a bit more attentively. These two pages I also found informative and they are quite fast to read (<15 min):
https://www.win.tue.nl/~wstomv/edu/2ip30/references/design-by-contract/index.html
https://gleichmann.wordpress.com/2007/12/09/test-driven-development-and-design-by-contract-friend-or-foe/

Here is a quick summary of the argument.

When you are documenting a method you have the following options:
1) Write preconditions and postconditions formally and include them automatically in the documentation (e.g., by using icontract library).
2) Write precondtions and postconditions in docstring of the method as human text.
3) Write doctests in the docstring of the method.
4) Expect the user to read the actual implementation.
5) Expect the user to read the testing code.

Here is what seems obvious to me. Please do point me to what is not obvious to you because that is the piece of puzzle that I am missing (i.e. why this is not obvious and what are the intricacies). I enumerated the statements for easier reference:

a) Using 1) is the only option when you have to deal with inheritance. Other approaches can no do that without much repetition in practice.

b) If you write contracts in text, they will become stale over time (i.e. disconnected from the implementation and plain wrong and misleading). It is a common problem that the comments rot over time and I hope it does not need further argument (please let me know if you really think that the comments do not rot).

c) Using 3), doctests, means that you need mocking as soon as your method depends on non-trivial data structures. Moreover, if the output of the function is not trivial and/or long, you actually need to write the contract (as postcondition) just after the call in the doctest. Documenting preconditions includes writing down the error that will be thrown. Additionally, you need to write that what you are documenting actually also holds for all other examples, not just for this particular test case (e.g., in human text as a separate sentence before/after the doctest in the docstring).

d) Reading other people's code in 4) and 5) is not trivial in most cases and requires a lot of attention as soon as the method includes calls to submethods and functions. This is impractical in most situations since most code is non-trivial to read and is subject to frequent changes.

e) Most of the time, 5) is not even a viable option as the testing code is not even shipped with the package and includes going to github (if the package is open-sourced) and searching through the directory tree to find the test. This forces every user of a library to get familiar with the testing code of the library.

f) 4) and 5) are obviously a waste of time for the user -- please do explain why this might not be the case. Whenever I use the library, I do not expect to be forced to look into its test code and its implementation. I expect to read the documentation and just use the library if I'm confident about its quality. I have rarely read the implementation of the standard libraries (notable exceptions in my case are ast and subprocess module) or any well-established third-party library I frequently use (numpy, opencv, sklearn, nltk, zmq, lmdb, sqlalchemy). If the documentation is not clear about the contracts, I use trial-and-error to figure out the contracts myself. This again is obviously a waste of time of the user and it's far easier to read the contracts directly than use trial-and-error.

Contracts are difficult to read.
David wrote:
To me, as I've said, DbC imposes a very large cost for both writers and readers of code.

This is again something that eludes me and I would be really thankful if you could clarify. Please consider for an example, pypackagery (https://pypackagery.readthedocs.io/en/latest/packagery.html) and the documentation of its function resolve_initial_paths:
packagery.resolve_initial_paths(initial_paths)

Resolve the initial paths of the dependency graph by recursively adding *.py files beneath given directories.

Parameters:

initial_paths (List[Path]) – initial paths as absolute paths

Return type:

List[Path]

Returns:

list of initial files (i.e. no directories)

Requires:
  • all(pth.is_absolute() for pth in initial_paths)
Ensures:
  • len(result) >= len(initial_paths) if initial_paths else result == []
  • all(pth.is_absolute() for pth in result)
  • all(pth.is_file() for pth in result)

How is this difficult to read, unless the reader is not familiar with formalism and has a hard time parsing the quantifiers and logic rules? Mind that all these bits are deemed important by the writer -- and need to be included in the function description  somehow -- you can choose between 1)-5). 1) seems obviously best to me. 1) will be tested at least at test time. If I have a bug in the implementation (e.g., I include a directory in the result), the testing framework will notify me again.

Here is what the reader would have needed to read without the formalism in the docstring as text (i.e., 2):
* All input paths must be absolute.
* If the initial paths are empty, the result is an empty list.
* All the paths in the result are also absolute.
* The resulting paths only include files.

and here is an example with doctest (3):
>>> result = packagery.resolve_initial_paths([])
[]

>>> with temppathlib.NamedTemporaryFile() as tmp1, \
... temppathlib.NamedTemporaryFile() as tmp2:
... tmp1.path.write_text("some text")
... tmp2.path.write_text("another text")
... result = packagery.resolve_initial_paths([tmp1, tmp2])
... assert all(pth.is_absolute() for pth in result)
... assert all(pth.is_file() for pth in result)

>>> with temppathlib.TemporaryDirectory() as tmp:
... packagery.resolve_initial_paths([tmp.path])
Traceback (most recent call last):
...
ValueError("Unexpected directory in the paths")
>>> with temppathlib.TemporaryDirectory() as tmp:
... pth = tmp.path / "some-file.py"
... pth.write_text("some text")
... packagery.initial_paths([pth.relative_to(tmp.path)])
Traceback (most recent call last):
...
ValueError("Unexpected relative path in the initial paths")

Now, how can reading the text (2, code rot) or reading the doctests (3, longer, includes contracts) be easier and more maintainable compared to reading the contracts? I would be really thankful for the explanation -- I feel really stupid as for me this is totally obvious and, evidently, for other people it is not.

I hope we all agree that the arguments about this example (resolve_initial_paths) selected here are not particular to pypackagery, but that they generalize to most of the functions and methods out there.

Writing contracts is difficult.
David wrote:
To me, as I've said, DbC imposes a very large cost for both writers and readers of code.

The effort of writing contracts include as of now:
* include icontract (or any other design-by-contract library) to setup.py (or requirements.txt), one line one-off
* include sphinx-icontract to docs/source/conf.py and docs/source/requirements.txt, two lines, one-off
* write your contracts (usually one line per contract).

The contracts (1) in the above-mentioned function look like this (omitting the contracts run only at test time):
@icontract.pre(lambda initial_paths: all(pth.is_absolute() for pth in initial_paths))
@icontract.post(lambda result: all(pth.is_file() for pth in result))
@icontract.post(lambda result: all(pth.is_absolute() for pth in result))
@icontract.post(lambda initial_paths, result: len(result) >= len(initial_paths) if initial_paths else result == [])
def resolve_initial_paths(initial_paths: List[pathlib.Path]) -> List[pathlib.Path]:
...
Putting aside how this code could be made more succinct (use "args" or "a" argument in the condition to represent the arguments, using from ... import ..., renaming "result" argument to "r", introducing a shortcut methods slowpre and slowpost to encapsulate the slow contracts not to be executed in the production), how is this difficult to write? It's 4 lines of code.

Writing text (2) is 4 lines. Writing doctests (3) is 23 lines and includes the contracts. Again, given that the writer is trained in writing formal expressions, the mental effort is the same for writing the text and writing the formal contract (in cases of non-native English speakers, I'd even argue that formal expressions are sometimes easier to write).

99% vs 1%
I'm not alone in this. A large majority of folks formally educated in computer science and related fields have been aware of DbC for decades but deliberately decided not to use them in their own code. Maybe you and Bertram Meyer are simple better than that 99% of programmers... Or maybe the benefit is not so self-evidently and compelling as it feels to you.

I think that ignorance plays a major role here. Many people have misconceptions about the design-by-contract. They just use 2) for more complex methods, or 3) for rather trivial methods. They are not aware that it's easy to use the contracts (1) and fear using them for non-rational reasons (e.g., habits).

This is also what Todd Plesel writes in https://www.win.tue.nl/~wstomv/edu/2ip30/references/design-by-contract/index.html#IfDesignByContractIsSoGreat:
The vast majority of those developing software - even that intended to be reused - are simply ignorant of the concept. As a result they produce application programmer interfaces (APIs) that are under-specified thus passing the burden to the application programmer to discover by trial and error, the 'acceptable boundaries' of the software interface (undocumented contract's terms). But such ad-hoc operational definitions of software interface discovered through reverse-engineering are subject to change upon the next release and so offers no stable way to ensure software correctness.

The fact that many people involved in writing software lack pertinent education (e.g., CS/CE degrees) and training (professional courses, read software engineering journals, attend conferences etc.) is not a reason they don't know about DBC since the concept is not covered adequately in such mediums anyway. That is, ignorance of DBC extends not just throughout practitioners but also throughout educators and many industry-experts.

He lists some more factors and misconceptions that hinder the adoption. I would definitely recommend you to read at least that section if not the whole piece.

The conclusion paragraph "Culture Shift: Too Soon or Too Late" was also telling:
The simplicity and obvious benefits of Design By Contract lead one to wonder why it has not become 'standard practice' in the software development industry. When the concept has been explained to various technical people (all non-programmers), they invariably agree that it is a sensible approach and some even express dismay that software components are not developed this way.

It is just another indicator of the immaturity of the software development industry. The failure to produce high-quality products is also blatantly obvious from the non-warranty license agreement of commercial software. Yet consumers continue to buy software they suspect and even expect to be of poor quality. Both quality and lack-of-quality have a price tag, but the difference is in who pays and when. As long as companies can continue to experience rising profits while selling poor-quality products, what incentive is there to change? Perhaps the fall-out of the "Year 2000" problem will focus enough external pressure on the industry to jolt it towards improved software development methods. There is talk of certifying programmers like other professionals. If and when that occurs, the benefits of Design By Contract just might begin to be appreciated.

But it is doubtful. Considering the typical 20 year rule for adopting superior technology, DBC as exemplified by Eiffel, has another decade to go. But if Java succeeds in becoming a widely-used language and JavaBeans become a widespread form of reuse then it would already be too late for DBC to have an impact. iContract will be a hardly-noticed event much like ANNA for Ada and A++ for C++. This is because the philosophy/mindset/culture is established by the initial publication of the language and its standard library.

(Emphasis mine; iContract refers to a Java design-by-contract library)

Hence the salient argument is the lack of tools for DbC. So far, none of the existing DbC libraries in Python really have the capabilities to be included in the code base. The programmer had to duplicate the contract, the messages did not include the values involved in the condition, one could not inherit the contracts and the contracts were not included in the documentation. Some libraries supported some of these features, but none up to icontract library supported them all. icontract finally supports all these features.

I have never seen a rational argument how writing contracts (1) is inferior to approaches 2-5), except that it's hard for programmers untrained in writing formal expressions and for the lack of tools. I would be really thankful if you could address these points and show me where I am wrong given that formalism and tools are not a problem. We can train the untrained, and we can develop tools (and put them into standard library). This will push adoption to far above 1%.

Finally, it is obvious to me that the documentation is important. I see lacking documentation as one of the major hits in the productivity of a programmer. If there is a tool that could easily improve the documentation (i.e. formal contracts with one line of code per contract) and automatically keep it in sync with the code (by checking the contracts during the testing), I don't see any rational reason why you would dispense of such a tool. Again, please do correct me and contradict -- I don't want to sound condescending or arrogant -- I literally can't wrap my head around why anybody would dispense of a such an easy-to-use tool that gives you better documentation (i.e. superior to approaches 2-5) except for lack of formal skills and lack of supporting library. If you think that the documentation is not important, then please, do explain that since it goes counter to all my previous experience and intuition (which, of course, can be wrong).

Why not Eiffel?
Hugh wrote:
Secondly, these "obvious" benefits. If they're obvious, I want to know why
aren't you using Eiffel? It's a programming language designed around DbC
concepts. It's been around for three decades, at least as long as Python or
longer. There's an existing base of compilers and support tools and libraries
and textbooks and experienced programmers to work with.

Could it be that Python has better libraries, is faster to develop for, attracts
more programmers? If so, I suggest it's worth considering that this might
be *because* Python doesn't have DbC.

Python is easier to write and read, and there are no libraries which are close in quality in Eiffel space (notably, Numpy, OpenCV, nltk and sklearn). I really don't see how the quality of these libraries have anything to do with lack (or presence) of the contracts. OpenCV and Numpy have contracts all over their code (written as assertions and not documented), albeit with very non-informative violation messages. And they are great libraries. Their users would hugely benefit from a more mature and standardized contracts library with informative violation messages.

Duck Typing
Hugh wrote:
And I wouldn't use DbC for Python because
I wouldn't find it helpful for the kind of dynamic, exploratory development
I do in Python. I don't write strict contracts for Python code because in a
dynamically typed, and duck typed, programming language they just don't
make sense to me. Which is not to say I think Design by Contract is bad,
just that it isn't good for Python.

I really don't see how DbC has to do with duck typing (unless you reduce it to mere isinstance conditions, which would simply be a straw-man argument) -- could you please clarify? As soon as you need to document your code, and this is what most modules have to do in teams of more than one person (especially so if you are developing a library for a wider audience), you need to write down the contracts. Please see above where I tried to explained  that 2-5) are inferior approaches to documenting contracts compared to 1).

As I wrote above, I would be very, very thankful if you point me to other approaches (apart from 1-5) that are superior to contracts or state an argument why approaches 2-5) are superior to the contracts since that is what I miss to see.

Cheers,
Marko

Barry Scott

unread,
Sep 24, 2018, 1:58:14 PM9/24/18
to Angus Hollands, python...@python.org
This is an interesting PoC, nice work! I like that its easy to read the tests.

Given a library like this the need to build DbC into python seems unnecessary.

What do other people think?

Barry



Regards,
Angus

Barry Scott

unread,
Sep 24, 2018, 2:23:11 PM9/24/18
to python-ideas


On 23 Sep 2018, at 11:33, Hugh Fisher <hugo....@gmail.com> wrote:

Could it be that Python has better libraries, is faster to develop for, attracts
more programmers? If so, I suggest it's worth considering that this might
be *because* Python doesn't have DbC.

I'm not sure how you get from the lack of DbC being a feature to python's success.

I use DbC in my python code via the asserts and its been very useful in my experience.

If there was a nice way to get better then the assert method I'd use it. Like Angus's PoC.

I assume that developers that are not interesting in DbC would simply not use any
library/syntax that supported it.

Barry

Marko Ristin-Kaufmann

unread,
Sep 24, 2018, 3:10:48 PM9/24/18
to Barry Scott, Angus Hollands, Python-Ideas
Hi Barry,
I think the main issue with pyffel is that it can not support function calls in general. If I understood it right, and Angus please correct me, you would need to wrap every function that you would call from within the contract.

But the syntax is much nicer than icontract or dpcontracts (see these packages on pypi). What if we renamed "args" argument and "old" argument in those libraries to just "a" and "o", respectively? Maybe that gives readable code without too much noise:

@requires(lambda self, a, o: self.sum == o.sum - a.amount)
def withdraw(amount: int) -> None:
    ...

There is this lambda keyword in front, but it's not too bad?

I'll try to contact dpcontracts maintainers. Maybe it's possible to at least merge a couple of libraries into one and make it a de facto standard. @Agnus, would you also like to join the effort?

Cheers,
Marko




Barry Scott

unread,
Sep 24, 2018, 4:04:55 PM9/24/18
to Marko Ristin-Kaufmann, Angus Hollands, Python-Ideas

On 24 Sep 2018, at 20:09, Marko Ristin-Kaufmann <marko....@gmail.com> wrote:

Hi Barry,
I think the main issue with pyffel is that it can not support function calls in general. If I understood it right, and Angus please correct me, you would need to wrap every function that you would call from within the contract.

But the syntax is much nicer than icontract or dpcontracts (see these packages on pypi). What if we renamed "args" argument and "old" argument in those libraries to just "a" and "o", respectively? Maybe that gives readable code without too much noise:

The args and old and not noise its easier to read the a and o.
a and o as aliases for more descriptive names maybe, but not as the only name.


@requires(lambda self, a, o: self.sum == o.sum - a.amount)
def withdraw(amount: int) -> None:
    ...

There is this lambda keyword in front, but it's not too bad?

The lambda smells of internals that I should not have to care about being exposed.
So -1 on lambda being required.
Also being able to supply a list of conditions was a +1.

James Lu

unread,
Sep 24, 2018, 6:36:38 PM9/24/18
to python...@python.org
Perhaps it’s because fewer Python functions involve transitioning between states. Web development and statistics don’t involve many state transition. State transitions are where I think I would find it useful to write contracts out explicitly.

Stephen J. Turnbull

unread,
Sep 25, 2018, 12:57:41 AM9/25/18
to Barry Scott, Python-Ideas
Barry Scott writes:

> > @requires(lambda self, a, o: self.sum == o.sum - a.amount)
> > def withdraw(amount: int) -> None:
> > ...
> >
> > There is this lambda keyword in front, but it's not too bad?
>
> The lambda smells of internals that I should not have to care about
> being exposed.
> So -1 on lambda being required.

If you want to get rid of the lambda you can use strings and then
'eval' them in the condition. Adds overhead.

If you want to avoid the extra runtime overhead of parsing
expressions, it might be nice to prototype with MacroPy. This should
also allow eliminating the lambda by folding it into the macro (I
haven't used MacroPy but it got really good reviews by fans of that
kind of thing). It would be possible to avoid decorator syntax if you
want to with this implementation.

I'm not sure that DbC is enough of a fit for Python that it's worth
changing syntax to enable nice syntax natively, but detailed reports
on a whole library (as long as it's not tiny) using DbC with a nice
syntax (MacroPy would be cleaner, but I think it would be easy to "see
through" the quoted conditions in an eval-based implementation) would
go a long way to making me sit up and take notice. (I'm not
influential enough to care about, but I suspect some committers would
be impressed too. YMMV)

Steve

Marko Ristin-Kaufmann

unread,
Sep 25, 2018, 2:20:03 AM9/25/18
to turnbull....@u.tsukuba.ac.jp, Python-Ideas
Hi Steve,
Thanks a lot for pointing us to macropy -- I was not aware of the library, it looks very interesting!

Do you have any experience how macropy fit with current IDEs and static linters (pylint, mypy)? I fired up pylint and mypy on the sample code from their web site, played a bit with it and it seems that they go along well.

I'm also a bit worried how macropy would work out in the libraries published to pypi -- imagine if many people start using contracts. Suddenly, all these libraries would not only depend on a contract library but on a macro library as well. Is that something we should care about? Potential dependency hell? (I already have a bad feeling about making icontract depend on asttokens and considerin-lining asttokens into icontract particularly for that reason).

I'm also worried about this one (from https://macropy3.readthedocs.io/en/latest/overview.html):
Note that this means you cannot use macros in a file that is run directly, as it will not be passed through the import hooks.

That would make contracts unusable in any stand-alone script, right?

Cheers,
Marko

Marko Ristin-Kaufmann

unread,
Sep 25, 2018, 3:29:27 AM9/25/18
to turnbull....@u.tsukuba.ac.jp, Python-Ideas
Hi Steve and others,
After some thinking, I'm coming to a conclusion that it might be wrong to focus too much about how the contracts are written -- as long as they are formal, easily  transformable to another representation and fairly maintainable.

Whether it's with a lambda, without, with "args" or "a", with "old" or "o" -- it does not matter that much as long as it is pragmatic and not something crazy complex. This would also mean that we should not add complexity (e.g., by adding macros) and limit the magic as much as possible.

It is actually much more important in which form they are presented to the end-user. I already made an example with sphinx-icontract in a message before -- an improved version might use mathematical symbols (e.g., replace all() with ∀, replace len() with |.|, nicely use subscripts for ranges, use case distinction with curly bracket "{" instead of if.. else ..., etc.). This would make them even shorter and easier to parse. Let me iterate the example I already pasted in the thread before to highlight what I have in mind:
packagery.resolve_initial_paths(initial_paths)

Resolve the initial paths of the dependency graph by recursively adding *.py files beneath given directories.

Parameters:

initial_paths (List[Path]) – initial paths as absolute paths

Return type:

List[Path]

Returns:

list of initial files (i.e. no directories)

Requires:
  • all(pth.is_absolute() for pth in initial_paths)
Ensures:
  • all(pth in result for pth in initial_paths if pth.is_file()) (Initial files also in result)
  • len(result) >= len(initial_paths) if initial_paths else result == []
  • all(pth.is_absolute() for pth in result)
  • all(pth.is_file() for pth in result)

    The contracts need to extend __doc__ of the function accordingly (and the contracts in __doc__ also need to reflect the inheritance of the contracts!), so that we can use help().

    There should be also a plugin for Pycharm, Pydev, vim and emacs to show the contracts in an abbreviated and more readable form in the code and only show them in raw form when we want to edit them (i.e., when we move cursor over them). I suppose inheritance of contracts needs to be reflected in quick-inspection windows, but not in the code view.

    Diffs and github/bitbucket/... code reviews might be a bit cumbersome since they enforce the raw form of the contracts, but as long as syntax is pragmatic, I don't expect this to be a blocker.

    Is this a sane focus?

    Cheers,
    Marko

    Angus Hollands

    unread,
    Sep 25, 2018, 3:38:57 AM9/25/18
    to Marko Ristin, python...@python.org
    Hi Mario, 
    yes I'd pass in some kind of 'old' object as a proxy to the old object state. 

    My demo can handle function calls, unless they themselves ultimately call something which can't be proxies e.g is instance (which delegates to the test class, not the instance), or boolean evaluation of some expression (e.g an if block). I don't think that this is awful - contracts should probably be fairly concise while expressive - but definitely non-ideal. 
     
    I haven't really time to work on this at the moment; I admit, it was a specific problem of interest, rather than a domain I have much experience with. In fact, it was probably an excuse to overload all of the operators on an object! 

    Kind regards, 
    Angus 

    Robert Collins

    unread,
    Sep 25, 2018, 4:02:42 AM9/25/18
    to marko....@gmail.com, Python-Ideas
    On Mon, 24 Sep 2018 at 19:47, Marko Ristin-Kaufmann
    <marko....@gmail.com> wrote:
    >
    > Hi,
    >
    > Thank you for your replies, Hugh and David! Please let me address the points in serial.
    >
    > Obvious benefits
    > You both seem to misconceive the contracts. The goal of the design-by-contract is not reduced to testing the correctness of the code, as I reiterated already a couple of times in the previous thread. The contracts document formally what the caller and the callee expect and need to satisfy when using a method, a function or a class. This is meant for a module that is used by multiple people which are not necessarily familiar with the code. They are not a niche. There are 150K projects on pypi.org. Each one of them would benefit if annotated with the contracts.

    You'll lose folks attention very quickly when you try to tell folk
    what they do and don't understand.

    Claiming that DbC annotations will improve the documentation of every
    single library on PyPI is an extraordinary claim, and such claims
    require extraordinary proof.

    I can think of many libraries where necessary pre and post conditions
    (such as 'self is still locked') are going to be noisy, and at risk of
    reducing comprehension if the DbC checks are used to enhance/extended
    documentation.

    Some of the examples you've been giving would be better expressed with
    a more capable type system in my view (e.g. Rust's), but I have no
    good idea about adding that into Python :/.

    Anyhow, the thing I value most about python is its pithyness: its
    extremely compact, allowing great developer efficiency, but the cost
    of testing is indeed excessive if the tests are not structured well.
    That said, its possible to run test suites with 10's of thousands of
    tests in only a few seconds, so there's plenty of headroom for most
    projects.

    -Rob

    Stephen J. Turnbull

    unread,
    Sep 25, 2018, 6:09:09 AM9/25/18
    to Marko Ristin-Kaufmann, Python-Ideas
    Marko Ristin-Kaufmann writes:

    > Thanks a lot for pointing us to macropy -- I was not aware of the library,
    > it looks very interesting!
    >
    > Do you have any experience how macropy fit

    Sorry, no. I was speaking as someone who is familiar with macros from
    Lisp but doesn't miss them in Python, and who also has been watching
    python-dev and python-ideas for about two decades now, so I've heard
    of things like MacroPy and know how the core developers think to a
    great extent.

    > I'm also a bit worried how macropy would work out in the libraries
    > published to pypi -- imagine if many people start using contracts.
    > Suddenly, all these libraries would not only depend on a contract library
    > but on a macro library as well.

    That's right.

    > Is that something we should care about?

    Yes. Most Pythonistas (at least at present) don't much like macros.
    They fear turning every program into its own domain-specific language.
    I can't claim much experience with dependency hell, but I think that's
    much less important from your point of view (see below).

    My point is mainly that, as you probably are becoming painfully aware,
    getting syntax changes into Python is a fairly drawnout process. For
    an example of the kind of presentation that motivates people to change
    their mind from the default state of "if it isn't in Python yet,
    YAGNI" to "yes, let's do *this* one", see
    https://www.python.org/dev/peps/pep-0572/#appendix-a-tim-peters-s-findings

    Warning: Tim Peters is legendary, though still active occasionally.
    All he has to do is post to get people to take notice. But this
    Appendix is an example of why he gets that kind of R-E-S-P-E-C-T.[1]

    So the whole thing is a secret plot ;-) to present the most beautiful
    syntax possible in your PEP (which will *not* be about DbC, but rather
    about a small set of enabling syntax changes, hopefully a singleton),
    along with an extended example, or a representative sample, of usage.
    Because you have a working implementation using MacroPy (or the less
    pretty[2] but fewer dependencies version based on condition strings
    and eval) people can actually try it on their own code and (you hope,
    they don't :-) they find a nestful of bugs by using it.

    > Potential dependency hell? (I already have a bad feeling about
    > making icontract depend on asttokens and considerin-lining
    > asttokens into icontract particularly for that reason).

    I don't think so. First, inlining an existing library is almost
    always a bad idea. As for the main point, if the user sticks to one
    major revision, and only upgrades to compatible bugfixes in the
    Python+stdlib distribution, I don't see why two or three libraries
    would be a major issue for a feature that the developer/project uses
    extremely frequently. I've rarely experienced dependency hell, and in
    both cases it was web frameworks (Django and Zope, to be specific, and
    the dependencies involved were more or less internal to those
    frameworks). If you or people you trust have other experience, forget
    what I just said. :-)

    Of course it depends on the library, but as long as the library is
    pretty strict about backward compatibility, you can upgrade it and get
    new functionality for other callers in your code base (which are
    likely to appear, you know -- human beings cannot stand to leave a
    tool unused once they install it!)

    > > Note that this means *you cannot use macros in a file that is run
    > > directly*, as it will not be passed through the import hooks.
    >
    > That would make contracts unusable in any stand-alone script,
    > right?

    Yes, but really, no:

    # The run.py described by the MacroPy docs assumes a script that
    # runs by just importing it. I don't have time to work out
    # whether that makes more sense. This idiom of importing just a
    # couple of libraries, and then invoking a function with a
    # conventional name such as "run" or "process" is quite common.
    # If you have docutils install, check out rstpep2html.py.

    import macropy.activate
    from my_contractful_library import main
    main()

    and away you go. 5 years from now that script will be a badge of
    honor among Pythonic DbCers, and you won't be willing to give it up!
    Just kidding, of course -- the ideal outcome is that the use case is
    sufficiently persuasive to justify a syntax change so you don't need
    MacroPy, or, perhaps some genius will come along and provide some
    obscure construct that is already legal syntax!

    HTH

    Footnotes:
    [1] R.I.P. Aretha!

    [2] Urk, I just realized there's another weakness to strings: you get
    no help on checking their syntax from the compiler. For a proof-of-
    concept that's OK, but if you end up using the DbC library in your
    codebase for a couple years while the needed syntax change gathers
    support, that would be really bad.

    Stephen J. Turnbull

    unread,
    Sep 25, 2018, 6:10:23 AM9/25/18
    to Angus Hollands, python...@python.org
    Angus Hollands writes:

    > yes I'd pass in some kind of 'old' object as a proxy to the old object
    > state.

    Mostly you shouldn't need to do this, you can copy the state:

    def method(self, args):
    import copy
    old = copy.deepcopy(self)

    This is easy but verbose to do with a decorator, and I imagine a bunch
    of issues about the 'old' object with multiple decorators, so I omit it
    here. You might want a variety of such decorators. Ie, using
    copy.copy vs copy.deepcopy vs a special-case copy for a particular
    class because there are large objects that are actually constant that
    you don't want to copy (an "is" test would be enough, so the copy
    would actually implement part of the contract). Or the copy function
    could be an argument to the decorator or a method on the object.

    Hugh Fisher

    unread,
    Sep 25, 2018, 7:13:58 AM9/25/18
    to python...@python.org
    > Date: Mon, 24 Sep 2018 09:46:16 +0200
    > From: Marko Ristin-Kaufmann <marko....@gmail.com>
    > To: Python-Ideas <python...@python.org>
    > Subject: Re: [Python-ideas] Why is design-by-contracts not widely
    > adopted?
    > Message-ID:
    > <CAGu4bVB4=Bou+DhoDzayx=i7eQ2Gr8KDDOFO...@mail.gmail.com>
    > Content-Type: text/plain; charset="utf-8"

    [munch]

    > Their users would hugely benefit from a more mature
    > and standardized contracts library with informative violation messages.

    Will respond in another message, because it's a big topic.

    > I really don't see how DbC has to do with duck typing (unless you reduce it
    > to mere isinstance conditions, which would simply be a straw-man argument)
    > -- could you please clarify?

    I argue that Design by Contract doesn't make sense for Python and other
    dynamically typed, duck typed languages because it's contrary to how the
    language, and the programmer, expects to work.

    In Python we can write something like:

    def foo(x):
    x.bar(y)

    What's the type of x? What's the type of y? What is the contract of bar?
    Don't know, don't care. x, or y, can be an instance, a class, a module, a
    proxy for a remote web service. The only "contract" is that object x will
    respond to message bar that takes one argument. Object x, do whatever
    you want with it.

    And that's a feature, not a bug, not bad design. It follows Postel's Law
    for Internet protocols of being liberal in what you accept. It follows the
    Agile principle of valuing working software over comprehensive doco.
    It allows software components to be glued together quickly and easily.

    It's a style of programming that has been successful for many years,
    not just in Python but also in Lisp and Smalltalk and Perl and JavaScript.
    It works.

    Not for everything. If I were writing the avionics control routines for a
    helicopter gas turbine, I'd use formal notation and static type checking
    and preconditions and whatnot. But I wouldn't be using Python either.

    > As soon as you need to document your code, and
    > this is what most modules have to do in teams of more than one person
    > (especially so if you are developing a library for a wider audience), you
    > need to write down the contracts. Please see above where I tried to
    > explained that 2-5) are inferior approaches to documenting contracts
    > compared to 1).

    You left off option 6), plain text. Comments. Docstrings. README files.
    Web pages. Books. In my experience, this is what most people consider
    documentation. A good book, a good blog post, can explain more about
    how a library works and what the implementation requirements and
    restrictions are than formal contract notation. In particular, contracts in
    Eiffel don't explain *why* they're there.

    As for 4) reading the code, why not? "Use the source, Luke" is now a
    programming cliche because it works. It's particularly appropriate for
    Python packages which are usually distributed in source form and, as
    you yourself noted, easy to read.

    --

    cheers,
    Hugh Fisher

    Marko Ristin-Kaufmann

    unread,
    Sep 25, 2018, 1:20:06 PM9/25/18
    to rob...@robertcollins.net, Python-Ideas
    Hi Robert,

    You'll lose folks attention very quickly when you try to tell folk
    what they do and don't understand.

    I apologize if I sounded offending, that was definitely not my intention. I appreciate that you addressed that.

    I suppose it's cultural/language issue and the wording was probably inappropriate. Please let me clarify what I meant: there was a  misconception as DbC was reduced to a tool for testing, and, in a separate message, reduced to type-checks at runtime. These are clearly misconceptions, as DbC (as origianally proposed by Hoare and later popularized by Meyer) include other relevant aspects which are essential and hence can not be overseen or simply ignored. If we are arguing about DbC without these aspects then we are simply falling pray to a straw-man fallacy.

    Claiming that DbC annotations will improve the documentation of every
    single library on PyPI is an extraordinary claim, and such claims
    require extraordinary proof.

    I don't know what you mean by "extraordinary" claim and "extraordinary" proof, respectively. I tried to show that DbC is a great tool and far superior to any other tools currently used to document contracts in a library, please see my message https://groups.google.com/d/msg/python-ideas/dmXz_7LH4GI/5A9jbpQ8CAAJ. Let me re-use the enumeration I used in the message and give you a short summary.

    The implicit or explicit contracts are there willy-nilly. When you use a module, either you need to figure them out using trial-and-error or looking at the implementation (4), looking at the test cases and hoping that they generalize (5), write them as doctests (3) or write them in docstrings as human text (2); or you write them formally as explicit contracts (1).

    I could not identify any other methods that can help you with expectations when you call a function or use a class (apart from formal methods and proofs, which I omitted as they seem too esoteric for the current discussion).

    Given that:
    * There is no other method for representing contracts,
    * people are trained and can read formal statements and
    * there is tooling available to write, maintain and represent contracts in a nice way

    I see formal contracts (1) as a superior tool. The deficiencies of other approaches are:
    2) Comments and docstrings inevitably rot and get disconnected from the implementation in my and many other people's experience and studies.
    3) Doctests are much longer and hence more tedious to read and maintain, they need extra text to signal the intent (is it a simple test or an example how boundary conditions are handled or ...). In any non-trivial case, they need to include even the contract itself.
    4) Looking at other people's code to figure out the contracts is tedious and usually difficult for any non-trivial function.
    5) Test cases can be difficult to read since they include much broader testing logic (mocking, set up). Most libraries do not ship with the test code. Identifying test cases which demonstrate the contracts can be difficult.

    Any function that is used by multiple developers which operates on the restricted range of input values and gives out structured output values benefits from contracts (1) since the user of the function needs to figure them out to properly call the function and handle its results correctly. I assume that every package on pypi is published to be used by wider audience, and not the developer herself. Hence every package on pypi would benefit from formal contracts.

    Some predicates are hard to formulate, and we will never be able to formally write down all the contracts. But that doesn't imply for me to not use contracts at all (analogously, some functionality is untestable, but that doesn't mean that we don't test what we can).

    I would be very grateful if you could point me where this exposition is wrong (maybe referring to my original message, https://groups.google.com/d/msg/python-ideas/dmXz_7LH4GI/5A9jbpQ8CAAJ, which I spent more thought on formulating).

    So far, I was not confronted against nor read on the internet a plausible argument against formal contracts (the only two exceptions being lack of tools and less-skilled programmers have a hard time reading formal statements as soon as they include boolean logic and quantifiers). I'm actively working on the former, and hope that the latter would improve with time as education in computer sciences improves.

    Another argument, which I did read often on internet, but don't really count is that quality software is not a priority and most projects hence dispense of documentation or testing. This should, hopefully, not apply to public pypi packages and is highly impractical for any medium-size project with multiple developers (and very costly in the long run).

    I can think of many libraries where necessary pre and post conditions
    (such as 'self is still locked') are going to be noisy, and at risk of
    reducing comprehension if the DbC checks are used to enhance/extended
    documentation.

    It is up to the developer to decide which contracts are enforced during testing, production or displayed in the documentation (you can pick the subset of the three, it's not an exclusion). This feature ("enabled" argument to a contract) has been already implemented in the icontract library.

    Some of the examples you've been giving would be better expressed with
    a more capable type system in my view (e.g. Rust's), but I have no
    good idea about adding that into Python  :/.
    I don't see how type system would help regardless how strict it would be? Unless each input and each output represent a special type, which would be super confusing as soon as you would put them in the containers and have to struggle with invariance, contravariance and covariance. Please see https://github.com/rust-lang/rfcs/issues/1077 for a discussion about introducing DbC to Rust. Unfortunately, the discussion about contracts in Rust is also based on misconceptions (e.g., see https://github.com/rust-lang/rfcs/issues/1077#issuecomment-94582917) -- there seems to be something wrong in the way anybody proposing DbC exposes contracts to the wider audience and miss to address these issues in a good way. So most people just react instinctively with "80% already covered with type systems" / "mere runtime type checks, use assert" and "that's only an extension to testing, so why bother" :(.

    I would now like to answer Hugh and withdraw from the discussion pro/contra formal contracts unless there is a rational, logical argument disputing the DbC in its entirety (not in one of its specific aspects or as a misconception/straw-man). A lot has been already said, many articles have been written (I linked some of the pages which I thought were short & good reads and I would gladly supply more reading material). I doubt I can find a better way to contribute to the discussion.

    Cheers,
    Marko
     

    Marko Ristin-Kaufmann

    unread,
    Sep 25, 2018, 1:21:25 PM9/25/18
    to turnbull....@u.tsukuba.ac.jp, Python-Ideas
    Hi Steve,
    I'll give it a shot and implement a proof-of-concept icontrac-macro library based on macropy and see if that works. I'll keep you posted.

    Cheers,
    Marko

    Marko Ristin-Kaufmann

    unread,
    Sep 25, 2018, 2:12:42 PM9/25/18
    to hugo....@gmail.com, Python-Ideas
    Hi Hugh,

    > As soon as you need to document your code, and
    > this is what most modules have to do in teams of more than one person
    > (especially so if you are developing a library for a wider audience), you
    > need to write down the contracts. Please see above where I tried to
    > explained  that 2-5) are inferior approaches to documenting contracts
    > compared to 1).

    You left off option 6), plain text. Comments. Docstrings.

     That was actually the option 2):

    2) Write precondtions and postconditions in docstring of the method as human text.

    The problem with text is that it is not verifiable and hence starts to "rot". Noticing that text is wrong involves much more developer time & attention than automatically verifying the formal contracts.


    In Python we can write something like:

    def foo(x):
        x.bar(y)

    What's the type of x? What's the type of y? What is the contract of bar?
    Don't know, don't care. x, or y, can be an instance, a class, a module, a
    proxy for a remote web service. The only "contract" is that object x will
    respond to message bar that takes one argument. Object x, do whatever
    you want with it.
    I still don't see how this is connected to contracts or how contracts play a role there? If foo can accept any x and return any result then there is no contract. But hardly any function is like that. Most exercise a certain behavior on a subset of possible input values. The outputs also  satisfy certain contracts, i.e. they also live in a certain subset of possible outputs. (Please mind that I don't mean strictly numerical ranges here -- it can be any subset of structured data.) As I already mentioned, the contracts have nothing to do with typing. You can use them for runtime type checks -- but that's a reduction of the concept to a very particular use case. Usually contracts read like this (from the numpy example linked in another message, https://www.numpy.org/devdocs/reference/generated/numpy.ndarray.transpose.html#numpy.ndarray.transpose):

    ndarray.transpose(*axes)

    Returns a view of the array with axes transposed.

    For a 1-D array, this has no effect. (To change between column and row vectors, first cast the 1-D array into a matrix object.) For a 2-D array, this is the usual matrix transpose. For an n-D array, if axes are given, their order indicates how the axes are permuted (see Examples). If axes are not provided and a.shape = (i[0], i[1], ... i[n-2], i[n-1]), then a.transpose().shape = (i[n-1], i[n-2], ... i[1], i[0]).

    (emphasis mine)

    Mind the three postconditions (case 1D array, case 2D array, case N-D array).


    As for 4) reading the code, why not? "Use the source, Luke" is now a
    programming cliche because it works. It's particularly appropriate for
    Python packages which are usually distributed in source form and, as
    you yourself noted, easy to read.

    Because it is hard and costs a lot of time. The point of encapsulating a function is that I as a user don't have to know its details of implementation and its wider dependencies in the implementation. Looking at the code is the tool of last resort to figure out the contracts. Imagine if you had to look at the implementation of numpy.transpose() to figure out what happens when transposing a N-D array.

    Cheers,
    Marko

    Chris Angelico

    unread,
    Sep 25, 2018, 3:43:44 PM9/25/18
    to python-ideas
    On Wed, Sep 26, 2018 at 3:19 AM Marko Ristin-Kaufmann
    <marko....@gmail.com> wrote:
    >> Claiming that DbC annotations will improve the documentation of every
    >> single library on PyPI is an extraordinary claim, and such claims
    >> require extraordinary proof.
    >
    >
    > I don't know what you mean by "extraordinary" claim and "extraordinary" proof, respectively. I tried to show that DbC is a great tool and far superior to any other tools currently used to document contracts in a library, please see my message https://groups.google.com/d/msg/python-ideas/dmXz_7LH4GI/5A9jbpQ8CAAJ. Let me re-use the enumeration I used in the message and give you a short summary.
    >

    An ordinary claim is like "DbC can be used to improve code and/or
    documentation", and requires about as much evidence as you can stuff
    into a single email. Simple claim, low burden of proof.

    An extraordinary claim is like "DbC can improve *every single project*
    on PyPI". That requires a TON of proof. Obviously we won't quibble if
    you can only demonstrate that 99.95% of them can be improved, but you
    have to at least show that the bulk of them can.

    > There are 150K projects on pypi.org. Each one of them would benefit if annotated with the contracts.

    This is the extraordinary claim. To justify it, you have to show that
    virtually ANY project would benefit from contracts. So far, I haven't
    seen any such proof.

    ChrisA

    Lee Braiden

    unread,
    Sep 25, 2018, 4:10:59 PM9/25/18
    to Chris Angelico, python-ideas
    Eh. It's too easy to cry "show me the facts" in any argument.  To do that too often is to reduce all discussion to pendantry.

    That verifying data against the contract a function makes code more reliable should be self evident to anyone with even the most rudimentary understanding of a function call, let alone a library or large application.  It's the reason why type checking exists, and why bounds checking exists, and why unit checking exists too.

    Chris Angelico

    unread,
    Sep 25, 2018, 4:40:18 PM9/25/18
    to python-ideas
    On Wed, Sep 26, 2018 at 6:09 AM Lee Braiden <leeb...@gmail.com> wrote:
    >
    > Eh. It's too easy to cry "show me the facts" in any argument. To do that too often is to reduce all discussion to pendantry.
    >
    > That verifying data against the contract a function makes code more reliable should be self evident to anyone with even the most rudimentary understanding of a function call, let alone a library or large application. It's the reason why type checking exists, and why bounds checking exists, and why unit checking exists too.
    >

    It's easy, but it's also often correct.

    From my reading of this thread, there HAS been evidence given that DbC
    can be beneficial in some cases. I do not believe there has been
    evidence enough to cite the number of projects on PyPI as "this is how
    many projects would benefit".

    Part of the trouble is finding a concise syntax for the contracts that
    is still sufficiently expressive.

    Kyle Lahnakoski

    unread,
    Sep 25, 2018, 5:59:43 PM9/25/18
    to python...@python.org


    I use DbC occasionally to clarify my thoughts during a refactoring, and then only in the places that continue to make mistakes. In general, I am not in a domain that benefits from DbC.

    Contracts are code: More code means more bugs. Declarative contracts are succinct, but difficult to debug when wrong; I believe this because the debugger support for contracts is poor; There is no way to step through the logic and see the intermediate reasoning in complex contracts.  A contract is an incomplete duplication of what the code already does: at some level of complexity I prefer to use a duplicate independent implementation and compare inputs/outputs.

    Writing contracts cost time and money; and that cost should be weighed against the number and flexibility of the customers that use the code.  A one-time script, a webapp for you team, an Android app for your startup, fraud software, and Facebook make different accounting decisions.  I contend most code projects can not justify DbC.



    On 2018-09-24 03:46, Marko Ristin-Kaufmann wrote:
    When you are documenting a method you have the following options:
    1) Write preconditions and postconditions formally and include them automatically in the documentation (e.g., by using icontract library).
    2) Write precondtions and postconditions in docstring of the method as human text.
    3) Write doctests in the docstring of the method.
    4) Expect the user to read the actual implementation.
    5) Expect the user to read the testing code.


    There are other ways to communicate how a method works.

    6) The name of the method
    7) How the method is called throughout the codebase
    8) observing input and output values during debugging
    9) observing input and output values in production
    10) relying on convention inside, and outside, the application
    11) Don't communicate - Sometimes <complexity>/<num_customers> is too high; code is not repaired, only replaced.


    This is again something that eludes me and I would be really thankful if you could clarify. Please consider for an example, pypackagery (https://pypackagery.readthedocs.io/en/latest/packagery.html) and the documentation of its function resolve_initial_paths:
    packagery.resolve_initial_paths(initial_paths)

    Resolve the initial paths of the dependency graph by recursively adding *.py files beneath given directories.

    Parameters:

    initial_paths (List[Path]) – initial paths as absolute paths

    Return type:

    List[Path]

    Returns:

    list of initial files (i.e. no directories)

    Requires:
    • all(pth.is_absolute() for pth in initial_paths)
    Ensures:
    • len(result) >= len(initial_paths) if initial_paths else result == []
    • all(pth.is_absolute() for pth in result)
    • all(pth.is_file() for pth in result)

    How is this difficult to read,[...]?

    This contract does not help me: 
    Does it work on Windows?
    What is_absolute()?  is "file:///" absolute?
    How does this code fail? 
    What does a permission access problem look like? 
    Can initial_paths can be None?
    Can initial_paths be files? directories? 
    What are the side effects?

    resolve_initial_path() is a piece code is better understood by looking at the callers (#7), or not exposing it publicly (#11).  You can also use a different set of abstractions, to make the code easier to read: 
      
    UNION(file for p in initial_paths for file in p.leaves() if file.extension=="py")

    At a high level, I can see the allure of DbC:  Programming can be a craft, and a person can derive deep personal satisfaction from perfecting the code they work on. DbC provides you with more decoration, more elaboration, more ornamentation, more control.  This is not bad, but I see all your arguments as personal ascetic sense.  DbC is only appealing under certain accounting rules.  Please consider the possibility that "the best code" is: low $$$, buggy, full of tangles, and mostly gets the job done.   :)

    Chris Angelico

    unread,
    Sep 25, 2018, 7:21:09 PM9/25/18
    to python-ideas
    On Wed, Sep 26, 2018 at 7:59 AM Kyle Lahnakoski <klahn...@mozilla.com> wrote:
    > I use DbC occasionally to clarify my thoughts during a refactoring, and then only in the places that continue to make mistakes. In general, I am not in a domain that benefits from DbC.
    >
    > Contracts are code: More code means more bugs.

    Contracts are executable documentation. If you can lift them directly
    into user-readable documentation (and by "user" here I mean the user
    of a library), they can save you the work of keeping your
    documentation accurate.

    > This contract does not help me:
    >
    > What is_absolute()? is "file:///" absolute?

    I'd have to assume that is_absolute() is defined elsewhere. Which
    means that the value of this contract depends entirely on having other
    functions, probably ALSO contractually-defined, to explain it.

    > How does this code fail?
    > What does a permission access problem look like?

    Probably an exception. This is Python code, and I would generally
    assume that problems are reported as exceptions.

    > Can initial_paths can be None?

    This can be answered from the type declaration. It doesn't say
    Optional, so no, it can't be None.

    > Can initial_paths be files? directories?

    Presumably not a question you'd get if you were actually using it; the
    point of the function is to "[r]esolve the initial paths of the
    dependency graph by recursively adding *.py files beneath given
    directories", so you'd call it because you have directories and want
    files back.

    > What are the side effects?

    Hopefully none, other than the normal implications of hitting the file system.

    It's easy to show beautiful examples that may actually depend on other
    things. Whether that's representative of all contracts is another
    question.

    ChrisA

    Marko Ristin-Kaufmann

    unread,
    Sep 26, 2018, 12:50:19 AM9/26/18
    to ros...@gmail.com, Python-Ideas
    Hi Chris,


    An extraordinary claim is like "DbC can improve *every single project*
    on PyPI". That requires a TON of proof. Obviously we won't quibble if
    you can only demonstrate that 99.95% of them can be improved, but you
    have to at least show that the bulk of them can.

    I tried to give the "proof" (not a formal one, though) in my previous message. The assumptions are that:
    * There are always contracts, they can be either implicit or explicit. You need always to figure them out before you call a function or use its result.
    * Figuring out contracts by trial-and-error and reading the code (the implementation or the test code) is time consuming and hard.
    * The are tools for formal contracts.
    * The contracts written in documentation as human text inevitably rot and they are much harder to maintain than automatically verified formal contracts.
    * The reader is familiar with formal statements, and hence reading formal statements is faster than reading the code or trial-and-error.

    I then went on to show why I think, under these assumptions, that formal contracts are superior as a documentation tool and hence beneficial. Do you think that any of these assumptions are wrong? Is there a hole in my logical reasoning presented in my previous message? I would be very grateful for any pointers!

    If these assumptions hold and there is no mistake in my reasoning, wouldn't that qualify as a proof?

    Cheers,
    Marko

    Marko Ristin-Kaufmann

    unread,
    Sep 26, 2018, 1:13:01 AM9/26/18
    to klahn...@mozilla.com, Python-Ideas
    Hi Kyle,


    6) The name of the method
    7) How the method is called throughout the codebase
    10) relying on convention inside, and outside, the application

    Sorry, by formulating 2) as "docstring" I excluded names of the methods as well as variables. Please assume that 2) actually entails those as well. They are human text and hence not automatically verifiable, hence qualify as 2).


    8) observing input and output values during debugging
    9) observing input and output values in production

    Sorry, again I implicitly subsumed 8-9 under 4), reading the implementation code (including the trial-and-error). My assumption was that it is incomparably more costly to apply trial-and-error than read the contracts given that contracts can be formulated. Of course, not all contracts can be formulated all the time.


    11) Don't communicate - Sometimes <complexity>/<num_customers> is too high; code is not repaired, only replaced.

    I don't see this as an option for any publicly available, high-quality module on pypi or in any organization. As I already noted in my message to Hugh, the argument in favor of undocumented and/or untested code are not the arguments. I assume we want a maintainable and usable modules. I've never talked about undocumented throw-away exploratory code. Most of the Python features become futile in that case (type annotations and static type checking with mypy, to name only the few).


    Does it work on Windows?
    This is probably impossible to write as a contract, but needs to be tested (though maybe there is a way to check it and encapsulate the check in a separate function and put it into the contract).

    What is_absolute()?  is "file:///" absolute?
    Since the type is pathlib.Path (as written in the type annotation), it's pathlib.Path.is_absolute() method. Please see https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.is_absolute

    At a high level, I can see the allure of DbC:  Programming can be a craft, and a person can derive deep personal satisfaction from perfecting the code they work on. DbC provides you with more decoration, more elaboration, more ornamentation, more control.  This is not bad, but I see all your arguments as personal ascetic sense.  DbC is only appealing under certain accounting rules.  Please consider the possibility that "the best code" is: low $$$, buggy, full of tangles, and mostly gets the job done.   :)
    Actually, this goes totally contrary to most of my experience. Bad code is unmaintainable and ends up being much more costly down the line. It's also what we were taught in software engineering lectures in the university (some 10-15 years ago) and I always assumed that the studies presented there were correct. 

    Saying that writing down contracts is costly is a straw-man. It is costly if you need to examine the function and write them down. If you are writing the function and just keep adding the contracts as-you-go, it's basically very little overhead cost. You make an assumption of the input, and instead of just coding on, you scroll up, write it down formally, and go back where you stopped and continue the implementation. Or you think for a minute what contracts your function needs to expect/satisfy before you start writing it (or during the design). I don't see how this can be less efficient than trial-and-error and making possibly wrong assumptions based on the output that you see without any documentation by running the code of the module.

    Cheers,
    Marko

    Marko Ristin-Kaufmann

    unread,
    Sep 26, 2018, 1:26:01 AM9/26/18