One explanation that seems plausible to me is that many programmers are actually having a hard time with formalization and logic rules (e.g., implication, quantifiers), maybe due to missing education (e.g. many programmers are people who came to programming from other less-formal fields). It's hence easier for them to write in human text and takes substantial cognitive load to formalize these thoughts in code. Does that explains it?
Hi Marko,
I think there are several ways to approach this problem, though am not weighing in on whether DbC is a good thing in Python. I wrote a simple implementation of DbC which is currently a run-time checker. You could, with the appropriate tooling, validate statically too (as with all approaches). In my approach, I use a “proxy” object to allow the contract code to be defined at function definition time. It does mean that some things are not as pretty as one would like - anything that cannot be hooked into with magic methods i.e isinstance
, but I think this is acceptable as it makes features like old
easier. Also, one hopes that it encourages simpler contract checks as a side-effect. Feel free to take a look - https://github.com/agoose77/pyffel
It is by no means well written, but a fun PoC nonetheless.
Regards,
Angus
To me, as I've said, DbC imposes a very large cost for both writers and readers of code.
packagery.
resolve_initial_paths
(initial_paths)Resolve the initial paths of the dependency graph by recursively adding *.py
files beneath given directories.
Parameters: | initial_paths ( |
---|---|
Return type: |
|
Returns: | list of initial files (i.e. no directories) |
Requires: |
|
Ensures: |
|
>>> result = packagery.resolve_initial_paths([])
[]
>>> with temppathlib.NamedTemporaryFile() as tmp1, \
... temppathlib.NamedTemporaryFile() as tmp2:
... tmp1.path.write_text("some text")
... tmp2.path.write_text("another text")
... result = packagery.resolve_initial_paths([tmp1, tmp2])
... assert all(pth.is_absolute() for pth in result)
... assert all(pth.is_file() for pth in result)
>>> with temppathlib.TemporaryDirectory() as tmp:
... packagery.resolve_initial_paths([tmp.path])
Traceback (most recent call last):
...
ValueError("Unexpected directory in the paths")
>>> with temppathlib.TemporaryDirectory() as tmp:
... pth = tmp.path / "some-file.py"
... pth.write_text("some text")
... packagery.initial_paths([pth.relative_to(tmp.path)])
Traceback (most recent call last):
...
ValueError("Unexpected relative path in the initial paths")
To me, as I've said, DbC imposes a very large cost for both writers and readers of code.
@icontract.pre(lambda initial_paths: all(pth.is_absolute() for pth in initial_paths))
@icontract.post(lambda result: all(pth.is_file() for pth in result))
@icontract.post(lambda result: all(pth.is_absolute() for pth in result))
@icontract.post(lambda initial_paths, result: len(result) >= len(initial_paths) if initial_paths else result == [])
def resolve_initial_paths(initial_paths: List[pathlib.Path]) -> List[pathlib.Path]:
...
I'm not alone in this. A large majority of folks formally educated in computer science and related fields have been aware of DbC for decades but deliberately decided not to use them in their own code. Maybe you and Bertram Meyer are simple better than that 99% of programmers... Or maybe the benefit is not so self-evidently and compelling as it feels to you.
The vast majority of those developing software - even that intended to be reused - are simply ignorant of the concept. As a result they produce application programmer interfaces (APIs) that are under-specified thus passing the burden to the application programmer to discover by trial and error, the 'acceptable boundaries' of the software interface (undocumented contract's terms). But such ad-hoc operational definitions of software interface discovered through reverse-engineering are subject to change upon the next release and so offers no stable way to ensure software correctness.The fact that many people involved in writing software lack pertinent education (e.g., CS/CE degrees) and training (professional courses, read software engineering journals, attend conferences etc.) is not a reason they don't know about DBC since the concept is not covered adequately in such mediums anyway. That is, ignorance of DBC extends not just throughout practitioners but also throughout educators and many industry-experts.
The simplicity and obvious benefits of Design By Contract lead one to wonder why it has not become 'standard practice' in the software development industry. When the concept has been explained to various technical people (all non-programmers), they invariably agree that it is a sensible approach and some even express dismay that software components are not developed this way.(Emphasis mine; iContract refers to a Java design-by-contract library)It is just another indicator of the immaturity of the software development industry. The failure to produce high-quality products is also blatantly obvious from the non-warranty license agreement of commercial software. Yet consumers continue to buy software they suspect and even expect to be of poor quality. Both quality and lack-of-quality have a price tag, but the difference is in who pays and when. As long as companies can continue to experience rising profits while selling poor-quality products, what incentive is there to change? Perhaps the fall-out of the "Year 2000" problem will focus enough external pressure on the industry to jolt it towards improved software development methods. There is talk of certifying programmers like other professionals. If and when that occurs, the benefits of Design By Contract just might begin to be appreciated.
But it is doubtful. Considering the typical 20 year rule for adopting superior technology, DBC as exemplified by Eiffel, has another decade to go. But if Java succeeds in becoming a widely-used language and JavaBeans become a widespread form of reuse then it would already be too late for DBC to have an impact. iContract will be a hardly-noticed event much like ANNA for Ada and A++ for C++. This is because the philosophy/mindset/culture is established by the initial publication of the language and its standard library.
Secondly, these "obvious" benefits. If they're obvious, I want to know why
aren't you using Eiffel? It's a programming language designed around DbC
concepts. It's been around for three decades, at least as long as Python or
longer. There's an existing base of compilers and support tools and libraries
and textbooks and experienced programmers to work with.
Could it be that Python has better libraries, is faster to develop for, attracts
more programmers? If so, I suggest it's worth considering that this might
be *because* Python doesn't have DbC.
And I wouldn't use DbC for Python because
I wouldn't find it helpful for the kind of dynamic, exploratory development
I do in Python. I don't write strict contracts for Python code because in a
dynamically typed, and duck typed, programming language they just don't
make sense to me. Which is not to say I think Design by Contract is bad,
just that it isn't good for Python.
Regards,
Angus
On 23 Sep 2018, at 11:33, Hugh Fisher <hugo....@gmail.com> wrote:Could it be that Python has better libraries, is faster to develop for, attracts
more programmers? If so, I suggest it's worth considering that this might
be *because* Python doesn't have DbC.
On 24 Sep 2018, at 20:09, Marko Ristin-Kaufmann <marko....@gmail.com> wrote:Hi Barry,I think the main issue with pyffel is that it can not support function calls in general. If I understood it right, and Angus please correct me, you would need to wrap every function that you would call from within the contract.
But the syntax is much nicer than icontract or dpcontracts (see these packages on pypi). What if we renamed "args" argument and "old" argument in those libraries to just "a" and "o", respectively? Maybe that gives readable code without too much noise:
@requires(lambda self, a, o: self.sum == o.sum - a.amount)def withdraw(amount: int) -> None:...There is this lambda keyword in front, but it's not too bad?
Note that this means you cannot use macros in a file that is run directly, as it will not be passed through the import hooks.
packagery.
resolve_initial_paths
(initial_paths)Resolve the initial paths of the dependency graph by recursively adding *.py
files beneath given directories.
Parameters: | initial_paths ( |
---|---|
Return type: |
|
Returns: | list of initial files (i.e. no directories) |
Requires: |
|
Ensures: |
|
---|
|
You'll lose folks attention very quickly when you try to tell folk
what they do and don't understand.
Claiming that DbC annotations will improve the documentation of every
single library on PyPI is an extraordinary claim, and such claims
require extraordinary proof.
I can think of many libraries where necessary pre and post conditions
(such as 'self is still locked') are going to be noisy, and at risk of
reducing comprehension if the DbC checks are used to enhance/extended
documentation.
Some of the examples you've been giving would be better expressed with
a more capable type system in my view (e.g. Rust's), but I have no
good idea about adding that into Python :/.
> As soon as you need to document your code, and
> this is what most modules have to do in teams of more than one person
> (especially so if you are developing a library for a wider audience), you
> need to write down the contracts. Please see above where I tried to
> explained that 2-5) are inferior approaches to documenting contracts
> compared to 1).
You left off option 6), plain text. Comments. Docstrings.
2) Write precondtions and postconditions in docstring of the method as human text.
In Python we can write something like:
def foo(x):
x.bar(y)
What's the type of x? What's the type of y? What is the contract of bar?
Don't know, don't care. x, or y, can be an instance, a class, a module, a
proxy for a remote web service. The only "contract" is that object x will
respond to message bar that takes one argument. Object x, do whatever
you want with it.
ndarray.
transpose
(*axes)Returns a view of the array with axes transposed.
For a 1-D array, this has no effect. (To change between column and
row vectors, first cast the 1-D array into a matrix object.)
For a 2-D array, this is the usual matrix transpose.
For an n-D array, if axes are given, their order indicates how the
axes are permuted (see Examples). If axes are not provided and
a.shape = (i[0], i[1], ... i[n-2], i[n-1])
, then
a.transpose().shape = (i[n-1], i[n-2], ... i[1], i[0])
.
As for 4) reading the code, why not? "Use the source, Luke" is now a
programming cliche because it works. It's particularly appropriate for
Python packages which are usually distributed in source form and, as
you yourself noted, easy to read.
I use DbC occasionally to clarify my thoughts during a
refactoring, and then only in the places that continue to make
mistakes. In general, I am not in a domain that benefits from DbC.
Contracts are code: More code means more bugs. Declarative
contracts are succinct, but difficult to debug when wrong; I
believe this because the debugger support for contracts is poor;
There is no way to step through the logic and see the intermediate
reasoning in complex contracts. A contract is an incomplete
duplication of what the code already does: at some level of
complexity I prefer to use a duplicate independent implementation
and compare inputs/outputs.
When you are documenting a method you have the following options:
1) Write preconditions and postconditions formally and include them automatically in the documentation (e.g., by using icontract library).
2) Write precondtions and postconditions in docstring of the method as human text.
3) Write doctests in the docstring of the method.
4) Expect the user to read the actual implementation.
5) Expect the user to read the testing code.
This is again something that eludes me and I would be really thankful if you could clarify. Please consider for an example, pypackagery (https://pypackagery.readthedocs.io/en/latest/packagery.html) and the documentation of its function resolve_initial_paths:
packagery.
resolve_initial_paths
(initial_paths)Resolve the initial paths of the dependency graph by recursively adding
*.py
files beneath given directories.
Parameters: initial_paths (
List
[Path
]) – initial paths as absolute pathsReturn type:
List
[Path
]Returns: list of initial files (i.e. no directories)
Requires:
all(pth.is_absolute() for pth in initial_paths)
Ensures:
len(result) >= len(initial_paths) if initial_paths else result == []
all(pth.is_absolute() for pth in result)
all(pth.is_file() for pth in result)
How is this difficult to read,[...]?
Does it work on Windows?resolve_initial_path() is a piece code is better understood by looking at the callers (#7), or not exposing it publicly (#11). You can also use a different set of abstractions, to make the code easier to read:
What is_absolute()? is "file:///" absolute?
How does this code fail?
What does a permission access problem look like?
Can initial_paths can be None?
Can initial_paths be files? directories?
What are the side effects?
An extraordinary claim is like "DbC can improve *every single project*
on PyPI". That requires a TON of proof. Obviously we won't quibble if
you can only demonstrate that 99.95% of them can be improved, but you
have to at least show that the bulk of them can.
6) The name of the method
7) How the method is called throughout the codebase
10) relying on convention inside, and outside, the application
8) observing input and output values during debugging
9) observing input and output values in production
11) Don't communicate - Sometimes <complexity>/<num_customers> is too high; code is not repaired, only replaced.
Does it work on Windows?
What is_absolute()? is "file:///" absolute?
At a high level, I can see the allure of DbC: Programming can be a craft, and a person can derive deep personal satisfaction from perfecting the code they work on. DbC provides you with more decoration, more elaboration, more ornamentation, more control. This is not bad, but I see all your arguments as personal ascetic sense. DbC is only appealing under certain accounting rules. Please consider the possibility that "the best code" is: low $$$, buggy, full of tangles, and mostly gets the job done. :)