Farewell to the static type checking project?

76 views
Skip to first unread message

Edward K. Ream

unread,
Nov 29, 2012, 11:37:45 AM11/29/12
to leo-e...@googlegroups.com
It's beginning to look as if the stc project will "succeed" in a different way than I originally imagined :-)  Indeed, it has been slowly dawning on me that Python already *has* two superb type checkers.  They are called pylint and rope!  That being so, the best way to move forward on the stc project may be to encourage as many people as possible to flattr (or otherwise contribute to) the developers of pylint and rope...

I'm not sure how much more time I will devote to the stc project.  This post may form the basis for a "farewell" post on the stc group.  But I'll probably continue to work on the stc project long enough to answer questions about how rope and pylint do what they do.  Make no mistake, I am still intensely curious about rope and pylint, perhaps more so now that I have done so much studying without gaining real understanding.

I might be a bit more embarrassed by these developments were it not that Guido himself seems not to understood pylint and rope.  Anyway, I have enjoyed my work on stc. I have gained lots of experience with Python's excellent (if slightly flawed) parser tools.  The tricks in rope's ast.py module are truly clever. I have been able to study TypeScript, a great static analyzer for JavaScript.

Most importantly, I have developed several tools and techniques for analyzing other python programs. The most important technique is the dictum,

    "the only way to understand a computer program is to run it."

This dictum has lead directly to several new tools:

- An improved recursive import script in scripts.leo.  This script now has an option to create @edit nodes rather than @file nodes.  While not as clear as @file nodes, @edit nodes are a fast way of gaining access to files for the purposes, say, of inserting calls to pdb.set_trace.

- The g.SherlockTracer class.  It's amusing to see how much simpler it is to do in Python than in C.

Having said all this, there is a sense of relief that I may soon be working on Leo "full time".  The list of good ideas worth doing seems endless. Here are some likely improvements:

1.  Integrate rope into Leo :-)  I have spent the last two weeks studying rope intensely.  I may not understand rope's type inference mechanism completely, but I do know that it works.  More importantly, I have a great deal of respect for rope's infrastructure; it provides what TypeScript calls the "harness" that allows an editor to interact with rope, and vice versa.  Most of rope's code consists of this harness, and for good reason:  the nitty gritty details are inherently complex.

I am convinced that it will be possible to drive rope from Leo.  Indeed, Leo's outline structure may make it possible to do some things to speed up what looks like an inherently time-consuming type inference process.  This is not a one-day, or even one-week project.  But it's worth doing.

2. More flexible unit testing.  Leo needs better ways of managing unit tests in external files, and of using custom subclasses of unittest.TestCase.  At present, the Leonine idiom that substitutes for subclassing is::

    exec(g.findTestScript(c,'@common <name of common code>'))

This sorta works for all-Leo situations, but it won't work when test cases are in separate files.  I have some ideas for improvement, and some mostly-failed experiments from the stc project.  I expect some good things to happen fairly soon.

3.  There have been some interesting requests lately for improvements to Leo.  Two that come to mind are a) easier ways to execute scripts in a separate process and b) a way to parameterize template nodes.  I'd also like easier ways to parameterize commands.  I am sure that Leo's users will continue to suggest improvements that I would never have imagined.

So that's it.  I expect a period of transition away from stc.  It may take weeks or even months, but I am eager to continue to improve Leo.

Edward

F.S.

unread,
Nov 29, 2012, 3:48:42 PM11/29/12
to leo-e...@googlegroups.com
Edward,
The stc project was probably a Sisyphean task in the first place, esp as pointed out by Guido that to be useful it needs to do the hard parts well not just the easy parts on a restricted language subset. Any plan to apply the insights you've gained from studying the subject to creating a better semantic code browser/autocompleter for Python? Unlike the stc you can let the programmer deal with ambiguities in the case of autocompletion/code tips. By semantic I mean it is not just based on regex tag matching. I have not really used rope as I recall it may have performance issues? I use vim/cscope/pycscope combo to do code browsing for large code bases. It is very fast but unfortunately it is not complete and depends on tag matching that tends to produce too many answers or not enough at the same time. To see why, consider the following code snippet:

x = MyClass()
y = x.my_method()

The tag based approach will give you every method with the name my_method even though it is clear what you are looking for is MyClass.my_method. And it may not give you anything if you want to look up x (or give you all assignments to x whether in scope or not.)

F.S.

unread,
Nov 29, 2012, 4:06:56 PM11/29/12
to leo-e...@googlegroups.com

It seems to me the most profitable approach for an IDE for a dynamic language is to create more intuitive ways for programmers to interact with the code. For example the Light Table work: http://www.chris-granger.com/2012/04/12/light-table---a-new-ide-concept/

On Thursday, November 29, 2012 12:48:42 PM UTC-8, F.S. wrote:
Edward,
Any plan to apply the insights you've gained from studying the subject to creating a better semantic code browser/autocompleter for Python? Unlike the stc you can let the programmer deal with ambiguities in the case of autocompletion/code tips. By semantic I mean it is not just based on regex tag matching. ... I use vim/cscope/pycscope combo to do code browsing for large code bases. It is very fast but unfortunately it is not complete and depends on tag matching that tends to produce too many answers or not enough at the same time. 

Edward K. Ream

unread,
Dec 1, 2012, 5:56:54 AM12/1/12
to leo-e...@googlegroups.com
On Thu, Nov 29, 2012 at 2:48 PM, F.S. <speec...@gmail.com> wrote:
Edward,
The stc project was probably a Sisyphean task in the first place, esp as pointed out by Guido that to be useful it needs to do the hard parts well not just the easy parts on a restricted language subset.

The question of goals for the stc is an important one, and hard to pin down.  Clearly, TypeScript, rope and pylint succeed in their particular goals.  I would be happy with a general frame work that did one or more of the following::

- Fast, accurate completions and type annotations as in TypeScript.
- Refactorings, as in Rope.
- Fast type checking, as in Pylint, but possibly with helps or other type assertions.

All of these are clearly possible, so the task is not "Sisyphean".

Any plan to apply the insights you've gained from studying the subject to creating a better semantic code browser/autocompleter for Python?

Leo's autocompleter isn't all that bad, but yes, if I had a better type inference engine I would certainly use it.

Edward

Edward K. Ream

unread,
Dec 1, 2012, 6:11:17 AM12/1/12
to leo-e...@googlegroups.com
On Thu, Nov 29, 2012 at 10:37 AM, Edward K. Ream <edre...@gmail.com> wrote:

I'm not sure how much more time I will devote to the stc project.  This post may form the basis for a "farewell" post on the stc group.

After writing this post I realized that it would be unbearable to stop now, without understanding *exactly* how rope works, and without more experimentation with other ideas. In particular, I want to investigate TypeScript's notion that some critical mass of cached (permanent) type info can make a type system "light up", to use a phrase that occurs in several TypeScript videos.

Indeed, it seems that caching is the heart of the efficiency problem.  Type computations aren't really that hard, it's just that there are so many of them.  TypeScript leads the way here because JavaScript scripts typically use a smallish set of large libraries, so pre-analyzing those libraries will pay off for many people. 

Similarly, a tool could pre-analyze Python's libraries.  Note that ts allows "polymorphic" descriptions of js functions, even thought js doesn't support polymorphism.  That is, ts allows descriptions of the form::

    f(x) returns a string if x is a string
    f(x) returns an int if x is an int.

This kind of description would likely suffice to describe all (or nearly all) parts of the Python library.  Such a description might allow very fast type checking of applications.

To repeat, caching seems to be the key.  Pylint doesn't do caching; rope does caching in a difficult-to-understand way that doesn't use any permanent, pre-analyzed data.  Thus, there would seem to be room for improving both programs.

Finally, I would not be happy simply abandoning all the work I have done so far.  I might put it on the shelf while working on Leo, but I'm likely to keep working on it.

Edward
Reply all
Reply to author
Forward
0 new messages