PEP x: Static module/package inspection

79 views
Skip to first unread message

anatoly techtonik

unread,
Dec 27, 2011, 5:47:09 AM12/27/11
to python...@googlegroups.com
(posting this from Google Group in hope it will get to Mailing List successfully)

Static module/package inspection

Abstract:
 - static: without execution (as opposed to dynamic)
 - module/package: .py or __init__.py file
 - inspection: get an overview of the contents

What should this do?

The proposal to add a mechanism to Python interpreter to get an outline of module/package contents without importing or executing module/package. The outline includes names of classes, functions, variables. It also should contain values for variables that could be provided without sophisticated calculations (e.g. a string, integer, but probably not expressions as it may lead to security leaks).

Why?

user story PEPx.001:
As a Python package maintainer, I find it bothersome to repeatedly write bolierplate code (e.g. setup.py) to package my single file module. The reason I should write setup.py is to provide version and description info. This info is already available in my module source code. So I need to either copy/paste the info from the module manually, or to import (and hence execute) my module during packaging and installation, which I don't want either, because modules are often installed with root privileges.

With this PEP, packing tool will be able to extract meta information from my module without executing it or without me manually copying version fields into some 'package configuration file'.

user story PEPx.002:
As a Python Application developer, I find it really complicated to provide plugin extension subsystem for my users. Users need a mechanism to switch between different versions of the plugin, and this mechanism is usually provided by external tool such as setuptools to manage and install multiple versions of plugins in local Python package repository. It is rather hard to create an alternative approach, because you are forced to maintain external meta-data about your plugin modules even in case it is already available inside the module.

With this PEP, Python Application will be able to inspect meta-data embedded inside of plugins before choosing which version to load. This will also provide a standard mechanism for applications to check modules returned by packaging tools without executing them. This will greatly simplify writing and debugging custom plugins loaders on different platforms.


Feedback goal
At this stage I'd like to a community response to two separate questions:
1. If everybody feels this functionality will be useful for Python
2. If the solution is technically feasible

anatoly techtonik

unread,
Dec 28, 2011, 5:15:24 AM12/28/11
to python...@googlegroups.com
(reposting this from Google Group once more as the previous post missed Mailing List, because I was not subscribed in Mailman)

Michael Foord

unread,
Dec 28, 2011, 10:28:44 AM12/28/11
to Python-Ideas


On a simple level, all of this is already "obtainable" by using the ast module that can parse Python code. I would love to see a "python-object" layer on top of this that will take an ast for a module (or other object) and return something that represents the same object as the ast.

So all module level objects will have corresponding objects - where they are Python objects (builtin-literals) then they will represented exactly. For classes and functions you'll get an object back that has the same attributes plus some metadata (e.g. for functions /  methods what arguments they take etc).

That is certainly doable and would make introspecting-without-executing a lot simpler.

I think your specific use cases are better served by adding functionality to the packaging (distutils2) package however. I'd particularly like to see plugin support in packaging (a cutdown version of setuptools entry points).

All the best,

Michael
 

_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas




--
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html

Nathan Rice

unread,
Dec 28, 2011, 4:18:38 PM12/28/11
to python-ideas
On Wed, Dec 28, 2011 at 5:15 AM, anatoly techtonik <tech...@gmail.com> wrote:
> user story PEPx.001:
> As a Python package maintainer, I find it bothersome to repeatedly write
> bolierplate code (e.g. setup.py) to package my single file module. The
> reason I should write setup.py is to provide version and description info.
> This info is already available in my module source code. So I need to either
> copy/paste the info from the module manually, or to import (and hence
> execute) my module during packaging and installation, which I don't want
> either, because modules are often installed with root privileges.

I agree this is a pain. I also agree with Micheal that this is moreso
a packager issue. Part of the problem is that I don't believe there
is a strong enough convention around writing modules with an eye to
being package tools accessible. If there was a PEP on module metadata
for packaging tools to use for introspection, that might motivate
package tool authors to support automated packaging :) *HINT HINT*
Sphinx could also take advantage of some of it too.

> With this PEP, packing tool will be able to extract meta information from my
> module without executing it or without me manually copying version fields
> into some 'package configuration file'.
>
> user story PEPx.002:
> As a Python Application developer, I find it really complicated to provide
> plugin extension subsystem for my users. Users need a mechanism to switch
> between different versions of the plugin, and this mechanism is usually
> provided by external tool such as setuptools to manage and install multiple
> versions of plugins in local Python package repository. It is rather hard to
> create an alternative approach, because you are forced to maintain external
> meta-data about your plugin modules even in case it is already available
> inside the module.

See above. Maintaining the same information twice is definitely a bad
thing, but we already have the ability to do everything required.
What is missing is good, strong conventions on module metadata
annotation that tool creators write to.

> With this PEP, Python Application will be able to inspect meta-data embedded
> inside of plugins before choosing which version to load. This will also
> provide a standard mechanism for applications to check modules returned by
> packaging tools without executing them. This will greatly simplify writing
> and debugging custom plugins loaders on different platforms.

Having more nuanced import behavior is something I can get behind.
Sure, I can wrap an import in a try except, and check the __version__
if is defined (after determining if it is a string/tuple/etc, and
possibly parsing it), but more nuanced behavior would certainly be
nice. Being able to specify version in the import line (and have
multiple versions installed), being able to get fine grained exception
beyond ImportError (ParseError, anyone?), not having to worry that the
same file is being imported twice, that sort of stuff.

I'm +1 getting a module level metadata conventions PEP draft started.
I'm also +1 on taking a look at import behavior (though that is
tangential here).

Nathan

anatoly techtonik

unread,
Feb 2, 2012, 3:41:39 AM2/2/12
to python-ideas
A rather user friendly proof of the concept with `ast` module is ready.
http://pypi.python.org/pypi/astdump/

`astdump` contains get_top_vars() method, which extracts sufficient information from module's AST to generate setup.py for itself. This capability can already be reused for plugin version discovery mechanisms. ISTM the working library should motivate authors better than a PEP convention. =)

`astdump` doesn't provide complete module introspection capabilities. I've primarily focused on getting the output done, so for a proper API it would be nice to study use case examples first. `astdump` contains tree walker with filtering capabilities by node type and level. What "python-object" should expose and how to make this convenient is not completely clear for me.
--
anatoly t.

Nick Coghlan

unread,
Feb 2, 2012, 8:35:10 AM2/2/12
to Michael Foord, Python-Ideas
On Thu, Dec 29, 2011 at 1:28 AM, Michael Foord <fuzz...@gmail.com> wrote:
> On a simple level, all of this is already "obtainable" by using the ast
> module that can parse Python code. I would love to see a "python-object"
> layer on top of this that will take an ast for a module (or other object)
> and return something that represents the same object as the ast.
>
> So all module level objects will have corresponding objects - where they are
> Python objects (builtin-literals) then they will represented exactly. For
> classes and functions you'll get an object back that has the same attributes
> plus some metadata (e.g. for functions /  methods what arguments they take
> etc).
>
> That is certainly doable and would make introspecting-without-executing a
> lot simpler.

The existing 'clbr' (class browser) module in the stdlib also attempts
to play in this same space. I wouldn't say it does it particularly
*well* (since it's easy to confuse with valid Python constructs), but
it tries.

Cheers,
Nick.

--
Nick Coghlan   |   ncog...@gmail.com   |   Brisbane, Australia

anatoly techtonik

unread,
Mar 29, 2012, 2:19:51 PM3/29/12
to python...@googlegroups.com, Python-Ideas
On Thu, Feb 2, 2012 at 4:35 PM, Nick Coghlan <ncog...@gmail.com> wrote:
> On Thu, Dec 29, 2011 at 1:28 AM, Michael Foord <fuzz...@gmail.com> wrote:
>> On a simple level, all of this is already "obtainable" by using the ast
>> module that can parse Python code. I would love to see a "python-object"
>> layer on top of this that will take an ast for a module (or other object)
>> and return something that represents the same object as the ast.
>>
>> So all module level objects will have corresponding objects - where they are
>> Python objects (builtin-literals) then they will represented exactly. For
>> classes and functions you'll get an object back that has the same attributes
>> plus some metadata (e.g. for functions /  methods what arguments they take
>> etc).
>>
>> That is certainly doable and would make introspecting-without-executing a
>> lot simpler.
>
> The existing 'clbr' (class browser) module in the stdlib also attempts
> to play in this same space. I wouldn't say it does it particularly
> *well* (since it's easy to confuse with valid Python constructs), but
> it tries.

Unfortunately http://docs.python.org/library/pyclbr.html misses info
about variables.

In the meanwhile I've patches my `astdump` module even further:
- function to query top level variables changed from get_top_vars() to
top_level_vars(), which is now accepts filename as a parameter. Now it
will be even more convenient to use it for generating `setup.py` for
simple modules. Sample `setup.py` generator is included.

http://pypi.python.org/pypi/astdump/1.0
--
anatoly t.

Reply all
Reply to author
Forward
0 new messages