Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

simple import hook

23 views
Skip to first unread message

Andrea Crotti

unread,
Nov 10, 2011, 10:44:29 AM11/10/11
to pytho...@python.org
So I would really like to accomplish the following:
run a program normally and keep track of all the imports that were
actually done.

I studied the PEP 302, but I'm still a bit confused about how to do it.

I thought that instead of implementing everything I could just record
the request
and then delegate to the "imp" module, so I did this:

class MyLoader(object):
"""
Loader object
"""

def __init__(self):
self.loaded = set()

def find_module(self, module_name, package=None):
print("requesting %s" % module_name)
self.loaded.add(module_name)
return self

def load_module(self, fullname):
#XXX: the find_module is actually doing nothing, since
# everything is delegated to the "imp" module
fp, pathname, stuff = imp.find_module(fullname)
imp.load_module(fullname, fp, pathname, stuff)

myl = MyLoader()
sys.meta_path.append(myl)
try:
import random
import os
print(random.random())



Which doesn't work, and very strangely it doesn't even look deterministic!
Sometimes it stops at first import sometimes it's able to do a few of them.
How can that be?

And how could I do solve my problem?

Andrea Crotti

unread,
Nov 10, 2011, 12:46:07 PM11/10/11
to Eric Snow, pytho...@python.org
On 11/10/2011 05:02 PM, Eric Snow wrote:
>
> Yeah, I'm working on a reference for imports in Python. They're just
> a little too mysterious relative to the rest of the language. But
> it's not too helpful yet. In the meantime...
Yes it's quite mysterious, and it's actually not as hard as it looks..
Anyway I'm glad to say that I reached what I basically wanted.

This script actually compiles the code and runs it, reporting the
imports done in the end.
Any suggestion is still welcome :)

"""
This script is used to analyse the imports which are actually being done
"""
import argparse
import os
import sys

class CollectImports(object):
"""
Loader object
"""

def __init__(self):
self.loaded = set()

def __str__(self):
return str(self.loaded)

def find_module(self, module_name, package=None):
print("requesting %s" % module_name)
self.loaded.add(module_name)


if __name__ == '__main__':
parser = argparse.ArgumentParser(description='analyse the imports
made')
parser.add_argument('script')
parser.add_argument('arguments', nargs='*')

ns = parser.parse_args()

progname = ns.script
sys.path.insert(0, os.path.dirname(progname))

cl = CollectImports()
# TODO: maybe we can create a with clause also for this thing
sys.meta_path.append(cl)

# TODO: catch the exit signal and present the output
code = compile(open(progname).read(), progname, 'exec')
exec(code)
print("imports done: %s" % str(cl))

> That _is_ pretty strange.
>
> After what I recommended above, are you still getting the wierdness?
> It could just be a side-effect of your use of the imp functions. in
> load_module(), so getting rid of it would help.
>
> What version of Python are you using? If not 2.7 or 3.2, do you get
> the same problem when you run the code under one of those latest
> versions?
>
> Even when the number of imports is different, are they always in the
> same order? Highly unlikely, but if you say no then this is extra
> fishy.
>

It was python 2.7 on Arch-linux, but thanks to your suggestions
everything was actually fixed..

DevPlayer

unread,
Nov 15, 2011, 2:39:51 AM11/15/11
to
An alternative approach:
http://pastebin.com/z6pNqFYE

or:

# devp...@gmail.com
# 2011-Nov-15

# recordimports.py

# my Import Hook Hack in response to:
# http://groups.google.com/group/comp.lang.python/browse_thread/thread/5a5d5c724f142eb5?hl=en
# as an initial learning exercise

# This code needs to come before any imports you want recorded
# usually just once in the initial (main) module
# Of course you can excluding the if __name__ == '__main__': demo code

# barely tested:
# only tested with modules that had no errors on import
# did not need/use/expect/try reload()
# ran with Python 2.7 on Win32
# create two fake modules moda.py and modb.py and stick some imports
in them

'''
Exerpt from PEP 302 -- New Import Hooks
...
Motivation:
- __import__ gets called even for modules that are already in
sys.modules, which is almost never what you want, unless you're
writing some sort of monitoring tool.

Note the last two words.'''


#
=======================================================================
# place to save Collected imports
imported = []

# save old __builtins__.__import__()
__builtins__.__dict__['__old_import__'] =
__builtins__.__dict__['__import__']

# match __builtins__.__import__() function signature
def __myimport(name, globals={}, locals={}, fromlist=[], level=-1):
global imported

# I don't know why this works.
__name__ = locals['__name__']
__file__ = locals['__file__']
__package__ = locals['__package__']
__doc__ = locals['__doc__']

# call original __import__
module = __builtins__.__old_import__(name, globals, locals,
fromlist, level)

# save import module name into namespace
__builtins__.__dict__[name] = module

tag = (name, __name__, __file__, __package__, module)

# do not append duplicates
if tag not in imported:
imported.append( tag )

return module

# store the new __import__ into __builtins__
__builtins__.__dict__['__import__'] = __myimport

# erase unneed func name
del __myimport
#
=======================================================================


# demo
if __name__ == '__main__':
# import some random packages/modules
import sys
import moda # a test module that does some other imports
import modb # a test module that does some imports
from pprint import pprint

# imported has name, __name__, __file__, __package__
# show each import
for n, __n, __f, __p, m in imported:
print n
print ' ', __n
print ' ', __f
print ' ', __p
print
del n, __n, __f, __p, m

print 'recordimports.py'
pprint(dir(), None, 4, 1)
print

print 'moda.py'
pprint(dir(moda), None, 4, 1)
print

print 'modb.py'
pprint(dir(modb), None, 4, 1)

# print imported
print

DevPlayer

unread,
Nov 15, 2011, 12:56:03 PM11/15/11
to
And for an insane amount of what REALLY gets imported:

python.exe -B -s -S -v -v -v -v your_module_here.py %1 %2 %3 %4 %5 %6
%7 %8 %9

or more appropiately:

python -v -v -v -v your_module_here.py


And for completeness, if you feel those options aren't telling the
whole truth, you can turn on your operating system's file auditing for
opens and reads; I forget if there are other places that can be
imported from besides *.py, *.pyo, *.pyc, *.pyd files too (maybe
zips?)
0 new messages