Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Trouble importing / using a custom text codec

9 views
Skip to first unread message

Jurie Horneman

unread,
Feb 28, 2002, 4:52:09 AM2/28/02
to
I'm having an irritating problem with a text codec I've written.

- The codec is in site-packages/MBCS_codecs/cp932.py.

- site-packages/MBCS_codecs contains an empty __init__.py.

- from MBCS_codecs import cp932 works fine.

- If I put cp932.py in the same place as a script that decodes a text
using encoding "cp932", there is no problem.

However, I want to have all my encodings somewhere else...

- With cp932.py in site-packages as described above, if I decode a
text using encoding "MBCS_codecs.cp932", I get the error message
"LookupError: unknown encoding". This happens even if I import cp932
by hand before decoding the text.

- The code that fails is in encodings/__init__.py :

modname = encoding.replace('-', '_')
modname = aliases.aliases.get(modname,modname)
try:
mod = __import__(modname,globals(),locals(),'*')
except ImportError,why:
# cache misses
_cache[encoding] = None
return None

- Executing the following code in one of my own scripts works:

modname = "MBCS_codecs.cp932"
modname = modname.replace('-', '_')
try:
mod = __import__(modname,globals(),locals(),'*')
except ImportError,why:
print "FAILED"

i.e. the module is imported.

- The aliases in encoding/__init__.py do not modify the module name.
In other words, it appears that the same code works or does not work
depending on where it is executed.

I cannot explain this behavior. Any help is appreciated.

Jurie Horneman

Martin von Loewis

unread,
Feb 28, 2002, 10:17:06 AM2/28/02
to
jhor...@pobox.com (Jurie Horneman) writes:

> - With cp932.py in site-packages as described above, if I decode a
> text using encoding "MBCS_codecs.cp932", I get the error message
> "LookupError: unknown encoding". This happens even if I import cp932
> by hand before decoding the text.

I recommend a different strategy. Aim for allowing "cp932" as an
encoding name. To achieve this, put the following (or something like
this) into MBCS_codecs/__init__.py

def search_mbcs(encoding):
if encoding == "cp932":
import cp932
return cp932.getregentry()
return None

codecs.register(search_mbcs)

If you want this to happen at startup time of Python, just add a
MBCS_codecs.pth file in site-packages, which reads

import MBCS_codecs

HTH,
Martin

Jurie Horneman

unread,
Mar 1, 2002, 8:08:40 AM3/1/02
to
Martin von Loewis <loe...@informatik.hu-berlin.de> wrote in message news:<j4r8n5f...@informatik.hu-berlin.de>...

> I recommend a different strategy. Aim for allowing "cp932" as an
> encoding name. To achieve this, put the following (or something like
> this) into MBCS_codecs/__init__.py

Thanks. I had to actually solve the problem so I took my code out of
the codecs and just called it directly, but at some point in the
future I will try this approach.

I'd still like to know why my approach didn't work, but oh well...

> If you want this to happen at startup time of Python, just add a
> MBCS_codecs.pth file in site-packages, which reads

What is a .pth file?

Jurie Horneman

Martin von Loewis

unread,
Mar 1, 2002, 8:53:31 AM3/1/02
to
jhor...@pobox.com (Jurie Horneman) writes:

> What is a .pth file?

See http://www.python.org/doc/current/lib/module-site.html

Regards,
Martin

0 new messages