Learning Python now coming from Perl

Bertilo Wennergren

unread,

Dec 6, 2008, 8:00:22 AM12/6/08

to

I'm planning to start learning Python now, using Python 3000.
I have no previous Python skills, but I now Perl pretty well.
I'm also well experienced with JavaScript.

Any pointers and tips how I should go about getting into
Python?

--
Bertilo Wennergren <http://bertilow.com>

Roy Smith

unread,

Dec 6, 2008, 8:50:20 AM12/6/08

to

In article <ghdt15$6e7$1...@news.motzarella.org>,
Bertilo Wennergren <bert...@gmail.com> wrote:

> I'm planning to start learning Python now, using Python 3000.
> I have no previous Python skills, but I now Perl pretty well.
> I'm also well experienced with JavaScript.
>
> Any pointers and tips how I should go about getting into
> Python?

I assume you use Perl to solve real problems in whatever job you do. My
recommendation would be the next time some problem comes up that you would
normally solve with Perl, try doing it in Python. Having a real task that
you need to accomplish is a great way to focus the mind. For your first
project, pick something that's small enough that you think you could tackle
it in under 50 lines of Perl.

One of the very first things you'll probably discover that's different
between Perl and Python is how they handle string pattern matching. In
Perl, it's a built in part of the language syntax. In Python, you use the
re module. The regular expressions themselves are the same, but the
mechanism you use to apply them to input text is quite different.

Bertilo Wennergren

unread,

Dec 6, 2008, 9:01:29 AM12/6/08

to

Roy Smith wrote:

> Bertilo Wennergren <bert...@gmail.com> wrote:

>> I'm planning to start learning Python now, using Python 3000.
>> I have no previous Python skills, but I now Perl pretty well.
>> I'm also well experienced with JavaScript.
>>
>> Any pointers and tips how I should go about getting into
>> Python?

> I assume you use Perl to solve real problems in whatever job you do. My
> recommendation would be the next time some problem comes up that you would
> normally solve with Perl, try doing it in Python. Having a real task that
> you need to accomplish is a great way to focus the mind. For your first
> project, pick something that's small enough that you think you could tackle
> it in under 50 lines of Perl.

Good advice.

> One of the very first things you'll probably discover that's different
> between Perl and Python is how they handle string pattern matching. In
> Perl, it's a built in part of the language syntax. In Python, you use the
> re module. The regular expressions themselves are the same, but the
> mechanism you use to apply them to input text is quite different.

Thanks.

I don't suppose there is any introductory material out there
that is based on Python 3000 and that is also geared at people
with a Perl background? Too early for that I guess..

Colin J. Williams

unread,

Dec 6, 2008, 10:14:04 AM12/6/08

to

This is from a post within the last few
days: http://www.qtrac.eu/pyqtbook.html

Colin W.

Steven D'Aprano

unread,

Dec 6, 2008, 10:50:01 AM12/6/08

to

On Sat, 06 Dec 2008 08:50:20 -0500, Roy Smith wrote:

> For your first
> project, pick something that's small enough that you think you could
> tackle it in under 50 lines of Perl.

Is there anything which *can't* be written in under 50 lines of Perl?

:-)

> One of the very first things you'll probably discover that's different
> between Perl and Python is how they handle string pattern matching. In
> Perl, it's a built in part of the language syntax. In Python, you use
> the re module. The regular expressions themselves are the same, but
> the mechanism you use to apply them to input text is quite different.

Also, Perl REs are faster than Python REs, or so I'm told. Between the
speed and the convenience, Perl programmers tend to use RE's for
everything they can. Python programmers tend to use REs only for problems
that *should* be solved with REs rather than *can* be solved with a RE.

Well, I say "tend", but in truth we get our fair share of questions like
"Hi, how do I factorize a 20 digit number with a regular expression?"
too.

Probably the biggest difference is that in Python, you always refer to
objects the same way, regardless of what sort of data they contain.
Regardless of whether x is a scalar or a vector, you always call it just
plain x.

--
Steven

Aahz

unread,

Dec 6, 2008, 12:24:36 PM12/6/08

to

In article <ghe0jo$3i1$1...@news.motzarella.org>,

Honestly, the differences between 2.x and 3.0 are small enough that it
doesn't much matter, as long as you're not the kind of person who gets
put off by little problems. Because so much material is for 2.x, you
may be better off just learning 2.x first and then moving to 3.x.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

News123

unread,

Dec 6, 2008, 1:23:38 PM12/6/08

to

I fully agree with Roy's answer.

COding small tasks is a good starting point. For quite some time you'll
be of course less efficient than with your previous language, but that's
part of the learning curve, isn't it.

I guess you'll learn the syntax rather quickly.
What's more painful is to learn which functianilty is in which library
and which library exists.

There's of course a lot of online documentation, but often you find
answers to trivial python questions fastest with Google:
for example: search for something like "python string reverse example"

And there's of course this newsgroup whenever you're stuck with a
'missing' feature, (though mostly the features aren't missing, but just
a little different)

bye

N

Roy Smith wrote:
> In article <ghdt15$6e7$1...@news.motzarella.org>,
> Bertilo Wennergren <bert...@gmail.com> wrote:
>
>> I'm planning to start learning Python now, using Python 3000.
>> I have no previous Python skills,

>> . . .

Roy Smith

unread,

Dec 6, 2008, 1:30:13 PM12/6/08

to

In article <ghecgk$81m$1...@panix3.panix.com>, aa...@pythoncraft.com (Aahz)
wrote:

> In article <ghe0jo$3i1$1...@news.motzarella.org>,
> Bertilo Wennergren <bert...@gmail.com> wrote:
> >
> >I don't suppose there is any introductory material out there that is
> >based on Python 3000 and that is also geared at people with a Perl
> >background? Too early for that I guess..
>
> Honestly, the differences between 2.x and 3.0 are small enough that it
> doesn't much matter, as long as you're not the kind of person who gets
> put off by little problems. Because so much material is for 2.x, you
> may be better off just learning 2.x first and then moving to 3.x.

I'm not sure I agree. If you're starting out, you might as well learn the
new stuff. Then there's no need to unlearn the old way.

Using material meant for 2.x is likely to lead to confusion. If you don't
know either, you'll never know if something isn't working as described
because you're doing it wrong or if it's just not the same as it used to
be. When everything is new, what seem like little stumbling blocks to
experts become total blockers to people starting from zero.

Martin P. Hellwig

unread,

Dec 6, 2008, 1:46:47 PM12/6/08

to

News123 wrote:
>
> What's more painful is to learn which functianilty is in which library
> and which library exists.
>

<cut>
Yes and one mistake I still often find myself doing is, when confronted
with a particular problem, that I write some helper code to deal with
it. Of course later on I discover that there is a standard module or
built-in that does exactly what I want and better. :-)

Somehow in the heat of the moment my mind is not thinking 'there must be
something out there which does what I want' but rather 'hmmm I think I
can do it this way, clicker-di-click'.

In my opinion it is a positive attribute to the language that it makes
solving problems so easy that you forget about searching for solutions.
Maybe python should prompt every 10 lines of code a message saying 'Are
you sure this can not be done with a built-in?' Most of the time it will
be right anyway :-)

--
mph

Roy Smith

unread,

Dec 6, 2008, 2:15:28 PM12/6/08

to

In article <014a96e0$0$20670$c3e...@news.astraweb.com>,

Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> wrote:

> On Sat, 06 Dec 2008 08:50:20 -0500, Roy Smith wrote:
>
> > For your first
> > project, pick something that's small enough that you think you could
> > tackle it in under 50 lines of Perl.
>
> Is there anything which *can't* be written in under 50 lines of Perl?

[...]

> Also, Perl REs are faster than Python REs, or so I'm told. Between the
> speed and the convenience, Perl programmers tend to use RE's for
> everything they can. Python programmers tend to use REs only for problems
> that *should* be solved with REs rather than *can* be solved with a RE.

Well, as an old-time unix hacker (who learned REs long before Perl
existed), my question to you would be, "Is there any problem which
*shouldn't* be solved with an RE?" :-)

It's easy to go nuts with REs, and create write-only code. On the other
hand, they are an extremely powerful tool. If you are wise in the ways of
RE-fu, they can not only be the most compact way to write something, but
also the most efficient and even the most comprehensible. Unfortunately,
REs seem to be regarded as some kind of monster these days and few people
take the time to master them fully. Which is a shame.

One really nice feature of REs in Python is the VERBOSE flag. It lets you
write some way-complicated REs in a way which is still easy for somebody to
read and understand. Python's raw strings, and triple-quoted strings, also
help reduce the sea of backslashes which often make REs seem much worse
than they really are.

One of the reasons REs don't get used in Python as much as in Perl is
because strings have useful methods like startswith(), endswith(), and
split(), and also the "in" operator. These combine to give you easy ways
to do many things you might otherwise do with REs.

Rainy

unread,

Dec 6, 2008, 3:28:36 PM12/6/08

to

There's a lot of hoopla about Py3k being different, incompatible
with Py2.x. However, you have to keep in mind that this matters
most of all to old, large programs, which will need to be changed
if one would like to run them on Py3k. For learning the differences
don't matter much. If you learn to code in py2.x for half a year,
you will be able to pick up on most of the differences and transfer
to py3k in a few days. If you find good docs on py2.x go ahead and
use them and don't worry.

Carl Banks

unread,

Dec 6, 2008, 4:01:53 PM12/6/08

to

On Dec 6, 12:30 pm, Roy Smith <r...@panix.com> wrote:
> In article <ghecgk$81...@panix3.panix.com>, a...@pythoncraft.com (Aahz)
> wrote:
>
> > In article <ghe0jo$3i...@news.motzarella.org>,

> > Bertilo Wennergren <berti...@gmail.com> wrote:
>
> > >I don't suppose there is any introductory material out there that is
> > >based on Python 3000 and that is also geared at people with a Perl
> > >background? Too early for that I guess..
>
> > Honestly, the differences between 2.x and 3.0 are small enough that it
> > doesn't much matter, as long as you're not the kind of person who gets
> > put off by little problems. Because so much material is for 2.x, you
> > may be better off just learning 2.x first and then moving to 3.x.
>
> I'm not sure I agree. If you're starting out, you might as well learn the
> new stuff. Then there's no need to unlearn the old way.

One disadvantage of learning Python 3 first is the availability of
third-party libraries (especially extension libraries), most of which
will not be updated for Python 3.x for quite a while.

Also, I don't think it's really advisable to be completely ignorant of
the 2.x difference even if one intends to start with 3.0. There is a
lot of code and material out there for 2.x, and until these start to
be widely available for 3.x, people will sometimes have to make do
with the 2.x stuff.

Carl Banks

Steven D'Aprano

unread,

Dec 6, 2008, 7:10:45 PM12/6/08

to

On Sat, 06 Dec 2008 14:15:28 -0500, Roy Smith wrote:

> In article <014a96e0$0$20670$c3e...@news.astraweb.com>,
> Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> wrote:
>
>> On Sat, 06 Dec 2008 08:50:20 -0500, Roy Smith wrote:
>>
>> > For your first
>> > project, pick something that's small enough that you think you could
>> > tackle it in under 50 lines of Perl.
>>
>> Is there anything which *can't* be written in under 50 lines of Perl?
> [...]
>> Also, Perl REs are faster than Python REs, or so I'm told. Between the
>> speed and the convenience, Perl programmers tend to use RE's for
>> everything they can. Python programmers tend to use REs only for
>> problems that *should* be solved with REs rather than *can* be solved
>> with a RE.
>
> Well, as an old-time unix hacker (who learned REs long before Perl
> existed), my question to you would be, "Is there any problem which
> *shouldn't* be solved with an RE?" :-)

I think you've answered your own question:

> One of the reasons REs don't get used in Python as much as in Perl is
> because strings have useful methods like startswith(), endswith(), and
> split(), and also the "in" operator. These combine to give you easy
> ways to do many things you might otherwise do with REs.

Also:

* splitting pathnames and file extensions

* dealing with arbitrarily nested parentheses

* any time you need a full-blown parser (e.g. parsing HTML or XML)

* sanitizing untrusted user input
("I bet I can think of *every* bad input and detect them all
with this regex!")

* validating email addresses
http://northernplanets.blogspot.com/2007/03/how-not-to-validate-email-addresses.html

* testing prime numbers
http://jtauber.com/blog/2007/03/18/python_primality_regex/

* doing maths
http://blog.stevenlevithan.com/archives/algebra-with-regexes
http://weblogs.asp.net/rosherove/archive/2004/11/08/253992.aspx

There's probably more.

--
Steven

Python Nutter

unread,

Dec 6, 2008, 8:44:19 PM12/6/08

to Bertilo Wennergren, pytho...@python.org

Perl Cookbook for Python Programmers:
http://pleac.sourceforge.net/pleac_python/index.html

P3K as starting point (slight cringe) as long as you know the caveats.

I'm of the mind as Christopher with regard to how Python 3.0 has been
released on Python.org:

"""I don't think that Python 3.0 is a bad thing. But that it's
displayed so prominently on the Python web site, without any kind of
warning that it's not going to work with 99% of the Python code out
there, scares the hell out of me. People are going to download and
install 3.0 by default, and nothing's going to work. They're going to
complain, and many are going to simply walk away.
- Christopher Lenz"""

Python3 is beautiful, and I am totally with James Bennet
http://www.b-list.org/weblog/2008/dec/05/python-3000/

That said I would suggest you learn the whole history of the Py3K
project and its 19+ years of 20/20 hindsight on what works well and
what doesn't so you can if you want to jump right into Python3 fully
informed on why it is the way it is and why you need to wait for 3rd
party libraries to catch up and release a Python 3 compatible version
and why all the Internal libraries are no worry except for the fact
some of them have disappeared in the cleanup or merged into a single
library. Then you can make sense of all the 2.x books and posts and
know where to start when trying to apply it to Python 3.

Cheers,
PN
(Python 2.5.2, Stackless Python 2.5.2, Python 2.6 and Python 3.0 on my box)

P.S. Look into iPython as your new favourtie shell and virtualenv to
help you keep all your projects straight and you'll be very productive
in no time =)

2008/12/7 Bertilo Wennergren <bert...@gmail.com>:

Python Nutter

unread,

Dec 6, 2008, 8:54:49 PM12/6/08

to pytho...@python.org

> In article <014a96e0$0$20670$c3e...@news.astraweb.com>,
> Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> wrote:
>
> Well, as an old-time unix hacker (who learned REs long before Perl
> existed), my question to you would be, "Is there any problem which
> *shouldn't* be solved with an RE?" :-)
>

> One of the reasons REs don't get used in Python as much as in Perl is
> because strings have useful methods like startswith(), endswith(), and
> split(), and also the "in" operator. These combine to give you easy ways
> to do many things you might otherwise do with REs.

I agree, I'm going through the new book Python for Unix and Linux
Administration now and although in general I like what they say, they
take you through the built in string functions and then introduce REs
and end the chapter leaving the reader with the impression that REs
are the better solution and I only agree with the case of the
problem/program they presented.

However I used the built ins more effectively using the indexes
returned within the string and I've built plenty of scripts that did
not need to move to REs to perform the text/file processing that I
did. This intermediate use of string built-in functions was missing
between the first string-function and RE versions of code and imho it
is not letting the readers see that string-functions are even more
powerful than the reader is lead to believe and that REs are pushed
more towards edge cases than the impression the reader seems to be
left with which is to use REs more.

At least if you push REs inform the readers where to get the a RE GUI
builder written in Python so they can build and *test* the complex and
unwieldy REs to perform anything beyond the basic pattern searches.

Cheers,
PN

Bertilo Wennergren

unread,

Dec 6, 2008, 9:05:15 PM12/6/08

to

Aahz wrote:

> In article <ghe0jo$3i1$1...@news.motzarella.org>,

> Bertilo Wennergren <bert...@gmail.com> wrote:

>> I don't suppose there is any introductory material out there that is
>> based on Python 3000 and that is also geared at people with a Perl
>> background? Too early for that I guess..

> Honestly, the differences between 2.x and 3.0 are small enough that it
> doesn't much matter, as long as you're not the kind of person who gets
> put off by little problems. Because so much material is for 2.x, you
> may be better off just learning 2.x first and then moving to 3.x.

The main reason I waited until Python 3000 came out is
the new way Unicode is handled. The old way seemed really
broken to me. Much of what I do when I program consists
of juggling Unicode text (real Unicode text with lots of
actual characters outside of Latin 1). So in my case
learning version 2.x first might not be very convenient.
I'd just get bogged down with the strange way 2.x handles
such data. I'd rather skip that completely and just go
with the Unicode handling in 3.0.

MRAB

unread,

Dec 6, 2008, 9:26:40 PM12/6/08

to Python List

I wouldn't have said it was broken, it's just that it was a later
addition to the language and backwards compatibility is important.
Tidying things which would break backwards compatibility in a big way
was deliberately left to a major version, Python 3.

Roy Smith

unread,

Dec 6, 2008, 11:07:10 PM12/6/08

to

In article <mailman.5121.1228614...@python.org>,
"Python Nutter" <python...@gmail.com> wrote:

> At least if you push REs inform the readers where to get the a RE GUI
> builder written in Python so they can build and *test* the complex and
> unwieldy REs to perform anything beyond the basic pattern searches.

Oh, my, I think my brain's about to explode. A RE GUI builder? Cough,
gasp, sputter. This is literally the first time I've ever heard of such a
thing, and it's leaving a bad taste in my mouth.

RE is the last bastion of Real Programmers. Real Programmers don't use GUI
builders. Using a GUI to build an RE is like trying to program by pushing
little UMLish things around with a mouse. It's Just Wrong.

Aahz

unread,

Dec 6, 2008, 11:18:35 PM12/6/08

to

In article <ghfb0r$iae$1...@news.motzarella.org>,

Sounds like you have a good use-case for 3.0 -- go to it!

Nick Craig-Wood

unread,

Dec 8, 2008, 5:30:50 AM12/8/08

to

Bertilo Wennergren <bert...@gmail.com> wrote:
> I'm planning to start learning Python now, using Python 3000.
> I have no previous Python skills, but I now Perl pretty well.
> I'm also well experienced with JavaScript.
>
> Any pointers and tips how I should go about getting into
> Python?

Read "Dive Into Python" while following along with your keyboard.

( http://www.diveintopython.org/ free online or paper edition from
your favourite bookseller )

My favourite mistake when I made the transition was calling methods
without parentheses. In perl it is common to call methods without
parentheses - in python this does absolutely nothing! pychecker does
warn about it though.

perl -> $object->method
python -> object.method()

--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick

Roy Smith

unread,

Dec 8, 2008, 10:28:36 AM12/8/08

to

In article <slrngjps0...@irishsea.home.craig-wood.com>,

Nick Craig-Wood <ni...@craig-wood.com> wrote:

> My favourite mistake when I made the transition was calling methods
> without parentheses. In perl it is common to call methods without
> parentheses - in python this does absolutely nothing! pychecker does
> warn about it though.
>
> perl -> $object->method
> python -> object.method()

On the other hand, leaving out the parens returns the function itself,
which you can then call later. I've often used this to create data-driven
logic.

For example, I'm currently working on some code that marshals objects of
various types to a wire protocol. I've got something like:

encoders = {
SM_INT: write_int,
SM_SHORT: write_short,
SM_FLOAT: write_float,
# and so on
}

class AnyVal:
def __init__(self, type, value):
self.type = type
self.value = value

def write_anyval(any):
encoders[any.type](any.value)

The fact that functions are objects which can be assigned and stored in
containers makes this easy to do.

Nick Craig-Wood

unread,

Dec 9, 2008, 7:30:52 AM12/9/08

to

Roy Smith <r...@panix.com> wrote:
> In article <slrngjps0...@irishsea.home.craig-wood.com>,
> Nick Craig-Wood <ni...@craig-wood.com> wrote:
>
> > My favourite mistake when I made the transition was calling methods
> > without parentheses. In perl it is common to call methods without
> > parentheses - in python this does absolutely nothing! pychecker does
> > warn about it though.
> >
> > perl -> $object->method
> > python -> object.method()
>
> On the other hand, leaving out the parens returns the function itself,
> which you can then call later. I've often used this to create data-driven
> logic.

I didn't say it wasn't useful, just that if you came from Perl like I
did, it is an easy mistake to make ;-)

> For example, I'm currently working on some code that marshals objects of
> various types to a wire protocol. I've got something like:
>
> encoders = {
> SM_INT: write_int,
> SM_SHORT: write_short,
> SM_FLOAT: write_float,
> # and so on
> }
>
> class AnyVal:
> def __init__(self, type, value):
> self.type = type
> self.value = value
>
> def write_anyval(any):
> encoders[any.type](any.value)
>
> The fact that functions are objects which can be assigned and stored in
> containers makes this easy to do.

OO lore says whenever you see a type field in an instance you've gone
wrong - types should be represented by what sort of object you've got,
not by a type field.

Eg http://www.soberit.hut.fi/mmantyla/BadCodeSmellsTaxonomy.htm

"""The situation where switch statements or type codes are needed
should be handled by creating subclasses. """

Here is my first iteration (untested)

class AnyVal:
def __init__(self, value):
self.value = value
def write(self):
raise NotImplementedError()

class IntVal(AnyVal):
def write(self):
# write_int code

class ShortVal(AnyVal):
def write(self):
# write_short code

class FloatVal(AnyVal):
def write(self):
# write_float code

Then to write an AnyVal you just call any.write()

The initialisation of the AnyVals then becomes

from AnyVal(int_expression, SM_INT)
to IntVal(int_expression)

However, if the types of the expressions aren't known until run time,
then use a dict of class types

AnyValRegistry = {
SM_INT: IntVal,
SM_SHORT: ShortVal,
SM_FLOAT: FloatVal,
# and so on
}

And initialise AnyVal objects thus

any = AnyValRegistry[type](value)

This smells of code duplication though and a real potential for a
mismatch between the AnyValRegistry and the actual classes.

I'd probably generalise this by putting the type code in the class and
use a __metaclass__ to autogenerate the AnyValRegistry dict which
would then become an attribute of AnyClass

Eg (slightly tested)

SM_INT=1
SM_SHORT=2
SM_FLOAT=3

class AnyVal(object):
TYPE = None
registry = {}
class __metaclass__(type):
def __init__(cls, name, bases, dict):
cls.registry[cls.TYPE] = cls
def __init__(self, value):
self.value = value
@classmethod
def new(cls, type_code, value):
"""Factory function to generate the correct subclass of AnyVal by type code"""
return cls.registry[type_code](value)
def write(self):
raise NotImplementedError()

class IntVal(AnyVal):
TYPE = SM_INT
def write(self):
# write_int code
print "int", self.value

class ShortVal(AnyVal):
TYPE = SM_SHORT
def write(self):
# write_short code
print "short", self.value

class FloatVal(AnyVal):
TYPE = SM_FLOAT
def write(self):
# write_float code
print "float", self.value

You then make new objects with any = AnyVal.new(type_code, value) and
write them with any.write()

Anyone can add a subclass of AnyVal and have it added to the
AnyVal.registry which is neat.

>>> any = AnyVal.new(SM_SHORT, 1)
>>> any
<__main__.ShortVal object at 0xb7e3776c>
>>> any.write()
short 1

>>> any = AnyVal.new(SM_FLOAT, 1.8)
>>> any
<__main__.FloatVal object at 0xb7e37a6c>
>>> any.write()
float 1.8

You could also override __new__ so you could write AnyVal(type_code,
value) to create the object of a new type. I personally don't think
its is worth it - a factory function is nice and obvious and show
exactly what is going on.

Roy Smith

unread,

Dec 9, 2008, 8:31:02 AM12/9/08

to

Nick Craig-Wood <ni...@craig-wood.com> wrote:

> > On the other hand, leaving out the parens returns the function itself,
> > which you can then call later. I've often used this to create data-driven
> > logic.
>
> I didn't say it wasn't useful, just that if you came from Perl like I
> did, it is an easy mistake to make ;-)

Agreed.

> OO lore says whenever you see a type field in an instance you've gone
> wrong - types should be represented by what sort of object you've got,
> not by a type field.

OO lore lives in an ivory tower sometimes :-) I'm working with an existing
system, where objects are marshaled on the wire as type codes followed by a
type-specific number of bytes of data. Internally, it calls these AnyVals
and the concept is pervasive in the architecture. I could work within the
existing architecture, or I could try to fight it.

Yes, I could get rid of the dispatch table and create 20 or 30 classes to
represent all the possible types. I'd end up with several times as much
code, most of it boilerplate. Instead of having a dispatch table of
read/write functions, I'd have a dispatch table of classes, each of which
has a read method and a write method. It doesn't buy anything, and I'd
still have the type codes exposed because I need them to read and write
values to the wire.

J. Cliff Dyer

unread,

Dec 9, 2008, 10:01:07 AM12/9/08

to Bertilo Wennergren, pytho...@python.org

I've actually found python's unicode handling quite strong. I will
grant that it is not intuitive, but once you have learned it, it is
clear, comprehensive, and sensible. It is by no means broken, or
half-heartedly supported. The drawback is that you have to explicitly
use it. It doesn't happen by default.

For starters, only using str objects for input and output. As soon as
you get them, decode them to unicode. And at the last minute, when
writing them, encode them to your favorite encoding. Better yet, use
codecs.open(file, encoding='utf-16') in place of open(file), pass an
encoding argument, and be done with it.

When you create strings in your code, always use u'stuff' rather than
'stuff' and ur'stu\ff' rather than r'stu\ff'.

If you work with unicode on a daily basis, it shouldn't be hard to
master. There are several good tutorials on the web.

Cheers,
Cliff