I'm building a class with a uniform interface and I'm trying to come to
terms with polymorphism in python.
For instance I want the constructor to take either a filename, a file object
or a database connection object.
In C++ this would be done using three different constructors. (I'm not quite
sure about this, cause I don't know C++ :-)
But how should one do it in python.
My first idea was something like this:
class myclass:
def __init__:(self, arg):
if type(arg) == types.StringType:
#check if that's a filename and do the right initialization.
elif type(arg) == types.FileType:
#do the initialization in an other way
# and so on
Looking at resent posts I found a thread on type checking that contained
this unsettling post:
<pa...@prescod.net> wrote:
> Let me also point out that in many cases, this style of programming
> would be frowned upon by serious Python programmers. For instance, if
> you check that something is a string, your code will complain when it is
> handed a Unicode string, even though it would probably work fine. If you
> check that it is an open file, then your code probably will complain
> about stringIO file-like objects, even though it would probably work
> fine. If you check that it is an integer, your code will complain about
> integers, even though ti would probably work fine.
What are your experiences? Does the type(arg) checking really become
unmanageable as Paul's post implies? If so what are the alternative routes?
oivvio
> For instance I want the constructor to take either a filename, a file object
> or a database connection object.
><pa...@prescod.net> wrote:
>> Let me also point out that in many cases, this style of programming
>> would be frowned upon by serious Python programmers.
Polymorphism and method overloading aren't really the same thing at all.
What Paul was saying was that Python lacks (or doesn't want) a Java
like interface concept. So if you wanted a method to be able to write
to any stream, you can't specify that in the code. And attempting to
enumerate all possible streams would break polymorphism. Python
takes the approach that if it looks like a fish and smells like a fish, it
probably is a fish.
Now what you are doing is rather different. You have three
different things: there is nothing that you could say 'filename', 'file
object', and 'database connection' are all kinds of. Except 'object'
perhaps.
How to solve the problem? One approach would be to use a factory
style and have three functions: generateXFromFileName(), generateXFromFileObject(),
generateXFromDBConn().
Hmmm... I think you want "overloading" (methods that do different
things depending on their arguments' types) rather than "polymorphism"
(calling method upon an object, without caring which actual type that
object IS, as long as it provides methods that Do The Right Thing). A
subtle distinction, to be sure, but an important one.
> For instance I want the constructor to take either a filename, a file
object
> or a database connection object.
> In C++ this would be done using three different constructors. (I'm not
quite
> sure about this, cause I don't know C++ :-)
Yes: you would use three different overloads of the constructor.
> But how should one do it in python.
My instinctive approach (and it may be that I still haven't developed the
"right" Python instincts, because I haven't been using it long!) would be :
class myclass:
def __init__(filename=None, fileobject=None, dbconn=None):
if filename:
# initialize using the filename
elif fileobject:
# initialize using the fileobject
elif dbconn:
# initialize using the db connection
else:
# throw an exception, or whatever
Client-code would then use, e.g.,
myobj = myclass("foo.dat")
or
myobj = myclass(filename="foo.dat")
to initialize with a filename,
myobj = myclass(fileobject=open("foo.dat"))
to initialize with a file object, etc.
The alternative, given a single, "anodyne" argument to
__init__, would be for myclass to try to find out if it
"BEHAVES like a filename string", "BEHAVES like a
fileobject", etc, rather than trying to ascertain whether
it "IS" one of those. "The IS of identity", bemoaned by
Korzibsky, is just as generally inappropriate in Python as
he claimed it to be in terms of fuzziness of thinking; and
Wittgenstein before him pointed out how language-games,
and NOT mathematical-identity, determine applicability of
a term-in-context.
In other words, don't check whether it IS-a duck: check
whether it QUACKS-like-a duck, WALKS-like-a duck,
etc, etc, depending on exactly what subset of duck-like
behaviour you need to play your language-games with. If
the argument fails this specific-ducklyhood-subset-test, then
you can shrug, ask "why a duck?" (at least, you can if you're
a Marx Brothers fan and have memorized "Cocoanuts"' script;
Monty Python one-true-wayists will have to find their own
simile here), and move on to the next set of tests (why-a-no-
chicken immediately comes to mind, but then one would have
to ask why it crosses the road, so I think we'd better snip it).
On the other hand, this can be a considerable amount of work,
depending on how you go about it (actually, it need not be that
bad if you "just go ahead and try", of course catching the likely
exceptions if the try does not succeed; but still, greater than 0).
Besides, "explicit is better than implicit", goes one of Python's
mantras. Just let the client-code explicitly TELL you which kind
of argument they are passing you (and doing so through a named
argument is simple and readable), and your work drops to zero,
while removing no useful functionality whatever from the client.
As a little vig, you also avoid trouble in case what the client wants
to pass is some tricky object that behaves EITHER as a file
connection OR as a db connection (etc, etc) -- not all that likely,
but, who knows.
About typeswitching in general...:
> Looking at resent posts I found a thread on type checking that contained
> this unsettling post:
>
> <pa...@prescod.net> wrote:
> > Let me also point out that in many cases, this style of programming
> > would be frowned upon by serious Python programmers. For instance, if
> > you check that something is a string, your code will complain when it is
> > handed a Unicode string, even though it would probably work fine. If you
> > check that it is an open file, then your code probably will complain
> > about stringIO file-like objects, even though it would probably work
> > fine. If you check that it is an integer, your code will complain about
> > integers, even though ti would probably work fine.
Amen, hallelujah. You don't really care for IS-A -- you really only care
for BEHAVES-LIKE-A-(in-this-specific-context), so, if you do test, this
behaviour is what you should be testing for (although I assume Paul Prescod
meant "longs" where he wrote "integers" on the last line, or else I don't
know
what those two last lines mean).
> What are your experiences? Does the type(arg) checking really become
> unmanageable as Paul's post implies? If so what are the alternative
routes?
I have no direct experience of setting up such typeswitching *in Python*,
but
I *do* know it becomes unmanageable, for various reasons, in languages
where I painted myself into that corner. Stroustrup had the same experience
with typeswitching in Simula-67, which is why he so carefully kept it out of
C++ (it snuck back in as dynamic_cast, but that is _supposed_ to be used
as the closest C++ equivalent of does-it-support-this-functionality-set [by
dyncasting to a suitable pointer-to-abstract-interface] -- although of
course it
is just as easy to mis-use it as a typeswitch, sigh). However, the
languages
I've fallen into this particular pitfall in were not as dynamic/fluid as
Python is,
so I can't be "sure because of personal experience" that it is indeed a
danger;
still, all of my instincts ARE yelling "yes, yes, exactly!" in chorus with
Paul
Prescod (and some of my favourite thinkers, as above quoted, appear to
provide solid back-up general reasons:-).
The "royal-road" alternative route to overloading would, I think, be the use
of suitable named-arguments. A rockier road, perhaps preferable in some
cases, but more work for dubious benefit, would be the try/except approach
to see if an argument supplies the functionalities you require.
Alex
In my experience, using the type() builtin is more pain than it's worth.
I like the scheme proposed by Alex Martelli, except that I would have
used
def __init__(filename='', fileobject=None, dbconn=None):
instead (personal style I guess. I like to keep strings as strings,
even if they are "false" strings).
The critical point is that in Python, polymorphism depends solely on the
interface provided, not on any "is-a" relationships. Instead of
insisting that your argument must be a string (for example), your code
should work with any object that exposes the interfaces you need
(strings, unicode strings, perhaps even a custom class created by
somebody to emulate Java's StringBuffer, etc.). Similarly, you don't
need to verify that your argument is a file, only that it has a read()
method (or whatever method you use).
Alternately, to keep closer to the style of your original:
class myclass:
def __init__(self, arg):
try:
if os.path.isfile(arg):
# Do proper string initialization here
except TypeError: #not a string
try:
data = arg.read()
# File intialization
except (AttributeError, TypeError): #Not a file
# DB Connection initialization
--
Sean Blakey, sbl...@freei.com
Software Developer, FreeInternet.com
(253)796-6500x1025
Computers are unreliable, but humans are even more unreliable.
Any system which depends on human reliability is unreliable.
-- Gilb
Or you can simply expect a read() method. Don't verify it, don't
support anything else. The polymorphism is up to the caller. Examples
of this with the file interface can be found in the library, rfc822 comes
to mind. I think this is more common than checking for method attributes
and then supporting different interfaces.
In more general terms, you have to pick or create an interface that covers
the important bases. Often enough it's not worth the trouble, and people
do use alternate input parameters etc. But where it works out, it could
make the function more flexible and yet simpler, by pushing the support for
different inputs back on the input objects themselves. Polymorphism.
Donn Cave, do...@u.washington.edu
> Or you can simply expect a read() method. Don't verify it, don't
> support anything else. The polymorphism is up to the caller. Examples
> of this with the file interface can be found in the library, rfc822 comes
> to mind. I think this is more common than checking for method attributes
> and then supporting different interfaces.
According to all the Python lore/attitude/community feeling/etc that
I've absorbed, that's the way to do it.
This way you know you can interact with a file object, and any other
object that wants to act like a file object.
Interface does seem to be very important in the standard libraries, but
type does not seem to be anywhere close in importance.
Many of the standard libs support 'any object that has a "read" method
like a file object."
--
Shae Matijs Erisson - http://www.webwitches.com/~shae/
VirtualPairProgramming Wanted - Linux/Emacs/Python/Speak Freely
.fi: rakastan ohjelmointia - python kengittää aasia
It's not so much that it becomes unmanageable, but that it
needlessly constrains what kind of objects your methods can
operate on.
Rather than checking for specific types, a more flexible way
is to test whether they support the operations you need to do
on them. For example,
class myclass:
def __init__(self, arg):
if hasattr(arg, "read"):
# It seems to be a file or a file-like object
...
else:
# Assume it's a string or something that can be converted
to one
filename = str(arg)
...
--
Greg Ewing, Computer Science Department,
University of Canterbury, New Zealand
To get my email address, please visit my web page:
http://www.cosc.canterbury.ac.nz/~greg
I see now that giving the responsibility to the caller is the path of
righteous.
My problem, I should have pointed this out in the first post, is that the
caller doesn't want any responsibility at all.
We're building a workbench for computational molecular biology. The intended
audience is not one of programmers. It is one that uses Excel and
Powerpoint. If you want them to do any programming you have to trick them in
to it. In than context it's a great thing if a constructor can take just
about anything and make sense out of it that is a good thing. It will
flatten the learning curve.
Therefore I will depart from the narrow path and go of with the try / except
hoard.
After all this I still don't get the difference between polymorphism and
overloading. I've just ordered a book on OO design. So maybe that
distinction will become clearer to me, then again, maybe it won't :)
oivvio
An overloaded function is a function whose behaviour is dependent on the
type of information you give it. The types of information that the function
supports is limited by the implementation of the function.
A polymorphic function is a function whose behaviour is independent of the
type of information you give it. The types of information that the function
supports is limited to those types which support the set of operations that
the function requires.