Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

OO conventions

2 views
Skip to first unread message

Daniel Nogradi

unread,
Feb 1, 2006, 5:40:37 PM2/1/06
to pytho...@python.org
I'm relatively new to object oriented programming, so get confused
about its usage once in a while. Suppose there is a class Image that
has a number of methods, rotate, open, verify, read, close, etc. Then
to use this class my natural guess would be to have something like

image = Image( )
image.read( "myfile.jpg" )
image.rotate( )
image.close( )

But now it turns out that the PIL module uses this as

image = Image.open( "myfile.jpg" )
image.verify( )
image.rotate( )
image.close( )

Perhaps the real Image class of PIL doesn't have these methods
exactly, but doesn't matter, my point is the way it works. Is it
normal that instead of first creating an instance of a class, it
starts right away with one its methods? I of course understand that
the PIL people simply made a choice that their module works this way,
period, but I'm just wondering if it wouldn't have been more "logical"
to do it along the way of my first example.

I guess it's just a matter of convention or how the programmer feels
like, but there are no conventions of this type? Which would be more
pythonic? Or I shouldn't worry and it's totally up to the developer?

Anthony Greene

unread,
Feb 1, 2006, 6:00:48 PM2/1/06
to


You should take a look at the PIL source files, open() could surely be a
class which takes a filename as an argument at construction time. If you
run across such code, there is really no need to worry, some things are
conventional and some are a matter of preference. In most cases diving
into the source can help clarify a lot of your speculations, better hope
the code is transparent, i hate reading non-readable code. :)

--
A wise man knows he knows nothing.

Anthony Greene

unread,
Feb 1, 2006, 6:03:46 PM2/1/06
to
On Wed, 01 Feb 2006 23:40:37 +0100, Daniel Nogradi wrote:

In this case, Image seems to be a python module, with the open function
defined, PIL's Image is not a class.

Kirk McDonald

unread,
Feb 1, 2006, 6:07:01 PM2/1/06
to


If I were coding a class like that, I'd probably do it like this:

image = Image("myfile.jpg")
image.rotate()
image.close()

And, in addition, there would be an image.open() method. In the above,
__init__ just calls it on its own when the filename is specified.

(Though, of course, I'm not at all familiar with the PIL module; there
might be more stuff going on behind the scenes with the way they do it.)

-Kirk McDonald

Claudio Grondi

unread,
Feb 1, 2006, 6:07:37 PM2/1/06
to Daniel Nogradi
It's probably more up to the context than to the developer.

I can imagine, that the reason for not using
image = Image()
is, that 'image' would be in a bad defined state suggesting to be an
Image(), but after image = Image() it were actually not (yet) an image.

It makes sense to me, to have an actual image object with some pixel
width and height and some color scheme and not only undefined empty
container for an image. In another context the approach you describe is
maybe more appropriate, but in case of an Image() object I can't see any
advantage.

I don't know if there exist any conventions for how to handle such
matters. For me Pythonic means it should be as I expect it intuitively.
And my intuition tells me, that I need to define what image I want, when
I create an Image object.

Maybe it would even be more appropriate to pass the definition of the
image to the Image class using Image(fileObject) or Image(fileNameSpec)
or Image(imageFormatAndSizeSpec), but I can imagine, that using
different methods instead of detection of the type of parameter passed
is probably less error prone making a good reason supporting the
approach of using Image.open("bride.jpg") or Image.new(mode, size) for
creation of an Image object when it is defined in different ways.

As the .new() and .open() methods are anyway there in order to be
available for further image operations, why not using them directly in
the process of creating Image() objects?

Claudio

Claudio Grondi

unread,
Feb 1, 2006, 6:07:37 PM2/1/06
to Daniel Nogradi, pytho...@python.org

Daniel Nogradi

unread,
Feb 1, 2006, 6:11:32 PM2/1/06
to pytho...@python.org
>
> In this case, Image seems to be a python module, with the open function
> defined, PIL's Image is not a class.
>

Thanks for the enlightening remarks, especially this last one, indeed,
it's not a class.

Terry Hancock

unread,
Feb 1, 2006, 8:09:03 PM2/1/06
to pytho...@python.org
On Wed, 1 Feb 2006 23:40:37 +0100
Daniel Nogradi <nog...@gmail.com> wrote:
> I'm relatively new to object oriented programming, so get
> confused about its usage once in a while. Suppose there is
> a class Image that has a number of methods, rotate, open,
> verify, read, close, etc. Then to use this class my
> natural guess would be to have something like
>
> image = Image( )
> image.read( "myfile.jpg" )
> image.rotate( )
> image.close( )
>
> But now it turns out that the PIL module uses this as
>
> image = Image.open( "myfile.jpg" )
> image.verify( )
> image.rotate( )
> image.close( )

That's because that *isn't* object oriented. PIL provides
an "open" *function* that opens a file containing an image
and *returns* an Image object. The two "Image" usages are
not the same. I personally think he should've named the
module "Imaging" or something, to make this clearer, but you
can of course fix this in the import (I frequently have had
to call it PIL_Image or something, because of collisions
with other modules or objects named "Image").

What Lundh is modeling this on is not standard OOP thinking,
but rather the standard way of opening files in Python.

Or you could say he's using a "factory function".

I think it is actually not bad style, because in fact, he's
drawing attention to the fact that the image is NOT "an
object initialized by the string 'myfile.jpg'", but is
rather a function loading data from a file called
'myfile.jpg' and creating a PIL Image object from *that*.

Remember, programming is an art, not a science.
"Object-oriented style" is a subjective and fuzzy
aesthetic term, and what is "most clear" depends on what
you've seen before, and what you're trying to be clear
about.

Cheers,
Terry


--
Terry Hancock (han...@AnansiSpaceworks.com)
Anansi Spaceworks http://www.AnansiSpaceworks.com

Fredrik Lundh

unread,
Feb 2, 2006, 2:24:01 AM2/2/06
to pytho...@python.org
Terry Hancock wrote:

> What Lundh is modeling this on is not standard OOP thinking,
> but rather the standard way of opening files in Python.
>
> Or you could say he's using a "factory function".

"open" is a factory function, and it picks the right class based
on the file you're opening.

anyone arguing that this is bad OOP is deluded.

</F>

Steven D'Aprano

unread,
Feb 2, 2006, 6:08:33 AM2/2/06
to
On Wed, 01 Feb 2006 23:40:37 +0100, Daniel Nogradi wrote:

> I'm relatively new to object oriented programming, so get confused
> about its usage once in a while. Suppose there is a class Image that
> has a number of methods, rotate, open, verify, read, close, etc. Then
> to use this class my natural guess would be to have something like
>
> image = Image( )

Now you have an "image" object. What is it?

Answer: it isn't an image at all, not in the plain English sense. (Or if
it is, it is an arbitrary "default image" picked by the class designer.)

> image.read( "myfile.jpg" )

And now, at long last, the image object actually is an image. So why make
this a two step process? Whatever the Image() initialization does, why
can't it be done automatically when you read the file?

[snip]

> Perhaps the real Image class of PIL doesn't have these methods exactly,
> but doesn't matter, my point is the way it works. Is it normal that
> instead of first creating an instance of a class, it starts right away
> with one its methods? I of course understand that the PIL people simply
> made a choice that their module works this way, period, but I'm just
> wondering if it wouldn't have been more "logical" to do it along the way
> of my first example.

It seems to me that the choice depends on the data you are handling, plus
of course the class designer's choice. (Dare I say it, whim.)

If a class has a natural, obvious default state (e.g. a mutable string
class might start off empty, a mutable int class might start off as zero,
a binary tree might start off as an empty node with no children) then it
makes sense to initialise the class, then add your data.

But if the class has no natural default state, then it makes no sense to
create an "empty object" with no data, a "non-image image" so to speak.

In other words, if you find yourself writing methods like this:

class Klass:
def foo(self):
if self.data is None:
raise KlassError("Can't foo an uninitialized Klass object.")
else:
# do something

then you are just doing pointless make-work to fit a convention that
doesn't make sense for your class.

--
Steven.

Tim N. van der Leeuw

unread,
Feb 2, 2006, 8:50:10 AM2/2/06
to

Actually, this way of creating a class instance is good OO practice in
many places: The Image.open() method acts as a factory-function for
creating Image objects.
You don't know, until you inspect the return value, if the created
object is actually an instance of class Image or something which just
happens to behave like one...
It could for instance be a sub-class of Image, JpegImage, if that
better suits the class library designers.
In general, factory-functions allow greater uncoupling between
interface and implementation.

Regards,

--Tim

Alex Martelli

unread,
Feb 2, 2006, 9:52:39 AM2/2/06
to
Steven D'Aprano <st...@REMOVETHIScyber.com.au> wrote:
> > image = Image( )
>
> Now you have an "image" object. What is it?
>
> Answer: it isn't an image at all, not in the plain English sense. (Or if
> it is, it is an arbitrary "default image" picked by the class designer.)

No doubt (presumably some kind of as-yet blank one).

> > image.read( "myfile.jpg" )
>
> And now, at long last, the image object actually is an image. So why make
> this a two step process? Whatever the Image() initialization does, why
> can't it be done automatically when you read the file?

"Two-step construct" ("2SC") is a reasonably well-known and widely
useful idiom, and it can serve several kinds of purposes, many but not
all of which are tied to persistence. For example, to some extent it's
used in Entity Enterprise Javabeans (EJBs): with CMP, instead of having
to create a separate object each time you need a new record from the
database, the container can juggle a pool of objects and reuse each of
them to hold different records at different times, so the number of
beans needed is the number of different records you need to have
_simultaneously_. The "thread pool" concept can be implemented in a
similar way, although this is less common. Another field where 2SC is
often used is GUIs; in that case the main motivation is "impedence
matching" between a GUI toolkit (often a cross-platform one) and a given
platform's underlying toolkit.

Python offers good opportunities to implement 2SC "under the covers"
thanks to the split between __new__ and __init__: indeed one can get
tricky, since __new__ need not necessarily perform the first step of 2SC
(it might well return a "scrubbed" object from the pool rather than the
new one). Unpickling may normally use __setstate__ instead of __init__
(after __new__, anyway) -- that's more flexible and often easier to
arrange than going through getinitargs (which doesn't work for newstyle
classes anyway) or getnewargs (which doesn't work for classic ones).

Of course, the way OOP is normally taught, 2SC sounds like a heresy, but
"out in the real world" it does have some advantages (even though
single-step construction remains the normal approach in most cases).

> But if the class has no natural default state, then it makes no sense to
> create an "empty object" with no data, a "non-image image" so to speak.

Hmmm, it might, actually; that's what __new__ normally does for
instances of mutable classes, so that __init__ or __setstate__ or other
different methods yet, as appropriate, can then make the object
"nonempty" in different ways.


> In other words, if you find yourself writing methods like this:
>
> class Klass:
> def foo(self):
> if self.data is None:
> raise KlassError("Can't foo an uninitialized Klass object.")
> else:
> # do something
>
> then you are just doing pointless make-work to fit a convention that
> doesn't make sense for your class.

It does appear to be a code smell, yes. But protocol semantics may
often require such constraints as "it does not make sense to call x.A
unless x.B has been previously called" or viceversa "it's forbidden to
call x.A if x.B has already been called", and such constraints are
generally implemented through something like the above idiom.

Consider a file object, for example: after you call f.close() any other
method call must raise an error. The simplest, most natural way to
implement this is very similar to what you just coded, using a
self.closed flag -- a case of "two-step destruction", separating
termination ("close") from finalization (destruction proper).

Reusable objects, that can be born "scrubbed", loaded with some data and
used for a while, then "scrubbed" again (with the data getting persisted
off and the object surviving and going into a reuse-pool), typically
need a flag to know if they're scrubbed or not (or, some other data
member may play double duty, e.g. by being None iff an object is
scrubbed). All normal methods must then raise if called on a scrubbed
object; code for that purpose may be injected by decorators or a custom
metaclass to reduce the boilerplate and code smells.

Note that I'm not defending the OP's contention -- I've seen no reason
in his post making 2SC/2SD desirable. I'm just addressing the wider
issue... one I probably wouldn't even cover in "OOP 101", but hold for a
later course, e.g. one on the architecture of persistence frameworks.


Alex

Magnus Lycka

unread,
Feb 2, 2006, 10:29:15 AM2/2/06
to
Alex Martelli wrote:
> Steven D'Aprano <st...@REMOVETHIScyber.com.au> wrote:
>>And now, at long last, the image object actually is an image. So why make
>>this a two step process? Whatever the Image() initialization does, why
>>can't it be done automatically when you read the file?
>
> "Two-step construct" ("2SC") is a reasonably well-known and widely
> useful idiom, and it can serve several kinds of purposes, many but not
> all of which are tied to persistence.

I've seen it used as a C++ coding standard when e.g. loading
instances from databases (Person p=Person(); p.init(id);)
since it was considered bad to have code in the constructor
which could throw exceptions. If you make a new Person on
the heap, and catch an exception from the constructor, are
you sure that you delete the object?

Alex Martelli

unread,
Feb 2, 2006, 11:03:27 AM2/2/06
to
Magnus Lycka <ly...@carmen.se> wrote:

Yes, this is a language-specific issue that applies to C++. But I tried
to show that 2SC (and its counterpart, 2SD) can come in handy in many
other languages, too, including Python.


Alex

Daniel Nogradi

unread,
Feb 2, 2006, 12:04:34 PM2/2/06
to pytho...@python.org

>From all the enlightening replies it seems to me that my next task
should be to look into what 'factory functions' are. Whenever I've
seen the word 'factory' I just skipped it, knowing that if I really
need to know what it is, I'll come back to it later. Apparently it's
just about time, thanks all for the comments.

Blair P. Houghton

unread,
Feb 2, 2006, 8:00:26 PM2/2/06
to
Image would be a superclass to JPGImage, BMPImage, PNGImage, etc...

But which to use could only be determined AFTER opening the file,
because "file.jpg" doesn't have type JPG, it has type string and
semantic value "maybe a jpeg file or maybe something misnamed as a jpeg
file".

So Image.open(filename) seems right as a factory function that opens
the file, figures out what it really is, constructs the appropriate
subclass object (likely by passing a string to the constructor, e.g.,
JPGImage(filename)), and returns the object via the superclass type.
The caller can then either check a flag in the superclass to see what
type the subclass is, or just assume it's the right type of image for
the filename extension (or does Python have RTTI? I don't recall if
I've seen it, yet...).

Though if the filename doesn't match the content, someone should raise
an exception...

But this means that Image.open(filename) is a static method of the
superclass, not requiring instantiation. Image(string) could easily
default to assuming string is a filename, doing the same work as
Image.open(filename), though it would at least partially construct an
Image instance each time it's called, which isn't what you want.
Image.open(filename) defined as a static method (wait...is that
possible in Python? I hate being the newbie) would not do any
constructing until after it knew what to construct.

--Blair

Steven D'Aprano

unread,
Feb 3, 2006, 7:33:22 AM2/3/06
to
On Thu, 02 Feb 2006 17:00:26 -0800, Blair P. Houghton wrote:

> Image would be a superclass to JPGImage, BMPImage, PNGImage, etc...
>
> But which to use could only be determined AFTER opening the file,
> because "file.jpg" doesn't have type JPG, it has type string and
> semantic value "maybe a jpeg file or maybe something misnamed as a jpeg
> file".

Absolutely.



> So Image.open(filename) seems right as a factory function that opens
> the file, figures out what it really is, constructs the appropriate
> subclass object (likely by passing a string to the constructor, e.g.,
> JPGImage(filename)), and returns the object via the superclass type.
> The caller can then either check a flag in the superclass to see what
> type the subclass is, or just assume it's the right type of image for
> the filename extension (or does Python have RTTI? I don't recall if
> I've seen it, yet...).

RTTI = Run Time Type Information, yes?

Objects in Python are strongly typed, if that's what you mean. So if each
image kind (jpeg, tiff, png, etc.) is a subclass, then you can easily use
introspection to find out which subclass it is.

> Though if the filename doesn't match the content, someone should raise
> an exception...

That depends on whether you believe DOS style filename extensions are
significant, or that they are a poor substitute for real metadata.

If you come from Windows land, you might thing the file extension is the
file type. If you come from Linux land, you might think that file
extensions are just a label stuck on the file name as a convenience for
the user. If you come from Macintosh land, especially pre-OS X, you
probably think that file extensions are at best a useful redundancy and at
worst an abomination.

From a UI perspective, an application should never refuse to handle a file
because of a mismatch between the file type contents, the file type
metadata (if any), and the file extension, with one important proviso as
follows. In general, if your application handles (say) both JPEGs and
GIFs, and the user tries to open a JPEG named foo.gif, then the correct
action is to advise the user that the file is actually a JPEG, and
give them the opportunity to open the file anyway.

Whether that is best handed internally by making Image.read() raise an
exception or not is a question for the class designer.

The proviso is, if there are potential security implications of that
mismatch (e.g. an executable file masquerading as a .text file), and the
target audience is technically naive, then I believe it may be appropriate
for the application to refuse to handle the file. Don't let the user shoot
themselves in the foot if they aren't technically sophisticated enough to
realise they are shooting themselves in the foot.


--
Steven.

Nicola Musatti

unread,
Feb 3, 2006, 8:05:20 AM2/3/06
to

Steven D'Aprano wrote:
[...]

> If a class has a natural, obvious default state (e.g. a mutable string
> class might start off empty, a mutable int class might start off as zero,
> a binary tree might start off as an empty node with no children) then it
> makes sense to initialise the class, then add your data.
>
> But if the class has no natural default state, then it makes no sense to
> create an "empty object" with no data, a "non-image image" so to speak.

I don't think this is all there is to it. Even though a class such as
Image might not have a sensible default, initial state it still might
not be reasonable to burden it with the ability to collect the
information needed to reach such an initial state. To put it it another
way: a car is a car, it isn't a car factory.

> In other words, if you find yourself writing methods like this:
>
> class Klass:
> def foo(self):
> if self.data is None:
> raise KlassError("Can't foo an uninitialized Klass object.")
> else:
> # do something

Factory functions (or classes) are there to solve this problem and
still allow a clean separation of concerns. Although instances of Klass
are created uninitialized, they only live in this state within their
factory and only reach trhe outside world only when they are in a
usable state.

Cheers,
Nicola Musatti

bruno at modulix

unread,
Feb 3, 2006, 9:22:21 AM2/3/06
to
Blair P. Houghton wrote:

(snip)

> So Image.open(filename) seems right as a factory function that opens
> the file, figures out what it really is, constructs the appropriate
> subclass object (likely by passing a string to the constructor, e.g.,
> JPGImage(filename)), and returns the object via the superclass type.

Why "via the superclass type" ? "returns the object" is enough.

> The caller can then either check a flag in the superclass to see what
> type the subclass is,

Why the h... ? We don't care what type it is, as long at it does what we
expect it to do.

> or just assume it's the right type of image

Yes

(snip)


> (or does Python have RTTI?

Much better than 'RTTI'.

> I don't recall if
> I've seen it, yet...).

obj.__class__ is a reference to the class (which is itself an object...)


> Though if the filename doesn't match the content, someone should raise
> an exception...

Why ? filenames and extensions are nothing more than conventions.
Image.open() 's responsability is to create an object of the appropriate
class, period. If *your program* needs to ensure that the image type
matches the filename, it's your problem to check this.

> But this means that Image.open(filename) is a static method of the
> superclass, not requiring instantiation.

AFAIK, it's just a function in the Image module (which is itself an
object - instance of class module - but that's orthogonal).

> Image(string) could easily
> default to assuming string is a filename, doing the same work as
> Image.open(filename), though it would at least partially construct an
> Image instance each time it's called, which isn't what you want.
> Image.open(filename) defined as a static method (wait...is that
> possible in Python? I hate being the newbie)

It is (search for 'staticmethod' and 'classmethod'). But there's not
much use for 'static methods' in Python - we usually just use plain
functions ('classmethods' are another beast - much more useful than
staticmethods)


--
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in 'on...@xiludom.gro'.split('@')])"

I V

unread,
Feb 3, 2006, 2:09:28 PM2/3/06
to
Nicola Musatti wrote:
> I don't think this is all there is to it. Even though a class such as
> Image might not have a sensible default, initial state it still might
> not be reasonable to burden it with the ability to collect the
> information needed to reach such an initial state. To put it it another
> way: a car is a car, it isn't a car factory.

What's the "burden", though? Surely the only burden is that the class
needs to take the relevant parameters to its __init__ method, which is
no more of a burden than providing some other initialization method
that takes the relevant parameters.

> Factory functions (or classes) are there to solve this problem and
> still allow a clean separation of concerns. Although instances of Klass
> are created uninitialized, they only live in this state within their
> factory and only reach trhe outside world only when they are in a
> usable state.

This may be my limited imagination, but I can't think of a situation
when you would prefer something like:

def factory(info):
k = Klass()
data = get_initial_data(info)
k.set_data(data)
return k

to:

def factory(info):
data = get_initial_data(info)
return Klass(data)

What would be the value of doing the initialization in a separate
method, rather than the constructor?

Blair P. Houghton

unread,
Feb 4, 2006, 2:31:19 AM2/4/06
to

bruno at modulix wrote:

> Blair P. Houghton wrote:
> > So Image.open(filename) seems right as a factory function that opens
> > the file, figures out what it really is, constructs the appropriate
> > subclass object (likely by passing a string to the constructor, e.g.,
> > JPGImage(filename)), and returns the object via the superclass type.
>
> Why "via the superclass type" ? "returns the object" is enough.

I'm just being pedantic; because we can't tell from the return
type what subclass type it is, we see it as an object of the
superclass type and then ask it what subclass type it is.

> > The caller can then either check a flag in the superclass to see what
> > type the subclass is,
>
> Why the h... ? We don't care what type it is, as long at it does what we
> expect it to do.

We might want to know what to expect. A function returning a Person
may return a Man or a Woman,. It could make a difference to our
Libido, so it's in our best interest to find out what was returned, so
we
can call the right methods specific to that subclass.

> > or just assume it's the right type of image
>
> Yes
>
> (snip)
> > (or does Python have RTTI?
>
> Much better than 'RTTI'.
>
> > I don't recall if
> > I've seen it, yet...).
>
> obj.__class__ is a reference to the class (which is itself an object...)

Shortly after I posted that I came across other posts mentioning
__class__
and I'm trying to grok it in fullness now.

> > Though if the filename doesn't match the content, someone should raise
> > an exception...
>
> Why ? filenames and extensions are nothing more than conventions.

Ostensibly. But they're also a means of deceiving people, so the
handling
of a mismatch deserves care. Whether that care takes the form of an
exception or by defensive coding (which is kind of what exceptions
simplify) is up to you.

> > Image(string) could easily
> > default to assuming string is a filename, doing the same work as
> > Image.open(filename), though it would at least partially construct an
> > Image instance each time it's called, which isn't what you want.
> > Image.open(filename) defined as a static method (wait...is that
> > possible in Python? I hate being the newbie)
>
> It is (search for 'staticmethod' and 'classmethod'). But there's not
> much use for 'static methods' in Python - we usually just use plain
> functions ('classmethods' are another beast - much more useful than
> staticmethods)

Does it make any noticeable difference in efficiency, or does nobody
care much about efficiency in Python?

--Blair

Daniel Nogradi

unread,
Feb 4, 2006, 9:00:29 AM2/4/06
to pytho...@python.org
> > Actually, this way of creating a class instance is good OO practice in
> > many places: The Image.open() method acts as a factory-function for
> > creating Image objects.
> > You don't know, until you inspect the return value, if the created
> > object is actually an instance of class Image or something which just
> > happens to behave like one...
> > It could for instance be a sub-class of Image, JpegImage, if that
> > better suits the class library designers.
> > In general, factory-functions allow greater uncoupling between
> > interface and implementation.
> >
>
> From all the enlightening replies it seems to me that my next task
> should be to look into what 'factory functions' are. Whenever I've
> seen the word 'factory' I just skipped it, knowing that if I really
> need to know what it is, I'll come back to it later. Apparently it's
> just about time, thanks all for the comments.
>


I've been looking around for what factory functions are, but honestly
I've been grepping and googling but couldn't find a clear definition
of the concept of 'factory', 'factory function' or 'class factory'. As
I've said I'm relatively new to OOP and the reason is that for too
long I thought that there is no need for it, ordinary C will do. But
once the concepts became clear, it started to become useful. That's
why I'm not particularly interested in examples and code snippets but
rather the general ideas and their definitions, but for these
'factories' I really couldn't find a clear explanation anywhere only
how to use them, example code and such. Perhaps because I was only
searching in the context of Python.

So after all, what is a 'factory' or 'factory function'?

Steve Holden

unread,
Feb 4, 2006, 10:09:05 AM2/4/06
to pytho...@python.org
Daniel Nogradi wrote:
[...]

> So after all, what is a 'factory' or 'factory function'?

The name is intended to be indicative: it's a function that makes things
- usually instances of some class.

As has already been pointed out. Image is a module from PIL, so
Image.open() is a function in that module. When you call it, it creates
an instance of some suitable image class (which will depend on the type
of the image) and returns that.

The point here is that Image.open() returns a
JpegImagePlugin.JpegImageFile instance for a .jpg file, a
GifImagePlugin.GifImageFile instance for a .gif file, and so on. It
wouldn't make sense to use a single class to represent all these
different image types, so Image.open() just creates an instance of the
appropriate class (whatever that may be, which will vary from call to
call) and returns that.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Alex Martelli

unread,
Feb 4, 2006, 10:33:14 AM2/4/06
to
Daniel Nogradi <nog...@gmail.com> wrote:
...

> So after all, what is a 'factory' or 'factory function'?

A brief explanation in Python terms is at
http://www.aleax.it/ep03_pydp.pdf -- "pages" (slides) 37-44 (the rest of
the presentation is about an even more fundamental design pattern,
"template method"). A far more extensive essay can be found, for
example, at
http://gsraj.tripod.com/design/creational/factory/factory.html -- it
names the pattern "factory method", not "factory function" ("method" is
probably a more widespread name for it) and presents examples in Java
and Corba IDL.

Anyway, trying to summarize:

- "what": any function or method or other callable whose task it
is to build and return new objects can be called "a factory";

- sometimes a factory may get away with returning an existing
object for recycling "as if" it was a new one, saving some
resources, and this is one advantage;

- a factory may choose what exact type of object to build and return
based on arguments or other circumstances, and this is a second
advantage.

"Program to an interface, not to an implementation" -- the key
underlying principle ot the gang of 4's "Design Patterns" book.

If you build an object of a specific given type ("new Foo" in Java
or C++), you're hardwiring the choice of implementation (the
exact concrete type); delegating the building to a method or
function frees your application from such hardwiring and lets
it be concentrated (if needed at all) in one spot (the factory).
[[With the "registry" pattern you may even be able to remove
any hardwiring, but that's a pretty advanced and subtle idea]].


Alex

Alex Martelli

unread,
Feb 4, 2006, 10:40:21 AM2/4/06
to
Blair P. Houghton <blair.h...@gmail.com> wrote:
...

> > It is (search for 'staticmethod' and 'classmethod'). But there's not
> > much use for 'static methods' in Python - we usually just use plain
> > functions ('classmethods' are another beast - much more useful than
> > staticmethods)
>
> Does it make any noticeable difference in efficiency, or does nobody
> care much about efficiency in Python?

Some of us do, at few but crucial moments; that's why we have timeit.py
to let you measure the performance of pieces of code you care about.

helen:~ alex$ python -mtimeit -s'class X(object):
> @staticmethod
> def f(): pass
> x=X()' 'x.f()'
1000000 loops, best of 3: 1.07 usec per loop
helen:~ alex$ python -mtimeit -s'def f(): pass' 'f()'
1000000 loops, best of 3: 0.727 usec per loop
helen:~ alex$

As you see, static methods have a small extra lookup cost (a couple
hundred nanoseconds on my oldish laptop); normally, one would use a
function instead. But if the body of the function/method did some
significant, the lookup overhead would then matter little; clarity and
organization and simplicity, OTOH, _always_ matter. "Premature
optimization is the root of all evil in programming", as Knuth wrote
quoting Hoare -- and anybody who's read Knuth's work knows he is
anything BUT indifferent to real optimization; the key is avoiding that
"premature" part!-)


Alex

Blair P. Houghton

unread,
Feb 5, 2006, 3:48:43 AM2/5/06
to

Alex Martelli wrote:
> As you see, static methods have a small extra lookup cost (a couple
> hundred nanoseconds on my oldish laptop);

I would've expected the opposite effect...I guess the runtime
considers instances more active than the static portion of
a class.

> "Premature
> optimization is the root of all evil in programming", as Knuth wrote
> quoting Hoare -- and anybody who's read Knuth's work knows he is
> anything BUT indifferent to real optimization; the key is avoiding that
> "premature" part!-)

Apropos of which someone mentioned in a thread on Slashdot
today about writing an entire program's code in Python first and
then optimizing portions to C or C++ only as performance warrants.

Seems like a good idea. I've noticed Python is a lot easier to
get up-and-running with, even if you're as unfamiliar with it as
I am, compared to the other two.

--Blair

Nicola Musatti

unread,
Feb 6, 2006, 5:28:46 AM2/6/06
to

I V wrote:
> Nicola Musatti wrote:
[...]

> > Factory functions (or classes) are there to solve this problem and
> > still allow a clean separation of concerns. Although instances of Klass
> > are created uninitialized, they only live in this state within their
> > factory and only reach trhe outside world only when they are in a
> > usable state.
>
> This may be my limited imagination, but I can't think of a situation
> when you would prefer something like:
>
> def factory(info):
> k = Klass()
> data = get_initial_data(info)
> k.set_data(data)
> return k
>
> to:
>
> def factory(info):
> data = get_initial_data(info)
> return Klass(data)
>
> What would be the value of doing the initialization in a separate
> method, rather than the constructor?

I didn't express my intent clearly. I agree that in general your second
example is to be preferred to the first one. In fact the only reason I
could think of using the first scheme is when the second would lead to
Klass's __init__ method having a large number of parameters.

What is important to me is to keep your get_initial_data() function
outside Klass if it's task is non trivial, e.g. it has to interact with
the OS or a DB.

Cheers,
Nicola Musatti

Fredrik Lundh

unread,
Feb 6, 2006, 5:36:41 AM2/6/06
to pytho...@python.org
Nicola Musatti wrote:

> > def factory(info):
> > k = Klass()
> > data = get_initial_data(info)
> > k.set_data(data)
> > return k
> >
> > to:
> >
> > def factory(info):
> > data = get_initial_data(info)
> > return Klass(data)
> >
> > What would be the value of doing the initialization in a separate
> > method, rather than the constructor?
>
> I didn't express my intent clearly. I agree that in general your second
> example is to be preferred to the first one. In fact the only reason I
> could think of using the first scheme is when the second would lead to
> Klass's __init__ method having a large number of parameters.
>
> What is important to me is to keep your get_initial_data() function
> outside Klass if it's task is non trivial, e.g. it has to interact with
> the OS or a DB.

why ?

</F>

Daniel Nogradi

unread,
Feb 6, 2006, 10:49:19 AM2/6/06
to pytho...@python.org

Thank you very much, this helped a lot. Just one side note, the url
http://www.aleax.it/ep03_pydp.pdf should be
http://www.aleax.it/Python/ep03_pydp.pdf, I found your document there.
Thanks again.

Scott David Daniels

unread,
Feb 6, 2006, 11:26:58 AM2/6/06
to
Fredrik Lundh wrote:

> Nicola Musatti wrote:
>> What is important to me is to keep your get_initial_data() function
>> outside Klass if it's task is non trivial, e.g. it has to interact with
>> the OS or a DB.
>
> why ?

In order to simplify testing.

--Scott David Daniels
scott....@acm.org

Nicola Musatti

unread,
Feb 6, 2006, 11:51:56 AM2/6/06
to

Fredrik Lundh wrote:
> Nicola Musatti wrote:
[...]
> > What is important to me is to keep your get_initial_data() function
> > outside Klass if it's task is non trivial, e.g. it has to interact with
> > the OS or a DB.
>
> why ?

Separating the internal logic of an application from its interactions
with the outside world, e.g. retrieval from a file or a database and
presentation to a GUI, makes it easier to reuse the logic in different
contexts and may help make it possible to use highly generic solutions
for those interactions. I'm thinking of code generation or
introspection, which may be applied with very limited knowledge of the
application logic.

In larger projects I find that this separation tends to match the
different areas of expertise that are needed: database experts, GUI
experts, domain experts, etc.

Cheers,
Nicola Musatti

0 new messages