Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

object.enable() anti-pattern

130 views
Skip to first unread message

Steven D'Aprano

unread,
May 8, 2013, 4:52:12 AM5/8/13
to
I'm looking for some help in finding a term, it's not Python-specific but
does apply to some Python code.

This is an anti-pattern to avoid. The idea is that creating a resource
ought to be the same as "turning it on", or enabling it, or similar. For
example, we don't do this in Python:


f = file("some_file.txt")
f.open()
data = f.read()


because reading the file can fail if you forget to call open first.
Instead, Python uses a factory function that creates the file object and
opens it:

f = open("some_file.txt") # if this succeeds, f is ready to use
data = f.read()


Basically, any time you have two steps required for using an object, e.g.
like this:

obj = someobject(arg1, arg2)
obj.enable()

you should move the make-it-work functionality out of the enable method
and into __init__ so that creating the object creates it in a state ready
to work.

I read a blog some time ago (a year ago?) that discusses this anti-
pattern, and I'm pretty sure gave it a specific name, but I cannot find
it again. Can anyone find any references to this anti-pattern? My google-
fu is letting me down.



(For what it's worth, I'm refactoring some of my own code that falls into
this anti-pattern. Die, enable method, die die die!)


--
Steven

Christian Heimes

unread,
May 8, 2013, 5:51:47 AM5/8/13
to pytho...@python.org
Am 08.05.2013 10:52, schrieb Steven D'Aprano:
> Basically, any time you have two steps required for using an object, e.g.
> like this:
>
> obj = someobject(arg1, arg2)
> obj.enable()
>
> you should move the make-it-work functionality out of the enable method
> and into __init__ so that creating the object creates it in a state ready
> to work.

In general I agree that an object.enable() function is ugly. ;)

But it's not necessarily the best solution for all problems. If the
resource needs some kind of cleanup, the context api (__enter__() /
__exit__()) is perfectly fine way to enable and disable the object.

For example:

class MyFile:
def __init__(self, filename):
self.filename = filename
def __enter__(self):
self.open()
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.close()

I suggest that you mention the context API in your blog post, too.

Christian

Robert Kern

unread,
May 8, 2013, 6:13:33 AM5/8/13
to pytho...@python.org
On 2013-05-08 09:52, Steven D'Aprano wrote:
> I'm looking for some help in finding a term, it's not Python-specific but
> does apply to some Python code.
>
> This is an anti-pattern to avoid. The idea is that creating a resource
> ought to be the same as "turning it on", or enabling it, or similar.

I don't think the anti-pattern has a name, but it's opposite pattern is named:

http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Steven D'Aprano

unread,
May 8, 2013, 8:30:30 AM5/8/13
to
On Wed, 08 May 2013 11:13:33 +0100, Robert Kern wrote:

> On 2013-05-08 09:52, Steven D'Aprano wrote:
>> I'm looking for some help in finding a term, it's not Python-specific
>> but does apply to some Python code.
>>
>> This is an anti-pattern to avoid. The idea is that creating a resource
>> ought to be the same as "turning it on", or enabling it, or similar.
>
> I don't think the anti-pattern has a name, but it's opposite pattern is
> named:
>
> http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization

That sounds like it should be related, but it actually isn't since RAII
actually has little to do with *acquiring* the resource and everything to
do with *releasing* it. See, for example:

http://stackoverflow.com/questions/2321511/what-is-meant-by-resource-
acquisition-is-initialization-raii

where it is pointed out that the resource may be acquired outside of the
object constructor and passed in as an argument. RAII is actually about
deterministic destruction of objects and the release of their resources.

But thanks for the attempt :-)


--
Steven

Roy Smith

unread,
May 8, 2013, 9:17:17 AM5/8/13
to
In article <518a123c$0$11094$c3e...@news.astraweb.com>,
Steven D'Aprano <steve+comp....@pearwood.info> wrote:

> I'm looking for some help in finding a term, it's not Python-specific but
> does apply to some Python code.
>
> This is an anti-pattern to avoid. The idea is that creating a resource
> ought to be the same as "turning it on", or enabling it, or similar. For
> example, we don't do this in Python:
>
>
> f = file("some_file.txt")
> f.open()
> data = f.read()

I've worked with C++ code that did this. At one point in the evolution
of OOP group consciousness, there was a feeling that constructors must
never fail. I don't remember if it was a general language-agnostic
pattern, or a specific C++ reaction to poor exception handling code in
early compilers. What came out of that was the pattern you describe.
All the code that could fail was factored out of the constructor into an
"enable" method.

That being said, sometimes there are good reasons for doing this. One
example might be something like:

frobnicator = Frobnicator()
for file in my_file_list:
frobnicator.munch(file)
for line in frobnicator:
process(line)

If creating a Frobnicator instance is very expensive, it might pay to
create an instance once and keep reusing it on multiple files. Here,
munch() is your enable() method. But, that's not quite what you were
talking about.

Duncan Booth

unread,
May 8, 2013, 10:27:53 AM5/8/13
to
Steven D'Aprano <steve+comp....@pearwood.info> wrote:

> I'm looking for some help in finding a term, it's not Python-specific
> but does apply to some Python code.
>
> This is an anti-pattern to avoid. The idea is that creating a resource
> ought to be the same as "turning it on", or enabling it, or similar

I've come across this under the name 'two-phase construction', but as a
desirable(!?) pattern rathern than as an anti-pattern.

In particular Symbian used it throughout as originally their C++
implementation didn't support exceptions. Instead they had a separate
cleanup stack and objects that require cleanup were pushed onto that stack
after being fully constructed but before calling the initialisation that
required cleanup. See
http://www.developer.nokia.com/Community/Wiki/Two-phase_construction


--
Duncan Booth

Dan Sommers

unread,
May 8, 2013, 10:38:36 PM5/8/13
to
On Wed, 08 May 2013 14:27:53 +0000, Duncan Booth wrote:

> Steven D'Aprano <steve+comp....@pearwood.info> wrote:

>> I'm looking for some help in finding a term, it's not Python-specific
>> but does apply to some Python code.
>>
>> This is an anti-pattern to avoid. The idea is that creating a
>> resource ought to be the same as "turning it on", or enabling it, or
>> similar
>
> I've come across this under the name 'two-phase construction', but as
> a desirable(!?) pattern rathern than as an anti-pattern.

Think of spinning off a thread: initialize it synchronously, and then
let it execute asynchronously. We tend towards that pattern even when
we know that execution will be synchronous, because we also that it
could be asynchronous one day. Yes, we could wrap that up in a neat
bundle, but explicit is better than implicit.

Dan

Dan Sommers

unread,
May 8, 2013, 10:42:01 PM5/8/13
to
On Wed, 08 May 2013 08:52:12 +0000, Steven D'Aprano wrote:

> This is an anti-pattern to avoid. The idea is that creating a resource
> ought to be the same as "turning it on", or enabling it, or similar. For
> example, we don't do this in Python:
>
> f = file("some_file.txt")
> f.open()
> data = f.read()

So why don't we do this?

data = read("some_file.txt")

> because reading the file can fail if you forget to call open first.
> Instead, Python uses a factory function that creates the file object and
> opens it:
>
> f = open("some_file.txt") # if this succeeds, f is ready to use
> data = f.read()

That's just the "enable" paradigm in a socially acceptable and
traditional wrapper.

Opening and closing the file is an implementation detail, often defeated
by caching (application level or OS level) anyway.

Dan

Mark Janssen

unread,
May 8, 2013, 10:53:36 PM5/8/13
to Steven D'Aprano, pytho...@python.org
> I'm looking for some help in finding a term, it's not Python-specific but
> does apply to some Python code.
>
> This is an anti-pattern to avoid. The idea is that creating a resource
> ought to be the same as "turning it on", or enabling it, or similar. For
> example, we don't do this in Python:

I would call it "user-mediated resource allocation" as distinct from
"object-mediated" resource allocation.
--
MarkJ
Tacoma, Washington

Mark Janssen

unread,
May 8, 2013, 10:56:02 PM5/8/13
to Steven D'Aprano, pytho...@python.org
>> This is an anti-pattern to avoid. The idea is that creating a resource
>> ought to be the same as "turning it on", or enabling it, or similar. For
>> example, we don't do this in Python:
>
> I would call it "user-mediated resource allocation" as distinct from
> "object-mediated" resource allocation.

I should point out, though, it's not really an anti-pattern (coming
form the C community).

--
MarkJ
Tacoma, Washington

Steven D'Aprano

unread,
May 8, 2013, 11:12:09 PM5/8/13
to
On Wed, 08 May 2013 14:27:53 +0000, Duncan Booth wrote:

Thanks for the link. It's obviously not the blog post I was looking for,
but it is interesting.



--
Steven

Steven D'Aprano

unread,
May 9, 2013, 1:23:59 AM5/9/13
to
On Thu, 09 May 2013 02:42:01 +0000, Dan Sommers wrote:

> On Wed, 08 May 2013 08:52:12 +0000, Steven D'Aprano wrote:
>
>> This is an anti-pattern to avoid. The idea is that creating a resource
>> ought to be the same as "turning it on", or enabling it, or similar.
>> For example, we don't do this in Python:
>>
>> f = file("some_file.txt")
>> f.open()
>> data = f.read()
>
> So why don't we do this?
>
> data = read("some_file.txt")

Because there is a lot more that you might want to do to a file than just
read from it.

py> f = open('/tmp/x')
py> dir(f)
['_CHUNK_SIZE', '__class__', '__delattr__', '__dict__', '__dir__',
'__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__',
'__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__',
'__iter__', '__le__', '__lt__', '__ne__', '__new__', '__next__',
'__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__',
'__str__', '__subclasshook__', '_checkClosed', '_checkReadable',
'_checkSeekable', '_checkWritable', 'buffer', 'close', 'closed',
'detach', 'encoding', 'errors', 'fileno', 'flush', 'isatty',
'line_buffering', 'mode', 'name', 'newlines', 'read', 'readable',
'readline', 'readlines', 'seek', 'seekable', 'tell', 'truncate',
'writable', 'write', 'writelines']


That's 24 public methods and attributes, excluding private and dunder
attributes.

There is no sensible use-case for creating a file without opening it.
What would be the point? Any subsequent calls to just about any method
will fail. Since you have to open the file after creating the file object
anyway, why make them two different calls?

Besides, this is not to denigrate the idea of a read() function that
takes a filename and returns its contents. But that is not an object
constructor. It may construct a file object internally, but it doesn't
return the file object, so it is completely unrelated to the scenario I
described.



>> because reading the file can fail if you forget to call open first.
>> Instead, Python uses a factory function that creates the file object
>> and opens it:
>>
>> f = open("some_file.txt") # if this succeeds, f is ready to use
>> data = f.read()
>
> That's just the "enable" paradigm in a socially acceptable and
> traditional wrapper.

Well duh. That's like saying that for loops, while loops and co-routines
are just GOTO in a socially acceptable and traditional wrapper. Of course
they are. That's the whole point.



--
Steven

Steven D'Aprano

unread,
May 9, 2013, 1:37:35 AM5/9/13
to
On Thu, 09 May 2013 02:38:36 +0000, Dan Sommers wrote:

> Think of spinning off a thread: initialize it synchronously, and then
> let it execute asynchronously. We tend towards that pattern even when
> we know that execution will be synchronous, because we also that it
> could be asynchronous one day.

Whether it is synchronous or asynchronous is irrelevant.

I can see use-cases for separating "make it go" from initialisation. It
all depends on what you might want to do to the object before making it
go. If the answer is "Nothing", then there is no reason not to have the
constructor make it go. If the answer is, "Well, sometimes we need to do
things to the object before making it go", then it makes sense to
separate the two:

blob = MyBlob(arg1, arg2, agr3)
blob.poke("with a stick")
blob.append(data)
blob.foo = "spam"
blob.make_it_go()


I'm not talking about this case. I'm talking about the case where there's
nothing you can do with the blob before making it go.


> Yes, we could wrap that up in a neat
> bundle, but explicit is better than implicit.

"And that is why we always write code like this:

n = int(int(len(something)) + int(1))

just to be sure that the result is explicitly an int and not just
implicitly an int. Suppose some Javascript programmer was reading the
code, and they thought that 1 was a floating point value. That would be
bad!"


--
Steven

Chris Angelico

unread,
May 9, 2013, 1:52:02 AM5/9/13
to pytho...@python.org
On Thu, May 9, 2013 at 3:37 PM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> I can see use-cases for separating "make it go" from initialisation. It
> all depends on what you might want to do to the object before making it
> go. If the answer is "Nothing", then there is no reason not to have the
> constructor make it go. If the answer is, "Well, sometimes we need to do
> things to the object before making it go", then it makes sense to
> separate the two:
>
> blob = MyBlob(arg1, arg2, agr3)
> blob.poke("with a stick")
> blob.append(data)
> blob.foo = "spam"
> blob.make_it_go()

Example use-case: Most GUI frameworks. You create a window, then
populate it, then show it. When you create the window object in
Python, you expect an actual window to exist, it should have its
handle etc. So it's still the same thing; the object is fully created
in its constructor.

ChrisA

Terry Jan Reedy

unread,
May 9, 2013, 2:41:56 AM5/9/13
to pytho...@python.org
On 5/9/2013 1:23 AM, Steven D'Aprano wrote:

> Besides, this is not to denigrate the idea of a read() function that
> takes a filename and returns its contents. But that is not an object
> constructor. It may construct a file object internally, but it doesn't
> return the file object, so it is completely unrelated to the scenario I
> described.

At least a few stdlib modules that define classes *also* have such
functions. They create an instance of the class, call one or more
methods, and return the result of the method, discarding the instance.
For instance, see the subprocess module, its POpen class, and module
functions; or the timeit module, its Timer class, and functions.


Gregory Ewing

unread,
May 9, 2013, 3:54:56 AM5/9/13
to
Steven D'Aprano wrote:
> There is no sensible use-case for creating a file without opening it.
> What would be the point?

Early unix systems often used this as a form of locking.

--
Greg

Cameron Simpson

unread,
May 9, 2013, 4:23:31 AM5/9/13
to Gregory Ewing, pytho...@python.org
Not just early systems: it's a nice lightweight method of making a
lockfile even today if you expect to work over NFS, where not that
many things are synchronous. You open a file with "0" modes, so
that it is _immediately_ not writable. Other attempts to make the
lock file thus fail because of the lack of write, even over NFS.

Cheers,
--
Cameron Simpson <c...@zip.com.au>

You can listen to what everybody says, but the fact remains that you've got
to get out there and do the thing yourself. - Joan Sutherland

Wayne Werner

unread,
May 9, 2013, 7:08:25 AM5/9/13
to Steven D'Aprano, pytho...@python.org


On Wed, 8 May 2013, Steven D'Aprano wrote:

> I'm looking for some help in finding a term, it's not Python-specific but
> does apply to some Python code.
>
> This is an anti-pattern to avoid. The idea is that creating a resource
> ought to be the same as "turning it on", or enabling it, or similar. For
> example, we don't do this in Python:

I'm not entirely sure what the name of it is, but the basic concept is
that you should never partially create, or create a class that can be in
an unstable state. Which isn't to say you should prevent invalid input,
only that with every valid input or single operation (including
construction) your class should be valid.


Ah, that's it - the problem is that it introduces /Temporal Coupling/ to
one's code: http://blog.ploeh.dk/2011/05/24/DesignSmellTemporalCoupling/

You don't ever want a class that has functions that need to be called in a
certain order to *not* crash. That's fine if you have to call them in a
certain sequence in order to get the correct data - that's what
programming is all about, after all. But if you provide me a class with a
constructor you better make sure that when I do this:

thing = YourSuperAwesomeClass()
thing.do_stuff()

that I don't get some horrid stack trace ending with

InvalidStateError: initialize() needs to be called before do_stuff()

Or something worse.


HTH,
Wayne

p.s. I'm interested in reading whatever is evenually written on the topic

Steven D'Aprano

unread,
May 9, 2013, 7:30:32 AM5/9/13
to
On Thu, 09 May 2013 18:23:31 +1000, Cameron Simpson wrote:

> On 09May2013 19:54, Greg Ewing <greg....@canterbury.ac.nz> wrote:
> | Steven D'Aprano wrote:
> | > There is no sensible use-case for creating a file WITHOUT OPENING
> | > it. What would be the point?
> |
> | Early unix systems often used this as a form of locking.
>
> Not just early systems: it's a nice lightweight method of making a
> lockfile even today if you expect to work over NFS, where not that many
> things are synchronous. You OPEN A FILE with "0" modes

[emphasis added]

This is all very well and good, but for the life of me, I cannot see how
opening a file is a good example of not opening a file. Perhaps it is a
Zen thing, like the sound no spoon makes when you don't tap it against a
glass that isn't there.


--
Steven

Steven D'Aprano

unread,
May 9, 2013, 7:51:22 AM5/9/13
to
On Thu, 09 May 2013 06:08:25 -0500, Wayne Werner wrote:

> Ah, that's it - the problem is that it introduces /Temporal Coupling/ to
> one's code: http://blog.ploeh.dk/2011/05/24/DesignSmellTemporalCoupling/

Good catch!

That's not the blog post I read, but that's the same concept. "Temporal
Coupling" -- yes, that is an excellent description of the problem.


> You don't ever want a class that has functions that need to be called in
> a certain order to *not* crash. That's fine if you have to call them in
> a certain sequence in order to get the correct data - that's what
> programming is all about, after all. But if you provide me a class with
> a constructor you better make sure that when I do this:
>
> thing = YourSuperAwesomeClass()
> thing.do_stuff()
>
> that I don't get some horrid stack trace ending with
>
> InvalidStateError: initialize() needs to be called before
> do_stuff()
>
> Or something worse.

Exactly.


Thank you Wayne, that's a great help.



--
Steven

Roy Smith

unread,
May 9, 2013, 9:07:42 AM5/9/13
to
In article <518b32ef$0$11120$c3e...@news.astraweb.com>,
Steven D'Aprano <steve+comp....@pearwood.info> wrote:

> There is no sensible use-case for creating a file without opening it.

Sure there is. Sometimes just creating the name in the file system is
all you want to do. That's why, for example, the unix "touch" command
exists.

Oscar Benjamin

unread,
May 9, 2013, 9:51:16 AM5/9/13
to Roy Smith, pytho...@python.org
Wouldn't the code that implements the touch command just look
something like this:

f = open(filename)
f.close()

Or is there some other way of creating the file that doesn't open it
(I mean in general not just in Python)?


Oscar

Steven D'Aprano

unread,
May 9, 2013, 2:21:37 PM5/9/13
to
Since I neglected to make it clear above that I was still talking about
file objects, rather than files on disk, I take responsibility for this
misunderstanding. I thought that since I kept talking about file
*objects* and *constructors*, people would understand that I was talking
about in-memory objects rather than on-disk files. Mea culpa.

So, let me rephrase that sentence, and hopefully clear up any further
misunderstandings.

There is no sensible use-case for creating a file OBJECT unless it
initially wraps an open file pointer.

This principle doesn't just apply to OOP languages. The standard C I/O
library doesn't support creating a file descriptor unless it is a file
descriptor to an open file. open() has the semantics:

"It shall create an open file description that refers to a file and a
file descriptor that refers to that open file description."

http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html

and there is no corresponding function to create a *closed* file
description. (Because such a thing would be pointless.)

Of course language designers are free to design their language to work
under whatever anti-patterns they desire. I quote from the Pascal
Language Reference:

"open associates the permanent file file [sic] with a file variable for
reading or writing. open does not actually open the file; you must call
reset or rewrite before reading or writing to that file."

http://www.amath.unc.edu/sysadmin/DOC4.0/pascal/lang_ref/
ref_builtin.doc.html


but since that's not a part of standard Pascal, other Pascals may behave
differently.



--
Steven

MRAB

unread,
May 9, 2013, 2:34:25 PM5/9/13
to pytho...@python.org
On 09/05/2013 19:21, Steven D'Aprano wrote:
> On Thu, 09 May 2013 09:07:42 -0400, Roy Smith wrote:
>
>> In article <518b32ef$0$11120$c3e...@news.astraweb.com>,
>> Steven D'Aprano <steve+comp....@pearwood.info> wrote:
>>
>>> There is no sensible use-case for creating a file without opening it.
>>
>> Sure there is. Sometimes just creating the name in the file system is
>> all you want to do. That's why, for example, the unix "touch" command
>> exists.
>
> Since I neglected to make it clear above that I was still talking about
> file objects, rather than files on disk, I take responsibility for this
> misunderstanding. I thought that since I kept talking about file
> *objects* and *constructors*, people would understand that I was talking
> about in-memory objects rather than on-disk files. Mea culpa.
>
> So, let me rephrase that sentence, and hopefully clear up any further
> misunderstandings.
>
> There is no sensible use-case for creating a file OBJECT unless it
> initially wraps an open file pointer.
>
You might want to do this:

f = File(path)
if f.exists():
...

This would be an alternative to:

if os.path.exists(path):
...

> This principle doesn't just apply to OOP languages. The standard C I/O
> library doesn't support creating a file descriptor unless it is a file
> descriptor to an open file. open() has the semantics:
>
> "It shall create an open file description that refers to a file and a
> file descriptor that refers to that open file description."
>
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html
>
> and there is no corresponding function to create a *closed* file
> description. (Because such a thing would be pointless.)
>
[snip]

Roy Smith

unread,
May 9, 2013, 2:59:26 PM5/9/13
to
In article <518be931$0$29997$c3e8da3$5496...@news.astraweb.com>,
Steven D'Aprano <steve+comp....@pearwood.info> wrote:
> There is no sensible use-case for creating a file OBJECT unless it
> initially wraps an open file pointer.

OK, I guess that's a fair statement. But mostly because a python file
object only exposes those subset of operations you can do on file
descriptors which deal with reading and writing the contents of a
file.

It would not be true if python file objects included methods for
querying and manipulating file metadata. It's not hard to imagine a
file class which could be used like:

f = file("/path/to/my/file")
f.delete()

That would be a totally different model from the current python file
object. And then there would be plenty of things you might want to do
to a file other than open it...

file("/path/to/my/directory").chdir()
file("/dev/sdf").mount("/var/lib/whatever")
file("/mnt/swapfile").swapon()

> The standard C I/O library doesn't support creating a file
> descriptor unless it is a file descriptor to an open file [...]
> there is no corresponding function to create a *closed* file
> description. (Because such a thing would be pointless.)

What about sockets? From the python standard library:

s = socket.socket()

Now what? You can't do much with your shiny new socket until you call
bind() or connect(), or a few other things. At least not for TCP.
This is essentially the two-phased construction pattern. Of course,
the python socket module just exposes the semantics of the underlying
OS sockets, so there's not a lot of choice there.

Chris Angelico

unread,
May 9, 2013, 5:55:49 PM5/9/13
to pytho...@python.org
On Fri, May 10, 2013 at 4:59 AM, Roy Smith <r...@panix.com> wrote:
> It's not hard to imagine a
> file class which could be used like:
>
> f = file("/path/to/my/file")
> f.delete()
>
> That would be a totally different model from the current python file
> object. And then there would be plenty of things you might want to do
> to a file other than open it...
>
> file("/path/to/my/directory").chdir()
> file("/dev/sdf").mount("/var/lib/whatever")
> file("/mnt/swapfile").swapon()

Sure, you can imagine it. But what does it do that can't be done with
a moduleful of flat functions accepting strings? This makes sense in
Java, I guess, but why do it in Python?

ChrisA

Greg Ewing

unread,
May 9, 2013, 6:56:45 PM5/9/13
to pytho...@python.org
Cameron Simpson wrote:
> You open a file with "0" modes, so
> that it is _immediately_ not writable. Other attempts to make the
> lock file thus fail because of the lack of write,

I don't think that's quite right. You open it with
O_CREAT+O_EXCL, which atomically fails if the file
already exists. The read/write modes don't really
come into it, as far as I know.

--
Greg

Cameron Simpson

unread,
May 9, 2013, 7:36:43 PM5/9/13
to Steven D'Aprano, pytho...@python.org
Because a file usually does not exist in isolation (yes sometimes
we want an isolated file). Files usually exist in the filesystem,
which is a namespace. And this is effectively a namespace operation,
not a data storage operation.

Of course, I can take this the other way: just because I opened it
with a 0 mode field doesn't mean _I_, the opener, cannot read/write
it. I've got an open file handle... A race free way to make a scratch
file in a shared area, for example.

The point is probably that a file isn't merely a feature free byte
storage container; in the real world they usually come with all
sorts of features like names and permissions. Those features will
always imply creative uses.

Anyway, this has little to do with your antipattern (about which
I'm not totally convinced anyway unless it is a rule of thumb or
code smell). It might apply to a Platonicly ideal file, but real
files have more than one use case.

Cheers,
--
Cameron Simpson <c...@zip.com.au>

I just kept it wide-open thinking it would correct itself.
Then I ran out of talent. - C. Fittipaldi

Gregory Ewing

unread,
May 9, 2013, 7:43:42 PM5/9/13
to
Wayne Werner wrote:
> You don't ever want a class that has functions that need to be called in
> a certain order to *not* crash.

That seems like an overly broad statement. What
do you think the following should do?

f = open("myfile.dat")
f.close()
data = f.read()

--
Greg

Cameron Simpson

unread,
May 9, 2013, 7:29:33 PM5/9/13
to Greg Ewing, pytho...@python.org
Interesting. That is more direct. My understanding that it was
enough to create the file with 0 modes. Maybe either will do. Must
check...

Cheers,
--
Cameron Simpson <c...@zip.com.au>

Piracy gets easier every day, but listening to legally purchased music gets harder by the day.
Firehed - http://politics.slashdot.org/comments.pl?sid=179175&cid=14846089

Michael Speer

unread,
May 9, 2013, 8:18:39 PM5/9/13
to Gregory Ewing, pytho...@python.org
By his reasoning it simply shouldn't exist. Instead you would access the information only like this:

with open("myfile.dat") as f:
  data = f.read()

Which is my preferred way to work with resources requiring cleanup in python anyways, as it ensures I have the least chance of messing things up, and that all of my resources are disposed of properly during the unwind. It's a hell of a lot cleaner than remembering to call the appropriate cleanup functions at every callsite, and the resultant values can never be in a bad state ( at least due to programmer function-ordering fault, underlying file i/o errors and things can still cause errors, but not due to API mis-usage ).

Python 3's multiple-with-statement-target syntax was backported to 2.7 as well, flattening the deep nests of with statements otherwise needed to implement this pattern

Roy Smith

unread,
May 9, 2013, 8:50:16 PM5/9/13
to
In article <mailman.1514.1368145...@python.org>,
Michael Speer <knom...@gmail.com> wrote:

> By his reasoning it simply shouldn't exist. Instead you would access the
> information only like this:
>
> with open("myfile.dat") as f:
> data = f.read()

The problem with things like file objects is they model external
real-world entities, which have externally-imposed real-world behaviors.

f.close() can fail, most commonly because some buffered output couldn't
be written when being flushed as part of the close(). Sometimes it's
important to find out about that.

Steven D'Aprano

unread,
May 9, 2013, 10:30:21 PM5/9/13
to
On Thu, 09 May 2013 19:34:25 +0100, MRAB wrote:

>> There is no sensible use-case for creating a file OBJECT unless it
>> initially wraps an open file pointer.
>>
> You might want to do this:
>
> f = File(path)
> if f.exists():
> ...
>
> This would be an alternative to:
>
> if os.path.exists(path):
> ...

Sure, but your made-up File object does not represent a file, it
represents a pathname which *may or may not* exist. Pathnames are not
files. Not all files have a pathname that refers to them, and not all
pathnames point to an actual file. Since it has an exists() method that
can return False, there are so-called "File" objects that aren't files
and the name is a misnomer. A much better name for the class would be
"Path", not "File".

I'm not saying that there is never a use-case for some objects to have an
"enable" or "start" or "open" method. That would clearly be a silly thing
to say. In the real world, we design many objects to have a start switch.
It would be pretty horrible if your car was running all the time. But
that's because there are actual use-cases for having cars *not* run, and
"make it stop" is the safe default behaviour.

Your fridge, on the other hand, doesn't have a "make it go" button. So
long as power is applied to it, your fridge automatically runs. Likewise,
your watch is an "always on" device, provided it hasn't wound down or
have a flat battery. Your fire alarm is "always on".

I must admit I am astonished at how controversial the opinion "if your
object is useless until you call 'start', you should automatically call
'start' when the object is created" has turned out to be.



--
Steven

Roy Smith

unread,
May 9, 2013, 11:09:55 PM5/9/13
to
In article <518c5bbc$0$29997$c3e8da3$5496...@news.astraweb.com>,
Steven D'Aprano <steve+comp....@pearwood.info> wrote:

> I must admit I am astonished at how controversial the opinion "if your
> object is useless until you call 'start', you should automatically call
> 'start' when the object is created" has turned out to be.

I'm sorry. I thought you were here for an argument.

I think where things went pear shaped is when you made the statement:

>> There is no sensible use-case for creating a file OBJECT unless it
>> initially wraps an open file pointer.

That's a pretty absolute point of view. Life is rarely so absolute.

Chris Angelico

unread,
May 9, 2013, 11:08:03 PM5/9/13
to pytho...@python.org
On Fri, May 10, 2013 at 12:30 PM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> I must admit I am astonished at how controversial the opinion "if your
> object is useless until you call 'start', you should automatically call
> 'start' when the object is created" has turned out to be.

I share your astonishment. This is a very simple point: If, after
constructing an object, the caller MUST call some method on it prior
to the object being of use, then better design is to embed that call
directly into the constructor. As always, it has its exceptions, but
that doesn't stop it being a useful rule.

The Path() equivalent would be:

p = Path()
p.set_path("/foo/bar")
if p.exists():
pass

Even if you have a set_path() method, it makes good sense to symlink
it to __init__ to avoid this anti-pattern.

C level APIs often have these sorts of initialization requirements.

fd_set selectme;
FD_ZERO(&selectme);

This is because declaring a variable in C cannot initialize it.
Anything that *has* constructors should be using them to set objects
up... that's what they're for.

Where's the controversy?

ChrisA

Mark Janssen

unread,
May 9, 2013, 11:19:05 PM5/9/13
to Roy Smith, pytho...@python.org
> I think where things went pear shaped is when you made the statement:
>
>>> There is no sensible use-case for creating a file OBJECT unless it
>>> initially wraps an open file pointer.
>
> That's a pretty absolute point of view. Life is rarely so absolute.

In the old days, it was useful to have fine-grained control over the
file object because you didn't know where it might fail, and the OS
didn't necessarily give you give good status codes. So being able to
step through the entire process was the job of the progammers.

Now, with languages so high like python and hardware so common, it
almost is never necessary, so he has some point. A closed file
pointer is useful from a OS-progamming point-of-view though because
you generally never want to leave files open where they'd block other
I/O.


--
MarkJ
Tacoma, Washington

Chris Angelico

unread,
May 9, 2013, 11:46:25 PM5/9/13
to pytho...@python.org
On Fri, May 10, 2013 at 1:19 PM, Mark Janssen <dreamin...@gmail.com> wrote:
>> I think where things went pear shaped is when you made the statement:
>>
>>>> There is no sensible use-case for creating a file OBJECT unless it
>>>> initially wraps an open file pointer.
>>
>> That's a pretty absolute point of view. Life is rarely so absolute.
>
> In the old days, it was useful to have fine-grained control over the
> file object because you didn't know where it might fail, and the OS
> didn't necessarily give you give good status codes. So being able to
> step through the entire process was the job of the progammers.

I don't know what you mean by the "old days", but a couple of decades
ago, there were no such things as "file objects". You call a function
to open a file, you get back a number. You explicitly close that by
calling another function and passing it that number. In fact, there is
no way to have a "file object" that doesn't have an open file
associated with it, because it's simply... a number.

> Now, with languages so high like python and hardware so common, it
> almost is never necessary, so he has some point. A closed file
> pointer is useful from a OS-progamming point-of-view though because
> you generally never want to leave files open where they'd block other
> I/O.

I'm beginning to wonder if you and Dihedral are swapping notes.
Dihedral's been sounding fairly coherent lately.

ChrisA

Steven D'Aprano

unread,
May 10, 2013, 1:00:53 AM5/10/13
to
But you are opening the file. Therefore, it cannot possibly be an example
of not opening the file.

Unlike Pascal, there is no way to create a C file descriptor in a closed
state. Such a thing does not exist. If you have a file descriptor, the
file is open. Once you close it, the file descriptor is no longer valid.

But even if C allowed you to do so, doesn't mean that it is a good idea.
At least some variants of Pascal force you to do the following:

# Pseudo-code.
f = open(filename)
really_open(f)
data = read(f) # or write, or *any other operation at all*

Surely we can agree that having to call both open() and really_open()
before you get an actually opened file that you can use is one call too
many? There is *nothing* you can do with f before calling really_open().
So why require it?

(For the record, "really_open" is spelled "reset" or "rewrite" depending
on whether you want to read or write to the file.)


> The point is probably that a file isn't merely a feature free byte
> storage container; in the real world they usually come with all sorts of
> features like names and permissions. Those features will always imply
> creative uses.

There are at least three related but separate things here.

* file objects, or their low-level equivalent, file descriptors;

* pathnames;

* files on disk, which are an abstraction for data in a particular kind
of data structure.

They are obviously related, but they are also obviously independent. You
can have a pathname that doesn't refer to any file on disk; you can have
a file on disk that has been lost, and therefore is no longer accessible
via a file name. (If you run fsck or equivalent, the file may be
recoverable, but the name will be lost.) You can open a file object, then
unlink it so it no longer points to a file on disk, but does still accept
read or write calls.

The counter-examples given so far apply to pathnames or files on disk.
They don't apply to file objects and file descriptors. I have tried
really hard to be clear that I am talking about file objects. To the
extent that I have failed, I am sorry.


--
Steven

Steven D'Aprano

unread,
May 10, 2013, 1:03:10 AM5/10/13
to
On Thu, 09 May 2013 23:09:55 -0400, Roy Smith wrote:

> In article <518c5bbc$0$29997$c3e8da3$5496...@news.astraweb.com>,
> Steven D'Aprano <steve+comp....@pearwood.info> wrote:
>
>> I must admit I am astonished at how controversial the opinion "if your
>> object is useless until you call 'start', you should automatically call
>> 'start' when the object is created" has turned out to be.
>
> I'm sorry. I thought you were here for an argument.

No, I'm here for the abuse.


> I think where things went pear shaped is when you made the statement:
>
>>> There is no sensible use-case for creating a file OBJECT unless it
>>> initially wraps an open file pointer.
>
> That's a pretty absolute point of view. Life is rarely so absolute.

So far the only counter-examples given aren't counter-examples. One
involves opening the file. The other involves something which isn't a
file, but a string instead. If there are any counter-examples, they are
impossible in Python and C: you cannot create a file object in Python
without opening it, and you cannot create a file descriptor in C without
opening it. But not in Pascal, which actually supports my claim that this
is an anti-pattern: while some Pascal implementations do allow you to
create a non-open file, you cannot do *anything* with it until you open
it, except generate bugs.



--
Steven

Roy Smith

unread,
May 10, 2013, 1:50:09 AM5/10/13
to
In article <518c7f05$0$29997$c3e8da3$5496...@news.astraweb.com>,
Steven D'Aprano <steve+comp....@pearwood.info> wrote:

> there is no way to create a C file descriptor in a closed state. Such
> a thing does not exist. If you have a file descriptor, the file is
> open. Once you close it, the file descriptor is no longer valid.

Of course there is.

int fd = 37;

I've just created a file descriptor. There is not enough information
given to know if it corresponds to an open file or not.

Before you protest that "it's just an int, not a file descriptor", I
should point out that they're the same thing. It's pretty common to do
something like:

for (int fd = 0; fd <= MAX_FD; fd++) {
close(fd)
}

before forking, to make sure all file descriptors are closed.

Dan Sommers

unread,
May 10, 2013, 2:22:31 AM5/10/13
to
On Fri, 10 May 2013 05:03:10 +0000, Steven D'Aprano wrote:

>>>> There is no sensible use-case for creating a file OBJECT unless it
>>>> initially wraps an open file pointer.

> So far the only counter-examples given aren't counter-examples ...

Well, sure, if you discount operations like "create this file" and
queries like "could I delete this file if I wanted to?" [0] as methods
of the file system rather than of a hypothetical file object.

What about a distributed system? Suppose I want to create a file object
in one place, and send that object to the another place for the file to
be read from or written to [1]? Suppose that for security reasons, I
have to do it that way, because the first place can only create the
objects, and the second place can only access the underly file contents
through an existing object?

I suppose that this case exists even in a non-distributed system that
allows whatever execution unit exists to change its own security
settings (POSIX has setuid and setgid functions), or exposes file ACLs
as methods of file objects rather than of the OS or file system.

What, exactly, does a "file object" represent?

And going back to your original comment (which was actually in response
to one of my posts), at least some operations on python file objects
*could* succeed without having to open the file. An OS could provide
truncate, or writeable, on un-opened files; and certainly Python could
provide encoding, or isatty, on un-opened files. Of these, truncate
might be the closest use case of creating a file object without any
intent to write to it, for some definition of "write to it."

Dan

[0] Yes, I understand that asking first instead of trying to delete the
file is just asking to lose any number of potential race conditions,
assuming that your system even supports race conditions.

[1] Think about a multi-threaded (or otherwise distributed) FTP or HTTP
server. No, don't think about the server vs. the client, but rather a
"core" server overseeing "sub" servers for different sessions.

Steven D'Aprano

unread,
May 10, 2013, 5:47:37 AM5/10/13
to
On Fri, 10 May 2013 01:50:09 -0400, Roy Smith wrote:

> In article <518c7f05$0$29997$c3e8da3$5496...@news.astraweb.com>,
> Steven D'Aprano <steve+comp....@pearwood.info> wrote:
>
>> there is no way to create a C file descriptor in a closed state. Such a
>> thing does not exist. If you have a file descriptor, the file is open.
>> Once you close it, the file descriptor is no longer valid.
>
> Of course there is.
>
> int fd = 37;
>
> I've just created a file descriptor. There is not enough information
> given to know if it corresponds to an open file or not.

No, you haven't created a file descriptor. You've made up a number which
C will allow you to use as an index into the file descriptor table,
because C is a high-level assembler with very little in the way of type
safety, and what little there is you can normally bypass. What you
haven't done is create the record in the file descriptor table. You can't
expect that read(fd) or write(fd) will work, although both should fail
safe rather than segfault if 37 happens to not be an actual file
descriptor.

What you've done is the moral equivalent of choosing an integer at
random, coercing it to a pointer, then dereferencing it to peek or poke
at some memory address. (Although fortunately much safer.)

It's a nice hack, but not one that takes away from what I'm saying.


--
Steven

Steven D'Aprano

unread,
May 10, 2013, 7:00:49 AM5/10/13
to
On Fri, 10 May 2013 06:22:31 +0000, Dan Sommers wrote:

> On Fri, 10 May 2013 05:03:10 +0000, Steven D'Aprano wrote:
>
>>>>> There is no sensible use-case for creating a file OBJECT unless it
>>>>> initially wraps an open file pointer.
>
>> So far the only counter-examples given aren't counter-examples ...
>
> Well, sure, if you discount operations like "create this file" and
> queries like "could I delete this file if I wanted to?" [0] as methods
> of the file system rather than of a hypothetical file object.
>
> What about a distributed system? Suppose I want to create a file object
> in one place, and send that object to the another place for the file to
> be read from or written to [1]? Suppose that for security reasons, I
> have to do it that way, because the first place can only create the
> objects, and the second place can only access the underly file contents
> through an existing object?

Unless you have re-implemented the file I/O system, it doesn't matter. If
your file objects are based on C I/O, then even if the first server
cannot read or write to the files it still creates file objects in an
open state, because that is how C works.

Or maybe the first server only creates some sort of proxy to the real
underlying file object. Or maybe you're re-implemented the I/O system,
and aren't following C's design. Since you're just making this up as a
thought experiment, anything is possible.

But either way, that's fine. You've found an object where it does make
sense to have an explicit "make it go" method: first one entity has
permission to construct the object, but not to open the underlying file.
Another entity has permission to open the underlying file, but not to
create the object. I have no idea whether this is a reasonable security
design or not, it actually sounds a bit rubbish to me but what do I know?
So let's treat it as a reasonable design.

As I've said, repeatedly, that's not what I'm talking about.

When you DON'T have useful things that can be done with the object before
calling "enable", then it is an anti-pattern to require a separate call
to "enable" method, and the enable functionality should be moved into the
object constructor. If you DO have useful things that can be done, like
pass the object to another entity, for security, then that's a whole
'nuther story.

Really, what I'm describing is *normal* behaviour for most objects. We
don't usually design APIs like this:

n = int("42")
n.enable()
m = n + 1
m.enable()
x = m*2 + n*3
print x - 1 # oops, raises because I forgot to call x.enable()

That's a rubbish API, and for simple data-like objects, we all agree it
is a rubbish API. So why accept the same rubbish API just because the
object is more complicated? If you don't have a good, reasonable, non-
contrived use-case for a separate "make it go" method, don't use one.


For my next controversial opinion, I'm going to argue that we should do
arithmetic using numbers rather than by inserting lists inside other
lists:

# Do this:

count = 0
count += 1

# Not this:

count = []
count.insert(0, [])


*wink*


--
Steven

Robert Kern

unread,
May 10, 2013, 8:19:05 AM5/10/13
to pytho...@python.org
On 2013-05-10 12:00, Steven D'Aprano wrote:

> But either way, that's fine. You've found an object where it does make
> sense to have an explicit "make it go" method: first one entity has
> permission to construct the object, but not to open the underlying file.
> Another entity has permission to open the underlying file, but not to
> create the object. I have no idea whether this is a reasonable security
> design or not, it actually sounds a bit rubbish to me but what do I know?
> So let's treat it as a reasonable design.
>
> As I've said, repeatedly, that's not what I'm talking about.
>
> When you DON'T have useful things that can be done with the object before
> calling "enable", then it is an anti-pattern to require a separate call
> to "enable" method, and the enable functionality should be moved into the
> object constructor. If you DO have useful things that can be done, like
> pass the object to another entity, for security, then that's a whole
> 'nuther story.

I'd be curious to see in-the-wild instances of the anti-pattern that you are
talking about, then. I think everyone agrees that entirely unmotivated "enable"
methods should be avoided, but I have my doubts that they come up very often. Do
programmers have a natural tendency to make an extra, completely unnecessary
method? I would think that they have a natural tendency to the opposite.

In my experience, everyone has a reason in mind when they follow a
pattern/anti-pattern. It is pretty rare that someone just does some specific,
nameable thing for no reason at all. There is no need to call out an
anti-pattern for which no one has a reason to do it. But there is a continuum of
reasons. Some reasons are better than others. Some reasons only apply in a small
set of circumstances but seem like they would apply more generally, at least to
novice programmers. Programmers can be wrong about what they think the
(anti-)pattern actually achieves. The whole point of naming an anti-pattern is
to discuss those reasons, show where they are misapplied, where YAGNI, why
novices overuse it, other patterns that should be used instead, and also the
circumstances where it is actually a good pattern instead.

To artificially limit the discussion of the anti-pattern to the trivial,
entirely unmotivated case forbids most of the interesting and instructive parts
of the conversation.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Roy Smith

unread,
May 10, 2013, 9:22:31 AM5/10/13
to
In article <518cc239$0$29997$c3e8da3$5496...@news.astraweb.com>,
Steven D'Aprano <steve+comp....@pearwood.info> wrote:

> > int fd = 37;
> >
> > I've just created a file descriptor. There is not enough information
> > given to know if it corresponds to an open file or not.
>
> No, you haven't created a file descriptor. You've made up a number which
> C will allow you to use as an index into the file descriptor table,
> because C is a high-level assembler with very little in the way of type
> safety, and what little there is you can normally bypass.

No, I've created a file descriptor, which is, by definition, an integer.
It has nothing to do with C. This is all defined by the POSIX
interface. For example, the getdtablesize(2) man page says:

"The entries in the descriptor table are numbered with small integers
starting at 0. The call getdtablesize() returns the size of this table."

So, I am now guaranteed that fds will be ints. I also know the
guaranteed minimum and maximum values.

The system even makes certain guarantees which let me predict what file
descriptor I'll get next in certain situations. For example, from the
dup(2) page on my OSX box:

"The new descriptor returned by the call is the lowest numbered
descriptor currently not in use by the process."

> What you haven't done is create the record in the file descriptor table.

That's correct. But, as described above, the system makes certain
guarantees which allow me to reason about the existence or non-existence
os such entries.

> You can't expect that read(fd) or write(fd) will work

I can expect that they will work if I have reasoned correctly about the
POSIX-guaranteed semantics. For example, POSIX says(*) that this C
program is guaranteed to print, "hello, fd world" (assuming the
assertion passes):

#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>

int main(int argc, char** argv) {
int max_files = getdtablesize();
assert(max_files >= 4);

for (int i = 3; i < max_files; ++i) {
close(i);
}

dup(2);
char* message = "hello, fd world\n";
write(3, message, strlen(message));
}

> What you've done is the moral equivalent of choosing an integer at
> random, coercing it to a pointer, then dereferencing it to peek or poke
> at some memory address. (Although fortunately much safer.)

No, what I've done is taken advantage of behaviors which are guaranteed
by POSIX.

But, we're going off into the weeds here. Where this started was you
said:

> There is no sensible use-case for creating a file WITHOUT OPENING
> it. What would be the point?

I agree with you, in general, that it is usually poor design to have
classes which require instances to be initialized after they are created.

The problem is, you chose as your example a particular domain where the
underlying objects being modeled have unusual semantics imposed by an
interface that's 40 years old. And then you made absolute statements
about there not possibly ever being certain use cases, when clearly
there are (for that domain).

-----------------------
(*) Technically, getdtablesize() isn't POSIX, but POSIX does define
other ways to get the same information.

Roy Smith

unread,
May 10, 2013, 10:01:14 AM5/10/13
to
In article <mailman.1527.1368188...@python.org>,
Robert Kern <rober...@gmail.com> wrote:

> I'd be curious to see in-the-wild instances of the anti-pattern that
> you are talking about, then. I think everyone agrees that entirely
> unmotivated "enable" methods should be avoided, but I have my doubts
> that they come up very often.

As I mentioned earlier in this thread, this was a common pattern in the
early days of C++, when exceptions were a new concept and handled poorly
by many compilers (and, for that matter, programmers).

There was a school of thought that constructors should never be able to
fail (because the only way for a constructor to fail is to throw an
exception). The pattern was to always have the constructor succeed, and
then either have a way to check to see if the newly-constructed object
was valid, or have a separate post-construction initialization step
which could fail.

See, for example, the isValid() and Exists() calls for RogueWave's
RWFile class (http://tinyurl.com/c8kv26g). And also,
http://tinyurl.com/cgs6clx.

Even today, there are C++ implementations which do not use exceptions.
Some are for use in embedded or real-time systems where things need to
be strictly time-bound and/or memory-bound. Others are for historical
reasons (http://tinyurl.com/6hn4zo).

Once people were used to writing "can't fail" constructors in C++, they
often continued using that pattern in other languages, where the
underlying reasons no longer made sense. Quite possibly, they never
even knew the underlying reasons; they were taught, "Constructors must
never fail", and assumed it was a universal rule.

This, BTW, is one of my biggest beefs with the classic Gang Of Four
pattern book. It presents a bunch of patterns as being universally
applicable, when in reality many (if not most) of them are highly C++
specific.

BTW, whenever I read things like, "I think everyone agrees", I
automatically assume what the writer really meant was, "I, and all the
people who agree with me, think".

Oscar Benjamin

unread,
May 10, 2013, 10:29:00 AM5/10/13
to Roy Smith, pytho...@python.org
On 10 May 2013 15:01, Roy Smith <r...@panix.com> wrote:
> In article <mailman.1527.1368188...@python.org>,
> Robert Kern <rober...@gmail.com> wrote:
>
>> I'd be curious to see in-the-wild instances of the anti-pattern that
>> you are talking about, then. I think everyone agrees that entirely
>> unmotivated "enable" methods should be avoided, but I have my doubts
>> that they come up very often.
>
> As I mentioned earlier in this thread, this was a common pattern in the
> early days of C++, when exceptions were a new concept and handled poorly
> by many compilers (and, for that matter, programmers).
>
> There was a school of thought that constructors should never be able to
> fail (because the only way for a constructor to fail is to throw an
> exception). The pattern was to always have the constructor succeed, and
> then either have a way to check to see if the newly-constructed object
> was valid, or have a separate post-construction initialization step
> which could fail.

It's not just because of exceptions. In C++ virtual method calls in a
constructor for a class A will always call the methods of class A even
if the object being constructed is actually of a subclass B because
the B part of the object isn't initialised when the A constructor is
called. There may be a better way to do this since I last used C++ but
as I remember it the two-phase pattern was a recommended way to
implement polymorphic behaviour during initialisation.


Oscar

Roy Smith

unread,
May 10, 2013, 10:37:37 AM5/10/13
to
In article <mailman.1530.1368196...@python.org>,
Oscar Benjamin <oscar.j....@gmail.com> wrote:

> It's not just because of exceptions. In C++ virtual method calls in a
> constructor for a class A will always call the methods of class A even
> if the object being constructed is actually of a subclass B because
> the B part of the object isn't initialised when the A constructor is
> called. There may be a better way to do this since I last used C++ but
> as I remember it the two-phase pattern was a recommended way to
> implement polymorphic behaviour during initialisation.

Mind. Blown.

One of the things I love (FSVO love) about C++ is that no matter how
much I learn, there's always whole new areas of wonderment to explore
behind doors I didn't even know existed.

Thank you.

I suppose, if I had a class like this, I would write a factory function
which called the constructor and post-construction initializer. And
then I would make the constructor protected.

Chris Angelico

unread,
May 10, 2013, 10:46:56 AM5/10/13
to pytho...@python.org
On Sat, May 11, 2013 at 12:37 AM, Roy Smith <r...@panix.com> wrote:
> I suppose, if I had a class like this, I would write a factory function
> which called the constructor and post-construction initializer. And
> then I would make the constructor protected.

That sounds like a reasonable plan, with the possible exception of
protected. Since meeting Python, I've stopped using private and
protected anywhere.

ChrisA

Roy Smith

unread,
May 10, 2013, 10:54:54 AM5/10/13
to
In article <mailman.1531.1368197...@python.org>,
Each language has its own set of best practices. Trying to write C++
code using Python patterns is as bad as trying to write Python code
using C++ patterns.

Chris Angelico

unread,
May 10, 2013, 11:09:04 AM5/10/13
to pytho...@python.org
Agreed, in generality. But what is actually gained by hiding data from
yourself? Compare:

class Foo
{
int asdf;
public:
Foo(int _asdf):asdf(_asdf) {}
int get_asdf() {return asdf;}
void set_asdf(int _asdf) {asdf=_asdf;}
void frob() {printf("Hi, I am %d\n",asdf);}
};

struct Foo
{
int asdf;
Foo(int _asdf):asdf(_asdf) {}
void frob() {printf("Hi, I am %d\n",asdf);}
};

Is there anything worse about the second one? Oh, and by logical
extension, here's something that doesn't (AFAIK) work in C++, but does
in another language that's syntactically similar:

class Foo(int asdf)
{
void frob() {write("Hi, I am %d\n",asdf);}
}

Now that's brevity. Why bother saying what's patently obvious? (Pike's
"class" keyword is like C++'s "struct", members are public by
default.)

ChrisA

Roy Smith

unread,
May 10, 2013, 11:21:33 AM5/10/13
to
In article <mailman.1532.1368198...@python.org>,
Chris Angelico <ros...@gmail.com> wrote:

> > Each language has its own set of best practices. Trying to write C++
> > code using Python patterns is as bad as trying to write Python code
> > using C++ patterns.
>
> Agreed, in generality. But what is actually gained by hiding data from
> yourself?

You're not hiding it from yourself. You're hiding it from the other
people who are using your code and may not understand all the subtle
gotchas as well as you do.

In the calling-virtual-methods-in-the-constructor case we're talking
about here, constructing an object by calling B() without immediately
following it up with a call to B::PostInit() is dangerous. If you
document that you shouldn't do that, but allow it anyway, people will do
it. Better to disallow it completely. People will bitch when their
code doesn't compile, but that's better than compiling and running and
producing the wrong results.

You solve this problem in Python by simply having the constructor do the
right thing.

Robert Kern

unread,
May 10, 2013, 11:33:09 AM5/10/13
to pytho...@python.org
Right, this is one of the "bad reasons" I talk about later in my message. The
authors had a reason in their mind for doing it (they thought it was a universal
rule); it was just a bad one. It's more useful to talk about why people thought
they should follow that pattern than to just say "there is no reason to do this".

> This, BTW, is one of my biggest beefs with the classic Gang Of Four
> pattern book. It presents a bunch of patterns as being universally
> applicable, when in reality many (if not most) of them are highly C++
> specific.
>
> BTW, whenever I read things like, "I think everyone agrees", I
> automatically assume what the writer really meant was, "I, and all the
> people who agree with me, think".

Hah! Fair enough. I actually meant it to emphasize that I thought that Steven
was overly reducing his statements to something that was trivially true,
sacrificing content for validity. I will suggest that your interpretation of
that phrase is more appropriate when the speaker is proposing something of their
own rather than (partially) conceding a point. The exaggeration is only
self-aggrandizing in the former case.

Chris Angelico

unread,
May 10, 2013, 11:44:31 AM5/10/13
to pytho...@python.org
On Sat, May 11, 2013 at 1:21 AM, Roy Smith <r...@panix.com> wrote:
> In article <mailman.1532.1368198...@python.org>,
> Chris Angelico <ros...@gmail.com> wrote:
>>
>> Agreed, in generality. But what is actually gained by hiding data from
>> yourself?
>
> You're not hiding it from yourself. You're hiding it from the other
> people who are using your code and may not understand all the subtle
> gotchas as well as you do.

True. And on looking over my code, I find that there are a few cases
where I've used private members: I have a buffer class that acts
pretty much like a non-refcounted string, and it declares a private
copy constructor to prevent accidental misuse. But it's the exception,
not the rule. My main point isn't about the cases where you actually
want to prevent access, but the all-too-common case where the member
itself is private and there are two public methods to get and set it.
Massive boilerplate. Completely unnecessary in 99%+ of cases.

ChrisA

Nobody

unread,
May 10, 2013, 12:59:26 PM5/10/13
to
On Thu, 09 May 2013 05:23:59 +0000, Steven D'Aprano wrote:

> There is no sensible use-case for creating a file without opening it.
> What would be the point? Any subsequent calls to just about any method
> will fail. Since you have to open the file after creating the file object
> anyway, why make them two different calls?

As a counterpoint, some OSes (e.g. Plan 9) allow you to get a "handle" to
a file without opening it. This can then be used for deleting, renaming or
stat()-type operations without either the risk of race conditions (if
another process renames files between operations, the operations may
be performed on different files) or the side-effects of actually opening
the file (particularly for device files, e.g. opening a tape drive may
rewind the tape).

Python's file model doesn't allow for this, so there isn't really anything
meaningful that you can do on a file object which isn't open (although
these actually exist; any file object on which the .close() method has
been called will be in this state).

However: there are situations where it is useful to be able to separate
the simple task of creating an object from more invasive actions such as
system calls. Particularly in multi-threaded or real-time code (although
the latter is a non-starter in Python for many other reasons).

Serhiy Storchaka

unread,
May 10, 2013, 11:44:35 AM5/10/13
to pytho...@python.org
10.05.13 15:19, Robert Kern написав(ла):
> I'd be curious to see in-the-wild instances of the anti-pattern that you
> are talking about, then.

Many (if not most) GUI frameworks use this pattern.

button = Button("text")
button.setForegroundColor(...)
button.setBackgoundColor(...)
button.setFont(...)
button.setRelief(...)
button.setBorder(...)
button.setWidth(...)
button.setAction(...)
button.setMouseListener(...)
button.place(...)

Another example is running a subprocess in Unix-like systems.

fork()
open/close file descriptors, set limits, etc
exec*()


Robert Kern

unread,
May 10, 2013, 1:20:34 PM5/10/13
to pytho...@python.org
According to Steven's criteria, neither of these are instances of the
anti-pattern because there are good reasons they are this way. He is reducing
the anti-pattern to just those cases where there is no reason for doing so. That
is why I asked for in-the-wild instances. I should have qualified my sentence to
include "according to your criteria" because people seem to be answering my
query out of that context.

Chris Angelico

unread,
May 10, 2013, 1:24:59 PM5/10/13
to pytho...@python.org
On Sat, May 11, 2013 at 1:44 AM, Serhiy Storchaka <stor...@gmail.com> wrote:
> 10.05.13 15:19, Robert Kern написав(ла):
>
>> I'd be curious to see in-the-wild instances of the anti-pattern that you
>> are talking about, then.
>
>
> Many (if not most) GUI frameworks use this pattern.
>
> button = Button("text")
> button.setForegroundColor(...)
> button.setBackgoundColor(...)
> button.setFont(...)
> button.setRelief(...)
> button.setBorder(...)
> button.setWidth(...)
> button.setAction(...)
> button.setMouseListener(...)
> button.place(...)

The button really exists, though. You could merge the creation and
placement (or in the case of a window/dialog, the creation and
showing), but it's often useful to not. However, in the specific case
you have there, there's an alternative: a mapping of attributes and
values passed to the constructor, and then you could pass the
constructed object directly to a place(). That would let you, if you
wished, effectively construct a Button with a parent right there,
which makes reasonable sense.

> Another example is running a subprocess in Unix-like systems.
>
> fork()
> open/close file descriptors, set limits, etc
> exec*()

Hrm. Not really a corresponding example. After you fork, you have two
actual processes. Following up with an exec is only one of the
possible options; I've done code that forks and execs, and code that
forks and keeps running, and neither of them feels "wrong" in any way.
There IS a function that's similar to what you're saying, and that's
vfork:

"""
(From POSIX.1) The vfork() function has the same effect as fork(2),
except that the behavior is undefined if the process created by
vfork() either modifies any data other than a variable of type pid_t
used to store the return value from vfork(), or returns from the
function in which vfork() was called, or calls any other function
before successfully calling _exit(2) or one of the exec(3) family of
functions.
"""

It's deprecated because it's so fragile (and because regular fork()
isn't that much less efficient now; AIUI, vfork was meant to be a
lightweight fork). I would say that the deprecation of vfork in favour
of fork is a strong indication that the object.enable() anti-pattern
can come up in kernel APIs too, and isn't liked there either.

ChrisA

Roy Smith

unread,
May 10, 2013, 1:32:50 PM5/10/13
to
In article <pan.2013.05.10....@nowhere.com>,
Nobody <nob...@nowhere.com> wrote:

> However: there are situations where it is useful to be able to separate
> the simple task of creating an object from more invasive actions such as
> system calls. Particularly in multi-threaded or real-time code (although
> the latter is a non-starter in Python for many other reasons).

Sure. I can serialize a path name. I can't serialize an open file
descriptor.

Cameron Simpson

unread,
May 10, 2013, 6:25:41 PM5/10/13
to Roy Smith, pytho...@python.org
On 10May2013 09:22, Roy Smith <r...@panix.com> wrote:
| In article <518cc239$0$29997$c3e8da3$5496...@news.astraweb.com>,
| Steven D'Aprano <steve+comp....@pearwood.info> wrote:
| > > int fd = 37;
| > >
| > > I've just created a file descriptor. There is not enough information
| > > given to know if it corresponds to an open file or not.
| >
| > No, you haven't created a file descriptor. You've made up a number which
| > C will allow you to use as an index into the file descriptor table,
| > because C is a high-level assembler with very little in the way of type
| > safety, and what little there is you can normally bypass.
|
| No, I've created a file descriptor, which is, by definition, an integer.
| It has nothing to do with C. This is all defined by the POSIX
| interface. For example, the getdtablesize(2) man page says:
|
| "The entries in the descriptor table are numbered with small integers
| starting at 0. The call getdtablesize() returns the size of this table."
[... snip ...]

I'm with Steven here.

You've made a number that can be used with calls that access the
OS file descriptor table. But it isn't a file descriptor. (Yes, the
in-program number is just a number either way.)

The descriptor table is an in-kernel data structure, filled with
file descriptors. All you have is a label that may or may not access
a file descriptor.

Anyway, we all know _what_ goes on. We're just having terminology issues.

Cheers,
--
Cameron Simpson <c...@zip.com.au>

My computer always does exactly what I tell it to do but sometimes I have
trouble finding out what it was that I told it to do.
- Dick Wexelblat <r...@ida.org>

Mark Janssen

unread,
May 10, 2013, 10:43:52 PM5/10/13
to Steven D'Aprano, pytho...@python.org
You're missing one other case: if there's useful things that can be
checked before calling enable(). Remember, on multi-user and/or
multi-processing systems, there could be contention for a slow
resource. If you automatically open a file for write, you're
preventing everyone else from writing and potentially reading it. So
there is something useful: did that file exist? Is that resource
available for writing?

Prior to such hi-level languages like Python and reliable hardware,
such fine-grained control was important and vital. Now it can
probably be relegated to special OS libraries.

Mark

> Really, what I'm describing is *normal* behaviour for most objects. We
> don't usually design APIs like this:
>
> n = int("42")
> n.enable()
> m = n + 1
> m.enable()
> x = m*2 + n*3
> print x - 1 # oops, raises because I forgot to call x.enable()

Again, you only do that for shared resources. In this case, memory
would have to be a (protected) shared resources, but the OS manages
memory allocation, so it's not an issue.

Mark

> That's a rubbish API, and for simple data-like objects, we all agree it
> is a rubbish API. So why accept the same rubbish API just because the
> object is more complicated?

I think I just told you, but let me know ..... :)

> For my next controversial opinion, I'm going to argue that we should do
> arithmetic using numbers rather than by inserting lists inside other
> lists:

Try arguing that we should have a common message-passing syntax.

> # Do this:
>
> count = 0
> count += 1
>
> # Not this:
>
> count = []
> count.insert(0, [])

That's actually what they do in "set theory", believe it or not.

MarkJ
Tacoma, Washington

Mark Janssen

unread,
May 10, 2013, 11:16:11 PM5/10/13
to Cameron Simpson, Python List, Roy Smith, steve+comp....@pearwood.info
> | No, I've created a file descriptor, which is, by definition, an integer.
> | It has nothing to do with C. This is all defined by the POSIX
> | interface. For example, the getdtablesize(2) man page says:
> |
> | "The entries in the descriptor table are numbered with small integers
> | starting at 0. The call getdtablesize() returns the size of this table."
> [... snip ...]
>
> I'm with Steven here.
>
> You've made a number that can be used with calls that access the
> OS file descriptor table. But it isn't a file descriptor. (Yes, the
> in-program number is just a number either way.)

Steven, don't be misled. POSIX is not the model to look to -- it does
not acknowledge that files are actual objects that reside on a piece
of hardware. It is not simply an integer.

Mark

Thomas Rachel

unread,
May 11, 2013, 12:21:44 AM5/11/13
to
Am 10.05.2013 15:22 schrieb Roy Smith:

> That's correct. But, as described above, the system makes certain
> guarantees which allow me to reason about the existence or non-existence
> os such entries.

Nevertheless, your 37 is not a FD yet.

Let's take your program:

> #include <unistd.h>
> #include <stdio.h>
> #include <string.h>
> #include <assert.h>
>
> int main(int argc, char** argv) {
> int max_files = getdtablesize();
> assert(max_files >= 4);

Until here, the numbers 3 toll max_files may or may not be FDs.

> for (int i = 3; i < max_files; ++i) {
> close(i);
> }

Now they are closed; they are definitely no longer FDs even if they
were. If you would use them in a file operation, you'd get a EBADF which
means "fd is not a valid file descriptor".

> dup(2);


From now on, 3 is a FD and you can use it as such.

> char* message = "hello, fd world\n";
> write(3, message, strlen(message));
> }
>

> No, what I've done is taken advantage of behaviors which are guaranteed
> by POSIX.

Maybe, but the integer numbers get or los their property as a file
descriptor with open() and close() and not by assigning them to an int.


Thomas

Steven D'Aprano

unread,
May 11, 2013, 3:05:00 AM5/11/13
to
On Fri, 10 May 2013 17:59:26 +0100, Nobody wrote:

> On Thu, 09 May 2013 05:23:59 +0000, Steven D'Aprano wrote:
>
>> There is no sensible use-case for creating a file without opening it.
>> What would be the point? Any subsequent calls to just about any method
>> will fail. Since you have to open the file after creating the file
>> object anyway, why make them two different calls?
>
> As a counterpoint, some OSes (e.g. Plan 9) allow you to get a "handle"
> to a file without opening it. This can then be used for deleting,
> renaming or stat()-type operations without either the risk of race
> conditions (if another process renames files between operations, the
> operations may be performed on different files) or the side-effects of
> actually opening the file (particularly for device files, e.g. opening a
> tape drive may rewind the tape).

Ah, now that's a fantastic counter-example. But I think that says more
about the primitiveness of the Unix file model than of the idea of
temporal coupling.


> Python's file model doesn't allow for this, so there isn't really
> anything meaningful that you can do on a file object which isn't open
> (although these actually exist; any file object on which the .close()
> method has been called will be in this state).

Absolutely correct, and I'm amazed it's taken this long for anyone to
point this out :-)



--
Steven

Steven D'Aprano

unread,
May 11, 2013, 3:51:53 AM5/11/13
to
On Fri, 10 May 2013 18:20:34 +0100, Robert Kern wrote:

> According to Steven's criteria, neither of these are instances of the
> anti-pattern because there are good reasons they are this way. He is
> reducing the anti-pattern to just those cases where there is no reason
> for doing so.

But isn't that the case for all anti-patterns?

We agree that GOTO is an anti-pattern. That doesn't mean that there
aren't valid reasons for using GOTO. Sometimes there are good use-cases
for GOTO that outweigh the harm. By calling it an anti-pattern, though,
we shift the onus onto the person wanting to use GOTO: justify why you
need it, or use something else.

Would you object less if I called it a "code smell" than an "anti-
pattern"? If so, I accept your criticism, and call it a code smell: the
less temporal coupling your API has, the better.


> That is why I asked for in-the-wild instances.

How about this?

http://legacy.thecodewhisperer.com/post/366626867/improving-clarity-by-removing-temporal-coupling


Another example of temporal coupling is json_decode in PHP: you must
follow it by a call to json_last_error, otherwise you have no way of
telling whether the json_decode function succeeded or not.

http://php.net/manual/en/function.json-last-error.php


> I should
> have qualified my sentence to include "according to your criteria"
> because people seem to be answering my query out of that context.

Now you know how I feel :-)

I started this thread asking for help tracking down a blog post
describing this code smell, or at least the name of such. Thanks again to
Wayne Werner, who came through with the name of the anti-pattern,
temporal coupling, and a blog post describing it, although not the one I
read all those many moons ago.

I never intended to give the impression that *any* use of a separate
"enable" method call was bad. I certainly didn't intend to be bogged
down into a long discussion about the minutia of file descriptors in
C, but it was educational :-)



--
Steven

Roy Smith

unread,
May 11, 2013, 9:31:12 AM5/11/13
to
In article <518df898$0$29997$c3e8da3$5496...@news.astraweb.com>,
Steven D'Aprano <steve+comp....@pearwood.info> wrote:

> I never intended to give the impression that *any* use of a separate
> "enable" method call was bad. I certainly didn't intend to be bogged
> down into a long discussion about the minutia of file descriptors in
> C, but it was educational :-)

Well, you did say you were here for abuse. I think you got your money's
worth. Can I interest you in a course of getting hit on the head
lessons?

And just to be clear to the studio audience and all of you who are
watching at home...

For all the effort I put into nit-picking, I do agree with Steven's
basic premise. Two-phase construction is usually not the right way to
be designing classes. Especially in languages like Python where
constructors raising exceptions is both inexpensive and universally
accepted as normal behavior.

Mark Janssen

unread,
May 9, 2013, 11:51:51 PM5/9/13
to Chris Angelico, pytho...@python.org
>> In the old days, it was useful to have fine-grained control over the
>> file object because you didn't know where it might fail, and the OS
>> didn't necessarily give you give good status codes. So being able to
>> step through the entire process was the job of the progammers.
>
> I don't know what you mean by the "old days", but a couple of decades
> ago, there were no such things as "file objects".

My apologies. I used the word "object" when I shouldn't have.

> I'm beginning to wonder if you and Dihedral are swapping notes.
> Dihedral's been sounding fairly coherent lately.

Dihedral... That's my dream-self. Where did you encounter him? heh

--
MarkJ
Tacoma, Washington

Mark Janssen

unread,
May 11, 2013, 12:00:02 AM5/11/13
to Cameron Simpson, Python List, Roy Smith, steve+comp....@pearwood.info
> Steven, don't be misled. POSIX is not the model to look to -- it does
> not acknowledge that files are actual objects that reside on a piece
> of hardware. It is not simply an integer.

Please disregard this (my own) flame bait.
--
MarkJ
Tacoma, Washington

André Malo

unread,
May 11, 2013, 11:33:51 AM5/11/13
to
* Serhiy Storchaka wrote:

> Another example is running a subprocess in Unix-like systems.
>
> fork()
> open/close file descriptors, set limits, etc
> exec*()

For running a subprocess, only fork() is needed. For starting another
executable, only exec() is needed. For running the new executable in a
subprocess fork() and exec() are needed. I think, that's a bad example.
These APIs are actually well-designed.

nd
--
Gefunden auf einer "Webdesigner"-Seite:
> Programmierung in HTML, XML, WML, CGI, FLASH <

# André Malo # http://pub.perlig.de/ #

Chris Angelico

unread,
May 11, 2013, 12:04:50 PM5/11/13
to pytho...@python.org
On Sun, May 12, 2013 at 1:33 AM, André Malo <ndpa...@gmail.com> wrote:
> * Serhiy Storchaka wrote:
>
>> Another example is running a subprocess in Unix-like systems.
>>
>> fork()
>> open/close file descriptors, set limits, etc
>> exec*()
>
> For running a subprocess, only fork() is needed. For starting another
> executable, only exec() is needed. For running the new executable in a
> subprocess fork() and exec() are needed. I think, that's a bad example.
> These APIs are actually well-designed.

That said, though, there's certainly plenty of room for one-call APIs
like popen. For the simple case where you want to start a process with
some other executable, wait for it to finish, and then work with its
return value, it makes sense to hide the details of fork/exec/wait -
especially since that simple API can be cross-platform, where fork()
itself is quite hard to implement on Windows.

ChrisA
Message has been deleted
Message has been deleted

Robert Kern

unread,
May 11, 2013, 3:55:49 PM5/11/13
to pytho...@python.org
On 2013-05-11 08:51, Steven D'Aprano wrote:
> On Fri, 10 May 2013 18:20:34 +0100, Robert Kern wrote:
>
>> According to Steven's criteria, neither of these are instances of the
>> anti-pattern because there are good reasons they are this way. He is
>> reducing the anti-pattern to just those cases where there is no reason
>> for doing so.
>
> But isn't that the case for all anti-patterns?
>
> We agree that GOTO is an anti-pattern. That doesn't mean that there
> aren't valid reasons for using GOTO. Sometimes there are good use-cases
> for GOTO that outweigh the harm. By calling it an anti-pattern, though,
> we shift the onus onto the person wanting to use GOTO: justify why you
> need it, or use something else.

Yes, that was the point I was making. You seemed to be defining away the
legitimate instances as not instances of the pattern at all because they were
legitimate, and that appeared to me to be defeating the purpose of having the
discussion.

On a related note, I *don't* think it's a good idea to phrase it as "justify why
you need it". I don't think that gives very good guidance to a novice when they
are given the task of designing something. People can come up with a
justification for just about anything, especially when they are only justifying
things to themselves. I think it's more important to just talk about the
situations where a pattern is useful, and the common situations where people,
for whatever reason, *think* that a pattern is useful, but isn't. Naming it a
Pattern or Anti-pattern is really just a measure of how bad the latter half of
that is compared to the first half, and is less interesting than the discussion
itself. That's why I had a bug up my ass about what looked like the exclusion of
the "good" uses. It's the good examples that give novices an idea of what a good
justification looks like, so they can tell if the justification they are giving
themselves measures up.

> Would you object less if I called it a "code smell" than an "anti-
> pattern"? If so, I accept your criticism, and call it a code smell: the
> less temporal coupling your API has, the better.

That was not really my objection. I was objecting to the way you appeared to be
defining the particular pattern in question. But as we appear to agree on the
important matters, I won't press it further.

>> That is why I asked for in-the-wild instances.
>
> How about this?
>
> http://legacy.thecodewhisperer.com/post/366626867/improving-clarity-by-removing-temporal-coupling

There's something about Java mixedCase that makes my eyes glaze, so I'll take
your word for it. :-)

> Another example of temporal coupling is json_decode in PHP: you must
> follow it by a call to json_last_error, otherwise you have no way of
> telling whether the json_decode function succeeded or not.
>
> http://php.net/manual/en/function.json-last-error.php

I suspect that the author might say something about error checking being
optional. But yeah, terrible API. :-)

Chris Angelico

unread,
May 11, 2013, 6:39:26 PM5/11/13
to pytho...@python.org
On Sun, May 12, 2013 at 5:55 AM, Robert Kern <rober...@gmail.com> wrote:
>> Another example of temporal coupling is json_decode in PHP: you must
>> follow it by a call to json_last_error, otherwise you have no way of
>> telling whether the json_decode function succeeded or not.
>>
>> http://php.net/manual/en/function.json-last-error.php
>
>
> I suspect that the author might say something about error checking being
> optional. But yeah, terrible API. :-)

The problem with that one isn't that error checking is optional, but
that errors are signalled with a perfectly legal return value (FALSE,
if I recall correctly - which is also returned if you json_decode a
boolean false). Better design would either throw an exception on
error, or have a unique sentinel representing the errorness of the
return value.

ChrisA

Wayne Werner

unread,
May 12, 2013, 12:48:58 PM5/12/13
to Robert Kern, pytho...@python.org
On Fri, 10 May 2013, Robert Kern wrote:

> On 2013-05-10 12:00, Steven D'Aprano wrote:
>
>> But either way, that's fine. You've found an object where it does make
>> sense to have an explicit "make it go" method: first one entity has
>> permission to construct the object, but not to open the underlying file.
>> Another entity has permission to open the underlying file, but not to
>> create the object. I have no idea whether this is a reasonable security
>> design or not, it actually sounds a bit rubbish to me but what do I know?
>> So let's treat it as a reasonable design.
>>
>> As I've said, repeatedly, that's not what I'm talking about.
>>
>> When you DON'T have useful things that can be done with the object before
>> calling "enable", then it is an anti-pattern to require a separate call
>> to "enable" method, and the enable functionality should be moved into the
>> object constructor. If you DO have useful things that can be done, like
>> pass the object to another entity, for security, then that's a whole
>> 'nuther story.
>
> I'd be curious to see in-the-wild instances of the anti-pattern that you are
> talking about, then. I think everyone agrees that entirely unmotivated
> "enable" methods should be avoided, but I have my doubts that they come up
> very often. Do programmers have a natural tendency to make an extra,
> completely unnecessary method? I would think that they have a natural
> tendency to the opposite.
>
> In my experience, everyone has a reason in mind when they follow a
> pattern/anti-pattern. It is pretty rare that someone just does some specific,
> nameable thing for no reason at all. There is no need to call out an
> anti-pattern for which no one has a reason to do it. But there is a continuum
> of reasons. Some reasons are better than others. Some reasons only apply in a
> small set of circumstances but seem like they would apply more generally, at
> least to novice programmers. Programmers can be wrong about what they think
> the (anti-)pattern actually achieves. The whole point of naming an
> anti-pattern is to discuss those reasons, show where they are misapplied,
> where YAGNI, why novices overuse it, other patterns that should be used
> instead, and also the circumstances where it is actually a good pattern
> instead.

I'll share the anti-pattern that I've seen many times (not actually in
Python)

class CoolPresenter:
def __init__(self):
self.view = None
self.some_property = None
self.other_property = None

def initialize(self):
self.view.disable()
data = self.load_data()
self.view.data = data
self.view.enable()


def reload(self):
if self.view is None:
raise NotInitializedError("Error: Please setup class")
self.view.disable()
data = self.load_data()
self.view.data = data
self.view.enable()



Then you would see code like this:

presenter = CoolPresenter()
presenter.view = CoolView()

This is just plain silly for a few reasons:

- It's ambiguous. I don't know what's required for the CoolPresenter
to function properly.

- The temporal coupling mentioned earlier. I can create an instance of
a class and then call a function (say `reload`) and then boom! My
program crashes. There is *no possible* use case of this class where
you can use it without a view.


The motivation behind this anti-pattern that I've seen is the desire to
not have a constructor that "does too much". So you end out with an empty
constructor and temporal coupling, and a terrible API that doesn't clearly
explain the requirements of the class. Your class constructor should
*require* everything that is necessary to have a stable state when the
class is created (i.e. you should be able to properly call any function,
set any property without an exception happening)

Why? Less bugs, easier to comprehend, change/update your code. Easier to
use the class.

-W

Wayne Werner

unread,
May 12, 2013, 1:14:33 PM5/12/13
to Gregory Ewing, pytho...@python.org
On Fri, 10 May 2013, Gregory Ewing wrote:

> Wayne Werner wrote:
>> You don't ever want a class that has functions that need to be called in a
>> certain order to *not* crash.
>
> That seems like an overly broad statement. What
> do you think the following should do?
>
> f = open("myfile.dat")
> f.close()
> data = f.read()

To clarify - you don't want a class that has functions that need to be
called in a certain order with *valid input* in order to not crash.

Exactly what does happen - a ValueError is raised because you're(*)
passing self into the file.read() function, and that input is invalid
input - specifically:

ValueError: I/O operation on closed file

*where you actually means python, because when you call
`your_instance.method()`, it works effectively like a call to
`YourClass.method(your_instance)`

-W

Terry Jan Reedy

unread,
May 12, 2013, 4:03:39 PM5/12/13
to pytho...@python.org
The new idiom
with open("myfile.dat") as f:
data = f.read()

partially encapsulates operations on the opened file before the
automatic close. It is true however, that after the with statement, f in
still uselessly bound to the closed file. (Well, you can check the mode,
encoding, and other attributes, so not totally useless.) To completely
encapsulate, one can end the block with 'del f'. I checked and this
works. Since the external 'as x' name binding is optional, an internal
reference to the context manager (the io.xxx instance) is kept in order
to call the __exit__ method, and that is not affected by deleting the
optional visible binding.

tjr


Terry Jan Reedy

unread,
May 12, 2013, 4:23:58 PM5/12/13
to pytho...@python.org
Thank you for this examples. It makes Steven's point clearer than the
'file object' example. The problem with the latter is that objectors
could could point to file path objects, which are used to do some
manipulations of files and directory entries on the storage device
*without looking at the specific file data*. Such file data, as opposed
to directory data, never enters the program data space and with smart
enough devices, need not enter the CPU and its RAM. They could then
confuse them with file access objects which are used to transfer
specific data *between* files and program data space. The confusion is
adied by the fact that file path objects are used to create file access
objects. They do separate jobs and it seems that they are well kept
separate, even if there are languages where file path objects can have
file access functions turned on and off.

> The motivation behind this anti-pattern that I've seen is the desire to
> not have a constructor that "does too much". So you end out with an
> empty constructor and temporal coupling, and a terrible API that doesn't
> clearly explain the requirements of the class. Your class constructor
> should *require* everything that is necessary to have a stable state
> when the class is created (i.e. you should be able to properly call any
> function, set any property without an exception happening)
>
> Why? Less bugs, easier to comprehend, change/update your code. Easier to
> use the class.


--
Terry Jan Reedy



Greg Ewing

unread,
May 12, 2013, 7:18:10 PM5/12/13
to pytho...@python.org
Wayne Werner wrote:
> On Fri, 10 May 2013, Gregory Ewing wrote:
>
>> f = open("myfile.dat")
>> f.close()
>> data = f.read()
>
> To clarify - you don't want a class that has functions that need to be
> called in a certain order with *valid input* in order to not crash.
>
> Exactly what does happen - a ValueError is raised because you're(*)
> passing self into the file.read() function, and that input is invalid

The same argument can be applied to:

foo = Foo()
foo.do_something()
foo.enable() # should have done this first

You're passing an invalid input to Foo.do_something,
namely a Foo that hasn't been enabled yet.

--
Greg

Fábio Santos

unread,
May 13, 2013, 2:32:36 AM5/13/13
to Greg Ewing, pytho...@python.org

> --
> http://mail.python.org/mailman/listinfo/python-list

I don't think you can really count that as invalid input in OOP terms. After all in most languages `self` / `this` / whatever is not an argument to every method.

Chris Angelico

unread,
May 13, 2013, 3:36:41 AM5/13/13
to pytho...@python.org
On Mon, May 13, 2013 at 4:32 PM, Fábio Santos <fabiosa...@gmail.com> wrote:
>
> On 13 May 2013 00:22, "Greg Ewing" <greg....@canterbury.ac.nz> wrote:
>> The same argument can be applied to:
>>
>> foo = Foo()
>> foo.do_something()
>> foo.enable() # should have done this first
>>
>> You're passing an invalid input to Foo.do_something,
>> namely a Foo that hasn't been enabled yet.
>
> I don't think you can really count that as invalid input in OOP terms. After
> all in most languages `self` / `this` / whatever is not an argument to every
> method.

Yes, it is; it's just often implicit. C++ lets you poke around with
the internals, and it's pretty clear that 'this' is an argument. (See
for instance what happens with the gcc 'format' attribute - I can't
find a convenient docs page, but it's been mentioned on SO [1] and can
be easily verified.) EMCAScript lets you call any function with any
'this' by using the .call() or .apply() methods - which, in my
extremely not-humble opinionated opinion, is bad design (closures work
implicitly, but the 'this' pointer doesn't??). Python turns an
attribute lookup on an instance into an attribute lookup on the class
plus a currying. One way or another, the bit-before-the-dot is an
argument to the function.

[1] http://stackoverflow.com/questions/11621043/how-should-i-properly-use-attribute-format-printf-x-y-inside-a-class

ChrisA

Fábio Santos

unread,
May 13, 2013, 4:09:37 AM5/13/13
to Chris Angelico, pytho...@python.org

> --
> http://mail.python.org/mailman/listinfo/python-list

I know ECMAScript does that. It would be nice to be able to pass an instance method as a callback argument without using `.bind(theInstance)`.

At any rate, exposed or not, that is all still internals. Exposing the ability to set the ` this ` as an argument is, I think, a functional feature (map(str.strip, file) is a good example)

That said, I didn't know c++ did that, but it makes sense with what I read somewhere about c++ starting out as a transcompiler-to-c-based language.

Wayne Werner

unread,
May 13, 2013, 9:15:07 AM5/13/13
to Greg Ewing, pytho...@python.org
On Mon, 13 May 2013, Greg Ewing wrote:

> Wayne Werner wrote:
>> On Fri, 10 May 2013, Gregory Ewing wrote:
>>
>>> f = open("myfile.dat")
>>> f.close()
>>> data = f.read()
>>
>> To clarify - you don't want a class that has functions that need to be
>> called in a certain order with *valid input* in order to not crash.
>>
>> Exactly what does happen - a ValueError is raised because you're(*) passing
>> self into the file.read() function, and that input is invalid
>
> The same argument can be applied to:
>
> foo = Foo()
> foo.do_something()
> foo.enable() # should have done this first
>
> You're passing an invalid input to Foo.do_something,
> namely a Foo that hasn't been enabled yet.

That is the crux of the argument - as designer of the class *you* need to
ensure that when your constructor is done, your class is in a stable
state. And that every other state transition (with valid input) results in
your class then being in a stable state.


If anything, the stronger argument is that `file.close()` is not a well
designed function because it leaves your object in an unstable state.

Which I would be inclined to agree with, but I couldn't give you the
answer for what makes it better. Because the answer is the best one you
can get in computer science: It depends.


The reason that it depends, is because it depends on what you want to do.
Do you want a program that seems purely functional? Do you want a program
that's easy to maintain? Do you want a program that more accurately models
the "real world"?

Personally, I think the file object API in Python is about as good as it
can get - but that's because it's working with "physical" things (i.e.
files - bits on a platter, or flash/SSD drive...) which necessarily have a
temporal nature. And it's much less badness to blow up on a call to `read`
than it is to remove the `read` function and die with a NameError when the
underlying file is in a closed state.


At least in my opinion ;)
-W
0 new messages