I need to create a class solely for the purpose of encapsulating
a large number of disparate data items. At the moment I have no
plans for any methods for this class other than the bazillion
accessors required to access these various instance variables.
(In case it matters, this class is meant to be a private helper
class internal to a module, and it won't be subclassed.)
What is "best practice" for implementing this sort of class
*succinctly* (i.e. without a lot of repetitive accessor code)?
Also, one more question concerning syntax. Suppose that i represents
an instance of this class. Is it possible to define the class to
support this syntax
val = i.field
i.field += 6
...rather than this one
val = i.get_field()
i.set_field(i.get_field() + 6)
?
TIA!
~K
If it's just a completely dumb struct-like class, you might consider
something like:
http://docs.python.org/library/collections.html#collections.namedtuple
> What is "best practice" for implementing this sort of class
> *succinctly* (i.e. without a lot of repetitive accessor code)?
Is there any good reason you can't just use straight instance
variables? Python ain't Java; vanilla, boilerplate accessor methods
should almost always be avoided.
> Also, one more question concerning syntax. Suppose that i represents
> an instance of this class. Is it possible to define the class to
> support this syntax
>
> val = i.field
> i.field += 6
>
> ...rather than this one
>
> val = i.get_field()
> i.set_field(i.get_field() + 6)
>
> ?
Yes, using the magic of the property() function:
http://docs.python.org/library/functions.html#property
Cheers,
Chris
--
http://blog.rebertia.com
> I need to create a class solely for the purpose of encapsulating a large
> number of disparate data items.
There's a built-in for that. It's called "dict". Syntax for item access
is a tiny bit different, but still very common:
data['foo']
instead of
data.foo
If you need to customize item access, you need to modify __getitem__,
__setitem__ and __delitem__ instead of __getattr__ etc., but otherwise
they are nearly identical. Ignoring a few complications due to slots and
inheritance, attribute access is built on top of item access, so you
won't notice any performance hit (and you might see a tiny performance
benefit).
> At the moment I have no plans for any
> methods for this class other than the bazillion accessors required to
> access these various instance variables.
Huh? If you have instance variables, why don't you refer to them by name?
x = MyClass() # create an instance
y = MyClass() # another variable bound to an instance
z = MyClass() # etc.
print x, y, z
> (In case it matters, this class
> is meant to be a private helper class internal to a module, and it won't
> be subclassed.)
>
> What is "best practice" for implementing this sort of class *succinctly*
> (i.e. without a lot of repetitive accessor code)?
Leave the repetitive accessor code out. Python isn't Java.
http://dirtsimple.org/2004/12/python-is-not-java.html
> Also, one more question concerning syntax. Suppose that i represents an
> instance of this class. Is it possible to define the class to support
> this syntax
>
> val = i.field
> i.field += 6
Classes already support that.
>>> class C(object):
... pass
...
>>> i = C()
>>> i.field = 42
>>> val = i.field
>>> i.field += 6
>>> print (val, i.field)
42 48
> ...rather than this one
>
> val = i.get_field()
> i.set_field(i.get_field() + 6)
>
> ?
Good grief! No wonder Java coders are so unproductive :(
--
Steven
You don't. Python is not Java. So just use instance attributes, and if
you need bhavior when accessing an attribute, introduce a property.
Diez
>You don't. Python is not Java. So just use instance attributes, and if
>you need bhavior when accessing an attribute, introduce a property.
Just accessing attributes looks a bit dangerous to me, due to bugs
like typing
i.typo = 'foo'
when what you meant is
i.type = 'foo'
I tried fixing this by mucking with __setattr__, but I didn't hit
on a satisfactory solution (basically, I couldn't find a good,
self-maintaining, way to specify the attributes that were OK to
set from those that weren't). Is there anything built-in?
Regarding properties, is there a built-in way to memoize them? For
example, suppose that the value of a property is obtained by parsing
the contents of a file (specified in another instance attribute).
It would make no sense to do this parsing more than once. Is there
a standard idiom for memoizing the value once it is determined for
the first time?
Thanks!
~K
>On Sat, Mar 20, 2010 at 3:15 PM, kj <no.e...@please.post> wrote:
>> I need to create a class solely for the purpose of encapsulating
>> a large number of disparate data items. =C2=A0At the moment I have no
>> plans for any methods for this class other than the bazillion
>> accessors required to access these various instance variables.
>> (In case it matters, this class is meant to be a private helper
>> class internal to a module, and it won't be subclassed.)
>If it's just a completely dumb struct-like class, you might consider
>something like:
>http://docs.python.org/library/collections.html#collections.namedtuple
Very cool. Thanks! The class I have in mind is *almost* that
dumb, but performance is a consideration in this case, which may
rule out namedtuple. But I'm glad to learn about it; there are
many places where I can put them to good use.
~K
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
See PyCon Talks from Atlanta 2010 http://pycon.blip.tv/
Holden Web LLC http://www.holdenweb.com/
UPCOMING EVENTS: http://holdenweb.eventbrite.com/
>On Sat, 20 Mar 2010 22:15:54 +0000, kj wrote:
>> I need to create a class solely for the purpose of encapsulating a large
>> number of disparate data items.
>There's a built-in for that. It's called "dict". Syntax for item access
>is a tiny bit different, but still very common:
>data['foo']
>instead of
>data.foo
I find the latter more readable than the former. All those extra
elements (the brackets and the quotes, vs the single dot) add
Perl-like visual noise to the code, IMHO.
And dicts are vulnerable to this sort of bug:
data['typo'] = type(foobar)
Also, AFAIK, initialization of a dictionary is never as simple as
i = myclass(*fields)
But in a sense you're right: aside from these objections,
*functionality-wise* what I'm looking for is not very different
from a dictionary, or a C struct.
>> At the moment I have no plans for any
>> methods for this class other than the bazillion accessors required to
>> access these various instance variables.
>Huh? If you have instance variables, why don't you refer to them by name?
I'm sorry, I used the wrong terminology. I see now that the correct
term is "(instance) attribute", not "instance variable".
>Leave the repetitive accessor code out. Python isn't Java.
>http://dirtsimple.org/2004/12/python-is-not-java.html
Thanks for the link! The bit about "Guido's time machine" is pretty
funny.
~K
> Just accessing attributes looks a bit dangerous to me, due to bugs like
> typing
>
> i.typo = 'foo'
>
> when what you meant is
>
> i.type = 'foo'
That's the price you pay for using a dynamic language like Python with no
declarations. But honestly, the price isn't very high, particularly if
you use an editor or IDE with auto-completion. I can't think of the last
time I had an error due to the above sort of mistake.
Besides, is that error really so much more likely than this?
i.type = 'fpo'
when you meant 'foo'? The compiler can't protect you from that error, not
in any language.
> I tried fixing this by mucking with __setattr__, but I didn't hit on a
> satisfactory solution (basically, I couldn't find a good,
> self-maintaining, way to specify the attributes that were OK to set from
> those that weren't). Is there anything built-in?
No.
You could abuse __slots__, but it really is abuse: __slots__ are a memory
optimization, not a typo-checker.
In Python 3.x, you can (untested) replace the class __dict__ with a
custom type that has more smarts. At the cost of performance. This
doesn't work in 2.x though, as the class __dict__ is always a regular
dictionary.
Something like this might work, at some minor cost of performance:
# Untested
def __setattr__(self, name, value):
if hasattr(self, name):
super(MyClassName, self).__setattr__(name, value)
else:
raise TypeError('cannot create new attributes')
Then, in your __init__ method, to initialise an attribute use:
self.__dict__['attr'] = value
to bypass the setattr.
Or you can use something like PyChecker or PyLint to analyse your code
and warm about likely typos.
But really, it's not a common form of error. YMMV.
> Regarding properties, is there a built-in way to memoize them? For
> example, suppose that the value of a property is obtained by parsing the
> contents of a file (specified in another instance attribute). It would
> make no sense to do this parsing more than once. Is there a standard
> idiom for memoizing the value once it is determined for the first time?
Google for "Python memoization cookbook". This will get you started:
http://code.activestate.com/recipes/52201/
Then just apply the memoize decorator to the property getter.
--
Steven
But namedtuple isn't, Steve. Namedtuple is a class generator that
creates fast and efficient classes.
If you *really* want static typing and validation for attributes in
Python, you might check out enthought traits: http://code.enthought.com/projects/traits/
Regards,
Pat
>Then, in your __init__ method, to initialise an attribute use:
> self.__dict__['attr'] = value
>to bypass the setattr.
Ah, that's the trick! Thanks!
~K
>On Sun, 21 Mar 2010 16:57:40 +0000 (UTC), kj <no.e...@please.post>
>declaimed the following in gmane.comp.python.general:
>> Regarding properties, is there a built-in way to memoize them? For
>> example, suppose that the value of a property is obtained by parsing
>> the contents of a file (specified in another instance attribute).
>> It would make no sense to do this parsing more than once. Is there
>> a standard idiom for memoizing the value once it is determined for
>> the first time?
>>
> Pickle, Shelve? Maybe in conjunction with SQLite3...
I was thinking of something less persistent; in-memory, that is.
Maybe something in the spirit of:
@property
def foo(self):
# up for some "adaptive auto-redefinition"?
self.foo = self._some_time_consuming_operation()
return self.foo
...except that that assignment won't work! It bombs with "AttributeError:
can't set attribute".
~K
PS: BTW, this is not the first time that attempting to set an
attribute (in a class written by me even) blows up on me. It's
situations like these that rattle my grasp of attributes, hence my
original question about boring, plodding, verbose Java-oid accessors.
For me these Python attributes are still waaay too mysterious and
unpredictable to rely on. Sometimes one can set them, sometimes
not, and I can't quite tell the two situations apart. It's all
very confusing to the Noob. (I'm sure this is all documented
*somewhere*, but this does not make using attributes any more
intuitive or straightforward. I'm also sure that *eventually*,
with enough Python experience under one's belt, this all becomes
second nature. My point is that Python attributes are not as
transparent and natural to the uninitiated as some of you folks
seem to think.)
Since foo is a read only property you can assign to it.
But it doesn't matter: if it worked technically it wouldn't give you what you're
after, the once-only evaluation.
A simple way to do that, in the sense of copying code and having it work, is to
use a generator that, after evaluating the expensive op, loops forever yielding
the resulting value.
A probably more efficient way, and anyway one perhaps more easy to understand,
is as follows:
<code>
from __future__ import print_function
class LazyEval:
def __init__( self, f ):
self._f = f
self._computed = False
@property
def value( self ):
if not self._computed:
self._value = self._f()
self._computed = True
return self._value
class HiHo:
def _expensive_op( self ):
print( "Expensive op!" )
return 42
def __init__( self ):
self._foo = LazyEval( self._expensive_op )
@property
def foo( self ):
return self._foo.value
o = HiHo()
for i in range( 5 ):
print( o.foo )
</code>
Cheers & hth.,
- Alf
Somehow simplified, here's what you have to know:
1/ there are instance attributes and class attributes. Instance
attributes lives in the instance's __dict__, class attributes lives in
the class's __dict__ or in a parent's class __dict__.
2/ when looking up an attribute on an instance, the rules are
* first, check if there's a key by that name in the instance's __dict__.
If yes, return the associated value
* else, check if there's a class or parent class attribute by that name.
* if yes
** if the attribute has a '__get__' method, call the __get__ method with
class and instance as arguments, and return the result (this is known as
the "descriptor protocol" and provides support for computed attributes
(including methods and properties)
** else return the attribute itself
* else (if nothing has been found yet), look for a __getattr__ method in
the class and it's parents. If found, call this __getattr__ method with
the attribute name and return the result
* else, give up and raise an AttributeError
3/ When binding an attribute on an instance, the rules are:
* first, check if there's a class (or parent class) attribute by that
name that has a '__set__' method. If yes, call this class attribute's
__set__ method with instance and value as arguments. This is the second
part part of the "descriptor protocol", as used by the property type.
* else, add the attribute's name and value in the instance's __dict__
As I said, this is a somehow simplified description of the process - I
skipped the parts about __slots__, __getattribute__ and __setattr__, as
well as the part about how function class attributes become methods. But
this should be enough to get an idea of what's going on.
In your above case, you defined a "foo" property class attribute. The
property type implements both __get__ and __set__, but you only defined
a callback for the __get__ method (the function you decorated with
'property'), so when you try to rebind "foo", the default property
type's __set__ implementation is invoked, which behaviour is to forbid
setting the attribute. If you want a settable property, you have to
provide a setter too.
Now if you want a "replaceable" property-like attribute, you could
define your own computed attribute (aka "descriptor") type _without_ a
__set__ method:
class replaceableprop(object):
def __init__(self, fget):
self._fget = fget
def __get__(self, instance, cls):
if instance is None:
return self
return self._fget(instance)
@replaceableprop
def foo(self):
# will add 'foo' into self.__dict__, s
self.foo = self._some_time_consuming_operation()
return self.foo
Another (better IMHO) solution is to use a plain property, and store the
computed value as an implementation attribute :
@property
def foo(self):
cached = self.__dict__.get('_foo_cache')
if cached is None:
self._foo_cache = cached = self._some_time_consuming_operation()
return cached
> Sometimes one can set them, sometimes
> not, and I can't quite tell the two situations apart. It's all
> very confusing to the Noob. (I'm sure this is all documented
> *somewhere*, but this does not make using attributes any more
> intuitive or straightforward. I'm also sure that *eventually*,
> with enough Python experience under one's belt, this all becomes
> second nature. My point is that Python attributes are not as
> transparent and natural to the uninitiated as some of you folks
> seem to think.)
I agree that the introduction of the descriptor protocol added some more
complexity to an already somehow unusual model object.
HTH.
<snip>
> Another (better IMHO) solution is to use a plain property, and store the
> computed value as an implementation attribute :
>
> @property
> def foo(self):
> cached = self.__dict__.get('_foo_cache')
> if cached is None:
> self._foo_cache = cached = self._some_time_consuming_operation()
> return cached
>
There's no need to access __dict__ directly. I believe this is
equivalent (and clearer):
@property
def foo(self):
try:
cached = self._foo_cache
except AttributeError:
self._foo_cache = cached = self._time_consuming_op()
return cached
-John
>kj a écrit :
>> PS: BTW, this is not the first time that attempting to set an
>> attribute (in a class written by me even) blows up on me. It's
>> situations like these that rattle my grasp of attributes, hence my
>> original question about boring, plodding, verbose Java-oid accessors.
>> For me these Python attributes are still waaay too mysterious and
>> unpredictable to rely on.
>Somehow simplified, here's what you have to know:
...
>As I said, this is a somehow simplified description of the process - I
>skipped the parts about __slots__, __getattribute__ and __setattr__, as
>well as the part about how function class attributes become methods.
>this should be enough to get an idea of what's going on.
Thank you, sir! That was quite the education.
(Someday I really should read carefully the official documentation
for the stuff you described, assuming it exists.)
Thanks also for your code suggestions.
~K
Nope, inded. I guess I wrote it that way to make clear that we were
looking for an instance attribute (as a sequel of my previous writing on
attribute lookup rules).
> I believe this is
> equivalent (and clearer):
>
> @property
> def foo(self):
> try:
> cached = self._foo_cache
> except AttributeError:
> self._foo_cache = cached = self._time_consuming_op()
> return cached
>
This is functionally _almost_ equivalent - won't work the same if
there's a class attribute "_foo_cache", which might or not be a good
thing !-)
Will possibly be a bit faster after the first access - IIRC setting up
an error handler is by itself cheaper than doing a couple attribute
access and a method call - but I'd timeit before worrying about it.