On Tue, Jul 17, 2012 at 4:39 PM, Tim McNamara <mcnamara.
...@gmail.com> wrote:
>> I haven't watched that one yet, but I'd challenge the idea of
>> subclassing for code reuse at all, its usually a mistake IME.
>> -Rob
> Wow, powerful statement. Especially as ah.. that's arguably the point
> of adding classes to a language. Otherwise they're just fancy
> dictionaries.
Thats also a debatable position (especially considering that some
recent languages omit classes entirely :)). I think its best answered
through a few things - other uses of classes than arranging for code
reuse, the lack of similarity to fancy dictionaries even in Python,
other more effective ways to reuse code and the downsides of using a
class hierarchy to achieve code reuse.
On the fancy dictionaries side, you might be referring to __dict__,
which is a special case: its not part of the object protocol -
instances of extension classes don't have it, and regular classes made
with __slots__ don't have it:
>>> class Foo(object):
... __slots__ = ['a']
...
>>> Foo().__dict__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Foo' object has no attribute '__dict__'
>>> class Bar(object):
... pass
...
>>> Bar().__dict__
{}
You might also be referring to the way you can just assign attributes
and read them back: but this isn't object orientation so much as it is
plain old data storage: classes are an object orientation and data
hiding tool: acting on someone elses data can lead to very vague
contracts, and when you're programming a large system that really
really hurts. Or you might be referencing the way you end up with a
namespace to do things on - but this is rather specific to Python. If
you look at other languages like e.g. Smalltalk, C++, or Java (picking
three fairly different, object orientated languages), the namespace
for talking about data storage, and for passing messages can be
handled very differently: it can be totally freeform as it is (mostly)
in Python [see descriptors for where it is not], or it can be both
fixed and totally partitioned like in Smalltalk: you cannot
accidentally refer to a member variable vs a method, the syntax is
unambiguous (and there is perfect data hiding: outside the class code
only methods can be seen). Or it can be very flexible with fine
grained rules over who can see what, and when, like it is in C++ and
Java. Any which way you look at it, these things are much more than
dictionaries when you have language support determining how it can be
accessed and which keys can be seen by what callers.
What other uses are there for objects other than code reuse? Well..
classes provide objects, and objects are extremely useful:
- in the absence of type based dispatch they give you polymorphism
without (developer managed) global state.
- they provide data hiding to separate out the plumbing and the taps
- and defensive barriers to ensure that invariants covering the
relationship between multiple atoms are preserved.
How else can we reuse code? There are a few fundamental ways: We can
compose objects - well known techniques like adapters are precisely
this. We can use traits (not available in Python, but available
elsewhere and -very- nice to work with). We can use functions [e.g.
where some required behaviour is not on an axis of freedom for
polymorphism, this prevents inappropriate variation vs using a
method]. Outside of class based languages we get into fun stuff like
prototypes, and thats a mind bending thing all of its own.
What about the downsides? Why should we treat subclassing as a last
resort not a first-chance tool? There are a few key reasons. The first
is that designing things to be subclassable is really rather hard:
subclasses get access to some of your classes internals (I know that
in Python this is more convention than anything :)) and so you need to
decide which things that are plumbing you trust third party code to
fiddle with safely. An argument can be made that this is waste: just
let subclasses do what they will and it will be fine: but when you've
got (say) 40 or 50 discrete code bases subclassing a class you wrote,
that exerts a fantastic evolutionary pressure to change it only
carefully. When that pressure is applied only to the public interface,
its a lot easier to refactor and improve the implementation without
gymnastics.
Secondly, there is the conflation of sub-type and sub-class; when you
subclass something to reuse its code, you are constrained to either a)
preserve its contract in total or b) not be a sub-type of the base
class. As a concrete example, the following two types have a subclass
relationship but not a subtype relationship.
>>> class Parent(object):
... pass
...
>>> class Child(Parent):
... def __init__(self, new_mandatory):
... pass
...
>>> def uses_the_type(a_class):
... return a_class()
...
>>> uses_the_type(Parent)
<__main__.Parent object at 0x1196ed0>
>>> uses_the_type(Child)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in uses_the_type
TypeError: __init__() takes exactly 2 arguments (1 given)
^ Note that the error given is entirely accurate: the subtype
relationship was not preserved - these two classes are not Liskov
substitutable.
Thirdly, its extremely common to find new and unrelated uses for code
as time goes by; if you are a fan of DRY, you will want to reuse them:
subclassing things without a sensible is-a relationship will tend to
break other code that uses isinstance(even though thats a bad smell),
violate guidelines like 'one concept per class' (because you end up
inheriting the concepts of the base class as well as the code), and
make you more vulnerable to action-at-a-distance bugs where changes in
the base class break other code without warning. Even when tests catch
that its still unpleasant, and keeping things tight, focused and
orthogonal is a good way to avoid it.
-Rob