Design by composition and persistance

38 views
Skip to first unread message

Taras_96

unread,
Dec 29, 2012, 12:44:26 AM12/29/12
to django...@googlegroups.com
Hi,

I'm stuck on the best way of implementing design by composition in Django, and was wondering if anyone had any suggestions/pointers/past experiences.

Say you're designing an event management program. Each event has a name, occurs in a suburb, and has a start time and end time. Say you want to model an event as follows.

class Event:
  string eventName
  string description
  Suburb suburb
  TimeWindow eventTimeWindow

class Suburb:
  string suburbName
  string state
  string postcode

class TimeWindow:
  datetime startTime
  datetime endTime
  def getDuration(self) # in seconds


The system contains a definitive list of Suburbs that can be chosen; that is a Suburb object has reference semantics. Consequently it makes sense for Suburb to have its own table, and thus the Event has a foreign key to a Suburb object.

However, a 'TimeWindow' object has value semantics. Two TimeWindow objects with the same attributes are not the same object (arguable, but for illustration let's assume this). Thus although TimeWindow could have it's own table, it seems like overkill - you have the cost of a join, and a TimeWindow object will only ever have a reference to exactly one Event (thus there's no benefit in normalising this part of the design).

This is analagous, in the extreme case, to the 'eventName' attribute. 'eventName' has value semantics, that is two 'eventName's that have the same sequence of characters will still be two separate objects. So you wouldn't create a separate 'EventName' table and reference this from the 'Event' class.

So from the OO software design angle, the Event's TimeWindow object is it's own object (with it's own state & behaviour), but at the persistance layer, TimeWindow's attributes are lumped in with eventName (and description) into the same 'Event' table. Having a separate TimeWindow object is desirable as it follows the general rule that design by composition is desirable. We should have small, coherent, abstracted models that are well defined.

My question is, is it possible to acheive this in Django? One possible solution is to declare TimeWindow as abstract and have Event inherit it. But this doesn't accurately model the 'has a' relationship (and seems like an abuse of Abstract inheritance) - Event would inherit the 'getDuration' method as well as TimeWindow's attributes, which more or less makes sense in this case but not in the general case.

So in your business logic you would want to use the object as: event.eventTimeWindow.getDuration() (and by using abstract inheritance this wouldn't be possible)

Another drawback is that you could only have one 'TimeWindow' object per Event. Let's say that we change the model slightly so that eventName and description are both EnglishString types:

class EnglishString:
  def isEnglish(self)
  def spellCheck(self)
  def titleCase(self) # capitalise the string as if for a header
  string contents

EnglishString is an object with value semantics - two instances with the same contents are not the same object. Thus you probably wouldn't want to store instances in their own table (to avoid join costs and overall complexity in how the objet is persisted). However, you can't use abstract inheritance (ignoring the previous concerns raised), as there are two instances contained with an Event instance.

Another option is to separate how the data is persisted from how it is used in the business logic. 

class Event:
  EnglishString eventName
  EnglishString description
  Suburb suburb
  TimeWindow eventTimeWindow

class PersistedEvent: 
  string eventName
  string description
  Suburb suburb
  datetime startTime
  datetime endTime

persistedEvent = PersistedEvent.objects.get(eventName = "John's Birthday")
event = Event(persistedEvent)
print event.suburb
print event.eventTimeWindow.getDuration()

But this feels:
 - difficult to get right
 - a large piece of work, that I'm sure someone has tried to solve before

And also any Django functionality that is bound to a Django model (eg: Forms, QuerySets) would need a separate conversion step.

 - as above, you'd query against the 'persisted' model and then use that to construct the domain model
 - also, you'd construct a form from the 'persisted' model
   - verification logic would have to sit on the 'persisted' model, as that's what extends Django's Model class
   - however, you'd probably want verification in the 'domain' model, leading to possible duplication
   - also verification for both models would probably have to be done slightly differently, as the 'persisted' model can 'see' all of the attributes from composed objects, whereas this isn't necessarily the case for the 'domain' model.

Is it even worth the effort?

Thanks

Taras

Mike Dewhirst

unread,
Dec 29, 2012, 2:53:05 AM12/29/12
to django...@googlegroups.com
On 29/12/2012 4:44pm, Taras_96 wrote:
> Hi,
>
> I'm stuck on the best way of implementing design by composition in
> Django, and was wondering if anyone had any suggestions/pointers/past
> experiences.

Design by compositon is what you do before thinking about Django. The
outcome would be, well, a design!

Only then do you convert the entities and relationships into Django
models. A model defines the object complete with data and methods.

Relationships between models are defined via SQL-style foreign keys and
constraints.

Have a look at the model layer here ...

https://docs.djangoproject.com/en/1.5/

and

https://docs.djangoproject.com/en/1.5/ref/models/fields/#module-django.db.models.fields.related


Cheers

Mike
> --
> You received this message because you are subscribed to the Google
> Groups "Django users" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/django-users/-/OwPJZWYqfTwJ.
> To post to this group, send email to django...@googlegroups.com.
> To unsubscribe from this group, send email to
> django-users...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/django-users?hl=en.

Taras_96

unread,
Dec 29, 2012, 3:39:58 AM12/29/12
to django...@googlegroups.com
Hey Mike,

Sorry I didn't make it very clear that I've been working with Django for about 8 months now, so I've got the hang of the basic concepts. I'm redesigning a part of the system that was hacked together into what I hope is a cleaner OO design.

I realised that my question can be applied outside of Django in the more general ORM area. I found http://stackoverflow.com/questions/4907518/is-there-an-orm-that-supports-composition-w-o-joins which explains the design issue I'm having,and it looks like JPA's 'embeddable' decorator is what I'm describing. 

If we map software objects directly onto Django Models, then we'd end up with a TimeWindow table, where each row would only ever be referenced by exactly one Event. My understanding of the advantage of normalisation is that it prevents data duplication, and thus helps with data consistency. However, it doesn't really make much sense to incur the cost of this (through joins) if the object that is being modelled is a value object.

So does anyone have any experience with this problem/trade-off in the Django world?

Mike Dewhirst

unread,
Dec 29, 2012, 5:54:03 AM12/29/12
to django...@googlegroups.com
On 29/12/2012 7:39pm, Taras_96 wrote:
> f we map software objects directly onto Django Models, then we'd end up
> with a TimeWindow table, where each row would only ever be referenced by
> exactly one Event. My understanding of the advantage of normalisation is
> that it prevents data duplication, and thus helps with data consistency.
> However, it doesn't really make much sense to incur the cost of this
> (through joins) if the object that is being modelled is a value object.
>

If there is a genuine one-to-one relationship you can be a bit relaxed
about making it part of the event table. That won't duplicate anything.

But there is nothing wrong with being rigorously 3rd normal form right
now and then optimising later - if necessary.

Mike

Javier Guerra Giraldez

unread,
Dec 29, 2012, 10:33:09 AM12/29/12
to django...@googlegroups.com
On Sat, Dec 29, 2012 at 3:39 AM, Taras_96 <tara...@gmail.com> wrote:
> If we map software objects directly onto Django Models, then we'd end up
> with a TimeWindow table, where each row would only ever be referenced by
> exactly one Event. My understanding of the advantage of normalisation is
> that it prevents data duplication, and thus helps with data consistency.
> However, it doesn't really make much sense to incur the cost of this
> (through joins) if the object that is being modelled is a value object.

there's value in one-to-one relationships too. it's not about
performance, but about data organization. also, if there's a lot of
data that logically 'belongs' to a record but is seldom used, it's
easier to ignore when not needed if it's on a separate table.

but if it really belongs to the record, and you want to store on the
same table, and keep it conceptually as a separate object, then it
could be easier to create a new field class.

it could be either a full-blown field that stores its data on more
than one DB field, or it could be an extra abstraction used on top of
existing (and declared) fields, similar to the way generic
relationships work
(https://docs.djangoproject.com/en/1.4/ref/contrib/contenttypes/#generic-relations):


--------------------------
from django.db import models
from django.contrib.contenttypes.models import ContentType
from django.contrib.contenttypes import generic

class TaggedItem(models.Model):
tag = models.SlugField()
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = generic.GenericForeignKey('content_type', 'object_id')

def __unicode__(self):
return self.tag
---------------------------

there you see the 'content_type' and 'object_id' fields store some
'low-level' data, and the 'content_object' field uses those to present
a different interface: the relationship behaviour.

in your case:

class Event(models.Model):
name = models.CharField(.....)
description = models.CharField(....)
tw_start = models.DateTimeField(....)
tw_finish= models.DateTimeField(....)
timewindow = TimeWindowField(tw_start, tw_finish)

and getDuration() is a method of the TimeWindowField, so you can say
event.timewindow.getDuration()


--
Javier

Taras_96

unread,
Dec 29, 2012, 7:57:26 PM12/29/12
to django...@googlegroups.com

in your case:

class Event(models.Model):
  name = models.CharField(.....)
  description = models.CharField(....)
  tw_start = models.DateTimeField(....)
  tw_finish= models.DateTimeField(....)
  timewindow = TimeWindowField(tw_start, tw_finish)

and getDuration() is a method of the TimeWindowField, so you can say
event.timewindow.getDuration()


--
Javier

I was thinking of doing something similar to Javier's suggestion of storing the underlying content 'flat' on the same table, and then creating an abstraction layer. I've tried out the sample code, and  calling event.timeWindow returns a TimeWindowField object (which you can call 'getDuration' on), but setting the timewindow attribute to some other value doesn't propagate the values onto the tw_start & tw_finish fields, and using the query api doesn't work either (not that I was expecting either to work :)). You could work around this by using getters/setters (via properties), but that feels like a long road to go down to come up with a generic solution (DRY) & make sure that it works in the general case. Also, once you've implemented setting/getting, then what about other parts of Django functionality, notably querying and model form generation, that are coupled to the model.. how would support for those be added?  
Reply all
Reply to author
Forward
0 new messages