Thinking about EAV database model for flexibility. How (in)compatible is it with the django model?

171 views
Skip to first unread message

Felipe Faraggi

unread,
Dec 15, 2014, 8:42:40 AM12/15/14
to django...@googlegroups.com
Hello everybody.

I am very new to django and I come from a wordpress background (yes, yes I know) and I really like their wp_*meta model. 

After a little digging I discovered this is called the Entity-attribute-value model or EAV. And I am currently setting up a project in django in which my data is very variable because we will be opening up to many APIs and they all have their own table styles. So instead of having a rigid model, we've opted for having a EAV model to stock everything with its own key-value pair.

So I would like to hear your general thoughts about this method (I've heard some good/bad things about it) and your specific experiences with it and django (if any).

I found this repository https://github.com/mvpdev/django-eav that I could use as reference.

Thanks!

Erik Cederstrand

unread,
Dec 15, 2014, 11:11:07 AM12/15/14
to Django Users

> Den 15/12/2014 kl. 14.42 skrev Felipe Faraggi <felipe...@gmail.com>:
>
> After a little digging I discovered this is called the Entity-attribute-value model or EAV. And I am currently setting up a project in django in which my data is very variable because we will be opening up to many APIs and they all have their own table styles. So instead of having a rigid model, we've opted for having a EAV model to stock everything with its own key-value pair.
>
> So I would like to hear your general thoughts about this method (I've heard some good/bad things about it) and your specific experiences with it and django (if any).

An EAV approach is almost never what you want. Even when you think you do, you probably don't. Since you are coercing all your values to strings (coming from a PHP background, that may not seem like a problem!), you lose everything that an RDBMS offers in terms of data validation and integrity, referential integrity, indexing, efficient disk storage etc. Often things you only find out you need when your project or data volume grows large.

In my view, the only justification to EAVs is when you have a sparse object model. For example when storing patient diagnoses. Thousands of diagnose codes exist, but any single patient will only have values for very few of these codes.

If you have many data sources, I would suggest modeling each source truthfully in Django in its' own app, and handling whatever abstraction you need in Python code instead, using e.g. abstract models, class inheritance, mixins, custom managers or queryset chaining. Tables are cheap, and with Django models and data migrations you don't need to care that you're creating hundreds or thousands of them.


Erik

Jamie Lawrence

unread,
Dec 15, 2014, 8:18:12 PM12/15/14
to django...@googlegroups.com
Just to add to Erik's very good advice on (not) using EAV, another thing to keep in mind is that the downsides of EAV tend to manifest after your app is hosting a substantial amount of data, at which point the exercise of sanitizing it in order to port it to a saner model can be *excruciating*. 

I've come in to projects like this as a consultant and the golden rule of any such contract is "no cap on data conversion fees, but please feel free to do it yourself once you realize how much it will cost to preserve that data."

More generally, when using a relational database, you want to think hard about temptations to store things in generic buckets. There is a data store optimized for generic buckets of bits; it is called a file system. Rdbmses, not being designed for that, tend to become expensive[1] when forced to act like it. 

-j

[1] not necessarily in currency, but that can certainly happen as well. 

-- 
Sent from a phone, please excuse typos and terseness.
--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/f0c8def8-d577-4f37-8e67-17fdae896219%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Felipe Faraggi

unread,
Dec 16, 2014, 5:24:12 AM12/16/14
to django...@googlegroups.com
Thanks for your responses Jamie and Erik,

We've since reconsidered and will in fact, be creating a standard relational structure

Again, thanks for your input and feedback

Felipe Faraggi

unread,
Dec 22, 2014, 5:27:55 AM12/22/14
to django...@googlegroups.com
I'd like to 're-open' this question to ask another (maybe) short one:

Therefore, is django not very suitable for NOSQL databases like mongo or couch or others in general?
Or is the problem specifically using RDBMS in a NoSQL manner?

Because if using NOSQL, the whole model system would be obsolete. Am I wrong in this line of thinking?


thanks in advance!

Erik Cederstrand

unread,
Dec 22, 2014, 4:57:22 PM12/22/14
to Django Users

> Den 22/12/2014 kl. 11.27 skrev Felipe Faraggi <felipe...@gmail.com>:
>
> I'd like to 're-open' this question to ask another (maybe) short one:
>
> Therefore, is django not very suitable for NOSQL databases like mongo or couch or others in general?
> Or is the problem specifically using RDBMS in a NoSQL manner?

I haven't used Django with NoSQL, so I can't answer that specific question. The Django wiki has a page on NoSQL: https://code.djangoproject.com/wiki/NoSqlSupport

Which "problem" you are referring to? If you mean using an EAV in Django then the problem is not in Django as such, but rather that an EAV usually ends up being a miserable way of storing your data, regardless of which software you choose to implement it with.

Erik

Jamie Lawrence

unread,
Dec 22, 2014, 5:00:46 PM12/22/14
to django...@googlegroups.com
Well, Django, in the role of an ORM, is necessarily pretty coupled to SQL. I know people have been toying with nosql databases with Django; I don't know much about those efforts. 

EAV intentionally defeats the intended use of RDBMSes by ignoring normalization and data typing, thus (among other things) losing easy queryability, validation and performance  optimizations designed with structure in mind. You can butter bread with lumber, but doing so is likely to be a little problematic.  

So, to answer your question, 'both'. 

I would imagine a fully nosql-capable Django would rethink the model system quite a bit, given that the capabilities and use cases of nosql systems are rather divergent from RMDBSes (and frequently each other).

I know it is common to 'pick a side'. But nosql databases are different beasts, good at different things. Doing things like storing sessions in Mysql (say) on a heavily loaded site is almost always a bad idea in any case. 

-j

Felipe Faraggi

unread,
Dec 23, 2014, 4:19:43 AM12/23/14
to django...@googlegroups.com
You guys are really active on this board! Thanks so much for your time once again.
My question has been answered at its fullest.

Collin Anderson

unread,
Dec 26, 2014, 3:34:55 PM12/26/14
to django...@googlegroups.com
Hi All,

Since no one has mentioned it, I'd also like to draw attention to Django's postgres hstore support, which is a dict-like data structure coming in Django 1.8:

Collin
Reply all
Reply to author
Forward
0 new messages