[GSOC] A Library For Integrating Neo4j With Django's ORM

137 views
Skip to first unread message

Saurabh Wahile

unread,
Mar 19, 2014, 4:12:35 PM3/19/14
to django-d...@googlegroups.com
Hello,
I am a third year Computer Engineering B.E student at Mumbai University, India. I have 4 years of experience in C++/Java Programming and 1 year in Python programming. Coming from a C++/Java background, I appreciate the rapid development quality of Python and Django's amazing way of leveraging it. Some of the projects that I've previously worked on are here: 

The Problem:
For the past year, I've been working on a social app for android which requires a RESTful API. This I implemented using Django and the tastypie library. However, as all social apps require a 'friends'/'relationships' module, I was stuck using a foreign key relationship model. This I found to be quite counter intuitive and simply wrong to go with as the join computation is too heavy on the database. SQL databases are quite good at handling large amounts of data but are not designed to handle relationships quite well. While researching more, I found out about NoSQL solutions, in particular, Neo4j. Neo4j can handle relationships quite well. However while digging through a large chunks of data, particularly searching, goes heavy on the NoSQL databases, at least for now.  

The Solution:
I found out that if i assign storage of linear data to ORM databases and the relationships to NoSQL solutions, I would be able to maximize on performance and create a 'best-of-both' scenario for developing Django applications that rely on heavy data and relationships.

The Plan:
I intend to build a library that can help to resolve this issue by using Django's existing model architecture and extending the foreign key relationships to be redirected onto the NoSQL solution.

I would like to hear any suggestions and if instead of a library, I could directly add models fields that could support NoSQL funtionalities.

Awaiting feedback,
Saurabh

Kirby, Chaim [BSD] - PED

unread,
Mar 19, 2014, 4:27:41 PM3/19/14
to django-d...@googlegroups.com
Hi Saurabh,

On 03/19/2014 03:12 PM, Saurabh Wahile wrote:
Hello,
I am a third year Computer Engineering B.E student at Mumbai University, India. I have 4 years of experience in C++/Java Programming and 1 year in Python programming. Coming from a C++/Java background, I appreciate the rapid development quality of Python and Django's amazing way of leveraging it. Some of the projects that I've previously worked on are here: 

The Problem:
For the past year, I've been working on a social app for android which requires a RESTful API. This I implemented using Django and the tastypie library. However, as all social apps require a 'friends'/'relationships' module, I was stuck using a foreign key relationship model. This I found to be quite counter intuitive and simply wrong to go with as the join computation is too heavy on the database. SQL databases are quite good at handling large amounts of data but are not designed to handle relationships quite well.
"SQL" databases are actually RDBMS - Relational Database management system. Above anything else they are built to handle relationships. Not sure why you find it counterintuitive, as a ForeignKey is basically the canonical example of a relationship. Do you have any statistics to back up "the join computation is too heavy on the database", or is this a premature optimization?

While researching more, I found out about NoSQL solutions, in particular, Neo4j. Neo4j can handle relationships quite well. However while digging through a large chunks of data, particularly searching, goes heavy on the NoSQL databases, at least for now.  

The Solution:
I found out that if i assign storage of linear data to ORM databases and the relationships to NoSQL solutions, I would be able to maximize on performance and create a 'best-of-both' scenario for developing Django applications that rely on heavy data and relationships.

The Plan:
I intend to build a library that can help to resolve this issue by using Django's existing model architecture and extending the foreign key relationships to be redirected onto the NoSQL solution.

I would like to hear any suggestions and if instead of a library, I could directly add models fields that could support NoSQL funtionalities.

Awaiting feedback,
Saurabh
--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/842f73e4-4c7d-4e5a-8034-ad17df990d8f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Regards,
Chaim Kirby


This email is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If the reader of this email message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is prohibited. If you have received this email in error, please notify the sender and destroy/delete all copies of the transmittal.

Thank you.

Russell Keith-Magee

unread,
Mar 20, 2014, 2:17:04 AM3/20/14
to Django Developers
Hi Saurabh,

On Thu, Mar 20, 2014 at 4:12 AM, Saurabh Wahile <saurabh...@gmail.com> wrote:
Hello,
I am a third year Computer Engineering B.E student at Mumbai University, India. I have 4 years of experience in C++/Java Programming and 1 year in Python programming. Coming from a C++/Java background, I appreciate the rapid development quality of Python and Django's amazing way of leveraging it. Some of the projects that I've previously worked on are here: 

The Problem:
For the past year, I've been working on a social app for android which requires a RESTful API. This I implemented using Django and the tastypie library. However, as all social apps require a 'friends'/'relationships' module, I was stuck using a foreign key relationship model. This I found to be quite counter intuitive and simply wrong to go with as the join computation is too heavy on the database. SQL databases are quite good at handling large amounts of data but are not designed to handle relationships quite well. While researching more, I found out about NoSQL solutions, in particular, Neo4j. Neo4j can handle relationships quite well. However while digging through a large chunks of data, particularly searching, goes heavy on the NoSQL databases, at least for now.  

I don't know if I'd completely agree with the way you've phrased this. SQL is a query language used to access RDBMSs. The "R" in RDBMS stands for "Relational", so saying SQL isn't good with relationships is patently untrue. Unless, of course, you're interpreting "relationship" a particular way; which if you're using Neo4j (an object database), I suspect you probably are.

Also - if anyone says "Databases can't handle joins", make sure they're not using MySQL as their point of reference. MySQL is definitely awful at joins. However, real databases (by which I mean PostgreSQL, Oracle, etc) are a whole lot better because they actually know what indexes are, and have query planners that use them.

The Solution:
I found out that if i assign storage of linear data to ORM databases and the relationships to NoSQL solutions, I would be able to maximize on performance and create a 'best-of-both' scenario for developing Django applications that rely on heavy data and relationships.

The Plan:
I intend to build a library that can help to resolve this issue by using Django's existing model architecture and extending the foreign key relationships to be redirected onto the NoSQL solution.

I would like to hear any suggestions and if instead of a library, I could directly add models fields that could support NoSQL funtionalities.

This is a project that is potentially interesting, but is probably going to be rejected on the basis that it doesn't pay enough attention to prior and current attempts in the same space. 

A few years back, Alex Gaynor did a GSoC project that attempted to write a NoSQL database backend. He was using MongoDB, and he was able to identify and clean up a bunch of problems. However, he also hit a couple of brick walls in the design, and the project was abandoned before it got to trunk.

Fast forward to today, and there is a proposal for *this* GSoC that has been discussed at length with the aim of producing the formal API and documentation to make it easier to third-party model layers that will be compatible with Django's internals. This project has been discussed at length, and is a quite mature proposal at this point.

You've joined this discussion with less than 3 days remaining in the GSoC application process, and have provided little more than "I've got an idea". Even if there wasn't a competing (and much more mature) project, the detail you've provided here is nowhere near the level we'd expect of a successful Django GSoC applicant.

There's obviously still time, but there isn't much of it. If you want to be accepted for this GSoC, you're going to need to put together a full proposal very quickly. Unfortunately, you're not going to have time for a whole lot of community feedback. 

If you want to proceed, I would suggest by reading up on the prior art that I've mentioned. At the very least, this will give you an idea of what is possible, and what approaches to this problem have already been tried.

Yours,
Russ Magee %-)

Tilman Koschnick

unread,
Mar 20, 2014, 3:44:06 AM3/20/14
to django-d...@googlegroups.com
Hi Saurabh,

On Wed, 2014-03-19 at 13:12 -0700, Saurabh Wahile wrote:

> The Plan:
> I intend to build a library that can help to resolve this issue by
> using Django's existing model architecture and extending the foreign
> key relationships to be redirected onto the NoSQL solution.
>
>
> I would like to hear any suggestions and if instead of a library, I
> could directly add models fields that could support NoSQL
> funtionalities.

Maybe https://github.com/scholrly/neo4django is worth a look.

Regards, Til



Reply all
Reply to author
Forward
0 new messages