I've all problems and required changes here:
http://code.djangoproject.com/wiki/NoSqlSupport
What we'd like to do is take Django-nonrel and clean it up (which also
means removing some of the modifications we did), so it's ready for a
merge into trunk. Why not start from Alex' branch? Practically all
modifications in Alex' branch are already implemented in Django-nonrel
(plus a few other changes).
In order to pull this off we'll need support from a Django core
developer. Ideally this would be someone with practical experience in
NoSQL databases and the Django ORM. Does any core developer have time
to help with this project?
Thanks!
Bye,
Waldemar Kornewald
--
Django on App Engine, MongoDB, ...? Browser-side Python? It's open-source:
http://www.allbuttonspressed.com/
I don't think you'll get much argument from the core team that, in
principle, having the infrastructure in place to support NoSQL data
stores would be a good thing. Waldemar et al have clearly put a lot of
effort into their branch. However, the devil is in the details.
Fundamentally, there are two problems standing in the way of this project.
The first is resources. I can't speak for any other members of the
core team, but looking at my calendar for the next couple of months, I
can tell that I'm not going to have as much time to dedicate to Django
as I have over the last couple of years.
The second is knowing the size of the job that is being proposed. At
the moment, this is a completely unknown quantity. I haven't used the
django-nonrel branch, and I'm not aware of anyone that I know and
trust that has. Django-nonrel has been developed completely
independently of django-trunk, with it's own mailing lists, it's own
development team, and so on, so Django's core team hasn't had any
exposure to the design and development process that has lead to the
code that is there.
To be completely frank, from my perspective, the code is an unknown
quantity at this point. It *might* be fine -- but it might not, on
anything from a scale from "needs minor work" to "needs to be
rebuilt". I simply don't know, and any process that will lead to me
knowing requires me to spend a non-trivial amount of time reviewing
the code and it's branch. This is one area where the wiki page could
help -- providing a 1000ft view of how the branch does what it does.
The current wiki content is a good start, but it needs a lot more
detail -- at the moment, it's contains a lot of brief feature
descriptions, but not a lot detail on how or why those features work
they way they do.
So how do we move forward? The assertion has been made that what is
needed next is attention from the core. I'd like to propose something
different.
The core team is already a bottleneck in the whole Django process. The
proposed body of work is of unknown size and scope, and will require a
non-trivial amount of time to establish scope. This has the potential
to consume the limited resources of the core and exacerbate the
bottleneck that already exists.
From my perspective, what is needed next isn't attention from the core
-- it's attention from the *community*.
Personally, the best way to convince me that something is ready for
core is when there is broad community support saying it is ready for
core. Show me an active discussion on django-dev, involving people
that are known to the Django community, arguing the merits of your
patch. Show me the discussion that validates why your approach is a
better than the alternatives (in particular, better than the approach
that has been proposed by one core developer and reviewed by another).
Once there's community consensus that the approach is good, *then* the
code will be ready for serious review from the core. And because the
community has already vouched for the code, there is a much lower risk
involved.
In reality, this is exactly what we ask of *any* proposal for trunk,
but on a slightly larger scale. It isn't the core team's
responsibility to review every patch submitted to Trac -- if it were,
we simply wouldn't be able to keep up. So, if you propose a small
patch, we ask that you get someone independent to review it. I don't
think it's too much of a stretch of the imagination to suggest that if
you are proposing a big patch, you need to get more independent
review. And, for the record, I've asked Waldemar for exactly this in
the past [1].
So -- certainly, lets try and get this into trunk. But the first step
isn't to monopolize the attention of a core developer for an unknown
period of time. Django is a community, not just a core team. That
community needs to be involved in the process, especially when we're
talking about a change as big as introducing support for
non-relational stores.
[1] http://groups.google.com/group/django-developers/browse_thread/thread/9208f63b2fb14acc
Yours,
Russ Magee %-)
For anyone who's interested, here's the complete diff of Django-nonrel
against Django 1.3: http://paste.pocoo.org/show/379546/
I think all those changes could fit into ~10 concrete Trac tickets.
(That doesn't mean discussions won't consume a lot of time for everybody
who's involved -- I just wanted give people an idea about kind and
quantity of the code changes.)
Jonas
The base64 url encoding and password resetting code is required for
MongoDB and other NoSQL DBs which have a string-based primary key. The
old code would only work with integers.
The file upload code is required to support App Engine's Blobstore.
That one indeed isn't exactly related to NoSQL support, but it's
needed by our users. I've already submitted a separate patch for this
change:
http://code.djangoproject.com/ticket/13721
Note that I never proposed to merge Django-nonrel directly. The
cleanup that I mentioned in my last mail would involve getting rid of
unrelated stuff (though I hope you'd still commit those changes in the
same release because they're needed to run Django on App Engine). I'd
also like to change select_related() and add a backwards-compatible
mode to AutoField as described on the wiki. Also, I'm not sure if
Model._entity_exists is acceptable because it might not be
backwards-compatible (it already breaks a few unit tests). Maybe
someone has an idea how to solve it differently?
As suggested by Russell, I'll try to explain the reasoning behind
every proposed change on the wiki page in the next few days.
Bye,
Waldemar
That would be a huge help. I'm trying to wrap my brain around the
megadiff Jonas posted, but I'm having trouble following what's going
on.
The big difference maker, for me, would be if you could separate out
the nonrel changes into a series of patches/commits that I could
review bit by bit. Seeing that sort of logical progression from low-
to high-level really helps me, personally. If you can take the time to
break stuff up in that manner I can certainly reciprocate and find the
time to review.
Jacob
Did you guys consider providing a Document class that is entirely
separate from models.Model?
Technically speaking teaching the ORM non-relational tricks is of
course possible but in reality the philosophy is entirely different
and you need to plan for NoSQL from the very beginning. Traditional
models are flat and have a schema, NoSQL documents can have extra
fields and each of them can hold a fairly complicated structure,
possibly involving numerous other (python-enforced) schemas at
different points in the tree.
In the end you won't be able to move models or logic between
traditional RDBMS and NoSQL engines anyway. What we get instead is
either a whole bunch of NotImplementedErrors or a heap of hacks to
simulate traditional relations in a world that does not need them.
Of course as much of the ORM API as it makes sense should be supported
by the Document but I really feel these should be designed as separate
object types.
--
Patryk Zawadzki
I solve problems.
Please don't get me wrong. I have worked with RDBMS for more than a
decade but I alse use django-nonrel with MongoDB on a daily basis. I
also think that the approach django-mongokit takes is much more
natural for NoSQL data than just reusing the ORM. The ORM has no way
to express complex structures and if such support is added, you will
always have to choose which subset to use. For relational tables you'd
get foreign keys and for non-relational you'd get structure semantics.
Then we have the ModelForms that would need to start producing
sub-formsets for certain structures. In the end you end up with one
swiss army knife instead of a fork and a knife. While possible, it's
not very convenient to dine using a swiss army knife.
Are EmbeddedModelField and DictField not enough to express complex structures?
Django-nonrel currently only doesn't allow to run complex queries on
those fields, but that can be added.
Bye,
Waldemar