DBRef dereferencing?

21 views

Skip to first unread message

Piotr Szotkowski

unread,

Aug 25, 2010, 10:45:38 AM8/25/10

to candy...@googlegroups.com

(Kind-of following-up to the sinatrarb thread, but not really.)

Inspired by Candy’s revolutionary idea(l)s (and, to some extend,
Sinatra’s simplicty) I’m toying with the obviously wrong idea of
implementing something fundamentally relational – a CRM – on top
of a certain document database.

Thus, my question: does Candy have any DBRef dereferencing support?
(I’m thinking about the implicit DBRefs, not the deprecated BSON type.)

(After grepping the sources I’m pretty sure it doesn’t, so I’ll probably
bolt something to this effect on top of it once I’m sure what I’m do^W^W
I want to play with, but just checking.)

— Piotr Szotkowski
--
I consider chromatic to be Larry Wall’s Thomas Huxley.
[Jarkko Hietaniemi, hates-software]

signature.asc

Stephen Eley

unread,

Aug 25, 2010, 1:13:34 PM8/25/10

to candy...@googlegroups.com

On Wed, Aug 25, 2010 at 10:45 AM, Piotr Szotkowski
<chas...@chastell.net> wrote:
>
> Inspired by Candy’s revolutionary idea(l)s (and, to some extend,
> Sinatra’s simplicty) I’m toying with the obviously wrong idea of
> implementing something fundamentally relational – a CRM – on top
> of a certain document database.

Nifty! That sounds like an interesting experiment. It might take some
reconceptualizing of what goes in the contact records to really make
it smooth, but I'll bet you could come up with something that'll turn
heads if you can do that. >8->

> Thus, my question: does Candy have any DBRef dereferencing support?
> (I’m thinking about the implicit DBRefs, not the deprecated BSON type.)

Alas, not at this time. It's on the "Make this happen" list, because
it's such an obviously good idea and because you couldn't really call
an ORM 'transparent' if you can't have pointers between objects. But
it's not there yet.

There are also some design issues to consider before the idea can be
considered fully baked. For instance:

* How do you tell Candy the difference between linking an object and
embedding an object? I'd like to keep it simple and have sensible
operators for each, but assignment can only be used once. Should
linking be a traditional method call instead? Should we go nuts and
override ** or ~ or some operator for something it was never intended
for?

* Should following the reference be eager or lazy? Eager loading --
querying for the referenced document as soon as we load the original
document -- would certainly be possible (it's one of the reasons for
Crunch, with its asynchronous goodness), but would also result in a
lot of extra database traffic. Lazy loading -- querying only when we
ask for that attribute's value -- would be more efficient but slower
from the user's perspective. We could make it an option and let the
app developer decide, but then that's more thing to think about and
try to predict usage patterns in advance.

These aren't unresolvable problems. They just require some thought.
I'd love to hear what you think. Plural "you," extended to everyone
on the list.

One thing I do feel intuitively is that DBRef linking, while it should
be made easy enough, probably shouldn't be *easier* than embedding.
I wouldn't want to encourage it for use cases where embedding makes
more sense. DBRef is treated like a red-headed stepchild in the
MongoDB docs for a reason: it's _always_ going to require extra
round-trips, and it'll _never_ be as efficient as a join in a SQL
database.

Any application where DBRefs have to be followed all the time would
likely be better off using a relational database instead. I know
that's not the trendy answer, but it's true. I'm even looking at
rewriting a meeting proposals app I wrote with MongoMapper and taking
it back to Postgres and ActiveRecord, for just that reason. (I hope
they don't take away my MongoDB coffee mug for saying that.)

--
Have Fun,
   Steve Eley (sfe...@gmail.com)
   ESCAPE POD - The Science Fiction Podcast Magazine
   http://www.escapepod.org

Piotr Szotkowski

unread,

Aug 26, 2010, 8:25:57 AM8/26/10

to candy...@googlegroups.com

Stephen Eley:

> On Wed, Aug 25, 2010 at 10:45 AM, Piotr Szotkowski
> <chas...@chastell.net> wrote:

>> Inspired by Candy’s revolutionary idea(l)s (and, to some extend,
>> Sinatra’s simplicty) I’m toying with the obviously wrong idea of
>> implementing something fundamentally relational – a CRM – on top
>> of a certain document database.

> Nifty! That sounds like an interesting experiment. It might take some
> reconceptualizing of what goes in the contact records to really make
> it smooth, but I'll bet you could come up with something that'll turn
> heads if you can do that. >8->

Much, much more probably something that will spectacularly collapse
on top of contact0’s head; fortunately, my long-time goal is *not* to
have something that works, but rather a testbed for ideas that we can
incorporate at some point at my day job, an open source CRM for NGOs:
http://civicrm.org/ – so, obviously, doing *everything* differently
(and seeing what sticks) is the right way to go in this case. ;)

>> Thus, my question: does Candy have any DBRef dereferencing support?
>> (I’m thinking about the implicit DBRefs, not the deprecated BSON type.)

> Alas, not at this time. It's on the "Make this happen" list, because
> it's such an obviously good idea and because you couldn't really call
> an ORM 'transparent' if you can't have pointers between objects. But
> it's not there yet.

Phew, so my grepping powers are not *that* week. :)

> There are also some design issues to consider before
> the idea can be considered fully baked. For instance:

> * How do you tell Candy the difference between
> linking an object and embedding an object?

My tentative idea is to embed things sans mongo id and link things with
one. But I’m not even sure whether it makes sense in a general case.

> * Should following the reference be eager or lazy?

Again – *in my case* – eager following would mean you’d probably
load the whole database every time, so lazy loading seems like
a good starting point.

> One thing I do feel intuitively is that DBRef linking, while it should
> be made easy enough, probably shouldn't be *easier* than embedding.

Agreed!

(Following are some not-really-Candy-related ramblings that
I just couldn’t resist writing about, feel free to skip.)

------- 8< ------- 8< -------

> Any application where DBRefs have to be followed all the time
> would likely be better off using a relational database instead.

But we already do have a CRM on top of MySQL. ;) Seriously, though:
the problem in our case (or at least my feeling, which my be heavily
influenced by my growing need to play with Ruby and Mongo instead of
PHP and MySQL) is that more and more use cases require storing custom
data (about contacts, events, cross-entity relations, etc.) and, in
general, juggling all kinds of such custom extensions seems to be much
easier in a schema-less way.

My vague idea of where I want to start is a system with only two kinds
of ‘things’: entities (people, groups, organisations, events, etc.)
and relations between them. Each entity type would have its own
class/collection, with a separate, common class/collection for
relations; each relation would store the two DBRefs it connects
and anything that makes sense in its case (relation type, under
what method should the given relation be exposed on either end,
maybe dates when it was active, etc.).

(The obvious first question is whether addresses should be embedded or
factored out to their own class; people from one family do live together
and people from a single organisation do share its address as their
work address – but then *most* addresses will be related to exactly
one entity, so they might as well be embedded…)

My vague idea about the API would be to have, say,
event.rels.participants returning an enumerable of
DBRefs, with event.participants returning an enumerable
with the actual (lazily dereferenced) objects.

My initial goal is for this to be trivially extensible; you’re Russian
and need to track patrynomics? Sure thing, *just start using them*.
You’re running a Jewish organisation and need to track after/before
sunset features of your contacts’ birthdays? Go ahead, make them
first-class properties in *this* CRM… ;)

For such things it really does help if the underlying system is
schema-less – and yes, this is the exact moment where I try to turn
my head away from graph databases like Neo4j…

— Piotr Szotkowski
--
Progress! Bars!
[Levin Alexander]

signature.asc

Reply all

Reply to author

Forward

0 new messages