Persistent graph structures with Ruby/Rails

119 views
Skip to first unread message

Dmytrii Nagirniak

unread,
Nov 28, 2011, 1:52:46 AM11/28/11
to Ruby or Rails Oceania
Hi,

I wonder how you guys work with the persistent ordered graph structures.

What I need to be able to do is to propagate all the changes from a node to all its child nodes recursively.

A node is basically a permission (loaded with the information, such as subject, user, company etc) that is propagated through to the other nodes.

Doing it in SQL is probably possible but seems to be a huge pain.
Similar thing applies to Document DB (unless I stick the whole graph into a single document).

Could you please give some suggestions on this?

Andrew Newman

unread,
Nov 28, 2011, 2:03:53 AM11/28/11
to rails-...@googlegroups.com
Hopefully someone has a better solution but I've played with Gremlin
and there is a JRuby implementation called Pacer.

Pacer:
https://github.com/pangloss/pacer
http://ofallpossibleworlds.wordpress.com/2010/12/19/introducing-pacer/

Gremlin:
https://github.com/tinkerpop/gremlin/wiki

> --
> You received this message because you are subscribed to the Google Groups
> "Ruby or Rails Oceania" group.
> To post to this group, send email to rails-...@googlegroups.com.
> To unsubscribe from this group, send email to
> rails-oceani...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/rails-oceania?hl=en.
>

Chris Berkhout

unread,
Nov 28, 2011, 2:08:51 AM11/28/11
to rails-...@googlegroups.com
Not sure if it helps with your persistance needs, but there are graph
DBs out there.

Take a look at neo4j:
http://neo4j.org/
http://wiki.neo4j.org/content/Getting_Started_With_Ruby
https://github.com/andreasronge/neo4j

Cheers,
Chris

Dmytrii Nagirniak

unread,
Nov 28, 2011, 2:39:57 AM11/28/11
to rails-...@googlegroups.com

Yeah, thanks.
I know about neo4j but have been reluctant to try it due to the Java roots.

I also thought to try MagLev for a while, but couldn't get my hands onto it.

If possible, I would prefer not to diverge from SQL Db. So maybe I am just after a good way of denormalising data...

Or maybe just optimise relational DB for reads, and process the "propagations" as the backgrond job (It's ok to have a delay of a. couple of minutes)...

Dmytrii Nagirniak

unread,
Nov 28, 2011, 2:49:36 AM11/28/11
to rails-...@googlegroups.com

Thanks Andrew. I'll definitely keep that in mind.
But I'll try to avoid using Java libs if other solutions available.

Maybe I really just have to denormalise some of the data...

Nicholas Faiz

unread,
Nov 28, 2011, 2:53:08 AM11/28/11
to rails-...@googlegroups.com
OOC, have you tried doing it with a regular traversal in, say, Ruby code? Did it take very long? If you can multi-thread the tree structures it might not be that big a problem?

If it's a truly massive graph then you'd probably want something other than SQL (as stated above).

Cheers,
Nick

Dmytrii Nagirniak

unread,
Nov 28, 2011, 3:10:18 AM11/28/11
to rails-...@googlegroups.com

The graph itself is not massive (5000-10000 objects). But takes 10 seconds to traverse/update in some cases.

Probably I can multithread it. Just need to make sure DB is not in an exclusive lock :)

Or even offloading it into queue may do the job.

What do you think about it?

--
You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group.
To view this discussion on the web visit https://groups.google.com/d/msg/rails-oceania/-/FqGMliNa4I4J.

ben wiseley

unread,
Nov 28, 2011, 3:28:53 AM11/28/11
to rails-...@googlegroups.com
Have you considered Amazon's Map Reduce or moving the data to Redis (no-sql, in memory, ridiculously fast)?

-ben

Stuart Coyle

unread,
Nov 28, 2011, 3:32:16 AM11/28/11
to rails-...@googlegroups.com
If you don't mind using MariaDB you can use my gem: https://github.com/stuart/acts_as_oqgraph
--
Stuart Coyle
stuart dot coyle at gmail dot com



Dmytrii Nagirniak

unread,
Nov 28, 2011, 3:46:56 AM11/28/11
to rails-...@googlegroups.com
Hmm... Haven't really put much thought into it...
Not very obvious to me how to use key/value store here, apart from cache mechanism.

Amazon Map/reduce may work, but it's again Java and feels like unnecessary infrastructure.

Dmytrii Nagirniak

unread,
Nov 28, 2011, 3:53:15 AM11/28/11
to rails-...@googlegroups.com
I am already using a relational database (locally SQLite, deploy to Postgres probably). Haven't heard about MariaDB but it looks like it's just better MySQL and not sure how it solves the issues here.

Mike Williams

unread,
Nov 28, 2011, 7:13:36 AM11/28/11
to rails-...@googlegroups.com
On 28/11/2011, at 5:52 PM, Dmytrii Nagirniak wrote:

I wonder how you guys work with the persistent ordered graph structures.

What I need to be able to do is to propagate all the changes from a node to all its child nodes recursively.

A node is basically a permission (loaded with the information, such as subject, user, company etc) that is propagated through to the other nodes.

Is "propagating changes to all child node recursively" a strategy to improve query performance?

If so, you might look at an ActiveRecord extension such as my "Arboreal" (shameless plug), or DHH's "acts_as_nested_set", to allow efficient queries on a tree structure.


I did presentation last year, that included an example of using Arboreal to query down a hierarchy (e.g. of roles or groups) while joining with another table (e.g. permissions).  See page 10 of:


-- 
cheers, 
Mike Williams

Dmytrii Nagirniak

unread,
Nov 28, 2011, 4:26:28 PM11/28/11
to rails-...@googlegroups.com
A node is basically a permission (loaded with the information, such as subject, user, company etc) that is propagated through to the other nodes.
Is "propagating changes to all child node recursively" a strategy to improve query performance?

Pretty much. So instead of traversing the path "upwards", I can update it's attributes and query on it. It is not always possible/easy to do though.
For example:
Company1 includes User1.1, User1.2
Company2 includes User2
Company includes User 3

And given the permissions:
User1.1 -- Allow read X--> User 3
User1.2 -- Allow write X--> User 2
User2 -- Allow write X --> User3

In this case User3 should be able to read and write, but only when she is a part of Company 2 or 3 (both allow write), not Company 1 (only allowed reading).



If so, you might look at an ActiveRecord extension such as my "Arboreal" (shameless plug), or DHH's "acts_as_nested_set", to allow efficient queries on a tree structure.


I did presentation last year, that included an example of using Arboreal to query down a hierarchy (e.g. of roles or groups) while joining with another table (e.g. permissions).  See page 10 of:


I think I remember your talk. I did like it a lot.
I currently do something similar to fetch the subtrees using SQL LIKE and paths.
The problem there is that I need to query all the parents for a condition.
So that even if I retrieve the subtree, I will have to walk upchain on each node to verify the permissions.

If I will be able to propagate all the changes down, then it I can just query on the nodes themselves.
Although it gets much more complicated as a lot of denormilised data will be stored... Leading me back to graph DBs :)


In any case, thanks a lot to everybody for the suggestions. I am a little bit better aware of my options now.
I'll actually go and try neo4j... (and will report back if somebody is interested).

Cheers,
Dima.




Jason Nah

unread,
Nov 30, 2011, 5:39:31 AM11/30/11
to rails-...@googlegroups.com
Not sure if this is still useful...

We've used the ancestry gem (https://github.com/stefankroes/ancestry) with great success. It wasn't used in the matter that you described but might also be a suitable alternative.

Cheers,
Jason


Dmytrii Nagirniak

unread,
Nov 30, 2011, 8:42:48 AM11/30/11
to rails-...@googlegroups.com
Thanks Jason.

Yes, ancestry is definitely useful.

But I decided to play with neo4j for a bit.

Would be interesting to hear some experience from others with neo4j.

The challenge is the REST API. I can't find "decent" ActiveModel-ish gem for it.
I am not ready (scared?) to embed the database and use JRuby even though it seems to be the simplest and easiest way to go.

It's pretty interesting so far. The database size is ~400000 "entries" and is under constant "write" load from ~100 Ruby processes.
Most of the queries take ~3 ms, a bit more complex ones - ~12 ms.

I honestly can't imagine doing anything similar with SQL.


Pretty imposed so far if not taking into account that I'll have to pretty much throw away all the ActiveRecord code and tests :(

Tim Uckun

unread,
Nov 30, 2011, 3:36:47 PM11/30/11
to rails-...@googlegroups.com
Wow those are impressive figures. I'll have to give neo4j a go.

Dmytrii Nagirniak

unread,
Nov 30, 2011, 10:18:20 PM11/30/11
to rails-...@googlegroups.com
Yeah, really interesting.

Another figure.

Inserting ~1000 relationships (batched and transactional!) already having the DB with ~4 million entries.
It took ~450 ms. And it's over HTTP, not native bindings.

I am really, really now considering neo4j.
Even if I'll have to give up all the ActiveRecord goodness and all the current code.

Unfortunately the choice of tooling is a little bit disappointing:


Cheers.

Tom Adams

unread,
Dec 1, 2011, 2:56:13 AM12/1/11
to rails-...@googlegroups.com
If you're around Brisbane next week, some of the neo4j guys will be at
BroSQL [1]. They're also ding YOW in Melbourne from what I recall.

I used to work (2000-2004) for a company that built a graph DB, this
kind of performance was pretty standard back then, actually from
memory we were loading ~4-8K records (well triples) per second
linearly with DB sizes in the hundreds of millions of records [2].

It's really about having the right kind of structure for the data,
sounds like your data just isn't right for an SQL DB. For nostalgia, I
did some more digging & found this old whitepaper [3] which kind of
explains it (~p10), may be helpful.

Tom

[1] http://www.brosql.org/events/40496652/
[2] http://www.w3.org/2004/04/13-swdd/#rdfstore
[3] http://itee.uq.edu.au/~dwood/docs/new_type_of_data_management.pdf

--
tom adams
e:tomjadams<at>gmail.com

Mike Williams

unread,
Dec 1, 2011, 7:58:09 AM12/1/11
to rails-...@googlegroups.com
On 01/12/2011, at 2:18 PM, Dmytrii Nagirniak wrote:

Inserting ~1000 relationships (batched and transactional!) already having the DB with ~4 million entries.
It took ~450 ms. And it's over HTTP, not native bindings.

I am really, really now considering neo4j.
Even if I'll have to give up all the ActiveRecord goodness and all the current code.

Interesting!

Jim Webber presented on Neo4J at the YOW conference in Melbourne this morning, and make a pretty convincing case for using it to replace your RDBMS as a general-purpose data-store.  I'm interested to see how your experiments with it pan out.

-- 
cheers, 
Mike Williams

Reply all
Reply to author
Forward
0 new messages