Related – A Redis-backed high performance distributed graph database

609 views
Skip to first unread message

Niklas Holmgren

unread,
Jan 14, 2012, 9:05:02 AM1/14/12
to Redis DB
Hi,

I've been working on a project using Redis for a while that I think
can be useful for other people than me. So I would like to share it
and perhaps get some feedback on it. It's a distributed graph database
called Related implemented as a Ruby gem. It's not really production
ready yet and I'm still working on the "distributed" part and on
improving the API. But I feel it is kind of ready enough to be shared
with the general public.

Take a look at it if you like and let me know what you think.

http://github.com/sutajio/related

N

Mason Jones

unread,
Jan 31, 2012, 12:18:18 PM1/31/12
to redi...@googlegroups.com
This is a really interesting project, Niklas. I hope to get some time
to try it out soon, and I'll look forward to seeing your upcoming work
on it.

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>

emiretsk

unread,
Feb 13, 2012, 4:11:54 PM2/13/12
to Redis DB
Hi Niklas,

I read the documentation and am a little bit confused about the exact
purpose/functionality of related.
1) Is it just an ActiveRecord like wrapper around redis with build in
"relationship? functionality?
2) Would you mind giving more details on how you store the
relationship data? Do you have a list of connections for each user(a
list of friends for each user, etc.)? Can you provide some info about
the under the hood implantation of the Follower module and
shortest_path_to?

Regards,
Eugene

Josiah Carlson

unread,
Feb 14, 2012, 12:03:06 AM2/14/12
to redi...@googlegroups.com
On Mon, Feb 13, 2012 at 1:11 PM, emiretsk <eugene....@gmail.com> wrote:
> Hi Niklas,
>
> I read the documentation and am a little bit confused about the exact
> purpose/functionality of related.
> 1) Is it just an ActiveRecord like wrapper around redis with build in
> "relationship? functionality?

Looks like it.

> 2) Would you mind giving more details on how you store the
> relationship data? Do you have a list of connections for each user(a
> list of friends for each user, etc.)? Can you provide  some info about
> the under the hood implantation of the Follower module and
> shortest_path_to?

I'm not the author, but the source suggests that it's using a zset to
store relationships, hashes for metadata.

Regards,

- Josiah

> On Jan 14, 9:05 am, Niklas Holmgren <nik...@sutajio.se> wrote:
>> Hi,
>>
>> I've been working on a project using Redis for a while that I think
>> can be useful for other people than me. So I would like to share it
>> and perhaps get some feedback on it. It's a distributed graph database
>> called Related implemented as a Ruby gem. It's not really production
>> ready yet and I'm still working on the "distributed" part and on
>> improving the API. But I feel it is kind of ready enough to be shared
>> with the general public.
>>
>> Take a look at it if you like and let me know what you think.
>>
>> http://github.com/sutajio/related
>>
>> N
>

Niklas Holmgren

unread,
Feb 14, 2012, 4:56:09 AM2/14/12
to Redis DB
Hi Eugene,

## What is Related?

The similarity to ActiveRecord is mostly superficial. It adheres to
the ActiveModel interface and gives the added benefit of a convenient
way of doing things like validations, callbacks and serialization. But
conceptually Related is actually closer to FlockDB (https://github.com/
twitter/flockdb). While FlockDB is a partitioned and replicated graph
interface on top of MySQL, Related is instead more or less a fairly
thin object oriented wrapper on top of Redis sets and zsets. You could
obviously use Redis directly for simple stuff like basic set
operations, in the same way that you could use MySQL joins directly to
implement the same functionality as FlockDB. The added benefit of
using Related is that you get a consistent way of interacting with
Redis as if it was a distributed graph database. Related also
implements some basic graph algorithms and convenience modules (like
the Follower module). My intention is to add more interesting graph
algorithms and query capability in the future. Right now it only has
#path_to and #shortest_path_to.

## How does Related work under the hood?

An edge in the graph is stored as an entry in two sorted sets and two
ordinary sets and a single KV. Like this:

Related.redis.zadd(r_key(:out), self.class.weight_for(self, :out),
self.id)
Related.redis.zadd(r_key(:in), self.class.weight_for(self, :in),
self.id)
Related.redis.sadd(n_key(:out), self.end_node_id)
Related.redis.sadd(n_key(:in), self.start_node_id)
Related.redis.set(dir_key, self.id)

The above is a copy/paste directly from the source. The sorted sets
are used to lookup the edges (in both) directions. The unsorted sets
are an optimization to quickly lookup nodes without going through the
corresponding edges first. They also make it possible to do efficient
set operations like intersect, diff and union, which are not possible
to do with zsets without storing the intermediate result. The last
line in the code above (dir_key) stores the ID of the edge with the
key "{start node id}:{edge type}:{end node id}" and is an optimization
to make deleting (or manipulating) a specific edge much faster and
scale O(1) instead of O(N). It is for example used in
Related::Follower#unfollow! to avoid having to iterate through all the
follow edges to find the one to delete.

## The Follower module

The Follower module is a simple convenience module that is pretty easy
to understand simply by looking at the code (https://github.com/
sutajio/related/blob/master/lib/related/follower.rb). The interesting
part is perhaps the #friends method that computes an intersect of the
incoming and outgoing follow edges. By default, when running a
distributed setup where the graph is distributed over several Redis
nodes, the sets that store the incoming and outgoing edges for a node
is collocated on the same Redis node. If you for some reason opt to
distribute the keys in a different way Related will transparently be
able to handle that and instead do the intersect on the client side,
which is not much slower in practice for most use cases but is bound
by network bandwidth.

## Shortest path to

The #shortest_path_to method uses the Dijkstra's algorithm to find the
shortest path in the graph. It uses the sismember and smembers
commands to check set membership and to fetch the set of edges from a
node (only the IDs). The first gut feeling I had was that doing it
like that would be too slow. But I have done a lot testing and it
turns out it is very fast in practice, since Redis is so fast. But
obviously you will be limited by network bandwidth and latency when
doing it for real on a large scale (instead of memory bandwidth and
memory latency when doing naive benchmarks).


N

by

unread,
Feb 15, 2012, 2:17:32 AM2/15/12
to Redis DB
Hi Niklas,
Any interests to provide a python clone?

On Feb 14, 1:56 am, Niklas Holmgren <nik...@sutajio.se> wrote:
> Hi Eugene,
>
> ## What isRelated?
>
> The similarity to ActiveRecord is mostly superficial. It adheres to
> the ActiveModel interface and gives the added benefit of a convenient
> way of doing things like validations, callbacks and serialization. But
> conceptuallyRelatedis actually closer to FlockDB (https://github.com/
> twitter/flockdb). While FlockDB is a partitioned and replicated graph
> interface on top of MySQL,Relatedis instead more or less a fairly
> thin object oriented wrapper on top of Redis sets and zsets. You could
> obviously use Redis directly for simple stuff like basic set
> operations, in the same way that you could use MySQL joins directly to
> implement the same functionality as FlockDB. The added benefit of
> usingRelatedis that you get a consistent way of interacting with
> Redis as if it was a distributed graph database.Relatedalso
> implements some basic graph algorithms and convenience modules (like
> the Follower module). My intention is to add more interesting graph
> algorithms and query capability in the future. Right now it only has
> #path_to and #shortest_path_to.
>
> ## How doesRelatedwork under the hood?
>
> An edge in the graph is stored as an entry in two sorted sets and two
> ordinary sets and a single KV. Like this:
>
>    Related.redis.zadd(r_key(:out), self.class.weight_for(self, :out),
> self.id)
>    Related.redis.zadd(r_key(:in), self.class.weight_for(self, :in),
> self.id)
>    Related.redis.sadd(n_key(:out), self.end_node_id)
>    Related.redis.sadd(n_key(:in), self.start_node_id)
>    Related.redis.set(dir_key, self.id)
>
> The above is a copy/paste directly from the source. The sorted sets
> are used to lookup the edges (in both) directions. The unsorted sets
> are an optimization to quickly lookup nodes without going through the
> corresponding edges first. They also make it possible to do efficient
> set operations like intersect, diff and union, which are not possible
> to do with zsets without storing the intermediate result. The last
> line in the code above (dir_key) stores the ID of the edge with the
> key "{start node id}:{edge type}:{end node id}" and is an optimization
> to make deleting (or manipulating) a specific edge much faster and
> scale O(1) instead of O(N). It is for example used inRelated::Follower#unfollow! to avoid having to iterate through all the
> follow edges to find the one to delete.
>
> ## The Follower module
>
> The Follower module is a simple convenience module that is pretty easy
> to understand simply by looking at the code (https://github.com/
> sutajio/related/blob/master/lib/related/follower.rb). The interesting
> part is perhaps the #friends method that computes an intersect of the
> incoming and outgoing follow edges. By default, when running a
> distributed setup where the graph is distributed over several Redis
> nodes, the sets that store the incoming and outgoing edges for a node
> is collocated on the same Redis node. If you for some reason opt to
> distribute the keys in a different wayRelatedwill transparently be
> > > calledRelatedimplemented as a Ruby gem. It's not really production

Niklas Holmgren

unread,
Feb 16, 2012, 4:45:08 AM2/16/12
to redi...@googlegroups.com
I'm not very fluent in Python so I would probably be the wrong person to do it. But I invite anyone that is interested to create clones in whatever language they like as long as you maintain some kind of notice back to the original Related project. I've been thinking of doing a Clojure clone myself to learn Clojure, but haven't got around to that.

N

emiretsk

unread,
Feb 17, 2012, 11:00:02 AM2/17/12
to Redis DB
Niklas, thanks for the detailed response!
A few more questions:
1) Do you have any idea how does the performance or Related compare to
FlockDB?
2) Do you have any performance/memory usage results? I am just trying
to estimate how well my application is going to perform using Related/
Redis.
3) How does storing a KV for each edge helps deletion? Don't you still
have to delete the edge from the other 4 sets?
4) What's the purpose of keeping the edge info (the 2 sorted sets) -
an edge is basically a connection between 2 users, so wouldn't it be
simpler to represent it as a list of nodes for each node(similarly to
the 2 unsorted sets).
5) What's the problem intersecting sorted sets - redis provides the
ZINTERSTORE command for that.

Regards,
Eugene

Dinos

unread,
Apr 6, 2012, 3:21:51 AM4/6/12
to redi...@googlegroups.com
Hello everybody,

Do you know if there is a Related clone available in PHP?
Reply all
Reply to author
Forward
0 new messages