Does Arango support bi-directional edges?

1,241 views
Skip to first unread message

Michael Woytowitz

unread,
Oct 13, 2012, 6:58:05 PM10/13/12
to aran...@googlegroups.com
I am evaluating ArangoDB for storing social network data.
I have some directional relations which would be modeled as directional edges.
Example of this would be a relation like reports-to (a person reports to another person)
 
But I also have bi-directional relations. 
Examples:
Friend is the most obvious bi-direction relation.  I would prefer not to model as two separate edges
Another example is Co-author.  A person is associated with another person as a co-author on an article.  there is no direction or it could be viewed as bi-directional.

thank you advance for your response
ArangoDB looks like a very promising NoSQL database

Frank Celler

unread,
Oct 14, 2012, 6:43:13 AM10/14/12
to aran...@googlegroups.com
You always create edges with a direction, but you can view them either directed or undirected:

Create three persons p1, p2, and p3:

arangosh> db._create("persons");
[ArangoCollection 1524135, "persons" (status loaded)]
arangosh> var p1 = db.persons.save({ name: "p1" });
arangosh> var p2 = db.persons.save({ name: "p2" });
arangosh> var p3 = db.persons.save({ name: "p3" });

Relate p1 to p2 and p2 to p3:

arangosh> edges.relations.save(p1, p2, { type: "knows" });
{ error : false, _id : "3162535/4604327", _rev : 4604327 }
arangosh> edges.relations.save(p2, p3, { type: "hates" });
{ error : false, _id : "3162535/4669863", _rev : 4669863 }

The you can check the directed edges:

arangosh> edges.relations.inEdges(p2);
[{ _id : "3162535/4604327", _rev : 4604327, _from : "1524135/2965927", _to : "1524135/3031463", type : "knows" }]
arangosh> edges.relations.outEdges(p2);
[{ _id : "3162535/4669863", _rev : 4669863, _from : "1524135/3031463", _to : "1524135/3096999", type : "hates" }]

Or the indirected:

arangosh> edges.relations.edges(p2);
[{ _id : "3162535/4604327", _rev : 4604327, _from : "1524135/2965927", _to : "1524135/3031463", type : "knows" }, { _id : "3162535/4669863", _rev : 4669863, _from : "1524135/3031463", _to : "1524135/3096999", type : "hates" }]

Does this answer you question? Or did you have something different in mind?

Kind Regards
   Frank

Michael Woytowitz

unread,
Oct 15, 2012, 9:35:11 AM10/15/12
to aran...@googlegroups.com
Thank you for your quick reply.

I do have something different in mind.  I am currently using an XML database to store and analyze large social networks.   Having a json database would have an advantage for me since my user interface is based on ajax-json.   I also like your concept to use mruby or javascript to create pl/sql like procedures in the database.

But I have found that modeling relations (edges) based on 'from' and 'to' has issues when it comes to indexing and getting all the edges for a given vertex.

My current databases stores edges more like this (I converted into json format)

[{ _id : "3162535/4604327", _rev : 4604327, _vertices : ["1524135/2965927", "1524135/3031463"], _directed = true, type : "knows-of" }]

the advantage is to more accurately represent  asymmetric and symmetric edges.

if directed is true then the 'from' is _vertices[0] and the 'to' is _vertices[1]
if directed is false there is no 'from or 'to'
Using this approach, In my current database I only require only one index on the elements in the array not need a separate index (or query) on 'from' and 'to'.

What are your opinions on this approach?

Reference from Wikipedia:
The edges may be directed (asymmetric) or undirected (symmetric). For example, if the vertices represent people at a party, and there is an edge between two people if they shake hands, then this is an undirected graph, because if person A shook hands with person B, then person B also shook hands with person A. On the other hand, if the vertices represent people at a party, and there is an edge from person A to person B when person A knows of person B, then this graph is directed, because knowledge of someone is not necessarily a symmetric relation (that is, one person knowing another person does not necessarily imply the reverse; for example, many fans may know of a celebrity, but the celebrity is unlikely to know of all their fans). This latter type of graph is called a directed graph and the edges are called directed edges or arcs.

Jan Steemann

unread,
Oct 15, 2012, 11:11:02 AM10/15/12
to aran...@googlegroups.com, Michael Woytowitz
Hi there,

Frank is currently out of the office so I'll respond on his behalf:

Edges in ArangoDB currently are not really directional. For example,
creating the following edge between vertices p1 and p2 will not mark it
as directed or undirected:

arangosh> edges.relations.save(p1, p2, { type: "knows" });

What's happening behind the scenes is that one edge document is created,
and that the id of the p1 document is stored in its "_from" attribute.
The id of the p2 document is stored in its "_to" attribute.

"_from" and "_to" attributes are both indexed in edge collections so
querying them should work efficiently.

As mentioned, the edge itself does not know whether it is uni- or
bidirectional. Only the relation between p1 and p2 is stored in the
edge, but the edge has no idea about the direction of the relation.

The attribute names "_from" and "_to" may therefore be somewhat
misleading. They do not necessarily indicate a direction. Somewhat
better names for them would be "vertex1" and "vertex2".

When querying edges for any vertex, you can decide what to query:
- outgoing edges can be queried using the .outEdges() method. This will
query the index on "_from"
- incoming edges can be queried using the .inEdges() method. This will
query the index on "_to"
- undirected relations can be queried using the .edges() method. This
will query the indexes on "_from" and "_to"

In the end that means the directional information currently only exists
implicity during querying. The way you query defines what end of the
edges ("_from", "_to" or both) will be searched in.

If some of your edges are meant to be one-directional only, the only way
to do this currently is store that in extra attributes of the edges,
e.g. inside a "type" attribute. Additionally, this "type" attribute must
be checked when processing the results of the edge queries so the
results can be restricted to the nodes of matching types/desired
directions. Using a "type" attribute is of course only an example, you
can use other attributes as you like.

I hope this is helpful and that you can still use this model to store
and query your graph with it.

Best regards
Jan
> <http://en.wikipedia.org/wiki/Symmetric_graph>). For example, if the
> vertices represent people at a party, and there is an edge between two
> people if they shake hands, then this is an undirected graph, because if
> person A shook hands with person B, then person B also shook hands with
> person A. On the other hand, if the vertices represent people at a
> party, and there is an edge from person A to person B when person A
> knows of person B, then this graph is directed, because knowledge of
> someone is not necessarily a symmetric relation
> <http://en.wikipedia.org/wiki/Symmetric_relation>(that is, one person
> knowing another person does not necessarily imply the reverse; for
> example, many fans may know of a celebrity, but the celebrity is
> unlikely to know of all their fans). This latter type of graph is called
> a /directed/graph and the edges are called /directed edges/or /arcs/.

Michael Woytowitz

unread,
Oct 15, 2012, 2:05:04 PM10/15/12
to aran...@googlegroups.com
Thank you for your reply and detailed explanation Jan,

Based on your statement:
The attribute names "_from" and "_to" may therefore be somewhat 
misleading. They do not necessarily indicate a direction. Somewhat 
better names for them would be "vertex1" and "vertex2". 

Any possibility of changing the internal edge property names to "_vertex1" and "_vertex2"   ??

I'm sure I can work around this and in my head think "_vertex1" and "_vertex2" when I see "_to" and "_from".  But would be nicer if they were just named "_vertex1" and "_vertex2" 

Could the _directional : boolean property be added as an enhancement and passed on construction.

I'll work on building a little prototype to see if I encounter any other concerns.

Jan Steemann

unread,
Oct 15, 2012, 2:53:42 PM10/15/12
to aran...@googlegroups.com, Michael Woytowitz
Hi Michael,

the problem with changing the "_from" and "_to" attribute names to
something else is that this would break compatibility with client APIs.
So I think we need to keep these names for a while though for
bi-directional edges they might be really misleading.

Re the boolean "_directional" property you mentioned:
You can already store arbitrary extra attributes when creating an edge.
Example:
edges.relations.save(f1, f2, {"directional":true});

There is a restriction for the attribute names though:
attribute names starting with an underscore are reserved for ArangoDB
internal usage (e.g. "_id", "_rev", "_from", "_to") and should not be
used by end users for their own purposes. That's the reason why I named
the attribute "directional" instead of "_directional" in the above example.

I am not sure whether storing extra user attributes in edges solves your
problem or if you're asking for a standard "_direction" attribute for
all edges.
If so, could you please open an issue for this in our Github issue
tracker (https://github.com/triAGENS/ArangoDB/issues)? That would be
really helpful.

Thank you and best regards
Jan

Frank Mayer

unread,
Oct 22, 2012, 8:46:19 AM10/22/12
to aran...@googlegroups.com, Michael Woytowitz, j.ste...@triagens.de
Jan,
we could put in a flag in order to get the expected result for the APIs.

By default (without the flag, or with the flag set to 'unidirectional') the returned attributes would be _from _to. And with the flag set to 'bidirectional' it would be _node1, _node2 or whatever the naming should be. 
That way we don't break compatibility... and also leave it open for future versions to set the default to bidirectional.

Frank

Jan Steemann

unread,
Oct 22, 2012, 11:50:18 AM10/22/12
to Frank Mayer, aran...@googlegroups.com, Michael Woytowitz
I think the current default behavior in ArangoDB is to treat all edges
as bi-directional edges and let the end user do the filtering by
specifying the query appropriately.
This may not be enough for traversals of more complex graphs, though.

Adding an optional attribute (e.g. "_bidirectional" or
"_unidirectional") to edges might solve this. This attribute could be
used automatically when querying edges.

The attribute name "_unidirectional" doesn't sound great but would be
the most downwards-compatible from my point of view:
- if ommitted, edges would be treated as bi-directional (same as now).
- if false, edges would be treated as bi-directional (explicity)
- if true, edges would be treated as directed from "_from" to "_to" (but
not vice versa)

Whereas with "_bidirectional" we might need to turn the behavior around,
making it not downwards-compatible.

What do you (and the others) think about it?

Best regards
Jan
> <https://github.com/triAGENS/ArangoDB/issues>)? That would be
--
eMail: j.ste...@triagens.de
Telefon: +49-221-2722999-37
Fax: +49-221-2722999-88

triagens GmbH
Brüsseler Straße 89-93
50672 Köln

Sitz der Gesellschaft: Köln
Registergericht Köln; HRB 53597

Geschäftsführung:
Dr. Frank Celler
Martin Schönert
Claudius Weinberger


Diese e-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese
e-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den
Absender und vernichten Sie diese e-Mail. Wir haben alle
verkehrsüblichen Maßnahmen unternommen, um das Risiko der Verbreitung
virenbefallener Software oder e-Mails zu minimieren, dennoch raten wir
Ihnen, Ihre eigenen Virenkontrollen auf alle Anhänge an dieser e-Mail
durchzuführen. Wir schließen außer für den Fall von Vorsatz oder grober
Fahrlässigkeit die Haftung für jeglichen Verlust oder Schäden durch
virenbefallene Software oder e-Mails aus.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient (or have received this e-mail in
error) please notify the sender immediately and destroy this e-mail. We
have taken precautions to minimize the risk of transmitting software
viruses but nevertheless advise you to carry out your own virus checks
on any attachment of this message. We accept no liability for loss or
damage caused by software.

Frank Mayer

unread,
Oct 22, 2012, 12:59:49 PM10/22/12
to aran...@googlegroups.com, Frank Mayer, Michael Woytowitz, j.ste...@triagens.de
Hi Jan,


On Monday, October 22, 2012 6:50:22 PM UTC+3, Jan Steemann wrote:
I think the current default behavior in ArangoDB is to treat all edges
as bi-directional edges and let the end user do the filtering by
specifying the query appropriately.
This may not be enough for traversals of more complex graphs, though.

Adding an optional attribute (e.g. "_bidirectional" or
"_unidirectional") to edges might solve this. This attribute could be
used automatically when querying edges.

The attribute name "_unidirectional" doesn't sound great but would be
the most downwards-compatible from my point of view:
- if ommitted, edges would be treated as bi-directional (same as now).
- if false, edges would be treated as bi-directional (explicity)
- if true, edges would be treated as directed from "_from" to "_to" (but
not vice versa)

Whereas with "_bidirectional" we might need to turn the behavior around,
making it not downwards-compatible.

What do you (and the others) think about it?
Yes, that sounds good. I forgot that the edges are bidirectional by default, so your proposal is correct (the other way around from mine). However this should also somehow take the node names (_from & _to in contrast to _node1 & _node2 or whatever they would be called)  into consideration. _from & _to as it is now, are behaving as bidirectional (internally) but for the user it seems unidirectional, as pointed out in this thread. Maybe if as you proposed earlier:
- if ommitted, edges would be treated as bi-directional (same as now). ===> API would expect/return _from & _to just like now

- if false, edges would be treated as bi-directional (explicity) ===> API would expect/return _node1 & _node2 instead of _from & _to

- if true, edges would be treated as directed from "_from" to "_to" (but 
not vice versa) ===> API would expect/return _from & _to

Best regards,
Frank

Michael Woytowitz

unread,
Oct 23, 2012, 7:28:26 PM10/23/12
to Frank Mayer, aran...@googlegroups.com, Frank Mayer, j.ste...@triagens.de
You could use keywords _ symmetric and _asymetric

Or we just use _ directional as Boolean in our current model

-- Michael
Sent from my iPhone... all typos are a result of my tired eyes and fat fingers.

Michael Woytowitz

unread,
Oct 23, 2012, 7:34:30 PM10/23/12
to Frank Mayer, aran...@googlegroups.com, Frank Mayer, j.ste...@triagens.de
Sorry

I ment to type
 _directed Boolean 
is what we use in our current model


-- Michael
Sent from my iPhone... all typos are a result of my tired eyes and fat fingers.

Jan Steemann

unread,
Oct 24, 2012, 2:35:59 AM10/24/12
to Michael Woytowitz, Frank Mayer, aran...@googlegroups.com
Yes, I agree. "_directed" is better than "_unidirectional".
Thanks.


Am 24.10.2012 01:34, schrieb Michael Woytowitz:
> Sorry
>
> I ment to type
> _directed Boolean
> is what we use in our current model
>
> -- Michael
> Sent from my iPhone... all typos are a result of my tired eyes and fat
> fingers.
>
> On Oct 23, 2012, at 7:28 PM, Michael Woytowitz <micha...@gmail.com
> <mailto:micha...@gmail.com>> wrote:
>
>> You could use keywords _ symmetric and _asymetric
>>
>> Or we just use _ directional as Boolean in our current model
>>
>> -- Michael
>> Sent from my iPhone... all typos are a result of my tired eyes and fat
>> fingers.
>>
>> On Oct 22, 2012, at 12:59 PM, Frank Mayer <fr...@frankmayer.net
>>> eMail: j.ste...@triagens.de <javascript:>
Message has been deleted
Message has been deleted

Michael Woytowitz

unread,
Oct 24, 2012, 7:01:05 AM10/24/12
to aran...@googlegroups.com, Michael Woytowitz, Frank Mayer, j.ste...@triagens.de
Hi Jan, Frank,

I would suggest not using the _from and _to keyword at all if the keyword _directed is present.. 
 this provides consistency.. Meaning I always get two values _node1 and _node2 and use the boolean value _directed to know if the edge is un-directed (symmetric)  or directed (asymmetric).  If  _directed = true then node1 is considered the start point and node2 is the end point.

Could you please summarize how the new proposed edge API/data-structure would look - including method calls and traversal?

Specifically:
_node1 (number) _node2 (number)  _directed (boolean)
OR
_vertex1 (number) _vertex2 (number) _directed (boolean)  (I am not sure if your standard is to use the name "node" or "vertex" through out your API)
  
And also please summarize the new behavior and what version/build this update would be available.
This would provide a good conclusion to this thread.

Thank you in advance,
Michael
Message has been deleted
Message has been deleted

Frank Mayer

unread,
Oct 24, 2012, 9:45:01 AM10/24/12
to aran...@googlegroups.com, Michael Woytowitz, Frank Mayer, j.ste...@triagens.de
Hi Michael,

That's not a bad idea, but the _from _to will have to still stay around for the old installations at least for the 1.x Versions. It's not the best thing to remove features in minor version upgrades, unless it's really necessary and unavoidable.

So this is why there has to be some time where the "old" and the "new" must coexist without breaking existing installations.

I guess the devs could deprecate it for the remaining 1.x versions and disable it completely in 2.0.

What do you think, Jan?

Best regards,
Frank

Jan Steemann

unread,
Oct 24, 2012, 9:53:23 AM10/24/12
to Frank Mayer, aran...@googlegroups.com, Michael Woytowitz
Hi all,

I am currently in the middle of implementing some other things.
I will get back to you on the subject as soon as I can.

Best regards
Jan

Michael Woytowitz

unread,
Oct 24, 2012, 3:07:51 PM10/24/12
to aran...@googlegroups.com, Michael Woytowitz, Frank Mayer, j.ste...@triagens.de
Hi Frank,

that was exactly what I was thinking ....   Define the revised API and notify users the old API would be deprecated in version 2.x
I think both could co-exist since they using different property names....  you could just have both sets of properties defined in the edge 

edge:  _from(number?)  _to(number?) _vertex1(number?)  vertex2(number?)  _directed(boolean)

Only issue is indexing might be bigger than desired.  Would be nice if could choose which style to setup on install.

~Michael

Frank Mayer

unread,
Oct 24, 2012, 3:23:35 PM10/24/12
to aran...@googlegroups.com, Michael Woytowitz, Frank Mayer, j.ste...@triagens.de
Hi Michael,
That sounds good...

...about the index bloat, maybe if Jan can do some magic to alias (and provide both) attribute name types for use in the API and internally only use the new set, it would not duplicate index data.

Let's see what he's got to say about this when he's back from whatever nice he's "cooking" into ArangoDB. :)

However, keep them ideas coming, Michael. It's a young but promising project. :)

Best regards,
Frank

Jan Steemann

unread,
Oct 25, 2012, 12:39:08 PM10/25/12
to aran...@googlegroups.com, Frank Mayer, Michael Woytowitz
Hi all,

today I found some time to check in more detail how edges behave at the
moment.

When creating an edge, you can store arbitrary payload information along
with the edge, e.g. the "some-stuff" attribute for the following edge
between vertices v1 & v2:

edges.e.save(v1, v2, { "some-stuff" : true });

Edges are not explicitly directed or undirected in ArangoDB. There is no
standard way to tell the system an edge is directed or undirected. You
can use any non-standard attribute for this info, however, it's hard to
filter on that later.

When an edge gets created, it's currently inserted into an edges index
of the edge collection.
In the above case, there will be following inserts into the edges index:
- vertex: v1, direction: OUT
- vertex: v2, direction: IN

This is reasonable and looks like the edge indeed carries some direction
with it (v1 -> v2, or _from -> _to). You can later look up the incoming
and outgoing edges using the .inEdges(vertex) or .outEdges(vertex)
functions.

However, the following index entries are also made when creating an
edge, in addition to the two entries above:
- vertex: v1, direction: ANY
- vertex: v2, direction: ANY

The only use case for these additional index entries is to quickly look
up all edges connected to a vertex, regardless of whether the original
edges were in or out edges.

That means four index entries were made per edge instead of just two
(exception: only three entries were made for edges that had identical
_from and _to values, but that should have been rare cases).

That of course bloats the index, and still edge direction is only implicit.


After thinking a bit about this, the solution I have come up with is the
following:

Only create two index entries per edge:
- vertex: v1, direction: OUT
- vertex: v2, direction: IN
These should do it, and if you need an ANY query (i.e. .edges(vertex)),
we can merge the results of an OUT and an IN query.
This prevents index bloat.


Additionally, I think we should store per edge whether it is directed or
undirected. This info can be passed to ArangoDB on edge creation.
My suggestion is to use the following syntax for this:

edges.e.save(v1, v2, { "some-stuff" : true, "_bidirectional" : true });

The _bidirectional attribute would be a new system attribute available
for edges. It is stored internally, and if set to true, it will cause
two additional index entries to be made:
- vertex: v2, direction: OUT
- vertex: v1, direction: IN
That's the reverse of the above entries and seems plausible, as the edge
is bidirectional.

To summarize so far, if an edge is unidirectional, two index entries
will be made. If it's bidirectional, four entries will be made.

The default behavior of ArangoDB should be to use unidirectional edges,
so when not specifying the "_bidirectional" attribute, everything is
fully downwards compatible.


When returning edges from a collection or a query, an edge currently has
_from and _to attributes.
This should be extended with the _bidirectional attribute. This will
always have a boolean value (with false being the default if nothing was
specified on creation).

When a unidirectional edge is returned, we should return the OUT vertex
in the _from attribute, and the IN vertex in the _to attribute as we did
before. This again is downwards-compatible.

When a bidirectional edge is returned, we can return the two vertices it
connects in an array named _vertices. This is more appropriate than
_from and _to because there is no order or hierarchy between the two
connected vertices, but this was suggested by the names _from and _to.


Some example graph follows, with a vertex collection v and an edge
collection e:

// clean up
db._drop("v");
edges._drop("e");

// create collections
db._create("v");
edges._create("e");

// save vertices
a = db.v.save({ "name" : "a" });
b = db.v.save({ "name" : "b" });
c = db.v.save({ "name" : "c" });
d = db.v.save({ "name" : "d" });
e = db.v.save({ "name" : "e" });
f = db.v.save({ "name" : "f" });

// save edges
edges.e.save(a, b, { "what": "a<->b", "_bidirectional" : true });
edges.e.save(a, b, { "what": "a->a", "_bidirectional" : false });
edges.e.save(a, c, { "what": "a->c", "_bidirectional" : false });
edges.e.save(d, a, { "what": "d->a", "_bidirectional" : false });
edges.e.save(c, d, { "what": "c->d", "_bidirectional" : false });
edges.e.save(f, d, { "what": "f<->d", "_bidirectional" : true });
edges.e.save(f, e, { "what": "f->e", "_bidirectional" : false });
edges.e.save(e, e, { "what": "e->e", "_bidirectional" : false });

As you can see I created both unidirectional and bidirectional edges in
the example. When querying all the edges, we'd get the following result:

arangod> edges.e.all().toArray();
[
{
"_bidirectional" : false,
"_from" : "7588030/10537150",
"_to" : "7588030/10275006",
"what" : "d->a"
},
{
"_bidirectional" : false,
"_from" : "7588030/10275006",
"_to" : "7588030/10471614",
"what" : "a->c"
},
{
"_bidirectional" : false,
"_from" : "7588030/10668222",
"_to" : "7588030/10602686",
"what" : "f->e"
},
{
"_bidirectional" : true,
"_vertices" : ["7588030/10668222", "7588030/10537150"],
"what" : "f<->d"
},
{
"_bidirectional" : false,
"_from" : "7588030/10275006",
"_to" : "7588030/10406078",
"what" : "a->a"
},
{
"_bidirectional" : false,
"_from" : "7588030/10602686",
"_to" : "7588030/10602686",
"what" : "e->e"
},
{
"_bidirectional" : false,
"_from" : "7588030/10471614",
"_to" : "7588030/10537150",
"what" : "c->d"
},
{
"_bidirectional" : true,
"_vertices" : ["7588030/10275006", "7588030/10406078"],
"what" : "a<->b"
}
]

As you can see from the above, we return _from and _to for
unidirectional edges, and _vertices for bidirectional edges.


I hope this suggestion is more intuitive than the current solution in
ArangoDB 1.0 and 1.1.
I also think it can be more efficient because for directed edges there
will be less index entries. Apart from that, edges direction is now
explicit.

I everybody is happy with this suggestion, you can already start playing
around with it. The above output is from the devel branch.
The change should not be made in ArangoDB 1.0 or 1.1, as it will require
some changes to the data file organisation.
ArangoDB 1.2 will introduce some changes for data file organisation
anyway due to other features (transactions), so integrating the change
into ArangoDB version 1.2 seems best from my point of view.

What do you think?

Best regards
Jan

Jan Steemann

unread,
Oct 25, 2012, 4:10:22 PM10/25/12
to aran...@googlegroups.com, Frank Mayer, Michael Woytowitz
A quick update on the subject:

Obviously we can always get away with just two index entries, one for
the IN vertex and one for the OUT vertex.
I've slightly modified the edge creation procedure to only produce two
index entries. After adjusting the querying part a bit, the queries now
produce the results they should.

That means from now on, exactly index entries will be created for each
edge and there are no exceptions to this rule. Two entries are still
necessary to do quick lookups for both ends of an edge.

The change is contained in the devel branch, as are the following
changes I described a few hours ago.
I think there are no objections to reducing the index bloat, but I am
not sure about the other features (_bidirectional attribute, returning
_from/_to or _vertices depending on edge type) which I last described.
Feedback on these issues is welcome.

Best regards
Jan

Frank Mayer

unread,
Oct 26, 2012, 1:25:48 AM10/26/12
to aran...@googlegroups.com, Frank Mayer, Michael Woytowitz, j.ste...@triagens.de
Hi Jan,

I read your 2 last posts and what you are proposing looks good. Keeps backwards compatibility and extends functionality. 

I am wondering if in that matter it would be somehow possible to include the optional functionality that I was proposing here: https://github.com/triAGENS/ArangoDB/issues/219
I am especially referring to the last three posts (as of today) of that thread, where I propose some internal collection which would hold a document's edges or some other way to internally store the document's edge connection ids as part of the document's metadata for easier lookup/querying. It's only an idea on how this could be solvable, however you know the internals of ArangoDB better than me :). 

Thanks
Frank

Jan Steemann

unread,
Oct 26, 2012, 2:56:48 AM10/26/12
to aran...@googlegroups.com, Frank Mayer, Michael Woytowitz
Hi Frank,

I agree it would be good to have some info about which edges connect to
which vertices.
Having the connection info only stored in the edges is a problem if you
care about consistency, and then delete vertices which do not also
delete the connected edges.

I am not sure what the best solution would be:
- having the connected edges also stored inside each vertex, so one can
do a lookup from the vertex to the edges, and from the edges to the
vertices, and use that to clean up when modifying vertices
- storing all vertex-edge connection data in a separate xref collection
and using that to clean up when modifying vertices
- storing just the collection ids/names for connections so ArangoDB
would know that for example edge collection E is connected to vertex
collections A and B, and then probing these collections for connections
to clean up when modifying vertices.

As all of the above has storage and runtime overhead, it might be good
to make the consistency things optional. At some point the deletion
logic may also be handled by triggers. To be able to have a consistent
view of the data, transactions are also required because we need to
touch multiple collections (edge collection and vertex collection(s))
for each data modifying operations.

From my point of view, fully solving this issue seems to be a bigger
task, as it depends on other features (transactions, triggers) and we
obviously need to discuss a lot more about the potential implementations.
Frank is out of the office for the next two weeks, and I suggest we wait
for his return before going on with the discussion, at least on our side.

Best regards
Jan

Frank Mayer

unread,
Oct 26, 2012, 3:36:05 AM10/26/12
to aran...@googlegroups.com, Frank Mayer, Michael Woytowitz, j.ste...@triagens.de
Hi Jan,

yes those solutions seem all fine. I guess, it all comes down to which is the most performant and would also give most flexibility in functionality (PATHS could benefit a lot from xref for example).

Ok, then let's wait for Frank to return, to continue this discussion.

Best regards,
Frank
Message has been deleted

Frank Mayer

unread,
Nov 18, 2012, 8:14:03 PM11/18/12
to aran...@googlegroups.com, Frank Mayer, Michael Woytowitz, j.ste...@triagens.de
Forget my last post, I deleted it. I found out that setting _bidirectional makes a vertices list instead of _from/_to. Looks good !! 

On Friday, October 26, 2012 9:56:55 AM UTC+3, Jan Steemann wrote:
Reply all
Reply to author
Forward
0 new messages