Philosophy behind not supporting SET and LIST multiplicity values for edge property

305 views
Skip to first unread message

Kushal Agrawal

unread,
Nov 18, 2019, 8:09:38 AM11/18/19
to Gremlin-users
Hi,
The title says it all, I was curious about why edge properties can't be sets or lists?

Stephen Mallette

unread,
Nov 18, 2019, 8:21:38 AM11/18/19
to gremli...@googlegroups.com
That asymmetry in the API was the subject of intense discussion in the early days of TinkerPop 3.x. I'm sorry I can't recall the details of the discussion really to give the highlights, but I feel like the general consensus was that the use cases we had envisioned given the best practices for schema design we'd espoused largely left us with the position that multiproperties would most typically be defined on vertices and not so much on edges. From my current perspective, I don't think we should have ever made multi or metaproperties first class citizens in TinkerPop at all. They should have simply been features of the underlying graphs system and Gremlin should have just been flexible enough to query them nicely however the graph system implemented them. 

On Mon, Nov 18, 2019 at 8:09 AM Kushal Agrawal <sweaty...@gmail.com> wrote:
Hi,
The title says it all, I was curious about why edge properties can't be sets or lists?

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/d10f05e9-cb20-419e-8c9b-c598b012f1f8%40googlegroups.com.

Kushal Agrawal

unread,
Nov 19, 2019, 1:34:02 AM11/19/19
to Gremlin-users
Thanks for the answer Stephen. Since it's not a settled issue, is there an ongoing discussion around the matter? Is it possible that multi-properties will stop being first class citizens in vertices as well?
As a side note, I wanted to ask what the point of cardinality is, given that the value of a property can be any arbitrary Java object? Is it to support indexing on the members of a set or list?


On Monday, 18 November 2019 18:51:38 UTC+5:30, Stephen Mallette wrote:
That asymmetry in the API was the subject of intense discussion in the early days of TinkerPop 3.x. I'm sorry I can't recall the details of the discussion really to give the highlights, but I feel like the general consensus was that the use cases we had envisioned given the best practices for schema design we'd espoused largely left us with the position that multiproperties would most typically be defined on vertices and not so much on edges. From my current perspective, I don't think we should have ever made multi or metaproperties first class citizens in TinkerPop at all. They should have simply been features of the underlying graphs system and Gremlin should have just been flexible enough to query them nicely however the graph system implemented them. 

On Mon, Nov 18, 2019 at 8:09 AM Kushal Agrawal <sweaty...@gmail.com> wrote:
Hi,
The title says it all, I was curious about why edge properties can't be sets or lists?

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremli...@googlegroups.com.

Kushal Agrawal

unread,
Nov 19, 2019, 1:43:04 AM11/19/19
to Gremlin-users
P.S. Apologies for any confusion stemming from the error in the title, I meant to say cardinality, not multiplicity.

Stephen Mallette

unread,
Nov 19, 2019, 5:33:37 AM11/19/19
to gremli...@googlegroups.com
>  Since it's not a settled issue, is there an ongoing discussion around the matter?   Is it possible that multi-properties will stop being first class citizens in vertices as well?

I mean...I think the way multi/metaproperties currently work is a settled issue for TinkerPop 3.x. I don't think 3.x could take the structural upheaval involved in changing this (one way or the other) unless someone came up with something really clever or the demand massively overwhelming. So, I'd say the feature is safe for 3.x, however support among graph systems is spotty and you reduce the portability of your code by using them I'd say.

> As a side note, I wanted to ask what the point of cardinality is, given that the value of a property can be any arbitrary Java object? Is it to support indexing on the members of a set or list?

The feature roots out of  Titan (JanusGraph) with the idea that Gremlin would have knowledge of the notion that properties could contain lists or metadata and could therefore be traversed seamlessly and with graph provider optimization (indices). In TinkerPop 2.x we didn't have flexibility in the language to express such things well so we felt we needed to make these features structurally explicit which then made it pretty easy to make Gremlin utilize that structure. While a property can take an arbitrary object there needs to be a way for the graph to know what that object represents:

g.V()......property('color',['red'])

Does that mean to append "red" to a list as a multiproperty or to append a List with the string "red" in it as a multiproperty or does it just mean overwrite the "color" property with a List? in TinkerPop 3.x the Cardinality argument would make that option clear. Not sure how such things would be delegated to the underlying graph without Cardinality...perhaps a schema would help make that decision clear. That will need more thought.

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/afca19e0-3fa9-486a-a975-d824d16cd679%40googlegroups.com.

Joshua Shinavier

unread,
Nov 19, 2019, 1:54:24 PM11/19/19
to gremli...@googlegroups.com
FWIW I think we should probably extend support for set-valued properties into TP4, and if so, you should be able to have set-valued properties on both edges and vertices, depending on whether the underlying graph supports them. Formally, it is probably best to think of sets as lists for which we just don't care about the order. Setting a set-valued property means replacing the set value with another set value.

Note: I do not see set-valued properties and multi-properties as the same beast. A set-valued property, a or a list- or map-valued property, is a property whose value is a collection. Supporting collection-valued properties does not require relaxing the constraint that the keys of an element's properties are distinct. Multi-properties do require relaxing that constraint (and I think in general this feature should be available, though many implementations may choose not to support it). The distinction becomes important if meta-properties are also supported, as the elements of a set of property values cannot be annotated with individual meta-properties, whereas a set of multi-properties can.

Josh


Stephen Mallette

unread,
Nov 19, 2019, 2:18:40 PM11/19/19
to gremli...@googlegroups.com
>  Setting a set-valued property means replacing the set value with another set value.... A set-valued property, a or a list- or map-valued property, is a property whose value is a collection.

Yes - agree with that thinking. No reason we can't allow such things and even make them easier to work with.

>  Multi-properties do require relaxing that constraint (and I think in general this feature should be available, though many implementations may choose not to support it). The distinction becomes important if meta-properties are also supported, as the elements of a set of property values cannot be annotated with individual meta-properties, whereas a set of multi-properties can.

right - if we kept multi/metaproperties for TP4 it will be curious to see how we go about implementing them. i'm not a big fan of them however because i really haven't seen the promise of meta/multiproperties pay off in production systems. they tend to muddy schema design choices for users (who are already overwhelmed by graph design choices) and even when used toward the use cases we've espoused, the solutions don't often scale well. perhaps there are folks out there who have had some success with them and will share their stories, but the general feedback have heard over the years that they are nice in theory but not terribly great in practice. 

Kushal Agrawal

unread,
Nov 27, 2019, 8:52:13 AM11/27/19
to Gremlin-users
I agree, having set-valued properties on edges, for example, would have provided a huge benefit to my use-case.
A toy version of my problem is as follows:

Vertex A is connected to vertices B and C.
The edges connecting A to B and A to C have a set-valued property called numbers.
I have an integer x provided to me at the start of my traversal.
I want to go from a vertex A to either B or C (or both), based on whether x is present in the numbers set or not.

As of now there is no way to implement this without making multiple edges between the vertex pairs A->B and A->C.
The number of these edges is as large as the size of the numbers sets, which can be quite large in my use-case.

I could have put the numbers sets in B and C themselves, but that would have prevented me from utilizing the vertex centric indices of JanusGraph (which is what I'm using).

On Wednesday, 20 November 2019 00:24:24 UTC+5:30, Joshua Shinavier wrote:
FWIW I think we should probably extend support for set-valued properties into TP4, and if so, you should be able to have set-valued properties on both edges and vertices, depending on whether the underlying graph supports them. Formally, it is probably best to think of sets as lists for which we just don't care about the order. Setting a set-valued property means replacing the set value with another set value.

Note: I do not see set-valued properties and multi-properties as the same beast. A set-valued property, a or a list- or map-valued property, is a property whose value is a collection. Supporting collection-valued properties does not require relaxing the constraint that the keys of an element's properties are distinct. Multi-properties do require relaxing that constraint (and I think in general this feature should be available, though many implementations may choose not to support it). The distinction becomes important if meta-properties are also supported, as the elements of a set of property values cannot be annotated with individual meta-properties, whereas a set of multi-properties can.

Josh


On Tue, Nov 19, 2019 at 2:33 AM Stephen Mallette <spmal...@gmail.com> wrote:
>  Since it's not a settled issue, is there an ongoing discussion around the matter?   Is it possible that multi-properties will stop being first class citizens in vertices as well?

I mean...I think the way multi/metaproperties currently work is a settled issue for TinkerPop 3.x. I don't think 3.x could take the structural upheaval involved in changing this (one way or the other) unless someone came up with something really clever or the demand massively overwhelming. So, I'd say the feature is safe for 3.x, however support among graph systems is spotty and you reduce the portability of your code by using them I'd say.

> As a side note, I wanted to ask what the point of cardinality is, given that the value of a property can be any arbitrary Java object? Is it to support indexing on the members of a set or list?

The feature roots out of  Titan (JanusGraph) with the idea that Gremlin would have knowledge of the notion that properties could contain lists or metadata and could therefore be traversed seamlessly and with graph provider optimization (indices). In TinkerPop 2.x we didn't have flexibility in the language to express such things well so we felt we needed to make these features structurally explicit which then made it pretty easy to make Gremlin utilize that structure. While a property can take an arbitrary object there needs to be a way for the graph to know what that object represents:

g.V()......property('color',['red'])

Does that mean to append "red" to a list as a multiproperty or to append a List with the string "red" in it as a multiproperty or does it just mean overwrite the "color" property with a List? in TinkerPop 3.x the Cardinality argument would make that option clear. Not sure how such things would be delegated to the underlying graph without Cardinality...perhaps a schema would help make that decision clear. That will need more thought.

On Tue, Nov 19, 2019 at 1:43 AM Kushal Agrawal <sweaty...@gmail.com> wrote:
P.S. Apologies for any confusion stemming from the error in the title, I meant to say cardinality, not multiplicity.

On Monday, 18 November 2019 18:39:38 UTC+5:30, Kushal Agrawal wrote:
Hi,
The title says it all, I was curious about why edge properties can't be sets or lists?

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremli...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremli...@googlegroups.com.

Alexandr

unread,
Sep 10, 2020, 7:25:37 AM9/10/20
to Gremlin-users
In our case using SET or LIST edge properties in JanusGraph would be beneficial.
The use case which we would really love to have is the next:
JanusGraph allows building indices on LIST or SET properties, thus it is possible to use lookups of a specific vertex by using any value of a property.

For example:
g.addV().property(VertexProperty.Cardinality.set, "type", "foo")
        .property(VertexProperty.Cardinality.set, "type", "bar")
        .property(VertexProperty.Cardinality.set, "type", "foobar").iterate();
g.V().has("type", "foo").has("type", "bar").next();

Sometimes edges might also have properties where it would be very beneficial to use Cardinality SET or LIST. In our use case edges might have several custom identifiers and our business logic requires to lookup edges by one of their custom identifiers.
Thus, it would be very convenient if we could use the above logic for edges. I.e.:
g.E().has("customIdentifiers", "foo").has("customIdentifiers", "bar").next();

Are there plans to add cardinalities to edge properties? Would the community be open for such contribution?

Stephen Mallette

unread,
Sep 10, 2020, 8:17:09 AM9/10/20
to gremli...@googlegroups.com
>  Are there plans to add cardinalities to edge properties? Would the community be open for such contribution? 

I remember the discussion about why we didn't do meta-properties on edges, but I don't recall one for multi-properties. Anyone remember why we chose that direction? 

Personally, I tend to feel like we shouldn't change the graph structure APIs of TinkerPop 3 at this point as such changes seem major enough that they belong in TinkerPop 4. Without softening my position on that too much, I'd just say that cardinality on edge properties might not be as complex and breaking a change as trying to do meta-properties on edges. 

> g.E().has("customIdentifiers", "foo").has("customIdentifiers", "bar").next();

Interesting that you start traversals by edge. Could you talk more about your modelling decisions that lead you to that sort of schema and query pattern? 
   
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/713fae8b-63b3-4cf2-86c2-85f832fbd658n%40googlegroups.com.

Alexandr

unread,
Sep 11, 2020, 4:25:00 PM9/11/20
to Gremlin-users
In our case we would really see benefit in edge SET or LIST properties. I see your concern about breaking changes. I also think that edge meta-properties might be a major breaking change and could be harder to implement.
I didn't yet check how much cardinality on edge properties break things. I don't think that break too much but I might just miss something.
I see your point that you believe it is better aligned with TinkerPop 4. Is there some ETA for TinkerPop 4?


> Interesting that you start traversals by edge. Could you talk more about your modelling decisions that lead you to that sort of schema and query pattern?

Sure. All our elements (vertices and edges) have "customIdentifiers" properties which is of cardinality SET. In our case a single element may have multiple identifiers. We are using JanusGraph as our main database which has indexes support on vertex and edge properties.
Sometimes we need to return an edge by any value of its customIdentifiers. If the value is SINGLE we can use something like this:

g.E().has("customIdentifiers", "foo").next();

which will query the index under the hood instead of traversing all edges of all vertices. But as I said above we would really benefit if we could store multiple identifiers in the edge property and use an index on that property.

Stephen Mallette

unread,
Sep 14, 2020, 8:12:56 AM9/14/20
to gremli...@googlegroups.com
>  Is there some ETA for TinkerPop 4?

no - there are no plans on that yet, only early discussion.

Reply all
Reply to author
Forward
0 new messages