How can I find common relations between 200 vertex ids?

41 views
Skip to first unread message

Sara

unread,
Mar 20, 2018, 5:01:04 PM3/20/18
to Gremlin-users
How can I find common relations(paths) between 200 vertexes ids?

Kelvin Lawrence

unread,
Mar 20, 2018, 6:11:12 PM3/20/18
to Gremlin-users
Hi Sara, could you say a bit more about what you are trying to do and maybe provide some sample data or sample query? It's not clear from your question what you are trying to do.

Cheers
Kelvin

Daniel Kuppitz

unread,
Mar 20, 2018, 10:57:17 PM3/20/18
to gremli...@googlegroups.com
Kevin is right, it's not clear what you're trying to do, but here is my best bet:

Given a set of ids, find all paths between them

gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> ids = [1,3,5]
==>1
==>3
==>5
gremlin> g.V(ids).repeat(both().simplePath()).emit(hasId(within(ids))).path().by(id)
==>[1,3]
==>[1,4,5]
==>[1,4,3]
==>[1,3,4,5]
==>[3,1]
==>[3,4,5]
==>[3,4,1]
==>[3,1,4,5]
==>[5,4,3]
==>[5,4,1]
==>[5,4,3,1]
==>[5,4,1,3]

...or find all paths that are not touching any other vertex (outside the specified set of ids):

gremlin> ids = [1,2,3,4]
==>1
==>2
==>3
==>4
gremlin> g.V(ids).repeat(both().hasId(within(ids)).simplePath()).emit().path().by(id)
==>[1,3]
==>[1,2]
==>[1,4]
==>[1,3,4]
==>[1,4,3]
==>[2,1]
==>[2,1,3]
==>[2,1,4]
==>[2,1,3,4]
==>[2,1,4,3]
==>[3,1]
==>[3,4]
==>[3,1,2]
==>[3,1,4]
==>[3,4,1]
==>[3,4,1,2]
==>[4,3]
==>[4,1]
==>[4,3,1]
==>[4,1,3]
==>[4,1,2]
==>[4,3,1,2]

Cheers,
Daniel



On Tue, Mar 20, 2018 at 2:01 PM, Sara <einav...@gmail.com> wrote:
How can I find common relations(paths) between 200 vertexes ids?

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/166fb490-acc2-47a7-a8c3-95319d29a12a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sara

unread,
Mar 21, 2018, 12:16:54 PM3/21/18
to Gremlin-users
Thanks. Does the same queries works for 200 ids without performance issues?.Im using latest dse 5.1

Daniel Kuppitz

unread,
Mar 21, 2018, 1:36:28 PM3/21/18
to gremli...@googlegroups.com
Path computations are generally pretty heavy operations as they require a lot of memory. The id filter shouldn't cause any issues though and 200 still seems like a reasonable number to me. Anyway, the performance will pretty much depend on your graph structure (edge degrees, length and number of paths).

As an aside: I meant Kelvin, not Kevin (sorry Kelvin for over-optimizing your name :))

Cheers
Daniel


On Wed, Mar 21, 2018 at 9:16 AM, Sara <einav...@gmail.com> wrote:
Thanks. Does the same queries works for 200 ids without performance issues?.Im using latest dse 5.1
--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.

Sara

unread,
Mar 21, 2018, 3:19:31 PM3/21/18
to Gremlin-users
What is the max recommended number of edges for a vertex in order to avoid performance issues using find paths queries?

Daniel Kuppitz

unread,
Mar 21, 2018, 3:41:23 PM3/21/18
to gremli...@googlegroups.com
It depends, a 2-hop path with 100 edges per vertex is as bad as a 4-hop path with only 10 edges per vertex. Furthermore, it depends on what you think is an issue - can you tolerate a 1-second query, 10 seconds?
I think it's best if you just try it out on a small sample graph.

Cheers,
Daniel


On Wed, Mar 21, 2018 at 12:19 PM, Sara <einav...@gmail.com> wrote:
What is the max recommended  number of edges for a vertex in order to avoid performance issues using find paths queries?
--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.

Sara

unread,
Mar 21, 2018, 4:20:52 PM3/21/18
to Gremlin-users
I already have a graph.the number of edges for a vertex is between 10 to 10000.and many of my vertexes has 5000 edges for each.the timeLimit I put in my query

Sara

unread,
Mar 21, 2018, 4:24:40 PM3/21/18
to Gremlin-users
The timeLimit in my find paths queries is 120000 millseconds and I do "paging" .I get no more than 50 paths for a "page".

Daniel Kuppitz

unread,
Mar 21, 2018, 4:31:26 PM3/21/18
to gremli...@googlegroups.com
Well, then you'll have to deal with combinatorial explosions. 3 hops alone will lead to approximately 125,000,000,000 (5k*5k*5k) paths. It takes time to traverse that many paths and if no valid path is found in the given time, well, then that's that - there's no magic that could make that faster. The best option, in this case, is usually a pre-computation of all paths in nightly OLAP jobs and store them somewhere else. But even that will take some time and a ton of memory.

I don't anything about the contents of your graph, but maybe you can rethink the model..?

Cheers,
Daniel

On Wed, Mar 21, 2018 at 1:20 PM, Sara <einav...@gmail.com> wrote:
I already have a graph.the  number of edges for a vertex is between 10 to 10000.and many of my vertexes has 5000 edges for each.the timeLimit I put in my query
--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.

Sara

unread,
Mar 21, 2018, 5:49:14 PM3/21/18
to Gremlin-users
Managing Logistical Relationships in a graphdb.

I already have read many articles on graphdb models but I still do not understand how to reduce the number of edges in some cases.

Sara

unread,
Mar 21, 2018, 5:57:54 PM3/21/18
to Gremlin-users
For example:
In the following article:

https://neo4j.com/blog/graph-database-specialized-etl/

There is a solution for the super node problem by adding "meta data" nodes.I do not understand what is the partition key of the meta data node in the example in that article.(person born in date)

Reply all
Reply to author
Forward
0 new messages