Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
The query that never comes
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  5 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Leward  
View profile  
 More options Apr 11 2012, 5:39 am
From: Leward <pj82...@gmail.com>
Date: Wed, 11 Apr 2012 02:39:18 -0700 (PDT)
Local: Wed, Apr 11 2012 5:39 am
Subject: [Cypher] The query that never comes

Hi,

If you haven't noticed it: the title of this thread is dedicated to
Metallica (*singing in my head*). But I'm not here to talk about music :)

Hi try to make a Cypher query to compare the ways people are linked
together.

Here is the query :

start
   n = node:names(email='test@test..com')
match
    p = n-[r:connect*1..5]->m,
    n-[r2*?*:connect]->m
where
    m <> n return distinct m,
    min(length(p))
order by min(length(p))
skip 0 limit 5

So I open the console in the webadmin, copy and paste the query, then... I
wait while my computer is getting hotter and louder. After 10 minutes I end
up killing the process.

I think the issue comes from the 4th line with the question mark "?".
Indeed, if I try to execute the following query it works. (Except that the
results does not fit my need) :
start
    n = node:names(email='t...@test.com')
match
    p = n-[r:connect*1..5]->m,
    n-[r2:connect]->m
where
    m <> n return distinct m,
min(length(p))
order by min(length(p))
skip 0 limit 5

So I wonder if it is something which can be fixed on the Neo4j side, or my
query is simply bad designed ?

Thanks in Advance,
Leward.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "[Cypher] The query that never comes" by Michael Hunger
Michael Hunger  
View profile  
 More options Apr 11 2012, 5:56 am
From: Michael Hunger <michael.hun...@neotechnology.com>
Date: Wed, 11 Apr 2012 11:56:03 +0200
Local: Wed, Apr 11 2012 5:56 am
Subject: Re: [Neo4j] [Cypher] The query that never comes

#1 What version of neo4j are you running?
#2 What does your dataset look like?
#3 did you try to just return count(*) to see how many nodes are iterated through (for aggregation and ordering)
#4 could we get it (off-line) to check the performance issue?

Thanks  a lot

Michael

Am 11.04.2012 um 11:39 schrieb Leward:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andres Taylor  
View profile  
 More options Apr 11 2012, 6:01 am
From: Andres Taylor <andres.tay...@neotechnology.com>
Date: Wed, 11 Apr 2012 12:01:19 +0200
Local: Wed, Apr 11 2012 6:01 am
Subject: Re: [Neo4j] [Cypher] The query that never comes

If we break down this query a bit. You want Neo4j to find all the paths
between one node and all other nodes in your graph, that are connected by
up to 5 steps. And, the same node might appear multiple times - you are
asking for all the paths to all the nodes 5 steps away. How connected are
your nodes? Take that number, and raise it by 5. I your nodes each have ~10
connections, we're quickly building up a lot of stuff to look at.

Next, you use the aggregate function min(), which forces your query to be
eager, and not lazy. So all the stuff has to be kept in the heap until the
query is finished.

Cypher is doing what you asked it to do, and you asked for something that
takes a whole lot of work. No surprise there, right?

> I think the issue comes from the 4th line with the question mark "?".
> Indeed, if I try to execute the following query it works. (Except that the
> results does not fit my need) :
> start
>     n = node:names(email='t...@test.com')
> match
>     p = n-[r:connect*1..5]->m,
>     n-[r2:connect]->m
> where
>     m <> n return distinct m,
> min(length(p))
> order by min(length(p))
> skip 0 limit 5

What's happening here is that, since you removed the question mark, Cypher
finds all nodes that are between 1-5 relationships away, and throws out any
that isn't directly connected to n. Much less stuff to keep in heap, and so
much less to aggregate on.

> So I wonder if it is something which can be fixed on the Neo4j side, or my
> query is simply bad designed ?

We're working on Cypher performance, but this a heavy query, and will
probably never be very fast. Do you really need to look five steps out?
There's a reason facebook and linkedin and the like don't look so many
steps out...

Andrés


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Leward  
View profile  
 More options Apr 14 2012, 10:06 am
From: Leward <pj82...@gmail.com>
Date: Sat, 14 Apr 2012 07:06:33 -0700 (PDT)
Local: Sat, Apr 14 2012 10:06 am
Subject: Re: [Neo4j] [Cypher] The query that never comes

Hello,

I worked on this issue today and I finally found a pretty nice workaround.
At the moment my graph DB is not very big: a few hundreds of nodes and
relationships. So doing a simple query with *p = n-[r:connect*1..5]->m* is
fast.

The fact is that my query was not very well designed. Now there is the
rewritten query which work pretty well :

start
   n = node:names(email='t...@test.com')
match
   p = n-[r:connect]->()-[r2?:connect*0..4]->m
where
   m <> n return distinct m,
min(length(p))
order by min(length(p))
skip 0 limit 5

Thanks for your answer, they were btw very instructive.

Regards, Leward.

Le mercredi 11 avril 2012 12:01:19 UTC+2, Andres Taylor a écrit :


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter Neubauer  
View profile  
 More options Apr 14 2012, 10:37 am
From: Peter Neubauer <neubauer.pe...@gmail.com>
Date: Sat, 14 Apr 2012 16:37:09 +0200
Local: Sat, Apr 14 2012 10:37 am
Subject: Re: [Neo4j] [Cypher] The query that never comes

That's cool. Could you do a short blog ans set this up with a
console.neo4.org link? Would be nice to see it visually :-)
On Apr 14, 2012 4:06 PM, "Leward" <pj82...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »