Question about using multiple WITH statements in Cypher queries

1,799 views
Skip to first unread message

Chris Boyd

unread,
Aug 24, 2012, 6:44:13 PM8/24/12
to ne...@googlegroups.com
Trying to calculate the number of people of age 21 and age 22.

The following works:

START n = node(*)
WITH n, COUNT( n.age? ) AS age21  WHERE ID(n)<>0 AND n.age = 21
WITH n, COUNT( n.age? ) AS age22  WHERE ID(n)<>0 AND n.age = 22
RETURN age22

If I try to also show 'age21'  in RETURN there is a problem, because 'age21' is not visible in the context of RETURN, i.e. the following does *not* work:

RETURN age21, age22

Is there a way to pass 'age21' to RETURN?

Also, is there a limitation to the number of sequential WITH statements that can be used in the same Cypher query?

Thank you,
Chris

Peter Neubauer

unread,
Aug 25, 2012, 8:29:49 AM8/25/12
to ne...@googlegroups.com

Have you tried passing over age21 in the second WITH just as the 'n' so it is available in the return?

/peter

Send from mobile device.

--
 
 

Michael Hunger

unread,
Aug 25, 2012, 8:44:04 AM8/25/12
to ne...@googlegroups.com
There is no limit of with statements.

What do you want to do?

It does not make sense to aggregate over a value and then access the individual node again?

start n=node(*)
where id(n) <> 0
with n.age? as age, count(*) as cnt
where age = 21 or age = 22 
return cnt

if you want to pass information from a previous query part you have to mention it again in the next with.

MIchael

--
 
 

Andres Taylor

unread,
Aug 25, 2012, 1:01:08 PM8/25/12
to ne...@googlegroups.com
On Sat, Aug 25, 2012 at 2:44 PM, Michael Hunger <michael...@neotechnology.com> wrote:
There is no limit of with statements.

What do you want to do?

It does not make sense to aggregate over a value and then access the individual node again?

start n=node(*)
where id(n) <> 0
with n.age? as age, count(*) as cnt
where age = 21 or age = 22 
return cnt

Almost. This is my take on it: 

start n=node(*)
where id(n) <> 0 and (age = 21 or age = 22)
return age, count(*)

Who wins?

Andrés

Michael Hunger

unread,
Aug 25, 2012, 4:53:41 PM8/25/12
to ne...@googlegroups.com
The question is still open: what did chris want to achieve?

Sent from mobile device
--
 
 

Chris Boyd

unread,
Aug 25, 2012, 8:31:42 PM8/25/12
to ne...@googlegroups.com
>...The question is still open: what did chris want to achieve?...

Sorry, I wasn't clear on that...

In the graph there are X nodes with node.age = 21 and Y nodes with node.age = 22.
I want to issue a *single* cypher query that would return X and Y as *separate* numbers, so that I could show a message like "There are 123 users of age 21 and 456 users of age 22".

I was considering Peter's idea to pass age21 to the second WITH so that it could make it to RETURN, but I do not know how to do that. I will experiment with that because Peter encouraged me that there is a way to do it. Michael and Andres assume that I wanted to get the combined number of age21 + age 22, while I want each of these as a separate number. I apologize, I wasn't clear on that.

Cypher is a powerful concept!

Thank you for your help,
Chris

Chris Boyd

unread,
Aug 25, 2012, 9:23:30 PM8/25/12
to ne...@googlegroups.com
I tried to use http://console.neo4j.org/ to come up with a clear explanation of what I need to accomplish.

If the graph is:

Graph Setup:

start root=node(0)
create (Neo {name:'Neo'}), (Morpheus {name: 'Morpheus'}), (Trinity {name: 'Trinity'}),
 (Cypher {name: 'Cypher'}), (Smith {name: 'Agent Smith'}), (Architect {name:'The Architect'}),
 root-[:ROOT]->Neo, Neo-[:KNOWS]->Morpheus, Neo-[:LOVES]->Trinity, Morpheus-[:KNOWS]->Trinity,
 Morpheus-[:KNOWS]->Cypher, Cypher-[:KNOWS]->Smith, Smith-[:CODED_BY]->Architect
--------------------------------------------------------------------------------

I want to issue a *single* cypher query to get the number of nodes with names that start with 'T' and the number of nodes that start with 'N'. As a result I should get 2 ( "Trinity" and "The Architect" ) and 1 ("Neo").

I tried the following:

START n = node(*) WITH n, COLLECT(n.name) AS T WHERE n.name =~ /(?i)T.*/  WITH T, n, COLLECT(n.name) AS N WHERE n.name =~ /(?i)N.*/  RETURN COUNT(T), COUNT(N)

I hoped to get something like this:

{u'data': [[2, 1]], u'columns': [u'COUNT( CA )', u'COUNT( C )']}

, but I cannot make that work -:(

Andres Taylor

unread,
Aug 26, 2012, 1:18:50 AM8/26/12
to ne...@googlegroups.com
On Sun, Aug 26, 2012 at 2:31 AM, Chris Boyd <chris...@gmail.com> wrote:
 Michael and Andres assume that I wanted to get the combined number of age21 + age 22, while I want each of these as a separate number. I apologize, I wasn't clear on that.

My query suggestion does not assume you want the combined number. It will give you two rows back, one with 21 and the number of 21 year olds, and one row with 22 and the number for the 22 year olds. Did you try it?

start n=node(*)
where id(n) <> 0 and (age = 21 or age = 22)
return age, count(*)
Andrés

Chris Boyd

unread,
Aug 26, 2012, 12:48:14 PM8/26/12
to ne...@googlegroups.com
I did. Does not recognize 'age'. I guess it should be 'n.age'.

On  http://console.neo4j.org/ I tried the following:

start n=node(*) where id(n) <> 0 and ( n.name = "Trinity" or n.name = "Neo" ) return n.name, count(*)

Got:

+----------------------+
| n.name    | count(*) |
+----------------------+
| "Neo"     | 1        |
| "Trinity" | 1        |
+----------------------+

Yes, that is the expected result, thank you!

But then I wanted to get the number of node names starting with 'T' and with 'N' The following query failed with "Error: illegal value" (at 'n.name' of n.name =~ /(?i)N.*/):

start n=node(*) where id(n) <> 0 and ( n.name =~ /(?i)T.*/ or n.name =~ /(?i)N.*/) return n.name, count(*)

What am I doing wrong this time?

Chris

Peter Neubauer

unread,
Aug 26, 2012, 5:27:03 PM8/26/12
to ne...@googlegroups.com
Chris,
the console is on SNAPSHOT mode, and has now the normal strings for Regexp,
like

start n=node(*) where id(n) <> 0 and ( n.name =~ '(?i)T.*' or n.name
=~ '(?i)N.*') return n.name, count(*)

see http://docs.neo4j.org/chunked/snapshot/query-where.html#where-case-insensitive-regular-expressions

Cheers,

/peter neubauer

G: neubauer.peter
S: peter.neubauer
P: +46 704 106975
L: http://www.linkedin.com/in/neubauer
T: @peterneubauer

Wanna learn something new? Come to @graphconnect.
> --
>
>

Chris Boyd

unread,
Aug 26, 2012, 7:36:33 PM8/26/12
to ne...@googlegroups.com
That worked, thank you Peter!

Now I got the following:

+----------------------------+
| n.name          | count(*) |
+----------------------------+
| "Neo"           | 1        |
| "The Architect" | 1        |
| "Trinity"       | 1        |
+----------------------------+

Getting closer, bur still do not know how to get the number of nodes with a name starting with 'T and with 'N'. I could use the returned data shown above and do processing on the recipient side, but the returned data could be quite long.
I still hope that there is a way to get a result like this:

+----------------------------+ | n.name | count(*) | +----------------------------+ | '(?i)N.*' | 1 | | '(?i)T.*' | 2 | +----------------------------+

Chris

Peter Neubauer

unread,
Aug 27, 2012, 2:27:59 AM8/27/12
to ne...@googlegroups.com
Mmh,
what about this? http://console.neo4j.org/r/g14j3a

start n=node(*) where id(n) <> 0 and n.name =~ '(?i)T.*' with
collect(n) as Ts start n = node(*) where id(n) <> 0 and n.name =~
'(?i)N.*' return Ts,collect(n) as Ns

start n=node(*)
where id(n) <> 0 and n.name =~ '(?i)T.*'
with collect(n) as Ts
start n = node(*)
where id(n) <> 0 and n.name =~ '(?i)N.*'
return Ts,collect(n) as Ns
+---------------------------------------------------------------------------------+
| Ts | Ns
|
+---------------------------------------------------------------------------------+
| [Node[3]{name:"Trinity"},Node[6]{name:"The Architect"}] |
[Node[1]{name:"Neo"}] |
+---------------------------------------------------------------------------------+
1 row
0 ms

Cheers,

/peter neubauer

G: neubauer.peter
S: peter.neubauer
P: +46 704 106975
L: http://www.linkedin.com/in/neubauer
T: @peterneubauer

Wanna learn something new? Come to @graphconnect.


> --
>
>

Michael Hunger

unread,
Aug 27, 2012, 5:00:56 AM8/27/12
to ne...@googlegroups.com
Chris,

you would want to group by first letter unfortunately the needed string functions are still missing.

What you can do is this:
start n=node(*) where id(n) <> 0  return 
  length(filter(x in collect(n.name) : x =~ '(?i)T.*')) as T, 
  length(filter(x in collect(n.name) : x =~ '(?i)N.*')) as N

or do a with collect(n.name) as names 
first and then do the length(filter(...)) after that



--
 
 

Chris Boyd

unread,
Aug 27, 2012, 5:44:01 PM8/27/12
to ne...@googlegroups.com
Ye, this works and returns exactly what I was looking for! Thank you!

Peter Neubauer

unread,
Aug 27, 2012, 5:59:28 PM8/27/12
to ne...@googlegroups.com
I crave a blog from thread!

:)

Cheers,

/peter neubauer

G: neubauer.peter
S: peter.neubauer
P: +46 704 106975
L: http://www.linkedin.com/in/neubauer
T: @peterneubauer

Wanna learn something new? Come to @graphconnect.


> --
>
>

David Montag

unread,
Aug 27, 2012, 6:05:15 PM8/27/12
to ne...@googlegroups.com
On Mon, Aug 27, 2012 at 2:00 AM, Michael Hunger <michael...@neotechnology.com> wrote:
Chris,

you would want to group by first letter unfortunately the needed string functions are still missing.

What you can do is this:
start n=node(*) where id(n) <> 0  return 
  length(filter(x in collect(n.name) : x =~ '(?i)T.*')) as T, 
  length(filter(x in collect(n.name) : x =~ '(?i)N.*')) as N

Wouldn't you want an additional (n.name =~ '(?i)T.*' or n.name =~ '(?i)N.*') in the WHERE clause to reduce the potential in-memory size of the collect(n.name) result? Or is collect() lazily iterable?
 
--
 
 



--
Neo Technology, www.neotechnology.com
Skype: ddmontag

As a friend of Neo, register for GraphConnect and get a 20% discount. Code: GCFON

Michael Hunger

unread,
Aug 27, 2012, 6:19:00 PM8/27/12
to ne...@googlegroups.com
Good point David,

I think Chris already promised us a blog post on the whole discussion and what he's learned :)

Cheers

Michael

--
 
 

Chris Bolton

unread,
Aug 27, 2012, 6:35:39 PM8/27/12
to ne...@googlegroups.com
Michael,

Different Chris. =)

Chris(Bolton)

--- On Mon, 8/27/12, Michael Hunger <michael...@neotechnology.com> wrote:
--
 
 

Michael Hunger

unread,
Nov 22, 2012, 7:42:02 PM11/22/12
to ne...@googlegroups.com
What do you want to achieve?

Michael

Am 22.11.2012 um 18:56 schrieb Nghia Doan:

Hi Michael and Peter,

I am running into a similar problem, don't really know how to solve this.


The query is:

start n=node(*)
where id(n) <> 0 and n.name =~ '(?i)T.*'
with collect(n) as Ts start n = node(*)
where id(n) <> 0 and n.name =~ '(?i)N.*'
with Ts, collect(n) as Ns start n = node(*)
where id(n) <> 0 and n.name =~ '(?i)C.*'
with Ts,Ns, collect(n) as Cs start n = node(*)
where id(n) <> 0 and n.name =~ '(?i)A.*'
with Ts,Ns,Cs, collect(n) as As start n = node(*)
where id(n) <> 0 and n.name =~ '(?i)F.*'
return Ts,Ns,Cs,As, collect(n) as Fs

And the query returned:

+------------------------+
| Ts | Ns | Cs | As | Fs |
+------------------------+
+------------------------+
0 row 0 ms





Is it a bug ? Shouldn't it return the Ts,Ns,Cs,As and Fs should be just null ?

The following query is just fine:

start n=node(*)
where id(n) <> 0 and n.name =~ '(?i)T.*'
with collect(n) as Ts start n = node(*)
where id(n) <> 0 and n.name =~ '(?i)N.*'
with Ts, collect(n) as Ns start n = node(*)
where id(n) <> 0 and n.name =~ '(?i)C.*'
with Ts,Ns, collect(n) as Cs start n = node(*)
where id(n) <> 0 and n.name =~ '(?i)A.*'
return Ts,Ns,Cs,collect(n) as As

with correct result:

+--------------------------------------------------------------------------------------------------------------------------------------------+
| Ts                                                      | Ns                    | Cs                       | As                            |
+--------------------------------------------------------------------------------------------------------------------------------------------+
| [Node[3]{name:"Trinity"},Node[6]{name:"The Architect"}] | [Node[1]{name:"Neo"}] | [Node[4]{name:"Cypher"}] | [Node[5]{name:"Agent Smith"}] |
+--------------------------------------------------------------------------------------------------------------------------------------------+
1 row
0 ms

In the mean time, do you know any work-around ?

Thank in advance,

Nghia Doan

--
 
 

Nghia Doan

unread,
Nov 22, 2012, 11:28:57 PM11/22/12
to ne...@googlegroups.com
Hi,

I want to get the Ts, Ns, Cs, As collection content event if Fs is null.

Nghia Doan

--
 
 

Michael Hunger

unread,
Nov 23, 2012, 2:26:40 AM11/23/12
to ne...@googlegroups.com
I would probably use filter for that:

But graph global queries are not the thing you want to do with your graph databases anyway, this is a typical sql query :)

You can easily test both in the console.

probably even better is to just group the values by the first letter

start n=node(*) 
where id(n) <> 0 
AND lower(left(n.name,1)) in ['t','n','c','a','f'] // this line is perhaps optional
return lower(left(n.name,1)), collect(n)


alternatively you can use filter

start n=node(*) 
where id(n) <> 0
with collect(n) as nodes
return filter(n in nodes : n.name =~ '(?i)T.*') as Ts,
filter(n in nodes : n.name =~ '(?i)N.*') as Ns,
filter(n in nodes : n.name =~ '(?i)C.*') as Cs,
filter(n in nodes : n.name =~ '(?i)A.*') as As,
filter(n in nodes : n.name =~ '(?i)F.*') as Fs


Michael




--
 
 

Nghia Doan

unread,
Nov 23, 2012, 2:35:49 AM11/23/12
to ne...@googlegroups.com
Michael, you make wonder happened!

Thanks a lot. You are right, using filter save my day!

Thanks again.

Nghia Doan

--
 
 

Reply all
Reply to author
Forward
0 new messages