How does query profiler work?

63 views
Skip to first unread message

Ryan Velasco

unread,
Mar 16, 2015, 5:43:26 AM3/16/15
to ne...@googlegroups.com, Michael Hunger
Hello,

If I run a query 2 it goes faster but if I run it with other queries the 2nd time, it is still slow.

Best Regards,
Ryan
Run with many queries.png
Run the queries many times alone.png

Michael Hunger

unread,
Mar 16, 2015, 7:33:08 AM3/16/15
to Ryan Velasco, ne...@googlegroups.com
Please always include the information about the actual queries you run and the actual dataset information.
Otherwise no one can help you

and you should also include the profiling info of your two queries.

Also try to measure query performance from the Neo4j-shell to see the least impact from drivers or additional requests in the neo4j-browser.
You can also use: http://localhost:7474/webadmin/#/console/ for that query time testing.

Michael

<Run with many queries.png><Run the queries many times alone.png>

Ryan Velasco

unread,
Mar 17, 2015, 12:14:02 AM3/17/15
to ne...@googlegroups.com, ry...@limesource.se
Thanks for the reply.
How do you do query profiling? I have observed that if sql server is eating the memory the neo4j query goes slow. Maybe in production it will be different. Because we plan to dedicate a machine with only neo4j installed on it. Do you have a good specs for a computer? We plan to use a machine with Core i7 and 8GB of memory. Is there a feature that I can view history of queries made and the time it took to retrieve the data?

Thanks,
Ryan

Michael Hunger

unread,
Mar 17, 2015, 9:20:19 AM3/17/15
to ne...@googlegroups.com
There is some more logging of queries in more recent versions.

in neo4j.properties

dbms.querylog.enabled=true
# in ms
dbms.querylog.threshold=500
dbms.querylog.path=data/log/queries.log

in neo4j 2.2 you can prepend EXPLAIN to a query which will show you the query plan visually (in the browser) or textually (in Neo4j shell) it doesn't run the query.

If you use PROFILE it will run the query and show also the db-hits.

As you haven't shared your queries or data model, there is not more I can help you with.

Cheers; Michael


--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ryan John Velasco

unread,
Mar 18, 2015, 1:13:58 AM3/18/15
to ne...@googlegroups.com, Michael Hunger
I will send you the 2 slow queries with profile.

Best regards,
Ryan John Velasco
You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/M8NVlEvjXKU/unsubscribe .
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com .

Ryan John Velasco

unread,
Mar 18, 2015, 5:22:19 AM3/18/15
to ne...@googlegroups.com, Michael Hunger
Thanks, I tried to put a Profile is some of our queries and there are queries that have 3M hits. Sample query is
MATCH (n:Subscription)
WHERE (n.SubscriptionStartDate IS NULL or n.SubscriptionStartDate <= 635622637370000000) and (n.SubscriptionEndDate IS NULL or n.SubscriptionEndDate >= 635622637370000000)
RETURN count(n)

Best Regards,
Ryan John Velasco

----- Original Message -----
From: "Michael Hunger" <michael...@neotechnology.com>
To: ne...@googlegroups.com
Sent: Tuesday, March 17, 2015 9:20:12 PM
Subject: Re: [Neo4j] How does query profiler work?

You received this message because you are subscribed to a topic in the Google Groups "Neo4j" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/neo4j/M8NVlEvjXKU/unsubscribe .
To unsubscribe from this group and all its topics, send an email to neo4j+un...@googlegroups.com .
plan.png

Michael Hunger

unread,
Mar 18, 2015, 5:30:35 AM3/18/15
to ne...@googlegroups.com, Ryan Velasco
Right, you don't want to do that, there is currently no index support for range queries.
Your query pulls all nodes and their properties into memory and does the comparison there.

Usually you have some other criteria to limit the search first.
For this concrete use-case it seems that you're looking at subscriptions outside of a certain time range, you can also tag them with an additional label

I suggest that you either add something like a time-tree to your graph to structure your subscriptions.

see: http://neo4j.com/docs/stable/cypher-cookbook-path-tree.html
or http://graphaware.com/neo4j/2014/08/20/graphaware-neo4j-timetree.html

Or you store a lower resolution date property e.g. down to the year, month or day level, index it

create index on :Subscription(start);

and do a lookup via
(s.start IN range(2012,2000))
> <plan.png>

Ryan John Velasco

unread,
Mar 18, 2015, 5:33:10 AM3/18/15
to Michael Hunger, ne...@googlegroups.com
Thanks, I have one of my member also make an experiment with index. I will send you the effect of it.


Cc: "Ryan Velasco" <ry...@limesource.se>
Sent: Wednesday, March 18, 2015 5:30:20 PM

Subject: Re: [Neo4j] How does query profiler work?

Right, you don't want to do that, there is currently no index support for range queries.
Your query pulls all nodes and their properties into memory and does the comparison there.

Usually you have some other criteria to limit the search first.
For this concrete use-case it seems that you're looking at subscriptions outside of a certain time range, you can also tag them with an additional label

I suggest that you either add something like a time-tree to your graph to structure your subscriptions.

see: http://neo4j.com/docs/stable/cypher-cookbook-path-tree.html
or http://graphaware.com/neo4j/2014/08/20/graphaware-neo4j-timetree.html

Or you store a lower resolution date property e.g. down to the year, month or day level, index it

create index on :Subscription(start);

and do a lookup via
(s.start IN range(2012,2000))

> Am 18.03.2015 um 10:22 schrieb Ryan John Velasco <ry...@limesource.se>:
>
> Thanks, I tried to put a Profile is some of our queries and there are queries that have 3M hits. Sample query is
> MATCH (n:Subscription)
> WHERE (n.SubscriptionStartDate IS NULL or n.SubscriptionStartDate <= 635622637370000000) and (n.SubscriptionEndDate IS NULL  or n.SubscriptionEndDate >= 635622637370000000)
> RETURN count(n)
>
> Best Regards,
> Ryan John Velasco
>
>

Ryan John Velasco

unread,
Mar 18, 2015, 5:35:53 AM3/18/15
to Michael Hunger, ne...@googlegroups.com
Is there a plan for someday support range queries?
How does neo4j save the properties of node? is it also via link to each property or like a document(all property in a single store)?

Best Regards,
Ryan

----- Original Message -----
From: "Michael Hunger" <michael...@neotechnology.com>
To: ne...@googlegroups.com

Michael Hunger

unread,
Mar 18, 2015, 7:51:55 AM3/18/15
to Ryan John Velasco, ne...@googlegroups.com
So whenever you don't need a property it's not loaded and the traversal is faster.

Properties are stored in a linked list of property-records that can contain up to 4 properties each (depending on size).

Range indexes are planned, but I can't give a timeline.

Michael

Ryan John Velasco

unread,
Mar 19, 2015, 3:22:03 AM3/19/15
to Michael Hunger, ne...@googlegroups.com
Hello,

The actual query involving subscriptions for example.
PROFILE MATCH (T0:Company)-[T2:Owner]-(T3:OwnedItems)-[T4:OwnedItem]-(T6:Fleet)-[T7:FleetGroup]-(T8:FleetGrouping)-[T9:GroupedSite]-(T11:Vessel)-[T12:At]-(T13:LocatedAt)-[T14:Located]-(T16:Invoiceable)-[T17:Goods]-(T19:SoldGoods)-[T20:Sold]-(T22:Subscription)-[T24:Subscription]-(T26:SalesSettings)-[T27:Contract]-(T28:Contract)-[T29:PriceAgreement]-(T31:SalesRelation)-[T32:Buyer]-(T33:Company)
WHERE T0.ID in [
1,
2,
172076,
172079
] and (T4.StartDate IS NULL or T4.StartDate <= 635623428460000000) and (T4.EndDate IS NULL or T4.EndDate >= 635623428460000000) and (T9.StartDate IS NULL or T9.StartDate <= 635623428460000000) and (T9.EndDate IS NULL or T9.EndDate >= 635623428460000000) and (T14.StartDate IS NULL or T14.StartDate <= 635623428460000000) and (T14.EndDate IS NULL or T14.EndDate >= 635623428460000000) and (T17.StartDate IS NULL or T17.StartDate <= 635623428460000000) and (T17.EndDate IS NULL or T17.EndDate >= 635623428460000000) and (T20.StartDate IS NULL or T20.StartDate <= 635623428460000000) and (T20.EndDate IS NULL or T20.EndDate >= 635623428460000000) and (T22.SubscriptionStartDate IS NULL or T22.SubscriptionStartDate <= 635623428460000000) and (T22.SubscriptionEndDate IS NULL or T22.SubscriptionEndDate >= 635623428460000000) and (T24.StartDate IS NULL or T24.StartDate <= 635623428460000000) and (T24.EndDate IS NULL or T24.EndDate >= 635623428460000000) and (T29.StartDate IS NULL or T29.StartDate <= 635623428460000000) and (T29.EndDate IS NULL or T29.EndDate >= 635623428460000000)
RETURN distinct T33.ID


Would the time tree still applicable?

Ryan John Velasco

unread,
Mar 24, 2015, 4:10:08 AM3/24/15
to Michael Hunger, ne...@googlegroups.com
Hello,

I was able to speed up the queries by enable
relationship_auto_indexing=true
relationship_keys_indexable=StartDate,EndDate

And also by adding Using index on those queries that have "In" and "Equals =".
The other thing I did is to not have possibility of null not thereby speeding the query.

Thanks for the help.
Ryan

----- Original Message -----
From: "Michael Hunger" <michael...@neotechnology.com>
To: "Ryan John Velasco" <ry...@limesource.se>
Sent: Thursday, March 19, 2015 7:18:01 PM
Subject: Re: [Neo4j] How does query profiler work?

Indexes for relationships are in discussion but no decision made yet.


Your point about adding a label for ":Current" or ":Active" nodes would make a big difference.
As checking the label is really fast compared to props.


for the comparison expression
comparing something with null is always false and NOT() of that is always true
that's why I thought it could help.


Michael





Am 19.03.2015 um 11:25 schrieb Ryan John Velasco < ry...@limesource.se >:


Please see my comments.

----- Original Message -----

From: "Michael Hunger" < michael...@neotechnology.com >
To: "Ryan John Velasco" < ry...@limesource.se >
Sent: Thursday, March 19, 2015 5:44:30 PM
Subject: Re: [Neo4j] How does query profiler work?

I wonder why it doesn't use the index on :Company.ID -> it says it uses a Label-Scan instead.

can you please run ":schema" in the browser

can you expand all operations?

and perhaps try this hint after the match

USING INDEX T0:Company(ID)

?
> Am 19.03.2015 um 10:30 schrieb Ryan John Velasco < ry...@limesource.se >:
>
> I put Index on all nodes. StartDate and End Date are on relationship but sadly relationship doesn't also have index and also your index is like a hash table right?
the index is a b-tree
Thanks for answering this. Is there a plan to also support range query index for relationship? All our relationship have start and end date.
> We decided not to make Time Tree because our use case it range type and it will complicate our queries. It is also difficult during exporting because a node is not valid for certain date but a range of date.
ok
> The solution I have in mind for now to have smaller set of result is to have a relationship type for example for Knows relationship to have KnowsPossiblyValidForTheMoment and KnowsEnded
I don't understand?

Example is that Subscription will have SubscriptionPossiblyValidForTheMoment (during export we will check if the start and end date is already past and if not then it could be valid for the future.
Most of our queries work with relationship that is valid for the moment. so instead of queries like (subscription)-[]-() we can use (SubscriptionPossiblyValidForTheMoment )-[]-() to have smaller subset of nodes by using label or relationship type.

btw you can probably simplify your query by using

>> (T4.StartDate IS NULL or T4.StartDate <= 635623428460000000)
->
>> NOT(T4.StartDate > 635623428460000000)
as for NULL values the comparison evaluates to false and so the total expression will be true

not sure if that helps

So if T4.StartDate is null it will be true right? This will help. Will this makes the query faster? So this will also work NOT(T4.EndDate< 635623428460000000)

>
> Best Regards,
> Ryan
>
> ----- Original Message -----

> From: "Michael Hunger" < michael...@neotechnology.com >
> To: "Ryan John Velasco" < ry...@limesource.se >
> Sent: Thursday, March 19, 2015 4:44:16 PM
> Subject: Re: [Neo4j] How does query profiler work?
>
> What is the profile/explain output?
>
> Depending on your data this can result in many billions of paths to be checked
>
> Add directions to your rels
>
> Is there an index/constraint on :Company.ID ? Add one if not
>
> In general all your _nodes_ can be attached to a time tree
>
>
> Von meinem iPhone gesendet
> <explain.png><plan.png>
Reply all
Reply to author
Forward
0 new messages