Logging Performance and Query Times (via console and SDN)

303 views
Skip to first unread message

BtySgtMajor

unread,
Sep 28, 2012, 11:26:27 AM9/28/12
to ne...@googlegroups.com
Hi all,

  So I'm getting into some good stuff here as I dig into Neo4j deeper.  At present, I'm trying to determine where bottlenecks lie as I develop some SDN code around a Neo4j server.  Obviously, I can time the Java code that makes the call out to the server (which is remote--on the same network, but sitting on a VM on another box), and I can see the time it takes my queries through the Neo4j console.

  My question is this: Is there a way in the server/logging config in Neo4j to document/log each query's runtime?  This would help me greatly in determining where my bottlenecks are.  (I'd imagine it's in the part where my SDN repository is calling out to the remote Neo4j server, but I'd like to confirm and then start tweaking things).

  Thanks in advance!  Exciting times!

Cheers,
Duncan

Jim Webber

unread,
Sep 28, 2012, 1:39:09 PM9/28/12
to ne...@googlegroups.com
Hi Duncan,

Would the CLF HTTP logging suit your needs? That is: http://docs.neo4j.org/chunked/stable/server-configuration.html#_http_logging_configuration

Jim

Michael Hunger

unread,
Sep 28, 2012, 2:01:22 PM9/28/12
to ne...@googlegroups.com
Duncan,

that is a good point actually.

I can add time logging to the repositories / queries in SDN. Due to the lazy nature that will be only the query parsing and initial execution though (might be the full query time though depending on if you have a streaming or non-streaming server interaction).

Can you share some of the code you're talking about?

An more or less easy way to do it for you might be putting a logging-proxy in front of the neo4j server


Alternatively if you're more adventurous adding a filter (via an unmanaged extension) to the neo4j server that logs the request times.
(see: https://github.com/neo4j/authentication-extension/blob/1.8/src/main/java/org/neo4j/server/extension/auth/AuthenticationExtensionInitializer.java, also make sure to have the META-INF/services/.... with your initializer class name) there.

HTH

Michael

--
 
 

BtySgtMajor

unread,
Sep 28, 2012, 3:49:31 PM9/28/12
to ne...@googlegroups.com
It might help with determining the length of the actual request handling.  Though that link doesn't actually take me to anywhere on the config page that contains explicit info about HTTP logging, I'll look deeper as I have a chance.  Thanks.

BtySgtMajor

unread,
Sep 28, 2012, 3:55:29 PM9/28/12
to ne...@googlegroups.com
Hi Michael,

  What you're proposing in terms of adding time logging for repos/queries is exactly what I'm looking for.  I don't have any eager loading in place at this point in time anyway (it's the last thing I'd want to add).

  I'll give the logging proxy bit a look as I have a chance.  As for the filter option, that might take me a bit longer, and will give that a shot if it comes to that.

  With respect to my code, there's really not all that much.

  Essentially, from within one of my Spring MVC controllers, I have a line like this:  ("Game" is part of my domain model as I'm doing something similar to IMDB.)

Collection<Game> allGames = IteratorUtil.asCollection(gameRepository.findAll());

  And, the "gameRepository" is declared like this:

public interface GameRepository extends GraphRepository<Game>, RelationshipOperationsRepository<Game> {
Game findByGameId(String id);
}

  That's about it.  Just keeping things simple. 

  The "Game" model does have a few relationships; however, all relationships/collections are left as lazily loaded so as to avoid a big mess.

  Currently, running the code above takes, on average, about 35s to complete the round trip.

  If I run the same query via Cypher on the Neo4j console, it takes a whopping 42ms.

  Also, if I just fetch a single game via the gameId via SDN, it takes like 600ms.

  So, like I said, early days, and I'm trying to really dig in and find out exactly where bottlenecks are occurring so I can rip it apart and really get familiar with it.  Like I said before, I have my hypotheses, but I need ways to test them out. :)

  Let me know if you want/need additional info/code and, if so, what it is.

  Thanks again, Michael!

Cheers,
Duncan

Michael Hunger

unread,
Sep 28, 2012, 4:25:51 PM9/28/12
to ne...@googlegroups.com
The main problem is that SDN is not yet optimized to work with Neo4j server (which means we have to rewrite the internals to use cypher throughout the place).

So the problem is not the findAll per se (which will be converted to a index lookup operation) but the hydrating of the objects.

Sorry for that, it's a people bandwith issue on our side right now.

You can either look into hydrating your objects via Cypher and @ResultMapping or 
moving your SDN code into a server plugin or unmanaged extension and expose domain level endpoints for your webapp to consume (talking domain language and objects and not nodes and rels on that boundary).

HTH

Michael
--
 
 

BtySgtMajor

unread,
Sep 28, 2012, 5:08:32 PM9/28/12
to ne...@googlegroups.com
Actually, something I just came across and hadn't anticipated.

The repository call is actually pretty quick; on the order of several seconds (like no more than 5-10 tops).

While it could be better (given that it's only 550 nodes being returned), it's sufficiently fast for now.

What's slowing things down is this: My conversion of findAll()'s return type (i.e. a ClosableIterable) into a Collection (via IteratorUtil.asCollection(...) is what's taking up the majority of the time.

So, this is either something I've forgotten about since my hardcore Java days, or I'm going to have to look into why the conversion is so horribly inefficient (I do get the differences between the Iterable concept and the Collection concept).  I'm using a "Collection" since I'm assigning it to a "select" box in my JSP via Spring MVC.

Thoughts are welcome (I've probably overlooked something obvious).

So, long story short: The timing code I need can likely wait until I have things straightened out a bit more.  I would love to see my query come down from ~5s to a lot less.

BTW, I found that the findAll() method was just as quick (if not quicker) than my use of the same Cypher query and the use of @MapResult, etc.

Thanks again guys!  Very excited going through all this!

BtySgtMajor

unread,
Oct 1, 2012, 12:24:23 PM10/1/12
to ne...@googlegroups.com
Just thought I'd try again here.  Concerning;

>> What's slowing things down is this: My conversion of findAll()'s return type (i.e. a ClosableIterable) into a Collection (via IteratorUtil.asCollection(...) is what's taking up the majority of the time.

Has anyone come across this before?  All my nodes/relationships are left at the default of being lazily loaded, so I'm hoping I'm doing something else somewhat obvious.  And, the size of the iterable/collection is only 550 nodes.

Thanks again!

Michael Hunger

unread,
Oct 1, 2012, 5:27:13 PM10/1/12
to ne...@googlegroups.com
It would really be great if you could share your project and data (privately) so we could have a look.

Thanks

Michael

--
 
 

Duncan Brown

unread,
Oct 1, 2012, 5:35:28 PM10/1/12
to ne...@googlegroups.com
Sure, how would you recommend?  I could create a github account if you'd like.  How would I privately get you the data?  I could see about whitelisting your IP against the Neo4j server, unless you'd just like me to zip up the data dir and Gmail to you or some such.

Also, please be forewarned that the code is in a serious state of flux and so it most definitely will not look/flow as nice as it should.

Let me know.

--
 
 

Peter Neubauer

unread,
Oct 2, 2012, 4:11:42 PM10/2/12
to ne...@googlegroups.com
Maybe zip up the installation and put it on dropbox, and share the
link off-list?

Cheers,

/peter neubauer

G: neubauer.peter
S: peter.neubauer
P: +46 704 106975
L: http://www.linkedin.com/in/neubauer
T: @peterneubauer

Wanna learn something new? Come to http://graphconnect.com
> --
>
>
Reply all
Reply to author
Forward
0 new messages