Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Graph Indexes: Lucene vs in-graph (node-based)
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  3 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
SimonH  
View profile  
 More options Jun 1 2012, 10:39 am
From: SimonH <simon.ha...@gmail.com>
Date: Fri, 1 Jun 2012 07:39:38 -0700 (PDT)
Local: Fri, Jun 1 2012 10:39 am
Subject: Graph Indexes: Lucene vs in-graph (node-based)

Hi, I've got a graph on which I want to index different Node (Entities and
Events) using properties such as time range, location, domain Ontology,
etc. The obvious 2 options I've got for doing this is to use: 1) a Lucene
Index; or 2) an in-graph Index, where I'll use a Node to index the Nodes I
seek. One main advantage with the in-graph Index is the versatility it
provides, by supporting a multilevel index (as shown in  
http://blog.neo4j.org/2012/02/modeling-multilevel-index-in-neoj4.html) and
reverse index lookup and other possibilities in traversals... However, it
is a bit more complex to maintain and it "pollutes" the graph with "system
nodes". Moreover, I'm not sure how the in-graph index compares in term of
efficiency to the Lucene Index? More specifically, in terms of time/date
indexing, how would the previous multilevel index compare to a Lucene
"YYYYMMDD" String field index? The in-graph index seems to offer some
advantage when indexing a start and end date and searching for date
ranges...

I would appreciate any insights about those two indexing approaches...
thanks!

Simon


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Niels Hoogeveen  
View profile  
 More options Jun 1 2012, 4:56 pm
From: Niels Hoogeveen <nielshoog...@gmail.com>
Date: Fri, 1 Jun 2012 13:56:01 -0700 (PDT)
Local: Fri, Jun 1 2012 4:56 pm
Subject: Re: Graph Indexes: Lucene vs in-graph (node-based)
I have done some work on in-graph indexes in the past and my
experience is that it is not always worth the effort. It depends
however on the context. If for example you want to expose the index as
part of your application, an in-graph index is a great solution.

In my experience in-graph indexes become less attractive when indexing
large numbers of nodes. Rebalancing index trees can become
prohibitively slow when indexes become big. In "normal" Btrees eg.,
the index consists of blocks that can be swapped in and out of memory
as a unit. In-graph indexes use relationships to span up a tree, but
those relationships are not grouped together on disk, so rebalancing
an index tree may require disk reads from many different places in the
relationship file.

In my experience (running on my development machine, without any
additional tuning) an index up to approximately 100,000 entries still
performs reasonably well, above that number of entries, performance
becomes progressively slower. Of course tuning can make the approach
work well for higher numbers of entries, but I have to assume the
basic pattern remains.

On Jun 1, 4:39 pm, SimonH <simon.ha...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
SimonH  
View profile  
 More options Jun 6 2012, 8:50 am
From: SimonH <simon.ha...@gmail.com>
Date: Wed, 6 Jun 2012 05:50:47 -0700 (PDT)
Local: Wed, Jun 6 2012 8:50 am
Subject: Re: Graph Indexes: Lucene vs in-graph (node-based)

Thanks for your feedback Niels!


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »