Traversing all nodes of a certain type: use an index or use a category node?

56 views
Skip to first unread message

Asfand Yar Qazi

unread,
Aug 12, 2012, 6:08:11 AM8/12/12
to ne...@googlegroups.com
Hello,

I want to make sure that, down the line, when I need to access all nodes of type (for example) Car, I can do so easily, for things like migrating data or fiddling with properties.

* I can link each node I create to a category node (I think it's called) with a property 'node_type: Car', and that gives me an easy way of finding all Cars.  I was told on this mailing list that this can create contention on this category node, thus slowing down writes (which I assume is the only downside).

* I can add the 'node_type' property to the Lucene index, so I can do a search on 'node_type: "Car"' to get back all cars.  I guess this causes contention on the index too.

I don't need this in normal day-to-day operations, but I am only thinking of the future where I will need to run migration scripts to do something to all nodes due to an upgrade or bug fix of some kind.

Is this the right way to be going about things, and which option is the best?

Thanks

Michael Hunger

unread,
Aug 12, 2012, 7:55:47 AM8/12/12
to ne...@googlegroups.com
I think when you don't need it on a day to day basis it is a safe bet to put it into lucene.

Optionally if you only need it for migrations you can also just put a property on the node and then during your migration do a full-db scan.

Michael

Asfand Yar Qazi

unread,
Aug 12, 2012, 8:15:57 AM8/12/12
to ne...@googlegroups.com

On Aug 12, 2012 12:55 PM, "Michael Hunger" <michael...@neotechnology.com> wrote:
> Optionally if you only need it for migrations you can also just put a property on the node and then during your migration do a full-db scan.
>

I think this is the solution I'll go with then - I didn't know nodes could be found unless they were somehow related to a known one.

Won't loading every node into memory cause out of memory errors?  Or is there a way to paginate over nodes using an iterator or something?  I'm using neo4j.rb which embeds a Neo4j database within the Ruby process.

Thanks

Michael Hunger

unread,
Aug 12, 2012, 8:17:50 AM8/12/12
to ne...@googlegroups.com, Andreas Ronge
There is an iterator with GlobalGraphOperations.at(gdb).getAllNodes() or with cypher start n=node(*) ....

Neo4j.rb should also have means of accessing the global iterator.

The processed nodes will be gc'ed if they are no longer referenced.

Michael

Asfand Yar Qazi

unread,
Aug 12, 2012, 8:20:09 AM8/12/12
to ne...@googlegroups.com

On Aug 12, 2012 1:18 PM, "Michael Hunger" <michael...@neotechnology.com> wrote:
>
> There is an iterator with GlobalGraphOperations.at(gdb).getAllNodes() or with cypher start n=node(*) ....
>
> Neo4j.rb should also have means of accessing the global iterator.
>
> The processed nodes will be gc'ed if they are no longer referenced.
>
> Michael
>

That's brilliant, thank you.

Reply all
Reply to author
Forward
0 new messages