Multiple datastore copy in the same JVM : potential conflict ?

Christophe Porté

unread,

Oct 8, 2012, 11:53:16 AM10/8/12

to ne...@googlegroups.com

Hi,

Just a quick question to be fully sure that we're not doing dangerous things :)

Let say we have a template Neo4j datastore somewhere on the file system, containing some nodes with properties, relationships, and indexes.
The folder containing this template DB is duplicated multiple times in other folders in the FS.
Then, using the java API, we open some copied databases and do things inside (different things, in fact there is one datastore per customer)
Is there any risks of conflict, for exemple based on an internal identification number (neostore UUID ? index UUID ?, or the different GraphDatabaseService instances will never conflict at all ?
I was wondering that because of a "neostore.id" file in the datastore folder

Thanks !

Christophe.

Michael Hunger

unread,

Oct 8, 2012, 12:49:30 PM10/8/12

to ne...@googlegroups.com

In general this will work, depends on what you want to do with the databases later, e.g. if they should join a HA cluster, then there might be cause for conflict, otherwise they are independend
As long as you don't have concurrent access to the datastores. Also where you should watch out for is running multiple neo4j databases on a single machine concurrently as long-term stores,
those should be configured to use the appropriate amount of memory for mmio and java-heap.

the store-id is mostly there for identification e.g. in logfiles etc.

May I ask what the use-case for this approach is?

HTH

Michael

> --
>
>

Christophe Porté

unread,

Oct 8, 2012, 4:44:03 PM10/8/12

to ne...@googlegroups.com

Sure !

We are providing a SaaS solution for scientific modeling and computation about environmental impacts. We decided to have one database per customer for many reasons (security, scalabilty, easier custom-development and customer-specific version)
We already have a multi-datastore using Neo4j in pre-production, so care have been taken about memory and it should not be an issue for now :)

For now we are creating new databases and populating them using JAXB and XML files. Here comes the specific usecase related to the question: for legacy reasons, we will have to migrate customers from an plain SQL version to this new one (SQL/Neo4j hybrid). The migration code needs a new Neo4J database populated with initial data.
We don't want to maintain the migration code, but the initial populating part will certainly evolve (the science part is pretty new and we're changing data models more often than in usual softwares). As we already have an internal java mechanism for database schema upgrades, we would like to freeze an initially populated database, use it as a start for the migration, then apply upgraders starting from this point on it. It's much easier than maintaining an old java init-populating mechanism.
And to duplicate a whole Neo4j database, "cp -r" sounds pretty easy :)

Thanks for your answer !
What did you mean by "concurrent access to the datastores" ?

Christophe

Michael Hunger

unread,

Oct 8, 2012, 4:55:41 PM10/8/12

to ne...@googlegroups.com

Wow, that sounds great.

Looking forward to some blog posts from your side describing what you do to the public :)

#1 you can also look into cypher for pre-populating your neo4j database (as of 1.8 you can create and update stores) you can also treat cypher scripts much like sql-scripts that you can version-control and apply/migrate

it is as easy as piping a cypher script wrapped by begin/commit to the neo4j-shell.

#2 I meant with concurrent access - multiple JVMs accessing the same neo4j-db.

E.g.

#import.sh

if [ ! -f init.cql ]; then

echo "Creating init.cql"

ruby import.rb > init.cql

fi

rm -rf cineasts.db

cat init.cql | neo4j/bin/neo4j-shell -path cineasts.db -config neo4j.properties

# init.cql

begin

start root=node(0)

create unique

root-[:GENRES]->({type:'GENRES'}),

root-[:MOVIES]->({type:'MOVIES'}),

root-[:PEOPLE]->({type:'PEOPLE'});

{:actors=>[{:role=>"Jake Sully", :name=>"Sam Worthington", :id=>65731}, {:role=>"Neytiri", :name=>"Zoe Saldana", :id=>8691}, {:role=>"Dr. Grace Augustine", :name=>"Sigourney Weaver", :id=>10205}, {:role=>"Col. Quaritch", :name=>"Stephen Lang", :id=>32747}, {:role=>"Trudy Chacon", :name=>"Michelle Rodriguez", :id=>17647}, {:role=>"Norm Spellman", :name=>"Joel Moore", :id=>59231}, {:role=>"Selfridge", :name=>"Giovanni Ribisi", :id=>1771}, {:role=>"Moha", :name=>"CCH Pounder", :id=>30485}, {:role=>"TsuTey", :name=>"Laz Alonso", :id=>10964}, {:role=>"Eytukan", :name=>"Wes Studi", :id=>15853}, {:role=>"Dr. Max Patel", :name=>"Dileep Rao", :id=>95697}, {:role=>"Lyle Wainfleet", :name=>"Matt Gerald", :id=>98215}, {:role=>"Samson Pilot", :name=>"Dean Knowsley", :id=>98216}], :movie_id=>19995, :directors=>[{:name=>"James Cameron", :id=>2710}], :tagline=>"Enter the World of Pandora.", :title=>"Avatar", :released=>"2009-12-18", :genres=>["Action", "Adventure", "Fantasy", "Science Fiction"]}

start root=node(0)

match

root-[:GENRES]->genres,

........................

create unique movie<-[:ACTS_IN {role : 'Bilbo Baggins'}]-({id : '65', name: 'Ian Holm'})-[:PERSON]->people

create unique movie<-[:ACTS_IN {role : 'Galadriel'}]-({id : '112', name: 'Cate Blanchett'})-[:PERSON]->people

create unique movie<-[:ACTS_IN {role : 'Gothmog & Witchking of Angmar'}]-({id : '1365', name: 'Lawrence Makoare'})-[:PERSON]->people

create unique movie<-[:ACTS_IN {role : 'King of the Dead'}]-({id : '1382', name: 'Paul Norell'})-[:PERSON]->people

create unique movie<-[:ACTS_IN {role : 'Deagol'}]-({id : '1383', name: 'Thomas Robins'})-[:PERSON]->people

create unique movie<-[:ACTS_IN {role : 'Mercenary On Boat'}]-({id : '108', name: 'Peter Jackson'})-[:PERSON]->people

create unique movie<-[:ACTS_IN {role : 'Rosie Cotton'}]-({id : '965278', name: 'Sarah McLeod'})-[:PERSON]->people

create unique movie<-[:DIRECTS]-({id : '108', name: 'Peter Jackson'})-[:PERSON]->people

;

commit

--

Christophe Porté

unread,

Oct 9, 2012, 4:04:35 PM10/9/12

to ne...@googlegroups.com

Good news this .cql files :)
I imagine the capability to dump a database in such file is planned ?

Ok for the concurrent access, this should not be the case

Thanks for all your answers !

Christophe

Peter Neubauer

unread,

Oct 12, 2012, 10:20:14 AM10/12/12

to ne...@googlegroups.com

Christophe,
yes that is planned. Already now you can dump the contents of the DB
into CQL files, you can even construct them from the Cypher statement
much like in SQL, like

start n=node(1,2,3) match n-[r?]-other where ID(n)>ID(other) return
"CREATE ("+ id(n) + "({name:'"+n.name?+"'})"

http://tinyurl.com/95xcz9w

Note however that this is BAD behaviour and very hacky.

The main missing point is that there is no good way to express the
contents of the indexes, since they are totally free to fill manually.
Export-Import is working if you enable the same autoindexes in both
databases.

Cheers,

/peter neubauer

G: neubauer.peter
S: peter.neubauer
P: +46 704 106975
L: http://www.linkedin.com/in/neubauer
T: @peterneubauer

Neo4j 1.8 GA - http://www.dzone.com/links/neo4j_18_release_fluent_graph_literacy.html

> --
>
>

Reply all

Reply to author

Forward