Accessing data on distributed database

Abhilash Panigrahi

unread,

Jun 18, 2015, 3:12:22 AM6/18/15

to orient-...@googlegroups.com

I'm new to distributed databases. I configured a distributed OrientDB configuration on multiple EC2 instances and they recognize each other and are communicating with each other.

1) Now, suppose I have a particular database on server 1 and not on server 2. If I set autoDeployment=false, server 2 only downloads a distributed config for the said database on server 2 and not the entire database. If I query server 2 for data from the database, it says it does not have access to the database. Is there a way to configure such that my OrientDB DBMS can find out where the data is in the distributed database and I can ping one server and get whatever data I need? Or does OrientDB supports only complete replication?

2) Is there a way I can set a replication count to the number of copies I want to maintain, in a 4 node cluster? i.e Suppose I insert a document into server 1 and want one more copy of the data on any of the other 3 nodes. How do i set this up?

Colin

unread,

Jun 19, 2015, 10:15:38 PM6/19/15

to orient-...@googlegroups.com

Hi Abhilash,

1) Your question isn't entirely clear, but if the database only resides on Server 1, there's no way to ask an instance running on Server 2 to communicate with Server 1 for the unique database. You would need to open a separate connection to the instance with the database you need.

2) The distributed-config.json file is where you can edit which server nodes each cluster is replicated to. Each class you create has a default cluster with the same name where all its records are stored. You can create multiple clusters for every class, if you so choose, and you specify which cluster a record is inserted in. You could have a Class named MagicData that has two clusters: MagicDataA and MagicDataB. The MagicDataA cluster could be stored on nodes server1, server 3, and server 5; and you could have the MagicDataB cluster stored on server2, server4, and server6.

If you looked in the hypothetical distributed-config.json file, you'd look for a cluster called MagicDataA, and you'd see that server1, server3, and server5 were specified for replication.

Best regards,

-Colin

Orient Technologies

The Company behind OrientDB

Abhilash Panigrahi

unread,

Jun 26, 2015, 4:26:08 AM6/26/15

to orient-...@googlegroups.com

Thank you so much! Things are much clearer now.

Regarding sharding...

Suppose I have three nodes viz. node1, node2 and node3. I have two clusers viz. zip_india, zip_usa.

Now I tried to set the servers up such that zip_india will be on node1 and node2 and zip_usa will be on node3.

I configured the default-distributed-db-config.json before creating the database. After I create the datbase, and connect other nodes a lot of other clusters are automatically being formed viz. _studio, _studio_node2, _studio_node3 etc.

Now, by connecting to node1, if I insert a record to zip_india cluster, I am able to see the replication happening even on node3. Is the data actually being stored on node3 too? Or is Orient actually fetching the data from node1 when I query node3 for the particular record in zip_india cluster?

Reply all

Reply to author

Forward