Accessing data on distributed database

68 views
Skip to first unread message

Abhilash Panigrahi

unread,
Jun 18, 2015, 3:12:22 AM6/18/15
to orient-...@googlegroups.com
I'm new to distributed databases. I configured a distributed OrientDB configuration on multiple EC2 instances and they recognize each other and are communicating with each other.

1) Now, suppose I have a particular database on server 1 and not on server 2. If I set autoDeployment=false, server 2 only downloads a distributed config for the said database on server 2 and not the entire database. If I query server 2 for data from the database, it says it does not have access to the database. Is there a way to configure such that my OrientDB DBMS can find out where the data is in the distributed database and I can ping one server and get whatever data I need? Or does OrientDB supports only complete replication?

2) Is there a way I can set a replication count to the number of copies I want to maintain, in a 4 node cluster? i.e Suppose I insert a document into server 1 and want one more copy of the data on any of the other 3 nodes. How do i set this up?

Colin

unread,
Jun 19, 2015, 10:15:38 PM6/19/15
to orient-...@googlegroups.com
Hi Abhilash,

1) Your question isn't entirely clear, but if the database only resides on Server 1, there's no way to ask an instance running on Server 2 to communicate with Server 1 for the unique database.  You would need to open a separate connection to the instance with the database you need.

2) The distributed-config.json file is where you can edit which server nodes each cluster is replicated to.  Each class you create has a default cluster with the same name where all its records are stored.  You can create multiple clusters for every class, if you so choose, and you specify which cluster a record is inserted in.  You could have a Class named MagicData that has two clusters: MagicDataA and MagicDataB.  The MagicDataA cluster could be stored on nodes server1, server 3, and server 5; and you could have the MagicDataB cluster stored on server2, server4, and server6. 

If you looked in the hypothetical distributed-config.json file, you'd look for a cluster called MagicDataA, and you'd see that server1, server3, and server5 were specified for replication.

Best regards,

-Colin

Orient Technologies

The Company behind OrientDB

Abhilash Panigrahi

unread,
Jun 26, 2015, 4:26:08 AM6/26/15
to orient-...@googlegroups.com
Thank you so much! Things are much clearer now.

Regarding sharding...

Suppose I have three nodes viz. node1, node2 and node3. I have two clusers viz. zip_india, zip_usa.


Now I tried to set the servers up such that zip_india will be on node1 and node2 and zip_usa will be on node3.

I configured the default-distributed-db-config.json before creating the database. After I create the datbase, and connect other nodes a lot of other clusters are automatically being formed viz. _studio, _studio_node2, _studio_node3 etc.

Now, by connecting to node1, if I insert a record to zip_india cluster, I am able to see the replication happening even on node3. Is the data actually being stored on node3 too? Or is Orient actually fetching the data from node1 when I query node3 for the particular record in zip_india cluster?



Reply all
Reply to author
Forward
0 new messages