I were thinking OrientDB already have ha capabilities :) as the main
page delivers "The transactional engine can run in distributed systems
supporting up to 9.223.372.036 Billions of records for the maximum
capacity of 19.807.040.628.566.084 Terabytes of data distributed on
multiple disks in multiple nodes."
When thinking about mongo and if you follow its achitecture (http://
www.mongodb.org/display/DOCS/Sharding+Introduction) based concepts,
would it be possible to merge what they call "mongos" and "config
servers". I'm not sure to well understand why they split the two as :
- it seems more complex
- it surely demultiplies the numbers of needed servers, it's really a
farm of vps for sure :), i don't imagine with dedicated servers :)
- it adds network traffic and delay latency
- so in apparence 3 reasons which bring another one : it's clearly
more expensive, particulary in a hosting context whatever cloud,
hybrid cloud or dedicated servers.
These config servers make me think of the meta-data servers of several
distributed parallel fault-tolerant file systems. I mean it's great in
some ways but also have quite some requirements the list of
simplicity. I prefer the approach of the more and more popular
GlusterFS competitor with no metadata servers (maybe some ideas to
retrieve here in the field of distributed file systems which can be
good teacher and glusterfs a good inspiration as it's quite unique).
Regarding the "one Master and N Slaves", maybe an idea for OrientDB
(and finally i may missunderstand something and Mongo already do
that) : the possibility to run several servers per machine (vm or
dedicated) and avoiding a salve being on the same machine as the
master of course. That would calm down the number of needed machine.
They call this "partitionned servers" in Lotus notes environments.
Is auto-sharding in the works ?
Will OrientDB keep its ACID qualities in a clustered context (whatever
it is document or graph oriented) ?
Does an Orient database must be compacted periodically due to MVCC ?
If yes, can it be done even when the db is in use ?
One other thing to keep from Mongo or Sedna is the language binding
idea with a binary protocol as the rest approach is perfectly cool for
slow performance :)
so php client support would be welcomed :)
On 16 sep, 12:49, Luca Garulli <
l.garu...@orientechnologies.com>
wrote:
> Seems that the most missed feature in OrientDB is the support for
> clustering, and therefore high scalability, high availability and high
> volume of transactions that a single node can't handle. In the last months I
> studied the whole different architectures of other NoSQL solutions for
> clustering and I can say that the preferred until now is something similar
> to Mongo-DB approach with Master/Slaves architecture.
>
> The current work-in-progress release 0.9.23 provides the first version of
> Replication in OrientDB. The features are:
>
> - Master-Slaves type, where it can be only one Master and N Slaves. If
> the Master crashes a Slave is elected to be the new Master
> - IP multicast to discover cluster nodes
> - Configuration of nodes using TCP/IP, useful for Clouds that don't allow
> the IP multicast
> - Two sync modes: full where all the database is compressed and sent over
> the network, and partial by sending only the changes happened since the last
> sync
> - New database handled by the Master OrientDB Server instance to store
> all the pending records until a configurable threshold. Up this threshold
> the logs are deleted and the node need a full-sync on startup
> - New console commands to display nodes, listen clustering messages