Hi all,
We're looking at a Tinkerpop-based stack to build our object graph
database on. It looks very cool. Our application will require high
concurrency - i.e. we need to support many clients accessing the
database at the same time. Since our workload will be read-heavy, I'm
less worried about locking for the moment than I am about ensuring the
the architecture can scale reasonably to handle concurrent requests.
Assuming for the sake of discussion that we're building atop the
OrientDB platform, and referencing the Blueprints API for OrientDB:
http://code.google.com/p/orient/wiki/GraphDatabaseTinkerpop#Work_with_vertexes_and_edges
. We will have one server and multiple worker process clients (with
multiple connections per client) communicating via TCP/IP.
Performance is very important to us, so while REST via ReXster looks
like a great client API, it seems undesirable from a performance
perspective. I understand that Blueprints does not support connection
pooling, which I interpret to mean that it only supports a single
handle to the underlying datastore at a time.
What is the preferred way to support high performance concurrent
requests from remote clients?
Thanks,
Dean
Hi Luca,
Thanks for your fast response!
That approach was certainly one that I had identified, but seemed to
have the disadvantage in that we're now reliant on a feature exposed
by the underlying datastore. In other words, we lose a lot of the
benefit in Blueprints being datastore-agnostic.
This bit probably belongs in the OrientDB grouplist, but I'll ask it
here given that it relates to my original question: In the OrientDB
remote connection model, it appears that OrientDB supports concurrent
connections. Can you suggest how many connections we should run in
our pool? For example, we've noticed that SQL databases typically do
best with a few dozen handles, while some of the other NoSQL platforms
can handle several thousand without issue.
Cheers,
Dean
We're looking at a Tinkerpop-based stack to build our object graph
database on. It looks very cool. Our application will require high
concurrency - i.e. we need to support many clients accessing the
database at the same time. Since our workload will be read-heavy, I'm
less worried about locking for the moment than I am about ensuring the
the architecture can scale reasonably to handle concurrent requests.
OrientDB platform, and referencing the Blueprints API for OrientDB:
http://code.google.com/p/orient/wiki/GraphDatabaseTinkerpop#Work_with_vertexes_and_edges
. We will have one server and multiple worker process clients (with
multiple connections per client) communicating via TCP/IP.
Performance is very important to us, so while REST via ReXster looks
like a great client API, it seems undesirable from a performance
perspective. I understand that Blueprints does not support connection
pooling, which I interpret to mean that it only supports a single
handle to the underlying datastore at a time.
What is the preferred way to support high performance concurrent
requests from remote clients?