Read Only requests in Cluster

19 views
Skip to first unread message

Max Bridgewater

unread,
Nov 21, 2009, 11:44:47 AM11/21/09
to h2-da...@googlegroups.com
Hi,

I have two nodes in a cluster. When I connect to it from the VM of the first node (62.165.121.14), everything is fine; but when i connect to it from the VM of the second node (211.245.193.235), I get the exception:

 Clustering error - database currently runs in cluster mode, server list: "'62.165.121.14:8803,211.245.193.235:8803'

The URL I use to connect from the first node is: tcp://62.165.121.14:8803,211.245.193.235:8803/test
The URL used to connect from the 2nd node is: tcp://211.245.193.235:8803,62.165.121.14:8803/test.

Essentially, the second node rearranges the order of servers in the URL so that its address is the first. This doesn't seem to be loved by H2.
I thought this rearranging is necessary because the documentation says "Read-only queries are only executed against the first cluster node, but all other statements are executed against all nodes."

If this is statement is true, isn't the constraint that always the same URL be used to connect to the cluster (independently from where the request is made) counter-productive?
It suggests to me that in a cluster of 10 nodes for instance, 9 of them will be performing read-only queries on the same remote node, which I assume was not not the desired behavior.

I'm sure I am missing something here. Please help me figure it out.

Thanks,
Merkel

Thomas Mueller

unread,
Nov 23, 2009, 5:44:16 PM11/23/09
to h2-da...@googlegroups.com
Hi,

The clustering mechanism is mainly for high availability and not for
high performance. See
http://www.h2database.com/html/advanced.html#clustering

> Essentially, the second node rearranges the order of servers in the URL so
> that its address is the first. This doesn't seem to be loved by H2.

Yes, this is not supported. The problem is: if two clients use a
different order, then updates would be written in different order,
which could lead to deadlocks. Reading from a server would be
problematic as well in some cases where writes and reads are mixed.

> It suggests to me that in a cluster of 10 nodes for instance, 9 of them will
> be performing read-only queries on the same remote node, which I assume was
> not not the desired behavior.

Not desired for a high performance cluster, I agree.

I will add a feature request to support reading from random node.

Regards,
Thomas

Max Bridgewater

unread,
Nov 23, 2009, 7:57:43 PM11/23/09
to h2-da...@googlegroups.com
I understand your point. The question is how do we get a high available system with acceptable performance. In my case, for instance, reading from the local node would be the most natural thing to do.

Also related to the performance of this read algorithm, I made a small test that produced troublesome results. Maybe I'm doing something wrong. In a cluster with five nodes, a select takes 50 ms, which is great. But when I kill the node from which the read is done, it jumps to an average of 3500ms. Does that match your results?
 
Not desired for a high performance cluster, I agree.

I will add a feature request to support reading from random node.

That would be nice. What about adding a URL param that allows H2 users to specify which node to read from? Something like:

jdbc:h2:tcp://address1,address2,address3,address4;READ_FROM=3

I think this would be generic enough to meet different read algorithms. It would allow me to read from the local node, allow other people to read from a single node, and still allow those who want to read from random nodes to be able to do so.

Thomas Mueller

unread,
Nov 27, 2009, 11:17:56 AM11/27/09
to h2-da...@googlegroups.com
Hi,

> In my case, for instance, reading from
> the local node would be the most natural thing to do.

Sure. There is a feature request: "Support mixed clustering mode (one
embedded, others in server mode)".

> In a cluster with five nodes, a select takes 50 ms, which is great. But when I
> kill the node from which the read is done, it jumps to an average of 3500ms.
> Does that match your results?

No. Could you try to find out what is going on? (maybe debugging will
help, or use java -Xrunhprof:cpu=samples ...). If not, could you
create a simple test case so I can reproduce the result? Is it
reproducible on a single machine?

> That would be nice. What about adding a URL param that allows H2 users to
> specify which node to read from? Something like:
>
> jdbc:h2:tcp://address1,address2,address3,address4;READ_FROM=3

I think that detection should be automatic (so that you don't need to
know which one is the local node). That may actually be almost as
easier to implement manual configuration.

Regards,
Thomas
Reply all
Reply to author
Forward
0 new messages