Failover over OneShotChannelAdaptor channels

197 views
Skip to first unread message

damien.grandemange

unread,
Nov 20, 2012, 6:59:22 AM11/20/12
to jpos-...@googlegroups.com
Hi,

Due to OneShotchannelAdaptor's nature, it seems inappropriate to use MUXPool failover facility with OneShotchannelAdaptor channels.
Is there anyone who has already faced this need and found a good strategy ?

Victor Salaman

unread,
Nov 20, 2012, 8:06:27 AM11/20/12
to jpos-...@googlegroups.com
Hi Damien:

You are right. Due to it's nature, the OneShotChannelAdaptor as-is is not a good candidate for failover. The problem is that the MuxPool will "assume" that the underlying MUX (backed by the OneShotChannelAdaptor) is connected and operational. Depending on what you need, this behavior can be changed by extending OneShotChannelAdaptor and adding "ready" indicators and an error-threshold.

So if for example, say you exceed your error-threshold of 3 (or you reach a condition that you define), you'd clear the "ready" indicator. At this point, the MUXPool would stop sending messages to your sick MUX. It's up to you to start a "recovery" thread that monitors the connection and infinitely tries to reconnect so that when connection is re-established the "ready" indicator is set and the MUXPool starts sending messages your way again.

/V
November 20, 2012 7:59 AM
Hi,

Due to OneShotchannelAdaptor's nature, it seems inappropriate to use MUXPool failover facility with OneShotchannelAdaptor channels.
Is there anyone who has already faced this need and found a good strategy ?

--
--
jPOS is licensed under AGPL - free for community usage for your open-source project. Licenses are also available for commercial usage.
Please support jPOS, contact: sa...@jpos.org
 
You received this message because you are subscribed to the "jPOS Users" group.
Please see http://jpos.org/wiki/JPOS_Mailing_List_Readme_first
To post to this group, send email to jpos-...@googlegroups.com
To unsubscribe, send email to jpos-users+...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/jpos-users
 
 
 

chillum

unread,
Nov 20, 2012, 9:09:30 AM11/20/12
to jpos-...@googlegroups.com
Hi Damien,

Do read this
http://www.andyorrock.com/2012/01/one-shot-watchdog.html
-chhil


On 20 November 2012 18:36:27, Victor Salaman wrote:
> Hi Damien:
>
> You are right. Due to it's nature, the OneShotChannelAdaptor as-is is
> not a good candidate for failover. The problem is that the MuxPool
> will "assume" that the underlying MUX (backed by the
> OneShotChannelAdaptor) is connected and operational. Depending on what
> you need, this behavior can be changed by extending
> OneShotChannelAdaptor and adding "ready" indicators and an
> error-threshold.
>
> So if for example, say you exceed your error-threshold of 3 (or you
> reach a condition that you define), you'd clear the "ready" indicator.
> At this point, the MUXPool would stop sending messages to your sick
> MUX. It's up to you to start a "recovery" thread that monitors the
> connection and infinitely tries to reconnect so that when connection
> is re-established the "ready" indicator is set and the MUXPool starts
> sending messages your way again.
>
> /V
>> damien.grandemange <mailto:damien.gr...@gmail.com>

damien.grandemange

unread,
Nov 20, 2012, 10:34:14 AM11/20/12
to jpos-...@googlegroups.com
Thank you for your help.

In the basic Channel implementation (designed for persistent connections), the channel push its "i am ready" information to a space. QMUX is tied to this same space to be notified of the (non)readyness of its channels.
In OneShotChannelAdaptor case, in order to provide something near to the "ready" indicator, channel should make some regular connections try to the distant in order this availability state stays close to reality. That means successive (open/close) connections to the distant.
For instance, that's what is done in the Watch Dog if i 've well understood it.
(By the way, how will a distant acquirer react to this serie of open/close without sending anything in between ? I can imagine that, for security reasons, access may be simply rejected in the worst case.)

Now, my naive approach :
Speaking of a OneShotChannelAdaptor's availability, it seems to me that we best can ensure it by its very connection attempt time.
Thinking this way means getting rid of the "ready" concept which does not help in its actual form. "ready" becomes then an ephemeral state which lasts only for the connection try duration.
We should also get rid of the MUX concept which seems not very compatible with "one shot" nature.
Back to the failover mechanism, it then becomes the connection attempt originator : when it got to choose one OneShotChannelAdaptor to work with from a OneShotChannelAdaptor pool, it should get the first instance of this pool that successfully connects.

chhil

unread,
Nov 20, 2012, 10:50:33 AM11/20/12
to jpos-...@googlegroups.com
The one shot approach that we have used is what you have described.
This was a client driven requirement and they were fine with the poll
to check if the server is available or not.

You are correct where the mux is an over kill as the connection is
available for a request response pair only no matching is really
required.

-chhil

Alejandro Revilla

unread,
Nov 20, 2012, 10:53:07 AM11/20/12
to jpos-...@googlegroups.com
Actually, OneShotChannelAdaptor could implement the MUX interface itself.

--
@apr

damien.grandemange

unread,
Nov 22, 2012, 9:52:56 AM11/22/12
to jpos-...@googlegroups.com
Hi guys,

I 've been working on a little something.
In a few words, it consists of :
* a EnhancedOneShotChannelAdaptor class : widely based on original OneShotChannelAdaptor, it is configurable with a 'handle connections failure' mode,
* a OneShotChannelPool class : a channel pool implementation with two load distribution strategies widely inspired from MUXPool; it handles a pool of (OneShot)channels
* a ConnectionFailureException class

I got rid of the "ready" concept as i prevviously said, but keep Alejandro suggestion that was to keep the MUX interface, though i expose it through the OneShotChannelPool. EnhancedOneShotChannelAdaptor stay unchanged on this point and keeps providing the Channel interface.

Here is a little demo of it : https://github.com/dgrandemange/OneShotChannelPool-demo

Any feedback is welcomed

Alejandro Revilla

unread,
Nov 22, 2012, 10:03:42 AM11/22/12
to jpos-...@googlegroups.com

Demo works like a charm. You may want to add an mkdir runtime/log to your README file.

Will certainly use it if the need arise!

--
@apr

damien.grandemange

unread,
Nov 22, 2012, 10:39:08 AM11/22/12
to jpos-...@googlegroups.com
Thank you for your time.
If need really rise someday, this enhanced version of the OneShotChannelAdaptor may easily be merged wih the current implementation. Actually, having no <cnx-process-handling/> declaration in the EnhancedOneShotChannelAdaptor XML configuration, you should get the actual OneShotChannelAdaptor behavior.

chhil

unread,
Nov 22, 2012, 12:33:48 PM11/22/12
to jpos-...@googlegroups.com
Missed this. Will give it a shot tomorrow.

-chhil

On Thu, Nov 22, 2012 at 9:09 PM, damien.grandemange

chhil

unread,
Nov 23, 2012, 12:50:30 AM11/23/12
to jpos-...@googlegroups.com
Hi Damien,
The demo works great! Like the strategies to choose from for picking
the channels.

Maybe this is a demo thing, but I removed the servers deployed and
used netcat listening in, this does not respond back. The clients dont
send any more messages once a response is not received. A disconnect
is also not triggered as the connection to netcat is always connected.

-chhil

damien.grandemange

unread,
Nov 23, 2012, 5:54:51 AM11/23/12
to jpos-...@googlegroups.com
Hi,
Thank you for your feedback.


The clients dont send any more messages once a response is not received
As far as i understand netcat, it accepts one solely connection. So once a connection is established to netcat, it accepts no more client. Is this right ?

I made some tests, and after removing the two server configs, I 'v launched netcat in two ways :
1) "nc -n -l -p 23456 -w 1" (listen on port 23456, and once a connection has been accepted, disconnect after 1 second)
=> the pool and the one shot channels work as expected even though no response has been received. DemoParticipant get a null response from the channel pool.

2) "cat msg0110.xml | nc -n -l -p 23456 -w 1" (listen on port 23456 and onec a connection has been accepted, send msg0110.xml file content as a response, and disconnect after 1 second)
The pool and one shot channels work as expected. DemoParticipant get this response from the channel pool.

I subsequently re-deploy the 2 servers configs and load is again distributed between them.


A disconnect is also not triggered as the connection to netcat is always connected.
I didn't get this. A disconnect from which component ?

chhil

unread,
Nov 23, 2012, 6:05:51 AM11/23/12
to jpos-...@googlegroups.com
Hello,

I just tried netcat, yes it allows one connection, wasnt aware of this.
The behavior in your tests look right. I got misled by the netcat behavior.

> I didn't get this. A disconnect from which component ?
When a request is sent and a response is not received in a specific
time the client should disconnect thus freeing it up. Cannot rely on a
server to initiate a disconnect in this scenario.

Will play around with the demo a bit more where the bsh script does
not respond back.

-chhil

damien.grandemange

unread,
Nov 23, 2012, 6:29:17 AM11/23/12
to jpos-...@googlegroups.com
When a request is sent and a response is not received in a specific time the client should disconnect thus freeing it up. Cannot rely on a server to initiate a disconnect in this scenario.
Totally agree with that.
Actually, I think that this responsability is let to the underlying Channel's.
In this demo, i am using a XMLChannel (child of BaseChannel, child of Denethor ...) . XMLChannel internally use a BufferedReader.readline() to read from server. Unless you set the channel underlying socket so timeout, this readline() method will block till the server cut down the connection.
This timeout is not configured in my demo channels XML configurations, but just add a <property name="timeout" value="2000" /> under the <channel/> element and it works just fine : the client (channel) cut the connection after 2 seconds if the netcat is not programmed to respond in this delay.

chhil

unread,
Nov 23, 2012, 6:35:38 AM11/23/12
to jpos-...@googlegroups.com

Perfect!

Reply all
Reply to author
Forward
0 new messages