Fuse fabric scalability

271 views
Skip to first unread message

Sudha Subramanian

unread,
Jun 8, 2012, 7:07:25 PM6/8/12
to fusefabric
Hi,

I was looking for a way to load balance across osgi services running
in different fabric containers. These services will be orchestrated
using camel. Originally my idea was to use camel's load balancing
strategy to load balance across a set of osgi services ( osgi:set).
For example, I would define something like

<osgi:set id="services"
        interface="service.myinterface "
        cardinality="0..N"/>

I would then use camel to somehow load balance across services.

My question on this is:

1. If I have osgi Services installed in different containers, and if I
were to define a set like above, will it contain all the services
running in different containers?

2. In the product description of fuse fabric, it says 'The fabric
infrastructure automatically load balances requests across all
available containers.'. Is this applicable for the case described
above?

3. Is there any other way?

Thanks,
Sudha

Ioannis Canellos

unread,
Jun 8, 2012, 8:13:33 PM6/8/12
to fusef...@googlegroups.com
Fabric provides a remote osgi service implementation, which will allow you to export osgi services and have remote containers register them as if they were local. On top of that, if you have more than one containers exporting the service fabric will load balance invocation across the containers that export the service.

You will not need camel loadbalancers to leverage that feature (but you leverage dosgi from your camel routes). Also containers that import the remote service will get a single service and not a set of services, even if there are multiple containers that export the service. To be more accurate the container that imports the service, will acutally import a proxy to that service which will load balance the requests to all containers exporting that service.

You can have a look at a fabric examples which provide an camel + dosgi example: https://github.com/fusesource/fuse/tree/master/fabric/fabric-examples/fabric-camel-dosgi
Also if you may want to have a look at the integration test, that setups containers for running this example: https://github.com/fusesource/fuse/blob/master/fabric/fabric-itests/fabric-pax-exam/src/test/java/org/fusesource/fabric/itests/paxexam/FabricDosgiCamelTest.java



--
Ioannis Canellos

Twitter: iocanel


Ioannis Canellos

unread,
Jun 8, 2012, 8:15:18 PM6/8/12
to fusef...@googlegroups.com
Sorry for the terrible english. It's really late here and I don't know what I am typing :-)

sudh...@gmail.com

unread,
Jun 9, 2012, 12:50:52 AM6/9/12
to fusef...@googlegroups.com
Thanks Ioannis. It's clear to me now.

Your english is perfect and thanks for taking time to reply late in the night :)

ben.day...@gmail.com

unread,
Jul 8, 2012, 5:52:44 AM7/8/12
to fusef...@googlegroups.com
Hi Ioannis,

I've been playing around the fabric's dosgi, with some success, but don't seem to be able to get the load balancing behaviour you describe to work.

I've got three containers, as follows:

  • root - Root container, that contains a bundle which looks for a provider of test.api.IMyService
  • prov1 - A child container, that contains an implementation of test.api.IMyService
  • prov2 - A duplicate of above.  

The provider bundles in prov1 and prov2 are set to output a simple log message on invocation, but I only ever see one bundle being called, when running the consumer.  If I stop the provider bundle the consumer is using, the consumer now throws an IllegalStateException. Restarting the consumer bundle causes it to find the other provder, which is good, and calls are now made agains the other provider.

So, couple of questions really:
  • Would you expect to see the traffic balanced across both prov1 and prov2 in this scenario
  • Should the current dosgi implementation handle fail over, i.e. If prov1 goes away prov2 steps in to service the calls?
Cheers,

Ben

Ioannis Canellos

unread,
Jul 8, 2012, 10:16:49 AM7/8/12
to fusef...@googlegroups.com
So, couple of questions really:
  • Would you expect to see the traffic balanced across both prov1 and prov2 in this scenario
  • Should the current dosgi implementation handle fail over, i.e. If prov1 goes away prov2 steps in to service the calls
Yes, I would expect both.

I'll investigate it more and log an issue, once I reproduce it.

ben.day...@gmail.com

unread,
Jul 8, 2012, 11:14:37 AM7/8/12
to fusef...@googlegroups.com
Great thanks.

If you need anything more from me just shout.

sudh...@gmail.com

unread,
Jul 10, 2012, 7:13:21 PM7/10/12
to fusef...@googlegroups.com

I tried the above scenario with their example dosgi application. I had a root container and 2 containers on the same host. I installed the distributed service on 2 containers and the camel consumer on the root container. I turned off one of the service ( osgi:stop) and it fail-over'ed to the service running in my other container.

But, instead of doing a osgi:stop, if I try shutting down the container itself, it does not work.

Ben Day

unread,
Jul 11, 2012, 3:12:20 AM7/11/12
to fusef...@googlegroups.com
Hmm, interesting.  I did try to use a camel route on top of both a reference list and a single service reference in the root container, but still couldn't get it to load balance, nor fail over.  I wonder if the issue is my configuration.

Any chance you could share your code?

Guillaume Nodet

unread,
Jul 11, 2012, 4:02:26 AM7/11/12
to fusef...@googlegroups.com
The fabric DOSGi implementation does not really have any built-in failover or load-balancing.  The way it works (and I think that's kinda required by the spec) is that a proxy service is created (visible only to the importing bundle) for each matching (the full filter syntax is supported) remote service.
So if you have multiple services exported, the client bundle will see several services.  It's up to you to define a load-balancing / failover strategy using camel maybe, or a simple ServiceTracker, or blueprint proxy.
Note that if you use blueprint, a failover proxy will be automatically created, as blueprint does for as usual.  

For quick failover, there will certainly be a slight behavior difference when the service is cleanly unregistered (because the bundle has been stopped) and when the process is abrublty halted.   The services are registered in OSGi with an ephemeral node and when the bundle is stopped, the node is deleted.  When the zookeeper connection is stopped (because the JVM halts), there is a delay after which the ephemeral node will be automatically deleted.   During this delay, clients would still think the service is available ...  
--
------------------------
Guillaume Nodet
------------------------
Blog: http://gnodet.blogspot.com/
------------------------
FuseSource, Integration everywhere
http://fusesource.com

Ben Day

unread,
Jul 11, 2012, 5:56:43 AM7/11/12
to fusef...@googlegroups.com
Thanks for that Guillaume, that explains where I got to.  

I ended up writing a consumer, in the root container, that used a blueprint reference list to get the DOSGI proxies.  I then randomly selected a reference out of the list to do the actual work.  This gave me load balancing and the ability to dynamic add and remove services.  It also explains why I was seeing, under heavy load, calls blocking when I removed a service.  I'm guessing these were cases where the client was getting a reference to the  ephemeral node, that was no longer backed.  In this circumstance, is there any surefire check, other than trying to make a method call, to assess the validity of the reference?

The reason I was playing with this stuff is that I wanted to be able to load balance across a dynamic set of JAX-RS CXF endpoints.  I did try the org.fusesource.fabric.cxf.FabricLoadBalancerFeature approach, but this didn't want to play ball with a JAX-RS client, only JAX-WS.  The solution I landed on, ended up being a bit convoluted.  

  • Write a JAX-RS service that implements IMyService, expose two instance in prov1 and prov2 over DOSGI
  • Write another implementation of IMyService, but this implementation takes a reference list of implementations of IMyService, as above, and is exposed to the big wide world over HTTP
  • I then chose a random reference, within the exposed IMyService implementation and use a JAX-RS proxy client to drive the service proper and return it's response
  • If I need HA for the root container, I can create a duplicate and use camel to load balance + fail over across the two exposed HTTP endpoints
Whilst this gives me the ability to spin up new service providers and automatically use them, load balance across them, and remove them if required, this seems like a complicated solution.  I'm introducing an additional two layers, containers and new bundles, just to achieve this.  Is there a better solution to what I'm trying to achieve?

Apologies for the rather wordy reply, I'm just keen to get this optimal solution early on in our project.

Cheers,

Ben

Guillaume Nodet

unread,
Jul 11, 2012, 6:09:51 AM7/11/12
to fusef...@googlegroups.com
On Wed, Jul 11, 2012 at 11:56 AM, Ben Day <ben.day...@gmail.com> wrote:
Thanks for that Guillaume, that explains where I got to.  

I ended up writing a consumer, in the root container, that used a blueprint reference list to get the DOSGI proxies.  I then randomly selected a reference out of the list to do the actual work.  This gave me load balancing and the ability to dynamic add and remove services.  It also explains why I was seeing, under heavy load, calls blocking when I removed a service.  I'm guessing these were cases where the client was getting a reference to the  ephemeral node, that was no longer backed.  In this circumstance, is there any surefire check, other than trying to make a method call, to assess the validity of the reference?

I don't see any good way but to actually try calling the remote service.  If you add a simple ping() method to your service interface which actually does nothing and try to call it, you could minimize the changes to call a dead service. But that's always the problem with remoting, as there will always be a certain time window between the server being killed and the client knowing about it.  In addition to the fact that remoting can always fail in the middle of the call.  You need to be able to cope with such problems in the calling code anyway, else, you need to use asynchronous messaging instead.
 

The reason I was playing with this stuff is that I wanted to be able to load balance across a dynamic set of JAX-RS CXF endpoints.  I did try the org.fusesource.fabric.cxf.FabricLoadBalancerFeature approach, but this didn't want to play ball with a JAX-RS client, only JAX-WS.  The solution I landed on, ended up being a bit convoluted.  

  • Write a JAX-RS service that implements IMyService, expose two instance in prov1 and prov2 over DOSGI
  • Write another implementation of IMyService, but this implementation takes a reference list of implementations of IMyService, as above, and is exposed to the big wide world over HTTP
  • I then chose a random reference, within the exposed IMyService implementation and use a JAX-RS proxy client to drive the service proper and return it's response
  • If I need HA for the root container, I can create a duplicate and use camel to load balance + fail over across the two exposed HTTP endpoints
Whilst this gives me the ability to spin up new service providers and automatically use them, load balance across them, and remove them if required, this seems like a complicated solution.  I'm introducing an additional two layers, containers and new bundles, just to achieve this.  Is there a better solution to what I'm trying to achieve?

Apologies for the rather wordy reply, I'm just keen to get this optimal solution early on in our project.

That looks good to me given the current state, though i guess Fabric could be enhanced to provide some of the bits you've done in a generic way, such as the jax-rs support in cxf/fabric or a generic proxy mechanism for dosgi (the underlying implementation will always register multiple services, but we could still expose some helper classes to create a loadbalancing / failover proxy easily).

Ben Day

unread,
Jul 11, 2012, 6:27:29 AM7/11/12
to fusef...@googlegroups.com
Ok cool, at least I'm the right track.

I would be great to see the enhancements you mentioned included.  Particularly the inclusion of the JAX-RS in fabric cxf, given the increasing popularity of restful API's I could see it become a pretty common requirement, if it's not one already.

Great product by the way, and thanks for you help on this, it's been really useful.
Reply all
Reply to author
Forward
0 new messages