Wrapping the bus and responding to connection failures

已查看 229 次
跳至第一个未读帖子

shadow...@gmail.com

未读,
2016年5月18日 10:27:032016/5/18
收件人 masstransit-discuss
Hi masstransit-discuss,
I've been tasked with creating a sort of wrapper for MassTransit (BusRollCage) which can monitor the bus for "RabbitMqConnectionException >Inner> BrokerUnreachableException"s, and then essentially stop the bus (put the brakes on) and change the host (change the bus driver). Essentially, if I receive a connection failed event, I want the bus to stop and the monitor to create a new bus instance. This event can happen at start up or after its already been running for a while.

We have an HA cluster (on windows) and we haven't been able to find a working solution for load balancing or proxying, and when it comes down to it, having the producers and consumers be aware of all the cluster nodes and just pick a new one on error doesn't seem like such a bad idea.

Anyway, I started implementing this for the consumer side, but I can't seem to get the connection exception bubbled up to my code during startup. I looked through the source and I see there is a ConnectBusObserver on BusBuilder, but no way to actually call it from the factory (Bus.Factory.CreateUsingRabbitMq(..)). BusObserver has the StartFault method I need to tie into. [This use case is if the cluster is degraded when I try to start the bus. I want the bus to stop, change hosts, and try again.]

Other hurdles to face are the IReceiveEndpointObserver's Fault method, which will keep being called each time mass transit attempts to reconnect to the dead host (404, bus driver not found), even after I call bus.Stop(). I think I can solve this one with a ResetEvent array, and just ignore the faults if I already got one for the particular host.

The short short version, is there any way to attach a custom IBusObserver to the bus so I can respond start/stop fault events?

-ShadowFoxish

Chris Patterson

未读,
2016年5月18日 11:07:222016/5/18
收件人 masstrans...@googlegroups.com
So, the thing about this, is that it really should happen at the RabbitMqHost level in the code. What I've thought about this is that being able to rotate a host configuration (actually configurations, since there would be multiple hosts) and if a host becomes unavailable, reconnect to the next host in the list. This way, it would happen under the covers and be just part of how the host configuration works.

With a mix of strategies, such as first available, round robin, etc. to allow connections to be shared across nodes.

The thing is, there would need to be a virtual host name "my-cluster", so that the message addresses would know which host to use, and then the host would have multiple node names ("my-node1", "my-node2", etc.).

I'm not sure I've spelled this out well, but I'd really like this to be built into MT at the RabbitMqHost level, so that when a connection is attempted, the underlying host name is masked by a logical host name, that the host maps to physical machines.

Like:

x.LogicalHost("my-cluster", h => 
{
    h.Node("my-node1");
    h.Node("my-node2");
});

Then, the URI would be:

rabbitmq://my-cluster/my-queue

And it would map underlying to the connection for that logical host.

Thoughts?

I'd be happy to help on a branch if you're already getting started.


--
You received this message because you are subscribed to the Google Groups "masstransit-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-dis...@googlegroups.com.
To post to this group, send email to masstrans...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/masstransit-discuss/4a50e055-7e52-48d2-8a87-24c8a56f2a1a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

shadow...@gmail.com

未读,
2016年5月18日 12:41:062016/5/18
收件人 masstransit-discuss
I think that's a great idea. I also discussed with my boss and we agreed that this solution would benefit everyone as a contribution to the project.

I'll change my approach and see if I can make something work. No branch yet, but when I get somewhere I'll be sure to loop you in.

shadow...@gmail.com

未读,
2016年5月19日 16:43:492016/5/19
收件人 masstransit-discuss
So I started down this path playing with a variety of configurators; I wonder if we can simplify the host settings. Ultimately, its the RabbitMqHost object that needs to know about the cluster nodes, so why not add that to the already existing RabbitMqHostConfigurator?

Making a separate cfg.LogicalHost(..) method with node names seems like a lot of extra infrastructure and classes. What do you think if we did something like this instead:

var host = cfg.Host(new Uri("rabbitmq://my-cluster"), h =>
{
   h.Username("user");
   h.Password("pass");
   h.ClusterNode("node1HostName");
   h.ClusterNode("node2HostName");
   h.ClusterNode(...)...
}

This way, we don't need to specify the logical host name (we can figure it out at run-time by dismantling the Uri). It side-steps the awkward ordering problem of needing to specify a logical host before specifying a regular host and the issue of having multiple hosts and then trying to match cluster nodes to hosts after the fact.

Neither approach introduces a breaking change to users, but I feel like this way might be cleaner.

Chris Patterson

未读,
2016年5月19日 17:13:412016/5/19
收件人 masstrans...@googlegroups.com
I see where you are going, but I think that making an entry point to a new nested closure to configure the host as a cluster, versus a single node, would help with clarity. I've tried to make it so that methods within an interface are always relevant. Having ClusterNode() on a regular host configurator would make it unclear if it is required or not. Perhaps...

var host = cfg.Host(new Uri("rabbitmq://my-cluster"), h =>
{
  h.Username(...);
  h.Password(...);
  h.Cluster(c =>
  {
    c.Node("node1");
    c.Node("node2");
    c.Node("node3", n => n.SomethingForThisNode(...));
  });
});

That way, opting into a cluster makes it possible to encapsulate the cluster configuration inside the Cluster nested closure, eliminating any confusion on what is required.


--
You received this message because you are subscribed to the Google Groups "masstransit-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-dis...@googlegroups.com.
To post to this group, send email to masstrans...@googlegroups.com.

shadow...@gmail.com

未读,
2016年5月20日 15:06:222016/5/20
收件人 masstransit-discuss
I have a decent start at this, but I wanted to check in implementation wise (before I go too far down a rabbit hole, har har).

var host1 = cfg.Host(new Uri($"rabbitmq://mycluster"), h =>

{
    h.Username("user");
    h.Password("pass");
    h.UseCluster(clst =>
    {
        clst.Node("node1");
        clst.Node("node2");
        clst.Node("node3");
    });
});
I made the closure like we talked about; named it "UseCluster" to stay consistent with the "UseSsl" one.
I store an array of node host names in the RabbitMqHostSettings object. I modified the GetConnectionFactory method to include the cluster members and set a HostnameSelector instance on the rabbit ConnectionFactory object. This gets pushed down and used by the RabbitMqConnectionCache, specifically the SendUsingNewConnection method. That's mostly the bulk of this, but there are some exceptions which still bubble up out of the Connection object, and logging might need to be revised so that when it outputs the Uri being connected to, it reflects the correct host name (instead of the 'mycluster' bit).

The selection method, just to start, uses a sequential ordering. That part of it was pretty easy actually, and should be really easy to implement other selection schemes.

I'm happy to share the changes so far, but I don't have the appropriate rights on github to push up a branch. (shadowfoxish@github)

Chris Patterson

未读,
2016年5月20日 20:32:252016/5/20
收件人 masstrans...@googlegroups.com
It sounds like a great start, I can't wait to take a look at it.

As far as pushing up to a branch if you fork the repository and add your own fork of MassTransit to your remotes within your local git repository you should be able to push to a branch within your own repo and then do a pull request. Then I can you review and monitor and we can discuss that way we can get this landed it will be pretty awesome.

__
Chris Patterson




--
You received this message because you are subscribed to the Google Groups "masstransit-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-dis...@googlegroups.com.
To post to this group, send email to masstrans...@googlegroups.com.

shadow...@gmail.com

未读,
2016年5月21日 09:08:162016/5/21
收件人 masstransit-discuss
Alright, cool. I was wondering how that worked!

shadow...@gmail.com

未读,
2016年8月12日 16:20:262016/8/12
收件人 masstransit-discuss
Hi Phatboyg, I have some good news!

A while back, when I helped implement this feature (https://github.com/MassTransit/MassTransit/issues/563) I was doing some local testing with dropping connections on the server / clients using TCPViewer and seeing what MassTransit/RabbitMQ driver was doing. I was seeing a bizarre situation where if you closed a connection, a new connection would be opened to a new server (presumably by the RMQ driver), and then MT would roll to the next one and create another connection. Each time a connection would close, the rabbit driver seemed to create a whole bunch more connections. At that time, I was able to get into a state where I had lots and lots of excess connections to the same server from the same client and they would persist, and each time a connection failure happened more would be created.

Thankfully, I am not seeing that case any more with the current version of MT and its dependencies. The clustering looks good. There can be duplicate message processing, but that is the nature of the beast. Thanks a bunch for helping to make this feature a reality!

-ShadowFoxish

Chris Patterson

未读,
2016年8月12日 16:30:502016/8/12
收件人 masstrans...@googlegroups.com
Must have been a fix in the RMQ driver, glad to hear it's behaving more appropriately!

I plan to drop a new release of MT soon, just finishing up a few other pull requests and outstanding issues.

Regards,

Chris


--
You received this message because you are subscribed to the Google Groups "masstransit-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-discuss+unsub...@googlegroups.com.
To post to this group, send email to masstransit-discuss@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/masstransit-discuss/059fe91a-fefe-4c11-ba14-eb3479e69856%40googlegroups.com.
回复全部
回复作者
转发
0 个新帖子