SCTP multi-homing examples?

1,919 views
Skip to first unread message

Tim Boudreau

unread,
Mar 23, 2015, 1:35:01 PM3/23/15
to ne...@googlegroups.com
Hi, folks,

I wrote a little framework to make it easy to write servers/clients/protocols using Netty's SCTP support:

I am trying to get multi-homing working with it, using a simple test setup of three machines running the same code, the host/port pair passed to bootstrap.remoteAddress() and the others passed in the channel initializer to SctpChannel.bindAdddress() before the call to connect().  Then send a little data to verify the connection is up, and stop the process on the primary host - at which point my expectation is that one of the secondary addresses will receive the data.

What happens is the connect future's closeFuture() is called when I take the primary process down;  data sent on the channel afterwards is not received by the other two endpoints.  And even when one of the remote hosts is localhost (which, if I understand correctly, SCTP should prefer based on route-length), data is routed to the primary address.

I'm not sure if I'm doing something wrong here, or this is expected behavior (but not the behavior I would prefer), or what.

It's also not clear if I should iterate the peer addresses and call bootstrap.connect() with each of them, or if SctpChannel.bindAdddress().

I'd expect that this, on a multi-homed association, would show more than one remote IP (should be 7.3.1.5 and 7.3.1.6 in this example):

cat /proc/net/sctp/assocs 
 ASSOC     SOCK   STY SST ST HBKT ASSOC-ID TX_QUEUE RX_QUEUE UID INODE LPORT RPORT LADDRS <-> RADDRS HBINT INS OUTS MAXRT T1X T2X RTXC wmema wmemq sndbuf rcvbuf
ffff880049327000 ffff8800093c8e40 2   1   3  26132  553        0        0    1000 4807380 46105  6701  7.3.2.3 7.3.2.3 7.3.2.3 7.3.2.3 7.3.2.3 7.3.2.3 7.3.2.3 7.3.2.3 7.3.2.3 7.3.2.3 7.3.2.3 <-> *7.3.1.6   30000    10    10   10    0    0        0        1        0    65535    65535

Am I misunderstanding what this should be able to do, or caling bindAddress() too late (docs say it must be before the channel is connected, which I think inside a ChannelInitializer would do), or something else?

Any help appreciated,

Tim

이희승 (Trustin Lee)

unread,
Mar 23, 2015, 10:34:13 PM3/23/15
to ne...@googlegroups.com, Jestan Nirojan
CC'd Jestan who is probably the guy who can answer your question. What do you think Jestan?
 
--
You received this message because you are subscribed to the Google Groups "Netty discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to netty+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
 

Tim Boudreau

unread,
Mar 23, 2015, 11:52:24 PM3/23/15
to ne...@googlegroups.com, Jestan Nirojan
FYI, there is a "multihome" branch with a test that I just added that fails, to make the behavior clear:


Thanks for passing this along, Trustin,

-Tim

--
You received this message because you are subscribed to a topic in the Google Groups "Netty discussions" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/netty/k5MXiZ_Tx20/unsubscribe.
To unsubscribe from this group and all its topics, send an email to netty+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/1427164447.283760.244348866.4C01EB40%40webmail.messagingengine.com.

For more options, visit https://groups.google.com/d/optout.

al...@alanmurphy.org

unread,
Apr 9, 2015, 10:05:12 PM4/9/15
to ne...@googlegroups.com, jestan...@gmail.com
Hi Tim,

We encountered the same problem. The issue seems to be that you need to do the following for multihomed addresses:
- bind the primary IP address
- call SctpChannel.bindAddress() on each of the secondary IPs
- connect.

If you use Bootstrap to bind, it will create an SctpChannel and bind the primary IP. However, if you then use the same Bootstrap to connect(), it creates a new SctpChannel completely without the bound information from before.

Since Netty 4 combines the bound and connected events into a single notification of channelActive on a handler, I couldn't see a way of inserting some custom handler to call the SctpChannel.bindAddress() after the primary IP is bound but before the connection is connected..

We used Bootstrap to simply bind the channel. Once bound, we use the ChannelFuture to extract the SctpChannel and call bindAddress on each of the secondary IPs, then directly call SctpChannel.connect(destination) on the channel itself. Multihoming and failover works then. Not perfect, but it basically does the job.

You can check with tcpdump/wireshark that the SCTP INIT contains the correctly bound addresses on the way to the destination. I advise using sctp_darn as client and server to check your local setup there works fine first before testing this as I had some issues trying to make it work on a local machine that wasted lots of time..

If there's a better way to get this working using more 'idiomatic' Netty, I'd love to hear it.

Hope this helps,

Alan.

이희승 (Trustin Lee)

unread,
Apr 9, 2015, 10:49:35 PM4/9/15
to ne...@googlegroups.com
Hi Tim and Alan,
 
Do you think it's a good idea to introduce a dedicated Bootstrap implementation for SCTP channels to cover this case? (e.g. SctpBootstrap, SctpServerBootstrap)  Let us know what you think.
 
Thanks,
T

Tim Boudreau

unread,
Apr 9, 2015, 11:31:23 PM4/9/15
to ne...@googlegroups.com
Yes.  Nobody is going to figure that out without a lot of pain, and it's pretty awkward.

Alan, I think I tracked your instructions, but it'd be lovely to see working sample code.

-Tim


For more options, visit https://groups.google.com/d/optout.

al...@alanmurphy.org

unread,
Apr 10, 2015, 4:49:29 AM4/10/15
to ne...@googlegroups.com
Hi Trustin, Tim,

@Trustin: I think there may be a case for a specific SCTP Bootstrap that allows specifying additional multihomed IPs. However, there may be a more general case in Netty for a callback or method to perform some custom actions when the channel is bound, which would also cover SCTP. It's up to you guys really to see if it fits in generally. It may be considered a retrograde step since those events (bound, connected, etc) were merged into channelActive from Netty 3 -> Netty 4.

@Tim: The application we're writing is in Clojure and I'm using the Java interop to invoke Netty and use it as the underlying transport mechanism. Works great, but the code isn't in the same structure as standard Java. Not sure if you're familiar with Clojure, but I'm using the core.async method of turning callbacks into more sequential code, which is fine for the startup of the channels.

Basically, in Java, it's something like the following:

ChannelFuture cf = new Bootstrap().group(..).option(..).handler(..).bind(DEST).addListener(CUSTOM_LISTENER).

In the CUSTOM_LISTENER, if success, get the underlying SctpChannel from the future and call: sctpChannel.bindAddress(SECONDARY_IP).addListener(CUSTOM_LISTENER)

Keep going through the listener to bind as many secondary IPs as you have.

Once all secondary IPs are successfully bound, the last step should be to call connect on the channel: sctpChannel.connect(DEST).

The ChannelInitializer set up as part of the Bootstrap will then be invoked and your pipeline set up as normal.

Be aware that we only really have this up and running in the last day or so. It seems to work, but there may yet be issues. We have yet to look on the server side for reporting the multihomed IP addresses back to connecting clients, but will probably do this soon and I'll share what we learn.

Hope it helps anyway.

Alan.

Jestan Nirojan

unread,
May 9, 2015, 2:44:00 PM5/9/15
to ne...@googlegroups.com, al...@alanmurphy.org
Hi all,

Sorry for my very late reply. 

I think, lack of examples for sctp transport is the main issue here. 
As Alan explained, setting up a multi-homing channel will be like following (i have skipped error handling to make it simple).


Bootstrap b = new Bootstrap();
 
//setup the event loop
.............

 
//step 1: bind the primary address
ChannelFuture f1 = b.bind(primaryInetAddress, port).sync();
 
//step 2: get the underlying sctp channel
SctpChannel channel = (SctpChannel) f1.channel();
//step 3: bind the secondary address
ChannelFuture f2 = channel.bindAddress(secondaryInetAddress).sync();
//step 4: connect the channel to socket
ChannelFuture f3 = b.connect().sync();


Only ugly thing here is "step 2", casting the Channel to SctpChannel.

The reasons for having bindAddress() in the channel and not having a separate bootstrap were, 

1) Provide support for Dynamic Address Reconfiguration in Netty SctpChannel, ie (.bindAddress() , .unbindAddress() methods in SctpChannel),
    which will be happening later in the channel and it should be possible without the bootstrap object reference.

    (Dynamic Address Reconfiguration is not enabled by default in lksctp, you have to enable it using. See http://linux.die.net/man/7/sctp addip_enable

2) Netty has several transports, but doesn't have transport specific bootstraps and I wanted to that convention :)

Having a separate bootstrap implementation will be a better solution. 
I will send a pull request for documentations/examples update.


best regards,
-Jestan Nirojan

Jestan Nirojan

unread,
May 10, 2015, 7:11:18 AM5/10/15
to ne...@googlegroups.com, al...@alanmurphy.org

It looks like, the bind(), connect() methods in client bootstrap are not reusing the channel,
 and doesn't allow to set a specific local port in multihoming scenario. Simple scenario is working.

To generalize all the scenarios, we can use the bootstrap to only bind the channel and connect later using the channel (from bind future).

I have sent a pull request https://github.com/netty/netty/pull/3769 
and here is the properly working examples 

best regards,
-Jestan Nirojan

al...@alanmurphy.org

unread,
May 15, 2015, 12:13:36 PM5/15/15
to ne...@googlegroups.com, al...@alanmurphy.org
Hi Jestan,

Great! The examples should make it clear to others in future the process to ensure multihoming is catered for.

As an update, both client and server tests with multihoming on our project work perfectly.

Keep up the good work,

Alan.

Tim Boudreau

unread,
Sep 3, 2015, 3:55:33 AM9/3/15
to Netty discussions, al...@alanmurphy.org
I'm attempting to wrestle into this submission, and still not having a lot of luck.

Doing essentially the same thing as Jestan's example, I get the AlreadyBoundException below.  Specifically, I call bind(new InetSocketAddress("localhost", 0)) to for the local socket.  Then iterate the association addresses and call channel.bindAddress() for each host.  Then bootstrap.remoteAddress(primary), followed by bootstrap.connect().

This is all a little baffling - particularly the fact that the API doesn't entirely make sense (or maybe I'm seriously misusing it).  At the moment I'm just writing a test which will start two servers, and then start a client that will have an association with both servers' addresses.  For that, I'm iterating the NICs, and wind up with an association with my network address + 127.0.0.1 for the client.  In the test, I send data across the connection, and verify that it's been received - then shut down the primary server, and test that data is still received by the secondary one - i.e. failover using SCTP works.

But what's mysterious is that channel.bindAddress() does not take a port - so if I wanted an association that contained two hostnames and ports, it's not clear that that's possible.  I work around it by using the loopback interface plus the machine's NIC's addresses in my test - and I suppose you could just run a server on multiple hosts but the same port - but that seems like a bizarre restriction.

Maybe what I'm looking for is SctpMultiChannel (which has no analogue in Netty)?  Or maybe it's a misuse of SCTP to try to do that?

At any rate, I'd settle for knowing why bindAddress() is failing for now.

java.nio.channels.AlreadyBoundException
at sun.nio.ch.sctp.SctpNet.throwAlreadyBoundException(SctpNet.java:59)
at sun.nio.ch.sctp.SctpChannelImpl.bindUnbindAddress(SctpChannelImpl.java:243)
at sun.nio.ch.sctp.SctpChannelImpl.bindAddress(SctpChannelImpl.java:209)
at io.netty.channel.sctp.nio.NioSctpChannel.bindAddress(NioSctpChannel.java:352)
at io.netty.channel.sctp.nio.NioSctpChannel$1.run(NioSctpChannel.java:361)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:343)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$3.run(SingleThreadEventExecutor.java:131)
at io.netty.util.internal.chmv8.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1412)
at io.netty.util.internal.chmv8.ForkJoinTask.doExec(ForkJoinTask.java:280)
at io.netty.util.internal.chmv8.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:877)
at io.netty.util.internal.chmv8.ForkJoinPool.scan(ForkJoinPool.java:1706)
at io.netty.util.internal.chmv8.ForkJoinPool.runWorker(ForkJoinPool.java:1661)
at io.netty.util.internal.chmv8.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:126)


-Tim

Tim Boudreau

unread,
Sep 3, 2015, 4:51:29 AM9/3/15
to ne...@googlegroups.com, al...@alanmurphy.org

Tim Boudreau

unread,
Sep 3, 2015, 1:30:40 PM9/3/15
to Netty discussions, al...@alanmurphy.org
To clarify, what I'm trying to do in this case (tests, but also a demo at a conference soon - I hope) is to have a one-client/many-servers scenario.

SCTP allows for this:

but bindAddress() seems to be for binding server sockets (and the lack of a port parameter makes this obviously wrong for this case), and I'm looking for a Java equivalent of sctp_connectx() - connect to multiple servers as a single association.

So, SCTP can do it, but it's not clear how to do it with either Netty or the JDK's raw NIO support - unless it means just issuing a whole bunch of calls to connect()?

Any ideas?

-Tim

Jestan Nirojan

unread,
Sep 10, 2015, 9:08:06 AM9/10/15
to Netty discussions
Hi Tim,

Multi homing is used to provide redundant routes between server/client and only one association will be active at a time.
When you are creating a multi homing association, you have to add the secondary local address to the channel via channel.bindAddress(..).

For example, If you are using loopback + eth1, if you use loopback for channel.bind(), you can only use eth1 for channel.bindAddress(..) .
Calling channel.bindAddress() on loopback will result AddressAlreadyBoundException.
The reason no port parameter used in channel.bindAddress(..) is "it is a method call to add the secondary address and it will share the port parameter given at connect()"

(Side note: I would avoid using loopback for multi homing testing, better use a machine with two network cards, so you would need two machines with dual NIC)

If you are trying to create multiple associations via a single channel. You have to use SctpMultiChannel (using JDK NIO. netty-sctp support is not available at the moment).

I am not quite sure that I have understood your use-case, Could you please explain a bit?


thanks and regards,
-Jestan Nirojan

Tim Boudreau

unread,
Sep 10, 2015, 10:43:07 AM9/10/15
to ne...@googlegroups.com
Multi homing is used to provide redundant routes between server/client and only one association will be active at a time.
When you are creating a multi homing association, you have to add the secondary local address to the channel via channel.bindAddress(..).

For example, If you are using loopback + eth1, if you use loopback for channel.bind(), you can only use eth1 for channel.bindAddress(..) .

This is useful for associating client -> server1, server2, server3?  Or just binding server sockets (not what I'm after).
 
Calling channel.bindAddress() on loopback will result AddressAlreadyBoundException.
The reason no port parameter used in channel.bindAddress(..) is "it is a method call to add the secondary address and it will share the port parameter given at connect()"

It's never been entirely clear how this is different from binding 0.0.0.0.
 
(Side note: I would avoid using loopback for multi homing testing, better use a machine with two network cards, so you would need two machines with dual NIC)

Understood.
 

If you are trying to create multiple associations via a single channel. You have to use SctpMultiChannel (using JDK NIO. netty-sctp support is not available at the moment).

I am not quite sure that I have understood your use-case, Could you please explain a bit?

Well, thing one is just a unit (maybe more like integration) test.

A couple of more practical use cases:
 - Connecting to multiple redundant metrics servers - think the same use cases as statsd - lots of tiny packets each of which describes a single high-frequency "event" and some data about it - this is traditionally done with fire-and-forget UDP packets
 - Connecting to multiple redundant logging server (consistency doesn't matter, just collect the logs and cat | sort or do something else with them later)

In both of these cases, the client doesn't care which server it hits, it simply should be the closest one and fail over if that fails.

Viable use case or not?

-Tim

Jestan Nirojan

unread,
Sep 10, 2015, 2:42:10 PM9/10/15
to Netty discussions
Hi Tim


On Thursday, September 10, 2015 at 8:13:07 PM UTC+5:30, Tim Boudreau wrote:
Multi homing is used to provide redundant routes between server/client and only one association will be active at a time.
When you are creating a multi homing association, you have to add the secondary local address to the channel via channel.bindAddress(..).

For example, If you are using loopback + eth1, if you use loopback for channel.bind(), you can only use eth1 for channel.bindAddress(..) .

This is useful for associating client -> server1, server2, server3?  Or just binding server sockets (not what I'm after).
 
Multi-homing is useful when your setup is like client1 (eth0, eth1)  --> server1(eth0,eth1) .
One multi-homing association is like having one connection, but redundant paths to provide fail over.
It is not only for server socket, it is also supported for client sockets too.

SCTP Socket API has two styles of programming.

It is similar to TCP Socket API

It is similar to UDP Socket API

It is similar to what you have asked client1(eth0) -> server1(eth0), server2(eth0), server3(eth0)
But it doesn't have fail-over between servers.
For example if message sending fails to server1, it will not be retried to other servers. Application might have to manually do it.
 
 
Calling channel.bindAddress() on loopback will result AddressAlreadyBoundException.
The reason no port parameter used in channel.bindAddress(..) is "it is a method call to add the secondary address and it will share the port parameter given at connect()"

It's never been entirely clear how this is different from binding 0.0.0.0.

Main difference is automatic fail-over for both/client and server, please check the Multi-homing SCTP RFC

By using 0.0.0.0 address for server socket, the server can accept clients using all local interfaces.
But when one connection fails due to routing/network-card failure, it can not switch that connection back to one of the working interfaces.


 
(Side note: I would avoid using loopback for multi homing testing, better use a machine with two network cards, so you would need two machines with dual NIC)

Understood.
 

If you are trying to create multiple associations via a single channel. You have to use SctpMultiChannel (using JDK NIO. netty-sctp support is not available at the moment).

I am not quite sure that I have understood your use-case, Could you please explain a bit?

Well, thing one is just a unit (maybe more like integration) test.

A couple of more practical use cases:
 - Connecting to multiple redundant metrics servers - think the same use cases as statsd - lots of tiny packets each of which describes a single high-frequency "event" and some data about it - this is traditionally done with fire-and-forget UDP packets
 - Connecting to multiple redundant logging server (consistency doesn't matter, just collect the logs and cat | sort or do something else with them later)

In both of these cases, the client doesn't care which server it hits, it simply should be the closest one and fail over if that fails.

Viable use case or not?

I think, SCTP One to Many Socket style programming would help designing such services. But it doesn't handle automatic fail over between servers.
Multi-homing is used for high availability by providing redundant network paths. It handles automatic fail over  between one server's primary/secondary address (not between many remote servers), 

I have seen multi-homing association fail-over causes network congestion at high message rates in our internal systems. 
SCTP kernel level parameters need to be calculated and tuned for proper multi-homing implementation.

Regards
-Jestan Nirojan


-Tim

Reply all
Reply to author
Forward
0 new messages