The subport feature (multiplexing several connections on top of the same
TCP port) have been asked for a long time.
I've committed the initial version of the functionality to "vtcp" branch
in libzmq repo. It's still a bit creaky, but it should work at least on
POSIX-compliant systems.
How it works
------------
There's "vtcp" project that deals with TCP port multiplexing. Note that
it is not 0MQ-specific. It can multiplex connections from any applications.
Technically, there's "vtcpd" deamon that listens on a specific TCP port
and dispatches new connections to the applications on the local box
based on the subport number.
Applications (including 0MQ) use libvtcp library to bind/connect to vtcp
subports.
In wire-protocol terms, vtcp requires that 32-bit network-byte-ordered
integer (subport number) is sent by the connecting side immediately
after creating the TCP connection.
Usage
-----
1. Get vtcp project here:
https://github.com/sustrik/vtcp
2. Build and isntall it:
./autogen.sh
./configure
make
sudo make install
3. Get 0MQ from libzmq repo, vtcp branch
4. Build it with --with-vtcp option:
./autogen.sh
./configure --with-vtcp
make
sudo make install
5. Run the vtcp deamon on the port you want to multiplex:
vtcpd 5555
6. Test it using 0MQ perf tests:
local_lat vtcp://*:5555.123 1 1
remote_lat vtcp://127.0.0.1:5555.123 1 1
Please note that vtcp only allows binding to *all* the network
interfaces (*). Trying to bind to a specific interface will produce error.
Question
--------
Addresses like "vtcp://127.0.0.1:5555.123" are annoyingly long. Would it
make sense to use a well-know fixed port (such as 5555) for
multiplexing? That way the address could be shortned to
"vtcp://127.0.0.1:123"
Any feedback is welcome.
Martin
_______________________________________________
zeromq-dev mailing list
zerom...@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
It is something that could be easily implemented on top of 0MQ using
normal tcp transport, while still being transparent for the
application.
> What is the rationale for this feature?
>
> It is something that could be easily implemented on top of 0MQ using
> normal tcp transport, while still being transparent for the
> application.
How would you pass say a pub/sub feed and a req/rep feed through a
single TCP port?
> But once again: what is the use case?
I am not a user so I am not really competent to answer. However, at San
Francisco meetup early this year, subports were identified as the most
desired feature to add to 0MQ.
I guess the problem is that opening a port in a firewall is an
administrative and often quite burocratic task. Thus, updating your
application by adding a new feed becomes a costly and time consuming
operation (all the clients have to go through the associated burocratic
procedures).
> If this feature is designed to
> allow apps to run transparently in environments when available open
> ports are scarce, then the 'well known port' idea does not make sense,
> unless the port is configurable at runtime, say, per zmq::context.
It has to do with the argument above. If there's a fixed VTCP port you
can just ask all the clients to get it open once and then tunnel all the
traffic through that port. Moreover, if they are already using a VTCP
service from a different vendor, the port is already open and no
administrative procedures are required.
> I am not a user so I am not really competent to answer. However, at San
> Francisco meetup early this year, subports were identified as the most
> desired feature to add to 0MQ.
I don't want to be negative. This is literally accurate, but what
you've made is not "subports" as people hope to see, and it's about
the third time we have this thread :-).
At SFO we discussed both tunneling and in-process subports, and
consensus demand was for in-process named subports.
One quite significant problem with complex 0MQ apps is having to open
and manage multiple ports. It's both a firewall and a topology issue.
It is *significantly* more complex to manage 2+ ports than one port.
It's made worse by the design pattern of splitting different flows
into separate sockets. One example cited had to open 4 sockets (across
a firewall, no less).
Named in-process subports would work as follows: server binds one
socket to e.g. tcp://*:9001:control and a second socket to
tcp://*:9001:data. Clients connect to
tcp://192.168.55.201:9001:control or tcp://192.168.55.201:9001:data
depending on the specific service _within that process_ they need. The
peers would do a normal TCP connection and then specify the subport in
the connection handshake.
Also discussed at the time:
* In-process subports would presumably allow tunneling, but not vice-versa
* Subports would need to be named, not numbered.
-Pieter
> At SFO we discussed both tunneling and in-process subports, and
> consensus demand was for in-process named subports.
The problem with-in process subports is that that way single-port setup,
which is the most desirable configuration on the client side, implies
single-server setup at the service provider side, which is apparently
not a desirable feature.
I agree that running a vtcpd deamon at the server side is a bit annoying
and that it should be rather running in the kernel space. However, we
can address that problem later on.
As for named/numbered subports, I've chosen the fixed length binary
identifier for HW-friendliness.
The user should of course use a string to access a service. The right
way to achieve it IMO is providing a name service (which is on the
roadmap anyway, irrespective of subports feature) that would translate
name into a subport number.
Martin
> The problem with-in process subports is that that way single-port setup,
> which is the most desirable configuration on the client side, implies
> single-server setup at the service provider side, which is apparently not a
> desirable feature.
Do you have data backing that "apparently not desirable" statement?
The use case I gave was quite specific. It's absolutely about reducing
the number of TCP ports needed by a _single_ service. Highly
desirable. This was explicitly discussed both at that meeting, and on
this list.
> I agree that running a vtcpd deamon at the server side is a bit annoying...
Well, this may be a useful tool in its own right but it's just the
wrong solution (name service?) for the problem I described. Calling
them both "subports" is highly confusing.
I'd suggest using "port forwarding" and "named subports".
-Pieter
> Do you have data backing that "apparently not desirable" statement?
Putting all the services into a single process doesn't scale. To move
one service to a different box you have to rip the source code from the
original application, untangle all the dependencies, build it as a
stand-alone service etc.
Martin
> Well, this may be a useful tool in its own right but it's just the
> wrong solution (name service?) for the problem I described. Calling
> them both "subports" is highly confusing.
At the same time, what might work is embedding VTCP as an inport
bridge, same as I'm doing with VTX.
> Putting all the services into a single process doesn't scale. To move one
> service to a different box you have to rip the source code from the original
> application, untangle all the dependencies, build it as a stand-alone
> service etc.
You're assuming services are independent, which isn't the use case. Of
course independent services can exist on their own ports. That's
obvious.
By definition we're discussing entangled co-dependent services that
naturally go together. E.g.
http://www.zeromq.org/topics:pubsub-security
(This is a simple case, there are many more complex ones).
Now, this model requires a ROUTER/DEALER pair for authentication and a
PUB/SUB pair for data. These two services are completely entangled,
there is no sense in speaking of "scaling these by running them on
different boxes". Yes, you might scale up, but you'd do it using
devices in front of the PUB socket.
Does this help to define the use case?
-Pieter
> You're assuming services are independent, which isn't the use case. Of
> course independent services can exist on their own ports. That's
> obvious.
Whether services are dependent or independent should be completely
transparent to the client.
The idea here is that you can distribute the server-side application
while keeping the clients intact. Say, if you are an stock exchange you
can start with with market data publisher and order gateway collocated
in the same process, however, as you grow you will possibly distribute
the market data from a set of gateways completely separate from order
management gateways.
As for the subports, AFAICS the ultimate requirement here is to lower
the time-to-market by circumventing the administrative processes on the
client site, which in turn translates to "open a port once, don't get
hit by the same thing later on as you evolve your service".
In short, it should be possible to funnel all the traffic via a single
port, irrespective of how your server-side service topology looks like.
To give a simple example: Imagine service A using port 5555, subport 1
and B using port 5555, subport 2.
Imaging the A's box burns down and so the admin starts an instance of A
on the B's box.
If port mutliplexer is located in-process, A fails with EADDRINUSE
(because both processes try to bind to the same port) although there is
no address collision (each service uses a different subport).
Martin
> Whether services are dependent or independent should be completely
> transparent to the client.
You may be right in the long term -- and surely are -- but your
solution ignores the short term problem and indeed, argues against
that problem even existing.
Which is kind of pointless IMO.
-Pieter
Ok. What's the exact problem with the proposed solution? AFAICS it's
just a more generic version of what you are describing, thus it should
cover all associated use cases. Am I missing something?
Martin
> Ok. What's the exact problem with the proposed solution? AFAICS it's just a
> more generic version of what you are describing, thus it should cover all
> associated use cases. Am I missing something?
Here's my immediate reaction as user: too complex, not complete, kind
of seems to miss the target.
That could be mitigated by wrapping it up as an inprocess bridge but
that seems incomplete too. The problem will come down to mapping
socket patterns over this. That leads either to "reinvent the wheel"
like VTX is doing, or "do it in the core", which seems the most
accurate solution.
This problem hits all tunneling solutions afaics, unless you know better.
-Pieter
> Here's my immediate reaction as user:
Yes. My initial take on the problem was exactly the same: Implement it
in-process and make it as simple to use as tcp transport.
However, it turns out that tcp transport is as simple as it is because
it is implemented in kernel space.
Specifically, port is a system-wide entity and thus needs a shared
system-wide table of bound ports. For TCP this table resides in kernel
space, so it's automatically set up when system boots.
As for subports the possible solutions are either to move the table to
the kernel space (possibly a reasonable long-term solution) or run a
system-wide daemon holding the table (which is what I did).
> too complex
The suggestion to use a fixed port for vtcp is trying to address the
complexity issue.
The idea is that you can start the vtcp daemon in the init script (no
parameters needed) and then stop caring about it.
From 0MQ's point of view it would mean that 'port' part of the address
can be ommitted which would make addresses look exactly the same as tcp
addresses:
zmq_bind (s, "vtcp://*:5555"); // 5555 is a subport!
> not complete
Yes. Binding to specific interfaces is missing. My guess is that it can
be implemented without having to mess with the kernel. I'll check it out
and report back.
Martin
>> not complete
>
> Yes. Binding to specific interfaces is missing. My guess is that it can
> be implemented without having to mess with the kernel. I'll check it out
> and report back.
Yes. It looks like it is doable in the user space.