Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Future of epmd

Received: by 10.205.123.145 with SMTP id gk17mr722557bkc.2.1352394292367;
        Thu, 08 Nov 2012 09:04:52 -0800 (PST)
X-BeenThere: erlang-programming@googlegroups.com
Received: by 10.204.0.70 with SMTP id 6ls3733119bka.2.gmail; Thu, 08 Nov 2012
 09:04:52 -0800 (PST)
Received: by 10.204.4.211 with SMTP id 19mr717110bks.5.1352394292047;
        Thu, 08 Nov 2012 09:04:52 -0800 (PST)
Received: by 10.204.4.211 with SMTP id 19mr717109bks.5.1352394292004;
        Thu, 08 Nov 2012 09:04:52 -0800 (PST)
Return-Path: <erlang-questions-boun...@erlang.org>
Received: from hades.cslab.ericsson.net (hades.cslab.ericsson.net. [192.121.151.104])
        by gmr-mx.google.com with ESMTP id l1si2427148bka.2.2012.11.08.09.04.51;
        Thu, 08 Nov 2012 09:04:51 -0800 (PST)
Received-SPF: pass (google.com: domain of erlang-questions-boun...@erlang.org designates 192.121.151.104 as permitted sender) client-ip=192.121.151.104;
Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of erlang-questions-boun...@erlang.org designates 192.121.151.104 as permitted sender) smtp.mail=erlang-questions-boun...@erlang.org
Received: from hades.cslab.ericsson.net (hades [192.121.151.104])
	by hades.cslab.ericsson.net (Postfix) with ESMTP id 586655C167;
	Thu,  8 Nov 2012 18:04:44 +0100 (CET)
X-Original-To: erlang-questi...@erlang.org
Delivered-To: erlang-questi...@erlang.org
Received: from mailgw7.ericsson.se (mailgw7.ericsson.se [193.180.251.48])
 by hades.cslab.ericsson.net (Postfix) with ESMTP id D30295C005
 for <erlang-questi...@erlang.org>; Thu,  8 Nov 2012 18:04:42 +0100 (CET)
X-AuditID: c1b4fb30-b7f936d0000018b3-5a-509be62ac2ac
Received: from esessmw0197.eemea.ericsson.se (Unknown_Domain [153.88.253.124])
 by mailgw7.ericsson.se (Symantec Mail Security) with SMTP id
 BA.E7.06323.A26EB905; Thu,  8 Nov 2012 18:04:42 +0100 (CET)
Received: from super.otp.ericsson.se (153.88.115.8) by
 esessmw0197.eemea.ericsson.se (153.88.115.88) with Microsoft SMTP Server id
 8.3.279.1; Thu, 8 Nov 2012 18:04:42 +0100
Received: from [147.214.122.90] (arwen.otp.ericsson.se [147.214.122.90])	by
 super.otp.ericsson.se (8.13.6/8.13.6/SuSE Linux 0.8) with ESMTP id
 qA8H4ejS011019	for <erlang-questi...@erlang.org>; Thu, 8 Nov 2012 18:04:42
 +0100
Message-ID: <509BE628.9050...@erlang.org>
Date: Thu, 8 Nov 2012 18:04:40 +0100
From: Patrik Nyblom <p...@erlang.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:16.0) Gecko/20121028 Thunderbird/16.0.2
MIME-Version: 1.0
To: erlang-questi...@erlang.org
References: <CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ...@mail.gmail.com>
In-Reply-To: <CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ...@mail.gmail.com>
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprCLMWRmVeSWpSXmKPExsUyM+Jvja7Ws9kBBosu6FnMf3OU3YHR48O9
 aUwBjFFcNimpOZllqUX6dglcGU+Pz2UqONvHWNF84wdjA+OlvC5GTg4JAROJ3xeb2CBsMYkL
 99YD2VwcQgInGSW+/lzEAuFsYJT48XsdI4TzlFHi2b2LrCAtvAKaEo3HdoG1swioSEy8PQfM
 ZgOyvz27xARiiwqESazZc4gJol5Q4uTMJywgtoiAvMSrfyfBbGEBPYnNF+eA1QgJBEgcb1gO
 ZHNwcAoESqy6nAoSZgYa83/nUhaIEmmJA0tusUxgFJiFZOosJGUQtq3EhTnXoWx5ie1v5zBD
 2LoSF/5PQRFfwMi2ipE9NzEzJ73cfBMjMGQPbvltsINx032xQ4zSHCxK4rx6qvv9hQTSE0tS
 s1NTC1KL4otKc1KLDzEycXBKNTAGSRz/EfRbYz3v/oe3/Yufd9xZnf627bL3RJ6FtnFxC2d8
 8rKzT0xS9BBfE1Oqv0Zyy62/rw7/lHEWNn91yMVT9EBk4YaCc5+M3fJN1n/LN3j4+dfGjNCZ
 WitWxTznuFRV0y8rfOtctHfhHgvuiZWLXk6dsMsnI7+yode6Yo8lD0d4rEG032YlluKMREMt
 5qLiRADCAuVZJwIAAA==
Subject: Re: [erlang-questions] Future of epmd
X-BeenThere: erlang-questi...@erlang.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: General Erlang/OTP discussions <erlang-questions.erlang.org>
List-Unsubscribe: <http://erlang.org/mailman/options/erlang-questions>,
 <mailto:erlang-questions-requ...@erlang.org?subject=unsubscribe>
List-Archive: <http://erlang.org/pipermail/erlang-questions>
List-Post: <mailto:erlang-questi...@erlang.org>
List-Help: <mailto:erlang-questions-requ...@erlang.org?subject=help>
List-Subscribe: <http://erlang.org/mailman/listinfo/erlang-questions>,
 <mailto:erlang-questions-requ...@erlang.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============8140368874242366160=="
Errors-To: erlang-questions-boun...@erlang.org
Sender: erlang-questions-boun...@erlang.org

--===============8140368874242366160==
Content-Type: multipart/alternative;
	boundary="------------020309020201080803060106"

--------------020309020201080803060106
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit

Hi!
On 11/07/2012 08:03 AM, Dmitry Demeshchuk wrote:
> Hello, list.
>
> As you may know, epmd may sometimes misbehave. Loses nodes and doesn't 
> add them back, for example (unless you do some magic, like this: 
> http://sidentdv.livejournal.com/769.html ).
>
First of all, we have no bug reports were epmd looses nodes except if 
you deliberately kill epmd or deliberately disconnect. I unfortunately 
cannot read the article you are referring to (the language is not one I 
understand), so I cannot explain what's going on there.
> A while ago, Peter Lemenkov got a wonderful idea that epmd may be 
> actually written in Erlang instead. EPMD protocol is very simple, and 
> it's much easier to implement all the failover scenarios in Erlang 
> than in C. So far, here's a prototype of his: 
> https://github.com/lemenkov/erlpmd
>
Failover is usually not needed, it's one single process on a machine, it 
should only stop if the machine stops. What scenario are we talking 
about here?

As epmd works today, a distributed erlang node connects to a *local* 
epmd (it's after all just a portmapper, similar to many other 
portmappers), and tells it what name and port number it has. When the 
beam process ends (in some way or another) the socket get's closed and 
epmd is more or less instantly informed. Epmd survives starts and stops 
of Erlang nodes on the machine and is the single database mapping ports 
for erlang distribution on the host.

If we were to implement epmd in Erlang with that scheme, the first 
Erlang node either has to survive for all of the host's lifespan or has 
to transfer the ownership of the open sockets (ALIVE-sockets) to "the 
next" node to take over the task of epmd. Note that these nodes may not 
be in the same cluster, epmd is bound to a machine, not an Erlang 
cluster. Erlang VM's participating in different Erlang clusters may 
exist on the same machine. This would be feasible if we had an *extra* 
Erlang node for port mapping, which of course could be a working solution.

To implement this in Erlang, using the already present distributed 
Erlang machines, would probably require a different mechanism for 
registering and unregistering nodes. Looking out for closed sockets will 
not do, as we will need to monitor nodes that has no connection to us 
(or they have to re-establish such a connection at least, which is not 
needed today). Also a reliable takeover by nodes participating in 
different clusters could be implemented, it is in no way impossible of 
course. You would also need to reopen the known port when taking over, 
so there will be a race, or rather a short time with no epmd listening. 
All clients have to handle that.

Implementing a more simple epmd for a machine with only one Erlang node 
is far easier and could be useful for small embedded systems. In that 
case we will not need to change the protocol. Usage will be limited of 
course.

You could also rewrite epmd in Erlang and have an extra (non 
distributed) Erlang machine resident in the system (after all, it would 
be more or less the same thing as having a C program resident). That 
would not require complicated takeover scenarios, but would increase the 
memory footprint slightly. An implementation in Erlang could cover both 
the single VM system and a solution with an extra Erlang machine, which 
would be nice.
> When hacking it, I've noticed several things:
>
> 1. When we send ALIVE2_REQ and reply with ALIVE2_RESP, we establish a 
> TCP connection. Closing of which is a signal of node disconnection. 
> This approach does have a point, since we can use keep-alive and 
> periodically check that the node is still here on the TCP level. But 
> next, some weird thing follows:
Note that this is local connections. Keep-alive has nothing to do with 
it. The loopback detects a close and informs immediately. Keep-alive 
detects network problems (badly) and is only useful when talking across 
a real network.
>
> 2. When we send other control messages from a node connected to epmd, 
> we establish a new TCP connection, each time. Could use the main 
> connection instead. Was it a design decision or it's just a legacy thing?
When you communicate with epmd after alive is sent, you establish a 
connection to the epmd *on the host you want to connect to*, which is 
only the same epmd  as you used for registration if the Erlang node you 
want to talk to is on the same host as you yourself are. You are looking 
for a port on the particular machine that your remote Erlang machine 
resides on. Only in the local case you could reuse your connection, 
which would only add a special case with very little gain.
>
> 3. The client (node) part of epmd seems to be all implemented in C and 
> sealed inside ERTS. However, it seems like this code could be 
> implemented inside the net_kernel module instead (or something similar).
erl_epmd is the module and it's called by net_kernel. No epmd 
communication except the inet_driver itself is written in C on that 
side. The epmd daemon is of course written in C, but it's not part of 
the VM.
>
>
> Why bother and switch to Erlang when everything is already written and 
> working? First of all, sometimes it doesn't really work in big 
> clusters (see my first link). And, secondly, using Erlang we can 
> easily extend the protocol. For example, add auto-discovery feature, 
> which has been discussed on the list a lot. Add an ability for a node 
> to reconnect if its TCP session has been terminated for some reason. 
> Add lookups of nodes by prefix (like, "give me all nodes that match 
> mynode@*"). The list can be probably extended further.
I think a lot of this should be solved in the client, which is already 
written in Erlang. Rewriting the server might just add complexity, at 
least if you want to solve it in the already running distributed nodes, 
with takeover and whatnot.

> Do you think such a thing (with full backwards compatibility, of 
> course) could go upstream? Also, a question for epmd maintainers: is 
> it going to change at all, or the protocol is considered to be full 
> enough for its purposes?
We have thought about a distributed epmd over the years, but have never 
considered it worth the effort, due to the takeover complexity etc. 
Portmapping is really basic functionality, you wouldn't want to mess 
that up. A separate Erlang machine would maybe be a solution, but as 
epmd is such a simple program, we have not really thought it worth the 
extra memory footprint.

So it would not be the easiest thing to convince us to take upstream, 
but given a well thought through solution, we could get rid of some 
maintenance - Erlang is after all far nicer to maintain than C... One 
could also make it possible to chose between different epmd solution, in 
that way we would cover the cases where people would not want an extra 
Erlang machine for portmapping. More elaborate things could then be 
experimented with in the Erlang-written epmd.

If you can isolate a bug or explain a malfunction in the current epmd, 
it would be a great contribution!

>
> -- 
> Best regards,
> Dmitry Demeshchuk
>
Cheers,
/Patrik
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questi...@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions


--------------020309020201080803060106
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi!<br>
      On 11/07/2012 08:03 AM, Dmitry Demeshchuk wrote:<br>
    </div>
    <blockquote
cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ...@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      Hello, list.
      <div><br>
      </div>
      <div>As you may know, epmd may sometimes misbehave. Loses nodes
        and doesn't add them back, for example (unless you do some
        magic, like this: <a moz-do-not-send="true"
          href="http://sidentdv.livejournal.com/769.html">http://sidentdv.livejournal.com/769.html</a>
        ).</div>
      <div><br>
      </div>
    </blockquote>
    First of all, we have no bug reports were epmd looses nodes except
    if you deliberately kill epmd or deliberately disconnect. I
    unfortunately cannot read the article you are referring to (the
    language is not one I understand), so I cannot explain what's going
    on there.<br>
    <blockquote
cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ...@mail.gmail.com"
      type="cite">
      <div>A while ago, Peter Lemenkov got a wonderful idea that epmd
        may be actually written in Erlang instead. EPMD protocol is very
        simple, and it's much easier to implement all the failover
        scenarios in Erlang than in C. So far, here's a prototype of
        his:&nbsp;<a moz-do-not-send="true"
          href="https://github.com/lemenkov/erlpmd">https://github.com/lemenkov/erlpmd</a></div>
      <div><br>
      </div>
    </blockquote>
    Failover is usually not needed, it's one single process on a
    machine, it should only stop if the machine stops. What scenario are
    we talking about here?<br>
    <br>
    As epmd works today, a distributed erlang node connects to a *local*
    epmd (it's after all just a portmapper, similar to many other
    portmappers), and tells it what name and port number it has. When
    the beam process ends (in some way or another) the socket get's
    closed and epmd is more or less instantly informed. Epmd survives
    starts and stops of Erlang nodes on the machine and is the single
    database mapping ports for erlang distribution on the host.<br>
    <br>
    If we were to implement epmd in Erlang with that scheme, the first
    Erlang node either has to survive for all of the host's lifespan or
    has to transfer the ownership of the open sockets (ALIVE-sockets) to
    "the next" node to take over the task of epmd. Note that these nodes
    may not be in the same cluster, epmd is bound to a machine, not an
    Erlang cluster. Erlang VM's participating in different Erlang
    clusters may exist on the same machine. This would be feasible if we
    had an *extra* Erlang node for port mapping, which of course could
    be a working solution.<br>
    <br>
    To implement this in Erlang, using the already present distributed
    Erlang machines, would probably require a different mechanism for
    registering and unregistering nodes. Looking out for closed sockets
    will not do, as we will need to monitor nodes that has no connection
    to us (or they have to re-establish such a connection at least,
    which is not needed today). Also a reliable takeover by nodes
    participating in different clusters could be implemented, it is in
    no way impossible of course. You would also need to reopen the known
    port when taking over, so there will be a race, or rather a short
    time with no epmd listening. All clients have to handle that.<br>
    <br>
    Implementing a more simple epmd for a machine with only one Erlang
    node is far easier and could be useful for small embedded systems.
    In that case we will not need to change the protocol. Usage will be
    limited of course.<br>
    <br>
    You could also rewrite epmd in Erlang and have an extra (non
    distributed) Erlang machine resident in the system (after all, it
    would be more or less the same thing as having a C program
    resident). That would not require complicated takeover scenarios,
    but would increase the memory footprint slightly. An implementation
    in Erlang could cover both the single VM system and a solution with
    an extra Erlang machine, which would be nice.<br>
    <blockquote
cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ...@mail.gmail.com"
      type="cite">
      <div>When hacking it, I've noticed several things:</div>
      <div><br>
      </div>
      <div>1. When we send&nbsp;ALIVE2_REQ and reply with&nbsp;ALIVE2_RESP, we
        establish a TCP connection. Closing of which is a signal of node
        disconnection. This approach does have a point, since we can use
        keep-alive and periodically check that the node is still here on
        the TCP level. But next, some weird thing follows:</div>
    </blockquote>
    Note that this is local connections. Keep-alive has nothing to do
    with it. The loopback detects a close and informs immediately.
    Keep-alive detects network problems (badly) and is only useful when
    talking across a real network.<br>
    <blockquote
cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ...@mail.gmail.com"
      type="cite">
      <div><br>
      </div>
      <div>2. When we send other control messages from a node connected
        to epmd, we establish a new TCP connection, each time. Could use
        the main connection instead. Was it a design decision or it's
        just a legacy thing?</div>
    </blockquote>
    When you communicate with epmd after alive is sent, you establish a
    connection to the epmd *on the host you want to connect to*, which
    is only the same epmd&nbsp; as you used for registration if the Erlang
    node you want to talk to is on the same host as you yourself are.
    You are looking for a port on the particular machine that your
    remote Erlang machine resides on. Only in the local case you could
    reuse your connection, which would only add a special case with very
    little gain.<br>
    <blockquote
cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ...@mail.gmail.com"
      type="cite">
      <div><br>
      </div>
      <div>3. The client (node) part of epmd seems to be all implemented
        in C and sealed inside ERTS. However, it seems like this code
        could be implemented inside the net_kernel module instead (or
        something similar).</div>
    </blockquote>
    <div>erl_epmd is the module and it's called by net_kernel. No epmd
      communication except the inet_driver itself is written in C on
      that side. The epmd daemon is of course written in C, but it's not
      part of the VM.<br>
    </div>
    <blockquote
cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ...@mail.gmail.com"
      type="cite">
      <div><br>
      </div>
      <div><br>
      </div>
      <div>Why bother and switch to Erlang when everything is already
        written and working? First of all, sometimes it doesn't really
        work in big clusters (see my first link). And, secondly, using
        Erlang we can easily extend the protocol. For example, add
        auto-discovery feature, which has been discussed on the list a
        lot. Add an ability for a node to reconnect if its TCP session
        has been terminated for some reason. Add lookups of nodes by
        prefix (like, "give me all nodes that match mynode@*"). The list
        can be probably extended further.</div>
    </blockquote>
    I think a lot of this should be solved in the client, which is
    already written in Erlang. Rewriting the server might just add
    complexity, at least if you want to solve it in the already running
    distributed nodes, with takeover and whatnot.<br>
    <br>
    <blockquote
cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ...@mail.gmail.com"
      type="cite">
      <div>Do you think such a thing (with full backwards compatibility,
        of course) could go upstream? Also, a question for epmd
        maintainers: is it going to change at all, or the protocol is
        considered to be full enough for its purposes?</div>
    </blockquote>
    We have thought about a distributed epmd over the years, but have
    never considered it worth the effort, due to the takeover complexity
    etc. Portmapping is really basic functionality, you wouldn't want to
    mess that up. A separate Erlang machine would maybe be a solution,
    but as epmd is such a simple program, we have not really thought it
    worth the extra memory footprint.<br>
    <br>
    So it would not be the easiest thing to convince us to take
    upstream, but given a well thought through solution, we could get
    rid of some maintenance - Erlang is after all far nicer to maintain
    than C... One could also make it possible to chose between different
    epmd solution, in that way we would cover the cases where people
    would not want an extra Erlang machine for portmapping. More
    elaborate things could then be experimented with in the
    Erlang-written epmd.<br>
    <br>
    If you can isolate a bug or explain a malfunction in the current
    epmd, it would be a great contribution!<br>
    <br>
    <blockquote
cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ...@mail.gmail.com"
      type="cite">
      <div>
        <div><br>
        </div>
        -- <br>
        Best regards,<br>
        Dmitry Demeshchuk<br>
      </div>
      <br>
    </blockquote>
    Cheers,<br>
    /Patrik<br>
    <blockquote
cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ...@mail.gmail.com"
      type="cite">
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
erlang-questions mailing list
<a class="moz-txt-link-abbreviated" href="mailto:erlang-questi...@erlang.org">erlang-questi...@erlang.org</a>
<a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-questions">http://erlang.org/mailman/listinfo/erlang-questions</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>

--------------020309020201080803060106--

--===============8140368874242366160==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

--===============8140368874242366160==--