IP Address for machine names

34 views
Skip to first unread message

Max Tritschler

unread,
Apr 12, 2011, 10:24:59 AM4/12/11
to s4-project
Hi all,

I am trying to get S4 to run distributedly on different virtual
machines. I wanted to start simple, so I just added another node to
the cluster that is located on the same machine (both run on
localhost). Everything worked well; I received the debug messages from
the partitioners on System.out and events arrived at all PEs on both
nodes.

Then I replaced "localhost" with "127.0.0.1" in the clusters.xml and I
do not receive output on System.out anymore. As far as I can tell from
the logs, there are no events arriving at the PEs, although the
adapter is happily producing them.

Could someone please confirm that using IP-Addresses in clusters.xml
is valid? I would appreciate any insight on this.

Cheers, Max

PS: I just confirmed that even using the original cluster.xml with
only one node, changing the host name to an IP-Address leads to the
same behavior. The debug output also says "processAvailable: false"
after the change. Any ideas?

kishore g

unread,
Apr 12, 2011, 4:37:44 PM4/12/11
to s4-pr...@googlegroups.com, Max Tritschler
Hi Max,

Using IP-address is valid, however using 127.0.0.1 does not give you the intended behavior. So basically you can use any of the following

* real ip of your box
* hostname of your box //output of hostname command

If you want the details what we do in the code. We compare

InetAddress.getByName(host) and InetAddress.getLocalHost().

localhost and 127.0.0.1 does not satisfy this requirement. We have treated localhost as a special case, we should probably treat 127.0.0.1 as a special case as well.

Is it a must to use 127.0.0.1? If you are just testing if you can use IP address then directly using the IP of the box will work

thanks
Kishore G

Max Tritschler

unread,
Apr 13, 2011, 7:35:55 AM4/13/11
to s4-project
Hi Kishore, all,

thanks for your reply.
You are right, using the hostname works and as I saw was the local IP
of this box actually 127.0.1.1 and not 127.0.0.1 as one would expect.
It also works fine with this IP, so the mistake here was on my side.

After this I ran into another issue. When I replaced now replaced the
local IP with the public IP, it stopped working again. I looked into
the code and finally found something
io.s4.comm.file.StaticTaskManager#canTakeupProcess().

if (!host.equals("localhost")) {
if (!InetAddress.getLocalHost().equals(inetAddress)) {
return false;
}
}

The problem is that on my machine InetAddress.getLocalHost() does not
return the public IP but rather <HOST-NAME>/127.0.1.1. If I configured
clusters.xml to use my public IP, this comparison will fail. As I read
this code, it checks if the address configured in clusters.xml is the
address of this machine. Unfortunately the result of
InetAddress.getLocalHost() depends on your network configuration, i.e.
on my Windows box this method returned an address that I got from a
VirtualBox network adapter instead of my public IP at first. Only
after I deactivated the VirtualBox adapter, the method returned my
public IP. To solve my problem, I then modified the code in
StaticTaskManager to find a match for inetAddress in all available
addresses from the installed NICs.

So my question is this: If it really is necessary to check if the
configured IP is a valid address for this machine, wouldn't it make
more sense to check all addresses from the NICs and not only the one
returned by getLocalHost()?

I also found a discussion on this topic (from 2002 though):
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4665037

Thank you,
Max

kishore g

unread,
Apr 13, 2011, 1:26:02 PM4/13/11
to s4-pr...@googlegroups.com, Max Tritschler
Hi Max,

Yes it does make sense to check the hostnames addresses of all the NIC's. But please keep in mind that this code is needed only in static mode where you dont have a true fail over [Basically a mode without Zookeeper]. So if you are planning to use this in a dynamic mode with zookeeper, you do not need to specify any of this.

Feel free to submit a patch for the same.

thanks,
Kishore G

Max Tritschler

unread,
Apr 20, 2011, 4:52:30 AM4/20/11
to s4-project
Hi,

I don't know if you can directly pull the change from our rep at
git://git.berlios.de/s4fokus. So I'll also paste the patch below.

Max


Date: Wed, 20 Apr 2011 10:42:13 +0200
Subject: [PATCH] Fix handling of hostnames for task assignment

---
.../java/io/s4/comm/file/StaticTaskManager.java | 24 ++++++++++++
++++---
1 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/src/main/java/io/s4/comm/file/StaticTaskManager.java b/
src/main/java/io/s4/comm/file/StaticTaskManager.java
index 89836d3..1c18b51 100644
--- a/src/main/java/io/s4/comm/file/StaticTaskManager.java
+++ b/src/main/java/io/s4/comm/file/StaticTaskManager.java
@@ -25,6 +25,7 @@ import
io.s4.comm.util.ConfigParser.Cluster.ClusterType;
import java.io.File;
import java.io.FileOutputStream;
import java.net.InetAddress;
+import java.net.NetworkInterface;
import java.nio.channels.FileLock;
import java.util.HashMap;
import java.util.HashSet;
@@ -162,16 +163,31 @@ public class StaticTaskManager implements
TaskManager {
}
}

+ private boolean isAddressValid(InetAddress inetAddress) {
+ boolean result = false;
+ try {
+ if (InetAddress.getLocalHost().equals(inetAddress)) {
+ result = true;
+ } else {
+ result =
NetworkInterface.getByInetAddress(inetAddress) != null;
+ }
+ } catch (Exception e) {
+ logger.warn(e.getMessage(), e);
+ result = false;
+ }
+ return result;
+ }
+
private boolean canTakeupProcess(Map<String, String>
processConfig) {
String host = processConfig.get("process.host");
try {
InetAddress inetAddress = InetAddress.getByName(host);
logger.info("Host Name: "
+
InetAddress.getLocalHost().getCanonicalHostName());
- if (!host.equals("localhost")) {
- if (!InetAddress.getLocalHost().equals(inetAddress))
{
- return false;
- }
+ if (!isAddressValid(inetAddress)) {
+ logger.error(host + " is not a valid address for this
machine."+
+ " Check the configuration.");
+ return false;
}
} catch (Exception e) {
logger.error("Invalid host:" + host);
--
1.7.0.4




On Apr 13, 7:26 pm, kishore g <g.kish...@gmail.com> wrote:
> Hi Max,
>
Reply all
Reply to author
Forward
0 new messages