Address resolution timeout (and 303000 USER_RODS_HOSTNAME_ERR) in federation

168 views
Skip to first unread message

buu...@org.dkrz.de

unread,
Jul 11, 2018, 1:50:28 PM7/11/18
to iRODS-Chat
Hi!

I have an irods 4.2.3 installation that I'm trying to federate with two other installations.

It does not seem to work. iCommands issued to the server take a long time to process, and I receive "-303000 USER_RODS_HOSTNAME_ERR". The help page for that error did not help me - the FQDN in the hosts_config look fine to me, and the owners of the other installations have checked my config.

Every minute (or so), "address resolution timeout" messages get printed into my rodsLog:

> Jul 11 18:20:36 pid:8945 remote addresses: 136.x.x.x, 2001:638:x.. ERROR: getaddrinfo_with_retry address resolution timeout [zone2-iCAT.130.x.x.x] [ai_flags: [2] ai_family: [0] ai_socktype: [0] ai_protocol: [0]]

(Same message for the other federated zone)

I am quite sure that there is no network or firewall issue (I can login to the other irods instances as a local user of their zone, and I previously had an earlier installation of irods on the same machine that was federated with the same remote zones, where networking was no issue).

My zone:
zone1, 136.x.x.x, 2001:638:x...

The other zones:
zone2, 130.x.x.x (4.2.1 or 4.2.3, unsure)
zone3, 193.x.x.x (4.2.1)

Does anybody here have a hint for me? Someone suggested that IPv6 might cause problems (I have IPv4 and 6, the other servers only IPv4), but as far as I remember, this had worked previously.

Best,
Merret



Some more details:

When I switch to DEBUG, the message following the address resolution timeout is:
Jul 11 18:20:36 pid:8945 DEBUG: getZoneInfo: Invalid zone name from hint zone2

I also see a stack trace dumped at some point, after which the agent exits (so every minute, irodsctl status returns a different pid):

Jul 11 18:21:08 pid:8945 DEBUG: CS_NEG
iRODS Exception:
file: /tmp/tmpIhxPZi/lib/core/include/irods_configuration_parser.hpp
function: T &irods::configuration_parser::get(const std::string &) [T = const std::__1::basic_string<char>]
line: 81
code: -1800000
message:
key "LocalZoneSID" not found in map.
: [-] /tmp/tmpIhxPZi/server/core/src/irods_server_negotiation.cpp:175:irods::error irods::client_server_negotiation_for_server(irods::network_object_ptr, std::string &) : status [KEY_NOT_FOUND] errno [] -- message []
stack trace: (...)

(Many more messages)

Jul 11 18:21:08 pid:8945 DEBUG: Agent [8945] exiting with status = 0

When I try to list my files in the remote zone (ils -l /zone2/home/bob#zone1), I get (after a long time, in which the log accumulates message about retrying getaddrinfo):

remote addresses: 127.0.0.1 ERROR: rcObjStat of /zone2/home/bob#zone1 failed status = -303000 USER_RODS_HOSTNAME_ERR

In the log, I see:

Jul 11 19:28:50 pid:10901 remote addresses: 127.0.0.1, 2001:638:x... ERROR: _rcConnect: setRhostInfo error, IRODS_HOST is probably not set correctly status = -303000 USER_RODS_HOSTNAME_ERR
Jul 11 19:28:50 pid:10901 NOTICE: getAndConnRcatHost: svrToSvrConnect to zone2-iCAT.130... failed

buu...@org.dkrz.de

unread,
Jul 12, 2018, 7:22:12 AM7/12/18
to iRODS-Chat
Hi,
after some more trying I found the error myself.

The problem was that I had misunderstood the documentation, and in the command
"iadmin mkzone ZoneB remote zoneB-iCAT.hostname.example.org:ZoneBPort"
I had only replaced the "hostname.example.org" part by the IP, so I used the command
"iadmin mkzone zone2 remote zone2-iCAT.130.x.x.x:1247", which then led to trying to connect to "zone2-iCAT.130.x.x.x:1247", while I should have used
"iadmin mkzone zone2 remote 130.x.x.x:1247".

Removing the wrong zones and recreating them correctly solved the problem.

Little suggestion: Adding an example to the docs, or putting brackets (<...>) around the part to be replaced, would make it easier for beginners to understand the command.

Thanks!
Reply all
Reply to author
Forward
0 new messages