I wrote a simple RMI-Server that starts up and listening for remote
requests as well as an client, that will access this server.
All platforms (win32, Linux, zLinux, Solaris 2.7, HP-UX 11.10)
performing well.
Only AIX 5.2 has a performance problem during the methods
LocateRegistry.createRegistry(),
Naming.rebind() and Naming.lookup().
All other platforms takes only 20-35 ms for these methods.
AIX needs between 75-90 sec !!!.
I test all of my code locally (localhost - 127.0.0.1), but I found no
explanation. IBM has reproduced this problem, but has no answer.
Do you have any idea?
Example Code below
TestService.java:
-----------------
import java.rmi.Remote;
import java.rmi.RemoteException;
public interface TestService extends Remote
{
public String getTest()
throws RemoteException;
}
TestServer.java:
----------------
import java.rmi.Naming;
import java.rmi.RemoteException;
import java.rmi.registry.LocateRegistry;
import java.rmi.server.UnicastRemoteObject;
public class TestServer extends UnicastRemoteObject implements
TestService
{
public TestServer() throws RemoteException {
super();
}
public String getTest() {
return "foo.bar";
}
public static void main(String[] args) {
long time,start;
try {
//TestServer testServer = new TestServer();
start = System.currentTimeMillis();
time = start;
System.out.println("before create: "+time);
LocateRegistry.createRegistry(4711);
time = System.currentTimeMillis() - start;
System.out.println("after create: "+time);
start = System.currentTimeMillis();
TestServer testServer = new TestServer();
time = System.currentTimeMillis() - start;
System.out.println("after new: "+time);
start = System.currentTimeMillis();
Naming.rebind("//127.0.0.1:4711/TestServer", testServer);
time = System.currentTimeMillis() - start;
System.out.println("after rebind: "+time);
} catch (Exception e) {
e.printStackTrace();
}
}
}
TestClient.java:
----------------
import java.rmi.Naming;
public class TestClient
{
public TestClient() {
}
public static void main(String args[]) {
TestClient testClient = new TestClient();
testClient.run();
}
static TestService testStub;
void run() {
long time, start;
try {
start = System.currentTimeMillis();
time = start;
System.out.println("before lookup: "+time);
testStub = (TestService)Naming.lookup("//127.0.0.1:4711/TestServer");
time = System.currentTimeMillis() - start;
System.out.println("after lookup: "+time);
start = System.currentTimeMillis();
System.out.println(testStub.getTest());
time = System.currentTimeMillis() -start;
System.out.println("end: "+time);
} catch (Exception e) {
e.printStackTrace();
}
testStub = null;
}
}
Note: JIT is not the problem
Many thanks
Tassilo
this is a long shot, but is there a possibility that something in the
innards is trying to do a reverse DNS lookup and that is taking a long time
to timeout?
I had this idea too and I tested this before.
If I do a nslookup everything is fast and ok
localhost <-> 127.0.0.1 ok
FQHN <-> IP ok
I suppose that there must a time-out problem inside the AIX JVM, but
I can't figure out what happens inside. I 'truss'ed the Java process
and got a lot of polls ending in _nsleep... and after a huge amount of
time
everything works as expected. can you see anything inside this:
749646: 0.0003: __loadx(0x07000000, 0x3493E5B8, 0x0000001C,
0x35687748, 0x34935E38) = 0x356879BC
749646: 0.0003: gethostname(0x2FF215D8, 256) = 0
749646: 0.0003: _getpid() = 749646
749646: 0.0002: open("/etc/resolv.conf", O_RDONLY) = 23
749646: 0.0002: kioctl(23, 22528, 0x00000000, 0x00000000)
Err#25 ENOTTY
749646: = 0
749646: 0.0001: kioctl(23, 22528, 0x00000000, 0x00000000)
Err#25 ENOTTY
749646: kread(23, " d o m a i n\t u n i x .".., 4096) = 86
749646: kread(23, " d o m a i n\t u n i x .".., 4096) = 0
749646: 0.0002: close(23) = 0
749646: 0.0002: _getpid() = 749646
749646: 0.0003: socket(2, 2, 0) = 23
749646: 0.0002: getsockopt(23, 65535, 4104, 0x2FF1FE34,
0x2FF1FE30) = 0
749646: 0.0002: connext(23, 0x34879DF0, 16) = 0
749646: 0.0003: send(23, 0x2FF201A0, 48, 0) = 48
749646: 0.0002: _poll(0x2FF1FEA8, 1, 5000) = 1
749646: 0.0003: nrecvfrom(23, 0x2FF20F50, 1024, 0, 0x2FF1FEE0,
0x2FF1FEA0) = 101
749646: 0.0002: close(23) = 0
749646: 0.0003: socket(2, 2, 0) = 23
749646: 0.0002: getsockopt(23, 65535, 4104, 0x2FF1FE34,
0x2FF1FE30) = 0
749646: 0.0002: connext(23, 0x34879DF0, 16) = 0
749646: 0.0005: send(23, 0x2FF201A0, 48, 0) = 48
749646: 0.0002: _poll(0x2FF1FEA8, 1, 5000) = 1
749646: 0.0003: nrecvfrom(23, 0x2FF20F50, 1024, 0, 0x2FF1FEE0,
0x2FF1FEA0) = 96
749646: 0.0002: close(23) = 0
749646: 0.0002: socket(2, 2, 0) = 23
749646: 0.0002: getsockopt(23, 65535, 4104, 0x2FF1FE34,
0x2FF1FE30) = 0
749646: 0.0002: connext(23, 0x34879DF0, 16) = 0
749646: 0.0002: send(23, 0x2FF201A0, 24, 0) = 24
749646: 0.0002: _poll(0x2FF1FEA8, 1, 5000) = 0
749646: 0.0043: _nsleep(0x355E99F0, 0x355E9A68) = 0
749646: 0.0102: _nsleep(0x355E99F0, 0x355E9A68) = 0
749646: 0.0102: _nsleep(0x355E99F0, 0x355E9A68) = 0
749646: 0.0102: _nsleep(0x355E99F0, 0x355E9A68) = 0
749646: 0.0064: _nsleep(0x346CE9B0, 0x346CEA28) = 0
<snip snip>
749646: 0.0102: _nsleep(0x355E99F0, 0x355E9A68) = 1
749646: 0.0060: nrecvfrom(24, 0x2FF20160, 1024, 0, 0x2FF1F0F0,
0x2FF1F0B0) = 24
749646: 0.0002: close(24) = 0
749646: 0.0004: _getpid() = 749646
749646: 0.0002: socket(2, 2, 0) = 24
749646: 0.0002: bind(24, 0x2FF20450, 16) Err#13 EACCES
749646: 0.0002: close(24) = 0
749646: 0.0002: getuidx(1) = 503
749646: 0.0002: _getpid() = 749646
749646: 0.0005: socket(2, 2, 17) = 24
749646: 0.0003: _getpid() = 749646
749646: 0.0002: bind(24, 0x2FF20280, 16) Err#13 EACCES
749646: 0.0002: kfcntl(24, F_GETFL, 0x00000000) = 2
749646: 0.0002: kfcntl(24, F_SETFL, 0x00000006) = 0
749646: 0.0002: kioctl(24, -2147195266, 0x2FF202F0, 0x00000000)
= 0
749646: 0.0002: sendto(24, 0x3577D098, 52, 0, 0x3577CE90, 16) =
52
749646: 0.0004: _select(2000, 0x2FF1E300, 0x00000000,
0x00000000, 0x2FF1C290) = 1
749646: 0.0003: nrecvfrom(24, 0x3577CF08, 400, 0, 0x2FF1C2A8,
0x2FF1C280) = 36
749646: 0.0002: close(24) = 0
749646: 0.0002: _getpid() = 749646
749646: 0.0002: socket(2, 2, 17) = 24
749646: 0.0002: _getpid() = 749646
749646: 0.0002: bind(24, 0x2FF20280, 16) Err#13 EACCES
749646: 0.0002: kfcntl(24, F_GETFL, 0x00000000) = 2
749646: 0.0002: kfcntl(24, F_SETFL, 0x00000006) = 0
749646: 0.0002: kioctl(24, -2147195266, 0x2FF202F0, 0x00000000)
= 0
749646: 0.0002: kfcntl(24, F_SETFD, 0x00000001) = 0
749646: 0.0002: bind(24, 0x2FF203B0, 16) = 0
749646: 0.0002: ngetsockname(24, 0x2FF203C0, 0x2FF203A0) = 0
749646: 0.0002: _getpid() = 749646
749646: 0.0002: statx("/var/yp/binding/nis.beta.2", 0x3577B958,
76, 0) = 0
749646: 0.0003: ngetsockname(24, 0x2FF20378, 0x2FF20370) = 0
749646: 0.0002: sendto(24, 0x35780D38, 80, 0, 0x3577EA60, 16) =
80
749646: 0.0002: _select(2000, 0x2FF1E370, 0x00000000,
0x00000000, 0x2FF1C300) = 1
749646: 0.0003: nrecvfrom(24, 0x3577EAD8, 8800, 0, 0x2FF1C318,
0x2FF1C2F0) = 32
749646: 0.0002: open("/etc/hosts", O_RDONLY) = 25
749646: 0.0002: kioctl(25, 22528, 0x00000000, 0x00000000)
Err#25 ENOTTY
749646: 0.0002: kfcntl(25, F_SETFD, 0x00000001) = 0
749646: 0.0003: kioctl(25, 22528, 0x00000000, 0x00000000)
Err#25 ENOTTY
749646: kread(25, " # @ ( # ) 4 7\t 1 . 1".., 4096) = 1920
749646: kread(25, " # @ ( # ) 4 7\t 1 . 1".., 4096) = 0
749646: 0.0002: close(25) = 0
---------
Regards
Tassilo
> I suppose that there must a time-out problem inside the AIX JVM, but
> I can't figure out what happens inside. I 'truss'ed the Java process
> and got a lot of polls ending in _nsleep... and after a huge amount of
> time everything works as expected.
If nobody can suggest anything better, then it might be worthwhile putting a
network sniffer on the box and seeing what (if anything) it's trying to
communicate with and what it's saying.
There seems to be a version of ethereal <www.ethereal.com> for AIX.
-- chris