Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

RMI connection refused

650 views
Skip to first unread message

FutureScalper

unread,
Apr 30, 2010, 11:07:38 AM4/30/10
to

I have a situation where a number of local Java processes register
themselves in a local rmiregistry so they can talk to each other. All
the correct things are being done, and this works 99.9% of the time
perfectly. Each process uses a unique name "XXServer" to bind, etc.

Each process periodically unbinds and rebinds itself to the registry
successfully. But when this Connection refused problem occurs
(rarely), even though a process unbinds and rebinds itself
successfully in the registry, it cannot nvoke RMI methods, due to
connection refused.

url is correct, specifying port, etc, there is no issue in this area.
No firewalls.

This Connection refused runtime problem in the problem process never
resolves itself, and I don't know what I can do to get the process to
recover from this error. All other processes continue to work
normally, until they might rarely experience the same issue.

When I kill and restart the particular application process , then
everything is again normal. So it's that particular process which is
somehow being refused connection due to < insert solution here > I
can't reproduce the problem easily, but once it happens the process
never recovers.

sun.rmi.transport.tcp.TCPEndpoint.newSocket is where the Connection
refused originates. I wonder how I can avoid what appears to be some
resource limitation problem near TCPEndpoint as in the stack trace
below (no line numbers). I'm trying to invoke an RMI method at the
time of exception.

Java 6 Update 17

java.rmi.ConnectException: Connection
refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getAskRunLengthFast(Unknown
<------- invoking
Source)
at com.twc.trader.TradeEntryManager$AutoTrader.run(Unknown
Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown
Source)
at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown
Source)
... 6 more

FutureScalper

unread,
Apr 30, 2010, 11:16:43 AM4/30/10
to

Perhaps I need to do this:

sun.rmi.transport.tcp.TCPEndpoint

public static void shedConnectionCaches()

Release idle outbound connections to reduce demand on I/O resources.
All transports are asked to release excess connections.

Seems to me something is accumulating in this area.

EJP

unread,
May 2, 2010, 9:16:26 PM5/2/10
to
On 1/05/2010 1:16 AM, FutureScalper wrote:
>
> Perhaps I need to do this:
>
> sun.rmi.transport.tcp.TCPEndpoint
>
> public static void shedConnectionCaches()
>
> Release idle outbound connections to reduce demand on I/O resources.
> All transports are asked to release excess connections.
>
> Seems to me something is accumulating in this area.

That wouldn't help in the slightest. 'Connection refused' has one
meaning only. Nothing is listening at the target host:port.

Is 127.0.0.1 the expected IP address to connect to?

FutureScalper

unread,
May 6, 2010, 12:46:28 PM5/6/10
to

Well, that can't be the case. There is an RMIRegistry process running
locally and N other apps are bound to it.

Restarting the rmiregistry doesn't help. The client actually has to
be killed and restarted.

Thanks for any other suggestions, as I try and debug this situation
which develops only after some hard usage.

I still think something is accumulating, such as connections, etc., or
some resource.


FutureScalper

unread,
May 6, 2010, 12:57:17 PM5/6/10
to

I explicitly use a Java Web Start property to fully specify the
localhost as:
<property name="java.rmi.server.hostname" value="127.0.0.1" />
I also spec the port explicitly, and periodically unbind and rebind
processes.
The convention is XXServer, so, for example
Core.rebindToRemote [rmi://127.0.0.1:1098/YMServer] serverImpl is an
example
of the bind URL for a client. Everything is explicitly specified.
As I said, this thing works for quite a long time, and then under
circumstances
I can't figure out, connection is permanently refused, and I have so
far been unable
to get the Java application client to recover without restart.

But the rmiregistry is there, and running. Now, perhaps it is
refusing connection
to a specific client for some reason, such as an accumulating resource
within
the rmiregistry process itself ?? However, restarting rmiregistry,
and clients
rebinding to it, does not fix the client's connection problems, so I
don't think
that's it.

Roedy Green

unread,
May 6, 2010, 1:58:59 PM5/6/10
to
On Thu, 6 May 2010 09:57:17 -0700 (PDT), FutureScalper
<future...@gmail.com> wrote, quoted or indirectly quoted someone
who said :

you might try snooping on the conversation with something like
WireShark. There might be something in the messages that would give
you a bit more information.

Is there any sort of debug mode on the RMI server that will log stuff
that could give you a clue?

When you detect the failure in the client, do you go right back to
square 1? Have you profiled to see if there is some strange object
accumulation in either client or server?

After you get a fail, do all other clients thereafter fail, or just
the one that failed?
--
Roedy Green Canadian Mind Products
http://mindprod.com

What is the point of a surveillance camera with insufficient resolution to identify culprits?

EJP

unread,
May 6, 2010, 11:44:55 PM5/6/10
to
On 7/05/2010 2:57 AM, FutureScalper wrote:
>>> That wouldn't help in the slightest. 'Connection refused' has one
>>> meaning only. Nothing is listening at the target host:port.
>>
>> Well, that can't be the case. There is an RMIRegistry process running
>> locally and N other apps are bound to it.

The number of apps that are bound to the registry isn't relevant. It is
possible that the Registry's accept thread has stopped somehow, which
would cause its backlog queue to fill up, which on Windows also provokes
an ECONN. That and a firewall are the the only other conditions besides
no listener that causes ECONN, and ECONN is the only condition that
causes 'connection refused' in Java.

>> Restarting the rmiregistry doesn't help. The client actually has to
>> be killed and restarted.

That's bizarre. Indicative but bizarre. Does restarting the client help
without restarting anything else?

> Now, perhaps it is
> refusing connection
> to a specific client for some reason

TCP can't do that. But it could start refusing all clients, as above.

> However, restarting rmiregistry,
> and clients rebinding to it

Don't you mean servers rebinding to it? (as clients ;-))

> does not fix the client's connection problems, so I
> don't think that's it.

I'm getting confused here. RMI Servers do Registry.bind/rebind. RMI
clients do Registry.lookup. RMI Servers are in fact clients of the
Registry, which is also an RMI server, but let's not add that
complication. If you restart the Registry it will have no bindings, so
servers would have to rebind (or be restarted) before clients would work.

I think it would be worthwhile running the Registry with some RMI
tracing properties - see the links via the RMI home page.

FutureScalper

unread,
May 7, 2010, 12:28:22 AM5/7/10
to
> servers wouldhaveto rebind (or be restarted) before clients would work.

>
> I think it would be worthwhile running the Registry with some RMI
> tracing properties - see the links via the RMI home page.

I'm sorry I neglected to make it clear that EACH process is both a
client and a server.

In other words, it can look up itself using RMI, as well as looking up
other clients.

Each process exposes the same interface both to itself, and to other
clients.

I know, sounds a little weird, but it's a trading application which
has to query both its own state, and potentially the state of other
apps, which analyze other futures contracts. Works perfectly, except
when it doesn't, which is quite RARE.

So part of each process, if you like, is a client, and the other part
is a server.

Thanks to all who made suggestions. I'm following up.

FutureScalper

unread,
May 7, 2010, 12:34:53 AM5/7/10
to

Another thing, is that there is NO FIREWALL being used.

Sorry I am unable to reproduce this problem at will. I've enabled
line number tracebacks in my clients so that if/when this happens
again I'll get precise info on just exactly where it failed.

For performance reasons I usually do not run with debug, as this thing
does dozens of rmi queries as a group, about 6 times per second both
locally and remotely.

So it's pretty fast, and so far quite reliable until I get this
connection refused issue in one of the clients and I can't figure out
how to help it recover.

FutureScalper

unread,
May 7, 2010, 12:52:32 AM5/7/10
to

Here's what it looks like, and each stack trace (sorry no line
numbers)
can be seen to be calling a different method on the interface.
Once I get this, I'm not sure what to do to clear the problem.
I have a watchdog which unbinds/rebinds the server implementation to
the
rmiregistry periodically, and also, I am calling that TCPEndpoint
static method sun.rmi.transport.tcp.TCPEndpoint
public static void shedConnectionCaches() hoping to "harvest" or
recycle whatever may lurk in the connection caches :)
I don't believe it helps at all, and I call it prior to each
periodic rebind. So I expect it to fail until the next watchdog
rebind, and then to clear itself... but, alas, it doesn't.

ERR: 10.05.06 12:36:02.369: java.rmi.ConnectException: Connection


refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at

com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
Source) <---


at com.twc.trader.TradeEntryManager$AutoTrader.run(Unknown
Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown
Source)
at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown
Source)
... 6 more

OUT: 10.05.06 12:36:02.853: MainFrame focus LOST.
ERR: 10.05.06 12:36:03.380: java.rmi.ConnectException: Connection


refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at

com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown
<---


Source)
at com.twc.trader.TradeEntryManager$AutoTrader.run(Unknown
Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown
Source)
at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown
Source)
... 6 more

OUT: 10.05.06 12:36:04.013: FutureScalperBookChart average elapsed
(msecs) is: 1.7
OUT: 10.05.06 12:36:04.368: ## ChartFiller run [10m] 17(0) msec
(ACTIVE)
ERR: 10.05.06 12:36:04.388: java.rmi.ConnectException: Connection


refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at

com.twc.remote.RemoteIndicatorServiceImpl_Stub.getBidRunLengthFast(Unknown
<---


Source)
at com.twc.trader.TradeEntryManager$AutoTrader.run(Unknown
Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown
Source)
at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown
Source)
... 6 more

OUT: 10.05.06 12:36:04.388: displayXTMessage:---fail: B-
notAboveTrigger
OUT: 10.05.06 12:36:04.388: displayXTMessage:*** BUY checks per sec:
0.3 <-- normally 6.0 per second
OUT: 10.05.06 12:36:04.594: UnifiedInventory avg(50) elapsed: 1.1
(slowed down due to exception processing, etc.)
ERR: 10.05.06 12:36:05.554: java.rmi.ConnectException: Connection


refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at

com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
Source) <---


at com.twc.trader.TradeEntryManager$AutoTrader.run(Unknown
Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown
Source)
at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown
Source)
... 6 more

ERR: 10.05.06 12:36:06.554: java.rmi.ConnectException: Connection


refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at

com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown
<---

EJP

unread,
May 7, 2010, 4:28:18 AM5/7/10
to
On 7/05/2010 2:52 PM, FutureScalper wrote:
> I am calling that TCPEndpoint
> static method sun.rmi.transport.tcp.TCPEndpoint
> public static void shedConnectionCaches() hoping to "harvest" or
> recycle whatever may lurk in the connection caches :)

Don't do that.

> com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
> Source)<---
> com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown
> com.twc.remote.RemoteIndicatorServiceImpl_Stub.getBidRunLengthFast(Unknown
> com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
> com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown

None of these has anything to do with the Registry so I don't know why
you thought restarting the Registry would do anything. They are all in
calls to the *same* remote object, RemoteIndicatorService. So is that
one doing something odd? like deadlocking itself?

BTW are all these objects exported on the same port? They should be.

FutureScalper

unread,
May 7, 2010, 9:18:56 AM5/7/10
to

I thank you for the suggestions.

No, concurrency is not an issue, and no deadlocks taking place as far
as I know and this whole thing is highly threads tolerant.

I have enough suggestions to work on and I'm also not reproducing the
issue myself since last post.

I'll try and break it again by having a couple of them heavily cross-
referencing each other.

Appreciate your help, will post resolution if I find one.

FutureScalper

unread,
May 12, 2010, 9:11:19 AM5/12/10
to
On May 7, 9:18 am, FutureScalper <futurescal...@gmail.com> wrote:
> On May 7, 4:28 am, EJP <esmond.not.p...@not.bigpond.com> wrote:
>
>
>
>
>
> > On 7/05/2010 2:52 PM, FutureScalper wrote:
>
> > > I am calling that TCPEndpoint
> > > static method sun.rmi.transport.tcp.TCPEndpoint
> > > public static  void shedConnectionCaches() hoping to "harvest" or
> > > recycle whatever may lurk in theconnectioncaches :)

>
> > Don't do that.
>
> > > com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
> > > Source)<---
> > > com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown
> > > com.twc.remote.RemoteIndicatorServiceImpl_Stub.getBidRunLengthFast(Unknown
> > > com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
> > > com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown
>
> > None of these has anything to do with the Registry so I don't know why
> > you thought restarting the Registry would do anything. They are all in
> > calls to the *same* remote object, RemoteIndicatorService. So is that
> > one doing something odd? like deadlocking itself?
>
> > BTW are all these objects exported on the same port? They should be.
>
> I thank you for the suggestions.
>
> No, concurrency is not an issue, and no deadlocks taking place as far
> as I know and this whole thing is highly threads tolerant.
>
> I have enough suggestions to work on and I'm also not reproducing the
> issue myself since last post.
>
> I'll try and break it again by having a couple of them heavily cross-
> referencing each other.
>
> Appreciate your help, will post resolution if I find one.

Thanks again for suggestions. To avoid possible deadlock, I've just
made the RMI server implementation single-threaded even though it's
read only stuff and should not require synchronization.

Don't think that's the issue, but just as a sanity check.

I've experienced the issue once since last post under fairly heavy
usage so it's hard for me to reproduce.

FutureScalper

unread,
May 28, 2010, 2:10:35 PM5/28/10
to
On May 7, 9:18 am, FutureScalper <futurescal...@gmail.com> wrote:
> On May 7, 4:28 am, EJP <esmond.not.p...@not.bigpond.com> wrote:
>
>
>
>
>
> > On 7/05/2010 2:52 PM, FutureScalper wrote:
>
> > > I am calling that TCPEndpoint
> > > static method sun.rmi.transport.tcp.TCPEndpoint
> > > public static  void shedConnectionCaches() hoping to "harvest" or
> > > recycle whatever may lurk in theconnectioncaches :)

>
> > Don't do that.
>
> > > com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
> > > Source)<---
> > > com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown
> > > com.twc.remote.RemoteIndicatorServiceImpl_Stub.getBidRunLengthFast(Unknown
> > > com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
> > > com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown
>
> > None of these has anything to do with the Registry so I don't know why
> > you thought restarting the Registry would do anything. They are all in
> > calls to the *same* remote object, RemoteIndicatorService. So is that
> > one doing something odd? like deadlocking itself?
>
> > BTW are all these objects exported on the same port? They should be.
>
> I thank you for the suggestions.
>
> No, concurrency is not an issue, and no deadlocks taking place as far
> as I know and this whole thing is highly threads tolerant.
>
> I have enough suggestions to work on and I'm also not reproducing the
> issue myself since last post.
>
> I'll try and break it again by having a couple of them heavily cross-
> referencing each other.
>
> Appreciate your help, will post resolution if I find one.

I've been thinking about this issue, and I suspect that the RMI socket
reader thread may be crashing for some unknown reason. I don't have
any direct evidence of this right now, but I smell something like that
happening.

So, I'll look into how I can determine that, and possibly implement my
own more reliable RMI reader thread is that's the issue. Can't think
of anything else that would cause this situation, other than the
socket reader thread dying within my client process (acting as a
server).

Esmond Pitt

unread,
Jun 23, 2010, 11:29:22 PM6/23/10
to
> I've been thinking about this issue, and I suspect that the RMI socket
> reader thread may be crashing for some unknown reason. I don't have
> any direct evidence of this right now, but I smell something like that
> happening.

*I* smell the remote object deadlocking itself so that nothing can
proceed. You don't have any evidence about the RMI reader thread
crashing and I have personally never seen it in 13 years of RMI. And why
would it only crash for one specific remote object? Obviously the
problem is associated with that object, not the RMI system. You're
barking up the wrong tree.

> Can't think of anything else that would cause this situation

I had already made the suggestion above.

0 new messages