Problems with chronicle engine after network disconnect(s)

101 views
Skip to first unread message

Alex Smith

unread,
Nov 4, 2015, 1:21:39 PM11/4/15
to Chronicle
Hi,

I've got a chronicle engine server/client setup in the following manner:

Server:
..
assetTree = new VanillaAssetTree().forServer( .. )
endPoint = new ServerEndpoint("localhost:9090", assetTree, WireType.TEXT);
..

Client:
servers = {"localhost:9090"};
assetTree = new VanillaAssetTree().forRemoteAccess(servers, WireType.TEXT, new ConnectionMonitor());

I'm able to write messages at the server and subscribe to these writes on the client. This works very well.

Problems start occurring when I have one or more disconnects from the client. For each disconnect it seems like the server is unable to release the objects it's trying to send to the non-existing client. I tried to profile this in jvisualvm, and I see the following behavior:

1) One client, no disconnects:
Class Name net.openhft.chronicle.engine.server.internal.ObjectKVSubscriptionHandler num instances goes up, before zeroing out. This continues.
2) One client, client disconnected:
Class Name net.openhft.chronicle.engine.server.internal.ObjectKVSubscriptionHandler num instances keeps going up.
3) One client connected, disconnected, reconnected, and so on.
Class Name net.openhft.chronicle.engine.server.internal.ObjectKVSubscriptionHandler num instances keeps going up by a multiple of the number of disconnects.

Anyone able to help out with this?

Thanks,
Alex

Alex Smith

unread,
Nov 4, 2015, 1:27:53 PM11/4/15
to Chronicle
Forgot to provide that I'm using chronicle-engine 1.9.4

Peter Lawrey

unread,
Nov 4, 2015, 1:30:17 PM11/4/15
to java-ch...@googlegroups.com
Hello Alex,
   At the moment, it detects bad subscriptions lazily.  ie when you try to send a subscription to a connection which has disconnected it detects the connection needs cleaning up.  This is not ideal, but works well enough in a system which has events running through it.
   Can you confirm this is the case?

Regards,
   Peter.

On 4 November 2015 at 12:27, Alex Smith <singapo...@gmail.com> wrote:
Forgot to provide that I'm using chronicle-engine 1.9.4

--
You received this message because you are subscribed to the Google Groups "Chronicle" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-chronicl...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alex Smith

unread,
Nov 4, 2015, 1:37:22 PM11/4/15
to Chronicle
Hi Peter,

thank you for your reply.Getting a new subscription up running again works fine both from the server and client side. Problem for me is that memory seems to fill up, e.g: 100 reconnects seems to create 100x more instances of net.openhft.chronicle.engine.server.internal.ObjectKVSubscriptionHandler

The reason this is a problem is that I saw throughput of my application fall drastically, therefore causing latency as it was not able to handle data quickly enough.


Br
Alex

Rob Austin

unread,
Nov 4, 2015, 1:42:23 PM11/4/15
to java-ch...@googlegroups.com
Alex

Im sorry I don’t understand why you have 100 reconnect, ( why are you not able to reconnect after the first attempt ? ) 

I take it that the problem is that the net.openhft.chronicle.engine.server.internal.ObjectKVSubscriptionHandler is not being freed ( something still has a reference to it once the connection drops ), does that sound likely ?

Rob

Rob Austin

unread,
Nov 4, 2015, 1:45:54 PM11/4/15
to java-ch...@googlegroups.com
Alex

do you also see 100x instances of net.openhft.chronicle.engine.server.internal.EngineWireHandler ?

Rob

Alex Smith

unread,
Nov 4, 2015, 1:53:18 PM11/4/15
to Chronicle
Hi Rob,

the 100x reconnect was just an example. I had this running on a server, and after one reconnect I started seeing latency after a couple of hours. 

In order to reproduce the problem I tried running locally. By closing the network connection between the client and the server (by using TCPView) I could see that the net.openhft.chronicle.engine.server.internal.ObjectKVSubscriptionHandler was not freed (kept increasing). By doing multiple disconnects I could see the number of instances increasing faster..

No, it seems like it's only ObjectKVSubscriptionHandler that is not being freed

Br 
Alex

Rob Austin

unread,
Nov 4, 2015, 1:58:40 PM11/4/15
to java-ch...@googlegroups.com
Alex

Are you able to tell me which classes hold the reference to the net.openhft.chronicle.engine.server.internal.ObjectKVSubscriptionHandler, as looking at the code there does not appear to be anything out of place ( that I can see )

Rob

Alex Smith

unread,
Nov 4, 2015, 2:11:11 PM11/4/15
to Chronicle
Seems like it could be VanillaWireOutPublisher. Besides LinkedTransferQueue that's what shows up in references within jvisualvm when I look at ObjectKVSubscriptionHandler

I'll get some additional help looking at this tomorrow..

Alex

Rob Austin

unread,
Nov 4, 2015, 2:12:15 PM11/4/15
to java-ch...@googlegroups.com
thanks

Peter Lawrey

unread,
Nov 4, 2015, 2:35:11 PM11/4/15
to java-ch...@googlegroups.com
If this is the case, it means the subscription to a dead connection isn't being cleaned up which is a bug.

When a event is passed to a connection which is closed it should clean up.

Rob Austin

unread,
Nov 4, 2015, 3:46:27 PM11/4/15
to java-ch...@googlegroups.com
Alex,Peter

I’ve written a test ( see below ) which I think reproduces the problem, however in my profiler I see the problem is more likely to be around the WireTcpHandler’s lambda’s expression. ( which ticks up with the number of connection disconnections ) 

Rob




my test is as follows ( run this and then run the profiler )

import net.openhft.chronicle.engine.server.ServerEndpoint;
import net.openhft.chronicle.engine.tree.VanillaAssetTree;
import net.openhft.chronicle.network.TCPRegistry;
import net.openhft.chronicle.network.connection.TcpChannelHub;
import net.openhft.chronicle.wire.WireType;
import org.junit.Before;
import org.junit.Test;

import java.io.IOException;

/**
* @author Rob Austin.
*/
public class Test {


private static final String CONNECTION = "host.port.KeySubscriptionTest";
public static final WireType WIRE_TYPE = WireType.TEXT;
private VanillaAssetTree serverAssetTree;

@Before
public void before() throws IOException {
serverAssetTree = new VanillaAssetTree().forTesting();

TCPRegistry.createServerSocketChannelFor(CONNECTION);
new ServerEndpoint(CONNECTION, serverAssetTree, WIRE_TYPE);

}


/**
* run this with a profiler
* @throws InterruptedException
*/
@Test
public void test2() throws InterruptedException {

VanillaAssetTree clientTree = new VanillaAssetTree().forRemoteAccess(CONNECTION, WIRE_TYPE);
final TcpChannelHub hub = clientTree.root().findView(TcpChannelHub.class);

assert hub != null;

for (; ; ) {

Thread.sleep(2000); // give time for the disconnect and reconnect
hub.forceDisconnect();

}

}

}

Alex Smith

unread,
Nov 8, 2015, 6:41:22 AM11/8/15
to Chronicle
Hi Rob,

can you try the following test both using hasSubscriber = true and hasSubscriber = false with chronicle-engine 1.9.4 ? 

On my machine the number of instances of ObjectKVSubscriptionHandler  keeps increasing with hasSubscriber = true, while it keeps down with hasSubscriber = false.

Will try the same code using the snapshots in a bit..

package testing;


import net.openhft.chronicle.engine.api.map.MapView;
import net.openhft.chronicle.engine.api.pubsub.TopicSubscriber;

import net.openhft.chronicle.engine.server.ServerEndpoint;
import net.openhft.chronicle.engine.tree.VanillaAssetTree;
import net.openhft.chronicle.network.TCPRegistry;
import net.openhft.chronicle.network.connection.TcpChannelHub;
import net.openhft.chronicle.wire.WireType;
import org.junit.Before;
import org.junit.Test;

import java.io.IOException;


public class EngineTest {


private static final String CONNECTION = "host.port.KeySubscriptionTest";
public static final WireType WIRE_TYPE = WireType.TEXT;
private VanillaAssetTree serverAssetTree;
    private MapView<String, String> dataMap;
private boolean hasSubscriber = true;


@Before
public void before() throws IOException {
serverAssetTree = new VanillaAssetTree().forTesting();
        dataMap = serverAssetTree.acquireMap("/data/chronicle", String.class, String.class);


TCPRegistry.createServerSocketChannelFor(CONNECTION);
new ServerEndpoint(CONNECTION, serverAssetTree, WIRE_TYPE);

}


    @Test
public void test2() throws InterruptedException {

VanillaAssetTree clientTree = new VanillaAssetTree().forRemoteAccess(CONNECTION, WIRE_TYPE);
final TcpChannelHub hub = clientTree.root().findView(TcpChannelHub.class);

        if(hasSubscriber) {
TopicSubscriber<String, String> topicSubscriber =
(topic, message) -> System.out.println("Topic " + topic + " message " + message);
clientTree.registerTopicSubscriber("/data/chronicle", String.class, String.class, topicSubscriber);

}
assert hub != null;

for (; ; ) {
            if(hasSubscriber)
dataMap.put("TestKey", "TestValue");


Thread.sleep(2000); // give time for the disconnect and reconnect
hub.forceDisconnect();

}

}
}

Rob Austin

unread,
Nov 8, 2015, 6:46:59 AM11/8/15
to java-ch...@googlegroups.com, Peter Lawrey
Alex

thanks for the test, ( I’m a bit busy at the moment ), but I will definitely take a look at fixing this toward the end of next week,

Rob


Rob Austin

unread,
Nov 13, 2015, 2:24:27 PM11/13/15
to java-ch...@googlegroups.com
Alex


this should now be fixed, can you retest with 
<dependency>
<groupId>net.openhft</groupId>
<artifactId>chronicle-bom</artifactId>
<version>1.10.1</version>
<type>pom</type>
<scope>import</scope>
</dependency>

and let me know, if it fixes it.

thanks


Rob Austin


On 8 Nov 2015, at 11:41, Alex Smith <singapo...@gmail.com> wrote:

Reply all
Reply to author
Forward
0 new messages