Correct way to disconnect from persistent subscription

942 views
Skip to first unread message

Anand

unread,
May 8, 2017, 10:10:54 AM5/8/17
to Event Store
What is the correct way to disconnect from a persistent subscription?  And does the event store .net api attempt to automatically connect to the persistent subscription under any circumstances?

After long periods of inactivity (greater than 24 hours), we see multiple open connections to the subscription group. I see some subscriptions with around 100 connections !!!

Hayley-Jean Campbell

unread,
May 8, 2017, 10:39:35 AM5/8/17
to Event Store
To disconnect from a persistent subscription you need to call Stop() on the subscription. And Event Store should not automatically connect to a persistent subscription.

How are you measuring the number of connections to the subscription group?
And how are your clients connecting to your persistent subscription?

Anand

unread,
May 8, 2017, 10:46:16 AM5/8/17
to Event Store
Yep. That is what we call. 

I am looking at the connections for the subscriber group in the competing competing consumers web page. 

We are using ConnectToPersistentSubscriptionAsync to connect.

And I think this problem is just specific to our clients running in a virtual machine after long periods of inactivity.

Anand

unread,
May 8, 2017, 10:48:04 AM5/8/17
to Event Store
Also is it possible to just disconnect all connections to a particular subscriber group? We have 100 connections for a client and stopping the client just reduces the connections by 1. The only way to get rid of the other 99 is to restart the eventstore.

Our eventstore is running in Azure.

Greg Young

unread,
May 8, 2017, 10:52:49 AM5/8/17
to event...@googlegroups.com
This sounds like you are leaking connections. Taking down the client
process would also close them.
> --
> You received this message because you are subscribed to the Google Groups
> "Event Store" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to event-store...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Studying for the Turing test

Anand

unread,
May 8, 2017, 10:55:54 AM5/8/17
to Event Store
What causes that? And taking down the client process just reduces the number of connections by 1. The only way to get rid of all connections is to shut down the event store. 

Greg Young

unread,
May 8, 2017, 10:57:06 AM5/8/17
to event...@googlegroups.com
The UI tells you where the connections are from and how are you measuring this?

Anand

unread,
May 8, 2017, 11:17:55 AM5/8/17
to Event Store
I am looking at the connections column in the competing consumers page. 

I did not check the source of connections today but when it occurred previously the connections were all from the same machine where the client was running (we had no other client running on that machine). The client did not receive any events till we restarted it. Also I have a suspicion this is related to virtual machines, since the subscription groups connected from the base machine do not have this issue.

Anand

unread,
May 17, 2017, 10:38:33 AM5/17/17
to Event Store
The issue occurred again and I grabbed some screenshots

1. CC1.Png - shows the main connections page. There is only one connection
2. CC2.PNG - Shows the Competing consumers page. We connect to each subscriber group from a separate machine. As you can see, the number of connections are all over the place. They should all be 1. Some of them show 100 which corresponds to the maximum subscribers setting.
3. CC100.txt - shows the connection info for a particular subscriber group that shows 100 in the CC web page.I have replaced the actual IP with 0.0.0.0 (i can send it through email if needed). 

The IP in CC100.txt is different from the IP in CC1.PNG. So some subscriber group connections are not being cleaned up when the connection drops.

 Thoughts on why this is happening and how to avoid it? As I said before, i have a feeling this is limited to VMs.
CC100.txt
CC1.png
cc2.png

Hayley-Jean Campbell

unread,
May 18, 2017, 2:33:40 AM5/18/17
to Event Store
Would you be able to share a sample of the code you are using to connect to your persistent subscriptions, or provide us with a test case?

Anand

unread,
Jul 17, 2017, 9:56:01 AM7/17/17
to Event Store
Hey, I was able to replicate the issue with the attached test service. Steps to reproduce -> Install the service and run it over the weekend. > 24 hours of inactivity results in orphan connections showing up and event delivery has issues. The only solution we have found so far is to restart the event store every Monday.
OrphanConnectionTest.zip

Greg Young

unread,
Jul 17, 2017, 10:02:55 AM7/17/17
to event...@googlegroups.com
Can you include a log during this period (it obviously will take us a while to run the test).

--
You received this message because you are subscribed to the Google Groups "Event Store" group.
To unsubscribe from this group and stop receiving emails from it, send an email to event-store+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Anand

unread,
Jul 17, 2017, 10:48:25 AM7/17/17
to Event Store
I have attached the logs.


On Monday, July 17, 2017 at 10:02:55 AM UTC-4, Greg Young wrote:
Can you include a log during this period (it obviously will take us a while to run the test).
On Mon, Jul 17, 2017 at 2:56 PM, Anand <anand....@gmail.com> wrote:
Hey, I was able to replicate the issue with the attached test service. Steps to reproduce -> Install the service and run it over the weekend. > 24 hours of inactivity results in orphan connections showing up and event delivery has issues. The only solution we have found so far is to restart the event store every Monday.


On Thursday, May 18, 2017 at 2:33:40 AM UTC-4, Hayley-Jean Campbell wrote:
Would you be able to share a sample of the code you are using to connect to your persistent subscriptions, or provide us with a test case?

--
You received this message because you are subscribed to the Google Groups "Event Store" group.
To unsubscribe from this group and stop receiving emails from it, send an email to event-store...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
EventStore Logs.zip

Greg Young

unread,
Jul 17, 2017, 11:22:25 AM7/17/17
to event...@googlegroups.com
Which date did the problem happen and on what setup are you running? I am noticing some odd messages just going through such as

[PID:09600:011 2017.07.14 04:33:48.184 ERROR QueuedHandlerMresWit] ---!!! VERY SLOW QUEUE MSG [StorageWriterQueue]: WritePrepares - 79642ms. Q: 0/2.
[PID:09600:011 2017.07.14 04:33:55.981 ERROR QueuedHandlerMresWit] ---!!! VERY SLOW QUEUE MSG [StorageWriterQueue]: WritePrepares - 7797ms. Q: -1/0.
[PID:09600:011 2017.07.14 04:34:49.904 ERROR QueuedHandlerMresWit] ---!!! VERY SLOW QUEUE MSG [StorageWriterQueue]: WritePrepares - 51251ms. Q: 0/1.
[PID:09600:011 2017.07.14 08:32:51.506 ERROR QueuedHandlerMresWit] ---!!! VERY SLOW QUEUE MSG [StorageWriterQueue]: WritePrepares - 10297ms. Q: 0/0.
[PID:09600:011 2017.07.14 08:39:02.469 ERROR QueuedHandlerMresWit] ---!!! VERY SLOW QUEUE MSG [StorageWriterQueue]: WritePrepares - 351228ms. Q: 0/11.
[PID:09600:011 2017.07.14 08:39:10.063 ERROR QueuedHandlerMresWit] ---!!! VERY SLOW QUEUE MSG [StorageWriterQueue]: WritePrepares - 7594ms. Q: -10/0.
[PID:09600:011 2017.07.14 08:39:31.657 ERROR QueuedHandlerMresWit] ---!!! VERY SLOW QUEUE MSG [StorageWriterQueue]: WritePrepares - 20063ms. Q: 0/0.

79 seconds to write data to disk is rather unusual.

To unsubscribe from this group and stop receiving emails from it, send an email to event-store+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Anand

unread,
Jul 17, 2017, 11:36:43 AM7/17/17
to Event Store
The service has been running since 14th. The problem should have occurred on the 15th evening or 16th. I checked on the 15th (around noon) and there were no orphan connections.

We are running a single instance of event store 4.0 (EventStore.ClusterNode.exe --db ./db --log ./logs --ExtIp 0.0.0.0). We are also using the open source version (in the process of acquiring commercial license)

Anand

unread,
Jul 25, 2017, 4:14:15 PM7/25/17
to Event Store
Have you had a chance to look into this issue?

Hayley-Jean Campbell

unread,
Jul 26, 2017, 2:40:43 AM7/26/17
to Event Store
Hi Anand,

I have not yet had a chance to test this properly, but have looked through the code sample.
Could you please try calling Close() on the subscription when it is dropped and see if that helps any? The connection code currently doesn't do this.

For example:

var result = await _connection.ConnectToPersistentSubscriptionAsync(
                             streamName
,
                             groupName
,
                             
(subscription, resolved) => Task.Run(() => OnEventAppearedPersistentStream(subscription, GetSubscriptionKey(streamName, groupName), resolved, false)),
                             
(subscription, dropReason, exception) =>
                             
{
                               
subscription.Stop(timeout);
                               
Task.Run(() => OnSubscriptionDropped(streamName, groupName, dropReason, exception));
                             
},
                             
null,
                             
Environment.ProcessorCount,
                             
false);


Also, how often is your connection to the persistent subscription being dropped and reconnected?
If there are any, what causes the subscription connections to be dropped?

Anand

unread,
Jul 26, 2017, 11:10:23 AM7/26/17
to Event Store
I will run the test with Stop and see what happens.

Based on our test program logs the subscriptions dropped and reconnected around 15 times from the 13th to 17th July. The connection dropped by a similar count. The connection drops may be related to the heartbeat getting timed out. We used to have connection drops often and Greg recommended we increase it.

Hayley-Jean Campbell

unread,
Jul 27, 2017, 5:51:17 AM7/27/17
to Event Store
Hi Anand,

After more testing today, we believe we have found the bug causing of your issue.

When the client attempts to create a connection to a persistent subscription, it will send off a request and wait for a response. If it does not receive a response within the OperationTimeout, it will retry the connection operation.
Since you have specified the KeepRetrying setting, the client will retry this connection operation until it eventually succeeds.
In addition to this, the Persistent Subscription Service on the Event Store server does not track which client operation caused the connection operation. This means that the server will create a connection for each retry.

From the logs you provided, your Event Store server was experiencing a lot of slow queues during the time you saw the extra connections being created.
This means that a lot of the connection operations were likely timing out, causing them to be retried, which would cause multiple connections to be created all at the same time.

An issue for this has been added to github, you can track it here.

Anand

unread,
Jul 27, 2017, 9:24:16 AM7/27/17
to Event Store
Hi, thanks for the response. Do you have an ETA for the fix? We will be using eventstore in production environment in a few weeks.

Anand

unread,
Jul 28, 2017, 10:32:58 AM7/28/17
to Event Store
I modified our code and added stop as you had recommended but that is crashing our service. I was also able to replicate the issue with the test program I sent you.

Hayley-Jean Campbell

unread,
Jul 28, 2017, 10:59:50 AM7/28/17
to Event Store
Hi Anand,

I do apologise for not mentioning this in my last comment, but calling Stop() will not help fix this issue.

We will be working on a fix for this after the release of 4.0.2.
For now, there are a few options for working around this:

1. The simplest is to find the source of the slow messages on your servers
    i. What are the specs of your VM?
    ii. What kind of disks are you running on?
    iii. What kind of load are you putting on your Event Store when you see these issues?

2. You could reduce the number of occurrences of this issue by increasing the operation timeout on your connection settings. However, from the logs that you provided, there were some cases where the queues were taking more than 300 seconds to process a message. So in order for this to work, this timeout value would need to be set to a very high number.

Anand

unread,
Jul 28, 2017, 11:07:59 AM7/28/17
to Event Store
Are there any cons in increasing the operation timeout?

Our production environment should have better specs than our internal test servers. So, 1(i) and 1(ii) should not be an issue. For 1(iii), the issue occurs when no one is using eventstore over the weekend. Never seen it occur when we send and receive events without huge periods of inactivity.

Greg Young

unread,
Jul 28, 2017, 11:13:59 AM7/28/17
to event...@googlegroups.com
300 seconds is not reasonable in any environment.

--
You received this message because you are subscribed to the Google Groups "Event Store" group.
To unsubscribe from this group and stop receiving emails from it, send an email to event-store+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages