server down

119 views
Skip to first unread message

dberjman...@gmail.com

unread,
Feb 13, 2013, 10:26:51 AM2/13/13
to curato...@googlegroups.com
Hi.

We are using Curator recipes (1.3.1) to perform a distributed locking. This is our code:

1) CuratorFrameworkFactory.newClient(this.connectString, new RetryOneTime(100));
2) client.getConnectionStateListenable().addListener(new ConnectionStateListener() {...
3) new InterProcessMutex(this.getCuratorClient(), lockPath);
4) if (lock.acquire(timeout, TimeUnit.MILLISECONDS)) {

We are testing different sceneries when the server shutdown:

1) Shutdown the server before 'lock.acquire'. We expect that Curator throw an exception but takes too longer to throw it. Much longer than 'timeout'.

2) Shutdown the server when a process is lock in 'lock.acquire' and an another process has the lock. The running process finished without problems, but the other process wait in the lock much longer than the timeout. A few seconds later an exception is throw by Curator.

Could anyone help us? Our goal is that Curator throw an exception immediately after the client lost the connection with the server and releases all locks.

Currently in the ConnectionStateListener, we are just logging the events and works fine. We think that the solution is, perhaps, around this listener.

Thanks,

Jordan Zimmerman

unread,
Feb 13, 2013, 2:07:39 PM2/13/13
to curato...@googlegroups.com
From the wiki:


Error Handling
It is strongly recommended that you add a ConnectionStateListener and watch for SUSPENDED and LOST state changes. If a SUSPENDED state is reported you cannot be certain that you still hold the lock unless you subsequently receive a RECONNECTED state. If a LOST state is reported it is certain that you no longer hold the lock.

Your connection state listener has to handle SUSPENDED and LOST such that it signals your lock code to exit (or do something meaningful).

-JZ

--
You received this message because you are subscribed to the Google Groups "curator-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to curator-user...@googlegroups.com.
To post to this group, send email to curato...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/curator-users/-/vEnlAAp1lwQJ.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

dberjman...@gmail.com

unread,
Feb 13, 2013, 2:27:27 PM2/13/13
to curato...@googlegroups.com

Thanks for your reply!

You wrote: "your lock code to exit". Curator has the lock and not our code. How we told Curator to release the lock? If we call lock.acquire(1 * 1000, TimeUnit.MILLISECONDS), why it's get lock for a much longer time than 1 second?

Now we are trying changing the sessionTimeout and connectionTimeout:

CuratorFrameworkFactory.newClient(this.connectString, 1 * 1000, 1 * 1000, new RetryOneTime(1 * 1000));

With this, after the lock.acquire timeout, an exception is throw and the lock release ---> this is what we want!! But now we are dealing with the reconnect. After the server start again, Curator could not connect: "Connection timed out for connection string (hostxxx:2181) and timeout (1000) / elapsed (3608)". If we use a greater connection timeout, the release lock (our objective) takes more time.. It's strange that takes ~3608 seconds to connect.

Could you guide us?

Thanks and sorry for my english!!!

           Demian

Jordan Zimmerman

unread,
Feb 13, 2013, 6:44:05 PM2/13/13
to curato...@googlegroups.com
InterProcessMutex is like the JDK lock. Once you have acquired it, you own the lock until you call release. The values to acquire() are how long you are willing to block to acquire the lock not the amount of time you want to hold the lock.

-Jordan

To view this discussion on the web visit https://groups.google.com/d/msg/curator-users/-/nV15tIbsE0IJ.
Reply all
Reply to author
Forward
0 new messages