[JIRA] (JENKINS-13828) locks-and-latches not release job and infinitely waits for unlock.

130 views
Skip to first unread message

gentoo.integer@gmail.com (JIRA)

unread,
May 18, 2012, 2:45:24 PM5/18/12
to jenkinsc...@googlegroups.com
Kanstantsin Shautsou created JENKINS-13828:
----------------------------------------------

Summary: locks-and-latches not release job and infinitely waits for unlock.
Key: JENKINS-13828
URL: https://issues.jenkins-ci.org/browse/JENKINS-13828
Project: Jenkins
Issue Type: Bug
Components: locks-and-latches
Affects Versions: current
Environment: rhel6,2, jenkins-1.463
Reporter: Kanstantsin Shautsou
Assignee: stephenconnolly
Priority: Critical


locks-and-latches periodically unrealeases jobs. I not know how debug and how reproduce manually, but i have this bug very periodically. When i loging to jenkins i see one job that cycles with "waititing for 1 minutes" for hours...
I have 3 build executors and some for queue. One lock is used in 3 jobs...
Could you advice any help? How to debug? I can install release plugin with debug info and collect on the next reproducing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


bmiegemolle@sierrawireless.com (JIRA)

unread,
Apr 11, 2013, 9:00:32 AM4/11/13
to jenkinsc...@googlegroups.com

This also happens sometimes to me when I cancel a build of a job that has a lock (but not each time).

For example, the following build cancellation went well. As mentionned in the logs, the lock was released, and following builds could be executed.

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3:22.655s
[INFO] Finished at: Wed Apr 10 18:56:34 CEST 2013
[INFO] channel stopped
[locks-and-latches] Releasing all the locks
[locks-and-latches] All the locks released
Build was aborted
Aborted by bmiegemolle
Final Memory: 76M/551M
[INFO] ------------------------------------------------------------------------
Finished: ABORTED

But sometimes when I cancel a build, nothing about lock release is displayed. See following build logs for example:

Uglifying file: /srv/jenkins/platform/jobs/trunk/workspace/target/classes/web/resources-built/vendor/jquery.fileupload/jquery.fileupload.js
Uglifying file: /srv/jenkins/platform/jobs/trunk/workspace/target/classes/web/resources-built/vendor/jquery.fileupload/jquery.iframe-transport.js
Uglifying file: /srv/jenkins/platform/jobs/trunk/workspace/target/classes/web/resources-built/vendor/handlebars/handlebars-1.0.rc.1.js
Uglifying file: /srv/jenkins/platform/jobs/trunk/workspace/target/classes/web/resources-built/vendor/handlebars/handlebars-1.0.0.beta.6.js
Uglifying file: /srv/jenkins/platform/jobs/trunk/workspace/target/classes/web/resources-built/vendor/select2/select2.js
Uglifying file: /srv/jenkins/platform/jobs/trunk/workspace/target/classes/web/resources-built/vendor/jquery.storage/jquery.Storage.js
Uglifying file: /srv/jenkins/platform/jobs/trunk/workspace/target/classes/web/resources-built/vendor/humane/humane.js
Uglifying file: /srv/jenkins/platform/jobs/trunk/workspace/target/classes/web/resources-built/vendor/highcharts/highcharts.js
Build was aborted
Aborted by bmiegemolle
Finished: ABORTED

Following builds remaimed blocked just after the svn update:

[locks-and-latches] Checking to see if we really have the locks
[locks-and-latches] Could not get all the locks... sleeping for 1 minute
[locks-and-latches] Could not get all the locks... sleeping for 1 minute
[locks-and-latches] Could not get all the locks... sleeping for 1 minute
[locks-and-latches] Could not get all the locks... sleeping for 1 minute

Restarting Jenkins make things work again. I don't have any clue on the conditions that lead to this issue.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.

bmiegemolle@sierrawireless.com (JIRA)

unread,
Apr 11, 2013, 10:47:32 AM4/11/13
to jenkinsc...@googlegroups.com
 
Bernard Miegemolle edited a comment on Bug JENKINS-13828

This also happens sometimes to me when I cancel a build of a job that has a lock (but not each time).

Following builds remained blocked just after the svn update:

[locks-and-latches] Checking to see if we really have the locks
[locks-and-latches] Could not get all the locks... sleeping for 1 minute
[locks-and-latches] Could not get all the locks... sleeping for 1 minute
[locks-and-latches] Could not get all the locks... sleeping for 1 minute
[locks-and-latches] Could not get all the locks... sleeping for 1 minute

Restarting Jenkins make things work again. I don't have any clue on the conditions that lead to this issue.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.

arnaud.nauwynck-ext@lyxor.com (JIRA)

unread,
Jan 7, 2015, 3:07:52 AM1/7/15
to jenkinsc...@googlegroups.com

Hi,

I also encountered this bug ... and I have implemented a workaround, that avoid restarting my jenkins master.
The workaround is a groovy script to execute as admin in the web page "https://<<jenkins>>/script"

ForceUnlockLatch.groovy
import hudson.plugins.locksandlatches.LockWrapper;
import java.util.concurrent.locks.ReentrantLock;
import java.util.concurrent.locks.AbstractOwnableSynchronizer;
import java.util.concurrent.locks.AbstractQueuedSynchronizer;
import java.lang.reflect.Field;
import java.lang.reflect.Method;

String lockName = "xxx-lock-name";

String text = "";
LockWrapper.DescriptorImpl descr = LockWrapper.DESCRIPTOR;
// invoke field "backupLocks"
Field backupLocksField = LockWrapper.DescriptorImpl.class.getDeclaredField("backupLocks");
backupLocksField.setAccessible(true);
Map<String,Object> backupLocks = (Map<String,Object>) backupLocksField.get(descr);

ReentrantLock lockObj = (ReentrantLock) backupLocks.get(lockName);

if (lockObj == null) {
  text += "NULL : lock not found";
} else if (lockObj.isLocked()) {
  text += "*** before Unlock: " + lockObj;

  if (! lockObj.isHeldByCurrentThread()) {
   // can not release from another thread... => java.lang.IllegalMonitorStateException
    // invoke "lockSophisESB.sync.setExclusiveOwnerThread(currentThread)";
    Field syncField = lockObj.getClass().getDeclaredField("sync");
    syncField.setAccessible(true);
    AbstractOwnableSynchronizer lockSync = syncField.get(lockObj);
  
    Thread currentThread = Thread.currentThread();
    Method setExclusiveOwnerThreadMethod = AbstractOwnableSynchronizer.class.getDeclaredMethod("setExclusiveOwnerThread", Thread.class);
    setExclusiveOwnerThreadMethod.setAccessible(true);
    setExclusiveOwnerThreadMethod.invoke(lockSync, currentThread);
  }

  // *** do unlock ***
  lockObj.unlock();

  text += "\n *** after unlock:" + lockObj;
} else {
  text += "NOT locked : " + lockObj;
}

text;
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.

arnaud.nauwynck-ext@lyxor.com (JIRA)

unread,
Jan 7, 2015, 3:07:52 AM1/7/15
to jenkinsc...@googlegroups.com
 
Arnaud Nauwynck edited a comment on Bug JENKINS-13828

Hi,

I also encountered this bug ... and I have implemented a workaround, that avoid restarting my jenkins master.
The workaround is a groovy script to execute as admin in the web page "https://<<jenkins>>/script"

ForceUnlockLatch.groovy
import hudson.plugins.locksandlatches.LockWrapper;
import java.util.concurrent.locks.ReentrantLock;
import java.util.concurrent.locks.AbstractOwnableSynchronizer;
import java.util.concurrent.locks.AbstractQueuedSynchronizer;
import java.lang.reflect.Field;
import java.lang.reflect.Method;

String lockName = "xxx-lock-name";

String text = "";
LockWrapper.DescriptorImpl descr = LockWrapper.DESCRIPTOR;
// invoke field "backupLocks"
Field backupLocksField = LockWrapper.DescriptorImpl.class.getDeclaredField("backupLocks");
backupLocksField.setAccessible(true);
Map<String,Object> backupLocks = (Map<String,Object>) backupLocksField.get(descr);

ReentrantLock lockObj = (ReentrantLock) backupLocks.get(lockName);

if (lockObj == null) {
  text += "NULL : lock not found";
} else if (lockObj.isLocked()) {
  text += "*** before Unlock: " + lockObj;

  if (! lockObj.isHeldByCurrentThread()) {
   // can not release from another thread... => java.lang.IllegalMonitorStateException
    // invoke "lockObj.sync.setExclusiveOwnerThread(currentThread)";
    Field syncField = lockObj.getClass().getDeclaredField("sync");
    syncField.setAccessible(true);
    AbstractOwnableSynchronizer lockSync = syncField.get(lockObj);
  
    Thread currentThread = Thread.currentThread();
    Method setExclusiveOwnerThreadMethod = AbstractOwnableSynchronizer.class.getDeclaredMethod("setExclusiveOwnerThread", Thread.class);
    setExclusiveOwnerThreadMethod.setAccessible(true);
    setExclusiveOwnerThreadMethod.invoke(lockSync, currentThread);
  }

  // *** do unlock ***
  lockObj.unlock();

  text += "\n *** after unlock:" + lockObj;
} else {
  text += "NOT locked : " + lockObj;
}

text;

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.

rogerdpack@java.net (JIRA)

unread,
Mar 20, 2015, 8:21:32 PM3/20/15
to jenkinsc...@googlegroups.com
rogerdpack commented on Bug JENKINS-13828

Just ran into something similar--no other jobs going, but when I start one, it says

[locks-and-latches] Could not get all the locks... sleeping for 1 minute

infinitely.

Possibly had something to do with interrupting a job?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
Reply all
Reply to author
Forward
0 new messages