| There are a lot of building job in jenkins using mesos nodes (more than two hundrend), which happens allocating and removing mesos nodes slower than normal situaton.There are a lot of building job in jenkins using mesos nodes (more than two hundrend), which happens allocating and removing mesos nodes slower than normal situaton.We compile the Jenkins core code with some debug information log to jenkins.log As follow(core/src/main/java/jenkins/model/Nodes.java) public void removeNode(final @Nonnull Node node) throws IOException { Logger.getLogger(Nodes.class.getName()).log(Level.INFO,"node.getNodeName boefore") if (node == nodes.get(node.getNodeName())) { Logger.getLogger(Nodes.class.getName()).log(Level.INFO,"removeNode Queue.withLock(new Runnable() before") Queue.withLock(new Runnable() { @Override public void run() { Logger.getLogger(Nodes.class.getName()).log(Level.INFO,"removeNode public void run() enter") Computer c = node.toComputer(); if (c != null) { c.recordTermination(); c.disconnect(OfflineCause.create(hudson.model.Messages._Hudson_NodeBeingRemoved())); } if (node == nodes.remove(node.getNodeName())) { jenkins.updateComputerList(); jenkins.trimLabels(); } } }); // no need for a full save() so we just do the minimum Util.deleteRecursive(new File(getNodesDir(), node.getNodeName())); NodeListener.fireOnDeleted(node); } } We find the problems by using compiled jenkins core to testing run a lot of mesos nodes. Jenkins master deleting mesos nodes us cycle. tasksMesos pending deleting slave cleanup.log show the longest duration time is as follow: Started at Sun Oct 13 16:17:01 CST 2019 Finished as Sun Oct 20:10:01 CST 2019 During this cleaning mesos nodes period, most of the time is waiting a lock defined in Queue class. The log is as follow: Oct 13 2019 4:17:35 PM jenkins.model.Nodes. removeNode INFO: removeNode Queue.withLock(new Runnable() before Oct 13 2019 6:50:59 PM jenkins.model.Nodes$6 run INFO: removeNode public void run() enter At the same time, jenkins allocates mesos nodes due to the lock. Therefore, is there some solution for the problem which happens allocating and removing mesos nodes slower than normal situaton? |