Publish over CIFS suddenly stops working

242 views
Skip to first unread message

Steve Robbins

unread,
Jun 23, 2015, 4:13:47 PM6/23/15
to jenkins...@googlegroups.com
Hi,

I've had Jenkins running fine for some time with jobs that publish the build artifacts to a CIFS share.  Then suddenly it stopped working; the build log says java ran out of native threads.   The master, slave, and CIFS server are all windows machines.

The stack trace (below) is slightly different but it seems very similar to https://issues.jenkins-ci.org/browse/JENKINS-19075 Unfortunately, the suggestion to use "-Djcifs.resolveOrder=DNS -Djcifs.smb.client.dfs.disabled=true" did not work for me.  [Just to be certain: I used those arguments on the slave node that was failing.  I presume that is where the "publish" is running, right?  Or do I need to modify the master node's options?  If so, how do you accomplish that when the service runs jenkins.exe?]

I find this sudden failure exceedingly strange since the source code being built has not changed for weeks.  Also that some builds on the same slave node do succeed to transfer their build output. It may be a size effect because the zip file being transferred is 28MB but only 20MB gets transferred before the error occurs. The builds that succeed are transferring less than 3MB.

There was about a week between the last good build and the first failing one.  During this time there was no change to the source code being built nor to Jenkins.  So could a change to IT infrastructure like name resolution be the trigger?  Anyone have ideas where to look?

Thanks,
-Steve

Tail of the failing job's console output:

14:04:25 java.lang.OutOfMemoryError: unable to create new native thread
14:04:25 	at java.lang.Thread.start0(Native Method)
14:04:25 	at java.lang.Thread.start(Unknown Source)
14:04:25 	at jcifs.UniAddress.lookupServerOrWorkgroup(UniAddress.java:173)
14:04:25 	at jcifs.UniAddress.getAllByName(UniAddress.java:290)
14:04:25 	at jcifs.UniAddress.getByName(UniAddress.java:245)
14:04:25 	at jcifs.smb.Dfs.getTrustedDomains(Dfs.java:62)
14:04:25 	at jcifs.smb.Dfs.resolve(Dfs.java:167)
14:04:25 	at jcifs.smb.SmbFile.resolveDfs(SmbFile.java:671)
14:04:25 	at jcifs.smb.SmbFile.send(SmbFile.java:773)
14:04:25 	at jcifs.smb.SmbFileOutputStream.writeDirect(SmbFileOutputStream.java:245)
14:04:25 	at jcifs.smb.SmbFileOutputStream.write(SmbFileOutputStream.java:216)
14:04:25 	at com.slide.hudson.plugins.CIFSShare.upload(CIFSShare.java:397)



Steve Robbins

unread,
Jun 24, 2015, 5:30:17 PM6/24/15
to jenkins...@googlegroups.com
Two follow-up notes:

1. Changed the CIFS share configuration to use IP address instead of host name.  Fails in the same way:

16:17:07 ERROR: Failed to upload files
16:17:07 java.lang.OutOfMemoryError: unable to create new native thread
16:17:07 	at java.lang.Thread.start0(Native Method)
16:17:07 	at java.lang.Thread.start(Unknown Source)
16:17:07 	at jcifs.UniAddress.lookupServerOrWorkgroup(UniAddress.java:173)
16:17:07 	at jcifs.UniAddress.getAllByName(UniAddress.java:290)
16:17:07 	at jcifs.UniAddress.getByName(UniAddress.java:245)
16:17:07 	at jcifs.smb.Dfs.getTrustedDomains(Dfs.java:62)
16:17:07 	at jcifs.smb.Dfs.resolve(Dfs.java:167)
16:17:07 	at jcifs.smb.SmbFile.resolveDfs(SmbFile.java:671)
16:17:07 	at jcifs.smb.SmbFile.send(SmbFile.java:773)
16:17:07 	at jcifs.smb.SmbFileOutputStream.writeDirect(SmbFileOutputStream.java:245)
16:17:07 	at jcifs.smb.SmbFileOutputStream.write(SmbFileOutputStream.java:216)
16:17:07 	at com.slide.hudson.plugins.CIFSShare.upload(CIFSShare.java:397)


2. Tried a manual copy using Windows Explorer on the slave node and it works fine.



Slide

unread,
Jun 24, 2015, 5:47:00 PM6/24/15
to jenkins...@googlegroups.com

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/b79a5e36-6ee9-4cbe-83e8-5f0f3e2dc0d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Steve Robbins

unread,
Jun 25, 2015, 11:16:46 AM6/25/15
to jenkins...@googlegroups.com


On Tuesday, 23 June 2015 15:13:47 UTC-5, Steve Robbins wrote:

The stack trace (below) is slightly different but it seems very similar to https://issues.jenkins-ci.org/browse/JENKINS-19075 Unfortunately, the suggestion to use "-Djcifs.resolveOrder=DNS -Djcifs.smb.client.dfs.disabled=true" did not work for me.  [Just to be certain: I used those arguments on the slave node that was failing.  I presume that is where the "publish" is running, right?  Or do I need to modify the master node's options?  If so, how do you accomplish that when the service runs jenkins.exe?]

So it turns out that you do indeed need to modify the arguments on the MASTER node, not the SLAVE.  Once I did that (edit jenkins.xml), the problem went away.


 

Daniel Beck

unread,
Jun 29, 2015, 2:11:43 PM6/29/15
to jenkins...@googlegroups.com
Take a thread dump to see what all the threads are doing.
> --
> You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/eda6a582-8eb2-4a34-8229-796ad0c4c23a%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages