[JIRA] (JENKINS-53030) Intermittent hudson.util.IOException2 + org.xml.sax.SAXParseException

5 views
Skip to first unread message

v.gimple@stemmer-imaging.de (JIRA)

unread,
Aug 14, 2018, 4:19:02 AM8/14/18
to jenkinsc...@googlegroups.com
Volker Gimple created an issue
 
Jenkins / Bug JENKINS-53030
Intermittent hudson.util.IOException2 + org.xml.sax.SAXParseException
Issue Type: Bug Bug
Assignee: Denis Saponenko
Attachments: Log114.txt, support_2018-08-14_07.44.30.zip
Components: pipeline-multibranch-defaults-plugin
Created: 2018-08-14 08:18
Environment: Jenkins 2.121.1 installed on Ubuntu Server 12.04. Plugin Pipeline: Multibranch at version 2.19.
Nodes running Windows 7 x64 and ubu1204 (x86_64, i686, armv7l), ubu1604 (x86_64, i686, armv7l, aarch64) and ubu1804 (x86_64).
See also support core zip in the attachments.
Labels: ioexception2 SAXParseException parallel
Priority: Major Major
Reporter: Volker Gimple

I am working on a build + test + deployment script for multiple platforms. So far the script is still mostly a dummy with the actual work still left to fill in. Nevertheless, when running the script I get intermittent (about one in three or four runs) failures caused by always the same two exceptions:

 

org.xml.sax.SAXParseException; systemId: file:/var/lib/jenkins/jobs/Temp.ScriptTest/branches/branches-nodenesting.skpm19/builds/109/changelog2.xml; lineNumber: 1; columnNumber: 1; Premature end of file.
 at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1239)
 at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
 at org.apache.commons.digester.Digester.parse(Digester.java:1871)
 at hudson.scm.SubversionChangeLogParser.parse(SubversionChangeLogParser.java:76)
Caused: hudson.util.IOException2: Failed to parse /var/lib/jenkins/jobs/Temp.ScriptTest/branches/branches-nodenesting.skpm19/builds/109/changelog2.xml
 at hudson.scm.SubversionChangeLogParser.parse(SubversionChangeLogParser.java:80)
 at hudson.scm.SubversionChangeLogParser.parse(SubversionChangeLogParser.java:43)
 at org.jenkinsci.plugins.workflow.job.WorkflowRun.onCheckout(WorkflowRun.java:1024)
 at org.jenkinsci.plugins.workflow.job.WorkflowRun.access$1400(WorkflowRun.java:143)
 at org.jenkinsci.plugins.workflow.job.WorkflowRun$SCMListenerImpl.onCheckout(WorkflowRun.java:1232)
 at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:127)
 at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:85)
 at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:75)
 at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1$1.call(AbstractSynchronousNonBlockingStepExecution.java:47)
 at hudson.security.ACL.impersonate(ACL.java:290)
 at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1.run(AbstractSynchronousNonBlockingStepExecution.java:44)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)

 

The file involved is always changelog2.xml.

The script I am running is this:

 

// --------------------------------------------------------------------------
// Project definitions used in the declarative pipeline below
// --------------------------------------------------------------------------
def maintainers = "${env.MAIL_VGIMPLE}"
    // these people will be notified when the build fails; comma separated
def platforms = util.StringToArray(env.CONAN_PLATFORMS)
    // alternative: ['win32', 'x64', ...]
def upstreamjobs = "BuildSystem.CMake.all"
    // these jobs will trigger this job; comma separated
def parallelbuild = true
    // set to true to execute builds parallel;
def crossplatformbuild = true
    // use cross platform builds where possible (i.e. Linux ARM builds will
    // be carried out on Linux PC nodes)

// --------------------------------------------------------------------------
// Diagnostic output and preparations
// --------------------------------------------------------------------------
echo "**** Building for the following platforms:"
print platforms

builds = [:]
builds.failFast = true

stashNameArtifacts = "artifacts." + env.BUILD_NUMBER + "."
stashNameTests = "tests." + env.BUILD_NUMBER + "."

// --------------------------------------------------------------------------
// The build and test step
// --------------------------------------------------------------------------
def BuildAndTestStep(platform) {
    def isCrossPlatformBuild = util.IsCrossPlatformBuild(platform)
    try {        testplatform = platform.startsWith('build.') ? platform - 'build.' : platform
        node (platform) {
            checkout scm
            echo "--------> build step for platform " + platform
            echo "--------> stash " + stashNameArtifacts + testplatform
            if (isCrossPlatformBuild) {
                echo "--------> stash " + stashNameTests + testplatform
            } else {
                echo "--------> execute tests directly on the node where the build ran"
            }
        }
        if (isCrossPlatformBuild) {
            node (testplatform) {
                echo "--------> unstash " + stashNameTests + testplatform
                echo "--------> execute cross platform test"
            }
        }
    } catch(Exception e) {
        echo "**** An error occurred on build platform " + platform
        print e
        throw e
    }
}

// --------------------------------------------------------------------------
// Pipeline Declaration
// --------------------------------------------------------------------------pipeline {
    agent none
    triggers {
        pollSCM('@hourly')
        upstream(upstreamProjects: upstreamjobs, threshold: hudson.model.Result.SUCCESS)
    }
    stages {
        // ------------------------------------------------------------------
        // Build and Test step
        // ------------------------------------------------------------------
        stage ('build & test') {
            steps {
                script {
                    // now execute the builds as determined - either parallel
                    // or sequentially
                    prefix = crossplatformbuild ? "build." : ""
                    if (parallelbuild) {
                        for (int i = 0; i < platforms.size(); ++i) {
                            def plt = prefix + platforms[i]
                            builds[plt] = { -> BuildAndTestStep(plt) }
                        }
                        parallel builds
                    } else {
                        for (int i = 0; i < platforms.size(); ++i) {
                            BuildAndTestStep(prefix + platforms[i])
                        }
                    }
                }
            }
        }
        // ------------------------------------------------------------------
        // Deployment step
        // ------------------------------------------------------------------
        stage ('deploy') {
            steps {
                script {
                    for (int i = 0; i < platforms.size(); ++i) {
                        node { // actual node does not matter!
                            echo "--------> unstash " + stashNameArtifacts + platforms[i]
                            echo "--------> install unstashed artifacts to either cvdev2 or conan"
                        }
                    }
                }
            }
        }
    }
    post {
        failure {
            echo "*** Build Regression ***"
            script {
                mail (to: "${env.MAIL_VGIMPLE}",
                subject: "Build failed in Jenkins: ${JOB_NAME} #${BUILD_NUMBER}",
                body: 'See ' + currentBuild.absoluteUrl)
            }
        }
        fixed {
            echo "*** Build Fixed ***"
            script {
                mail (to: "${env.MAIL_VGIMPLE}",
                subject: "Jenkins build is back to normal: ${JOB_NAME} #${BUILD_NUMBER}",
                body: 'See ' + currentBuild.absoluteUrl)
            }
        }
    }
}

A typical output log is attached (file Log114.txt).

Looking at the failed builds it looks like in most cases the (generally slower) ARM-based test nodes are the ones where the exception is raised. The exceptions never occur when executing the script sequentially (i.e. when setting 'parallelbuild' to false and the 'parallel' block is never hit).

My next move will be to try and filter out these particular exceptions and see if I have a chance at completing the build regardless but an approach where this is not necessary would of course be my preference.

 

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396)

v.gimple@stemmer-imaging.de (JIRA)

unread,
Aug 14, 2018, 5:26:01 AM8/14/18
to jenkinsc...@googlegroups.com
Volker Gimple commented on Bug JENKINS-53030
 
Re: Intermittent hudson.util.IOException2 + org.xml.sax.SAXParseException

After doing some more testing with more finely grained and selective try/catch blocks I can add the following details:

  • If it does occur, then the IOException2 always occurs during the scm checkout call in BuildAndTestStep()
  • I cannot confirm my earlier impression that the exception is more likely to happen for one of the ARM platforms (in fact the scm step is usually not executed on one of those as these are typically used for testing only, not for building)
  • I have not seen the exception fly when there is no workspace left from the previous run
  • When the exception is thrown, the sources on the affected machines have not been updated to the latest scm version

For the moment I am working around this by sleeping something between 500 and 1500 milliseconds before retrying the scm checkout step.

tom.ghyselinck@excentis.com (JIRA)

unread,
Oct 3, 2018, 9:27:04 AM10/3/18
to jenkinsc...@googlegroups.com

Hi all,

I had a similar issue when running some (declarative) pipeline steps in parallel:

org.xml.sax.SAXParseException; systemId: file:/var/lib/jenkins/jobs/XRA31_components-software/branches/branches-1-x-sc-qam.9mj31f/builds/468/changelog2.xml; lineNumber: 1; columnNumber: 537; Content is not allowed in trailing section.
	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1239)
	at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
	at org.apache.commons.digester.Digester.parse(Digester.java:1871)
	at hudson.scm.SubversionChangeLogParser.parse(SubversionChangeLogParser.java:76)
Caused: hudson.util.IOException2: Failed to parse /var/lib/jenkins/jobs/XRA31_components-software/branches/branches-1-x-sc-qam.9mj31f/builds/468/changelog2.xml
	at hudson.scm.SubversionChangeLogParser.parse(SubversionChangeLogParser.java:80)
	at hudson.scm.SubversionChangeLogParser.parse(SubversionChangeLogParser.java:43)
	at org.jenkinsci.plugins.workflow.job.WorkflowRun.onCheckout(WorkflowRun.java:1045)
	at org.jenkinsci.plugins.workflow.job.WorkflowRun.access$1400(WorkflowRun.java:143)
	at org.jenkinsci.plugins.workflow.job.WorkflowRun$SCMListenerImpl.onCheckout(WorkflowRun.java:1253)
	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:136)
	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:90)
	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:77)
	at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution$1$1.call(SynchronousNonBlockingStepExecution.java:50)
	at hudson.security.ACL.impersonate(ACL.java:290)
	at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution$1.run(SynchronousNonBlockingStepExecution.java:47)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

The created changelog2.xml was:

<?xml version="1.0" encoding="UTF-8"?><log><logentry revision="3931"><author>tom.g</author><date>2018-10-03T10:29:53.640919Z</date><paths><path action="M" localPath="src/xra31rpc/xra31rpc.thrift" kind="file">/software/branches/1.x-sc_qam/rpc/src/xra31rpc/xra31rpc.thrift</path><path action="M" localPath="src/samplecode/test_cpp_client.cpp" kind="file">/software/branches/1.x-sc_qam/rpc/src/samplecode/test_cpp_client.cpp</path></paths><msg>The `inputLevel` is not yet available for the `PhysicalConnectorStatus`.</msg></logentry></log>ware/branches/1.x-sc_qam/libxra31ctrl/src/xra31ctrl/PcapFileSaver.cpp</path><path action="M" localPath="src/xra31ctrl/PcapFileSaver.h" kind="file">/software/branches/1.x-sc_qam/libxra31ctrl/src/xra31ctrl/PcapFileSaver.h</path></paths><msg>- Remove packet filtering (i.e. `PacketFilter`) support in `PcapFileSaver`.
- Use correct thread protection by using `std::atomic` where applicable.
- Disable storing unused (member) variables.
- Code cleanup.</msg></logentry><logentry revision="3930"><author>tom.g</author><date>2018-10-03T09:41:08.941382Z</date><paths><path action="M" localPath="src/xra31ctrl/DsSynchronizer.cpp" kind="file">/software/branches/1.x-sc_qam/libxra31ctrl/src/xra31ctrl/DsSynchronizer.cpp</path></paths><msg>- Use C++ interface for `stdio`.</msg></logentry></log>

The localPath in this changelog file are part of different checkouts (from the same commit in the same repository though), but still I suspect it has something to do with the parallel execution...

I hope this helps!

With best regards,
Tom.

This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

v.gimple@stemmer-imaging.de (JIRA)

unread,
Oct 4, 2018, 2:27:03 AM10/4/18
to jenkinsc...@googlegroups.com
Volker Gimple updated an issue
Change By: Volker Gimple
I am working on a build + test + deployment script for multiple platforms. So far the script is still mostly a dummy with the actual work still left to fill in. Nevertheless, when running the script I get intermittent (about one in three or four runs) failures caused by always the same two exceptions:

 
{code:java}

org.xml.sax.SAXParseException; systemId: file:/var/lib/jenkins/jobs/Temp.ScriptTest/branches/branches-nodenesting.skpm19/builds/109/changelog2.xml; lineNumber: 1; columnNumber: 1; Premature end of file.
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1239)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at org.apache.commons.digester.Digester.parse(Digester.java:1871)
at hudson.scm.SubversionChangeLogParser.parse(SubversionChangeLogParser.java:76)
Caused: hudson.util.IOException2: Failed to parse /var/lib/jenkins/jobs/Temp.ScriptTest/branches/branches-nodenesting.skpm19/builds/109/changelog2.xml

at hudson.scm.SubversionChangeLogParser.parse(SubversionChangeLogParser.java:80)
at hudson.scm.SubversionChangeLogParser.parse(SubversionChangeLogParser.java:43)
at org.jenkinsci.plugins.workflow.job.WorkflowRun.onCheckout(WorkflowRun.java:1024)
at org.jenkinsci.plugins.workflow.job.WorkflowRun.access$1400(WorkflowRun.java:143)

at org.jenkinsci.plugins.workflow.job.WorkflowRun$SCMListenerImpl.onCheckout(WorkflowRun.java:1232)
at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:127)
at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:85)
at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:75)
at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1$1.call(AbstractSynchronousNonBlockingStepExecution.java:47)
at hudson.security.ACL.impersonate(ACL.java:290)
at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1.run(AbstractSynchronousNonBlockingStepExecution.java:44)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745){code}

 

The file involved is always changelog2.xml.

The script I am running is this:

 
{noformat}
}{noformat}

A typical output log is attached (file Log114.txt).

Looking at the failed builds it looks like in most cases the (generally slower) ARM-based test nodes are the ones where the exception is raised. The exceptions never occur when executing the script sequentially (i.e. when setting 'parallelbuild' to false and the 'parallel' block is never hit).

My next move will be to try and filter out these particular exceptions and see if I have a chance at completing the build regardless but an approach where this is not necessary would of course be my preference.

 

pallen@perforce.com (JIRA)

unread,
Dec 3, 2018, 9:44:02 AM12/3/18
to jenkinsc...@googlegroups.com
Paul Allen started work on Bug JENKINS-53030
 
Change By: Paul Allen
Status: Open In Progress

pallen@perforce.com (JIRA)

unread,
Dec 3, 2018, 9:45:05 AM12/3/18
to jenkinsc...@googlegroups.com
Paul Allen updated an issue
Change By: Paul Allen
Labels: P4_A SAXParseException ioexception2 parallel

pallen@perforce.com (JIRA)

unread,
Dec 3, 2018, 9:45:07 AM12/3/18
to jenkinsc...@googlegroups.com

pallen@perforce.com (JIRA)

unread,
Dec 3, 2018, 12:03:02 PM12/3/18
to jenkinsc...@googlegroups.com

pallen@perforce.com (JIRA)

unread,
Dec 3, 2018, 12:03:03 PM12/3/18
to jenkinsc...@googlegroups.com
Paul Allen updated an issue
Change By: Paul Allen
Labels: P4_A SAXParseException ioexception2 parallel

pallen@perforce.com (JIRA)

unread,
Dec 3, 2018, 12:03:03 PM12/3/18
to jenkinsc...@googlegroups.com
Paul Allen stopped work on Bug JENKINS-53030
 
Change By: Paul Allen
Status: In Progress Open

pallen@perforce.com (JIRA)

unread,
Dec 3, 2018, 12:06:02 PM12/3/18
to jenkinsc...@googlegroups.com

Sorry updating the wrong job. I have had to work around this issue in the `p4-plugin` by catching the exception, logging the error and returning an empty ChangeLogSet.

pallen@perforce.com (JIRA)

unread,
Dec 3, 2018, 12:08:02 PM12/3/18
to jenkinsc...@googlegroups.com
Paul Allen edited a comment on Bug JENKINS-53030
Sorry updating the wrong job. I


Internally we
have had to the same problem with the changelog files being read before the writer has finished. For the moment we work around this issue in the `p4-plugin` by catching the exception, logging the error and returning an empty ChangeLogSet.

DuAell@gmail.com (JIRA)

unread,
Mar 6, 2019, 5:04:04 AM3/6/19
to jenkinsc...@googlegroups.com

I had this error before when doing parallel checkouts.

I removed parallel and I now do sequential checkouts. It's working fine since several months.

DuAell@gmail.com (JIRA)

unread,
Jun 6, 2019, 8:12:02 AM6/6/19
to jenkinsc...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages