JUnit4 Test Runner .*** Error in `/home/jenkins/.cache/bazel/_bazel_jenkins/3239551e333dc09ba2b5ef07ff4549b6/execroot/gerrit/bazel-out/local-fastbuild/bin/gerrit-gpg/gpg_tests.runfiles/local_jdk/bin/java': free(): invalid pointer: 0x00000007c0035528 *** external/bazel_tools/tools/test/test-setup.sh: line 105: 13710 Aborted (core dumped) "${EXE}" "$@"
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I know, we can try to reintroduce the retry to eliminate the residual flakiness.Luca
Sent from my iPhone
On 20 Feb 2017, at 18:10, 'Shawn Pearce' via Repo and Gerrit Discussion <repo-d...@googlegroups.com> wrote:
I'm seeing transient build failures in my series with errors like:--JUnit4 Test Runner .*** Error in `/home/jenkins/.cache/bazel/_bazel_jenkins/3239551e333dc09ba2b5ef07ff4549b6/execroot/gerrit/bazel-out/local-fastbuild/bin/gerrit-gpg/gpg_tests.runfiles/local_jdk/bin/java': free(): invalid pointer: 0x00000007c0035528 *** external/bazel_tools/tools/test/test-setup.sh: line 105: 13710 Aborted (core dumped) "${EXE}" "$@"*sigh*
--
To unsubscribe, email repo-discuss+unsubscribe@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.
Looking at the failures, I can see the following:
- We already run 3 builds for each change, using different NoteDb modes.
- When there is some flakiness, like the one mentioned by Shawn, it typically affects only some of the 3 builds
- When all the 3 builds are failing, it is typically for a genuine code (or test) error
I'd suggest then to check the status of the 3 builds and, should some of them fail, go into a retry cycle (3 times?).What do you think?
What that means is some of the build combinations were flaky and are retried up to 3 times. If they are then successful, the overall change is Verified +1, otherwise if the flaky builds will still fail 3 times, the change is Verified -1.** FLAKY Builds detected: [bazel/reviewdb] ignore(FAILURE) { parallel { retry (attempt 1) { Schedule job Gerrit-verifier-bazelBuild Gerrit-verifier-bazel #5169 startedGerrit-verifier-bazel #5169 completed Builds status: bazel/notedbPrimary : SUCCESS (https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/5160/console) bazel/notedbReadWrite : SUCCESS (https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/5159/console) bazel/reviewdb : SUCCESS (https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/5169/console) } } // SUCCESS ignored }
On Tue, Feb 21, 2017 at 8:38 AM, lucamilanesio <luca.mi...@gmail.com> wrote:Looking at the failures, I can see the following:
- We already run 3 builds for each change, using different NoteDb modes.
- When there is some flakiness, like the one mentioned by Shawn, it typically affects only some of the 3 builds
- When all the 3 builds are failing, it is typically for a genuine code (or test) error
I'd suggest then to check the status of the 3 builds and, should some of them fail, go into a retry cycle (3 times?).What do you think?SGTM
Luca.
On Monday, February 20, 2017 at 10:58:39 PM UTC, lucamilanesio wrote:
I know, we can try to reintroduce the retry to eliminate the residual flakiness.Luca
Sent from my iPhone
On 20 Feb 2017, at 18:10, 'Shawn Pearce' via Repo and Gerrit Discussion <repo-d...@googlegroups.com> wrote:
I'm seeing transient build failures in my series with errors like:--JUnit4 Test Runner .*** Error in `/home/jenkins/.cache/bazel/_bazel_jenkins/3239551e333dc09ba2b5ef07ff4549b6/execroot/gerrit/bazel-out/local-fastbuild/bin/gerrit-gpg/gpg_tests.runfiles/local_jdk/bin/java': free(): invalid pointer: 0x00000007c0035528 *** external/bazel_tools/tools/test/test-setup.sh: line 105: 13710 Aborted (core dumped) "${EXE}" "$@"*sigh*
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
I've implemented a change to the CI flow that detects and retry flaky builds.
Builds status: bazel/notedbPrimary : SUCCESS (https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/5362/console) bazel/reviewdb : SUCCESS (https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/5360/console) bazel/notedbReadWrite : FAILURE (https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/5361/console) } // failed } // FAILURE ignored } ** FLAKY Builds detected: [bazel/notedbReadWrite] ignore(FAILURE) { parallel { retry (attempt 1) { Schedule job Gerrit-verifier-bazel Build Gerrit-verifier-bazel #5363 started Gerrit-verifier-bazel #5363 completed Builds status: bazel/notedbPrimary : SUCCESS (https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/5362/console) bazel/reviewdb : SUCCESS (https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/5360/console) bazel/notedbReadWrite : SUCCESS (https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/5363/console) } } // SUCCESS ignored } ---------------------------------------------------------------------------- Gerrit Review: Verified=1 to change 97818/ffb06c28ba727d7b337f7bffea438c7b6d4dc4c1 ---------------------------------------------------------------------------- Finished: SUCCESS
I have to say that when we had Buck and Bazel in parallel, we noticed the Bazel ones a bit more flaky :-(However, with the retry logic implemented, things are going much better.See below one example of build that it would have failed, but with the 2nd retry succeded:
1) pushWithoutChangeId(com.google.gerrit.acceptance.git.HttpPushForReviewIT) org.eclipse.jgit.api.errors.TransportException: Socket closed
As the breakage is random and makes the build flaky, it means that is sporadic and hard to reproduce :-(
On 23 Feb 2017, at 07:57, David Ostrovsky <david.o...@gmail.com> wrote:
On Thursday, February 23, 2017 at 8:34:45 AM UTC+1, lucamilanesio wrote:As the breakage is random and makes the build flaky, it means that is sporadic and hard to reproduce :-(I think we we have a multi facet problem here: docker
, jenkins
, bazel,
flaky gerrit tests.
Stil we should try to identify same similar failurepatterns and try to address them, by opening issue on the respectiveupstream projects.
Since the introduction of the change on the Gerrit CI flow ... the green is predominant again :-)
Luca.
On 23 Feb 2017, at 07:57, David Ostrovsky <david.o...@gmail.com> wrote:
On Thursday, February 23, 2017 at 8:34:45 AM UTC+1, lucamilanesio wrote:As the breakage is random and makes the build flaky, it means that is sporadic and hard to reproduce :-(I think we we have a multi facet problem here: dockerDo you believe the flakiness comes from Docker? I could setup a physical slave to check if that is the case., jenkinsThe build runs on a machine that has no Jenkins setup, only the minimum set of packages needed to build Gerrit., bazel,Yes.flaky gerrit tests.Yes, but the SEGV highlighted by Shawn wasn't on a flaky test.Stil we should try to identify same similar failurepatterns and try to address them, by opening issue on the respectiveupstream projects.Seems easy to say, more difficult to implement :-(
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.
Nice, this looks much better :)Still the number of flaky Gerrit tests is worrisome.
Nice, this looks much better :)Still the number of flaky Gerrit tests is worrisome.So, to be honest with you *LOTS* of tests I have analysed recently are inherently flaky :-(Shall we start tracking them somewhere?Our issue tracker?