Tests broken on master (since the 19th of September !)

53 views
Skip to first unread message

Luca Milanesio

unread,
Oct 2, 2014, 1:22:49 PM10/2/14
to repo-discuss
Hi all,
it seems that there are a couple of tests broken on the master branch ... since the 19th of September! (quite a long time)

FAIL    104.0s 70 Passed   2 Skipped   2 Failed   com.google.gerrit.server.query.change.LuceneQueryChangesTest
FAILURE bySize: expected singleton List<ChangeInfo>, found [] for [added:3]
java.lang.AssertionError: expected singleton List<ChangeInfo>, found [] for [added:3]
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at com.google.gerrit.server.query.change.AbstractQueryChangesTest.queryOne(AbstractQueryChangesTest.java:1097)
	at com.google.gerrit.server.query.change.AbstractQueryChangesTest.bySize(AbstractQueryChangesTest.java:894)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.junit.runners.Suite.runChild(Suite.java:127)
	at org.junit.runners.Suite.runChild(Suite.java:26)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.junit.runners.Suite.runChild(Suite.java:127)
	at org.junit.runners.Suite.runChild(Suite.java:26)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:138)
	at com.facebook.buck.junit.JUnitRunner.run(JUnitRunner.java:139)
	at com.facebook.buck.junit.Main.main(Main.java:102)

FAILURE byPathRegex[noteDbEnabled]: expected singleton List<ChangeInfo>, found [] for [path:^dir.file.*]
java.lang.AssertionError: expected singleton List<ChangeInfo>, found [] for [path:^dir.file.*]
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at com.google.gerrit.server.query.change.AbstractQueryChangesTest.queryOne(AbstractQueryChangesTest.java:1097)
	at com.google.gerrit.server.query.change.AbstractQueryChangesTest.byPathRegex(AbstractQueryChangesTest.java:773)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.junit.runners.Suite.runChild(Suite.java:127)
	at org.junit.runners.Suite.runChild(Suite.java:26)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.junit.runners.Suite.runChild(Suite.java:127)
	at org.junit.runners.Suite.runChild(Suite.java:26)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:138)
	at com.facebook.buck.junit.JUnitRunner.run(JUnitRunner.java:139)
	at com.facebook.buck.junit.Main.main(Main.java:102)

Has anyone tried to run the full set of tests recently to see if the two above work for them?

Luca.

Dave Borowitz

unread,
Oct 2, 2014, 1:34:07 PM10/2/14
to Luca Milanesio, repo-discuss
Works for me at 126f802.

I want to say we've had some flakiness in the Lucene tests, possibly a bug in NRT stuff, but in my experience the flakiness rate has been <<10%, essentially unnoticeable.

If you are seeing this fail 100% of the time, I'd like to know what's different about your environment that could help us reproduce and track this down.

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hugo Arès

unread,
Oct 2, 2014, 1:59:25 PM10/2/14
to repo-d...@googlegroups.com, luca.mi...@gmail.com
On my laptop, this test never failed but in my jenkins environment, I build master on a daily basis and the flakiness rate is 65%. If you want to test a fix for this, I could test it in my environment.

Hugo

David Pursehouse

unread,
Oct 2, 2014, 8:09:19 PM10/2/14
to Hugo Arès, repo-d...@googlegroups.com, luca.mi...@gmail.com
I've seen this failure a few times. I switch between branches a lot,
and thought it was a buck cache issue - doing a `buck clean` and
building again fixes it for me.
> To unsubscribe, email repo-discuss...@googlegroups.com <javascript:>
> More info at http://groups.google.com/group/repo-discuss?hl=en
> <http://groups.google.com/group/repo-discuss?hl=en>
>
> ---
> You received this message because you are subscribed to the
> Google Groups "Repo and Gerrit Discussion" group.
> To unsubscribe from this group and stop receiving emails from
> it, send an email to repo-discuss...@googlegroups.com <javascript:>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
> --
> --
> To unsubscribe, email repo-discuss...@googlegroups.com
> More info at http://groups.google.com/group/repo-discuss?hl=en
>
> ---
> You received this message because you are subscribed to the Google
> Groups "Repo and Gerrit Discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to repo-discuss...@googlegroups.com
> <mailto:repo-discuss...@googlegroups.com>.

Hugo Arès

unread,
Oct 3, 2014, 8:58:52 AM10/3/14
to repo-d...@googlegroups.com, hugo...@ericsson.com, luca.mi...@gmail.com
In the jenkins job where I get flakiness rate of 65% , I always delete the workspace before I checkout and build so I do not think it is related to buck cache in my case.

Luca Milanesio

unread,
Oct 3, 2014, 8:59:30 AM10/3/14
to David Pursehouse, Hugo Arès, repo-d...@googlegroups.com
Thanks, will give it a try :-)

Luca.

lucamilanesio

unread,
Oct 6, 2014, 4:06:29 AM10/6/14
to repo-d...@googlegroups.com, david.pu...@sonymobile.com, hugo...@ericsson.com
No luck ... still broken.
Will try now on different machines and let you know.

It is strange however that it was working perfectly till the 18th of September ... and then broken *constantly* (not flaky).

Is there any assumption on the parallelism of the machine?
(the CI has a single processor whilst most machines nowadays have at least 4 cores)

Luca.


On Friday, October 3, 2014 1:59:30 PM UTC+1, lucamilanesio wrote:
Thanks, will give it a try :-)

Luca.

>> More info at http://groups.google.com/group/repo-discuss?hl=en
>>
>> ---
>> You received this message because you are subscribed to the Google
>> Groups "Repo and Gerrit Discussion" group.
>> To unsubscribe from this group and stop receiving emails from it, send

lucamilanesio

unread,
Oct 6, 2014, 4:41:52 AM10/6/14
to repo-d...@googlegroups.com, david.pu...@sonymobile.com, hugo...@ericsson.com
It seems as well that the entire suite was typically taking 35 minutes ... and when started failing the overall execution time was actually almost doubled to 1h !!!

Anybody any clue of massive changes introduced that could have doubled the tests execution time?

Luca.

David Pursehouse

unread,
Oct 6, 2014, 5:24:20 AM10/6/14
to lucamilanesio, repo-d...@googlegroups.com, hugo...@ericsson.com
On 10/06/2014 05:41 PM, lucamilanesio wrote:
> It seems as well that the entire suite was typically taking 35 minutes
> ... and when started failing the overall execution time was actually
> almost doubled to 1h !!!
>
> Anybody any clue of massive changes introduced that could have doubled
> the tests execution time?
>

Since [1] the push tests are being executed with two configurations:
note DB enabled and not enabled.

That change was merged on the 25th. When exactly did you notice the
execution time has doubled?


[1] https://gerrit-review.googlesource.com/#/c/60295/

> Luca.
>
> On Monday, October 6, 2014 9:06:29 AM UTC+1, lucamilanesio wrote:
>
> No luck ... still broken.
> Will try now on different machines and let you know.
>
> It is strange however that it was working perfectly till the 18th of
> September ... and then broken *constantly* (not flaky).
>
> Is there any assumption on the parallelism of the machine?
> (the CI has a single processor whilst most machines nowadays have at
> least 4 cores)
>
> Luca.
>
> On Friday, October 3, 2014 1:59:30 PM UTC+1, lucamilanesio wrote:
>
> Thanks, will give it a try :-)
>
> Luca.
>
> On 3 Oct 2014, at 01:09, David Pursehouse
> <david.pu...@sonymobile.com
> repo-discuss...@googlegroups.com
> <mailto:repo-discuss%2Bunsu...@googlegroups.com>
> >> More info at
> http://groups.google.com/group/repo-discuss?hl=en
> <http://groups.google.com/group/repo-discuss?hl=en>
> >>
> >> ---
> >> You received this message because you are subscribed to the
> Google
> >> Groups "Repo and Gerrit Discussion" group.
> >> To unsubscribe from this group and stop receiving emails
> from it, send
> >> an email to repo-discuss...@googlegroups.com
> <mailto:repo-discuss%2Bunsu...@googlegroups.com>
> >> <mailto:repo-discuss...@googlegroups.com
> <mailto:repo-discuss%2Bunsu...@googlegroups.com>>.

Luca Milanesio

unread,
Oct 6, 2014, 5:31:50 AM10/6/14
to David Pursehouse, repo-d...@googlegroups.com, hugo...@ericsson.com
It was between the 18th and 19th of September.

18th of September (http://ci.gerritforge.com/job/Gerrit-master-test/358/) took 35 mins and the day after 19th of September (http://ci.gerritforge.com/job/Gerrit-master-test/359/) took 1h.
The fact of executing all the tests twice (with and without NoteDB) would explain it :-)

Luca.

Luca Milanesio

unread,
Oct 6, 2014, 5:33:59 AM10/6/14
to David Pursehouse, repo-d...@googlegroups.com, hugo...@ericsson.com
The two tests mentioned are failing with NodeDBEnabled (AbstractQueryChangesTest.byKey, AbstractQueryChangesTest.byPathRegex) whilst they do not fail when NodeDB is disabled.
Possibly some stability issues with NodeDB ?

Luca.

Goerler, Adrian

unread,
Oct 6, 2014, 8:09:33 AM10/6/14
to Hugo Arès, luca.mi...@gmail.com, repo-d...@googlegroups.com
We have been observing instabilities with these tests as well. Typically on a local machine test run fine. In our Jenkins environment we have sporadic failures. Now the buck cache kicks in an caches the sporadic failures ;-(. 

I have investigated this for a while and suspect (not 100% sure) that Lucene does not guarantee that the changes get visible immediately but only eventually (in near realtime).

To circumvent this issue we have locally a change in place that allows for some delay until the changes get visible. Since we are running the query tests with this change in place we are observing stable behaviour.

As others seem to be observing simpler issues I’ll propose the change.

-Adrian

To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.

Dave Borowitz

unread,
Oct 6, 2014, 9:05:07 AM10/6/14
to Goerler, Adrian, luca milanesio, Hugo Arès, repo-d...@googlegroups.com

If it went from failing 0% to failing 100% then git bisect should be able to figure out the culprit, right?

Luca Milanesio

unread,
Oct 6, 2014, 9:10:46 AM10/6/14
to Dave Borowitz, Goerler, Adrian, Hugo Arès, repo-d...@googlegroups.com
Shouldn't be hard for me to find the exact SHA1 ... let me do a couple of tests and send the breaking change.

Luca.
Reply all
Reply to author
Forward
0 new messages