[Announce] GerritForge CI is about to start parallel docker-ised builds

115 views
Skip to first unread message

lucamilanesio

unread,
Jun 15, 2016, 12:47:32 PM6/15/16
to Repo and Gerrit Discussion
Hi Gerrit Contributors,
I am sure you have felt the pain of the slow builds and the long queues when pushing changes to Gerrit :-(

At times getting a Verified label was taking up to 1-2h, mainly because of the need of serialising the builds.
The good news is: things are going to change and get better soon, once [1] will get reviewed and merged.

We will start using "dockerized Jenkins slaves" for executing the builds, allowing much more parallelism and off-loading the build executions to dedicated slaves.
The slave node can have "pre-downloaded" dependencies pre-loaded and will speed up builds and make them more reliable.

Pushing the concept to the "next level" we could then start testing the plugins in "dockerized" Gerrit set-ups and getting them validated in a real integrated environment.

What do you think?

Luca.

Dave Borowitz

unread,
Jun 15, 2016, 1:00:58 PM6/15/16
to lucamilanesio, Repo and Gerrit Discussion
Sounds awesome Luca, nice work.

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Zaro

unread,
Jun 15, 2016, 2:20:56 PM6/15/16
to Dave Borowitz, lucamilanesio, Repo and Gerrit Discussion
yeah, +1

Edwin Kempin

unread,
Jun 15, 2016, 2:24:36 PM6/15/16
to Zaro, Dave Borowitz, lucamilanesio, Repo and Gerrit Discussion
Really cool! 
Thanks for all your work on this!

Nasser Grainawi

unread,
Jun 15, 2016, 6:24:53 PM6/15/16
to lucamilanesio, Repo and Gerrit Discussion
On Jun 15, 2016, at 10:47 AM, lucamilanesio <luca.mi...@gmail.com> wrote:

Hi Gerrit Contributors,
I am sure you have felt the pain of the slow builds and the long queues when pushing changes to Gerrit :-(

At times getting a Verified label was taking up to 1-2h, mainly because of the need of serialising the builds.
The good news is: things are going to change and get better soon, once [1] will get reviewed and merged.

We will start using "dockerized Jenkins slaves" for executing the builds, allowing much more parallelism and off-loading the build executions to dedicated slaves.
The slave node can have "pre-downloaded" dependencies pre-loaded and will speed up builds and make them more reliable.

Are you going to do this in a layered way? i.e. container that has all the build tool dependencies + container on top with build time deps + container on top with jenkins slave

That would be pretty sweet.


Pushing the concept to the "next level" we could then start testing the plugins in "dockerized" Gerrit set-ups and getting them validated in a real integrated environment.

What do you think?

Luca.


--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project

luca.mi...@gmail.com

unread,
Jun 15, 2016, 6:34:20 PM6/15/16
to Nasser Grainawi, Repo and Gerrit Discussion


On 15 Jun 2016, at 23:27, Nasser Grainawi <nas...@codeaurora.org> wrote:


On Jun 15, 2016, at 10:47 AM, lucamilanesio <luca.mi...@gmail.com> wrote:

Hi Gerrit Contributors,
I am sure you have felt the pain of the slow builds and the long queues when pushing changes to Gerrit :-(

At times getting a Verified label was taking up to 1-2h, mainly because of the need of serialising the builds.
The good news is: things are going to change and get better soon, once [1] will get reviewed and merged.

We will start using "dockerized Jenkins slaves" for executing the builds, allowing much more parallelism and off-loading the build executions to dedicated slaves.
The slave node can have "pre-downloaded" dependencies pre-loaded and will speed up builds and make them more reliable.

Are you going to do this in a layered way? i.e. container that has all the build tool dependencies + container on top with build time deps + container on top with jenkins slave

Yes, by leveraging the Dockerfile hierarchy mechanism.

Containers could be even then used standalone should anyone have difficulties with the preparation of a Gerrit dev box.

I know for sure Windows users had issues in the past ... and a Docker machine would ease their pain as well :-)

We could publish them on DockerHub too ;-)

Nasser Grainawi

unread,
Jun 15, 2016, 7:02:40 PM6/15/16
to luca.mi...@gmail.com, Repo and Gerrit Discussion
On Jun 15, 2016, at 4:34 PM, luca.mi...@gmail.com wrote:



On 15 Jun 2016, at 23:27, Nasser Grainawi <nas...@codeaurora.org> wrote:


On Jun 15, 2016, at 10:47 AM, lucamilanesio <luca.mi...@gmail.com> wrote:

Hi Gerrit Contributors,
I am sure you have felt the pain of the slow builds and the long queues when pushing changes to Gerrit :-(

At times getting a Verified label was taking up to 1-2h, mainly because of the need of serialising the builds.
The good news is: things are going to change and get better soon, once [1] will get reviewed and merged.

We will start using "dockerized Jenkins slaves" for executing the builds, allowing much more parallelism and off-loading the build executions to dedicated slaves.
The slave node can have "pre-downloaded" dependencies pre-loaded and will speed up builds and make them more reliable.

Are you going to do this in a layered way? i.e. container that has all the build tool dependencies + container on top with build time deps + container on top with jenkins slave

Yes, by leveraging the Dockerfile hierarchy mechanism.

Containers could be even then used standalone should anyone have difficulties with the preparation of a Gerrit dev box.

I know for sure Windows users had issues in the past ... and a Docker machine would ease their pain as well :-)

We could publish them on DockerHub too ;-)

Yeah, then us with Gerrit forks can steal them for our CI too :D




That would be pretty sweet.


Pushing the concept to the "next level" we could then start testing the plugins in "dockerized" Gerrit set-ups and getting them validated in a real integrated environment.

What do you think?

Luca.


-- 
-- 
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project

luca.mi...@gmail.com

unread,
Jun 16, 2016, 2:20:11 AM6/16/16
to Nasser Grainawi, Repo and Gerrit Discussion


On 16 Jun 2016, at 00:05, Nasser Grainawi <nas...@codeaurora.org> wrote:

On Jun 15, 2016, at 4:34 PM, luca.mi...@gmail.com wrote:



On 15 Jun 2016, at 23:27, Nasser Grainawi <nas...@codeaurora.org> wrote:


On Jun 15, 2016, at 10:47 AM, lucamilanesio <luca.mi...@gmail.com> wrote:

Hi Gerrit Contributors,
I am sure you have felt the pain of the slow builds and the long queues when pushing changes to Gerrit :-(

At times getting a Verified label was taking up to 1-2h, mainly because of the need of serialising the builds.
The good news is: things are going to change and get better soon, once [1] will get reviewed and merged.

We will start using "dockerized Jenkins slaves" for executing the builds, allowing much more parallelism and off-loading the build executions to dedicated slaves.
The slave node can have "pre-downloaded" dependencies pre-loaded and will speed up builds and make them more reliable.

Are you going to do this in a layered way? i.e. container that has all the build tool dependencies + container on top with build time deps + container on top with jenkins slave

Yes, by leveraging the Dockerfile hierarchy mechanism.

Containers could be even then used standalone should anyone have difficulties with the preparation of a Gerrit dev box.

I know for sure Windows users had issues in the past ... and a Docker machine would ease their pain as well :-)

We could publish them on DockerHub too ;-)

Yeah, then us with Gerrit forks can steal them for our CI too :D

s/steal/leverage and contribute/g :-)

Yes, all the CI infrastructure is designed to be shared and easily reused internally by companies. You are not the only "forks" on the table I guess, and having a standardised build helps a lot ;-)

Dave Borowitz

unread,
Jun 21, 2016, 10:58:49 AM6/21/16
to Luca Milanesio, Nasser Grainawi, Repo and Gerrit Discussion
Any updates, Luca? I've been seeing a lot of "verification queued" messages and not many actual verifications.

luca.mi...@gmail.com

unread,
Jun 21, 2016, 11:17:38 AM6/21/16
to Dave Borowitz, Nasser Grainawi, Repo and Gerrit Discussion
Seems today the CI is a bit overloaded by lots of broken builds :-(

The parallel execution is still under review ...

Luca

Sent from my iPhone

Luca Milanesio

unread,
Jun 21, 2016, 5:05:39 PM6/21/16
to Dave Borowitz, Nasser Grainawi, Repo and Gerrit Discussion
Yes, it seems that an unusual number of invalid changes under review ... triggered a long list of failing jobs that have congested the build queue today :-(
In order to avoid flaky tests and results, in case of failure we repeat the same build 3 times: this make the build time even longer.

A recurring failure is:
java.lang.AbstractMethodError: com.google.gerrit.reviewdb.server.ReviewDb_Schema_GwtOrm$$17.getUnwrappedDb()Lcom/google/gerrit/reviewdb/server/ReviewDb;
	at com.google.gerrit.server.notedb.ChangeNotes$Factory.create(ChangeNotes.java:163)
	at com.google.gerrit.server.query.change.ChangeData.reloadChange(ChangeData.java:726)
	at com.google.gerrit.server.query.change.ChangeData.change(ChangeData.java:713)
	at com.google.gerrit.server.index.change.ChangeField$2.get(ChangeField.java:94)
	at com.google.gerrit.server.index.change.ChangeField$2.get(ChangeField.java:90)
	at com.google.gerrit.server.index.Schema$1.apply(Schema.java:196)
	at com.google.gerrit.server.index.Schema$1.apply(Schema.java:191)
	at com.google.common.collect.Iterators$8.transform(Iterators.java:817)
	at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
	at com.google.common.collect.Iterators$7.computeNext(Iterators.java:674)
	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
	at com.google.gerrit.lucene.AbstractLuceneIndex.toDocument(AbstractLuceneIndex.java:251)
	at com.google.gerrit.lucene.LuceneChangeIndex.replace(LuceneChangeIndex.java:205)
	at com.google.gerrit.lucene.LuceneChangeIndex.replace(LuceneChangeIndex.java:103)
	at com.google.gerrit.server.index.change.ChangeIndexer.index(ChangeIndexer.java:181)
	at com.google.gerrit.server.index.change.ChangeIndexer$IndexTask.call(ChangeIndexer.java:307)
	at com.google.gerrit.server.index.change.ChangeIndexer$IndexTask.call(ChangeIndexer.java:266)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
	at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:372)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

Rings any bell?

Luca.

lucamilanesio

unread,
Jun 21, 2016, 5:53:37 PM6/21/16
to Repo and Gerrit Discussion, dbor...@google.com, nas...@codeaurora.org
Anyway ... I'm preparing as well a new faster infrastructure based in Canada with:
- 12 cores
- 480 GB SSD Raid
- 10 GBps
- 500 MBps guaranteed bandwidth

More horsepower (in addition to parallelism) is coming too :-)
By using virtualised docker slaves spawned on demand ... the power can be then further extended.

Luca.


More info at http://groups.google.com/group/repo-discuss?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project


--
--
To unsubscribe, email repo-discuss+unsubscribe@googlegroups.com

More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.

Edwin Kempin

unread,
Jun 22, 2016, 2:11:14 AM6/22/16
to lucamilanesio, Repo and Gerrit Discussion, Dave Borowitz, nas...@codeaurora.org
On Tue, Jun 21, 2016 at 11:53 PM, lucamilanesio <luca.mi...@gmail.com> wrote:
Anyway ... I'm preparing as well a new faster infrastructure based in Canada with:
- 12 cores
- 480 GB SSD Raid
- 10 GBps
- 500 MBps guaranteed bandwidth

More horsepower (in addition to parallelism) is coming too :-)
By using virtualised docker slaves spawned on demand ... the power can be then further extended.
Sounds great!! :-) Thanks!!

 

Luca.

To unsubscribe, email repo-discuss...@googlegroups.com

More info at http://groups.google.com/group/repo-discuss?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project


--
--
To unsubscribe, email repo-discuss...@googlegroups.com

More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.

Luca Milanesio

unread,
Jun 22, 2016, 3:21:38 AM6/22/16
to Edwin Kempin, Repo and Gerrit Discussion, Dave Borowitz, nas...@codeaurora.org
I will start exporting it side-by-side with the current one, without right to vote of course :-)

I am planning to get as well analytics to understand and improve on the overall workflow, publishing events to HDFS and ES for some analysis and graphs.
(that's why I recently started using RabbitMQ plugin ... and sending some fixes through as well :-) )

Luca.

Björn Pedersen

unread,
Jun 22, 2016, 3:47:26 AM6/22/16
to Repo and Gerrit Discussion, dbor...@google.com, nas...@codeaurora.org

A recurring failure is:
java.lang.AbstractMethodError: com.google.gerrit.reviewdb.server.ReviewDb_Schema_GwtOrm$$17.getUnwrappedDb()Lcom/google/gerrit/reviewdb/server/ReviewDb;
	at com.google.gerrit.server.notedb.ChangeNotes$Factory.create(ChangeNotes.java:163)
Rings any bell?

Luca.

 
Yes,  that's caused by https://gerrit-review.googlesource.com/#/c/79087/ ( a change that  is actually not clear wether I will keep it). An idea would be to skip retrying if  there are CR-2 or VRFD-1 votes from other users as well?

Luca Milanesio

unread,
Jun 22, 2016, 3:52:16 AM6/22/16
to Björn Pedersen, Repo and Gerrit Discussion, dbor...@google.com, nas...@codeaurora.org
Once the parallel builds are in place, it wouldn't be a problem anymore :-)
Jenkins has already plenty of mechanisms to manage those situations, no need to put any extra logic IMHO.

Luca.

lucamilanesio

unread,
Jun 22, 2016, 5:57:26 AM6/22/16
to Repo and Gerrit Discussion, dbor...@google.com, nas...@codeaurora.org
DavidP was proposting a hashtag on the changes to skip the CI build: this could be used when the change's author is aware of a serious bug or problem with that change and does not want to be spammed by gerrit-ci.
E.g. #skipbuild

I think this is a great idea :-)

Luca.

lucamilanesio

unread,
Jun 28, 2016, 6:47:00 AM6/28/16
to Repo and Gerrit Discussion, luca.mi...@gmail.com, dbor...@google.com, nas...@codeaurora.org
One step closer: the changes are working and I've set-up the new environment, which will run in parallel with the current one for a couple of days.
(if you are curious: see jenkins.gerritcentral.com)

Will plan to move gerrit-ci.gerritforge.com to the new parallel environment in the next couple of days, if no problems arise.

Luca.

Luca.

lucamilanesio

unread,
Jun 29, 2016, 3:55:49 AM6/29/16
to Repo and Gerrit Discussion, luca.mi...@gmail.com, dbor...@google.com, nas...@codeaurora.org
From the first night of parallel builds on Jenkins .... so far so good :-)

Overall build time is more volatile than before: now is ranging from 4' to 12' (the worst case with 5 concurrent builds). This is kind of expected because Buck is already doing a great job of using a multi-core machine even for a single build. Even with 5 concurrent builds, running them in parallel pays off as running them serialised would have taken 4x5 = 20', 60% time more.

*Appeal* If anyone of the companies contributing to Gerrit has *spare capacity* on their servers, I can easily configure additional remote docker servers to spawn additional containers and have more horsepower :-)

As things are progressing very well, I am planning to roll-out the new CI by today COB B(rexit)ST.

Luca.

lucamilanesio

unread,
Jun 29, 2016, 6:29:14 AM6/29/16
to Repo and Gerrit Discussion, luca.mi...@gmail.com, dbor...@google.com, nas...@codeaurora.org
... starting the roll-out now: it should take a few minutes.

Luca.

lucamilanesio

unread,
Jun 29, 2016, 7:13:06 AM6/29/16
to Repo and Gerrit Discussion, luca.mi...@gmail.com, dbor...@google.com, nas...@codeaurora.org
Roll-out completed :-) Enjoy parallel and faster builds for Gerrit !!!

lucamilanesio

unread,
Jun 29, 2016, 7:16:19 AM6/29/16
to Repo and Gerrit Discussion, luca.mi...@gmail.com, dbor...@google.com, nas...@codeaurora.org
New IP address is propagating through DNSs (see [1]); in the meantime the old NGINX redirects HTTP and HTTPS traffic to the new IP.

lucamilanesio

unread,
Jun 29, 2016, 9:08:40 AM6/29/16
to Repo and Gerrit Discussion, luca.mi...@gmail.com, dbor...@google.com, nas...@codeaurora.org


Propagation completed and new Gerrit CI site is fully operational :-)

See below how a parallel build looks like:


Reply all
Reply to author
Forward
0 new messages