downloading NaCl's IRT in gclient hooks considered problematic

111 views
Skip to the first unread message

Paweł Hajdan, Jr.

unread,
9 May 2011, 02:45:5909/05/2011
to chromium-dev
About two weeks ago a gclient hook has been added to download NaCl's IRT (integrated runtime) on each gclient sync: http://src.chromium.org/viewvc/chrome/trunk/src/build/download_nacl_irt.py?view=log

This is probably fine for the developers' workflow and the buildbot (although some valid concerns have been raised in http://codereview.chromium.org/6893080), but it breaks horribly for Linux distro packages. We're creating tarballs on http://build.chromium.org/official/ , and the script behind it uses --nohooks (otherwise we run into problems like http://bugs.gentoo.org/show_bug.cgi?id=337543, and it's not guaranteed that the machine generating tarballs has all build dependencies (and embedding makefiles into the tarballs, even if they can get overwritten, is not necessarily the best idea).

Now the immediate problem that download_nacl_irt.py is causing is http://bugs.gentoo.org/show_bug.cgi?id=366413 (since we don't run the hook, the build fails because of missing files). It even fails when -Ddisable_nacl=1 is passed to gyp, which makes it impossible to work around cleanly. The package should not be downloading any files from the Internet after unpacking the tarball, so everything should be in the tarball.

Is it really needed to have a script like download_nacl_irt.py? Why don't we handle this like everything else, i.e. DEPS rolls and possibly canary buildbots which always use the latest builds?

Mark Seaborn

unread,
9 May 2011, 09:01:1509/05/2011
to phajd...@chromium.org, chromium-dev, bradn...@chromium.org
On 9 May 2011 07:45, Paweł Hajdan, Jr. <phajd...@chromium.org> wrote:
About two weeks ago a gclient hook has been added to download NaCl's IRT (integrated runtime) on each gclient sync: http://src.chromium.org/viewvc/chrome/trunk/src/build/download_nacl_irt.py?view=log

This is probably fine for the developers' workflow and the buildbot (although some valid concerns have been raised in http://codereview.chromium.org/6893080), but it breaks horribly for Linux distro packages. We're creating tarballs on http://build.chromium.org/official/ , and the script behind it uses --nohooks (otherwise we run into problems like http://bugs.gentoo.org/show_bug.cgi?id=337543, and it's not guaranteed that the machine generating tarballs has all build dependencies (and embedding makefiles into the tarballs, even if they can get overwritten, is not necessarily the best idea).

Now the immediate problem that download_nacl_irt.py is causing is http://bugs.gentoo.org/show_bug.cgi?id=366413 (since we don't run the hook, the build fails because of missing files). It even fails when -Ddisable_nacl=1 is passed to gyp, which makes it impossible to work around cleanly.

As a short term fix, we can make it work with 'disable_nacl=1'.  Here's a change to do that:  http://codereview.chromium.org/6968007/

We can also make download_nacl_irt.py invokable directly to address the problem that "Calling the download_nacl_irt.py script from the ebuild is difficult because we would need to figure out the nacl_revision and file_hash values."  Here's a change to do that:  http://codereview.chromium.org/6966010/

Would that help, or is it still a problem to have pre-built binaries in your source tarball?


The package should not be downloading any files from the Internet after unpacking the tarball, so everything should be in the tarball.

Is it really needed to have a script like download_nacl_irt.py? Why don't we handle this like everything else, i.e. DEPS rolls and possibly canary buildbots which always use the latest builds?

Everything else in the Chromium build is (AFAIK) built using the host OS's toolchain.  Native Client's IRT library is different in that it is built as NaCl untrusted code, so it needs to be built using the NaCl toolchain (nacl-gcc).

If we wanted to build the IRT library from source inside the Chromium build, we'd need to do two things:

 1) Pull the NaCl toolchain into the Chromium build, either by pulling in binaries or by building the toolchain from source.  The toolchain is big, so this is more awkward than pulling in pre-built binaries for the IRT library.
 2) Extend Gyp to be able to build NaCl untrusted code.  Since Gyp has built-in knowledge of how to use the host OS's toolchain (to a greater degree than Make or Scons), this may not be trivial.

Cheers,
Mark

Paweł Hajdan, Jr.

unread,
9 May 2011, 10:20:3809/05/2011
to Mark Seaborn, chromium-dev, bradn...@chromium.org
Wow, thank you for quick response! I commented on both code review issues.

On Mon, May 9, 2011 at 15:01, Mark Seaborn <msea...@chromium.org> wrote:
As a short term fix, we can make it work with 'disable_nacl=1'.  Here's a change to do that:  http://codereview.chromium.org/6968007/

Yeah, wrapping NaCl-specific code in disable_nacl!=1 generally sounds like a good idea.
 
We can also make download_nacl_irt.py invokable directly to address the problem that "Calling the download_nacl_irt.py script from the ebuild is difficult because we would need to figure out the nacl_revision and file_hash values."  Here's a change to do that:  http://codereview.chromium.org/6966010/

I commented on technical details, but it seems to be a change in good direction.
 
Would that help, or is it still a problem to have pre-built binaries in your source tarball?

My first concern is just to make it compile. Then of course building everything from source would be better than pre-built binaries, but if a special toolchain is needed (and also other changes like in gyp), I'm fine with the binaries, especially that it seems still somewhat experimental. It's clearly a trade-off, and I wouldn't like to inconvenience people working on NaCl. When things become more stable and mature, it would be nice to revisit this.

Antoine Labour

unread,
9 May 2011, 12:55:5409/05/2011
to phajd...@chromium.org, chromium-dev
+1 to it being problematic, for a different reason for me. In my environment I need to pass extra parameters to gyp so that I can generate concurrently makefiles for different configurations - "normal", chrome os, arm, ...
So I run gclient sync --nohooks so that updating doesn't clobber the makefiles with another configuration. Much to my surprise when my build started failing when this was introduced. I don't have a good solution though.

Antoine 

--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

Jói Sigurðsson

unread,
9 May 2011, 13:09:0909/05/2011
to pi...@google.com, phajd...@chromium.org, chromium-dev
Is there a reason you need to use a script executed by [gclient
runhooks] to pull down the pre-built binaries instead of checking the
pre-built binaries into an SVN repository somewhere and pulling them
down via DEPS during [gclient sync]? If not then that seems like a
simple short-term solution; apart from fixing the issues Paweł and
Antoine have mentioned, it seems like it would also help make the
build repeatable.

Cheers,
Jói

Mark Seaborn

unread,
9 May 2011, 13:28:3509/05/2011
to j...@chromium.org, bradn...@chromium.org, pi...@google.com, phajd...@chromium.org, chromium-dev
On 9 May 2011 18:09, Jói Sigurðsson <j...@chromium.org> wrote:
Is there a reason you need to use a script executed by [gclient
runhooks] to pull down the pre-built binaries instead of checking the
pre-built binaries into an SVN repository somewhere and pulling them
down via DEPS during [gclient sync]?

This came up before on http://codereview.chromium.org/6893080.

The binaries are currently uploaded automatically by Native Client's Buildbot because they change frequently, so if we were to put them into SVN rather than uploading them to Google Storage for Developers, we would have to have an SVN server that can cope with that volume of data.  Currently it is roughly 6MB per commit.

Brad Nelson set up some of this infrastructure and knows more about its limitations than me, so I'll defer to him for further explanation. :-)

Cheers,
Mark

Bradley Nelson

unread,
9 May 2011, 13:56:1609/05/2011
to Mark Seaborn, j...@chromium.org, pi...@google.com, phajd...@chromium.org, chromium-dev
We did look at checking into SVN, but decided against because:
1. 6MB files x ~10 commits per day was a turn off to the folks running our svn server.
2. We were concerned about reproducibility as well (as these actually go all the way down the pipeline and get baked into release builds). Short of having an SVN rev, we figured archiving and storing both URL + sha1 while not guaranteeing we wouldn't loose the archive, would at least guarantee we'd detect if we had.
3. While we could have setup a separate SVN server specifically for these artifacts, we concluded that if this were separate from the nacl svn repo (on chrome's svn server), it really would provide only one desirable attribute: no-modification of history allowed). In other regards in terms of migration, it would be less easy to relocate. By using google storage, we also believe that we will be less likely to exceed capacity limits and need to relocate.

We considered building IRT in the chrome tree as well (its not particularly large). But this would have entailed downloading another large binary blob (nacl toolchain ~100MB+). We are actually doing this already for one test on some builders (but this doesn't affect most people at their desk). We conclude this was an ineffective gesture at reproducibility, since if we also wanted to build the toolchain (rather than relying on another download) it would require about 4hrs on windows. Strictly the IRT can be built on linux and then used on all platforms, but this would further complicate the build. The thinking was that downloading the prebuilt 6MB file on chrome's end would be the least disruptive for chromium devs.

We considered doing the download as a gyp build step in chrome's build, but concluded that this felt disturbingly non-deterministic (it's more ok for your source control sync to fail on a bad download then your compile). We ruled out downloading tip of trunk, as again, this gets baked into releasing chrome builds on all channels without further rebuild.

Sorry about running the step everywhere. I was aware that not everyone was building nacl in, but had the impression that everyone was checking it out when it was noted that parts of gyp files seemed to assume so. Apparently not the case.

Now that Mark has it properly gated on nacl_enabled does anyone object to the current solution?

I'm not 100% happy with it (On the archiving end for instance there is no guarantee that archives don't get covered, just the hash to validate). I have some general notion that I'd like to wrap google storage with a sort of write once archival tool on AppEngine. I don't believe I can structure the desired behavior out of raw ACLs, but I may be able to have archivers write to a side location, then have a separate trusted service move the archives into place and enforce write-once. For that matter the toolchain archive has the same issue.

-BradN

Jói Sigurðsson

unread,
9 May 2011, 14:13:1009/05/2011
to Bradley Nelson, Mark Seaborn, pi...@google.com, phajd...@chromium.org, chromium-dev
I don't object if everything is behind the nacl_enabled flag, but just
one thought:

> 1. 6MB files x ~10 commits per day was a turn off to the folks running our
> svn server.

Would it reduce the burden if these files were only checked in
whenever the NaCl revision used by Chrome is rolled in DEPS? I'm
guessing this is much less often.

Cheers,
Jói

Bradley Nelson

unread,
9 May 2011, 14:29:1909/05/2011
to Jói Sigurðsson, Mark Seaborn, pi...@google.com, phajd...@chromium.org, chromium-dev
On Mon, May 9, 2011 at 11:13 AM, Jói Sigurðsson <j...@chromium.org> wrote:
I don't object if everything is behind the nacl_enabled flag, but just
one thought:

> 1. 6MB files x ~10 commits per day was a turn off to the folks running our
> svn server.

Would it reduce the burden if these files were only checked in
whenever the NaCl revision used by Chrome is rolled in DEPS?  I'm
guessing this is much less often.


Perhaps. Trouble is, typically we don't know a-priori what revision of nacl will end up in chrome. So we'd either have to have the person rolling the DEPS do the irt build and make that part of their CL (painful in the presence of trybots that don't handle binary patches and a hassle for integration builders, ie they're now different that a 'regular' build) or we archive it in this same fashion and then at some later point check it in (potentially the same repro issues).

Additionally we're being encouraged to move to push the nacl DEPS roll daily, so the binary churn may be similar. (Admittedly it's more like every 2 weeks currently).
Also if we checked it in on DEPS roll, the logical place would be into the chrome svn repo, which might make git mirroring folks sad. I suppose a separate side repo could be used and then a deps change pointing to that, but then we'd also need to carefully track the correspondence to nacl's repo revs.

-BradN

Mark Mentovai

unread,
9 May 2011, 14:56:5909/05/2011
to bradn...@google.com, Jói Sigurðsson, Mark Seaborn, pi...@google.com, phajd...@chromium.org, chromium-dev
Bradley Nelson wrote:
> Also if we checked it in on DEPS roll, the logical place would be into the
> chrome svn repo, which might make git mirroring folks sad. I suppose a
> separate side repo could be used and then a deps change pointing to that,
> but then we'd also need to carefully track the correspondence to nacl's repo
> revs.

It seems that you’ve got to track this correspondence anyway, whether
the a specific build of the IRT is identified by a hash or a
Subversion revision number. At least as far as this aspect is
concerned, the download script is no better than having a side
Subversion repository.

Bradley Nelson

unread,
9 May 2011, 15:17:0709/05/2011
to Mark Mentovai, Jói Sigurðsson, Mark Seaborn, pi...@google.com, phajd...@chromium.org, chromium-dev
Well but the archives are stored at locations organized by nacl's revision number (part of the url they're at like http://gsdview.appspot.com/nativeclient-archive2/irt/r5068/ ), so the nacl revision is implicit (given a nacl rev you know the corresponding download url).
I suppose an SVN checkin could have the nacl rev in log text or something, but then you'd have to scan the svn logs or something to find the right rev.
(and the hassle of automated svn checkins).
Though admittedly there is a window of time where you're blindly trusting the integrity of the archive (from the nacl commit to the chrome deps roll).
Ideally, the hash would be locked in the moment the builder emits it (granted an svn repo would have this property). Maybe I need to write that write-once app soon...

-BradN 

Mark Mentovai

unread,
9 May 2011, 15:28:3609/05/2011
to Bradley Nelson, Jói Sigurðsson, Mark Seaborn, pi...@google.com, phajd...@chromium.org, chromium-dev
Bradley Nelson wrote:
> Well but the archives are stored at locations organized by nacl's revision
> number (part of the url they're at
> like http://gsdview.appspot.com/nativeclient-archive2/irt/r5068/ ), so the
> nacl revision is implicit (given a nacl rev you know the corresponding
> download url).

For that matter, in a Subversion checkin, you could just use something
like http://nacl-irt-binaries.googlecode.com/svn/5068. Now it’s
written right into the path again. There isn’t any requirement that
you replace the files as time goes on, and since nobody is likely to
check out the repository in its entirety, it’s not likely to make
things cumbersome for anyone.

Bradley Nelson

unread,
9 May 2011, 15:31:4709/05/2011
to Mark Mentovai, Jói Sigurðsson, Mark Seaborn, pi...@google.com, phajd...@chromium.org, chromium-dev
Ah, very good point (hadn't considered that!).

-BradN


Nico Weber

unread,
25 May 2011, 22:11:3025/05/2011
to msea...@chromium.org, j...@chromium.org, bradn...@chromium.org, Antoine Labour, Paweł Hajdan Jr., chromium-dev
Hi Mark & Brad,

I just tried to update the NaCl version used in chrome. I saw this
comment in DEPS:

# These hashes need to be updated when nacl_revision is changed.
# After changing nacl_revision, run gclient sync to get the new values.

When I ran `gclient sync` after bumping up nacl_revision to 5443
(which has a build file dependency fix that gyp rolls are blocked on),
I got a long stack ending in
"http://commondatastorage.googleapis.com/nativeclient-archive2/irt/r5443/irt_x86_64.nexe
– urllib2.HTTPError: HTTP Error 404: Not Found"

By trial and error I discovered that 5445 does have a nexe, so I'm
trying to roll to that. Is there a way to know which nacl revisions
are valid roll targets?

Thanks,
Nico

On Mon, May 9, 2011 at 10:28 AM, Mark Seaborn <msea...@chromium.org> wrote:

Mark Seaborn

unread,
25 May 2011, 23:06:4625/05/2011
to Nico Weber, chromium-dev
On 26 May 2011 03:11, Nico Weber <tha...@chromium.org> wrote:
Hi Mark & Brad,

I just tried to update the NaCl version used in chrome. I saw this
comment in DEPS:

 # These hashes need to be updated when nacl_revision is changed.
 # After changing nacl_revision, run gclient sync to get the new values.

When I ran `gclient sync` after bumping up nacl_revision to 5443
(which has a build file dependency fix that gyp rolls are blocked on),
I got a long stack ending in
"http://commondatastorage.googleapis.com/nativeclient-archive2/irt/r5443/irt_x86_64.nexe
– urllib2.HTTPError: HTTP Error 404: Not Found"

By trial and error I discovered that 5445 does have a nexe, so I'm
trying to roll to that. Is there a way to know which nacl revisions
are valid roll targets?

One way is to look at http://build.chromium.org/p/client.nacl/console, select the "merge" option at the bottom of the page, and look for the newest SVN revision that got built by the two bots "lucid-newlib-{32,64}-dbg".  (Those are the two bots that upload the IRT binaries.)  These are the two columns which, if you click on a green build, contain the step "archive irt.nexe [stripped] [unstripped]" at the top.

Admittedly that is a bit tedious. :-)

We have a script in the NaCl tree that finds revisions of Chromium for which binaries have been built by Buildbot (native_client/build/chromebinaries.py).  We could do something similar for the IRT library.  How about something like this:  http://codereview.chromium.org/7074003

Currently this prints:
5456 complete ['x86_32', 'x86_64']
5455 complete ['x86_32', 'x86_64']
5454 complete ['x86_32', 'x86_64']
5453 complete ['x86_32', 'x86_64']
5452 complete ['x86_32', 'x86_64']
5451 complete ['x86_32', 'x86_64']
5450 incomplete []
5449 complete ['x86_32', 'x86_64']
5448 incomplete ['x86_64']
5447 incomplete []
5446 complete ['x86_32', 'x86_64']
5445 complete ['x86_32', 'x86_64']
5444 incomplete ['x86_32']
5443 incomplete ['x86_32']
5442 incomplete ['x86_32']
5441 incomplete ['x86_32']
...

Cheers,
Mark

Nico Weber

unread,
26 May 2011, 12:10:2826/05/2011
to Mark Seaborn, chromium-dev

That would be helpful. Please add a comment to the DEPS file that
points to this script.

Maybe also mention what the hashes are for – I assume to make sure the
downloaded nexes aren't corrupted / haven't been tampered with?

Reply all
Reply to author
Forward
0 new messages