FYI: Chromium Mac Valgrind can't link

120 views
Skip to first unread message

Robert Sesek

unread,
Jun 28, 2010, 2:30:36 PM6/28/10
to chromium-dev
The Chromium Mac Valgrind bot [1] can no longer link due to linker address space exhaustion.  We've run out of bandaids and there's nothing more that we can do without causing a significant divergence between our testing and release configurations.  We're flying blind until we can get 10.6 bots up and running, which Mark says is about 2 weeks away.  Previous bandaids were shuffling link ordering of libraries (putting libWebCore.a first), disabling SVG, and switching to -Os optimization (rather than -O1).


rsesek / @chromium.org

Evan Martin

unread,
Jun 28, 2010, 2:35:38 PM6/28/10
to rse...@chromium.org, chromium-dev
On Mon, Jun 28, 2010 at 11:30 AM, Robert Sesek <rse...@chromium.org> wrote:
> there's nothing more that
> we can do

DELETE MOAR CODE

Marc-Antoine Ruel

unread,
Jun 28, 2010, 3:26:06 PM6/28/10
to ev...@chromium.org, rse...@chromium.org, chromium-dev
Didn't disabling svg require a clobber build? I didn't force a clobber
when I restarted the memory master this morning.

> --
> Chromium Developers mailing list: chromi...@chromium.org
> View archives, change email options, or unsubscribe:
>    http://groups.google.com/a/chromium.org/group/chromium-dev
>

Robert Sesek

unread,
Jun 28, 2010, 3:29:54 PM6/28/10
to Marc-Antoine Ruel, ev...@chromium.org, chromium-dev
SVG has been disabled for about a week or two/

rsesek / @chromium.org

Scott Hess

unread,
Jun 28, 2010, 3:52:48 PM6/28/10
to rse...@chromium.org, chromium-dev
I believe that the problem wasn't lack of address space, but address
space fragmentation? Perhaps a horrible bandaid solution would be to
write a script to reshard libwebcore.a and libbrowser.a across
multiple smaller .a files, and then introduce another layer of
dependencies at the link phase.

-scott

On Mon, Jun 28, 2010 at 11:30 AM, Robert Sesek <rse...@chromium.org> wrote:

Aaron Boodman

unread,
Jun 29, 2010, 1:04:31 AM6/29/10
to ev...@chromium.org, rse...@chromium.org, chromium-dev

I will remove some of my apps experiments that didn't work out. I
don't think it is enough to help, but it's the thought that counts
(right?).

- a

Timur Iskhodzhanov

unread,
Jun 29, 2010, 9:29:42 AM6/29/10
to rse...@chromium.org, chromium-dev, Alexander, Evan Martin, Marc-Antoine Ruel, Nicolas Sylvain
Here are a few bugs open:
http://code.google.com/p/chromium/issues/detail?id=45207 (Mac issues)
http://code.google.com/p/chromium/issues/detail?id=44241 (similar issues on Linux 32-bit build)

Can we split unit_tests into two or more parts?

Currently, unit_tests take 5x-10x more time than any other test (under Valgrind)

I think OS X 10.6 Valgrind bot will not be fully functional in two weeks since 10.6 is still in a branch in Valgrind repo (and there are a few open bugs).

Timur Iskhodzhanov,
Google Russia



--

Scott Hess

unread,
Jun 29, 2010, 10:56:09 AM6/29/10
to timu...@chromium.org, rse...@chromium.org, chromium-dev, Alexander, Evan Martin, Marc-Antoine Ruel, Nicolas Sylvain
On Tue, Jun 29, 2010 at 6:29 AM, Timur Iskhodzhanov
<timu...@chromium.org> wrote:
> Here are a few bugs open:
> http://code.google.com/p/chromium/issues/detail?id=45207 (Mac issues)
> http://code.google.com/p/chromium/issues/detail?id=44241 (similar issues on
> Linux 32-bit build)
> Can we split unit_tests into two or more parts?

Unfortunately, the issue is mapping the various libraries involved at
link time. Breaking up unit_tests might only buy you a week or two
before you hit another problem mapping things.

Hmm. Another possibility on the memory fragmentation angle would be
to put the unit_test object files into a library, and then link
against it. That way instead of 400 or so .o files scattered about,
there would be another half-gig .a file.

-scott

Timur Iskhodzhanov

unread,
Jun 29, 2010, 10:59:56 AM6/29/10
to Scott Hess, rse...@chromium.org, chromium-dev, Alexander, Evan Martin, Marc-Antoine Ruel, Nicolas Sylvain
On Tue, Jun 29, 2010 at 6:56 PM, Scott Hess <sh...@google.com> wrote:
On Tue, Jun 29, 2010 at 6:29 AM, Timur Iskhodzhanov
<timu...@chromium.org> wrote:
> Here are a few bugs open:
> http://code.google.com/p/chromium/issues/detail?id=45207 (Mac issues)
> http://code.google.com/p/chromium/issues/detail?id=44241 (similar issues on
> Linux 32-bit build)
> Can we split unit_tests into two or more parts?

Unfortunately, the issue is mapping the various libraries involved at
link time.  Breaking up unit_tests might only buy you a week or two
before you hit another problem mapping things.
Building ui_tests (which includes building chrome itself) are less problematic thought they are bigger than unit_tests IMO.
 
Hmm.  Another possibility on the memory fragmentation angle would be
to put the unit_test object files into a library, and then link
against it.  That way instead of 400 or so .o files scattered about,
there would be another half-gig .a file.
LGTM

-scott

Mark Mentovai

unread,
Jun 29, 2010, 11:35:53 AM6/29/10
to sh...@google.com, timu...@chromium.org, rse...@chromium.org, chromium-dev, Alexander, Evan Martin, Marc-Antoine Ruel, Nicolas Sylvain
Scott Hess wrote:
> Hmm.  Another possibility on the memory fragmentation angle would be
> to put the unit_test object files into a library, and then link
> against it.  That way instead of 400 or so .o files scattered about,
> there would be another half-gig .a file.

That’s problematic, especially for tests, because there won’t be any
reason for the linker to actually look at the .o objects inside the
test code .a archive—they won’t resolve any undefined symbols. The Mac
linker doesn’t have -whole-archive/-no-whole-archive, it only has
-all_load, which can only be enabled for an entire link session and
can’t be individually enabled or defeated for specific .a archives.

I appreciate all of the suggestions people are offering, but we’ll be
moving to 10.6 as a build platform with its 64-bit linker. I’ll send
out an e-mail about this later today for those that weren’t present
when we discussed this yesterday. We’ve invested enough time and
energy in band-aids already, and each “success” only buys a few days,
so I think it’s time to bite the bullet and move on.

Timur, we can continue testing with Valgrind on 10.5 by ferrying
builds done on 10.6 between machines.

Mark

Evan Martin

unread,
Jun 29, 2010, 12:45:34 PM6/29/10
to rse...@chromium.org, chromium-dev
On Mon, Jun 28, 2010 at 11:35 AM, Evan Martin <ev...@chromium.org> wrote:

In seriousness, on IRC we poked at the size buildbots and noticed that
turning on NSS on Mac gained us another 1.4 pounds -- er, megabytes.
I wonder if you could quickly turn that off until Mark's (better) 10.6
plan comes to fruition.

Mark Mentovai

unread,
Jun 29, 2010, 12:56:02 PM6/29/10
to ev...@chromium.org, rse...@chromium.org, chromium-dev

Not gonna help. :(

Mark Mentovai

unread,
Jun 29, 2010, 1:02:07 PM6/29/10
to sh...@google.com, rse...@chromium.org, chromium-dev
Scott Hess wrote:
> I believe that the problem wasn't lack of address space, but address
> space fragmentation?  Perhaps a horrible bandaid solution would be to
> write a script to reshard libwebcore.a and libbrowser.a across
> multiple smaller .a files, and then introduce another layer of
> dependencies at the link phase.

That’s an interesting idea, but I think we’re at—beyond, actually—the
point of diminishing returns. libwebcore.a is a beast, at over 1GB in
a debug build from a few days ago. One of the existing band-aids has
us loading it while the address space is relatively wide open.
libbrowser.a, second-heaviest at over 600MB, was formerly the first
library on the link line, and is now #2. Fragmentation is a big
problem, but with these numbers, we’re kind of pushing the limits in a
more absolute sense, too.

This may be worthwhile if other platforms up against this same problem
can’t move to 64-bit linkers, but it’s probably not going to do too
much for us for too long.

Mark

Scott Hess

unread,
Jun 29, 2010, 1:49:24 PM6/29/10
to Mark Mentovai, timu...@chromium.org, rse...@chromium.org, chromium-dev, Alexander, Evan Martin, Marc-Antoine Ruel, Nicolas Sylvain
On Tue, Jun 29, 2010 at 8:35 AM, Mark Mentovai <ma...@chromium.org> wrote:
> Scott Hess wrote:
>> Hmm.  Another possibility on the memory fragmentation angle would be
>> to put the unit_test object files into a library, and then link
>> against it.  That way instead of 400 or so .o files scattered about,
>> there would be another half-gig .a file.
>
> That’s problematic, especially for tests, because there won’t be any
> reason for the linker to actually look at the .o objects inside the
> test code .a archive—they won’t resolve any undefined symbols. The Mac
> linker doesn’t have -whole-archive/-no-whole-archive, it only has
> -all_load, which can only be enabled for an entire link session and
> can’t be individually enabled or defeated for specific .a archives.

For things where you know "Link all of these", instead of a library
you could use ld's:
-r Merges object files to produce another mach-o object file
with file type MH_OBJECT.
Seems to work alright when I snag the link command-line and bust
things up manually. We have a really large number of items on our
link command-line :-).

> I appreciate all of the suggestions people are offering, but we’ll be
> moving to 10.6 as a build platform with its 64-bit linker. I’ll send
> out an e-mail about this later today for those that weren’t present
> when we discussed this yesterday. We’ve invested enough time and
> energy in band-aids already, and each “success” only buys a few days,
> so I think it’s time to bite the bullet and move on.

This is surely the correct overall solution, but in the meanwhile I'm
nervous about whether we'll lack coverage for a period. The 10.6 bots
on the waterfall provide some, but not having trybots for any length
of time could get annoying!

-scott

Evan Martin

unread,
Jun 29, 2010, 1:51:23 PM6/29/10
to ma...@chromium.org, sh...@google.com, rse...@chromium.org, chromium-dev
On Tue, Jun 29, 2010 at 10:02 AM, Mark Mentovai <ma...@chromium.org> wrote:
> Scott Hess wrote:
>> I believe that the problem wasn't lack of address space, but address
>> space fragmentation?  Perhaps a horrible bandaid solution would be to
>> write a script to reshard libwebcore.a and libbrowser.a across
>> multiple smaller .a files, and then introduce another layer of
>> dependencies at the link phase.
>
> That’s an interesting idea, but I think we’re at—beyond, actually—the
> point of diminishing returns. libwebcore.a is a beast, at over 1GB in
> a debug build from a few days ago. One of the existing band-aids has
> us loading it while the address space is relatively wide open.
> libbrowser.a, second-heaviest at over 600MB, was formerly the first
> library on the link line, and is now #2. Fragmentation is a big
> problem, but with these numbers, we’re kind of pushing the limits in a
> more absolute sense, too.

One last workaround to mention: shared library link. Build a
webcore.dylib or whatever. Would require some gyp surgery and dealing
with build breaks occasionally.

Marc-Antoine Ruel

unread,
Jun 29, 2010, 1:58:11 PM6/29/10
to ev...@chromium.org, Victor Wang, ma...@chromium.org, sh...@google.com, rse...@chromium.org, chromium-dev

Victor is already doing the windows part so it'll eventually be
(relatively) easy to do on the Mac.

Shinichiro Hamaji

unread,
Jun 29, 2010, 2:09:53 PM6/29/10
to maruel...@google.com, ev...@chromium.org, Victor Wang, ma...@chromium.org, sh...@google.com, rse...@chromium.org, chromium-dev
Hi,

Cannot we use 64bit linker on 10.5? I've tried this idea on my 10.5
and it seems to be OK.

If you trust me, you can download the following file, put it into
/Developer/usr/bin, and see if it works.

http://shinh.skr.jp/t/ld

I created the above file by

1. download the source code of ld64 from apple's website

% wget http://www.opensource.apple.com/tarballs/ld64/ld64-85.2.1.tar.gz
% tar -xvzf ld64-85.2.1.tar.gz

2. fix project.pbxproj

% cat ld64.patch
diff -ur ld64-85.2.1.orig/ld64.xcodeproj/project.pbxproj
ld64-85.2.1/ld64.xcodeproj/project.pbxproj
--- ld64-85.2.1.orig/ld64.xcodeproj/project.pbxproj 2010-06-29 07:54:08.0000
00000 -0700
+++ ld64-85.2.1/ld64.xcodeproj/project.pbxproj 2010-06-29 10:41:27.000000000 -0
700
@@ -552,7 +552,7 @@
PREBINDING = NO;
PRODUCT_NAME = ld;
SECTORDER_FLAGS = "";
- VALID_ARCHS = "i386 ppc";
+ VALID_ARCHS = "i386 ppc x86_64";
VERSIONING_SYSTEM = "apple-generic";
WARNING_CFLAGS = "-Wall";
};
@@ -712,7 +712,7 @@
INSTALL_PATH = /usr/bin;
PREBINDING = NO;
PRODUCT_NAME = rebase;
- VALID_ARCHS = "i386 ppc";
+ VALID_ARCHS = "i386 ppc x86_64";
};
name = Release;
};
% patch -p0 < ld64.patch

3. build the linker

% xcodebuild ARCHS=x86_64 -project ld64.xcodeproj -configuration Release

(note that the build of "rebase" seems to fail... but we don't need this binary)

4. put it into /Developer/usr/bin

% sudo cp build/Release/ld /Developer/usr/bin

Obviously, ld64-85.2.1 doesn't support x86_64 officially so the output
of the 64bit ld can be broken.

Thanks,

Mark Mentovai

unread,
Jun 29, 2010, 2:36:26 PM6/29/10
to Shinichiro Hamaji, chromium-dev
Shinichiro Hamaji wrote:
> Obviously, ld64-85.2.1 doesn't support x86_64 officially so the output
> of the 64bit ld can be broken.

That’s what makes this pretty scary.

In my work with getting the 10.5/Xcode 3.1 toolchain running on Linux,
I found that significant parts would not build or work correctly for
x86_64. ld64-85.2.1 was not among those parts, but it depended on
components in other packages that had problems. I consider this
approach to be “shaky ground.”

If it works for you and others, that’s awesome, but I don’t think we
can really take this approach for production builds (including bots).

Mark

Mark Mentovai

unread,
Jun 29, 2010, 2:43:42 PM6/29/10
to Scott Hess, timu...@chromium.org, rse...@chromium.org, chromium-dev, Alexander, Evan Martin, Marc-Antoine Ruel, Nicolas Sylvain
Scott Hess wrote:
> For things where you know "Link all of these", instead of a library
> you could use ld's:
>     -r          Merges object files to produce another mach-o object file
>                 with file type MH_OBJECT.
> Seems to work alright when I snag the link command-line and bust
> things up manually.

Have you tried this with unit_tests in a debug build that wouldn’t
link? Did it make it linkable?

> We have a really large number of items on our
> link command-line :-).

Yeah, and the command line doesn’t even have the .o files in it,
they’re listed separately in a file. Huge, huge command line anyway.

Mark

Scott Hess

unread,
Jun 29, 2010, 4:31:25 PM6/29/10
to Mark Mentovai, timu...@chromium.org, rse...@chromium.org, chromium-dev, Alexander, Evan Martin, Marc-Antoine Ruel, Nicolas Sylvain
On Tue, Jun 29, 2010 at 11:43 AM, Mark Mentovai <ma...@chromium.org> wrote:
> Scott Hess wrote:
>> For things where you know "Link all of these", instead of a library
>> you could use ld's:
>>     -r          Merges object files to produce another mach-o object file
>>                 with file type MH_OBJECT.
>> Seems to work alright when I snag the link command-line and bust
>> things up manually.
>
> Have you tried this with unit_tests in a debug build that wouldn’t
> link? Did it make it linkable?

Yes. What I did:
- reenable svg for gyp.
- gclient sync.
- build target unit_tests.
- failed with mmap error.
- LINKFILE=xcodebuild/chrome.build/Debug/unit_tests.build/Objects-normal/i386/unit_tests.LinkFileList
- egrep unittest $(LINKFILE) | xargs ld -ObjC -o uber.o -r
- egrep -v unittest $(LINKFILE) > unit_tests.LinkFileList
- echo uber.o >>unit_tests.LinkFileList
- re-run original gcc line with $(LINKFILE) replaced by unit_tests.LinkFileList

Even passed a test! Unfortunately, when I re-enabled svg to capture
the command-line, I noticed that a bunch of other stuff didn't look
like it was linking right anymore. AFAICT some don't have a huge
LinkFileList (Chromium Framework.LinkFileList lists three files). So
this might only work for a little while, sigh.

Unless we add another layer to aggregate the .a files into an uber.a
file ... and then rewrite ld entirely at some point. Oh yeah.

-scott

Mark Mentovai

unread,
Jun 29, 2010, 4:42:36 PM6/29/10
to Scott Hess, chromium-dev
Scott Hess wrote:
> Yes.  What I did:
>  - reenable svg for gyp.
>  - gclient sync.
>  - build target unit_tests.
>  - failed with mmap error.
>  - LINKFILE=xcodebuild/chrome.build/Debug/unit_tests.build/Objects-normal/i386/unit_tests.LinkFileList
>  - egrep unittest $(LINKFILE) | xargs ld -ObjC -o uber.o -r
>  - egrep -v unittest $(LINKFILE) > unit_tests.LinkFileList
>  - echo uber.o >>unit_tests.LinkFileList
>  - re-run original gcc line with $(LINKFILE) replaced by unit_tests.LinkFileList
>
> Even passed a test!  Unfortunately, when I re-enabled svg to capture
> the command-line, I noticed that a bunch of other stuff didn't look
> like it was linking right anymore.  AFAICT some don't have a huge
> LinkFileList (Chromium Framework.LinkFileList lists three files).  So
> this might only work for a little while, sigh.

Right, this wouldn’t really work for the framework (chrome_dll), so
it’ll only buy some time for unit_tests.

If you wanted to clean this up as a final band-aid for unit_tests,
that’s OK. Of all of the band-aids we’ve seen so far, this one is the
least hackalicious, although the GYP/Xcode work will get a little
“cute.”

GYP r833 (yesterday) added support for listing .o files in 'sources'
sections for the Xcode generator. If you wanted to do this, you’d
probably need to make unit_tests a none-type target on the Mac (and
also conditionally rename it), then hang a postbuild off of it to
build a monster .o file (would "ld -r -all_load
libunit_tests_monster.a -o unit_tests_monster.o" work? if not, use ar
x to get a bunch of .o files and drop the __.SYMDEF) and then have
(again, conditionally on the Mac) a unit_tests target that depends on
unit_tests_monster and lists unit_tests_monster.o in its sources list.
The new target would also need a dummy .mm file (which could be empty)
to actually convince Xcode to link, and to do it with the right
support libraries for Obj-C++.

Mark

Timur Iskhodzhanov

unread,
Aug 11, 2010, 4:57:59 PM8/11/10
to maruel...@google.com, ev...@chromium.org, Victor Wang, ma...@chromium.org, sh...@google.com, rse...@chromium.org, chromium-dev, Alexander Potapenko
[from the right address, sigh]

On Thu, Aug 12, 2010 at 12:56 AM, Timur Iskhodzhanov <timu...@google.com> wrote:
Any progress on this?

Mac Valgrind can't build chrome and ui_tests right now...

I know we're moving towards 10.6 but Valgrind is still 10.5-only
(Alexander is working on Valgrind for 10.6)

Can anyone take a look and maybe apply yet another bandaid to the Mac UI Valgrind bots?..

Timur Iskhodzhanov,
Google Russia




Darin Fisher

unread,
Aug 11, 2010, 6:25:33 PM8/11/10
to rse...@chromium.org, chromium-dev

Has anyone tried breaking up libwebcore.a?

Reply all
Reply to author
Forward
0 new messages