Thanks, I'm sure, because I printed gn_path in depot_tools/gn.py and because I tried "buildtools/linux64/gn help" explicitly. It just hung in exactly the same way.
Thanks for the tip about bootstrapping GN. Unfortunately this is the output:
> $ tools/gn/bootstrap/bootstrap.py
> Building gn manually in a temporary directory for bootstrapping...
> ninja: Entering directory `/tmp/tmp1CmrVG'
> ninja: error: '/chromium/src/base/memory/ref_counted.cc', needed by 'base/memory/ref_counted.o', missing and no known rule to make it
> Command '['ninja', '-C', '/tmp/tmp1CmrVG', 'gn']' returned non-zero exit status 1
> $
I think this means I have a functional ninja binary? What am I missing?
What's the difference between the intermediate executable in /tmp/tmpGnEMBR and the final one in out/Debug?
I don't think there's a particularly compelling reason for GN to be using tcmalloc
--
You received this message because you are subscribed to the Google Groups "gn-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gn-dev+unsubscribe@chromium.org.
On Mon, Sep 26, 2016 at 7:09 PM, Dirk Pranke <dpr...@chromium.org> wrote:I don't think there's a particularly compelling reason for GN to be using tcmallocIn crbug.com/586444 brettw/you reported a 10-40% perf hit by going back to the default allocator
To unsubscribe from this group and stop receiving emails from it, send an email to gn-dev+un...@chromium.org.
On Mon, Sep 26, 2016 at 11:20 AM, Primiano Tucci <prim...@google.com> wrote:On Mon, Sep 26, 2016 at 7:09 PM, Dirk Pranke <dpr...@chromium.org> wrote:I don't think there's a particularly compelling reason for GN to be using tcmallocIn crbug.com/586444 brettw/you reported a 10-40% perf hit by going back to the default allocatorYes, but all other things being equal I don't think that's a compelling reason enough by itself to stay on it :).
Thanks everyone for your help.
I'm happy to open a bug, now or after we understand what's going on.
The original problem is that when I follow these instructions [1], GN just hangs:
> $ fetch --no-history chromium
> $ gn gen out/Default
> (No output, just hangs here)
I can workaround this by first building GN without TCMalloc:
> $ tools/gn/bootstrap/bootstrap.py --gn-gen-args 'use_allocator="none"'
To understand what's going on, I'm attempting to isolate and reproduce the problem.
I'm stuck here at the moment:
> #include <malloc.h>
>
> void* malloc(size_t size) {
> return tc_malloc(size);
> }
>
> int main() {
> malloc(1);
> }
> $ gcc -c main.c
> $ g++ main.o out/Debug/obj/base/allocator/tcmalloc/*.o out/Debug/obj/base/third_party/dynamic_annotations/dynamic_annotations/dynamic_annotations.o
> $ ./a.out
Segmentation fault
> $
I'm trying to replicate the relevant part of allocator_shim_override_libc_symbols.h but I think I'm missing some magic words.
Can anyone spot what's wrong with this test case?
Or suggest another approach to figuring out what's going on?
[1] https://www.chromium.org/developers/how-tos/get-the-code
Do we even support building on debian (let alone debian testing)? I just tried running install-build-deps.sh and got this
ERROR: Only Ubuntu 12.04 (precise), 14.04 (trusty), 14.10
(utopic), 15.04 (vivid), 15.10 (wily) and 16.04 (xenial) are
currently supported
I'm not seeing this issue on a fresh install of Debian Stretch. The steps I took were:
1. fetch chromium
2. sudo apt-get install --reinstall libasound2:i386 libcap2:i386
libelf-dev:i386 libfontconfig1:i386 libgconf-2-4:i386
libgl1-mesa-glx:i386 libglib2.0-0:i386 libgpm2:i386
libgtk2.0-0:i386 libncurses5:i386 libnss3:i386
libpango1.0-0:i386 libtinfo-dev:i386 libudev1:i386
libxcomposite1:i386 libxcursor1:i386 libxdamage1:i386
libxi6:i386 libxrandr2:i386 libxss1:i386 libxtst6:i386
linux-libc-dev:i386 ant apache2-bin autoconf bison cdbs cmake
curl devscripts dpkg-dev elfutils fakeroot flex fonts-indic
fonts-thai-tlwg g++ g++-6-multilib gawk git-core git-svn
g++-mingw-w64-i686 gperf intltool lib32gcc1 lib32ncurses5-dev
lib32stdc++6 lib32z1-dev libapache2-mod-php7.0 libasound2
libasound2-dev libatk1.0-0 libav-tools libbluetooth-dev
libbrlapi0.6 libbrlapi-dev libbz2-1.0 libbz2-dev libc6 libc6-dbg
libc6-dev-armhf-cross libc6-i386 libcairo2 libcairo2-dbg
libcairo2-dev libcap2 libcap-dev libcups2 libcups2-dev
libcurl4-gnutls-dev libdrm-dev libelf-dev libexpat1 libffi6
libffi6-dbg libffi-dev libfontconfig1 libfontconfig1-dbg
libfreetype6 libgbm-dev libgconf2-dev libgl1-mesa-dev
libgles2-mesa-dev libglib2.0-0 libglib2.0-0-dbg libglib2.0-dev
libglu1-mesa-dev libgnome-keyring0 libgnome-keyring-dev
libgtk2.0-0 libgtk2.0-0-dbg libgtk2.0-dev libjpeg-dev
libkrb5-dev libnspr4 libnspr4-dbg libnspr4-dev libnss3
libnss3-dbg libnss3-dev libpam0g libpam0g-dev libpango1.0-0
libpango1.0-0-dbg libpci3 libpci-dev libpcre3 libpcre3-dbg
libpixman-1-0 libpixman-1-0-dbg libpulse0 libpulse-dev
libsctp-dev libspeechd2 libspeechd-dev libsqlite3-0
libsqlite3-0-dbg libsqlite3-dev libssl-dev libstdc++6
libtinfo-dev libtool libudev1 libudev-dev libwww-perl libx11-6
libx11-6-dbg libx11-xcb1 libx11-xcb1-dbg libxau6 libxau6-dbg
libxcb1 libxcb1-dbg libxcomposite1 libxcomposite1-dbg
libxcursor1 libxcursor1-dbg libxdamage1 libxdamage1-dbg
libxdmcp6 libxdmcp6-dbg libxext6 libxext6-dbg libxfixes3 libxi6
libxi6-dbg libxinerama1 libxinerama1-dbg libxkbcommon-dev
libxrandr2 libxrandr2-dbg libxrender1 libxrender1-dbg
libxslt1-dev libxss-dev libxt-dev libxtst6 libxtst6-dbg
libxtst-dev linux-libc-dev-armhf-cross mesa-common-dev openbox
patch perl php7.0-cgi pkg-config python python-cherrypy3
python-crypto python-dev python-numpy python-opencv
python-openssl python-psutil python-yaml realpath rpm ruby
subversion texinfo ttf-dejavu-core wdiff xcompmgr xsltproc
xutils-dev xvfb zip zlib1g zlib1g-dbg
3. gn gen out/Debug
On Mon, Sep 26, 2016 at 3:53 PM, Dirk Pranke <dpr...@chromium.org> wrote:On Mon, Sep 26, 2016 at 11:20 AM, Primiano Tucci <prim...@google.com> wrote:On Mon, Sep 26, 2016 at 7:09 PM, Dirk Pranke <dpr...@chromium.org> wrote:I don't think there's a particularly compelling reason for GN to be using tcmallocIn crbug.com/586444 brettw/you reported a 10-40% perf hit by going back to the default allocatorYes, but all other things being equal I don't think that's a compelling reason enough by itself to stay on it :).I'm not aware of these other discussions, but slowing down gn 10-40% because of this thread (where gn doesn't run on one person's linux installation, which might be weird in a million ways, and where we don't understand yet what's going on) seems a bit over-eager to me...
On Mon, Sep 26, 2016 at 12:57 PM, Nico Weber <tha...@chromium.org> wrote:On Mon, Sep 26, 2016 at 3:53 PM, Dirk Pranke <dpr...@chromium.org> wrote:On Mon, Sep 26, 2016 at 11:20 AM, Primiano Tucci <prim...@google.com> wrote:On Mon, Sep 26, 2016 at 7:09 PM, Dirk Pranke <dpr...@chromium.org> wrote:I don't think there's a particularly compelling reason for GN to be using tcmallocIn crbug.com/586444 brettw/you reported a 10-40% perf hit by going back to the default allocatorYes, but all other things being equal I don't think that's a compelling reason enough by itself to stay on it :).I'm not aware of these other discussions, but slowing down gn 10-40% because of this thread (where gn doesn't run on one person's linux installation, which might be weird in a million ways, and where we don't understand yet what's going on) seems a bit over-eager to me...Sure. The larger issue is that tcmalloc causes us pain in other ways as well in Chromium (though this is not the thread to go into those details) and it's unclear if tcmalloc is even still a win for Chromium, perf-wise.I.e., all other things aren't equal, so it's not really worth debating this by itself :).
More fully, very early, dl-init.c [1] calls malloc(), via e.g. [2] or [3].
tc_malloc() calls open("/dev/urandom") [4].
On my system, open() calls dlsym() [5].
dlsym() calls calloc() [6].
Finally tc_calloc() calls ThreadCache::InitModule() which calls Static::pageheap_lock() [7], which the original tc_malloc() is already holding :-(
So is there a bug?
Is it wrong of my system to call dlsym() from open()?
Is it wrong of TCMalloc to call open() from malloc()?
Should ThreadCache::InitModule() or tc_malloc(), etc. handle reentrancy, even with an error?
Or is this just an understandable conflict between Chromium and the cowdancer package?
(I guess dlsym() does what's expected.)
You can reproduce the problem with the following:
> #define _GNU_SOURCE
> #include <dlfcn.h>
>
> # Like https://anonscm.debian.org/git/pbuilder/cowdancer.git/tree/cowdancer.c
> int open64(const char* filename, int flags) {
> int (*origlibc_open64)(const char *, int) = dlsym(RTLD_NEXT, "open64");
>
> return origlibc_open64(filename, flags);
> }
>
> int main() {}
> $ tools/gn/bootstrap/bootstrap.py --gn-gen-args use_experimental_allocator_shim=false
> $ gcc -c main.c
> $ g++ -lpthread main.o out/Release/obj/base/allocator/tcmalloc/*.o out/Release/obj/base/third_party/dynamic_annotations/dynamic_annotations/dynamic_annotations.o -ldl
> $ ./a.out
> (No output, just hangs here)
The problem goes away unless all of the following are true:
The Chromium TCMalloc fork is used. Vanilla TCMalloc doesn't open("/dev/urandom").
ASLR is turned on. Address space layout randomization is what opens /dev/urandom.
Threads are present. Without libpthread, dlsym() uses a static buffer, not calloc().
ASLR isn't already initialized. Presumably if GetRandomAddrHint() were called outside tc_malloc(), it wouldn't deadlock.
cowdancer isn't already initialized. I didn't test, but if cowdancer.c:initialize_functions() were called outside tc_malloc(), presumably it wouldn't deadlock.
Is there a bug anywhere in all this?
[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/dl-init.c;h=818c3aa37cd052e6edbf5f55524647b45b5bfe87;hb=HEAD#l72
[2] https://git.gnome.org/browse/glib/tree/glib/gthread-posix.c#n1000
[3] https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/libsupc%2B%2B/eh_alloc.cc#L123
[4] https://chromium.googlesource.com/chromium/src.git/+/master/third_party/tcmalloc/chromium/src/system-alloc.cc#180
[5] https://anonscm.debian.org/git/pbuilder/cowdancer.git/tree/cowdancer.c#n182
[6] https://sourceware.org/git/?p=glibc.git;a=blob;f=dlfcn/dlerror.c;h=41b2bd6bf29be5f61affc5e750775ab2f64ee4b0;hb=HEAD#l141
[7] https://chromium.googlesource.com/chromium/src.git/+/master/third_party/tcmalloc/chromium/src/thread_cache.cc#322
Okay, here's what's going on:
There's a package on my system that calls dlsym() from open(), and glibc calls calloc() from dlsym(), which deadlocks.
So is there a bug?
Is it wrong of my system to call dlsym() from open()?
Is it wrong of TCMalloc to call open() from malloc()?
Should ThreadCache::InitModule() or tc_malloc(), etc. handle reentrancy, even with an error?
Or is this just an understandable conflict between Chromium and the cowdancer package?
(I guess dlsym() does what's expected.)
You can reproduce the problem with the following:
> #define _GNU_SOURCE
> #include <dlfcn.h>
>
> # Like https://anonscm.debian.org/git/pbuilder/cowdancer.git/tree/cowdancer.c
The problem goes away unless all of the following are true:
The Chromium TCMalloc fork is used. Vanilla TCMalloc doesn't open("/dev/urandom").
ASLR is turned on. Address space layout randomization is what opens /dev/urandom.
Threads are present. Without libpthread, dlsym() uses a static buffer, not calloc().
ASLR isn't already initialized. Presumably if GetRandomAddrHint() were called outside tc_malloc(), it wouldn't deadlock.
cowdancer isn't already initialized. I didn't test, but if cowdancer.c:initialize_functions() were called outside tc_malloc(), presumably it wouldn't deadlock.
Is there a bug anywhere in all this?
[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/dl-init.c;h=818c3aa37cd052e6edbf5f55524647b45b5bfe87;hb=HEAD#l72
[2] https://git.gnome.org/browse/glib/tree/glib/gthread-posix.c#n1000
[3] https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/libsupc%2B%2B/eh_alloc.cc#L123
[4] https://chromium.googlesource.com/chromium/src.git/+/master/third_party/tcmalloc/chromium/src/system-alloc.cc#180
[5] https://anonscm.debian.org/git/pbuilder/cowdancer.git/tree/cowdancer.c#n182
[6] https://sourceware.org/git/?p=glibc.git;a=blob;f=dlfcn/dlerror.c;h=41b2bd6bf29be5f61affc5e750775ab2f64ee4b0;hb=HEAD#l141
[7] https://chromium.googlesource.com/chromium/src.git/+/master/third_party/tcmalloc/chromium/src/thread_cache.cc#322
Yeah, don't use cowdancer, use a civilised kernel based CoW implementation instead (like btrfs snapshots) :)
Yeah, don't use cowdancer, use a civilised kernel based CoW implementation instead (like btrfs snapshots) :)
On Tue, 27 Sep 2016, 8:24 pm 'Primiano Tucci' via gn-dev, <gn-...@chromium.org> wrote:
On Tue, Sep 27, 2016 at 7:51 PM <jack....@gmail.com> wrote:
Okay, here's what's going on:
There's a package on my system that calls dlsym() from open(), and glibc calls calloc() from dlsym(), which deadlocks.This really reminds me of crbug.com/586444So is there a bug?
IMHO a couple
Is it wrong of my system to call dlsym() from open()?
A lot of code (like tcmalloc) expects open to be a pure syscall wrapper. By doing fancy things which are out of your control, like invoking other glibc functions, you break this expectation. There is no right or wrong in these cases, just: how many things depend on that assumptions, and how many things you break when you change that.
Is it wrong of TCMalloc to call open() from malloc()?
Again, see "expectations" above. Honestly I find very odd for open() to end up calling malloc() (directly or indirectly)Should ThreadCache::InitModule() or tc_malloc(), etc. handle reentrancy, even with an error?
Best place to discuss this would be on https://github.com/gperftools/gperftools
[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=839101
On Wednesday, September 28, 2016 at 12:18:12 PM UTC-7, Torne (Richard Coles) wrote:
> btrfs subvolume snapshots are even faster and more disk-efficient than cp --reflink; if you make the thing you want to be able to treat as a CoW volume into a btrfs subvolume you can create snapshots of that subvolume basically instantly at the cost of a couple of kilobytes of disk at most (instead of being linearly proportional to number-of-files like cp --reflink) and also delete the subvolume similarly efficiently instead of waiting for rm -rf :)
>
> But yes, you need to be using btrfs to use it. :)
I've added trying out btrfs and a subvolume solution to my to-do list, thanks!
[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=839101
On Wednesday, September 28, 2016 at 12:18:12 PM UTC-7, Torne (Richard Coles) wrote:
> btrfs subvolume snapshots are even faster and more disk-efficient than cp --reflink; if you make the thing you want to be able to treat as a CoW volume into a btrfs subvolume you can create snapshots of that subvolume basically instantly at the cost of a couple of kilobytes of disk at most (instead of being linearly proportional to number-of-files like cp --reflink) and also delete the subvolume similarly efficiently instead of waiting for rm -rf :)
>
> But yes, you need to be using btrfs to use it. :)
I've added trying out btrfs and a subvolume solution to my to-do list, thanks!