Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
lto, sound: Fix export symbols for !CONFIG_MODULES
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 101 - 125 of 128 - Collapse all  -  Translate all to Translated (View all originals) < Older  Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Andi Kleen  
View profile  
 More options Aug 20 2012, 5:50 am
Newsgroups: linux.kernel
From: Andi Kleen <a...@firstfloor.org>
Date: Mon, 20 Aug 2012 11:50:03 +0200
Local: Mon, Aug 20 2012 5:50 am
Subject: Re: [PATCH 26/74] lto, sound: Fix export symbols for !CONFIG_MODULES

I don't strictly need it for 3.6, 3.7 is ok.

> And shall I apply this one to sound git tree, or would you like to
> apply all in a single tree?

Please apply it in yours.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Takashi Iwai  
View profile  
 More options Aug 20 2012, 6:00 am
Newsgroups: linux.kernel
From: Takashi Iwai <ti...@suse.de>
Date: Mon, 20 Aug 2012 12:00:02 +0200
Local: Mon, Aug 20 2012 6:00 am
Subject: Re: [PATCH 26/74] lto, sound: Fix export symbols for !CONFIG_MODULES
At Mon, 20 Aug 2012 11:45:45 +0200,

OK, applied now.  Thanks.

Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "RFC: Link Time Optimization support for the kernel" by Andi Kleen
Andi Kleen  
View profile  
 More options Aug 20 2012, 6:20 am
Newsgroups: linux.kernel
From: Andi Kleen <a...@firstfloor.org>
Date: Mon, 20 Aug 2012 12:20:02 +0200
Local: Mon, Aug 20 2012 6:20 am
Subject: Re: RFC: Link Time Optimization support for the kernel

On Mon, Aug 20, 2012 at 09:48:35AM +0200, Ingo Molnar wrote:

> * Andi Kleen <a...@firstfloor.org> wrote:

> > This rather large patchkit enables gcc Link Time Optimization (LTO)
> > support for the kernel.

> > With LTO gcc will do whole program optimizations for
> > the whole kernel and each module. This increases compile time,
> > but can generate faster code.

> By how much does it increase compile time?

All numbers are preliminary at this point. I miss both some code
quality and compile time improvements that it could do, to work
around some issues that are fixable.

Compile time:

Compilation slowdown depends on the largest binary size.  I see between
50% and 4x.  The 4x case is mainly for allyes (so unlikely); a normal
distro build, which is mostly modular, or a defconfig like build is more
towards the 50%.

Currently I have to disable slim LTO, which essentially means everything
is compiled twice. Once that's fixed it should compile faster for
the normal case too (although it will be still slower than non LTO)

A lot of the overhead on the larger builds is also some specific
gcc code that I'm working with the gcc developers on to improve.
So the 4x extreme case will hopefully go down.

The large builds also currently suffer from too much memory
consumption. That will hopefully improve too, as gcc improves.

I wouldn't expect anyone using it for day to day kernel hacking
(I understand that 50% are annoying for that). It's more like a
 "release build" mode.

The performance is currently also missing some improvements due
to workarounds.

Performance:

Hackbench goes about 5% faster, so the scheduler benefits. Kbuild
is not changing much. Various network benchmarks over loopback
go faster too (best case seen 18%+), so the network stack seems to
benefit. A lot of micro benchmarks go faster, sometimes larger numbers.
There are some minor regressions.

A lot of benchmarking on larger workloads is still outstanding.
But the existing numbers are promising I believe. Things will still
change, it's still early.

I would welcome any benchmarking from other people.

I also expect gcc to do more LTO optimizations in the future, so we'll
hopefully see more gains over time. Essentially it gives more
power to the compiler.

Long term it would also help the kernel source organization. For example
there's no reason with LTO to have gigantic includes with large inlines,
because cross file inlining works in a efficient way without reparsing.

In theory (but that's not realized today) the automatic repartitioning of
compilation units could improve compile time with lots of small files

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "x86, lto: Disable fancy hweight optimizations for LTO" by Jan Beulich
Jan Beulich  
View profile  
 More options Aug 20 2012, 7:00 am
Newsgroups: linux.kernel
From: "Jan Beulich" <JBeul...@suse.com>
Date: Mon, 20 Aug 2012 13:00:01 +0200
Local: Mon, Aug 20 2012 7:00 am
Subject: Re: [PATCH 46/74] x86, lto: Disable fancy hweight optimizations for LTO

That's not the point: The point really is that you could allow the
alternative regardless of LTO, and just penalize the LTO case
by having even the asm clobber the registers that a function call
would not preserve.

> I'm not sure the optimization is really worth it anyways, hweight should
> be uncommon.

That's a separate question (but I sort of agree - not sure whether
CPU mask weights ever get calculated on hot paths).

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "lto, workaround: Add workaround for initcall reordering" by Jan Beulich
Jan Beulich  
View profile  
 More options Aug 20 2012, 7:10 am
Newsgroups: linux.kernel
From: "Jan Beulich" <JBeul...@suse.com>
Date: Mon, 20 Aug 2012 13:10:01 +0200
Local: Mon, Aug 20 2012 7:10 am
Subject: Re: [PATCH 55/74] lto, workaround: Add workaround for initcall reordering

>>> On 19.08.12 at 17:01, Andi Kleen <a...@firstfloor.org> wrote:
> On Sun, Aug 19, 2012 at 09:46:04AM +0100, Jan Beulich wrote:
>> >>> Andi Kleen <a...@firstfloor.org> 08/19/12 5:05 AM >>>
>> >Work around a LTO gcc problem: when there is no reference to a variable
>> >in a module it will be moved to the end of the program. This causes
>> >reordering of initcalls which the kernel does not like.
>> >Add a dummy reference function to avoid this. The function is
>> >deleted by the linker.

>> This is not even true on x86, not to speak of generally.

> Why is it not true ?

> __initcall is only defined for !MODULE and there __exit discards.

__exit, on x86 and perhaps other arches, causes the code
to be discarded at runtime only.

No - see above. Using .discard.* enforces the discarding at link
time.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "x86, lto: Disable fancy hweight optimizations for LTO" by Andi Kleen
Andi Kleen  
View profile  
 More options Aug 20 2012, 7:20 am
Newsgroups: linux.kernel
From: Andi Kleen <a...@linux.intel.com>
Date: Mon, 20 Aug 2012 13:20:02 +0200
Local: Mon, Aug 20 2012 7:20 am
Subject: Re: [PATCH 46/74] x86, lto: Disable fancy hweight optimizations for LTO

> That's not the point: The point really is that you could allow the
> alternative regardless of LTO, and just penalize the LTO case
> by having even the asm clobber the registers that a function call
> would not preserve.

That's just what a normal call does, right?

-Andi
--
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jan Beulich  
View profile  
 More options Aug 20 2012, 8:40 am
Newsgroups: linux.kernel
From: "Jan Beulich" <JBeul...@suse.com>
Date: Mon, 20 Aug 2012 14:40:02 +0200
Local: Mon, Aug 20 2012 8:40 am
Subject: Re: [PATCH 46/74] x86, lto: Disable fancy hweight optimizations for LTO

>>> On 20.08.12 at 13:18, Andi Kleen <a...@linux.intel.com> wrote:
>>  That's not the point: The point really is that you could allow the
>> alternative regardless of LTO, and just penalize the LTO case
>> by having even the asm clobber the registers that a function call
>> would not preserve.

> That's just what a normal call does, right?

Exactly.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "x86, lto, vdso: Don't duplicate vvar address variables" by Andrew Lutomirski
Andrew Lutomirski  
View profile  
 More options Aug 20 2012, 1:50 pm
Newsgroups: linux.kernel
From: Andrew Lutomirski <l...@mit.edu>
Date: Mon, 20 Aug 2012 19:50:02 +0200
Local: Mon, Aug 20 2012 1:50 pm
Subject: Re: [PATCH 54/74] x86, lto, vdso: Don't duplicate vvar address variables

On Sat, Aug 18, 2012 at 7:56 PM, Andi Kleen <a...@firstfloor.org> wrote:
> From: Andi Kleen <a...@linux.intel.com>

> Every includer of vvar.h currently gets own static variables
> for all the vvar addresses. Generate just one set each for the
> main kernel and for the vdso. This saves some data space.

> Cc: Andy Lutomirski <l...@mit.edu>
> Signed-off-by: Andi Kleen <a...@linux.intel.com>

[This doesn't apply to -linus or to 3.5, so I haven't actually tested it.]

NACK, without significant further evidence that this is a good idea.

On input like this:

static const int * const vvaraddr_test = 0xffffffffff601000;

int func(void)
{
        return *vvaraddr_test;

}

gcc -O2 generates:

        .file   "constptr.c"
        .text
        .p2align 4,,15
        .globl  func
        .type   func, @function
func:
.LFB0:
        .cfi_startproc
        movl    -10481664, %eax
        ret
        .cfi_endproc
.LFE0:
        .size   func, .-func
        .ident  "GCC: (GNU) 4.6.3 20120306 (Red Hat 4.6.3-2)"
        .section        .note.GNU-stack,"",@progbits

Note, in particular, that (a) the load from the vvar uses an immediate
memory operand (this avoids a cacheline access, which is a measureable
speedup) and (b) vvaraddr_test was not emitted as data at all.

Your code will force each vvar address to be emitted as data and will
cause each reference to reference it as data.  Barring cleverness (and
I don't remember whether the vdso build is currently clever), this
could result in double-indirect access via the GOT from the vdso.

This kind of change IMO needs actual size measurements, benchmarks,
and some evidence that duplicate .data/.rodata things were emitted.

--Andy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "lto, kprobes: Use KSYM_NAME_LEN to size identifier buffers" by Ananth N Mavinakayanahalli
Ananth N Mavinakayanahalli  
View profile  
 More options Aug 21 2012, 12:50 am
Newsgroups: linux.kernel
From: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Date: Tue, 21 Aug 2012 06:50:02 +0200
Local: Tues, Aug 21 2012 12:50 am
Subject: Re: [PATCH 71/74] lto, kprobes: Use KSYM_NAME_LEN to size identifier buffers

On Sat, Aug 18, 2012 at 07:57:07PM -0700, Andi Kleen wrote:
> From: Joe Mario <jma...@redhat.com>

> Use KSYM_NAME_LEN to size identifier buffers, so that it can
> be easier increased.

> Cc: ana...@in.ibm.com
> Signed-off-by: Joe Mario <jma...@redhat.com>
> Signed-off-by: Andi Kleen <a...@linux.intel.com>

Acked-by: Ananth N Mavinakayanahalli <ana...@in.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "RFC: Link Time Optimization support for the kernel" by Ingo Molnar
Ingo Molnar  
View profile  
 More options Aug 21 2012, 3:50 am
Newsgroups: linux.kernel
From: Ingo Molnar <mi...@kernel.org>
Date: Tue, 21 Aug 2012 09:50:02 +0200
Local: Tues, Aug 21 2012 3:50 am
Subject: Re: RFC: Link Time Optimization support for the kernel

* Andi Kleen <a...@firstfloor.org> wrote:

The other hope would be that if LTO is used by a high-profile
project like the Linux kernel then the compiler folks might look
at it and improve it.

> A lot of the overhead on the larger builds is also some
> specific gcc code that I'm working with the gcc developers on
> to improve. So the 4x extreme case will hopefully go down.

> The large builds also currently suffer from too much memory
> consumption. That will hopefully improve too, as gcc improves.

Are there any LTO build files left around, blowing up the size
of the build tree?

Can the current implementation of LTO optimize to the level of
inlining? A lot of our include file hell situation results from
the desire to declare structures publicly so that inlined
functions can use them directly.

If data structures could be encapsulated/internalized to
subsystems and only global functions are exposed to other
subsystems [which are then LTO optimized] then our include
file dependencies could become a *lot* simpler.

Thanks,

        Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Don Zickus  
View profile  
 More options Aug 21 2012, 10:10 am
Newsgroups: linux.kernel
From: Don Zickus <dzic...@redhat.com>
Date: Tue, 21 Aug 2012 16:10:01 +0200
Local: Tues, Aug 21 2012 10:10 am
Subject: Re: RFC: Link Time Optimization support for the kernel

On Tue, Aug 21, 2012 at 09:49:21AM +0200, Ingo Molnar wrote:
> > A lot of the overhead on the larger builds is also some
> > specific gcc code that I'm working with the gcc developers on
> > to improve. So the 4x extreme case will hopefully go down.

> > The large builds also currently suffer from too much memory
> > consumption. That will hopefully improve too, as gcc improves.

> Are there any LTO build files left around, blowing up the size
> of the build tree?

Hi Ingo,

Joe Mario from Red Hat has been assisting Andi with his LTO work.  One of
the ideas he had which may help here is to push the LTO granularity down
to the directory level.  This would allow subsystem maintainers to opt-in
and keep the compile overhead consistent across randconfigs as the linker
would have a smaller pool of files to deal with.

Joe was wondering if he hacked something up for the scheduler directory
only, if there was some preferred benchmark tools he could run to verify a
performance increase or not?

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Avi Kivity  
View profile  
 More options Aug 21 2012, 10:30 am
Newsgroups: linux.kernel
From: Avi Kivity <a...@redhat.com>
Date: Tue, 21 Aug 2012 16:30:01 +0200
Local: Tues, Aug 21 2012 10:30 am
Subject: Re: RFC: Link Time Optimization support for the kernel
On 08/21/2012 10:49 AM, Ingo Molnar wrote:

> Can the current implementation of LTO optimize to the level of
> inlining? A lot of our include file hell situation results from
> the desire to declare structures publicly so that inlined
> functions can use them directly.

> If data structures could be encapsulated/internalized to
> subsystems and only global functions are exposed to other
> subsystems [which are then LTO optimized] then our include
> file dependencies could become a *lot* simpler.

I think modules break this (if I understand what you mean correctly).
If the main kernel exposes symbol x as a global function, then lto will
not inline it into a module.

--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andi Kleen  
View profile  
 More options Aug 21 2012, 1:10 pm
Newsgroups: linux.kernel
From: Andi Kleen <a...@firstfloor.org>
Date: Tue, 21 Aug 2012 19:10:02 +0200
Local: Tues, Aug 21 2012 1:10 pm
Subject: Re: RFC: Link Time Optimization support for the kernel

> The other hope would be that if LTO is used by a high-profile
> project like the Linux kernel then the compiler folks might look
> at it and improve it.

Yes definitely.  I already got lot of help from toolchain people.

> > A lot of the overhead on the larger builds is also some
> > specific gcc code that I'm working with the gcc developers on
> > to improve. So the 4x extreme case will hopefully go down.

> > The large builds also currently suffer from too much memory
> > consumption. That will hopefully improve too, as gcc improves.

> Are there any LTO build files left around, blowing up the size
> of the build tree?

The objdir size increases from the immediate information in the objects,
even though it's compressed.  A typical LTO objdir is about 2.5x
as big as non LTO.

[this will go down a bit with slim LTO; right now there is an unnecessary
copy of the non LTOed code too; but I expect it will still be
significantly larger]

There's also the TMPDIR problem. If you put /tmp in tmpfs and gcc
defaults to put the immediate files during the final link into
/tmp the memory fills up even faster, because tmpfs is competing
with anonymous memory.

4.7 improved a lot over 4.6 for this with better partitioning; with 4.6 I
had some spectacular OOMst. 4.6 is not supported for LTO anymore now,
with 4.7 it became much better.

I also hope tmpfs will get better algorithms eventually that make
this less likely.

Anyways this can be overriden by setting TMPDIR to the object directory.
With TMPDIR set and not too aggressive -j* for most kernels you should
be ok with 4GB of memory. Just allyes still suffers.

This was one of the reasons why I made it not default for allyesconfig.

> > so we'll hopefully see more gains over time. Essentially it
> > gives more power to the compiler.

> > Long term it would also help the kernel source organization.
> > For example there's no reason with LTO to have gigantic
> > includes with large inlines, because cross file inlining works
> > in a efficient way without reparsing.

> Can the current implementation of LTO optimize to the level of
> inlining? A lot of our include file hell situation results from

Yes, it does cross file inlining. Maybe a bit too much even
(Currently there are about 40% less static CALLs when LTOed)
In fact some of the current workarounds limit it, so there may be
even more in the future.

One side effect is that backtraces are harder to read. You'll
need to rely more on addr2line than before (or we may need
to make kallsyms smarter)

It only inlines inside a final binary though, as Avi mentioned,
so it's more useful inside a subsystem for modular kernels.

> If data structures could be encapsulated/internalized to
> subsystems and only global functions are exposed to other
> subsystems [which are then LTO optimized] then our include
> file dependencies could become a *lot* simpler.

Yes, long term we could have these benefits.

BTW I should add LTO does more than just inlining:
- Drop unused global functions and variables
  (so may cut down on ifdefs)
- Detect type inconsistencies between files
- Partial inlining (inline only parts of a function like a test
  at the beginning)
- Detect pure and const functions without side effects that can be more
  aggressively optimized in the caller.
- Detect global clobbers globally. Normally any global call has to
  assume all global variables could be changed.  With LTO information some
  of them can be cached in registers over calls.
- Detect read only variables and optimize them
- Optimize arguments to global functions (drop unnecessary arguments,
  optimize input/output etc.)
- Replace indirect calls with direct calls, enabling other
  optimizations.
- Do constant propagation and specialization for functions. So if a
  function is called commonly with a constant it can generate a special
  variant of this function optimized for that.  This still needs more tuning (and
  currently the code size impact is on the largish side), but I hope
  to eventually have e.g. a special kmalloc optimized for GFP_KERNEL.
  It can also in principle inline callbacks.

-Andi
--
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "lto, workaround: Add workaround for missing LTO symbols in igb" by Arnd Bergmann
Arnd Bergmann  
View profile  
 More options Aug 22 2012, 4:50 am
Newsgroups: linux.kernel
From: Arnd Bergmann <a...@arndb.de>
Date: Wed, 22 Aug 2012 10:50:01 +0200
Local: Wed, Aug 22 2012 4:50 am
Subject: Re: [PATCH 56/74] lto, workaround: Add workaround for missing LTO symbols in igb
On Sunday 19 August 2012, Andi Kleen wrote:

> -static struct e1000_mac_operations e1000_mac_ops_82575 = {
> +/* Workaround for LTO bug */
> +__visible struct e1000_mac_operations e1000_mac_ops_82575 = {

The comment is not very clear outside the context of this patch.
Maybe change it to /* __visible added to work around an LTO but */.

        Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "RFC: Link Time Optimization support for the kernel" by Arnd Bergmann
Arnd Bergmann  
View profile  
 More options Aug 22 2012, 5:00 am
Newsgroups: linux.kernel
From: Arnd Bergmann <a...@arndb.de>
Date: Wed, 22 Aug 2012 11:00:01 +0200
Local: Wed, Aug 22 2012 5:00 am
Subject: Re: RFC: Link Time Optimization support for the kernel
On Sunday 19 August 2012, Andi Kleen wrote:

> This rather large patchkit enables gcc Link Time Optimization (LTO)
> support for the kernel.

> With LTO gcc will do whole program optimizations for
> the whole kernel and each module. This increases compile time,
> but can generate faster code.

> LTO allows gcc to inline functions between different files and
> do various other optimization across the whole binary.

This looks quite nice overall. Have you seen other disadvantages
besides bugs and compile time? There are two possible issues that
I can see happening:

* Debuggability: When we get more aggressive optimizations, it
often becomes harder to trace back object code to a specific source
line, which may be a reason for distros not to enable it for their
product kernels in the end because it can make the work of their
support teams harder.

* Stack consumption: If you do more inlining, the total stack usage
of large functions can become higher than what the deepest path through
the same code in the non-inlined version would be. This bites us
more in the kernel than in user applications, which have much more
stack space available.

Have you noticed problems with either of these so far? Do you think
they are realistic concerns or is the LTO implementation good enough
that they would rarely become an issue?

        Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "lto, workaround: Add workaround for missing LTO symbols in igb" by Andi Kleen
Andi Kleen  
View profile  
 More options Aug 22 2012, 8:40 am
Newsgroups: linux.kernel
From: Andi Kleen <a...@firstfloor.org>
Date: Wed, 22 Aug 2012 14:40:01 +0200
Local: Wed, Aug 22 2012 8:40 am
Subject: Re: [PATCH 56/74] lto, workaround: Add workaround for missing LTO symbols in igb

On Wed, Aug 22, 2012 at 08:43:35AM +0000, Arnd Bergmann wrote:
> On Sunday 19 August 2012, Andi Kleen wrote:
> > -static struct e1000_mac_operations e1000_mac_ops_82575 = {
> > +/* Workaround for LTO bug */
> > +__visible struct e1000_mac_operations e1000_mac_ops_82575 = {

> The comment is not very clear outside the context of this patch.
> Maybe change it to /* __visible added to work around an LTO but */.

I hope to remove this soon, just needs another fix for initcalls
first.

-Andi
--
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "RFC: Link Time Optimization support for the kernel" by Andi Kleen
Andi Kleen  
View profile  
 More options Aug 22 2012, 8:40 am
Newsgroups: linux.kernel
From: Andi Kleen <a...@firstfloor.org>
Date: Wed, 22 Aug 2012 14:40:03 +0200
Local: Wed, Aug 22 2012 8:40 am
Subject: Re: RFC: Link Time Optimization support for the kernel

On Wed, Aug 22, 2012 at 08:58:02AM +0000, Arnd Bergmann wrote:
> * Debuggability: When we get more aggressive optimizations, it
> often becomes harder to trace back object code to a specific source
> line, which may be a reason for distros not to enable it for their
> product kernels in the end because it can make the work of their
> support teams harder.

Yes, that's a potential issue with the larger functions. People looking
at oopses may need to rely more on addr2line with debug info. It's probably
less an issue for distributions (who should have debug info for their kernels
and may even use crash instead of only oops logs), but more for random reports
on linux-kernel.

That said for the few LTO crashes I looked at it wasn't that big an issue.
Usually the inline chains are still broken up by indirect calls, and
a lot of kernel paths have that, so all the backtraces I could make
sense of without debug info.

> * Stack consumption: If you do more inlining, the total stack usage
> of large functions can become higher than what the deepest path through
> the same code in the non-inlined version would be. This bites us
> more in the kernel than in user applications, which have much more
> stack space available.

Newer gcc has a heuristic to not inline when the stack frame gets too
large. We set that option. Also there's a warning for too large
stack frames. With these two together we should be pretty safe.

iirc the warning mostly showed up in some staging drivers which were likely
already too large on their own. I haven't hunted for it explicitely,
but I don't remember seeing it much in other places. Also it was alwas
still in a range that does not necessarily crash.

> Have you noticed problems with either of these so far? Do you think
> they are realistic concerns or is the LTO implementation good enough
> that they would rarely become an issue?

I think the first is a realistic possible concern, but I personally haven't
had much trouble with it so far.

-Andi

--
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "lto, watchdog/hpwdt.c: Make assembler label global" by Wim Van Sebroeck
Wim Van Sebroeck  
View profile  
 More options Aug 22 2012, 3:30 pm
Newsgroups: linux.kernel
From: Wim Van Sebroeck <w...@iguana.be>
Date: Wed, 22 Aug 2012 21:30:02 +0200
Local: Wed, Aug 22 2012 3:30 pm
Subject: Re: [PATCH 38/74] lto, watchdog/hpwdt.c: Make assembler label global
Hi andi,

> From: Andi Kleen <a...@linux.intel.com>

> We cannot assume that the inline assembler code always ends up
> in the same file as the original C file. So make any assembler labels
> that are called with "extern" by C global

> Cc: w...@iguana.be
> Signed-off-by: Andi Kleen <a...@linux.intel.com>

You have my signed-off-by, but I'm Cc-ing also the author of the driver
(Tom Mingarelli) so that he is also aware of the proposed change.

Kind regards,
Wim.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mingarelli, Thomas  
View profile  
 More options Aug 22 2012, 4:20 pm
Newsgroups: linux.kernel
From: "Mingarelli, Thomas" <Thomas.Mingare...@hp.com>
Date: Wed, 22 Aug 2012 22:20:02 +0200
Local: Wed, Aug 22 2012 4:20 pm
Subject: RE: [PATCH 38/74] lto, watchdog/hpwdt.c: Make assembler label global
I am OK with the changes. We have a few more coming soon to improve the kdump process when hpwdt is running. Just a heads up.

Thanks,
Tom


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "lto, workaround: Mark do_futex noinline to prevent clobbering ebp" by H. Peter Anvin
H. Peter Anvin  
View profile  
 More options Aug 22 2012, 8:20 pm
Newsgroups: linux.kernel
From: "H. Peter Anvin" <h...@zytor.com>
Date: Thu, 23 Aug 2012 02:20:01 +0200
Local: Wed, Aug 22 2012 8:20 pm
Subject: Re: [PATCH 74/74] lto, workaround: Mark do_futex noinline to prevent clobbering ebp
On 08/18/2012 07:57 PM, Andi Kleen wrote:

> From: Andi Kleen <a...@linux.intel.com>

> On a 32bit build gcc 4.7 with LTO decides to clobber the 6th argument on the
> stack.  Unfortunately this corrupts the user EBP and leads to later crashes.
> For now mark do_futex noinline to prevent this.

> I wish there was a generic way to handle this. Seems like a ticking time
> bomb problem.

There is a generic way to handle this.  This is actually a bug in Linux
that has been known for at least 15 years and which we keep hacking around.

The right thing to do is to change head_32.S to not violate the i386
ABI.  Arguments pushed (by value) on the stack are property of the
callee, that is, they are volatile, so the hack of making them do double
duty as both being saved and passed as arguments is just plain bogus.
The problem is that it works "just well enough" that people (including
myself) keep hacking around it with hacks like this, with assembly
macros, and whatnot instead of fixing the root cause.

        -hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
H. Peter Anvin  
View profile  
 More options Aug 22 2012, 10:20 pm
Newsgroups: linux.kernel
From: "H. Peter Anvin" <h...@zytor.com>
Date: Thu, 23 Aug 2012 04:20:01 +0200
Local: Wed, Aug 22 2012 10:20 pm
Subject: Re: [PATCH 74/74] lto, workaround: Mark do_futex noinline to prevent clobbering ebp
On 08/22/2012 05:17 PM, H. Peter Anvin wrote:

Just a clarification (Andi knows this, I'm sure, but others might not):
this wasn't done the way it is for no reason; back when Linus originally
wrote the code, i386 passed *all* arguments on the stack, and we still
do that for "asmlinkage" functions on i386.  Since gcc back then rarely
if ever mucked with the stack arguments, it made sense to make them
"double duty."  Fixing this really should entail changing the invocation
of system calls on i386 to use the regparm convention, which means we
only need to push three arguments twice, rather than six.

        -hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andi Kleen  
View profile  
 More options Aug 22 2012, 10:30 pm
Newsgroups: linux.kernel
From: Andi Kleen <a...@linux.intel.com>
Date: Thu, 23 Aug 2012 04:30:02 +0200
Local: Wed, Aug 22 2012 10:30 pm
Subject: Re: [PATCH 74/74] lto, workaround: Mark do_futex noinline to prevent clobbering ebp

> The right thing to do is to change head_32.S to not violate the i386
> ABI.  Arguments pushed (by value) on the stack are property of the
> callee, that is, they are volatile, so the hack of making them do double
> duty as both being saved and passed as arguments is just plain bogus.
> The problem is that it works "just well enough" that people (including
> myself) keep hacking around it with hacks like this, with assembly
> macros, and whatnot instead of fixing the root cause.

How about just use register arguments for the first three arguments.
This should work for the syscalls at least (may be too risky for all
other asm entry points)

And for syscalls with more than three generate a stub that saves on the stack
explicitely.  This could be done using the new fancy SYSCALL definition macros
(except that arch/x86 would need to start using them too in its own code)

Or is there some subtle reason with syscall restart and updated args
that prevents it?

Perhaps newer gcc can do regparm(X), X > 3 too, may be worth trying.

Don't have time to look into this currently though.

-Andi

--
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
H. Peter Anvin  
View profile  
 More options Aug 22 2012, 11:20 pm
Newsgroups: linux.kernel
From: "H. Peter Anvin" <h...@zytor.com>
Date: Thu, 23 Aug 2012 05:20:02 +0200
Local: Wed, Aug 22 2012 11:20 pm
Subject: Re: [PATCH 74/74] lto, workaround: Mark do_futex noinline to prevent clobbering ebp
On 08/22/2012 07:29 PM, Andi Kleen wrote:

> How about just use register arguments for the first three arguments.
> This should work for the syscalls at least (may be too risky for all
> other asm entry points)

Well, it's just an effort to convert each one in turn...

> And for syscalls with more than three generate a stub that saves on the stack
> explicitely.  This could be done using the new fancy SYSCALL definition macros
> (except that arch/x86 would need to start using them too in its own code)

I don't think there is any point.  Just push the six potential arguments
to the stack and be done with it.

> Or is there some subtle reason with syscall restart and updated args
> that prevents it?

> Perhaps newer gcc can do regparm(X), X > 3 too, may be worth trying.

No, there is no such ABI defined.

> Don't have time to look into this currently though.

Always the problem.

        -hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "RFC: Link Time Optimization support for the kernel" by Jan Hubicka
Jan Hubicka  
View profile  
 More options Aug 23 2012, 11:20 am
Newsgroups: linux.kernel
From: Jan Hubicka <hubi...@ucw.cz>
Date: Thu, 23 Aug 2012 17:20:02 +0200
Local: Thurs, Aug 23 2012 11:20 am
Subject: Re: RFC: Link Time Optimization support for the kernel

> > If data structures could be encapsulated/internalized to
> > subsystems and only global functions are exposed to other
> > subsystems [which are then LTO optimized] then our include
> > file dependencies could become a *lot* simpler.

> Yes, long term we could have these benefits.

Yes, LTO should make in long term life of developers easier, it is just not tool
how to get few extra % of performance.
There is a lot to do.

> BTW I should add LTO does more than just inlining:
> - Drop unused global functions and variables
>   (so may cut down on ifdefs)
> - Detect type inconsistencies between files
> - Partial inlining (inline only parts of a function like a test
>   at the beginning)
> - Detect pure and const functions without side effects that can be more
>   aggressively optimized in the caller.

Also noreturn and nothorw are autodetected (the second is probably not big deal
for kernel, but it makes some C++ codebases a lot smaller by elliminating EH
and cleanps). We plan to add more in near future.

> - Detect global clobbers globally. Normally any global call has to
>   assume all global variables could be changed.  With LTO information some
>   of them can be cached in registers over calls.
> - Detect read only variables and optimize them
> - Optimize arguments to global functions (drop unnecessary arguments,
>   optimize input/output etc.)

At this moment this really happen s within compilation units only.
It is one of harder optimizations to get working over whole program,
we are slowly getting infrasrtucture to make this possible.

> - Replace indirect calls with direct calls, enabling other
>   optimizations.
> - Do constant propagation and specialization for functions. So if a
>   function is called commonly with a constant it can generate a special
>   variant of this function optimized for that.  This still needs more tuning (and
>   currently the code size impact is on the largish side), but I hope
>   to eventually have e.g. a special kmalloc optimized for GFP_KERNEL.
>   It can also in principle inline callbacks.

Also profile propagation is done.  When function is called only on cold paths, it becomes
cold.

Thanks for all the hard work on LTO kernel, Andi!
Honza

> -Andi
> --
> a...@linux.intel.com -- Speaking for myself only.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "x86, lto: Change dotraplinkage into __visible on 32bit" by Michal Marek
Michal Marek  
View profile  
 More options Sep 1 2012, 10:50 am
Newsgroups: linux.kernel
From: Michal Marek <mma...@suse.cz>
Date: Sat, 01 Sep 2012 16:50:01 +0200
Local: Sat, Sep 1 2012 10:50 am
Subject: Re: [PATCH 20/74] x86, lto: Change dotraplinkage into __visible on 32bit
Dne 19.8.2012 04:56, Andi Kleen napsal(a):

> From: Andi Kleen <a...@linux.intel.com>

> Mark 32bit dotraplinkage functions as __visible for LTO.
> 64bit already is using asmlinkage which includes it.

You can make it __visible on both 32bit and 64bit, the result is the same.

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 101 - 125 of 128 < Older  Newer >
« Back to Discussions « Newer topic     Older topic »