Large data sections for the large code model

251 visualizzazioni
Passa al primo messaggio da leggere

mas...@google.com

da leggere,
12 mag 2023, 22:54:1312/05/23
a X86-64 System V Application Binary Interface
There are several types of relocation overflows that we need to pay attention to:

* `.text <-> .rodata`
* `.text <-> .eh_frame`: `.eh_frame` has 32-bit offsets. 64-bit code offsets are possible, but I don't know an implementation exists.
* `.text <-> .bss`
* `.rodata <-> .bss`

In many programs, `.text <-> .data/.bss` relocations have the tightest constraints.
Overflows due to `.text <-> .rodata` relocations are possible but rare (I have seen such issues in the past).
`.rodata` is usually larger than `.data+.bss`.

The current section layout of ld.lld is as follows:
```
.rodata
.text
.data
.bss
```

One notable difference from GNU ld is that `.rodata` precedes `.text`. This arrangement helps alleviate relocation overflow pressure for references from `.text` to `.bss`.

---

The medium code models introduce large data sections such as `.lrodata`, `.ldata`, and `.lbss`.
GCC's implementation provides several variants for `.ldata`, including `.ldata.rel`, `.ldata.rel.local`, `.ldata.rel.ro`, and `.ldata.rel.ro.local`.

GNU ld uses the following section layout:
```
.text
.rodata
.data
.bss
.lrodata
.ldata
.lbss
```

For ld.lld, I am contemplating the following section layout:
```
.lrodata
.rodata
.text
.data
.bss
.lbss
.ldata
```
(Placing .lrodata before .rodata so that we can save a maxpagesize alignment.)

GCC generates both regular and large data sections with `-mcmodel=medium`. This is decided by a section size threshold (`-mlarge-data-threshold`).
In practice, we always mix object files built from small and medium/large code models (just think of prebuilt object files including libc).
The large data sections built with `-mcmodel=large` do not exert relocation pressure on sections in object files with `-mcmodel=small`

However, GCC only generates regular data sections with `-mcmodel=large`. `-mlarge-data-threshold` is ignored.
As a result, the data sections built with `-mcmodel=large` may exert relocation pressure on sections in object files with `-mcmodel=small`.

I propose that we make `-mcmodel=large` respect `-mlarge-data-threshold` and generate large data sections as well.
I posted a GCC patch and Uros Bizjak [asked](https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617999.html) me to discuss the issue with the x86_64 psABI group.
So here is the discussion:)

Here are the concrete suggestions for the psABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/42/

We have large data sections listed under "Table 4.4: Additional Special Sections for the Large Code Model".
We should clarify that these `.l*` sections can be used with the medium code model as well.

State that the large code model, like the medium code model, can use large data sections.

Fangrui Song

da leggere,
22 mag 2023, 17:07:0322/05/23
a X86-64 System V Application Binary Interface, Jan Hubicka, H.J. Lu, Michael Matz, Uros Bizjak

Michael Matz

da leggere,
23 mag 2023, 09:16:3523/05/23
a Fangrui Song, X86-64 System V Application Binary Interface, Jan Hubicka, H.J. Lu, Uros Bizjak
Hello,

On Mon, 22 May 2023, 'Fangrui Song' via X86-64 System V Application Binary Interface wrote:

> > However, GCC only generates regular data sections with
> > `-mcmodel=large`. `-mlarge-data-threshold` is ignored. As a result,
> > the data sections built with `-mcmodel=large` may exert relocation
> > pressure on sections in object files with `-mcmodel=small`.

The thing is that in the large code model there's no difference between
.data and .ldata. The large model has to assume that .text is larger than
2GB, so even references into .data need to be 64bit-aware, like they would
be with .ldata already in the medium model. As this is so, there's now
not even a size restriction on .data, and so there's no real need to use
the .ldata scheme.

The existence of .ldata is an optimization, so to speak: it's there to be
able to get away with a size restriction on .data. But that only makes
sense with the medium code model. The large code model leaves no such
inherent optimization possibilities (obviously the linker can of course
relax certain accesses if it can determine that the distances happen to
be small enough).

So, in the large model the compiler can freely place everything into .data
without any heuristics.

> > I propose that we make `-mcmodel=large` respect `-mlarge-data-threshold`
> > and generate large data sections as well.
> > I posted a GCC patch and Uros Bizjak [asked](
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617999.html) me to
> > discuss the issue with the x86_64 psABI group.
> > So here is the discussion:)

So, in light of the above (.ldata not required in cmodel=large), why would
you want to make GCC emit stuff into .ldata?

> > Here are the concrete suggestions for the psABI:
> > https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/42/

I agree with the clarification on the first part, but not to the addition
of the requirement to split data sections into two categories for the
large model.

> > We have large data sections listed under "Table 4.4: Additional Special
> > Sections for the Large Code Model".
> > We should clarify that these `.l*` sections can be used with the medium
> > code model as well.
> >
> > State that the large code model, like the medium code model, can use
> > large data sections.

"The large code model makes no assumptions about addresses and sizes
of sections." is already part of the text. You may use any section names
you like, they are all allowed to be > 2GB. If you insist on .ldata, you
may use that without any addition to the psABI.


Ciao,
Michael.

Fangrui Song

da leggere,
23 mag 2023, 12:18:1723/05/23
a Michael Matz, X86-64 System V Application Binary Interface, Jan Hubicka, H.J. Lu, Uros Bizjak
On 2023-05-23, Michael Matz wrote:
>Hello,
>
>On Mon, 22 May 2023, 'Fangrui Song' via X86-64 System V Application Binary Interface wrote:
>
>> > However, GCC only generates regular data sections with
>> > `-mcmodel=large`. `-mlarge-data-threshold` is ignored. As a result,
>> > the data sections built with `-mcmodel=large` may exert relocation
>> > pressure on sections in object files with `-mcmodel=small`.
>
>The thing is that in the large code model there's no difference between
>.data and .ldata. The large model has to assume that .text is larger than
>2GB, so even references into .data need to be 64bit-aware, like they would
>be with .ldata already in the medium model. As this is so, there's now
>not even a size restriction on .data, and so there's no real need to use
>the .ldata scheme.
>
>The existence of .ldata is an optimization, so to speak: it's there to be
>able to get away with a size restriction on .data. But that only makes
>sense with the medium code model. The large code model leaves no such
>inherent optimization possibilities (obviously the linker can of course
>relax certain accesses if it can determine that the distances happen to
>be small enough).
>
>So, in the large model the compiler can freely place everything into .data
>without any heuristics.

If an executable consists of only -mcmodel=large object files, having
just .data sections should be fine. However, in reality we have to deal
with both -mcmodel=small and -mcmodel=large object files and think about
giving less relocation overflow pressure to -mcmodel=small object files.

Similarly, -mcmodel=large object files can impose pressure to .data sections
from -mcmodel=medium (if smaller than the threshold).

>> > I propose that we make `-mcmodel=large` respect `-mlarge-data-threshold`
>> > and generate large data sections as well.
>> > I posted a GCC patch and Uros Bizjak [asked](
>> > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617999.html) me to
>> > discuss the issue with the x86_64 psABI group.
>> > So here is the discussion:)
>
>So, in light of the above (.ldata not required in cmodel=large), why would
>you want to make GCC emit stuff into .ldata?
>

Above:)

Jan Beulich

da leggere,
24 mag 2023, 02:04:5024/05/23
a Fangrui Song, X86-64 System V Application Binary Interface, Jan Hubicka, H.J. Lu, Uros Bizjak, Michael Matz
Isn't it an inherent assumption that all object files contributing to a
single final binary use the same model?

Jan

Florian Weimer

da leggere,
24 mag 2023, 03:21:2024/05/23
a 'Jan Beulich' via X86-64 System V Application Binary Interface, Fangrui Song, Jan Beulich, Jan Hubicka, H.J. Lu, Uros Bizjak, Michael Matz
* via:

> Isn't it an inherent assumption that all object files contributing to a
> single final binary use the same model?

It would certainly be nice if the larger models would be usable without
having to rebuild GCC and glibc to get the appropriate crt*.o files.
That's only possible with some level of ABI compatibility.

Thanks,
Florian

Michael Matz

da leggere,
24 mag 2023, 08:39:2424/05/23
a Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Fangrui Song, Jan Beulich, Jan Hubicka, H.J. Lu, Uros Bizjak
Hello,
It's downward, but not upward compatible. So it's enough if the crt*.o
files would be large-model aware i.e. simply would use 64-bit relocs for
everything they need to do, instead of PC-relative ones. There's no harm
(except for a couple bytes code-size waste) vis the small/medium models.

Otherwise, in principle I agree with Jan, all .o files have to be compiled
with the same (or a larger) model. It simply can't be made to work 100%
reliable otherwise. E.g. if the text size will be > 2GB for whatever
reason, and you link in some small/medium model .o (e.g. from static
archives) you will run into problems. _Even_ if we were to try to specify
a way where some use-cases would happen to work.

Like Fangruis suggestion of still trying to keep .data small (by moving
large data items to .ldata) in the large model, even though that's not
necessary. It will make some more mixed examples happen to work, without
being a reliable solution for the mixed-model problem.

So, I'm fine with putting a _suggestion_ of such split even in the large
model into the psABI, as a hint. But I'm not fine with putting in
language that implies that .ldata (and friends) have to be used in the
large model.

FWIW, I consider the large code models to be a theoretic excercise in
trying to reach feature completion. At the time we were adding it to the
psABI there were only hear-say inhouse requirements for actually having
ELF executables or shared libs that had > 2GB text, and I have never seen
such in public. The large model code sequences are _so_ bad that really
noone should use it, but rather split the code into different shared libs,
if must be. The only case where the large model is necessary is if you
have an individual function that's larger than 2GB, and if your
project/code structure is bad enough to result in _that_ then you have
other problems and obviously aren't interested in performance.


Ciao,
Michael.

Michael Matz

da leggere,
24 mag 2023, 08:46:2124/05/23
a Fangrui Song, X86-64 System V Application Binary Interface, Jan Hubicka, H.J. Lu, Uros Bizjak
Hello,

On Tue, 23 May 2023, Fangrui Song wrote:

> > So, in the large model the compiler can freely place everything into .data
> > without any heuristics.
>
> If an executable consists of only -mcmodel=large object files, having
> just .data sections should be fine. However, in reality we have to deal
> with both -mcmodel=small and -mcmodel=large object files and think about
> giving less relocation overflow pressure to -mcmodel=small object files.

So you want to make some more of the mixed-model examples work. That's
really outside the psABI, but I think a sensible goal. As I said in the
answer to Florian I would be fine with hints about this in the psABI, but
not with a requirement to use .ldata and friends.

Also, I think the biggest source of mixing models comes from crt*.o files,
and I further think the solution to that is to make them large-model
always. There are no downsides to that. I was contemplating to submit
something like that for compiler and glibc crt files, the dozens years ago
when we added the large model, but never found the energy as I inherently
think noone should use the large model.


Ciao,
Michael.

Fangrui Song

da leggere,
24 mag 2023, 11:32:4224/05/23
a Michael Matz, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Jan Beulich, Jan Hubicka, H.J. Lu, Uros Bizjak
On 2023-05-24, Michael Matz wrote:
>Hello,
>
>On Wed, 24 May 2023, Florian Weimer wrote:
>
>> > Isn't it an inherent assumption that all object files contributing to
>> > a single final binary use the same model?
>>
>> It would certainly be nice if the larger models would be usable without
>> having to rebuild GCC and glibc to get the appropriate crt*.o files.
>> That's only possible with some level of ABI compatibility.
>
>It's downward, but not upward compatible. So it's enough if the crt*.o
>files would be large-model aware i.e. simply would use 64-bit relocs for
>everything they need to do, instead of PC-relative ones. There's no harm
>(except for a couple bytes code-size waste) vis the small/medium models.
>
>Otherwise, in principle I agree with Jan, all .o files have to be compiled
>with the same (or a larger) model. It simply can't be made to work 100%
>reliable otherwise. E.g. if the text size will be > 2GB for whatever
>reason, and you link in some small/medium model .o (e.g. from static
>archives) you will run into problems. _Even_ if we were to try to specify
>a way where some use-cases would happen to work.
>
>Like Fangruis suggestion of still trying to keep .data small (by moving
>large data items to .ldata) in the large model, even though that's not
>necessary. It will make some more mixed examples happen to work, without
>being a reliable solution for the mixed-model problem.

On
https://maskray.me/blog/2023-05-14-relocation-overflow-and-code-models#aarch64-code-models
, I have analyzed AArch64 code models. Given that AArch64 and x86-64
executables are comparable in size, and R_AARCH64_ADR_PREL_PG_HI21 has a
doubled range, AArch64's small code model is good enough for many
programs that only slightly exceed the normal limit.

For these programs on x86-64, compiling the majority of source files
with -mcmodel= with a suitable -mlarge-data-threshold=N interracts well
with the -mcmodel=small object files (glibc, libstdc++, prebuilt .a
files, etc). I find it odd that changing some -mcmodel=medium object
files to use -mcmodel=large may exert relocation overflow pressure to
-mcmodel=small object files, therefore I created a GCC patch and was
asked to start a discussion in the x86-64 ABI group.

I agree that mixing object files doesn't unleash the full power of
-mcmodel=medium and -mcmodel=large, but I don't think that matters.
Practicality should weigh a lot when we discussing the code models,
otherwise we won't need regular/large data section distinction.

>So, I'm fine with putting a _suggestion_ of such split even in the large
>model into the psABI, as a hint. But I'm not fine with putting in
>language that implies that .ldata (and friends) have to be used in the
>large model.

If I change

+ Like the medium code model, the data sections are split into two
+ parts.

to

+ Like the medium code model, the data sections can be split into two
+ parts.

for https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/42 ,
would that look good?

>FWIW, I consider the large code models to be a theoretic excercise in
>trying to reach feature completion. At the time we were adding it to the
>psABI there were only hear-say inhouse requirements for actually having
>ELF executables or shared libs that had > 2GB text, and I have never seen
>such in public. The large model code sequences are _so_ bad that really
>noone should use it, but rather split the code into different shared libs,
>if must be. The only case where the large model is necessary is if you
>have an individual function that's larger than 2GB, and if your
>project/code structure is bad enough to result in _that_ then you have
>other problems and obviously aren't interested in performance.
>
>
>Ciao,
>Michael.

If we use range extension thunks instead of
_GLOBAL_OFFSET_TABLE_+R_X86_64_PLTOFF64+indirect jump, the large code
model code sequence will not be that bad. I think global data symbols
are not a bottleneck for a significant portion or programs.

(
Large code models may be used by some JIT engines. I know hear-say that
LLVM MCJIT (now deprecated by Orc) uses(used?) the x86-64(?) large code
model.)

Michael Matz

da leggere,
25 mag 2023, 09:16:5125/05/23
a Fangrui Song, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Jan Beulich, Jan Hubicka, H.J. Lu, Uros Bizjak
Hello,

On Wed, 24 May 2023, Fangrui Song wrote:

> For these programs on x86-64, compiling the majority of source files
> with -mcmodel= with a suitable -mlarge-data-threshold=N interracts well
> with the -mcmodel=small object files (glibc, libstdc++, prebuilt .a
> files, etc). I find it odd that changing some -mcmodel=medium object
> files to use -mcmodel=large may exert relocation overflow pressure to
> -mcmodel=small object files, therefore I created a GCC patch and was
> asked to start a discussion in the x86-64 ABI group.

Yeah, as I said, I do understand the wish to make more mixed examples
work.

> > So, I'm fine with putting a _suggestion_ of such split even in the large
> > ...
> If I change
>
> + Like the medium code model, the data sections are split into two
> + parts.
>
> to
>
> + Like the medium code model, the data sections can be split into two
> + parts.
>
> for https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/42 , would that
> look good?

I was thinking about something a little more verbose, ala:

"Although not strictly necessary the data sections can be split into
normal and large parts like in the medium model, to improve
interoperability." or something to that effect.

What do others here think?

> > such in public. The large model code sequences are _so_ bad that really
> > noone should use it, but rather split the code into different shared libs,
> > if must be. The only case where the large model is necessary is if you
> > have an individual function that's larger than 2GB, and if your
> > project/code structure is bad enough to result in _that_ then you have
> > other problems and obviously aren't interested in performance.
>
> If we use range extension thunks instead of
> _GLOBAL_OFFSET_TABLE_+R_X86_64_PLTOFF64+indirect jump, the large code
> model code sequence will not be that bad.

Hmm? All jumps/calls will always contain indirect jumps (except where
relaxable), I don't see how range extension hunks will avoid that. I
think trying to optimize the large code model sequences is a waste, as
nothing should use them to start with. (For other architectures that's
different, as their reachable range for jumps is much smaller, but 2GB
displacement for x86-64 ...)

> I think global data symbols
> are not a bottleneck for a significant portion or programs.

I tend to agree. I really was only talking about code-internal
references, like jumps.

> (
> Large code models may be used by some JIT engines.

JIT engines shouldn't restrict themself to the psABI, it's all internal
only (except of course for interacting with foreign stuff).

> I know hear-say that
> LLVM MCJIT (now deprecated by Orc) uses(used?) the x86-64(?) large code
> model.)

Hopefully not for internal code references. I can easily see how it's
easier to just always use large model sequences for all external
references, though. But really, I'm not worried about JITs vis the large
model, only about final ELF objects using it. None should :)


Ciao,
Michael.

Jan Beulich

da leggere,
25 mag 2023, 09:20:2725/05/23
a Michael Matz, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Jan Hubicka, H.J. Lu, Uros Bizjak, Fangrui Song
Your suggestion reads good to me, fwiw, and I'd prefer it over Fangrui's.

Jan

Fangrui Song

da leggere,
25 mag 2023, 10:45:1025/05/23
a Michael Matz, Jan Beulich, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Jan Hubicka, H.J. Lu, Uros Bizjak
On 2023-05-25, Jan Beulich wrote:
>On 25.05.2023 15:16, Michael Matz wrote:
>> On Wed, 24 May 2023, Fangrui Song wrote:
>>> For these programs on x86-64, compiling the majority of source files
>>> with -mcmodel= with a suitable -mlarge-data-threshold=N interracts well
>>> with the -mcmodel=small object files (glibc, libstdc++, prebuilt .a
>>> files, etc). I find it odd that changing some -mcmodel=medium object
>>> files to use -mcmodel=large may exert relocation overflow pressure to
>>> -mcmodel=small object files, therefore I created a GCC patch and was
>>> asked to start a discussion in the x86-64 ABI group.
>>
>> Yeah, as I said, I do understand the wish to make more mixed examples
>> work.

Thanks for the acknowledgement.

>>>> So, I'm fine with putting a _suggestion_ of such split even in the large
>>>> ...
>>> If I change
>>>
>>> + Like the medium code model, the data sections are split into two
>>> + parts.
>>>
>>> to
>>>
>>> + Like the medium code model, the data sections can be split into two
>>> + parts.
>>>
>>> for https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/42 , would that
>>> look good?
>>
>> I was thinking about something a little more verbose, ala:
>>
>> "Although not strictly necessary the data sections can be split into
>> normal and large parts like in the medium model, to improve
>> interoperability." or something to that effect.
>>
>> What do others here think?
>
>Your suggestion reads good to me, fwiw, and I'd prefer it over Fangrui's.
>
>Jan

Thanks, the wording looks better. I incorporated it into
https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/42

Fangrui Song

da leggere,
25 mag 2023, 11:05:3425/05/23
a Michael Matz, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Jan Beulich, Jan Hubicka, H.J. Lu, Uros Bizjak
On 2023-05-25, Michael Matz wrote:
>[...]
>> > such in public. The large model code sequences are _so_ bad that really
>> > noone should use it, but rather split the code into different shared libs,
>> > if must be. The only case where the large model is necessary is if you
>> > have an individual function that's larger than 2GB, and if your
>> > project/code structure is bad enough to result in _that_ then you have
>> > other problems and obviously aren't interested in performance.
>>
>> If we use range extension thunks instead of
>> _GLOBAL_OFFSET_TABLE_+R_X86_64_PLTOFF64+indirect jump, the large code
>> model code sequence will not be that bad.
>
>Hmm? All jumps/calls will always contain indirect jumps (except where
>relaxable), I don't see how range extension hunks will avoid that. I
>think trying to optimize the large code model sequences is a waste, as
>nothing should use them to start with. (For other architectures that's
>different, as their reachable range for jumps is much smaller, but 2GB
>displacement for x86-64 ...)

Range extension thunks don't avoid indirect jumps, but leveraging them
can avoid unneeded indirect jumps. Consider

void ext();
__attribute__((noinline)) static void foo() { ext(); }
void test() { foo(); ext(); }

gcc -mcmodel=large -O2 uses indirect jumps for both foo and ext.
During linking, it's fairly common for foo and ext to be reachable with
a direct jump (foo is in the same translation unit and has a larger
chance to be reachable without features like
--symbol-ordering-file/machine-level function move).

If gcc generates `call foo; call ext` instead, the linker-generated
range extension thunks will only affect those call sites that need them.
For a slightly over-sized program, the majority of call sites do not
need indirect jumps.

>> I think global data symbols
>> are not a bottleneck for a significant portion or programs.
>
>I tend to agree. I really was only talking about code-internal
>references, like jumps.
>
>> (
>> Large code models may be used by some JIT engines.
>
>JIT engines shouldn't restrict themself to the psABI, it's all internal
>only (except of course for interacting with foreign stuff).

Sounds good.

>> I know hear-say that
>> LLVM MCJIT (now deprecated by Orc) uses(used?) the x86-64(?) large code
>> model.)
>
>Hopefully not for internal code references. I can easily see how it's
>easier to just always use large model sequences for all external
>references, though. But really, I'm not worried about JITs vis the large
>model, only about final ELF objects using it. None should :)
>
>
>Ciao,
>Michael.

Hopefully not for internal code references:)

(I just recall another JIT project https://git.ageinghacker.net/jitter
(used by GNU poke) that generates R_X86_64_PLTOFF64. I don't
investigate whether it has improvable code sequence.)

Michael Matz

da leggere,
25 mag 2023, 12:11:1025/05/23
a Fangrui Song, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Jan Beulich, Jan Hubicka, H.J. Lu, Uros Bizjak
Hello,

On Thu, 25 May 2023, Fangrui Song wrote:

> > Hmm? All jumps/calls will always contain indirect jumps (except where
> > relaxable), I don't see how range extension hunks will avoid that. I
> > think trying to optimize the large code model sequences is a waste, as
> > nothing should use them to start with. (For other architectures that's
> > different, as their reachable range for jumps is much smaller, but 2GB
> > displacement for x86-64 ...)
>
> Range extension thunks don't avoid indirect jumps, but leveraging them
> can avoid unneeded indirect jumps. Consider
>
> void ext();
> __attribute__((noinline)) static void foo() { ext(); }
> void test() { foo(); ext(); }
>
> gcc -mcmodel=large -O2 uses indirect jumps for both foo and ext.

As it has to, yes.

> During linking, it's fairly common for foo and ext to be reachable with
> a direct jump (foo is in the same translation unit and has a larger
> chance to be reachable without features like
> --symbol-ordering-file/machine-level function move).

Yes, the linker may be able to know this.

> If gcc generates `call foo; call ext` instead, the linker-generated
> range extension thunks will only affect those call sites that need them.
> For a slightly over-sized program, the majority of call sites do not
> need indirect jumps.

My point is that GCC simply can't just generate 'call foo' in the large
model. GCC has imperfect knowledge about the size of the current
function, and in cmodel=large it has to assume that even just the current
function is > 2GB, and so, even if 'foo' is in the very same translation
unit, has to assume that it _cannot_ be reached by a 32bit displacement in
the call insn. In this situation there may not even be any other space
reachable with 32bit displacements where the extension hunk could be
placed by the linker. Placing it outside 'test' will be out-of-range
(because test is possibly much larger than 2GB), and placing it within
'test' needs preparation from the compiler and requires text relocations.

If you now say "but that only matter if individual functions are larger
than 2GB", then that's correct. But exactly that is the very feature for
which the large code model exists. If we say that extension hunks are the
solution to the speed problem, then we basically say "don't use 2GB
functions". At that point we can just as well say "don't use the large
model", which also solves all speed problems related to it.


> Hopefully not for internal code references:)
>
> (I just recall another JIT project https://git.ageinghacker.net/jitter
> (used by GNU poke) that generates R_X86_64_PLTOFF64. I don't
> investigate whether it has improvable code sequence.)

Thanks for the reference, it looks interesting :)


Ciao,
Michael.

Fangrui Song

da leggere,
25 mag 2023, 12:25:3025/05/23
a Michael Matz, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Jan Beulich, Jan Hubicka, H.J. Lu, Uros Bizjak
You are right. If x86-64's large code model imposes a restriction that
an input code section cannot be too large, `call callee` should be
usable. But I see that this is no intention to impose such a
restriction.

Users may just use the medium code model, and when `call callee`
causes a problem, implement range extension thunks in the linkers.

> > Hopefully not for internal code references:)
> >
> > (I just recall another JIT project https://git.ageinghacker.net/jitter
> > (used by GNU poke) that generates R_X86_64_PLTOFF64. I don't
> > investigate whether it has improvable code sequence.)
>
> Thanks for the reference, it looks interesting :)
>
>
> Ciao,
> Michael.



--
宋方睿

Fangrui Song

da leggere,
13 ott 2023, 15:57:0913/10/23
a 'Jan Beulich' via X86-64 System V Application Binary Interface, Michael Matz, Florian Weimer, Jan Beulich, Jan Hubicka, H.J. Lu, Uros Bizjak
Hi, I'd like to revisit this matter and hope we can proceed with the two patches for x86-64-ABI and GCC:

* https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/42 "Clarify large data section usage in medium and large code models"
  + The wording has adopted Michael's suggestion at https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU/m/vODgg546AgAJ
* https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625993.html "i386: Allow -mlarge-data-threshold with -mcmodel=large"

To rephrase the problem for those who don't want to read through all the comments:

The essential point is that all object files contributing to a single final binary don't necessarily have to use the same code model (as I responded to Florian's question)
Otherwise, the medium and large code models become impractical since there are almost always some object files that use the small code model (e.g. prebuilt crt*.o files and assembly files that don't consider medium/large).

Mixing small and medium/large code models does work; it has been functioning for quite some time (perhaps 10+ years) with GNU ld's internal linker scripts and more recently with LLD 17 (https://reviews.llvm.org/D150510).

What I aim to address is: when switching from -mcmodel=medium -mlarge-data-threshold=N to -mcmodel=large -mlarge-data-threshold=N, the compatibility may break.
(large .data from -mcmodel=large object files may render .data from -mcmodel=small object files far away, depending on the order we concatenate .data sections)
This is because GCC -mcmodel=large doesn't currently utilize .lrodata/.ldata/.lbss. This situation seems counterintuitive. Therefore I posted two patches as mentioned by the initial part of this message.
(I have more notes about the current status at https://maskray.me/blog/2023-05-14-relocation-overflow-and-code-models#x86-64-linker-requirement )

I've reached out to Jan Beulich, and he mentioned that my GCC patch has addressed all his comments. Uros Bizjak, GCC's i386 port maintainer, has indicated that he's awaiting a resolution from x86-64-ABI. The wording of https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/42 has adopted Michael's suggestion.
--
宋方睿

Michael Matz

da leggere,
16 ott 2023, 09:29:1016/10/23
a Fangrui Song, 'Jan Beulich' via X86-64 System V Application Binary Interface, Florian Weimer, Jan Beulich, Jan Hubicka, H.J. Lu, Uros Bizjak
Heyho,

On Fri, 13 Oct 2023, Fangrui Song wrote:

> I've reached out to Jan Beulich, and he mentioned that my GCC patch has
> addressed all his comments. Uros Bizjak, GCC's i386 port maintainer, has
> indicated that he's awaiting a resolution from x86-64-ABI. The wording of
> https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/42 has adopted
> Michael's suggestion.

This merely needed a ping for approval, rebase and merge. With the GCC
patches (and their pings) I had it in my head that the revised psABI
language was already in. Sorry about that. Now it is. So if that was
the thing holding back the GCC patch: weee! ;-)


Ciao,
Michael.

Fangrui Song

da leggere,
16 ott 2023, 13:03:1116/10/23
a Uros Bizjak, Michael Matz, 'Jan Beulich' via X86-64 System V Application Binary Interface, Florian Weimer, Jan Beulich, Jan Hubicka, H.J. Lu
On Mon, Oct 16, 2023 at 6:38 AM Uros Bizjak <ubi...@gmail.com> wrote:
> Yes, this was the only thing holding back the GCC patch. Fangrui,
> please repost the final version of the patch to gcc-patches@ mailing
> list, so I can formally approve it.
>
> Thanks,
> Uros.

Hi Uros and Michael,

https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625993.html
("[PATCH v4] i386: Allow -mlarge-data-threshold with -mcmodel=large")
is the final version.
PATCH v2 and v3 were to refine commit messages while v4 improved test
directives.

Thanks to everyone for their assistance on improving the documentation:)


--
宋方睿
Rispondi a tutti
Rispondi all'autore
Inoltra
0 nuovi messaggi