Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

glibc regression on alpha with 2.34+

5 views
Skip to first unread message

John Paul Adrian Glaubitz

unread,
Nov 12, 2022, 6:50:03 PM11/12/22
to
Hello!

I just noticed that there is a regression in glibc on alpha with version 2.34 or later.

Looking at the build logs for Debian's 2.34-8 [1], 2.35-4 [2] and 2.36-4 [3], it's obvious
there is something wrong given the many "Segmentation Fault" errors.

I had hoped I could fix this issue by passing "--disable-default-pie" like we already did
on sparc64, but it seems it's not the same bug [4]. At least, this particular workaround
does not help.

I think the best approach would be bisecting this from 2.33 to 2.34 using the glibc sources
from git. I assume building glibc and running the testsuite should be enough for bisecting.

Adrian

> [1] https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=alpha&ver=2.34-8&stamp=1662963628&raw=0
> [2] https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=alpha&ver=2.35-4&stamp=1666729919&raw=0
> [3] https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=alpha&ver=2.36-4&stamp=1667607306&raw=0
> [4] https://sourceware.org/bugzilla/show_bug.cgi?id=29575

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

Michael Cree

unread,
Nov 20, 2022, 4:50:03 AM11/20/22
to
On Sun, Nov 13, 2022 at 12:45:17AM +0100, John Paul Adrian Glaubitz wrote:
> I just noticed that there is a regression in glibc on alpha with version 2.34 or later.
>
> Looking at the build logs for Debian's 2.34-8 [1], 2.35-4 [2] and 2.36-4 [3], it's obvious
> there is something wrong given the many "Segmentation Fault" errors.
>
> I had hoped I could fix this issue by passing "--disable-default-pie" like we already did
> on sparc64, but it seems it's not the same bug [4]. At least, this particular workaround
> does not help.

Interestingly the vast number of the failing tests pass if one builds
with a compiler that raises the baseline to EV67. This has been
proposed a number of times in the past for the Debian distribution.
I think it is time we did it. One of our last EV56 users has recently
bowed out due to hardware failure and I am only running EV67 hardware.

Cheers
Michael.

Frank Scheiner

unread,
Nov 20, 2022, 7:50:03 AM11/20/22
to
I still have the following pre EV67 machines available and in working order:

* AXPpci 33 (LCA4)
* AlphaStation 200 (EV4) / 255 (EV45) / 500 (EV56)
* PWS 500au (EV56)
* AlphaServer 800 (EV56)

...and can provide testing on them. All of them eventually ran Debian
GNU/Linux Sid with up to Linux 5.x.x IIRC and I will also try them with
6.0.x. And I believe the majority of still exsiting, still working Alpha
systems are pre EV67 systems.

Given the fact that EV6[...] and EV7[...] based systems are nowadays
very expensive for hobby use (I don't want to say unobtainium), I expect
that dropping support for pre EV67 will kill off most of the user base
for Debian on Alpha (and also Gentoo I assume).

Phrasing it differently:

Who needs a port that only runs on the buildds and a handful of
(hobbyist) machines around the world (like ppc64le ;-))?

My two cents.

All the best,
Frank

Kirsten Bromilow

unread,
Nov 20, 2022, 8:00:03 AM11/20/22
to
Please remove my email from your mailing lIst!

Sent from my iPhone

> On 20 Nov 2022, at 12:48, Frank Scheiner <frank.s...@web.de> wrote:

Michael Cree

unread,
Dec 12, 2022, 2:30:02 AM12/12/22
to
On Sun, Nov 20, 2022 at 01:47:59PM +0100, Frank Scheiner wrote:
> On 20.11.22 10:03, Michael Cree wrote:
> > On Sun, Nov 13, 2022 at 12:45:17AM +0100, John Paul Adrian Glaubitz wrote:
> > > I just noticed that there is a regression in glibc on alpha with version 2.34 or later.
> > >
> > Interestingly the vast number of the failing tests pass if one builds
> > with a compiler that raises the baseline to EV67. This has been
> > proposed a number of times in the past for the Debian distribution.
> > I think it is time we did it. One of our last EV56 users has recently
> > bowed out due to hardware failure and I am only running EV67 hardware.
>
> I still have the following pre EV67 machines available and in working order:
>
> * AXPpci 33 (LCA4)
> * AlphaStation 200 (EV4) / 255 (EV45) / 500 (EV56)
> * PWS 500au (EV56)
> * AlphaServer 800 (EV56)
>
> ...and can provide testing on them. All of them eventually ran Debian

Can you fix the ev4 based bugs in glibc? If not, I am not interested.

With the usrmerge uploads now depending on a recent libc version Alpha
is now dead in the water. Nothing can be built. Thus we have to fix
glibc to continue building.

I am not prepared to fix ev4 issues so if no one else is prepared to
fix them then without a architecture baseline raise this is the end
of Alpha on Debian Ports.

Given that ev67 fixed the glibc test suite failures, and that it is
most likely BWX that is the issue, I am going to work towards
upping the baseline Alpha architecture to include BWX, i.e. ev56, in
Debian ports. It will probably take me a few days work to rebuild
gcc and glibc with the baseline raised and fix any remaining issues
in glibc.

So if people want the baseline to remain at ev4 you have a few days
to fix all the test suite failures in glibc before I upload the raise
of the architecture baseline.

Cheers
Michael.

John Paul Adrian Glaubitz

unread,
Dec 12, 2022, 3:10:02 AM12/12/22
to
Hi Frank!

> On Dec 12, 2022, at 8:57 AM, Frank Scheiner <frank.s...@web.de> wrote:
>
> I'm not sure I fully understand the issue here:
>
> See, glibc used to work for alpha up until 2.33 as I read. Then a change
> broke it for alpha with 2.34. Does the respective glibc maintainer for
> alpha (Richard Henderson according to [1]) really have no interest in
> fixing it?

Any chance you can bisect the issue?

FWIW, it’s not been reported upstream yet.

Adrian

Frank Scheiner

unread,
Dec 12, 2022, 3:10:02 AM12/12/22
to
Dear Michael,

On 12.12.22 08:27, Michael Cree wrote:
> On Sun, Nov 20, 2022 at 01:47:59PM +0100, Frank Scheiner wrote:
>> On 20.11.22 10:03, Michael Cree wrote:
>>> On Sun, Nov 13, 2022 at 12:45:17AM +0100, John Paul Adrian Glaubitz wrote:
>>>> I just noticed that there is a regression in glibc on alpha with version 2.34 or later.
>>>>
>>> Interestingly the vast number of the failing tests pass if one builds
>>> with a compiler that raises the baseline to EV67. This has been
>>> proposed a number of times in the past for the Debian distribution.
>>> I think it is time we did it. One of our last EV56 users has recently
>>> bowed out due to hardware failure and I am only running EV67 hardware.
>>
>> I still have the following pre EV67 machines available and in working order:
>>
>> * AXPpci 33 (LCA4)
>> * AlphaStation 200 (EV4) / 255 (EV45) / 500 (EV56)
>> * PWS 500au (EV56)
>> * AlphaServer 800 (EV56)
>>
>> ...and can provide testing on them. All of them eventually ran Debian
>
> Can you fix the ev4 based bugs in glibc? If not, I am not interested.

I already told you what I can provide.

> With the usrmerge uploads now depending on a recent libc version Alpha
> is now dead in the water. Nothing can be built. Thus we have to fix
> glibc to continue building.
>
> I am not prepared to fix ev4 issues so if no one else is prepared to
> fix them then without a architecture baseline raise this is the end
> of Alpha on Debian Ports.

I'm not sure I fully understand the issue here:

See, glibc used to work for alpha up until 2.33 as I read. Then a change
broke it for alpha with 2.34. Does the respective glibc maintainer for
alpha (Richard Henderson according to [1]) really have no interest in
fixing it?

[1]: https://sourceware.org/glibc/wiki/MAINTAINERS#Machine_maintainers

Cheers,
Frank

Michael Cree

unread,
Dec 12, 2022, 3:20:02 AM12/12/22
to
RTH hasn't had working Alpha hardware for quite some time.

One of the glibc maintainers did have access to one of my Alphas
until last year but unfortunately the hosting site is no longer
prepared to host it so I can no longer make that Alpha available
to developers.

So with that glibc Alpha support is rotting fast.

Many of the other ports (e.g. armel, armhf, i386) have had
architecture baseline increases in the last few years, and none
support hardware anywhere near as old as alpha ev4.

I am no longer personally prepared to support Alpha unless
the architecture baseline increase is done. I have no
ev4/ev45 hardware and no longer have any interest in supporting
them.

Regards,
Michael.

Michael Cree

unread,
Dec 12, 2022, 3:30:02 AM12/12/22
to
I am not interested in supporting old Alphas without BWX anymore.
I am drawing the line. Either someone steps up to support non-BWX
Alpha and promptly fixes glibc or the architecture baseline is
increased to include BWX (thereby fixing most of the glibc issues).
Without either of those happening I give up being an Alpha porter
and switch off my Alpha buildd permanently. I have many other
interesting projects I could be working on!

Regards,
Michael.

Frank Scheiner

unread,
Dec 12, 2022, 4:40:03 AM12/12/22
to


On 12.12.22 09:17, Michael Cree wrote:
> On Mon, Dec 12, 2022 at 08:56:40AM +0100, Frank Scheiner wrote:
>> Dear Michael,
>>
>> On 12.12.22 08:27, Michael Cree wrote:
>>> With the usrmerge uploads now depending on a recent libc version Alpha
>>> is now dead in the water. Nothing can be built. Thus we have to fix
>>> glibc to continue building.
>>>
>>> I am not prepared to fix ev4 issues so if no one else is prepared to
>>> fix them then without a architecture baseline raise this is the end
>>> of Alpha on Debian Ports.
>>
>> I'm not sure I fully understand the issue here:
>>
>> See, glibc used to work for alpha up until 2.33 as I read. Then a change
>> broke it for alpha with 2.34. Does the respective glibc maintainer for
>> alpha (Richard Henderson according to [1]) really have no interest in
>> fixing it?
>
> RTH hasn't had working Alpha hardware for quite some time.
>
> One of the glibc maintainers did have access to one of my Alphas
> until last year but unfortunately the hosting site is no longer
> prepared to host it so I can no longer make that Alpha available
> to developers.

Thanks for clarifying.

> So with that glibc Alpha support is rotting fast.
>
> Many of the other ports (e.g. armel, armhf, i386) have had
> architecture baseline increases in the last few years, and none
> support hardware anywhere near as old as alpha ev4.
>
> I am no longer personally prepared to support Alpha unless
> the architecture baseline increase is done. I have no
> ev4/ev45 hardware and no longer have any interest in supporting
> them.

Yeah, I figured that already from your first email today.

In your email to Adrian you write about BWX capable processors as new
baseline. So EV56 instead of EV67?

Cheers,
Frank

John Paul Adrian Glaubitz

unread,
Dec 12, 2022, 6:30:02 AM12/12/22
to


> On Dec 12, 2022, at 9:27 AM, Michael Cree <mc...@orcon.net.nz> wrote:
>
> I am not interested in supporting old Alphas without BWX anymore.
> I am drawing the line. Either someone steps up to support non-BWX
> Alpha and promptly fixes glibc or the architecture baseline is
> increased to include BWX (thereby fixing most of the glibc issues).
> Without either of those happening I give up being an Alpha porter
> and switch off my Alpha buildd permanently. I have many other
> interesting projects I could be working on!

As a compromise, how about we fix the bug, create a final set of CD images for old Alphas, then raise the baseline after having verified it does not break QEMU (both -user and -system)?

Adrian

Michael Cree

unread,
Dec 12, 2022, 1:20:04 PM12/12/22
to
You fix the bug then. I'm not interested so there is no "we" in this.

Cheers,
Michael.

John Paul Adrian Glaubitz

unread,
Dec 12, 2022, 1:30:02 PM12/12/22
to


> On Dec 12, 2022, at 7:17 PM, Michael Cree <mc...@orcon.net.nz> wrote:
Please don’t be so negative.

We should be able to have a discussion on this topic without such sentiments.

There are valid arguments for both sides, so it’s not helpful to lead a discussion like this.

Adrian

Michael Cree

unread,
Dec 12, 2022, 2:50:03 PM12/12/22
to
I had wanted to do this years ago. Every time I have raised it someone
has protested. The discussion has been had again and again and I am no
longer interested in it. The result of the discussion is that it is I
who ends up fixing the problems arising from supporting EV4/EV45.

The bottom line is that I am not prepared to support EV4/EV45 anymore.
This is not being negative. This is me being honest about the fact
that I have too limited time and many other projects that I want to
work on.

Either the arch baseline is raised to something that is easier to
maintain (which, frankly, I think is essential if the Alpha port is to
survive any longer), someone else steps up to fix the brokenness that
arises from non-atomic multi-cpu-instruction 8-bit and 16-bit memory
accesses, or I bail out of maintaining Debian-Ports Alpha.

Cheers,
Michael.

John Paul Adrian Glaubitz

unread,
Dec 13, 2022, 12:20:02 AM12/13/22
to
Hello!

On 12/12/22 20:45, Michael Cree wrote:
> Either the arch baseline is raised to something that is easier to
> maintain (which, frankly, I think is essential if the Alpha port is to
> survive any longer), someone else steps up to fix the brokenness that
> arises from non-atomic multi-cpu-instruction 8-bit and 16-bit memory
> accesses, or I bail out of maintaining Debian-Ports Alpha.

So what baseline do we want? Would EV56 be sufficient? Because that would
still work with my AlphaStation 433au and XP1000 and gets us BWX.

I don't want to use something like EV67 as I think that would limit the
usable hardware too much. I guess I can live with dropping EV4 since NetBSD
and Gentoo would still run on these.

I am still interested in fixing the glibc bug and will work on bisecting it.

If EV56 is the baseline we can agree on, please go ahead and rebuild glibc
and gcc using this baseline.

Thanks,
Adrian

Michael Cree

unread,
Dec 13, 2022, 2:50:03 AM12/13/22
to
On Tue, Dec 13, 2022 at 06:15:16AM +0100, John Paul Adrian Glaubitz wrote:
> Hello!
>
> On 12/12/22 20:45, Michael Cree wrote:
> > Either the arch baseline is raised to something that is easier to
> > maintain (which, frankly, I think is essential if the Alpha port is to
> > survive any longer), someone else steps up to fix the brokenness that
> > arises from non-atomic multi-cpu-instruction 8-bit and 16-bit memory
> > accesses, or I bail out of maintaining Debian-Ports Alpha.
>
> So what baseline do we want? Would EV56 be sufficient? Because that would
> still work with my AlphaStation 433au and XP1000 and gets us BWX.

Yes. The first extension added is the byte-word extension which came
in with EV56. That provides CPU instructions for byte and word (16-bit)
memory accesses. That is the most important one: possibly a third of
the bugs in the repository extend from non-atomic byte and word
accesses. The kernel developers have expressed a view that they would
like to assume on all arches that byte and word memory accesses are
atomic and the only architecture that is holding them back from that
assumption are old Alphas without BWX. There is an old open bug on
gcc related to the non-atomic memory accesses of old Alphas and that
one is basically cannot fix.

If we went to BWX (i.e. EV56) then as you say that means the personal
workstations (e.g. PWS433au and PWS500au), which a lot of Alpha users
have and AlphaStations such as the 433au will still be supported.

> I don't want to use something like EV67 as I think that would limit the
> usable hardware too much.

Yes, that's the problem going fully to EV67. The CPU extensions we
would get are MVI (motion video instructions) that came in with
PCA56, CIX (count integer instructions with the like of counting
trailing zeros) that came in with EV67 and FIX (floating point
extensions primarily for efficient conversion between float and
integer and a sqrt instruction) with EV6, but these are nowhere
near as important as BWX in terms of reducing bug fixing workload
in maintaining the port.

> I guess I can live with dropping EV4 since NetBSD
> and Gentoo would still run on these.

Gentoo has the advantage (and disadvantage) of compiling from
source so one can optimise their own installation for their
hardware.

> I am still interested in fixing the glibc bug and will work on bisecting it.
>
> If EV56 is the baseline we can agree on, please go ahead and rebuild glibc
> and gcc using this baseline.

I am currently building gcc-12 to default to EV56/BWX. In the test
suite now so probably won't be finished till tomorrow. Then I will try
building latest glibc (2.36-6) with that gcc. I suspect there will
still be a couple of test suite failures so there will probably be
a further delay before I have it ready to upload to the repository. In
any case I will give fair warning before I do.

Cheers,
Michael.

John Paul Adrian Glaubitz

unread,
Dec 13, 2022, 4:00:02 AM12/13/22
to
Hi!

On 12/13/22 08:44, Michael Cree wrote:
>> So what baseline do we want? Would EV56 be sufficient? Because that would
>> still work with my AlphaStation 433au and XP1000 and gets us BWX.
>
> Yes. The first extension added is the byte-word extension which came
> in with EV56. That provides CPU instructions for byte and word (16-bit)
> memory accesses. That is the most important one: possibly a third of
> the bugs in the repository extend from non-atomic byte and word
> accesses. The kernel developers have expressed a view that they would
> like to assume on all arches that byte and word memory accesses are
> atomic and the only architecture that is holding them back from that
> assumption are old Alphas without BWX. There is an old open bug on
> gcc related to the non-atomic memory accesses of old Alphas and that
> one is basically cannot fix.
>
> If we went to BWX (i.e. EV56) then as you say that means the personal
> workstations (e.g. PWS433au and PWS500au), which a lot of Alpha users
> have and AlphaStations such as the 433au will still be supported.

Thanks for the confirmation and clarification!

>> I don't want to use something like EV67 as I think that would limit the
>> usable hardware too much.
>
> Yes, that's the problem going fully to EV67. The CPU extensions we
> would get are MVI (motion video instructions) that came in with
> PCA56, CIX (count integer instructions with the like of counting
> trailing zeros) that came in with EV67 and FIX (floating point
> extensions primarily for efficient conversion between float and
> integer and a sqrt instruction) with EV6, but these are nowhere
> near as important as BWX in terms of reducing bug fixing workload
> in maintaining the port.

OK, so EV56 sounds like a very good compromise. I guess we can still
keep the -ev67 glibc package for people with these CPUs.

>> I guess I can live with dropping EV4 since NetBSD
>> and Gentoo would still run on these.
>
> Gentoo has the advantage (and disadvantage) of compiling from
> source so one can optimise their own installation for their
> hardware.

Yes, of couse.

>> I am still interested in fixing the glibc bug and will work on bisecting it.
>>
>> If EV56 is the baseline we can agree on, please go ahead and rebuild glibc
>> and gcc using this baseline.
>
> I am currently building gcc-12 to default to EV56/BWX. In the test
> suite now so probably won't be finished till tomorrow. Then I will try
> building latest glibc (2.36-6) with that gcc. I suspect there will
> still be a couple of test suite failures so there will probably be
> a further delay before I have it ready to upload to the repository. In
> any case I will give fair warning before I do.

Feel free to open bugs against glibc and gcc to request the raise of the
baseline.

Frank Scheiner

unread,
Dec 13, 2022, 4:40:04 AM12/13/22
to
Hi guys,

On 13.12.22 06:15, John Paul Adrian Glaubitz wrote:
> [...]
> I am still interested in fixing the glibc bug and will work on bisecting
> it.

I yestderday did give that a try on a DS15, but it took already hours to
get glibc 2.33 compiled.

During this compilation I got 4 segfaults from the compiler (gcc-12) and
a "gcc: internal compiler error: Aborted signal terminated program cc1".
If you are interested in the details, I have all the error messages
available.

This went on during `make test` with another segfault and this one here
after more than 2 hours of processing:

```
root@ds15:/srv/storage/build/glibc-2.33# time make test
[...]
g++ tst-thread_local1.cc -c -I/srv/storage/build/glibc-2.33/ -g -O2
-Wall -Wwrite-strings -Wundef -fmerge-all-constants -frounding-math
-fno-stack-protector -mlong-double-128 -mieee -mfp-rounding-mode=d
-std=gnu++11 -I../include -I/srv/storage/build/glibc-2.33/nptl
-I/srv/storage/build/glibc-2.33 -I../sysdeps/unix/sysv/linux/alpha/fpu
-I../sysdeps/alpha/fpu -I../sysdeps/unix/sysv/linux/alpha
-I../sysdeps/alpha/nptl -I../sysdeps/unix/sysv/linux/wordsize-64
-I../sysdeps/ieee754/ldbl-64-128 -I../sysdeps/ieee754/ldbl-opt
-I../sysdeps/unix/sysv/linux/include -I../sysdeps/unix/sysv/linux
-I../sysdeps/nptl -I../sysdeps/pthread -I../sysdeps/gnu
-I../sysdeps/unix/inet -I../sysdeps/unix/sysv -I../sysdeps/unix/alpha
-I../sysdeps/unix -I../sysdeps/posix -I../sysdeps/alpha
-I../sysdeps/wordsize-64 -I../sysdeps/ieee754/ldbl-128
-I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754/flt-32
-I../sysdeps/ieee754 -I../sysdeps/generic -I.. -I../libio -I.
-D_LIBC_REENTRANT -include /srv/storage/build/glibc-2.33/libc-modules.h
-DMODULE_NAME=testsuite -include ../include/libc-symbols.h
-DTOP_NAMESPACE=glibc -o
/srv/storage/build/glibc-2.33/nptl/tst-thread_local1.o -MD -MP -MF
/srv/storage/build/glibc-2.33/nptl/tst-thread_local1.o.dt -MT
/srv/storage/build/glibc-2.33/nptl/tst-thread_local1.o
tst-thread_local1.cc: In function ‘int do_test()’:
tst-thread_local1.cc:177:5: error: variable ‘std::array<std::pair<const
char*, std::function<void(void* (*)(void*))> >, 2> do_thread_X’ has
initializer but incomplete type
177 | do_thread_X
| ^~~~~~~~~~~
tst-thread_local1.cc: At global scope:
tst-thread_local1.cc:133:1: warning: ‘void* thread_with_access(void*)’
defined but not used [-Wunused-function]
133 | thread_with_access (void *)
| ^~~~~~~~~~~~~~~~~~
tst-thread_local1.cc:127:1: warning: ‘void*
thread_without_access(void*)’ defined but not used [-Wunused-function]
127 | thread_without_access (void *)
| ^~~~~~~~~~~~~~~~~~~~~
make[2]: *** [../o-iterator.mk:9:
/srv/storage/build/glibc-2.33/nptl/tst-thread_local1.o] Error 1
make[2]: Leaving directory '/srv/storage/glibc/nptl'
make[1]: *** [Makefile:479: nptl/tests] Error 2
make[1]: Leaving directory '/srv/storage/glibc'
make: *** [Makefile:9: check] Error 2

real 129m3.940s
user 104m25.441s
sys 11m36.611s
```

...after which I stopped. I did use `--disable-werror` during the
configure step, but maybe this is not enough. OTOH it's only a warning
so why does it err? Ah, I see it `-Wundef` is set. I'll have a look what
the buildds use for the configure step.

Today I'll also give it another try, but with gcc-11 this time - just in
case something is wrong with gcc-12 - but frankly I don't think this
goes anywhere on the DS15:

Even with a DS25 I have available here I can only speed up the
compilation and it takes already more than twice the power of the DS15,
so nothing gained. My ES45s are in cold storage and I don't dare to
start them up in such a low temperature environment.

Summarizing it, I'd be grateful if someone could do the bisecting on one
of the buildds or developer machines.

Cheers,
Frank


Frank Scheiner

unread,
Dec 13, 2022, 5:02:46 AM12/13/22
to
Hi again,

just wanted to clarify something I saw in the build logs from the
buildds - imago to be specific.

On 13.12.22 10:33, Frank Scheiner wrote:
> [...]
> Summarizing it, I'd be grateful if someone could do the bisecting on one
> of the buildds or developer machines.

According to the logs on [1] and [2], the machine seems to run w/o Bcache:

```
uname -a
Linux imago 5.8.18-titan-p1+ #63 SMP Sat Jan 8 16:18:01 NZDT 2022 alpha
GNU/Linux

if [ -f /proc/cpuinfo ] ; then cat /proc/cpuinfo ; fi
cpu : Alpha
cpu model : EV68CB
cpu variation : 7
cpu revision : 0
cpu serial number : JA44900165
system type : Titan
system variation : Privateer
system revision : 0
system serial number : AY50901023
cycle frequency [Hz] : 1250000000
timer frequency [Hz] : 1024.00
page size [bytes] : 8192
phys. address bits : 44
max. addr. space # : 255
BogoMIPS : 2480.92
kernel unaligned acc : 0 (pc=0,va=0)
user unaligned acc : 338897612 (pc=200000e100c,va=1200e71cc)
platform string : AlphaServer ES45 Model 1B
cpus detected : 3
cpus active : 3
cpu active mask : 0000000000000007
L1 Icache : 64K, 2-way, 64b line
L1 Dcache : 64K, 2-way, 64b line
L2 cache : n/a
L3 cache : n/a
```

[1]:
https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=alpha&ver=2.34-8&stamp=1662963628&raw=0

[2]:
https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=alpha&ver=2.36-4&stamp=1667607306&raw=0

See "n/a" for the L2 cache line? Is that (1) an error in the kernel not
being able to detect the 16 MiB Bcache of the 1250 MHz processor modules
or (2) is this machine really running w/o active Bcache?

If (2) I don't know how much effect this has on compilation, but I
recently had a similar issue with the DS15 - i.e. Bcache not activated -
and could see a noticable difference in performance for e.g. `7za b`
and `openssl speed -elapsed` when compared to runs with active Bcache later.

Though I found a solution for the DS15, I didn't find anything related
yet for an ES45.

But maybe this is just a kernel bug and the Bcache is active despite
that message. You can check in SRM with `show config | more`:

```
>>>show config | more
hp AlphaStation DS15
[...]
Processors
CPU 0 Alpha EV68CB pass 4.0 1000 MHz 2MB Bcache
[...]
```

If you see "0MB Bcache" something is wrong.

Alternatively the RMC should also be able to tell you the state of the
processors (using the `cpu` command:

```
RMC>cpu
�0;1m
CPU Powerup Status Translation
�0m
EV6 BIST: PASS
CPU ID: 0 (primary)
STR Test: PASS
CSC Test: PASS
PCHIP0 Test: PASS
DIMx Test: PASS
TIG Bus Test: PASS
DPR Test: STARTED - PASS
CPU Speed Test: PASS - 1000MHz
SROM Power-On Time: 12-13-42 09:45:36
SROM Power-On Error: No error
System Bus Speed: 125MHz
Last Synch State Test: PASS
Bcache Size: 2MB
```

). Though I'm not sure how this will look for mutliple processors.

Cheers,
Frank

John Paul Adrian Glaubitz

unread,
Dec 13, 2022, 5:02:46 AM12/13/22
to
Hello!

On 12/13/22 10:33, Frank Scheiner wrote:
> Hi guys,
>
> On 13.12.22 06:15, John Paul Adrian Glaubitz wrote:
>> [...]
>> I am still interested in fixing the glibc bug and will work on bisecting it.
>
> I yestderday did give that a try on a DS15, but it took already hours to get glibc 2.33 compiled.
>
> During this compilation I got 4 segfaults from the compiler (gcc-12) and a "gcc: internal compiler
> error: Aborted signal terminated program cc1". If you are interested in the details, I have all the
> error messages available.

Is that glibc from upstream or the Debian package?

Also, is the machine's memory known to be good? Please make sure to test it.
You could cross-compile glibc. That's most likely what I am going to do.

Frank Scheiner

unread,
Dec 13, 2022, 5:10:03 AM12/13/22
to
Hi,

On 13.12.22 10:52, John Paul Adrian Glaubitz wrote:
> [...]
>> During this compilation I got 4 segfaults from the compiler (gcc-12)
>> and a "gcc: internal compiler
>> error: Aborted signal terminated program cc1". If you are interested
>> in the details, I have all the
>> error messages available.
>
> Is that glibc from upstream or the Debian package?

Upstream, I followed [1].

[1]: https://sourceware.org/glibc/wiki/Testing/Builds

> Also, is the machine's memory known to be good? Please make sure to test
> it.

I don't know for sure - you never know for software running on the same
hardware that's gonna be tested - but the SROM testing didn't show any
problems at least:

```
............
SROM V1.0-0 CPU # 00 @ 1000 MHz
SROM program starting
Reloading SROM
............
SROM V1.0-1 CPU # 00 @ 1000 MHz
System Bus Speed @ 0125 MHz
SROM program starting
Bcache data tests in progress
Bcache address test in progress
CPU parity and ECC detection in progress
Bcache ECC data tests in progress
Bcache TAG lines tests in progress
Memory sizing in progress
Memory configuration in progress
Testing AAR2
Memory data test in progress
Memory address test in progress
Memory pattern test in progress
Testing AAR0
Memory data test in progress
Memory address test in progress
Memory pattern test in progress
Memory initialization
............Loading console
Code execution complete (transfer control)
```

I can try to increase the depth of testing - if possible - but I'd
expect some ECC related messaging for any failures happening and there
was none on the system console.

> [...]
>> Summarizing it, I'd be grateful if someone could do the bisecting on
>> one of the buildds or developer machines.
>
> You could cross-compile glibc. That's most likely what I am going to do.

Can the testing happen on a different arch, too?

Cheers,
Frank

John Paul Adrian Glaubitz

unread,
Dec 13, 2022, 11:30:02 AM12/13/22
to
Hi!

On 11/13/22 00:45, John Paul Adrian Glaubitz wrote:
> I just noticed that there is a regression in glibc on alpha with version 2.34 or later.
>
> Looking at the build logs for Debian's 2.34-8 [1], 2.35-4 [2] and 2.36-4 [3], it's obvious
> there is something wrong given the many "Segmentation Fault" errors.

This regression was introduced by the following commit:

6c57d320484988e87e446e2e60ce42816bf51d53 is the first bad commit
commit 6c57d320484988e87e446e2e60ce42816bf51d53
Author: H.J. Lu <hjl....@gmail.com>
Date: Mon Feb 1 11:00:38 2021 -0800

sysconf: Add _SC_MINSIGSTKSZ/_SC_SIGSTKSZ [BZ #20305]

Add _SC_MINSIGSTKSZ for the minimum signal stack size derived from
AT_MINSIGSTKSZ, which is the minimum number of bytes of free stack
space required in order to gurantee successful, non-nested handling
of a single signal whose handler is an empty function, and _SC_SIGSTKSZ
which is the suggested minimum number of bytes of stack space required
for a signal stack.

FWIW, it does not seem to affect alpha systems with a higher baseline such as EV67.

John Paul Adrian Glaubitz

unread,
Dec 13, 2022, 11:30:02 AM12/13/22
to
Hi!

On 12/13/22 10:52, John Paul Adrian Glaubitz wrote:
> You could cross-compile glibc. That's most likely what I am going to do.

For the record, here's how I am doing it.

1. Create an alpha chroot on an x86_64 host system using debootstrap
on a system with qemu-user-static installed.

# debootstrap --no-check-gpg --arch=alpha unstable /srv/sid-alpha-sbuild http://ftp.ports.debian.org/debian-ports

(See below for schroot configuration).

2. Install cross-compiler for alpha as well as build dependencies for glibc:

# apt install g++-alpha-linux-gnu
# apt build-dep --arch-only glibc

3. Cross-compile glibc on x86_64 on host system:

$ cd /path/to/glibc/
$ mkdir build && cd build
$ ../configure --host=alpha-linux-gnu --disable-werror --prefix=/usr --disable-sanity-checks && make -j8

4. Enter alpha schroot and run the the following command from the build directory:

(sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$ LD_LIBRARY_PATH=/home/glaubitz/glibc-git/build /bin/bash

If the bug is present, this command will segfault:

Segmentation fault

Otherwise it will just spawn another bash which can be exited with "exit":

(sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$ LD_LIBRARY_PATH=/home/glaubitz/glibc-git/build /bin/bash
(sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$
exit
(sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$

The trick is to share the glibc source directory into the schroot. This is achieved by the following
two configuration files:

root@z6:~> cat /etc/schroot/chroot.d/sid-alpha-sbuild
[sid-alpha-sbuild]
description=Debian sid chroot for alpha
type=directory
directory=/local_scratch/sid-alpha-sbuild
profile=sbuild
#aliases=sid
groups=root,sbuild,glaubitz,buildd
root-groups=root,sbuild,glaubitz,buildd
root@z6:~>

root@z6:~> cat /etc/schroot/sbuild/fstab
# fstab: static file system information for chroots.
# Note that the mount point will be prefixed by the chroot path
# (CHROOT_PATH)
#
# <file system> <mount point> <type> <options> <dump> <pass>
/proc /proc none rw,bind 0 0
/sys /sys none rw,bind 0 0
/dev/pts /dev/pts none rw,bind 0 0
tmpfs /dev/shm tmpfs defaults 0 0
# Mount a large scratch space for the build, so we don't use up
# space on an LVM snapshot of the chroot itself.
/home/glaubitz /home/glaubitz none rw,bind 0 0
root@z6:~>

To bisect, just run the normal git bisect process from the and just run the
test command in the emulated schroot from a second terminal.

John Paul Adrian Glaubitz

unread,
Dec 13, 2022, 12:50:03 PM12/13/22
to
On 12/13/22 17:25, John Paul Adrian Glaubitz wrote:
> On 11/13/22 00:45, John Paul Adrian Glaubitz wrote:
>> I just noticed that there is a regression in glibc on alpha with version 2.34 or later.
>>
>> Looking at the build logs for Debian's 2.34-8 [1], 2.35-4 [2] and 2.36-4 [3], it's obvious
>> there is something wrong given the many "Segmentation Fault" errors.
>
> This regression was introduced by the following commit:

Reported here: https://sourceware.org/bugzilla/show_bug.cgi?id=29899

Frank Scheiner

unread,
Dec 14, 2022, 12:32:55 PM12/14/22
to
Hi Adrian,

On 13.12.22 17:21, John Paul Adrian Glaubitz wrote:
> Hi!
>
> On 12/13/22 10:52, John Paul Adrian Glaubitz wrote:
>> You could cross-compile glibc. That's most likely what I am going to do.
>
> For the record, here's how I am doing it.
>
> [...]

Thanks for that, this is quite useful.

> 4. Enter alpha schroot and run the the following command from the build
> directory:
>
>    (sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$
> LD_LIBRARY_PATH=/home/glaubitz/glibc-git/build /bin/bash
>
>    If the bug is present, this command will segfault:
>
>    Segmentation fault
>
>    Otherwise it will just spawn another bash which can be exited with
> "exit":
>
>    (sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$
> LD_LIBRARY_PATH=/home/glaubitz/glibc-git/build /bin/bash
>    (sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$
>    exit
>    (sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$

Can we be sure that this reproducer identifies the same problem than the
build failures from the original post ([1])?

[1]: https://lists.debian.org/debian-alpha/2022/11/msg00003.html

Regardless, I can confirm this on my DS15:

```
root@ds15:/srv/storage/build#
LD_LIBRARY_PATH=$PWD/glibc-at-36231bee7ab36d59dd121ea85b91411ae86945f3
/bin/bash
root@ds15:/srv/storage/build# echo $?
0
root@ds15:/srv/storage/build# exit
exit

root@ds15:/srv/storage/build#
LD_LIBRARY_PATH=$PWD/glibc-at-6c57d320484988e87e446e2e60ce42816bf51d53
/bin/bash
Segmentation fault
root@ds15:/srv/storage/build# echo $?
139
```

...6c57d320484988e87e446e2e60ce42816bf51d53 is the first bad commit and
36231bee7ab36d59dd121ea85b91411ae86945f3 is its parent.

Do we also have a result for
glibc@6c57d320484988e87e446e2e60ce42816bf51d53 with `-mcpu=ev67`?

Cheers,
Frank

Frank Scheiner

unread,
Dec 14, 2022, 2:50:02 PM12/14/22
to
On 14.12.22 18:21, Frank Scheiner wrote:
> [...]
> Regardless, I can confirm this on my DS15:
>
> ```
> root@ds15:/srv/storage/build#
> LD_LIBRARY_PATH=$PWD/glibc-at-36231bee7ab36d59dd121ea85b91411ae86945f3
> /bin/bash
> root@ds15:/srv/storage/build# echo $?
> 0
> root@ds15:/srv/storage/build# exit
> exit
>
> root@ds15:/srv/storage/build#
> LD_LIBRARY_PATH=$PWD/glibc-at-6c57d320484988e87e446e2e60ce42816bf51d53
> /bin/bash
> Segmentation fault
> root@ds15:/srv/storage/build# echo $?
> 139
> ```
>
> ...6c57d320484988e87e446e2e60ce42816bf51d53 is the first bad commit and
> 36231bee7ab36d59dd121ea85b91411ae86945f3 is its parent.
>
> Do we also have a result for
> glibc@6c57d320484988e87e446e2e60ce42816bf51d53 with `-mcpu=ev67`?

```
root@ds15:/srv/storage/build/glibc-at-6c57d320484988e87e446e2e60ce42816bf51d53-ev67#
CC="alpha-linux-gnu-gcc-12 -mcpu=ev67 -mtune=ev67 "
CXX="alpha-linux-gnu-g++-12 -mcpu=ev67 -mtune=ev67 "
MIG="alpha-linux-gnu-mig" ../../glibc/configure
--host=alphaev67-linux-gnu --disable-werror --prefix=/usr
--disable-sanity-checks

[...]

root@ds15:/srv/storage/build#
LD_LIBRARY_PATH=$PWD/glibc-at-6c57d320484988e87e446e2e60ce42816bf51d53-ev67
/bin/bash
Segmentation fault
```

Unfortunately it also doesn't work here when optimized for EV67.

Cheers,
Frank

John Paul Adrian Glaubitz

unread,
Dec 14, 2022, 3:00:03 PM12/14/22
to
Hi!

On 12/14/22 20:46, Frank Scheiner wrote:
> ```
> root@ds15:/srv/storage/build/glibc-at-6c57d320484988e87e446e2e60ce42816bf51d53-ev67#
> CC="alpha-linux-gnu-gcc-12 -mcpu=ev67 -mtune=ev67 "
> CXX="alpha-linux-gnu-g++-12 -mcpu=ev67 -mtune=ev67 "
> MIG="alpha-linux-gnu-mig" ../../glibc/configure
> --host=alphaev67-linux-gnu --disable-werror --prefix=/usr
> --disable-sanity-checks
>
> [...]
>
> root@ds15:/srv/storage/build#
> LD_LIBRARY_PATH=$PWD/glibc-at-6c57d320484988e87e446e2e60ce42816bf51d53-ev67
> /bin/bash
> Segmentation fault
> ```
>
> Unfortunately it also doesn't work here when optimized for EV67.

OK, this just confirms what my cross-compile tests with "-mcpu=ev67 -mtune=ev67"
where the segfault wasn't fixed either by raising the baseline.

If you have a user account for glibc bugzilla, you should subscribe to the bug
report I opened for this particular issue [1]. H. J. Lu raises a good question,
namely whether alpha has any hardcoded values for "struct rtld_global_ro {}".

Adrian

> [1] https://sourceware.org/bugzilla/show_bug.cgi?id=29899

John Paul Adrian Glaubitz

unread,
Dec 14, 2022, 3:00:03 PM12/14/22
to
Hi Frank!

On 12/14/22 18:21, Frank Scheiner wrote:
>> (sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$ LD_LIBRARY_PATH=/home/glaubitz/glibc-git/build /bin/bash
>>
>> If the bug is present, this command will segfault:
>>
>> Segmentation fault
>>
>> Otherwise it will just spawn another bash which can be exited with "exit":
>>
>> (sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$ LD_LIBRARY_PATH=/home/glaubitz/glibc-git/build /bin/bash
>> (sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$
>> exit
>> (sid-alpha-sbuild)glaubitz@z6:~/glibc-git/build$
>
> Can we be sure that this reproducer identifies the same problem than the build failures from the original post ([1])?
>
> [1]: https://lists.debian.org/debian-alpha/2022/11/msg00003.html

Well, this is how I identified that there was a problem with glibc on alpha.

I built the packages manually with the testsuite enabled and installed them
into a chroot for testing which resulted in a segfault when dpkg tried to
configure the libc-bin package.

I assume the many testsuite failures are a direct result of this bug which
just causes many tests to segfault. We had a similar problem on sparc64 where
a single bug in the static build caused many testsuite failures.

> Regardless, I can confirm this on my DS15:
>
> ```
> root@ds15:/srv/storage/build# LD_LIBRARY_PATH=$PWD/glibc-at-36231bee7ab36d59dd121ea85b91411ae86945f3 /bin/bash
> root@ds15:/srv/storage/build# echo $?
> 0
> root@ds15:/srv/storage/build# exit
> exit
>
> root@ds15:/srv/storage/build# LD_LIBRARY_PATH=$PWD/glibc-at-6c57d320484988e87e446e2e60ce42816bf51d53 /bin/bash
> Segmentation fault
> root@ds15:/srv/storage/build# echo $?
> 139
> ```
>
> ...6c57d320484988e87e446e2e60ce42816bf51d53 is the first bad commit and 36231bee7ab36d59dd121ea85b91411ae86945f3 is its parent.

Good.

> Do we also have a result for glibc@6c57d320484988e87e446e2e60ce42816bf51d53 with `-mcpu=ev67`?

I actually tried to verify whether building with '-mcpu=ev67 -mtune=ev67' would fix the problem
but I was unable to. I'm not sure whether the cross-compiler supports the ev67 target.

On the other hand, I did some more testing and it turned out that with commit 6c57d320484988e87e446e2e60ce42816bf51d53
applied, I could fix the segfault with the following minimal change:

diff --git a/elf/dl-sysdep.c b/elf/dl-sysdep.c
index bd5066fe3b..bc45a6e9d3 100644
--- a/elf/dl-sysdep.c
+++ b/elf/dl-sysdep.c
@@ -115,10 +115,10 @@ _dl_sysdep_start (void **start_argptr,
user_entry = (ElfW(Addr)) ENTRY_POINT;
GLRO(dl_platform) = NULL; /* Default to nothing known about the platform. */

- /* NB: Default to a constant CONSTANT_MINSIGSTKSZ. */
- _Static_assert (__builtin_constant_p (CONSTANT_MINSIGSTKSZ),
- "CONSTANT_MINSIGSTKSZ is constant");
- GLRO(dl_minsigstacksize) = CONSTANT_MINSIGSTKSZ;
+ /* /\* NB: Default to a constant CONSTANT_MINSIGSTKSZ. *\/ */
+ /* _Static_assert (__builtin_constant_p (CONSTANT_MINSIGSTKSZ), */
+ /* "CONSTANT_MINSIGSTKSZ is constant"); */
+ /* GLRO(dl_minsigstacksize) = CONSTANT_MINSIGSTKSZ; */

for (av = GLRO(dl_auxv); av->a_type != AT_NULL; set_seen (av++))
switch (av->a_type)
@@ -184,9 +184,9 @@ _dl_sysdep_start (void **start_argptr,
case AT_RANDOM:
_dl_random = (void *) av->a_un.a_val;
break;
- case AT_MINSIGSTKSZ:
- GLRO(dl_minsigstacksize) = av->a_un.a_val;
- break;
+ /* case AT_MINSIGSTKSZ: */
+ /* GLRO(dl_minsigstacksize) = av->a_un.a_val; */
+ /* break; */
DL_PLATFORM_AUXV
}

diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index 9720a4e446..9ead714718 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
@@ -536,8 +536,8 @@ struct rtld_global_ro
/* Cached value of `getpagesize ()'. */
EXTERN size_t _dl_pagesize;

- /* Cached value of `sysconf (_SC_MINSIGSTKSZ)'. */
- EXTERN size_t _dl_minsigstacksize;
+ /* /\* Cached value of `sysconf (_SC_MINSIGSTKSZ)'. *\/ */
+ /* EXTERN size_t _dl_minsigstacksize; */

/* Do we read from ld.so.cache? */
EXTERN int _dl_inhibit_cache;
diff --git a/sysdeps/unix/sysv/linux/sysconf.c b/sysdeps/unix/sysv/linux/sysconf.c
index 366fcef01e..5a5c89f80e 100644
--- a/sysdeps/unix/sysv/linux/sysconf.c
+++ b/sysdeps/unix/sysv/linux/sysconf.c
@@ -77,12 +77,12 @@ __sysconf (int name)
}
break;

- case _SC_MINSIGSTKSZ:
- assert (GLRO(dl_minsigstacksize) != 0);
- return GLRO(dl_minsigstacksize);
+ /* case _SC_MINSIGSTKSZ: */
+ /* assert (GLRO(dl_minsigstacksize) != 0); */
+ /* return GLRO(dl_minsigstacksize); */

- case _SC_SIGSTKSZ:
- return sysconf_sigstksz ();
+ /* case _SC_SIGSTKSZ: */
+ /* return sysconf_sigstksz (); */

default:
break;

Looking at the changes, it can only be this particular hunk that causes the segfault:

diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index 9720a4e446..9ead714718 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
@@ -536,8 +536,8 @@ struct rtld_global_ro
/* Cached value of `getpagesize ()'. */
EXTERN size_t _dl_pagesize;

- /* Cached value of `sysconf (_SC_MINSIGSTKSZ)'. */
- EXTERN size_t _dl_minsigstacksize;
+ /* /\* Cached value of `sysconf (_SC_MINSIGSTKSZ)'. *\/ */
+ /* EXTERN size_t _dl_minsigstacksize; */

/* Do we read from ld.so.cache? */
EXTERN int _dl_inhibit_cache;

Interestingly, when I checkout the tag glibc-2.34 and disabled the _dl_minsigstacksize symbol
in "struct rtld_global_ro {}" again with the following hack, I'm no longer getting a segfault
but a floating point exception:

diff --git a/elf/dl-sysdep.c b/elf/dl-sysdep.c
index d47bef1340..8462e5859a 100644
--- a/elf/dl-sysdep.c
+++ b/elf/dl-sysdep.c
@@ -116,10 +116,10 @@ _dl_sysdep_start (void **start_argptr,
user_entry = (ElfW(Addr)) ENTRY_POINT;
GLRO(dl_platform) = NULL; /* Default to nothing known about the platform. */

- /* NB: Default to a constant CONSTANT_MINSIGSTKSZ. */
- _Static_assert (__builtin_constant_p (CONSTANT_MINSIGSTKSZ),
- "CONSTANT_MINSIGSTKSZ is constant");
- GLRO(dl_minsigstacksize) = CONSTANT_MINSIGSTKSZ;
+ /* /\* NB: Default to a constant CONSTANT_MINSIGSTKSZ. *\/ */
+ /* _Static_assert (__builtin_constant_p (CONSTANT_MINSIGSTKSZ), */
+ /* "CONSTANT_MINSIGSTKSZ is constant"); */
+ /* GLRO(dl_minsigstacksize) = CONSTANT_MINSIGSTKSZ; */

for (av = GLRO(dl_auxv); av->a_type != AT_NULL; set_seen (av++))
switch (av->a_type)
@@ -185,9 +185,9 @@ _dl_sysdep_start (void **start_argptr,
case AT_RANDOM:
_dl_random = (void *) av->a_un.a_val;
break;
- case AT_MINSIGSTKSZ:
- GLRO(dl_minsigstacksize) = av->a_un.a_val;
- break;
+ /* case AT_MINSIGSTKSZ: */
+ /* GLRO(dl_minsigstacksize) = av->a_un.a_val; */
+ /* break; */
DL_PLATFORM_AUXV
}

diff --git a/elf/rtld_static_init.c b/elf/rtld_static_init.c
index 3f8abb6800..aeac492235 100644
--- a/elf/rtld_static_init.c
+++ b/elf/rtld_static_init.c
@@ -67,9 +67,9 @@ __rtld_static_init (struct link_map *map)
dl->_dl_hwcap = _dl_hwcap;
extern __typeof (dl->_dl_hwcap2) _dl_hwcap2 attribute_hidden;
dl->_dl_hwcap2 = _dl_hwcap2;
- extern __typeof (dl->_dl_minsigstacksize) _dl_minsigstacksize
- attribute_hidden;
- dl->_dl_minsigstacksize = _dl_minsigstacksize;
+ /* extern __typeof (dl->_dl_minsigstacksize) _dl_minsigstacksize */
+ /* attribute_hidden; */
+ /* dl->_dl_minsigstacksize = _dl_minsigstacksize; */
extern __typeof (dl->_dl_pagesize) _dl_pagesize attribute_hidden;
dl->_dl_pagesize = _dl_pagesize;
extern __typeof (dl->_dl_tls_static_align) _dl_tls_static_align
diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index 9c15259236..62117727e1 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
@@ -545,9 +545,6 @@ struct rtld_global_ro
/* Cached value of `getpagesize ()'. */
EXTERN size_t _dl_pagesize;

- /* Cached value of `sysconf (_SC_MINSIGSTKSZ)'. */
- EXTERN size_t _dl_minsigstacksize;
-
/* Do we read from ld.so.cache? */
EXTERN int _dl_inhibit_cache;

diff --git a/sysdeps/unix/sysv/linux/sysconf-pthread_stack_min.h b/sysdeps/unix/sysv/linux/sysconf-pthread_stack_min.h
index 9e0eb0f7fc..2ba132e1fe 100644
--- a/sysdeps/unix/sysv/linux/sysconf-pthread_stack_min.h
+++ b/sysdeps/unix/sysv/linux/sysconf-pthread_stack_min.h
@@ -22,7 +22,7 @@ static inline long int
__get_pthread_stack_min (void)
{
/* sysconf (_SC_THREAD_STACK_MIN) >= sysconf (_SC_MINSIGSTKSZ). */
- long int pthread_stack_min = GLRO(dl_minsigstacksize);
+ long int pthread_stack_min = 4096;
assert (pthread_stack_min != 0);
_Static_assert (__builtin_constant_p (PTHREAD_STACK_MIN),
"PTHREAD_STACK_MIN is constant");
diff --git a/sysdeps/unix/sysv/linux/sysconf.c b/sysdeps/unix/sysv/linux/sysconf.c
index daaeeb7d36..24afb2fdc5 100644
--- a/sysdeps/unix/sysv/linux/sysconf.c
+++ b/sysdeps/unix/sysv/linux/sysconf.c
@@ -84,11 +84,11 @@ __sysconf (int name)
break;

case _SC_MINSIGSTKSZ:
- assert (GLRO(dl_minsigstacksize) != 0);
- return GLRO(dl_minsigstacksize);
+ // assert (GLRO(dl_minsigstacksize) != 0);
+ return 4096; // GLRO(dl_minsigstacksize);

case _SC_SIGSTKSZ:
- return sysconf_sigstksz ();
+ return 16384 ; //sysconf_sigstksz ();

Could you verify this on your DS-15?

Frank Scheiner

unread,
Dec 14, 2022, 3:20:02 PM12/14/22
to
On 14.12.22 20:55, John Paul Adrian Glaubitz wrote:
> [...]
>> Unfortunately it also doesn't work here when optimized for EV67.
>
> OK, this just confirms what my cross-compile tests with "-mcpu=ev67
> -mtune=ev67"
> where the segfault wasn't fixed either by raising the baseline.
>
> If you have a user account for glibc bugzilla, you should subscribe to
> the bug
> report I opened for this particular issue [1].

Or can you just put me on the CC list?

> H. J. Lu raises a good
> question,
> namely whether alpha has any hardcoded values for "struct rtld_global_ro
> {}".

I have no answer for that.

Cheers,
Frank

Frank Scheiner

unread,
Dec 14, 2022, 3:30:02 PM12/14/22
to
Hi Adrian,

On 14.12.22 20:51, John Paul Adrian Glaubitz wrote:
> [...]
>> Can we be sure that this reproducer identifies the same problem than
>> the build failures from the original post ([1])?
>>
>> [1]: https://lists.debian.org/debian-alpha/2022/11/msg00003.html
>
> Well, this is how I identified that there was a problem with glibc on
> alpha.
>
> I built the packages manually with the testsuite enabled and installed them
> into a chroot for testing which resulted in a segfault when dpkg tried to
> configure the libc-bin package.
>
> I assume the many testsuite failures are a direct result of this bug which
> just causes many tests to segfault. We had a similar problem on sparc64
> where
> a single bug in the static build caused many testsuite failures.

I see.

> Interestingly, when I checkout the tag glibc-2.34 and disabled the
> _dl_minsigstacksize symbol
> in "struct rtld_global_ro {}" again with the following hack, I'm no
> longer getting a segfault
> but a floating point exception:
>
> [...]
>
> Could you verify this on your DS-15?

I'll do that tomorrow. The thing is that this diff doesn't apply cleanly:

```
root@ds15:/srv/storage/glibc# git checkout glibc-2.34
Note: switching to 'glibc-2.34'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

[...]
HEAD is now at ae37d06c7d Update ChangeLog.old/ChangeLog.23.

root@ds15:/srv/storage/glibc# patch -p1 < ../../glibc-fix.patch
patching file elf/dl-sysdep.c
Hunk #1 FAILED at 116.
Hunk #2 FAILED at 185.
2 out of 2 hunks FAILED -- saving rejects to file elf/dl-sysdep.c.rej
patching file elf/rtld_static_init.c
patching file sysdeps/generic/ldsodefs.h
patching file sysdeps/unix/sysv/linux/sysconf-pthread_stack_min.h
Hunk #1 succeeded at 22 with fuzz 1.
patching file sysdeps/unix/sysv/linux/sysconf.c
Hunk #1 succeeded at 84 with fuzz 2.
```

Not sure why, shouldn't we have the same source state? Should I try to
apply the rejected stuff manually?

Cheers,
Frank

Frank Scheiner

unread,
Dec 14, 2022, 3:50:03 PM12/14/22
to
On 14.12.22 21:32, John Paul Adrian Glaubitz wrote:
> Hi!
>
> On 12/14/22 21:16, Frank Scheiner wrote:
>> I'll do that tomorrow. The thing is that this diff doesn't apply cleanly:
>
> Which version of the workaround diff did you use? There are two.
>
> There is one that applies cleanly on top of
> 6c57d320484988e87e446e2e60ce42816bf51d53
> and a second one that applies cleanly on top of glibc-2.34, I posted
> both. There were
> some changes between 6c57d320484988e87e446e2e60ce42816bf51d53 and
> glibc-2.34 in the
> minstksize/stksize code which is why you need the second diff that was
> also part of
> my mail.

I used the one from the bottom of your mail, just below "Interestingly,
when I checkout the tag glibc-2.34 and disabled the _dl_minsigstacksize
symbol in "struct rtld_global_ro {}" again with the following hack, I'm
no longer getting a segfault but a floating point exception: "

> I'm attaching the second diff as a patch.

I think there's some whitespace difference. I manually applied the
rejected stuff, made a `git diff` and comparing that to your attached
patch gives:

```
root@nfs:/srv/nfs/ds15/root/srv# diff -Nur glibc-fix-2.patch
bz20305-workaround2.patch
--- glibc-fix-2.patch 2022-12-14 21:24:01.259696291 +0100
+++ bz20305-workaround2.patch 2022-12-14 21:37:25.439904377 +0100
@@ -1,5 +1,5 @@
diff --git a/elf/dl-sysdep.c b/elf/dl-sysdep.c
-index d47bef1340..d3dc6e5c57 100644
+index d47bef1340..8462e5859a 100644
--- a/elf/dl-sysdep.c
+++ b/elf/dl-sysdep.c
@@ -116,10 +116,10 @@ _dl_sysdep_start (void **start_argptr,
@@ -12,7 +12,7 @@
- GLRO(dl_minsigstacksize) = CONSTANT_MINSIGSTKSZ;
+ /* /\* NB: Default to a constant CONSTANT_MINSIGSTKSZ. *\/ */
+ /* _Static_assert (__builtin_constant_p (CONSTANT_MINSIGSTKSZ), */
-+ /* "CONSTANT_MINSIGSTKSZ is constant"); */
++ /* "CONSTANT_MINSIGSTKSZ is constant"); */
+ /* GLRO(dl_minsigstacksize) = CONSTANT_MINSIGSTKSZ; */

for (av = GLRO(dl_auxv); av->a_type != AT_NULL; set_seen (av++))
@@ -25,8 +25,8 @@
- GLRO(dl_minsigstacksize) = av->a_un.a_val;
- break;
+ /* case AT_MINSIGSTKSZ: */
-+ /* GLRO(dl_minsigstacksize) = av->a_un.a_val; */
-+ /* break; */
++ /* GLRO(dl_minsigstacksize) = av->a_un.a_val; */
++ /* break; */
DL_PLATFORM_AUXV
}

```

...so I think we're covered, unless the difference in the index line is
important.

I'll compile that tomorrow and see what happens.

Cheers,
Frank

John Paul Adrian Glaubitz

unread,
Dec 15, 2022, 3:20:03 AM12/15/22
to
Hi!

On 12/14/22 21:44, Frank Scheiner wrote:
>> I'm attaching the second diff as a patch.
>
> I think there's some whitespace difference. I manually applied the
> rejected stuff, made a `git diff` and comparing that to your attached
> patch gives:

Or just use the attached patch file from my previous mail.

If your glibc fails with Floating Point exception, I fear there might be
a second bug hiding somewhere which we need to bisect as well. This is
particularly annoying since we would have to apply the above diff for
every bisecting step.

But we'll see.

Frank Scheiner

unread,
Dec 15, 2022, 5:00:02 AM12/15/22
to
Hi,

On 15.12.22 09:09, John Paul Adrian Glaubitz wrote:
> Hi!
>
> On 12/14/22 21:44, Frank Scheiner wrote:
>>> I'm attaching the second diff as a patch.
>>
>> I think there's some whitespace difference. I manually applied the
>> rejected stuff, made a `git diff` and comparing that to your attached
>> patch gives:
>
> Or just use the attached patch file from my previous mail.

Yeah, did that in the end to be sure, but it looks like both are
incomplete (because both versions gave the following result):

```
root@ds15:/srv/storage/glibc# git status
HEAD detached at glibc-2.34
nothing to commit, working tree clean

root@ds15:/srv/storage/glibc# patch -p1 < ../../bz20305-workaround2.patch
patching file elf/dl-sysdep.c
patching file elf/rtld_static_init.c
patching file sysdeps/generic/ldsodefs.h
patching file sysdeps/unix/sysv/linux/sysconf-pthread_stack_min.h
patching file sysdeps/unix/sysv/linux/sysconf.c

root@ds15:/srv/storage/glibc# cd ../build/glibc-2.34-plus-patch/
root@ds15:/srv/storage/build/glibc-2.34-plus-patch#
CC="alpha-linux-gnu-gcc-12 -mcpu=ev67 -mtune=ev67 "
CXX="alpha-linux-gnu-g++-12 -mcpu=ev67 -mtune=ev67 "
MIG="alpha-linux-gnu-mig" ../../glibc/configure
--host=alphaev67-linux-gnu --disable-werror --prefix=/usr
--disable-sanity-checks
[...]

root@ds15:/srv/storage/build/glibc-2.34-plus-patch# time make
[...]
alpha-linux-gnu-gcc-12 -mcpu=ev67 -mtune=ev67
../sysdeps/unix/sysv/linux/alpha/sysconf.c -c -std=gnu11 -fgnu89-inline
-g -O2 -Wall -Wwrite-strings -Wundef -fmerge-all-constants
-frounding-math -fno-stack-protector -fno-common -Wstrict-prototypes
-Wold-style-definition -fmath-errno -mlong-double-128 -mieee
-mfp-rounding-mode=d -fexceptions
-DGETCONF_DIR='"/usr/libexec/getconf"' -ftls-model=initial-exec
-I../include -I/srv/storage/build/glibc-2.34-plus-patch/posix
-I/srv/storage/build/glibc-2.34-plus-patch
-I../sysdeps/unix/sysv/linux/alpha/alphaev67/fpu
-I../sysdeps/alpha/alphaev67/fpu
-I../sysdeps/unix/sysv/linux/alpha/alphaev67
-I../sysdeps/unix/sysv/linux/alpha/fpu -I../sysdeps/alpha/fpu
-I../sysdeps/unix/sysv/linux/alpha -I../sysdeps/alpha/nptl
-I../sysdeps/unix/sysv/linux/wordsize-64
-I../sysdeps/ieee754/ldbl-64-128 -I../sysdeps/ieee754/ldbl-opt
-I../sysdeps/unix/sysv/linux/include -I../sysdeps/unix/sysv/linux
-I../sysdeps/nptl -I../sysdeps/pthread -I../sysdeps/gnu
-I../sysdeps/unix/inet -I../sysdeps/unix/sysv -I../sysdeps/unix/alpha
-I../sysdeps/unix -I../sysdeps/posix -I../sysdeps/alpha/alphaev67
-I../sysdeps/alpha/alphaev6 -I../sysdeps/alpha/alphaev5
-I../sysdeps/alpha -I../sysdeps/wordsize-64
-I../sysdeps/ieee754/ldbl-128 -I../sysdeps/ieee754/dbl-64
-I../sysdeps/ieee754/flt-32 -I../sysdeps/ieee754 -I../sysdeps/generic
-I.. -I../libio -I. -D_LIBC_REENTRANT -include
/srv/storage/build/glibc-2.34-plus-patch/libc-modules.h
-DMODULE_NAME=libc -include ../include/libc-symbols.h
-DTOP_NAMESPACE=glibc -o
/srv/storage/build/glibc-2.34-plus-patch/posix/sysconf.o -MD -MP -MF
/srv/storage/build/glibc-2.34-plus-patch/posix/sysconf.o.dt -MT
/srv/storage/build/glibc-2.34-plus-patch/posix/sysconf.o
In file included from ../sysdeps/alpha/ldsodefs.h:40,
from ../sysdeps/gnu/ldsodefs.h:46,
from ../sysdeps/unix/sysv/linux/ldsodefs.h:25,
from ../sysdeps/unix/sysv/linux/sysconf.c:29,
from ../sysdeps/unix/sysv/linux/alpha/sysconf.c:127:
../sysdeps/unix/sysv/linux/sysconf-sigstksz.h: In function
‘sysconf_sigstksz’:
../sysdeps/generic/ldsodefs.h:512:21: error: ‘_dl_minsigstacksize’
undeclared (first use in this function); did you mean ‘minsigstacksize’?
512 | # define GLRO(name) _##name
| ^
../sysdeps/unix/sysv/linux/sysconf-sigstksz.h:24:30: note: in expansion
of macro ‘GLRO’
24 | long int minsigstacksize = GLRO(dl_minsigstacksize);
| ^~~~
../sysdeps/generic/ldsodefs.h:512:21: note: each undeclared identifier
is reported only once for each function it appears in
512 | # define GLRO(name) _##name
| ^
../sysdeps/unix/sysv/linux/sysconf-sigstksz.h:24:30: note: in expansion
of macro ‘GLRO’
24 | long int minsigstacksize = GLRO(dl_minsigstacksize);
| ^~~~
In file included from ../sysdeps/unix/sysv/linux/sysconf.c:30:
../sysdeps/unix/sysv/linux/sysconf-sigstksz.h: At top level:
../sysdeps/unix/sysv/linux/sysconf-sigstksz.h:22:1: warning:
‘sysconf_sigstksz’ defined but not used [-Wunused-function]
22 | sysconf_sigstksz (void)
| ^~~~~~~~~~~~~~~~
make[2]: *** [/srv/storage/build/glibc-2.34-plus-patch/sysd-rules:179:
/srv/storage/build/glibc-2.34-plus-patch/posix/sysconf.o] Error 1
make[2]: Leaving directory '/srv/storage/glibc/posix'
make[1]: *** [Makefile:478: posix/subdir_lib] Error 2
make[1]: Leaving directory '/srv/storage/glibc'
make: *** [Makefile:9: all] Error 2

real 32m23.084s
user 23m7.861s
sys 5m13.960s
```

Maybe adding [1] might help, but the patch actually removes it.

[1]:
https://github.com/bminor/glibc/commit/6c57d320484988e87e446e2e60ce42816bf51d53#diff-a0f6ca39e050317adcf156062ab073beb500c3c9d75b4d9adad7a8a08f42e5f3

Cheers,
Frank

John Paul Adrian Glaubitz

unread,
Dec 15, 2022, 5:10:02 AM12/15/22
to
Hi!

On 12/15/22 10:49, Frank Scheiner wrote:
> Maybe adding [1] might help, but the patch actually removes it.

It's missing this hunk:

diff --git a/sysdeps/unix/sysv/linux/sysconf-sigstksz.h b/sysdeps/unix/sysv/linux/sysconf-sigstksz.h
index 64d450b22c..4552e77d59 100644
--- a/sysdeps/unix/sysv/linux/sysconf-sigstksz.h
+++ b/sysdeps/unix/sysv/linux/sysconf-sigstksz.h
@@ -21,7 +21,7 @@
static long int
sysconf_sigstksz (void)
{
- long int minsigstacksize = GLRO(dl_minsigstacksize);
+ long int minsigstacksize = 4096 ; //GLRO(dl_minsigstacksize);
assert (minsigstacksize != 0);
_Static_assert (__builtin_constant_p (MINSIGSTKSZ),
"MINSIGSTKSZ is constant");

I was experimenting with a custom sysconf-sigstksz.h like on ia64 which I forgot to purge, sorry.

Frank Scheiner

unread,
Dec 15, 2022, 5:10:11 AM12/15/22
to
Hi,

On 15.12.22 11:02, John Paul Adrian Glaubitz wrote:
> Hi!
>
> On 12/15/22 10:49, Frank Scheiner wrote:
>> Maybe adding [1] might help, but the patch actually removes it.
>
> It's missing this hunk:
>
> diff --git a/sysdeps/unix/sysv/linux/sysconf-sigstksz.h
> b/sysdeps/unix/sysv/linux/sysconf-sigstksz.h
> index 64d450b22c..4552e77d59 100644
> --- a/sysdeps/unix/sysv/linux/sysconf-sigstksz.h
> +++ b/sysdeps/unix/sysv/linux/sysconf-sigstksz.h
> @@ -21,7 +21,7 @@
>  static long int
>  sysconf_sigstksz (void)
>  {
> -  long int minsigstacksize = GLRO(dl_minsigstacksize);
> +  long int minsigstacksize = 4096 ; //GLRO(dl_minsigstacksize);
>    assert (minsigstacksize != 0);
>    _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
>                   "MINSIGSTKSZ is constant");
>
> I was experimenting with a custom sysconf-sigstksz.h like on ia64 which
> I forgot to purge, sorry.

Ok, I will use this and run it again.

Cheers,
Frank

Frank Scheiner

unread,
Dec 15, 2022, 6:52:28 AM12/15/22
to
Hi Adrian,
I renamed the build directory to reflect that the build was optimized
for EV67. The result confirms your findings:

```
root@ds15:/srv/storage/build#
LD_LIBRARY_PATH=$PWD/glibc-2.34-plus-patch-ev67 /bin/bash
Floating point exception
```

Cheers,
Frank

John Paul Adrian Glaubitz

unread,
Dec 20, 2022, 8:00:03 PM12/20/22
to
Hello!

On 12/15/22 09:09, John Paul Adrian Glaubitz wrote:
> If your glibc fails with Floating Point exception, I fear there might be
> a second bug hiding somewhere which we need to bisect as well. This is
> particularly annoying since we would have to apply the above diff for
> every bisecting step.

FWIW, I'm still working on this issue and made some progress.

Apparently, the segfault was fixed somewhere in the 2.36 development
cycle but on the other hand, the floating point exception issue
was introduced.

John Paul Adrian Glaubitz

unread,
Dec 21, 2022, 12:00:03 PM12/21/22
to
Hello!

I have not been able to identify the commit that introduced the floating point
issue. However, I seem to have found what fixes the segfault properly and also
another fix for a third problem, see below.

FWIW, Adhemeveral told me he would be looking into the glibc issues on alpha in
the following days. Currently, I am out of ideas myself.

Adrian

===============================================================================

Issue:

(sid-alpha-sbuild)glaubitz@nofan:~/glibc-git/build$ LD_LIBRARY_PATH=/home/glaubitz/glibc-git/build /bin/bash
/bin/bash: symbol lookup error: /home/glaubitz/glibc-git/build/libc.so.6.1: undefined symbol: _dl_audit_preinit, version GLIBC_PRIVATE
(sid-alpha-sbuild)glaubitz@nofan:~/glibc-git/build$

Fixed by:

commit 144761540a1e40b85997d195d9a226a500531dc9
Author: Adhemerval Zanella <adhemerva...@linaro.org>
Date: Thu Jan 13 18:04:49 2022 -0300

elf: Remove LD_USE_LOAD_BIAS

It is solely for prelink with PIE executables [1].

[1] https://sourceware.org/legacy-ml/libc-hacker/2003-11/msg00127.html

Reviewed-by: Siddhesh Poyarekar <sidd...@sourceware.org>

Issue:

(sid-alpha-sbuild)glaubitz@nofan:~/glibc-git/build$ LD_LIBRARY_PATH=/home/glaubitz/glibc-git/build /bin/bash
Segmentation fault
(sid-alpha-sbuild)glaubitz@nofan:~/glibc-git/build$

Fixed by:

commit 0b98a8748759e88b58927882a8714109abe0a2d6
Author: Adhemerval Zanella <adhemerva...@linaro.org>
Date: Thu Jul 22 17:10:57 2021 -0300

elf: Add _dl_audit_preinit

It consolidates the code required to call la_preinit audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fwe...@redhat.com>

csu/libc-start.c | 23 +++--------------------
elf/Versions | 2 +-
elf/dl-audit.c | 15 +++++++++++++++
sysdeps/generic/ldsodefs.h | 3 +++
4 files changed, 22 insertions(+), 21 deletions(-)

Issue:

(sid-alpha-sbuild)glaubitz@nofan:~/glibc-git/build$ LD_LIBRARY_PATH=/home/glaubitz/glibc-git/build /bin/bash
Floating point exception
(sid-alpha-sbuild)glaubitz@nofan:~/glibc-git/build$

Introduced by:

(not found)
0 new messages