Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#926539: rootskel: steal-ctty no longer works on at least sparc64

63 views
Skip to first unread message

John Paul Adrian Glaubitz

unread,
Apr 6, 2019, 1:00:03 PM4/6/19
to
Source: rootskel
Version: 1.128
Severity: important
User: debian...@lists.debian.org
Usertags: sparc64

Hello!

I built updated installation images [1] for Debian Ports today and tested
the sparc64 image on our SPARC T5 in an LDOM.

Unfortunately, it seems that the recent changes to rootskel broke the
serial console on sparc64 in d-i. The kernel boots fine but d-i never
starts, the boot stops with:

steal-ctty: No such file or directory

My suspicion is that the support multiple consoles in parallel [2] introduced
this particular regression. I haven't done any debugging yet though as I'm
not sure where to start, I haven't touched the rootskel package before and
therefore would be interested in any pointers how to debug this.

Thanks,
Adrian

> [1] https://cdimage.debian.org/cdimage/ports/2019-04-06/
> [2] https://salsa.debian.org/installer-team/rootskel/commit/b6048aafed7d73ba42da04d6f7a798710f271384

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glau...@debian.org
`. `' Freie Universitaet Berlin - glau...@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

Ben Hutchings

unread,
Apr 6, 2019, 8:00:03 PM4/6/19
to
On Sat, 2019-04-06 at 21:33 +0200, John Paul Adrian Glaubitz wrote:
> On 4/6/19 6:46 PM, John Paul Adrian Glaubitz wrote:
> > My suspicion is that the support multiple consoles in parallel [2] introduced
> > this particular regression. I haven't done any debugging yet though as I'm
> > not sure where to start, I haven't touched the rootskel package before and
> > therefore would be interested in any pointers how to debug this.
>
> The problem seems to be the fact that the sparc64 kernel uses different names
> for /proc/console and the actual console name:
>
> root@landau:~# cat /proc/consoles
> ttyHV0 -W- (EC p ) 4:64
> tty0 -WU (E ) 4:1
> root@landau:~# readlink /sys/dev/char/4:64
> ../../devices/root/f0299a70/f029b788/tty/ttyS0

The inconsistent name seems like a kernel bug...

> root@landau:~#
>
> And this is what used to make it work [1]:
>
> *) # >= 2.6.38
> console_major_minor="$(get-real-console-linux)"
> console_raw="$(readlink "/sys/dev/char/${console_major_minor}")"
> console="${console_raw##*/}"
> ;;

So maybe rootskel should use that again, but applied to each console's
char device number.

(Though directly using the symlinks under /dev/char seems cleaner than
poking in sysfs.)

Ben.

--
Ben Hutchings
This sentence contradicts itself - no actually it doesn't.


signature.asc

Ben Hutchings

unread,
Apr 16, 2019, 7:20:03 AM4/16/19
to
On Tue, 2019-04-16 at 11:47 +0200, John Paul Adrian Glaubitz wrote:
> Hi Ben!
>
> On 4/7/19 1:53 AM, Ben Hutchings wrote:
> > > root@landau:~# cat /proc/consoles
> > > ttyHV0 -W- (EC p ) 4:64
> > > tty0 -WU (E ) 4:1
> > > root@landau:~# readlink /sys/dev/char/4:64
> > > ../../devices/root/f0299a70/f029b788/tty/ttyS0
> >
> > The inconsistent name seems like a kernel bug...
>
> Yes. I'm trying to convince Dave Miller to fix this.
>
> Do you think we could carry a patch in src:linux for the time being?
[...]

I would rather not do that until it's accepted, as if it that doesn't
happen we either have to switch back or carry it forever.

Ben.

--
Ben Hutchings
Make three consecutive correct guesses and you will be considered
an expert.


signature.asc

James Clarke

unread,
Jun 26, 2019, 5:30:03 PM6/26/19
to
Control: reopen -1
Control: reassign -1 src:linux,rootskel
Control: severity -1 serious

(Don't know if this is a blocker for the release, but it should at
least be reviewed before we release IMO, hence the severity)
Just got a report in #debian-cd of a user running into this issue on
s390x with Hercules; a subset of the messages sent in conversation are
below:

[20:12:18] <gruetzkopf> steal-ctty: No such file or directory
[20:12:29] <gruetzkopf> will go hunt this down once i find time
[20:12:52] <gruetzkopf> (DI buster RC2 / s390x)
[21:52:40] <jrtc27> gruetzkopf: cat /proc/consoles ?
[21:54:00] <jrtc27> should give something like:
[21:54:00] <jrtc27> ttyS0 -W- (EC p ) 4:64
[21:54:22] <jrtc27> rootskel will prefer a console which has the C flag
[21:55:17] <gruetzkopf> now let's see how to get there
[21:55:57] <gruetzkopf> (note: running in hercules, not real hw or qemu where i'd have virtio console)
[22:01:39] <gruetzkopf> cat /proc/consoles
[22:01:40] <gruetzkopf> ttyS0 -W- (EC p ) 4:64
[22:02:05] <jrtc27> and ls -l /dev/ttyS0?
[22:03:06] <gruetzkopf> ls: /dev/ttyS0: No such file or directory
[22:03:06] <gruetzkopf> oh, fun!
[22:04:36] <jrtc27> and ls -l /sys/dev/char/4:64 ?
[22:06:06] <gruetzkopf> ls -l /sys/dev/char/4:64
[22:06:06] <gruetzkopf> lrwxrwxrwx 1 root root 0 Jun 26 21:05 /sys/dev/char/4:64 -> .
[22:06:06] <gruetzkopf> ./../devices/virtual/tty/sclp_line0
[22:06:28] <jrtc27> ok, so, it's not /dev/ttyS0, it's /dev/sclp_line0?
[22:06:32] <jrtc27> (does that exist?)
[22:06:48] <jrtc27> we had an issue like this on sparc64 (#926539)
[22:07:38] <gruetzkopf> i just found that
[22:07:53] <jrtc27> does that device node exist for you?
[22:08:13] <gruetzkopf> crw--w---- 1 root root 4, 64 Jun 26 20:58 /dev/sclp_line0
[22:08:43] <gruetzkopf> (and so does /dev/ttysclp0)

This is the "fault" of drivers/s390/char/sclp_tty.c. I don't know what
the best fix is; we could also patch the kernel to ensure this shows up
as /dev/sclp_line0 in /proc/consoles like sparc64 now does for sunhv,
but I worry now that this might be a game of whack-a-mole and there are
other character device drivers out there that also suffer from this.
Perhaps therefore we need to go back to looking up the device name from
the device number as has been suggested already...

James

Valentin Vidić

unread,
May 20, 2020, 6:50:02 AM5/20/20
to
On Wed, May 20, 2020 at 11:19:53AM +0200, John Paul Adrian Glaubitz wrote:
> Ah, sorry. I was seeing the cached version of the thread, refreshing helped.
>
> In any case, the SPARC kernel maintainer (Dave Miller) had the same argument
> that it would potentially break existing setups but eventually I could
> convince him that the change was right.
>
> Not sure which distributions he has in mind.

It is hard to tell, but it seems the current state is hardcoded
in different places:

https://www.redhat.com/archives/libguestfs/2017-May/msg00068.html
https://www.ibm.com/support/knowledgecenter/linuxonibm/com.ibm.linux.z.lhdd/lhdd_r_console_sum.html

I think it would be better to make debian-installer smarter about
this since we will probably run into the same problem again with
a different architecture/driver.

--
Valentin

Salvatore Bonaccorso

unread,
Apr 18, 2021, 11:40:04 AM4/18/21
to
Is this bug still valid to be open?

The mentioned commit landed in 5.3-rc1, 4.19.54 and as well 4.9.183.

Regards,
Salvatore

Philipp Kern

unread,
Apr 18, 2021, 3:10:03 PM4/18/21
to
On 18.04.21 17:27, Salvatore Bonaccorso wrote:
> Is this bug still valid to be open?
>
> The mentioned commit landed in 5.3-rc1, 4.19.54 and as well 4.9.183.

Unfortunately the daily debian-installer build (on Linux 5.10.0-6-s390x)
is still broken on qemu-system-s390x. So the s390x part is still true.

Kind regards
Philipp Kern

Omar Sandoval

unread,
Apr 21, 2021, 2:20:02 PM4/21/21
to
https://bugzilla.redhat.com/show_bug.cgi?id=1351968 suggests looking up
the console by major:minor instead of name when using /proc/consoles,
would it be possible to have debian-installer do that?

At the very least, it'd be nice to have some kind of workaround
available, like specifying the console name on the command line or
preseed file.

John Paul Adrian Glaubitz

unread,
May 3, 2021, 3:10:03 AM5/3/21
to
Hi!

On 5/3/21 8:36 AM, Cyril Brulebois wrote:
> From skimming through the bug log, it seems it was initially a sparc64
> problem, that was fixed in the kernel (inconsistent naming) eventually.

Correct.

> The same issue exists on s390x but isn't apparently going to get fixed
> so we need to have d-i be smarter (hence the merge request)?

Seems so.

> I'd suggest at least retitling the bug report to mention s390x (release
> arch, affected) instead of sparc64 (port arch, no longer affected), to
> lower the chances people could overlook this issue, thinking it's only
> about a port arch.

We could also unmerge #926539 and #961056 again, then close the former bug
which was sparc64-specific.

Adrian

Valentin Vidic

unread,
May 3, 2021, 6:30:03 AM5/3/21
to
On Mon, May 03, 2021 at 08:58:02AM +0200, John Paul Adrian Glaubitz wrote:
> > The same issue exists on s390x but isn't apparently going to get fixed
> > so we need to have d-i be smarter (hence the merge request)?
>
> Seems so.

QEMU console might get fixed in the kernel, but it looks like LPAR could
have a similar problem (don't have access to test this). So it seems
better (and future proof) to fix this on the Debian side too. I have
updated the merge request to trigger the new code only on s390x as
suggested:

https://salsa.debian.org/installer-team/rootskel/-/merge_requests/2

> > I'd suggest at least retitling the bug report to mention s390x (release
> > arch, affected) instead of sparc64 (port arch, no longer affected), to
> > lower the chances people could overlook this issue, thinking it's only
> > about a port arch.
>
> We could also unmerge #926539 and #961056 again, then close the former bug
> which was sparc64-specific.

I have unmerged the bugs now, so the sparc one can be closed.

--
Valentin
0 new messages