Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1062073: grub-efi-amd64: GRUB 2.12 fails to boot with X64 exception

64 views
Skip to first unread message

Morten Hein Tiljeset

unread,
Jan 31, 2024, 4:30:04 AM1/31/24
to
Package: grub-efi-amd64
Version: 2.12-1
Severity: important

Dear Maintainer,

Upgrading to GRUB 2.12 makes my system unbootable. It immediately breaks
with the following error before giving me a chance to see any menu or
get any better debug output.

!!!! X64 Exception Type - 0D(#GP - General Protection) CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000000
RIP - 000000005FDE0E22, CS - 0000000000000038, RFLAGS - 0000000000210202
RAX - 0000000000000004, RCX - 0000000000000000, RDX - 0000000000000004
RBX - 5043414600000000, RSP - 000000006FC56008, RBP - 000000006FC56050
RSI - 000000004D0ABEFA, RDI - 5043414600000000
R8 - 0000000000000053, R9 - 0000000000000001, R10 - 000000000001C201
R11 - 000000006FC55B80, R12 - 000000000000001C, R13 - 00000000600651EC
R14 - 000000005FDE0E1B, R15 - 0000000060065000
DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030
GS - 0000000000000030, SS - 0000000000000030
CR0 - 0000000080000013, CR2 - 0000000000000000, CR3 - 000000006FAD6000
CR4 - 0000000000000668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
GDTR - 00000000698AD500 0000000000000047, LDTR - 0000000000000000
IDTR - 0000000067949018 0000000000000FFF, TR - 0000000000000000
FXSAVE_STATE - 000000006FC55C60
!!!! Find image (No PDB) (ImageBase=000000005FDD8000, EntryPoint=000000005FDD9000) !!!!

This is consistent across several similar machines, so I suspect a
software regression rather than a hardware issue.

FWIW the EFI BIOS is
American Megatrends
Version 2.20.1276
Core Version 5.14

I realize that this is not a lot to go on, but I would appreciate any
hints on how to debug this issue.

Best,
Morten Hein Tiljeset

Mate Kukri

unread,
Jan 31, 2024, 4:40:05 AM1/31/24
to
Hello,

Do you happen to know the previous GRUB version you were running, was
it 2.06 or 2.12~rc1?

Mate
> _______________________________________________
> Pkg-grub-devel mailing list
> Pkg-gru...@alioth-lists.debian.net
> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/pkg-grub-devel

Mate Kukri

unread,
Jan 31, 2024, 4:50:05 AM1/31/24
to
In that case, I strongly suspect the regression was already present in 2.12~rc1.

Unfortunately we have multiple years of GRUB changes here, and this
particular issue doesn't reproduce either on OVMF, or on any of my
test hardware.

Also what computer/motherboard model is it exactly? Do you have a
firmware release date? That seems like a vendor specific version
number, and it would be rather difficult to determine "UEFI
specification version", or "bugginess factor" from that.

If you could try older package versions between 2.06-n and 2.12-1, it
would be very helpful too, the more information we have, the more
likely we can do something about this.

Also I'll take a quick look at what code is in the GRUB image at the
crashing offset, but I suspect that won't be enough information to
workaround this.

Mate

On Wed, Jan 31, 2024 at 9:29 AM Morten Hein Tiljeset <mtil...@uber.com> wrote:
>
> Oh yeah, I should've included that. It was 2.06.
> I'm afraid I haven't tested 2.12~rc1 on this particular machine type.
>
> On Wed, Jan 31, 2024 at 10:27 AM Mate Kukri <mate....@canonical.com> wrote:
> >
> > Hello,
> >
> > Do you happen to know the previous GRUB version you were running, was
> > it 2.06 or 2.12~rc1?
> >
> > Mate
> >
> > On Wed, Jan 31, 2024 at 9:21 AM Morten Hein Tiljeset <mtil...@uber.com> wrote:
> > >

Morten Hein Tiljeset

unread,
Jan 31, 2024, 6:10:06 AM1/31/24
to
The motherboard is

Quanta Cloud Technology Inc. S5U-MB (1U2N LBG-1G)UBR
Firmware release date: 2020-04-16.
Compliancy: UEFI 2.7.0; PI 1.6.

Morten Hein Tiljeset

unread,
Feb 2, 2024, 7:50:04 AM2/2/24
to
So I spent an evening bisecting the issue from upstream sources.

The problem was introduced with
> 7b192ec4c term/ns8250: Use ACPI SPCR table when available to configure serial

The old behaviour can be regained by setting the unit explicitly grub.cfg
> serial --unit=0 --speed=115200

This solves my issue, so you can go ahead and close the bug report.
I assume the ACPI lookup logic triggered a bug in my particular firmware.

/Morten

Mate Kukri

unread,
Feb 2, 2024, 8:00:04 AM2/2/24
to
Thanks for bisecting, appreciate that.

Having boot crashes like that by default still looks really bad,
I would rather revert that commit given that it's new and no one
had the opportunity to rely on it yet.

Morten Hein Tiljeset

unread,
Feb 5, 2024, 4:10:05 AM2/5/24
to
That makes sense. The commit doesn't revert cleanly but before finding the option I just reverted the default:

diff --git a/grub-core/term/serial.c b/grub-core/term/serial.c
index 8260dcb7a..72a6927b4 100644
--- a/grub-core/term/serial.c
+++ b/grub-core/term/serial.c
@@ -271,7 +271,7 @@ grub_cmd_serial (grub_extcmd_context_t ctxt, int argc, char **args)
     name = args[0];

   if (!name)
-    name = "auto";
+    name = "com0";

   port = grub_serial_find (name);
   if (!port)

I think (but please double check me on this), that it should be possible to opt-in to the new behaviour by just setting serial=auto in grub.cfg.

/Morten
0 new messages