Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Kernel versions 6.x don't boot on Amiga 4000

6 views
Skip to first unread message

Geert Uytterhoeven

unread,
Feb 21, 2023, 10:00:03 AM2/21/23
to
Hi Adrian,

On Tue, Feb 21, 2023 at 3:51 PM John Paul Adrian Glaubitz
<glau...@physik.fu-berlin.de> wrote:
> I tested Debian's most recent m68k kernels from the 6.0.x and 6.1.x series and

Thanks for testing!

> neither of these boot on my Amiga 4000/060. Both get stuck at the ABCDGHIJK
> message.

Looks surprisingly similar to the issue reported by Stan.
Do the mitigations given in
https://lore.kernel.org/all/CAMuHMdUtkr2zvZiJfLXvs9d_...@mail.gmail.com
help?

> FWIW, I noticed that the kernel image itself is already over 7 MB, not sure
> whether this is a problem.

Depends on how much RAM you have ;-)

> Anyone else tried a recent kernel on their Amigas?

I really should start booting on real Amiga hardware again...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

John Paul Adrian Glaubitz

unread,
Feb 21, 2023, 10:00:03 AM2/21/23
to
Hi!

I tested Debian's most recent m68k kernels from the 6.0.x and 6.1.x series and
neither of these boot on my Amiga 4000/060. Both get stuck at the ABCDGHIJK
message.

Will try earlier kernels until I found the one where the breakage was introduced.
Currently known latest kernel to work is 5.10.5.

FWIW, I noticed that the kernel image itself is already over 7 MB, not sure
whether this is a problem.

Anyone else tried a recent kernel on their Amigas?

Adrian

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

John Paul Adrian Glaubitz

unread,
Feb 21, 2023, 11:00:04 AM2/21/23
to
Hi Geert!

On Tue, 2023-02-21 at 15:55 +0100, Geert Uytterhoeven wrote:
> Looks surprisingly similar to the issue reported by Stan.
> Do the mitigations given in
> https://lore.kernel.org/all/CAMuHMdUtkr2zvZiJfLXvs9d_...@mail.gmail.com
> help?

The kernel actually crashes with a backtrace:

ABCDGHIJK
[ 0.000000] Linux version 6.0.0-6-m68k (debian...@lists.debian.org) (gcc-12 (Debian 12.2.0-9) 12.2.0, GNU ld (GNU Binutils for
Debian) 2.39) #1 Debian 6.0.12-1 (2022-12-09)
[ 0.000000] Enabling workaround for errata I14
[ 0.000000] printk: bootconsole [debug0] enabled
[ 0.000000] Amiga hardware found: [A4000] VIDEO BLITTER AUDIO FLOPPY A4000_IDE KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA
LISA ALICE_PAL ZORRO3
[ 0.000000] initrd: 0ef0602c - 0f800000
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x0000000008000000-0x000000f7ffffffff]
[ 0.000000] Normal empty
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000008000000-0x000000000f7fffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000f7fffff]
[ 0.000000] Unable to handle kernel access at virtual address (ptrval)
[ 0.000000] Oops: 00000000
[ 0.000000] Modules linked in:
[ 0.000000] PC: [<00201d3c>] memcmp+0x28/0x56
[ 0.000000] SR: 2709 SP: (ptrval) a2: 004a5580
[ 0.000000] d0: 00000003 d1: 00000001 d2: 00201d14 d3: 00000272
[ 0.000000] d4: 00012750 d5: 08023ec0 a0: 0000000c a1: 0f7ffff4
[ 0.000000] Process swapper (pid: 0, task=(ptrval))
[ 0.000000] Frame format=4 fault addr=0f7ffff4 fslw=01051000
[ 0.000000] Stack from 004a3fac:
[ 0.000000] 00201d14 00000272 00374e40 0f7ffff4 0f800000 00534b22 0f7ffff4 0042e325
[ 0.000000] 0000000c 0055c000 00000272 00012750 08023ec0 00012750 080dbf48 08001000
[ 0.000000] 08001000 0f7ffff0 00553d9a 00000000 00533872
[ 0.000000] Call Trace: [<00201d14>] memcmp+0x0/0x56
[ 0.000000] [<00374e40>] _printk+0x0/0x18
[ 0.000000] [<00534b22>] start_kernel+0x8a/0x5d6
[ 0.000000] [<00012750>] LOGTBL+0x228/0x800
[ 0.000000] [<00012750>] LOGTBL+0x228/0x800
[ 0.000000] [<00533872>] _sinittext+0x872/0x11f8
[ 0.000000]
[ 0.000000] Code: b288 661e 4280 6030 2a49 284b 264c 224d <bb8c> 66ea 5988 7003 b088 65f0 224d 264c 60dc 4283 1631 1800 4282 1433 1800
2003
[ 0.000000] Disabling lock debugging due to kernel taint
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

> > FWIW, I noticed that the kernel image itself is already over 7 MB, not sure
> > whether this is a problem.
>
> Depends on how much RAM you have ;-)

128 MB.

> > Anyone else tried a recent kernel on their Amigas?
>
> I really should start booting on real Amiga hardware again...

You should ;-).

Michael Schmitz

unread,
Feb 21, 2023, 4:20:02 PM2/21/23
to
Hi Adrian,

On 22/02/23 04:53, John Paul Adrian Glaubitz wrote:
> Hi Geert!
>
> On Tue, 2023-02-21 at 15:55 +0100, Geert Uytterhoeven wrote:
>> Looks surprisingly similar to the issue reported by Stan.
>> Do the mitigations given in
>> https://lore.kernel.org/all/CAMuHMdUtkr2zvZiJfLXvs9d_...@mail.gmail.com
>> help?
> The kernel actually crashes with a backtrace:
>
> ABCDGHIJK
> [ 0.000000] Linux version 6.0.0-6-m68k (debian...@lists.debian.org) (gcc-12 (Debian 12.2.0-9) 12.2.0, GNU ld (GNU Binutils for
> Debian) 2.39) #1 Debian 6.0.12-1 (2022-12-09)
> [ 0.000000] Enabling workaround for errata I14
> [ 0.000000] printk: bootconsole [debug0] enabled
> [ 0.000000] Amiga hardware found: [A4000] VIDEO BLITTER AUDIO FLOPPY A4000_IDE KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA
> LISA ALICE_PAL ZORRO3
> [ 0.000000] initrd: 0ef0602c - 0f800000
> [ 0.000000] Zone ranges:
> [ 0.000000] DMA [mem 0x0000000008000000-0x000000f7ffffffff]
> [ 0.000000] Normal empty
> [ 0.000000] Movable zone start for each node
> [ 0.000000] Early memory node ranges
> [ 0.000000] node 0: [mem 0x0000000008000000-0x000000000f7fffff]

In both your case and Kars', the memory does not start at 0x0. Kars
finds all memory reserved on his HP.

6.2rc8 boots fine on my 030 (memory starting at 0x0).

> [ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000f7fffff]
> [ 0.000000] Unable to handle kernel access at virtual address (ptrval)
> [ 0.000000] Oops: 00000000
> [ 0.000000] Modules linked in:
> [ 0.000000] PC: [<00201d3c>] memcmp+0x28/0x56
> [ 0.000000] SR: 2709 SP: (ptrval) a2: 004a5580
> [ 0.000000] d0: 00000003 d1: 00000001 d2: 00201d14 d3: 00000272
> [ 0.000000] d4: 00012750 d5: 08023ec0 a0: 0000000c a1: 0f7ffff4

a1 is just  before the end of your RAM chunk. If that's a longword
access, you'd fall over the edge :) Can you disassemble the code snippet
(or memcmp()) so we can see what's happening?

I do recall recent changes to the mm code, but that was for NOMMU. I
wonder whether there was anything else that would introduce an implicit
assumption about memory starting at 0x0 ...
Thirded :-)

Cheers,

    Michael

>
> Adrian
>

John Paul Adrian Glaubitz

unread,
Feb 21, 2023, 4:50:04 PM2/21/23
to
Hi Michael!

On Wed, 2023-02-22 at 10:09 +1300, Michael Schmitz wrote:
> a1 is just  before the end of your RAM chunk. If that's a longword
> access, you'd fall over the edge :) Can you disassemble the code snippet
> (or memcmp()) so we can see what's happening?

Here you go:

00201d14 <memcmp>:
201d14: 48e7 301c moveml %d2-%d3/%a3-%a5,%sp@-
201d18: 226f 0018 moveal %sp@(24),%a1
201d1c: 266f 001c moveal %sp@(28),%a3
201d20: 206f 0020 moveal %sp@(32),%a0
201d24: 7003 moveq #3,%d0
201d26: b088 cmpl %a0,%d0
201d28: 650a bcss 201d34 <memcmp+0x20>
201d2a: 4281 clrl %d1
201d2c: b288 cmpl %a0,%d1
201d2e: 661e bnes 201d4e <memcmp+0x3a>
201d30: 4280 clrl %d0
201d32: 6030 bras 201d64 <memcmp+0x50>
201d34: 2a49 moveal %a1,%a5
201d36: 284b moveal %a3,%a4
201d38: 264c moveal %a4,%a3
201d3a: 224d moveal %a5,%a1
201d3c: bb8c cmpml %a4@+,%a5@+
201d3e: 66ea bnes 201d2a <memcmp+0x16>
201d40: 5988 subql #4,%a0
201d42: 7003 moveq #3,%d0
201d44: b088 cmpl %a0,%d0
201d46: 65f0 bcss 201d38 <memcmp+0x24>
201d48: 224d moveal %a5,%a1
201d4a: 264c moveal %a4,%a3
201d4c: 60dc bras 201d2a <memcmp+0x16>
201d4e: 4283 clrl %d3
201d50: 1631 1800 moveb %a1@(0,%d1:l),%d3
201d54: 4282 clrl %d2
201d56: 1433 1800 moveb %a3@(0,%d1:l),%d2
201d5a: 2003 movel %d3,%d0
201d5c: 9082 subl %d2,%d0
201d5e: 5281 addql #1,%d1
201d60: b483 cmpl %d3,%d2
201d62: 67c8 beqs 201d2c <memcmp+0x18>
201d64: 4cdf 380c moveml %sp@+,%d2-%d3/%a3-%a5
201d68: 4e75 rts

The kernel image is actually unstripped. Is there a config option for that?

Do we want to keep symbols in a non-debug kernel?

> I do recall recent changes to the mm code, but that was for NOMMU. I
> wonder whether there was anything else that would introduce an implicit
> assumption about memory starting at 0x0 ...

Sounds like a possible culprit.

Michael Schmitz

unread,
Feb 21, 2023, 8:00:04 PM2/21/23
to
Hi Adrian,

On 22/02/23 10:46, John Paul Adrian Glaubitz wrote:
> Hi Michael!
>
> On Wed, 2023-02-22 at 10:09 +1300, Michael Schmitz wrote:
>> a1 is just  before the end of your RAM chunk. If that's a longword

Actually it isn't that close - if I read the stack correctly, we're
comparing 0xc bytes from 0x0f7ffff4 which is to 0x0f7ffffff.

The post-increment of a5 to 0x0f800000 might cause a pre-fetch beyond
end of memory - how does that get handled?

>> access, you'd fall over the edge :) Can you disassemble the code snippet
>> (or memcmp()) so we can see what's happening?
> Here you go:
>
> 00201d14 <memcmp>:
> 201d14: 48e7 301c moveml %d2-%d3/%a3-%a5,%sp@-
> 201d18: 226f 0018 moveal %sp@(24),%a1
> 201d1c: 266f 001c moveal %sp@(28),%a3
> 201d20: 206f 0020 moveal %sp@(32),%a0
> 201d24: 7003 moveq #3,%d0
> 201d26: b088 cmpl %a0,%d0
> 201d28: 650a bcss 201d34 <memcmp+0x20>
> 201d2a: 4281 clrl %d1
> 201d2c: b288 cmpl %a0,%d1
> 201d2e: 661e bnes 201d4e <memcmp+0x3a>
> 201d30: 4280 clrl %d0
> 201d32: 6030 bras 201d64 <memcmp+0x50>
> 201d34: 2a49 moveal %a1,%a5 <======= 0x0f7ffff4
> 201d36: 284b moveal %a3,%a4
> 201d38: 264c moveal %a4,%a3
> 201d3a: 224d moveal %a5,%a1
> 201d3c: bb8c cmpml %a4@+,%a5@+ <======= a5 will be 0x0f800000 after post-increment
> 201d3e: 66ea bnes 201d2a <memcmp+0x16>
> 201d40: 5988 subql #4,%a0
> 201d42: 7003 moveq #3,%d0
> 201d44: b088 cmpl %a0,%d0
> 201d46: 65f0 bcss 201d38 <memcmp+0x24>
> 201d48: 224d moveal %a5,%a1
> 201d4a: 264c moveal %a4,%a3
> 201d4c: 60dc bras 201d2a <memcmp+0x16>
> 201d4e: 4283 clrl %d3
> 201d50: 1631 1800 moveb %a1@(0,%d1:l),%d3
> 201d54: 4282 clrl %d2
> 201d56: 1433 1800 moveb %a3@(0,%d1:l),%d2
> 201d5a: 2003 movel %d3,%d0
> 201d5c: 9082 subl %d2,%d0
> 201d5e: 5281 addql #1,%d1
> 201d60: b483 cmpl %d3,%d2
> 201d62: 67c8 beqs 201d2c <memcmp+0x18>
> 201d64: 4cdf 380c moveml %sp@+,%d2-%d3/%a3-%a5
> 201d68: 4e75 rts
>
> The kernel image is actually unstripped. Is there a config option for that?
I'm sure the compressed kernel image is stripped but includes the kernel
symbol table (see below). The symbol table is definitely good to have
(otherwise you'd have to figure what all the addresses on the stack mean
from a separate symbol table).
> Do we want to keep symbols in a non-debug kernel?

Definitely ...

Cheers,

    Michael

Output of objdump -h:

vmlinux-6.2.0-rc8-atari-fpuemu-atafbfix+:     file format elf32-m68k

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         0030169c  00001000  00001000  00001000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 __ex_table    00001ab0  003026a0  003026a0  003026a0  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .rodata       000c81e8  00305000  00305000  00305000  2**4
                  CONTENTS, ALLOC, LOAD, DATA
  3 __ksymtab     00009a14  003cd1e8  003cd1e8  003cd1e8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 __ksymtab_gpl 000057c0  003d6bfc  003d6bfc  003d6bfc  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 __ksymtab_strings 000166a3  003dc3bc  003dc3bc  003dc3bc  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 __param       000006cc  003f2a60  003f2a60  003f2a60  2**1
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 __modver      00000088  003f312c  003f312c  003f312c  2**1
                  CONTENTS, ALLOC, LOAD, DATA
  8 .notes        00000054  003f31b4  003f31b4  003f31b4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  9 .data         00051a20  003f4000  003f4000  003f4000  2**4
                  CONTENTS, ALLOC, LOAD, DATA
 10 .bss          0002266c  00445a20  00445a20  00445a20  2**4
                  ALLOC
 11 .init.text    00017be0  00469000  00469000  00447000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 12 .init.data    00004c1c  00480be0  00480be0  0045ebe0  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 13 .m68k_fixup   00000480  004857fc  004857fc  004637fc  2**0
                  CONTENTS, ALLOC, LOAD, DATA
 14 .init_end     00000384  00485c7c  00485c7c  00463c7c  2**0
                  ALLOC
 15 .comment      0000002d  00000000  00000000  00463c7c  2**0
                  CONTENTS, READONLY

Michael Schmitz

unread,
Feb 23, 2023, 1:30:03 PM2/23/23
to
Correcting myself again...

On 22/02/23 13:53, Michael Schmitz wrote:
> Hi Adrian,
>
> On 22/02/23 10:46, John Paul Adrian Glaubitz wrote:
>> Hi Michael!
>>
>> On Wed, 2023-02-22 at 10:09 +1300, Michael Schmitz wrote:
>>> a1 is just  before the end of your RAM chunk. If that's a longword
>
> Actually it isn't that close - if I read the stack correctly, we're
> comparing 0xc bytes from 0x0f7ffff4 which is to 0x0f7ffffff.
>
> The post-increment of a5 to 0x0f800000 might cause a pre-fetch beyond
> end of memory - how does that get handled?

The stack frame format in this case (at least, going by the 68000 series
PRM) seems to indicate it's not something to do with prefetch.

Can you try Kars' recent patch? Maybe the old bug calculating the RAM
end address only now got 'active' on your configuration due to more
recent MM changes?

Cheers,

    Michaell

Stephen Walsh

unread,
Feb 23, 2023, 7:20:04 PM2/23/23
to
Hi Adrian,

On Tue, 21 Feb 2023 15:50:52 +0100
John Paul Adrian Glaubitz <glau...@physik.fu-berlin.de> wrote:

> Will try earlier kernels until I found the one where the breakage was
> introduced. Currently known latest kernel to work is 5.10.5.

From my testing last year trying to boot my Amiga 3000, the break
happens sometime after 5.15.0-2. (The last working kernel for me)

I've not been able to successfully boot any later kernel's since.

All my attempts to compile my own kernel have failed.


--
Stephen - Vk3heg

Stephen Walsh

unread,
Feb 23, 2023, 8:10:03 PM2/23/23
to
FYI:

Just caught this trying a re-compile of kernel 5.15.2 from kernel org,
under debbootstrap/sbuild and qemu-system-m68k both produce this issue:


CC mm/process_vm_access.o
CC mm/page_alloc.o
mm/page_alloc.c: In function ‘mem_init_print_info’:
mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
8163 | if (start <= pos && pos < end && size > adj) \
| ^~
mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
8167 | adj_init_size(__init_begin, __init_end, init_data_size,
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:27: note: use ‘&__init_begin[0] <= &_sinittext[0]’ to compare the addresses
8163 | if (start <= pos && pos < end && size > adj) \
| ^~
mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
8167 | adj_init_size(__init_begin, __init_end, init_data_size,
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
8163 | if (start <= pos && pos < end && size > adj) \
| ^
mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
8167 | adj_init_size(__init_begin, __init_end, init_data_size,
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:41: note: use ‘&_sinittext[0] < &__init_end[0]’ to compare the addresses
8163 | if (start <= pos && pos < end && size > adj) \
| ^
mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
8167 | adj_init_size(__init_begin, __init_end, init_data_size,
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
8163 | if (start <= pos && pos < end && size > adj) \
| ^~
mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:27: note: use ‘&_stext[0] <= &_sinittext[0]’ to compare the addresses
8163 | if (start <= pos && pos < end && size > adj) \
| ^~
mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
8163 | if (start <= pos && pos < end && size > adj) \
| ^
mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:41: note: use ‘&_sinittext[0] < &_etext[0]’ to compare the addresses
8163 | if (start <= pos && pos < end && size > adj) \
| ^
mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
8163 | if (start <= pos && pos < end && size > adj) \
| ^~
mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:27: note: use ‘&_sdata[0] <= &__init_begin[0]’ to compare the addresses
8163 | if (start <= pos && pos < end && size > adj) \
| ^~
mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
8163 | if (start <= pos && pos < end && size > adj) \
| ^
mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:41: note: use ‘&__init_begin[0] < &_edata[0]’ to compare the addresses
8163 | if (start <= pos && pos < end && size > adj) \
| ^
mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
8163 | if (start <= pos && pos < end && size > adj) \
| ^~
mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:27: note: use ‘&_stext[0] <= &__start_rodata[0]’ to compare the addresses
8163 | if (start <= pos && pos < end && size > adj) \
| ^~
mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
8163 | if (start <= pos && pos < end && size > adj) \
| ^
mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:41: note: use ‘&__start_rodata[0] < &_etext[0]’ to compare the addresses
8163 | if (start <= pos && pos < end && size > adj) \
| ^
mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
8163 | if (start <= pos && pos < end && size > adj) \
| ^~
mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:27: note: use ‘&_sdata[0] <= &__start_rodata[0]’ to compare the addresses
8163 | if (start <= pos && pos < end && size > adj) \
| ^~
mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
8163 | if (start <= pos && pos < end && size > adj) \
| ^
mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
| ^~~~~~~~~~~~~
mm/page_alloc.c:8163:41: note: use ‘&__start_rodata[0] < &_edata[0]’ to compare the addresses
8163 | if (start <= pos && pos < end && size > adj) \
| ^
mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
| ^~~~~~~~~~~~~
CC mm/init-mm.o
CC mm/memblock.o






--
Stephen - Vk3heg

Michael Schmitz

unread,
Feb 23, 2023, 10:10:03 PM2/23/23
to
Hi Stephen,

that's apparently been corrected in later versions. Commit
ca831f29f8f25c97182e726429b38c0802200c8f (in from 5.17).

I doubt this would lead to different code generated.

Which was the first broken version you tried? That would narrow down the
search range considerably...

Cheers,

Michael

Michael Schmitz

unread,
Feb 24, 2023, 2:50:03 PM2/24/23
to
Hi Stephen, Adrian

the only commits to hit arch/m68k/mm between 5.15 and now are:

29f28f8b826d m68k: fix livelock in uaccess
6d0b92254510 m68k/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
d92725256b4f mm: avoid unnecessary page fault retires on shared memory types
f95a387cdeb3 m68k: coldfire: drop ISA_DMA_API support
05d51e42df06 m68k: Introduce a virtual m68k machine
c4d5b6eef258 m68k: mm: Remove check for VM_IO to fix deferred I/O
36ef159f4408 mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit
0e25498f8cd4 exit: Add and use make_task_dead.
376e3fdecb0d m68k: Enable memtest functionality
952eea9b01e4 memblock: allow to specify flags with memblock_add_node()

The first is a fix for the second so these should be tested together.
None appear suspect to me.

Running memtest could incur a boot delay but AFAIR that isn't enabled by
default, and it isn't implicated in the panic log Adrian posted.

Cheers,

Michael

John Paul Adrian Glaubitz

unread,
Feb 24, 2023, 2:50:04 PM2/24/23
to
Hi Michael!

On Sat, 2023-02-25 at 08:39 +1300, Michael Schmitz wrote:
> the only commits to hit arch/m68k/mm between 5.15 and now are:
>
> 29f28f8b826d m68k: fix livelock in uaccess
> 6d0b92254510 m68k/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
> d92725256b4f mm: avoid unnecessary page fault retires on shared memory types
> f95a387cdeb3 m68k: coldfire: drop ISA_DMA_API support
> 05d51e42df06 m68k: Introduce a virtual m68k machine
> c4d5b6eef258 m68k: mm: Remove check for VM_IO to fix deferred I/O
> 36ef159f4408 mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit
> 0e25498f8cd4 exit: Add and use make_task_dead.
> 376e3fdecb0d m68k: Enable memtest functionality
> 952eea9b01e4 memblock: allow to specify flags with memblock_add_node()
>
> The first is a fix for the second so these should be tested together.
> None appear suspect to me.
>
> Running memtest could incur a boot delay but AFAIR that isn't enabled by
> default, and it isn't implicated in the panic log Adrian posted.

I don't have time this weekend to bisect the issue. But I think, I can start
bisecting it on Sunday evening. I will give it a try on Amiga Forever.

Michael Schmitz

unread,
Feb 24, 2023, 3:50:04 PM2/24/23
to
Hi Adrian,

Am 25.02.2023 um 08:49 schrieb John Paul Adrian Glaubitz:
> Hi Michael!
>
> On Sat, 2023-02-25 at 08:39 +1300, Michael Schmitz wrote:
>> the only commits to hit arch/m68k/mm between 5.15 and now are:
>>
>> 29f28f8b826d m68k: fix livelock in uaccess
>> 6d0b92254510 m68k/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
>> d92725256b4f mm: avoid unnecessary page fault retires on shared memory types
>> f95a387cdeb3 m68k: coldfire: drop ISA_DMA_API support
>> 05d51e42df06 m68k: Introduce a virtual m68k machine
>> c4d5b6eef258 m68k: mm: Remove check for VM_IO to fix deferred I/O
>> 36ef159f4408 mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit
>> 0e25498f8cd4 exit: Add and use make_task_dead.
>> 376e3fdecb0d m68k: Enable memtest functionality
>> 952eea9b01e4 memblock: allow to specify flags with memblock_add_node()
>>
>> The first is a fix for the second so these should be tested together.
>> None appear suspect to me.
>>
>> Running memtest could incur a boot delay but AFAIR that isn't enabled by
>> default, and it isn't implicated in the panic log Adrian posted.
>
> I don't have time this weekend to bisect the issue. But I think, I can start
> bisecting it on Sunday evening. I will give it a try on Amiga Forever.

I had hoped we could maybe narrow down the range to bisect by compile...
As it stands, testing each Debian kernel image released since 5.15.2
already requires a bisect approach so you indeed have your work cut out
for you.

Let me know what you find - the list of commits in mm/ is too huge to
contemplate in its entirety but might be easier to digest from one
release to another.

Cheers,

Michael


>
> Adrian
>

Geert Uytterhoeven

unread,
Feb 26, 2023, 6:10:04 AM2/26/23
to
Hi Adrian,

On Tue, Feb 21, 2023 at 4:53 PM John Paul Adrian Glaubitz
<glau...@physik.fu-berlin.de> wrote:
> On Tue, 2023-02-21 at 15:55 +0100, Geert Uytterhoeven wrote:
> > Looks surprisingly similar to the issue reported by Stan.
> > Do the mitigations given in
> > https://lore.kernel.org/all/CAMuHMdUtkr2zvZiJfLXvs9d_...@mail.gmail.com
> > help?
>
> The kernel actually crashes with a backtrace:
>
> ABCDGHIJK
> [ 0.000000] Linux version 6.0.0-6-m68k (debian...@lists.debian.org) (gcc-12 (Debian 12.2.0-9) 12.2.0, GNU ld (GNU Binutils for
> Debian) 2.39) #1 Debian 6.0.12-1 (2022-12-09)
> [ 0.000000] Enabling workaround for errata I14
> [ 0.000000] printk: bootconsole [debug0] enabled
> [ 0.000000] Amiga hardware found: [A4000] VIDEO BLITTER AUDIO FLOPPY A4000_IDE KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA
> LISA ALICE_PAL ZORRO3
> [ 0.000000] initrd: 0ef0602c - 0f800000
> [ 0.000000] Zone ranges:
> [ 0.000000] DMA [mem 0x0000000008000000-0x000000f7ffffffff]
> [ 0.000000] Normal empty
> [ 0.000000] Movable zone start for each node
> [ 0.000000] Early memory node ranges
> [ 0.000000] node 0: [mem 0x0000000008000000-0x000000000f7fffff]
> [ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000f7fffff]
> [ 0.000000] Unable to handle kernel access at virtual address (ptrval)

I see the same issue on my A4000, bisecting...

Stephen Walsh

unread,
Feb 26, 2023, 6:50:04 AM2/26/23
to
Hi Michael,

> that's apparently been corrected in later versions. Commit
> ca831f29f8f25c97182e726429b38c0802200c8f (in from 5.17).
>
> I doubt this would lead to different code generated.
>
> Which was the first broken version you tried? That would narrow down
> the search range considerably...

Version 5.16-02-m68k is the last working version for me on my A3000
(Reported back in Sept last year... Only now have I been able to get
back to it)..

I downloaded the kernel image deb's from snapshot.debian.org.

Version's 5.16-3 through to 5.16-6 boot but fail back to the initramfs
saying they can't find the root file system. The hd is listed during
the boot process though.

It's not the size of the initram, but a kernel issue. I changed the
initramfs settings from "most" to "dep", shrinking it in size, and
still had kernel boot issues.

Kernel's above 5.17/5.19/6.x fail and don't even start the heart beat.

Searching for SAVEKMSG magic...
Found 2674 bytes at 0x001e0010
>>>>>>>>>>>>>>>>>>>>
[ 0.000000] Linux version 5.17.0-1-m68k (debian...@lists.debian.org) (gcc-11 (Debian 11.2.0-20) 11.2.0, GNU ld (GNU Binutils for Debian) 2.38) #1 Debian 5.17.3-1 (2022-04-18)
[ 0.000000] printk: console [debug0] enabled
[ 0.000000] Amiga hardware found: [A3000] VIDEO BLITTER AMBER_FF AUDIO FLOPPY A3000_SCSI KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA DENISE_HR AGNUS_HR_PAL MAGIC_REKICK ZORRO3
[ 0.000000] initrd: 0f7f395d - 10000000
[ 0.000000] Ignoring memory chunk at 0x7800000:0x800000 before the first chunk
[ 0.000000] Fix your bootloader or use a memfile to make use of this area!
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x0000000008000000-0x000000ffffffffff]
[ 0.000000] Normal empty
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000008000000-0x000000000fffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000fffffff]
[ 0.000000] Unable to handle kernel access at virtual address (ptrval)
[ 0.000000] Oops: 00000000
[ 0.000000] Modules linked in:
[ 0.000000] PC: [<001edcac>] memcmp+0x2c/0x5c
[ 0.000000] SR: 2700 SP: (ptrval) a2: 0047d530
[ 0.000000] d0: 00408ab1 d1: 0ffffff8 d2: 001edc80 d3: 0000019e
[ 0.000000] d4: 0804c588 d5: 0080c6a3 a0: 0000000c a1: 0ffffff4
[ 0.000000] Process swapper (pid: 0, task=(ptrval))
[ 0.000000] Frame format=7 eff addr=0047bfb8 ssw=0505 faddr=0ffffff4
[ 0.000000] wb 1 stat/addr/data: 0005 0804c588 0080c6a3
[ 0.000000] wb 2 stat/addr/data: 0005 0053c000 0000019e
[ 0.000000] wb 3 stat/addr/data: 0005 0047bfb0 001edc80
[ 0.000000] push data: 0080c6a3 00353a8a 08001000 08056094
[ 0.000000] Stack from 0047bfb0:
[ 0.000000] 001edc80 0000019e 00353a8a 00514b0e 0ffffff4 00408aad 0000000c 0053c000
[ 0.000000] 0000019e 0804c588 0080c6a3 08050258 0000ffff 08062278 08001000 08056094
[ 0.000000] 0ffffff0 005333b8 00000000 00513872
[ 0.000000] Call Trace: [<001edc80>] memcmp+0x0/0x5c
[ 0.000000] [<00353a8a>] _printk+0x0/0x18
[ 0.000000] [<00514b0e>] start_kernel+0x86/0x5ca
[ 0.000000] [<0000ffff>] sz_long+0x5/0x6
[ 0.000000] [<00513872>] _sinittext+0x872/0x11f8
[ 0.000000]
[ 0.000000] Code: 4280 6036 2209 200b 2640 2241 5881 5880 <2411> b493 66e4 2241 2640 5988 7403 b488 65e6 60d6 4283 1631 1800 4282 1433 1800
[ 0.000000] Disabling lock debugging due to kernel taint
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
<<<<<<<<<<<<<<<<<<<<




Searching for SAVEKMSG magic...
Found 2632 bytes at 0x001e0010
>>>>>>>>>>>>>>>>>>>>
[ 0.000000] Linux version 5.18.0-3-m68k (debian...@lists.debian.org) (gcc-11 (Debian 11.3.0-4) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 Debian 5.18.14-1 (2022-07-23)
[ 0.000000] printk: console [debug0] enabled
[ 0.000000] Amiga hardware found: [A3000] VIDEO BLITTER AMBER_FF AUDIO FLOPPY A3000_SCSI KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA DENISE_HR AGNUS_HR_PAL MAGIC_REKICK ZORRO3
[ 0.000000] initrd: 0facd79f - 10000000
[ 0.000000] Ignoring memory chunk at 0x7800000:0x800000 before the first chunk
[ 0.000000] Fix your bootloader or use a memfile to make use of this area!
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x0000000008000000-0x000000ffffffffff]
[ 0.000000] Normal empty
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000008000000-0x000000000fffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000fffffff]
[ 0.000000] Unable to handle kernel access at virtual address (ptrval)
[ 0.000000] Oops: 00000000
[ 0.000000] Modules linked in:
[ 0.000000] PC: [<001ed3a8>] memcmp+0x2c/0x5c
[ 0.000000] SR: 2700 SP: (ptrval) a2: 00481530
[ 0.000000] d0: 0040af9d d1: 0ffffff8 d2: 001ed37c d3: 0000019e
[ 0.000000] d4: 0806ba68 d5: 00532861 a0: 0000000c a1: 0ffffff4
[ 0.000000] Process swapper (pid: 0, task=(ptrval))
[ 0.000000] Frame format=7 eff addr=0047ffbc ssw=0505 faddr=0ffffff4
[ 0.000000] wb 1 stat/addr/data: 0005 0806ba68 00532861
[ 0.000000] wb 2 stat/addr/data: 0005 00536000 0000019e
[ 0.000000] wb 3 stat/addr/data: 0005 0047ffb4 001ed37c
[ 0.000000] push data: 00532861 00355728 08001000 0804dcc4
[ 0.000000] Stack from 0047ffb4:
[ 0.000000] 001ed37c 0000019e 00355728 0050fb0e 0ffffff4 0040af99 0000000c 00536000
[ 0.000000] 0000019e 0806ba68 00532861 0806f738 08091288 08001000 0804dcc4 0ffffff0
[ 0.000000] 0052e2b8 00000000 0050e872
[ 0.000000] Call Trace: [<001ed37c>] memcmp+0x0/0x5c
[ 0.000000] [<00355728>] _printk+0x0/0x18
[ 0.000000] [<0050fb0e>] start_kernel+0x86/0x5a0
[ 0.000000] [<0050e872>] _sinittext+0x872/0x11f8
[ 0.000000]
[ 0.000000] Code: 4280 6036 2209 200b 2640 2241 5881 5880 <2411> b493 66e4 2241 2640 5988 7403 b488 65e6 60d6 4283 1631 1800 4282 1433 1800
[ 0.000000] Disabling lock debugging due to kernel taint
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
<<<<<<<<<<<<<<<<<<<<









--
Stephen - Vk3heg

Geert Uytterhoeven

unread,
Feb 26, 2023, 8:00:04 AM2/26/23
to
Bisected to commit 376e3fdecb0dcae2 ("m68k: Enable memtest
functionality") in v5.17-rc1. Reverting that on top of latest fixes the
issue.

Michael Schmitz

unread,
Feb 26, 2023, 9:10:04 PM2/26/23
to
Hi Geert, Stephen,
Yes, I'm sorry to say that was the only likely candidate. Can't see why
though - are Macs all configured to have RAM start at address zero, and
possibly contiguous, Finn?

Cheers,

Michael


>
> Gr{oetje,eeting}s,
>
> Geert
>

Michael Schmitz

unread,
Feb 26, 2023, 9:20:04 PM2/26/23
to
Hi Geert,

Am 27.02.2023 um 01:52 schrieb Geert Uytterhoeven:
What about instead changing the piece of code that you identified as
problematic in Kars' case to claim/map the last few bits as well
(memblock_cap_size() to be precise)?

I wonder whether Finn's memtest patch merely exposed another MM bug that
we don't hit as easily (not without putting memory under a lot of pressure)?

Cheers,

Michael

Finn Thain

unread,
Feb 27, 2023, 1:10:03 AM2/27/23
to
On Mon, 27 Feb 2023, Michael Schmitz wrote:

> >
> > Bisected to commit 376e3fdecb0dcae2 ("m68k: Enable memtest
> > functionality") in v5.17-rc1. Reverting that on top of latest fixes
> > the issue.
>
> Yes, I'm sorry to say that was the only likely candidate. Can't see why
> though - are Macs all configured to have RAM start at address zero, and
> possibly contiguous, Finn?
>

I don't really understand your question. This was not a Mac patch. The
issue seems to be about the locations initrd_start and initrd_end in
relation to the various memory segments (?)

This seems to be the same bug that was raised about 6 months ago... I had
thought it was a bootloader bug but I'm out of my depth here.

https://lists.debian.org/debian-68k/2022/09/msg00047.html
https://lists.debian.org/debian-68k/2022/09/msg00051.html
https://lists.debian.org/debian-68k/2022/09/msg00055.html

Finn Thain

unread,
Feb 27, 2023, 1:50:03 AM2/27/23
to
On Mon, 27 Feb 2023, Michael Schmitz wrote:

>
> I wonder whether Finn's memtest patch merely exposed another MM bug
>

A kernel patch may be easier than a bootloader patch (even if this is a
bootloader bug) particularly if it affects multiple platforms.

A partial revert of my patch (below) will probably avoid the issue, but
with the side effect that use of memtest will clobber the initrd.

The initrd and memtest features aren't usually needed together. At the
time when I needed the memtest feature I did not have confidence in the
hardeare. An initrd wasn't very useful at that point.

diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c
index 3a2bb2e8fdad..92f1b9268dff 100644
--- a/arch/m68k/kernel/setup_mm.c
+++ b/arch/m68k/kernel/setup_mm.c
@@ -326,6 +326,8 @@ void __init setup_arch(char **cmdline_p)
panic("No configuration setup");
}

+ paging_init();
+
#ifdef CONFIG_BLK_DEV_INITRD
if (m68k_ramdisk.size) {
memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);
@@ -335,8 +337,6 @@ void __init setup_arch(char **cmdline_p)
}
#endif

- paging_init();
-
#ifdef CONFIG_NATFEAT
nf_init();
#endif

Michael Schmitz

unread,
Feb 27, 2023, 2:30:03 AM2/27/23
to
Hi Finn,

Am 27.02.2023 um 18:55 schrieb Finn Thain:
> On Mon, 27 Feb 2023, Michael Schmitz wrote:
>
>>>
>>> Bisected to commit 376e3fdecb0dcae2 ("m68k: Enable memtest
>>> functionality") in v5.17-rc1. Reverting that on top of latest fixes
>>> the issue.
>>
>> Yes, I'm sorry to say that was the only likely candidate. Can't see why
>> though - are Macs all configured to have RAM start at address zero, and
>> possibly contiguous, Finn?
>>
>
> I don't really understand your question. This was not a Mac patch. The
> issue seems to be about the locations initrd_start and initrd_end in
> relation to the various memory segments (?)

I didn't realize that - thanks for pointing this out.

>
> This seems to be the same bug that was raised about 6 months ago... I had
> thought it was a bootloader bug but I'm out of my depth here.
>
> https://lists.debian.org/debian-68k/2022/09/msg00047.html
> https://lists.debian.org/debian-68k/2022/09/msg00051.html
> https://lists.debian.org/debian-68k/2022/09/msg00055.html

I had forgotten all about that one... Thanks for jogging my memory!

In this case though, the bug happens when the ramdisk is loaded in the
lowest address memory chunk, at least at a lower address than the one
the kernel runs from.

The crashes in the above thread were all from boots where the initrd got
loaded at the end of the memory chunk the kernel runs from.

Time to try using copy_from_kernel_nofault() to copy the ramdisk into
its final location? (just kidding)

Cheers,

Michael

Finn Thain

unread,
Feb 27, 2023, 3:20:03 AM2/27/23
to

On Mon, 27 Feb 2023, I wrote:

> On Mon, 27 Feb 2023, Michael Schmitz wrote:
>
> >
> > I wonder whether Finn's memtest patch merely exposed another MM bug
> >
>
> A kernel patch may be easier than a bootloader patch (even if this is a
> bootloader bug) particularly if it affects multiple platforms.
>
> A partial revert of my patch (below) will probably avoid the issue, but
> with the side effect that use of memtest will clobber the initrd.
>

Maybe that's for the best now that the initrd/initramfs has grown so
large. That portion of memory is presently skipped by memtest, which means
you'd have to disable the initrd to get good coverage from memtest anyway.

Geert Uytterhoeven

unread,
Feb 27, 2023, 3:30:03 AM2/27/23
to
Hi Finn,

FTR, here is the diff of the dmesg between good and bad:

+initrd: 07f61166 - 08000000

This is wrong (note the 6 trailing zeros), as phys_to_virt() is not
working correctly yet (module_fixup() is called from paging_init()).

Zone ranges:
DMA [mem 0x0000000007400000-0x0000007fffffffff]
Normal empty
Movable zone start for each node
Early memory node ranges
node 0: [mem 0x0000000007400000-0x0000000007ffffff]
Initmem setup node 0 [mem 0x0000000007400000-0x0000000007ffffff]
-initrd: 00b61166 - 00c00000

This is correct (note the 5 trailing zeros).

-pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
-pcpu-alloc: [0] 0
[...]
+Unable to handle kernel access at virtual address (ptrval)
+Oops: 00000000
+Modules linked in:
+PC: [<002c11be>] memcmp+0x2c/0x5c
+SR: 2700 SP: (ptrval) a2: 003bd560
+d0: 0035eb83 d1: 07fffff8 d2: 002c1192 d3: 000000e6
+d4: 000684e8 d5: 00447000 a0: 0000000c a1: 07fffff4
+Process swapper (pid: 0, task=(ptrval))
+Frame format=7 eff addr=003bbfbc ssw=0505 faddr=07fffff4
+wb 1 stat/addr/data: 0005 00447000 07401000
+wb 2 stat/addr/data: 0005 000000e6 000684e8
+wb 3 stat/addr/data: 0005 003bbfb4 002c1192
+push data: 07401000 002c7d82 07401000 074a2cf4
+Stack from 003bbfb4:
+002c1192 000000e6 002c7d82 00428eda 07fffff4 0035eb7f 0000000c 00447000
+000000e6 000684e8 00447000 07401000 074bec08 07401000 074a2cf4 07fffff0
+00440406 00000000 00428322
+Call Trace: [<002c1192>] memcmp+0x0/0x5c
+[<002c7d82>] _printk+0x0/0x18
+[<00428eda>] start_kernel+0x80/0x5b0
+[<000684e8>] pcpu_alloc+0x88/0x3b4
+[<00428322>] _sinittext+0x322/0x9b0

On Mon, Feb 27, 2023 at 7:30 AM Finn Thain <fth...@linux-m68k.org> wrote:
> On Mon, 27 Feb 2023, Michael Schmitz wrote:
> > I wonder whether Finn's memtest patch merely exposed another MM bug
>
> A kernel patch may be easier than a bootloader patch (even if this is a
> bootloader bug) particularly if it affects multiple platforms.
>
> A partial revert of my patch (below) will probably avoid the issue, but
> with the side effect that use of memtest will clobber the initrd.

Which we can avoid, by moving the ramdisk handling inside paging_init().

> The initrd and memtest features aren't usually needed together. At the
> time when I needed the memtest feature I did not have confidence in the
> hardeare. An initrd wasn't very useful at that point.
>
> diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c
> index 3a2bb2e8fdad..92f1b9268dff 100644
> --- a/arch/m68k/kernel/setup_mm.c
> +++ b/arch/m68k/kernel/setup_mm.c
> @@ -326,6 +326,8 @@ void __init setup_arch(char **cmdline_p)
> panic("No configuration setup");
> }
>
> + paging_init();
> +
> #ifdef CONFIG_BLK_DEV_INITRD
> if (m68k_ramdisk.size) {
> memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);

Presumably something in memblock_reserve() relies on having
called paging_init() before?

I'll do some more debugging later today...

> @@ -335,8 +337,6 @@ void __init setup_arch(char **cmdline_p)
> }
> #endif
>
> - paging_init();
> -
> #ifdef CONFIG_NATFEAT
> nf_init();
> #endif
>


--

Eero Tamminen

unread,
Feb 27, 2023, 4:50:03 AM2/27/23
to
Hi,

On 27.2.2023 9.19, Michael Schmitz wrote:
> Am 27.02.2023 um 18:55 schrieb Finn Thain:
>> On Mon, 27 Feb 2023, Michael Schmitz wrote:
>>
>>>>
>>>> Bisected to commit 376e3fdecb0dcae2 ("m68k: Enable memtest
>>>> functionality") in v5.17-rc1.  Reverting that on top of latest fixes
>>>> the issue.
>>>
>>> Yes, I'm sorry to say that was the only likely candidate. Can't see why
>>> though - are Macs all configured to have RAM start at address zero, and
>>> possibly contiguous, Finn?
>>>
>>
>> I don't really understand your question. This was not a Mac patch. The
>> issue seems to be about the locations initrd_start and initrd_end in
>> relation to the various memory segments (?)
>
> I didn't realize that - thanks for pointing this out.
>
>> This seems to be the same bug that was raised about 6 months ago... I had
>> thought it was a bootloader bug but I'm out of my depth here.
>>
>> https://lists.debian.org/debian-68k/2022/09/msg00047.html
>> https://lists.debian.org/debian-68k/2022/09/msg00051.html
>> https://lists.debian.org/debian-68k/2022/09/msg00055.html
>
> I had forgotten all about that one... Thanks for jogging my memory!
>
> In this case though, the bug happens when the ramdisk is loaded in the
> lowest address memory chunk, at least at a lower address than the one
> the kernel runs from.

I'm wondering whether this old Atari side boot issue is related at all...

When adding Linux bootinfo support to Hatari emulator (from Aranym
emulator) few years ago, I noticed that:
"Linux barfs at ST-RAM memory range given after TT-RAM. However, if
kernel is loaded to TT-RAM and ST-RAM range is given before TT-RAM
range, kernel crashes."

=> Only working config was Linux being loaded to ST-RAM, TT-RAM being
given only after that in bootinfo, and initrd ramdisk after kernel.

Based on mails in archive, this seemed to have been a known Linux/Atari
issue already in 2013.


> The crashes in the above thread were all from boots where the initrd got
> loaded at the end of the memory chunk the kernel runs from.
>
> Time to try using copy_from_kernel_nofault() to copy the ramdisk into
> its final location? (just kidding)


- Eero

PS. For people familiar only with Amiga terminology, ST-RAM = chip RAM,
TT-RAM = fast RAM.

Michael Schmitz

unread,
Feb 27, 2023, 4:50:03 AM2/27/23
to
Hi Geert,

adding Mike Rapoport to the recipient list who would know whether
memblock_reserve() relies on paging_init() having run.

Cheers,

Michael

Michael Schmitz

unread,
Feb 27, 2023, 5:00:03 AM2/27/23
to
Eero,

that issue (kernel running from TT-RAM) was fixed quite a few years ago
(but maybe not in 2013), in the sense that ST-RAM could be used for
drivers (SCSI, atafb). Using ST-RAM as normal VM should have been made a
lot easier by changing to memblock, but AFAIR there are still some bits
missing.

RAM must be listed in bootinfo with the chunk holding the kernel first,
_not_ in ascending address order, so that second option is expected to
crash.

This isn't related to the current issue for all I can see.

Cheers,

Michael

Mike Rapoport

unread,
Feb 27, 2023, 7:10:03 AM2/27/23
to
Hi,
memblock_reserve() does not rely on paging_init() as it operates on
physical addresses and it does not care if memory was already registered.

What does rely on paging_init() it's phys_to_virt() in the line after
memblock_reserve():

initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);
initrd_end = initrd_start + m68k_ramdisk.size;

So to have both memtest and initrd we'd need something like

memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);

paging_init() {
/* setup page tables and memblock */
early_memtest();
}

initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);

or

paging_init(); /* without early_memtest() */

memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);
initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);

early_memtest();


> > I'll do some more debugging later today...
> >
> > > @@ -335,8 +337,6 @@ void __init setup_arch(char **cmdline_p)
> > > }
> > > #endif
> > >
> > > - paging_init();
> > > -
> > > #ifdef CONFIG_NATFEAT
> > > nf_init();
> > > #endif
> > >
> >
> >

--
Sincerely yours,
Mike.

Geert Uytterhoeven

unread,
Feb 27, 2023, 7:40:04 AM2/27/23
to
Hi Mike,
Of course... /me bangs his head against the TFT for not having
realized before the values saved into initrd_{start,end} are not just
for printing in the pr_info() line...

Mike Rapoport

unread,
Feb 27, 2023, 8:00:03 AM2/27/23
to
Hi Geert,
Happens to the best of us :)

> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds

--
Sincerely yours,
Mike.
0 new messages