Unhandled EFAULT with letux kernels built with newer gcc compilers

5 views
Skip to first unread message

H. Nikolaus Schaller

unread,
Apr 4, 2021, 11:56:00 AM4/4/21
to William Cohen, Paul Boddie, Discussions about the Letux Kernel, mips-creat...@googlegroups.com
Hi Will,
I haven't seen the unhandled EFAULT you are describing but
there is something weird with futex on the jz4730. There
is no unhandled fault but a process can get stuck in
the futex system call.

So this may both be related. It may fault on the jz4780 with
newer compiler but work with older and get stuck on jz4730
with older...

I need some time to upgrade my good old cross-toolchains
to something newer (yes, gcc 4.9 is quite old and arm64
already requires gcc 5.1)... I have to modify my own compiler
generator because e.g. cross-tool-ng or prebuilt toolchains
do not fit my requirements for a cross-compiling farm for
multiple architectures. Especially since a toolchain is
more than the C compiler.

BR and thanks,
Nikolaus



William Cohen

unread,
Apr 4, 2021, 5:56:57 PM4/4/21
to mips-creat...@googlegroups.com
On 4/4/21 11:55 AM, H. Nikolaus Schaller wrote:
> Hi Will,
> I haven't seen the unhandled EFAULT you are describing but
> there is something weird with futex on the jz4730. There
> is no unhandled fault but a process can get stuck in
> the futex system call.
>
> So this may both be related. It may fault on the jz4780 with
> newer compiler but work with older and get stuck on jz4730
> with older...

Hi,

I have been building the kernels self-hosted on the Creator CI20 board.  I don't know if that might be causing issues.

One attempt I tried was to turn off the CONFIG_FUTEX in the configure to work around the problem with the  futex_detect_cmpxchg causing a panic.  Turning that CONFIG_FUTEX off  caused:

The futex facility returned an unexpected error code.[    3.174558] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000005   
[    3.186706] Rebooting in 10 seconds.. 

So for the time being I am running a kernel that has  CONFIG_HAVE_FUTEX_CMPXCHG patched to be set.

One other thought as to something that might be causing a problem is that address have the MSB set as in the backtrace below:

                                                                                                                    
[    0.183009] Call Trace:                                                                                                                    
[    0.185486] [<801b97b4>] cmpxchg_futex_value_locked+0x30/0x6c                                                                              
[    0.191318] [<80d8940c>] futex_init+0x80/0xe0                                                                                              
[    0.195730] [<80100c68>] do_one_initcall+0x150/0x308                                                                                       
[    0.113385] [<80d7db68>] do_initcall_level+0x128/0x160                                                                                     
[    0.118590] [<80d7d9f8>] do_initcalls+0x60/0xa8                                                                                            
[    0.123181] [<80d7d98c>] do_basic_setup+0x28/0x34                                                                                          
[    0.127949] [<80d7d834>] kernel_init_freeable+0x6c/0xa8                                                                                    
[    0.133249] [<80a0f4ac>] kernel_init+0x10/0x128                                                                                            
[    0.137838] [<801030a0>] ret_from_kernel_thread+0x10/0x18 

Is there any place that the addresses are being treated as 64-bit quantities and/or getting signed extended with the newer compiler?  There have been problems in the past like:

https://github.com/u-boot/u-boot/commit/c190fbd010a00e16e9599575b43c5a7c7bc7ec09#diff-980273600cca5b17c252b4ce1edb11b4cfd826fe6b2bbcad20393aa392f3cdce


> I need some time to upgrade my good old cross-toolchains
> to something newer (yes, gcc 4.9 is quite old and arm64
> already requires gcc 5.1)... I have to modify my own compiler
> generator because e.g. cross-tool-ng or prebuilt toolchains
> do not fit my requirements for a cross-compiling farm for
> multiple architectures. Especially since a toolchain is
> more than the C compiler.
>
> BR and thanks,
> Nikolaus
>
>
>
Thanks,

-Will

H. Nikolaus Schaller

unread,
Apr 5, 2021, 5:41:17 AM4/5/21
to William Cohen, MIPS Creator CI20 Development, Discussions about the Letux Kernel
Hi William,

> Am 04.04.2021 um 23:56 schrieb William Cohen <wco...@nc.rr.com>:
>
> On 4/4/21 11:55 AM, H. Nikolaus Schaller wrote:
>> Hi Will,
>> I haven't seen the unhandled EFAULT you are describing but
>> there is something weird with futex on the jz4730. There
>> is no unhandled fault but a process can get stuck in
>> the futex system call.
>>
>> So this may both be related. It may fault on the jz4780 with
>> newer compiler but work with older and get stuck on jz4730
>> with older...
>
> Hi,
>
> I have been building the kernels self-hosted on the Creator CI20 board. I don't know if that might be causing issues.

It shouldn't but one never knows...

>
> One attempt I tried was to turn off the CONFIG_FUTEX in the configure to work around the problem with the futex_detect_cmpxchg causing a panic. Turning that CONFIG_FUTEX off caused:
>
> The futex facility returned an unexpected error code.[ 3.174558] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000005
> [ 3.186706] Rebooting in 10 seconds..
>
> So for the time being I am running a kernel that has CONFIG_HAVE_FUTEX_CMPXCHG patched to be set.
>
> One other thought as to something that might be causing a problem is that address have the MSB set as in the backtrace below:

Ok, that could also be an influence.

>
>
> [ 0.183009] Call Trace:
> [ 0.185486] [<801b97b4>] cmpxchg_futex_value_locked+0x30/0x6c
> [ 0.191318] [<80d8940c>] futex_init+0x80/0xe0
> [ 0.195730] [<80100c68>] do_one_initcall+0x150/0x308
> [ 0.113385] [<80d7db68>] do_initcall_level+0x128/0x160
> [ 0.118590] [<80d7d9f8>] do_initcalls+0x60/0xa8
> [ 0.123181] [<80d7d98c>] do_basic_setup+0x28/0x34
> [ 0.127949] [<80d7d834>] kernel_init_freeable+0x6c/0xa8
> [ 0.133249] [<80a0f4ac>] kernel_init+0x10/0x128
> [ 0.137838] [<801030a0>] ret_from_kernel_thread+0x10/0x18
>
> Is there any place that the addresses are being treated as 64-bit quantities and/or getting signed extended with the newer compiler? There have been problems in the past like:
>
> https://github.com/u-boot/u-boot/commit/c190fbd010a00e16e9599575b43c5a7c7bc7ec09#diff-980273600cca5b17c252b4ce1edb11b4cfd826fe6b2bbcad20393aa392f3cdce

There are also more 32<->64 bit address conversion macros in the kernel to better support 64 bit processors.
Maybe one of these is compiler dependent and/or processor dependent.

I'll keep an eye on this issue.

BR,
Nikolaus


>
>
>> I need some time to upgrade my good old cross-toolchains
>> to something newer (yes, gcc 4.9 is quite old and arm64
>> already requires gcc 5.1)... I have to modify my own compiler
>> generator because e.g. cross-tool-ng or prebuilt toolchains
>> do not fit my requirements for a cross-compiling farm for
>> multiple architectures. Especially since a toolchain is
>> more than the C compiler.
>>
>> BR and thanks,
>> Nikolaus
>>
>>
>>
> Thanks,
>
> -Will
>
> --
> You received this message because you are subscribed to the Google Groups "MIPS Creator CI20 Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to mips-creator-ci2...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/mips-creator-ci20-dev/cd14c9d9-e3a0-b017-1141-cd7266fd3fa8%40nc.rr.com.

Reply all
Reply to author
Forward
0 new messages