3.5.7 f_functionlist crash

49 views
Skip to first unread message

Simon Lundmark

unread,
Nov 20, 2024, 5:46:32 AM11/20/24
to ldmud...@googlegroups.com
Hey all,

New to this group so if I'm in error to mail here about crashes or issues, please let me know.

We've recently (August) migrated to 3.5.7 from a 3.2.x driver and are starting to have some issues. First off we've managed to get a crash in  object.c:1923 in a f_functionlist call. We have a dump but since it's in optimized mode some of the things are hard to get good results on.

I'm attaching some gdb-info to this mail in a .html file that's pretty much all we've got so far. 

If I'm understanding this correctly we're trying to access fun[j] or the offset of fun (0x7f45048583d8, which looks reasonable) with j that's 4294966813, which obviously is problematic. I'm not nearly knowledgeable about the code-base to understand all of this but if 

j = (long)CROSSDEF_NAME_OFFSET(flags); where flags is 2149711389 and we get 4294966813 then obviously it seems like a function is somehow flagged wrongly. 

Looking into CROSSDEF_NAME_OFFSET it seems to extract the 18 lower bits and take the result minus 0x20000. Correct me if I'm wrong but the 18 bits (inherit mask) of the flags  is 01 1111 1110 0001 1101 or 0x1fe1d is smaller than 0x20000. So the result of that operation (0x1fe1d - 0x20000) seems to be correct with the result that we're seeing in the value of j as seen in the crash.

So I'm curious as to if the flags (2149711389) is incorrect and if so how could I possibly track down where the incorrect flags are introduced and how I can try to track this down further.

Thanks for reading and any help, the work that's been done on the driver is magnificent.

Thanks,
Simon

gdb-2024-11-20.html

Gnomi

unread,
Nov 20, 2024, 5:29:21 PM11/20/24
to ldmud...@googlegroups.com
Hello Simon,

If you want to make the binary and core dump available, I can have a look.

Simon Lundmark wrote:
> If I'm understanding this correctly we're trying to access fun[j] or the
> offset of fun (0x7f45048583d8, which looks reasonable) with j that's
> 4294966813,
> which obviously is problematic. I'm not nearly knowledgeable about the
> code-base to understand all of this but if
>
> j = (long)CROSSDEF_NAME_OFFSET(flags); where flags is 2149711389 and
> we get 4294966813
> then obviously it seems like a function is somehow flagged wrongly.

flags is 0x8021fe1d, CROSSDEF_NAME_OFFSET(flags) should be -483
(0x1fe1d - 0x20000), which would be fine. We are at function #532,
so we would arrive at function #49. So the flags seem sensible to me.
You can check in gdb, whether fun[-483] is a valid memory access.

However gdb shows 4294966813, and fun[4294966813] is very probable an
invalid memory access. 4294966813 is 2^32 - 483. So somehow it's
converting the value to an unsigned 32 bit integer before putting it into j
(which is declared as a long, so it probably is a signed 64 bit integer).

I looked at the code and cannot find a place where this would happen.
flags is an uint32, but it's cast to an int32. INHERIT_MASK being an enum
value should be an int. The return type of CROSSDEF_NAME_OFFSET is a long
as is the target variable. So I don't see where this conversion to uint32
could happen.

Could you check:

ptype CROSSDEF_NAME_OFFSET
ptype j

The first one should give type = long (const funflag_t), the second one
type = long.

This might be interesting as well:
disass CROSSDEF_NAME_OFFSET

With best regards,
Gnomi

Simon Lundmark

unread,
Nov 20, 2024, 5:34:22 PM11/20/24
to LDMud Talk
Hi Gnomi,

I just managed to reproduce this with the lp-245 mudlib using the 2.5.7 driver locally.

I just did:
    add_action("flist", "flist");

in the player object / wiz_commands() function, and then added the function

int flist(string target_name) {
  // Find target
  if( !target_name ) {
    write("For which object should I display the function list?\n");
    return 1;
  }
  object target = find_player(target_name);
  if( !target )
  {
    printf("Could not find \"%s\".\n",target_name);
    return 1;
  }
  mixed *list = functionlist(target,
   RETURN_FUNCTION_NAME |
   RETURN_FUNCTION_FLAGS );
  return 1;
}

to it as well. Log in and flist your own playername and you should get the crash.

Thanks!
Simon

Simon Lundmark

unread,
Nov 20, 2024, 5:41:35 PM11/20/24
to LDMud Talk
Sorry, tired and it's way past my bedtime.

I'm running this on a Ryzen 7 so any modern x64 architecture should probably do, running ubuntu 24.04 LTS.

I'll dig through the code tomorrow after sleep and try to figure out the signed/unsigned and width of the variables to see if I can understand why it's not properly handled as signed.

Cheers,
Simon

Gnomi

unread,
Nov 20, 2024, 6:08:04 PM11/20/24
to ldmud...@googlegroups.com
Hi,

I tried it with 3.5.7 (default configuration) and the LP-245 mudlib on an
x64 system (gcc 12.2 and clang 14), I could not reproduce it. I got a full
function list (I added a printout to the action).

But I'm glad, you can reproduce it.

With best regards,
Gnomi

Simon Lundmark

unread,
Nov 21, 2024, 3:54:44 AM11/21/24
to LDMud Talk
Hi again Gnomi!

Ok so I compiled the driver in debug.
I checked the types of both j (long) and CROSSDEF_NAME_OFFSET which is just as you assumed long (const funflag_t). I also validated that INHERIT_MASK is indeed enum function_flags which means that it should be signed. I'm using gcc 13.2.0 (Ubuntu 13.2.0-23ubuntu4), so maybe it's something that's introduced with gcc 13+.

Here's the disassembly:

(gdb) disass CROSSDEF_NAME_OFFSET
Dump of assembler code for function CROSSDEF_NAME_OFFSET:
   0x00005555555fae2f <+0>: push   %rbp
   0x00005555555fae30 <+1>: mov    %rsp,%rbp
   0x00005555555fae33 <+4>: mov    %edi,-0x4(%rbp)
   0x00005555555fae36 <+7>: mov    -0x4(%rbp),%eax
   0x00005555555fae39 <+10>: and    $0x3ffff,%eax
   0x00005555555fae3e <+15>: sub    $0x20000,%eax
   0x00005555555fae43 <+20>: mov    %eax,%eax
   0x00005555555fae45 <+22>: pop    %rbp
   0x00005555555fae46 <+23>: ret
End of assembler dump.

Unfortunately I have a few really buzy days ahead but once I get some time I will try to drag the code into godbolt and investigate further. I do have the intentions of getting to know the driver better so if you have better things to do, don't worry and I'll get back to you whenever I have some results.

I must say that it was really easy to get the driver running now and testing out stuff compared to 15-20 years ago so I'm really impressed with all the work that's been done here. Also I have gdb with the crash up and running so if there's anything else you want to know or if I missed anything - please let me know.

Best,
Simon

Gnomi

unread,
Nov 21, 2024, 5:27:51 AM11/21/24
to ldmud...@googlegroups.com
Hi Simon,

Just a short remark on the disassembly:

Simon Lundmark wrote:
> (gdb) disass CROSSDEF_NAME_OFFSET
> Dump of assembler code for function CROSSDEF_NAME_OFFSET:
> 0x00005555555fae2f <+0>: push %rbp
> 0x00005555555fae30 <+1>: mov %rsp,%rbp
> 0x00005555555fae33 <+4>: mov %edi,-0x4(%rbp)
> 0x00005555555fae36 <+7>: mov -0x4(%rbp),%eax
> 0x00005555555fae39 <+10>: and $0x3ffff,%eax
> 0x00005555555fae3e <+15>: sub $0x20000,%eax
> 0x00005555555fae43 <+20>: mov %eax,%eax
This is a zero extension of %eax into %rax, which is what we have seen.
In my version it's using 'cltq' (sign extension) instead.

> 0x00005555555fae45 <+22>: pop %rbp
> 0x00005555555fae46 <+23>: ret
> End of assembler dump.

With best regards,
Gnomi

Gnomi

unread,
Nov 21, 2024, 1:47:40 PM11/21/24
to ldmud...@googlegroups.com
Hi Simon,

I have played a bit with Godbolt's Compiler Explorer and browsed through the
C standards. This is caused by a change from C17 to C23.

The question resolves around the type of 'INHERIT_MASK', being a signed
or unsigned integer.

Until C17 the standard said:
An identifier declared as an enumeration constant has type int.

So INHERIT_MASK would be a (signed) int, therefore the expression in
CROSSDEF_NAME_OFFSET would be a signed int und therefore be extended
into a signed long by keeping the sign.

Now with C23 the standard says:
An identifier declared as an enumeration constant for an enumeration
without a fixed underlying type has either type int or the enumerated
type, as defined in 6.7.2.2.

As the enumeration type (function_flags) contains a constant that is
0x80000000, which doesn't fit into an int, the underlying integer type for
the enum type will be chosen as unsigned int. And therefore the type for all
enum constant identifiers will be unsigned int as well.

So INHERIT_MASK now is unsigned int, therefore the expression in
CROSSDEF_NAME_OFFSET would be an unsigned int and therefore it will be
extended into a signed long without keeping the sign (there is no sign
in an unsigned int).

So it seems GCC 13 already adopted that notion, Clang 20 will probably do it
as well. Specifying an older standard (--std=c99) doesn't help. :-(

So we need to adopt our code. I will do a corresponding change.

With best regards,
Gnomi.

PS: The change in the standard makes sense to me, enum constants having the
same type as the enum itself. Just wished they would not break our code...

Simon Lundmark

unread,
Nov 21, 2024, 4:12:04 PM11/21/24
to LDMud Talk
Hi Gnomi,

That explains it. Great find! 

Yeah I agree both with the change to the standard and the problem that it breaks old code. Let me know if there's anything I can help with. Meanwhile I'll see if we can patch our driver locally during next reboot. 

We do have a stray issue with mapping that is less severe and much harder to reproduce. I'll see if there's any possibility to reproduce it separately from our entire mud and post a separate thread about it once I get there.

Many thanks for your help, and also all of the hard work that you put into the driver. Much appreciated!

Best regards,
Simon
Reply all
Reply to author
Forward
0 new messages