Foolish attempt to support darwin/arm64

324 views
Skip to first unread message

Christian Banse

unread,
Dec 5, 2020, 10:31:15 AM12/5/20
to delv...@googlegroups.com

Hi everyone,

 

I am new to the delve developer community, but a longtime delve user. I have started the very hackish, very foolish approach to get delve running on darwin/arm64. I have started this here at my fork: https://github.com/oxisto/delve/tree/darwin-arm64-lldb

 

After realising that the native backend seems to be broken on MacOS, @aarzilli suggested the gdbserver way. I essentially created arm64-versions of those files (need to be integrated later to avoid duplicate code). I can successfully launch an executable, and also read the register *info* from the macOS debug server. However, when trying to retrieve the actual registers with the ‘g:thread’ command, I receive an “E74” error from the debug server.

 

Not wanting to quit there, I tried a self-compiled debug server and this one actually gives me the registers.. yay. But that is not an option in the long run.

 

So progress so far: I can launch an executable on darwin/arm64, read registers and single step cpu instructions.which also seem to update the register. Reading memory at the ‘pc’’s register also yields expected results.

 

However, there is something off about the calculation of memory addresses of program code in memory. All memory read instructions that are involved with, i.e. setting a breakpoint based on a code location is denied by the debug server, so maybe it is offsetting from a wrong base address. Still trying to understand how and where those addresses are calculated in delve.

 

However, the main drawback is at the moment that it does not seem to work with the macOS supplied debug server. Is anyone else working on this and can share some insights? Also.. how broken is the native backend on macOS? I might consider taking this route again.

 

Best Regards,

 

Christian

 

Alessandro Arzilli

unread,
Dec 5, 2020, 1:04:10 PM12/5/20
to Christian Banse, delv...@googlegroups.com
On Sat, Dec 05, 2020 at 03:31:12PM +0000, Christian Banse wrote:
> Hi everyone,
>
> I am new to the delve developer community, but a longtime delve user. I have started the very hackish, very foolish approach to get delve running on darwin/arm64. I have started this here at my fork: https://github.com/oxisto/delve/tree/darwin-arm64-lldb
>
> After realising that the native backend seems to be broken on MacOS, @aarzilli<https://github.com/go-delve/delve/commits?author=aarzilli> suggested the gdbserver way. I essentially created arm64-versions of those files (need to be integrated later to avoid duplicate code). I can successfully launch an executable, and also read the register *info* from the macOS debug server. However, when trying to retrieve the actual registers with the ‘g:thread’ command, I receive an “E74” error from the debug server.

If you haven't figured this out yet you can see exactly what's going on
between delve and debugserver by passing `--log --log-output=gdbwire`.
We have a lot of tests in pkg/proc/proc_test.go, you can also enable logging
for those with:

go test -run TestName -backend=lldb -log=gdbwire

I took a look at your branch and it seems that your code is still using
regnamePC, regnameSP, etc... those have hardcoded values that are valid only
for amd64 (rip, rsp...), they'll have to be changed for arm64. For the
correct name you'd have to see what debugserver is actually returning as
registers.

BTW you can find out the current architecture by looking at p.bi.Arch.Name
(p being a gdbProcess variable).

Also the code in gdbserial.(*gdbThread).reloadRegisters needs to be changed,
it can never reach the line where reloadGAlloc/reloadGAtPC is called,
instead you should add some code, similar to the linux switch case that sets
t.regs.gaddr from a register, if it works like linux/arm64 it would look
like this:

if t.p.bi.Arch.Name == "arm64" && t.p.bi.GOOS == "darwin" {
if reg, hasX28 := t.regs.regs["name of X28 register goes here"]; hasX28 {
t.regs.gaddr = binary.LittleEndian.Uint64(reg.value)
t.regs.tls = 0
t.regs.hasgaddr = true
}
return nil
}

In theory this should be all that's needed. If debugserver works correctly
on arm64 and go emits good debug informations for arm64 it should be all.

I also suggest you do the work with go1.16 (i.e. go built from tip).

> However, there is something off about the calculation of memory addresses of program code in memory. All memory read instructions that are involved with, i.e. setting a breakpoint based on a code location is denied by the debug server, so maybe it is offsetting from a wrong base address. Still trying to understand how and where those addresses are calculated in delve.

Go executables are not position independent by default, so it should be
fine. You can crosscheck this by launching lldb on the same executable and
setting a breakpoint. It could be a problem with delve, go or debugserver.

> However, the main drawback is at the moment that it does not seem to work with the macOS supplied debug server. Is anyone else working on this and can share some insights?

If debugserver really is broken on arm64 then it should be reported to lldb.
It would be helpful to figure out exactly what causes E74, the log would
help determine this.

> Also.. how broken is the native backend on macOS? I might consider taking this route again.

It's very broken. Also, it likely can not be made to work without changing the Go runtime.

Christian Banse

unread,
Dec 5, 2020, 4:16:10 PM12/5/20
to Alessandro Arzilli, delv...@googlegroups.com

Hi Allessandro,

 

Thanks for your hints. I have adopted the similar behavior about taking the Gaddr from register x28 from the linux code and also discovered the logging options now. The error E74 with the apple-debugserver can be seen in the attached log (see line 765). It only occurs with the apple debugserver, the one fresh from the llvm-project github works.

 

The regnamePC values are adopted to the arm-register in the _arm64.go files. The stack traces were just some added debug, since I wanted to understand where the ReadMemory function was being called.

 

The second log contains the log my the compiled debugserver.

 

The memory addresses still seem to be completely off somehow, I will dig deeper into this tomorrow.

 

Christian

log.txt
log-my-debugserver.txt

Alessandro Arzilli

unread,
Dec 6, 2020, 5:34:02 AM12/6/20
to Christian Banse, delv...@googlegroups.com
On Sat, Dec 05, 2020 at 09:16:06PM +0000, Christian Banse wrote:
> Hi Allessandro,
>
> Thanks for your hints. I have adopted the similar behavior about taking the Gaddr from register x28 from the linux code and also discovered the logging options now. The error E74 with the apple-debugserver can be seen in the attached log (see line 765). It only occurs with the apple debugserver, the one fresh from the llvm-project github works.

I can probably work around that by setting gcmdok to false. Maybe it will be enough.

>
> The regnamePC values are adopted to the arm-register in the _arm64.go files. The stack traces were just some added debug, since I wanted to understand where the ReadMemory function was being called.

Ok, I see it now.

> The second log contains the log my the compiled debugserver.
>
> The memory addresses still seem to be completely off somehow, I will dig deeper into this tomorrow.

What makes you say this? I crosscompiled a target program to darwin/arm64
and the addresses I get from `go tool objdump` are in that range. To me this
looks more like a bug in debugserver. You can make it log stuff to a file by
passing '-l filename' on the command line.

>
> Christian
>

Christian Banse

unread,
Dec 6, 2020, 8:08:58 AM12/6/20
to Alessandro Arzilli, delv...@googlegroups.com

Setting gcmdok to false helps! Reading registers one-at-a-time now indeed works with Apple’s debugserver.

 

You are correct reg. the addresses. I finally had time to play around with lldb and those addresses check out (see log). The first one seems to relate to Go’s panic function which I assume delve is trying to set a breakpoint there. Using LLDB directly, I can access all the memory areas that delve wants to access.

 

I have also activated the lldb debugserver output (the one from Apple) in the logfile. mach_vm_read seems to fail with 0x1 (invalid address). Strange. Having a quick look at the lldb source code, it seems the debugserver is the only place that directly uses mach_vm_read. Not sure exactly, which higher level equivalent lldb uses to implement its “memory read” function, but it does seem to behave differently.

 

From: Alessandro Arzilli <alessandr...@gmail.com>
Date: Sunday, 6. December 2020 at 11:34
To: Christian Banse <oxi...@aybaze.com>
Cc: delv...@googlegroups.com <delv...@googlegroups.com>
Subject: Re: Foolish attempt to support darwin/arm64

lldboutput.txt

Alessandro Arzilli

unread,
Dec 6, 2020, 8:45:14 AM12/6/20
to Christian Banse, delv...@googlegroups.com
On Sun, Dec 06, 2020 at 01:08:54PM +0000, Christian Banse wrote:
> Setting gcmdok to false helps! Reading registers one-at-a-time now indeed works with Apple’s debugserver.
>
> You are correct reg. the addresses. I finally had time to play around with lldb and those addresses check out (see log). The first one seems to relate to Go’s panic function which I assume delve is trying to set a breakpoint there. Using LLDB directly, I can access all the memory areas that delve wants to access.
>
> I have also activated the lldb debugserver output (the one from Apple) in the logfile. mach_vm_read seems to fail with 0x1 (invalid address). Strange. Having a quick look at the lldb source code, it seems the debugserver is the only place that directly uses mach_vm_read. Not sure exactly, which higher level equivalent lldb uses to implement its “memory read” function, but it does seem to behave differently.
>

You can also make lldb connect to debugserver, start debugserver with:

$ debugserver 127.0.0.1:30000 ./executable

And then:

$ lldb ./executable
(lldb) process connect connect://127.0.0.1:1234
(lldb) break set ...

It should fail with the same error unless we're doing something wrong. If it
works the logs would tell us what we're doing wrong if it doesn't then the
only remaining thing to do is report it to LLVM.

I've looked at the issue tracker and found nothing.

Christian Banse

unread,
Dec 6, 2020, 11:32:22 AM12/6/20
to Alessandro Arzilli, delv...@googlegroups.com

I have found the culprit: It was ASLR. causes the main.__TEXT section to not start at 0x100000000 but rather a random value ASLR (which is the point of it anyway). Starting dlv with –-disable-aslr makes it also worked with the debugserver on arm64 (see log) on my branch. Basically, I can launch the inferior, set breakpoints and watch for locals. There is still some problem displaying the program code while in the breakpoint.

 

Disclaimer: I am not an export on ASLR. I usually do “higher” level programming for Go microservice and stuff like that, so bear with me that I am fairly new to this “low-level” programming

 

It seems that on my darwin arm64 ASLR is always used, on my old Intel mac (still running 10.15 as well), the –-disable-aslr is not needed and I can always find the macho- magic (0xcf   0xfa   0xed   0xfe) at 0x100000000. BTW: lldb seems to automagically launch the inferior with ASLR disabled. debugserver needs a special flag to do it. Luckily, this was already integrated into delve.

 

Additionally, when I was trying to find what was going on, I pretty much fixed the native-backend (see second log file). The main fix is done using an entitlements file and I replaced the fork-exec mechanisms with a posix_spawn – similarly to what debugserver does on macOS, with the added flag for disabling ASLR. You can find the current progress on https://github.com/oxisto/delve/tree/fix-native-backend. It works quite good, except some minor kinks, such as continue seems to be broken if you set a second breakpoint.

 

The question would be… How to proceed from here?

 

From: Alessandro Arzilli <alessandr...@gmail.com>
Date: Sunday, 6. December 2020 at 14:45
To: Christian Banse <oxi...@aybaze.com>
Cc: delv...@googlegroups.com <delv...@googlegroups.com>
Subject: Re: Foolish attempt to support darwin/arm64

gdbserver-disable-aslr.log
native-backend.log

Alessandro Arzilli

unread,
Dec 6, 2020, 12:00:14 PM12/6/20
to Christian Banse, delv...@googlegroups.com
On Sun, Dec 06, 2020 at 04:32:18PM +0000, Christian Banse wrote:
> I have found the culprit: It was ASLR. causes the main.__TEXT section to not start at 0x100000000 but rather a random value ASLR (which is the point of it anyway). Starting dlv with –-disable-aslr makes it also worked with the debugserver on arm64 (see log) on my branch. Basically, I can launch the inferior, set breakpoints and watch for locals. There is still some problem displaying the program code while in the breakpoint.

That's great.

Instead of passing --disable-aslr you should probably use -buildmode=exe
when building the executable. We should also add the flag by default when
runtime.GOOS=="darwin" in pkg/gobuild/gobuild.go (optflags).

We should also detect if the executable is PIE in pkg/proc/bininfo.go
(loadBinaryInfoMacho) and return an error if it is.

https://golang.org/pkg/debug/macho/#Type

There's a FlagPIE constant for the Flags field of FileHeader, but I haven't
checked if that's it. There's a bug in dsymutil that makes PIE executables
not debuggable.

> Disclaimer: I am not an export on ASLR. I usually do “higher” level programming for Go microservice and stuff like that, so bear with me that I am fairly new to this “low-level” programming
>
> It seems that on my darwin arm64 ASLR is always used, on my old Intel mac (still running 10.15 as well), the –-disable-aslr is not needed and I can always find the macho- magic (0xcf 0xfa 0xed 0xfe) at 0x100000000. BTW: lldb seems to automagically launch the inferior with ASLR disabled. debugserver needs a special flag to do it. Luckily, this was already integrated into delve.
>
> Additionally, when I was trying to find what was going on, I pretty much fixed the native-backend (see second log file). The main fix is done using an entitlements file and I replaced the fork-exec mechanisms with a posix_spawn – similarly to what debugserver does on macOS, with the added flag for disabling ASLR. You can find the current progress on https://github.com/oxisto/delve/tree/fix-native-backend. It works quite good, except some minor kinks, such as continue seems to be broken if you set a second breakpoint.
>
> The question would be… How to proceed from here?

Well the changes should be made to gdbserver.go and gdbserver_conn.go
instead of being in separate _arm64.go files, and then it should pass `make
test`.


Christian Banse

unread,
Dec 7, 2020, 5:44:11 AM12/7/20
to Alessandro Arzilli, delv...@googlegroups.com

Is there any other way, i.e. “go tool” to check for the PIE flag?

 

I have added

 

    if exe.Flags&macho.FlagPIE != 0 {

        return errors.New("Cannot debug PIE executable")

    }

 

To loadBinaryInfoMacho and also

 

if runtime.GOOS == "darwin" {

   args = append(args, "-buildmode=exe")

}

 

To optflags(). But it still seems to produce a PIE executable (at least according to the Flag) and also the debugging fails the same way as before, if –disable-aslr is not used. So I guess, it really is a PIE. Maybe PIE is enforced now on darwin/arm64?

 

Except minor kinks, debugging a PIE with disabled-aslr seems to be fine – or are you assuming I will run into the dsymutil bug at some point that you mentioned?

 

 

From: Alessandro Arzilli <alessandr...@gmail.com>
Date: Sunday, 6. December 2020 at 18:00
To: Christian Banse <oxi...@aybaze.com>
Cc: delv...@googlegroups.com <delv...@googlegroups.com>
Subject: Re: Foolish attempt to support darwin/arm64

Christian Banse

unread,
Dec 7, 2020, 9:55:04 AM12/7/20
to Alessandro Arzilli, delv...@googlegroups.com

It actually looks like I am running into a brick wall with the debugserver approach. Setting a breakpoint and “continue”’ing to it returns a 0x91 bad access from handleThreadSignals and I have no idea why.

 

Meanwhile, the native approach has been made good progress and I have opened a “work in progress” PR here: https://github.com/go-delve/delve/pull/2254 for people to check it out and test it (shoud work on arm64 and amd64). There are still some ominous test fails though, need to work on those.

--
You received this message because you are subscribed to the Google Groups "delve-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to delve-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/delve-dev/AM0PR09MB4420E57DD0DF539B79058A9FA7CE0%40AM0PR09MB4420.eurprd09.prod.outlook.com.

Alessandro Arzilli

unread,
Dec 9, 2020, 4:23:44 AM12/9/20
to Christian Banse, delv...@googlegroups.com
On Mon, Dec 07, 2020 at 02:55:01PM +0000, Christian Banse wrote:
> It actually looks like I am running into a brick wall with the debugserver approach. Setting a breakpoint and “continue”’ing to it returns a 0x91 bad access from handleThreadSignals and I have no idea why.

Under what circumstance? What does the gdbwire log say?

FYI the problem with macos and PIE is described here: https://github.com/golang/go/issues/25841

Christian Banse

unread,
Dec 11, 2020, 11:47:50 AM12/11/20
to Alessandro Arzilli, delv...@googlegroups.com

It does so, by just setting an arbitrary breakpoint, in this case to a function of the stacktraceme test fixture. I have attached the log with gdbwire and lldbout

 

./dlv debug --log --disable-aslr --log-dest log.txt --log-output lldbout,gdbwire _fixtures/stacktraceprog.go

 

From: Alessandro Arzilli <alessandr...@gmail.com>
Date: Wednesday, 9. December 2020 at 10:23
To: Christian Banse <oxi...@aybaze.com>
Cc: delv...@googlegroups.com <delv...@googlegroups.com>
Subject: Re: Foolish attempt to support darwin/arm64

log.txt

Alessandro Arzilli

unread,
Dec 13, 2020, 7:35:58 AM12/13/20
to Christian Banse, delv...@googlegroups.com
On Fri, Dec 11, 2020 at 04:47:44PM +0000, Christian Banse wrote:
> It does so, by just setting an arbitrary breakpoint, in this case to a function of the stacktraceme test fixture. I have attached the log with gdbwire and lldbout
>
> ./dlv debug --log --disable-aslr --log-dest log.txt --log-output lldbout,gdbwire _fixtures/stacktraceprog.go

That's very strange. 0x91 means the program made a bad access and the
response to vCont has the main thread (0x5f99) with a PC register set to 0.

I think the first thing to do would be to verify that 0x10005ee90 is not in
the middle of an instruction (it looks like it is, but just to be sure).

And then I'd check if it also happens with lldb+debugserver, with logging
enabled on debugserver, and report it to the lldb project if it does.

Christian Banse

unread,
Dec 31, 2020, 10:42:10 AM12/31/20
to Alessandro Arzilli, delv...@googlegroups.com

Took me only almost half a month to figure it out, but it turns out the set breakpoint command was of. Seems that you need to specifiy the “kind” of breakpoint, which is usually the length of the breakpoint instruction, which on amd64 is one, but it needs to be 4 for arm.. Setting a breakpoint works now with the lldb-approach.

 

From: Alessandro Arzilli <alessandr...@gmail.com>
Date: Sunday, 13. December 2020 at 13:35
To: Christian Banse <oxi...@aybaze.com>
Cc: delv...@googlegroups.com <delv...@googlegroups.com>
Subject: Re: Foolish attempt to support darwin/arm64

Reply all
Reply to author
Forward
0 new messages