One approach is to maintain a shadow stack holding the pointers in a place the GC already knows about, like an array allocated in the heap. This can be done in Go, the language. Dereferences would use a level of indirection. Perhaps one would pass an index into the array instead of the pointer and somehow know the location of the shadow stack from a VM structures. This way the stack contains no pointers _into the heap_ so the _GC_ is happy. You might still have to deal with pointers to stack allocated objects since stacks can be moved and so forth but that is not the problem being discussed.
Go the implementation, such as the Go 1.13, has a GC that does not move heap objects. This means that to keep a heap object live the GC only needs to know about a single pointer. That’s sort of handy since now you can push the pointer onto the shadow stack and also onto the call stack since as long as the shadow stack is visible the object will not be collected. I note that this involves a barrier on all pointer writes so it is more than just a change to the calling conventions. Reads on the other hand would be full speed and not require a level of indirection or barrier unless and until Go the implementation moved to a moving collector.
I would explore this approach first since all of the pieces are under your control. Developing an ABI for stack maps would include other people with differing agendas and would likely slow you down. Likewise forking would come with the usual maintenance/merge headaches.
Update:
I found a solution using AMD64 assembly trickery to call arbitrary Go functions from JIT code, while the JIT code is running on the Go stack and accessing Go memory.
It is extremely hackish and guaranteed to break as soon as Go function call ABI changes.
I will try to implement the same trick in ARM64 assembler - there are differences, and I am not yet sure it will work.
cosmos72
Don't forget about calling to write barriers.