Guaranteeing deterministic execution of Golang code

396 views
Skip to first unread message

Stan Srednyak

unread,
Jan 14, 2023, 5:52:55 PM1/14/23
to golang-nuts
How much of Golang functionality must be excluded in order to guarantee deterministic execution on a fixed system?

There are typical sources of nondeterminism

1. /dev/urandom

2. time

We should include system variables here, but lets suppose we fix the system.

One more source could be system files. Lets say we use chroot to jail the process. This should be done carefully: naive use does not exclude /dev/urandom , and as a result e.g. RSA key generation has access to randomness. But lets assume that we dealt with this issue.


Also, the stack can be a source of randomness. In Golang, it is possible to get info about the stack. But lets say, we blocked these possibilities by forbidding the access to runtime.

What else is there?

At the moment it seems that to guarantee deterministic execution it is necessary to block access to modules:
runtime
syscall
time
crypto -- the parts that have to do with key generation
os -- parts that get system variables and process information
math/rand

Which other standard modules can give rise to nondeterminism?

It seems that modules
-ioutil
-bufio
-most of os
-strings
-bytes

are "nondeterminism safe".

File info can be a source of nondeterminism - the last access time in nanoseconds.



Axel Wagner

unread,
Jan 14, 2023, 6:30:13 PM1/14/23
to Stan Srednyak, golang-nuts
There's also maps, select and goroutines in general. Funnily enough I blogged about this some years ago, for fun.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/731a9430-6c0d-4f9f-87ec-75f833cc544fn%40googlegroups.com.

Axel Wagner

unread,
Jan 14, 2023, 6:38:18 PM1/14/23
to Stan Srednyak, golang-nuts
Oh, also, fmt: https://go.dev/play/p/xxOw3vqWR4u (probably need to run this offline)

Honestly, I don't think there really is a practical way to prevent non-determinism without severly hampering the language. Your best bet is probably to compile a program to webassembly and then don't give the VM access to any source of entropy.

Kevin Chowski

unread,
Jan 15, 2023, 3:03:50 PM1/15/23
to golang-nuts
I think a little more background about the context of the question would be helpful. Determinism may mean different things in different situations depending ont he guarantees you need. Also, did you know that the go playground (play.golang.org) is intended to be generally deterministic so that output can be cached? That might be a good place for you to start. I have a few more general thoughts:

Is the code trusted or could it be "attacker" controller? There are going to be a lot of random execution delay differences on a modern multiprocessor, and they'd be noticeable to a goroutine calling time.Sleep in a loop especially if the program was not the only application on the physical host. Theoretically, you could use this as a random number generator and cause non-deterministic execution, but obviously that code would be quite weird and probably wouldn't make it past code review. But if the code is untrusted, you may have to worry about it.

Another thought is that goroutine scheduling is effectively non-deterministic on a multiprocessor (I don't know enough about the runtime to say whether this is true on a uniprocessor), so the runtime's scheduling algorithm (plus natural delays I mentioned above) can cause non-determinism to emerge. For example, depending on goroutine scheduling order, this code may sometimes crash: https://go.dev/play/p/lEZEAYdji-f . As Axel said you will need to run it locally, the playground results may be cached iirc. I ran it 100 times and it crashed 9 times. This code is contrived, but in a large system this sort of non-determinism may still happen due to the general complexity of it all, and it may or may not matter to you.

Also, what exactly do you mean by "determinism"? Another example: if you create many goroutines in a loop and have them all do an atomic.AddInt64 on a shared variable (e.g. for them to get a global ID for themselves), the specific number each goroutine is assigned will essentially be random. The numbers will not form a continuous distribution, but they will be random at least in some sense on a multiprocessor. If the code does different things based on a combination of this kind of simple non-determinism, it can cause different code paths and different specific interleaving of the code in different executions. But, in practice it there may not be an observable effect in your average program, so I don't know if you consider this aspect important.

-Kevin

Kevin Chowski

unread,
Jan 15, 2023, 3:11:19 PM1/15/23
to golang-nuts
Sorry, I should have read Axel's blogpost before posting. I enjoyed reading it, thanks for sharing :)  some of what I was talking about is covered with more code examples in the post.

Stan Srednyak

unread,
Jan 15, 2023, 6:38:42 PM1/15/23
to Axel Wagner, golang-nuts
hi Axel,

thanks for sharing. Why did you mention webassembly specifically, why not go or c?

Stan

Axel Wagner

unread,
Jan 15, 2023, 6:59:21 PM1/15/23
to Stan Srednyak, golang-nuts
On Mon, Jan 16, 2023 at 12:38 AM Stan Srednyak <stan....@gmail.com> wrote:
hi Axel,

thanks for sharing. Why did you mention webassembly specifically, why not go or c?

I mentioned WebAssembly specifically because a) it has a well sandboxed, single-threaded VM and b) there are existing compilers for Go.
I don't understand why you mention C. If you want control over the execution environment (i.e. if you want to isolate it to be deterministic) that is literally the worst choice.

Robert Engels

unread,
Jan 15, 2023, 9:18:37 PM1/15/23
to Axel Wagner, Stan Srednyak, golang-nuts
This is a very strange discussion. I don’t understand the purpose. Malloc() isn’t even deterministic. Which means any arbitrary program can take the address to generate entropy regardless of having time, etc available. 

I didn’t look it up, but I’m pretty sure there is a “law” that if the language/system has any concurrency available then it is the correctness of the program that guarantees determinism - ie. same inputs generate same outputs. 



On Jan 15, 2023, at 5:59 PM, 'Axel Wagner' via golang-nuts <golan...@googlegroups.com> wrote:



Brian Candler

unread,
Jan 16, 2023, 4:10:47 AM1/16/23
to golang-nuts
Are you trying to find a pure functional subset of go?

In a "pure function", the return value(s) are calculated only from the arguments. The result is not affected by the state of the system, and the state of the system is not affected by the function.  That means you can't even do a print() inside a function.

Or are you trying to do some sandboxing of untrusted, user-submitted code, but allowing a larger scope than pure functional behaviour?  For example, do you need to allow access the filesystem, or send and receive messages on a channel or queue?  If so, does it have to be Go code that is sandboxed?

For some applications, an interpreter for a "safe" language may be what you want (e.g. jsonnetcuestarlark).  Or you might be able to configure an interpreter for some other language - even an interpreter for Go - to have the restricted behaviour you require.

More generally: it's possible to write pure functional code which returns a value which is a "request" to do something.  Such requests are then very easy to filter, perform if safe, and then invoke the next function with the result.  This is the basic idea behind the IO Monad in functional languages.

Axel Wagner

unread,
Jan 16, 2023, 4:34:25 AM1/16/23
to Robert Engels, Stan Srednyak, golang-nuts
On Mon, Jan 16, 2023 at 3:18 AM Robert Engels <ren...@ix.netcom.com> wrote:
This is a very strange discussion. I don’t understand the purpose. Malloc() isn’t even deterministic. Which means any arbitrary program can take the address to generate entropy regardless of having time, etc available. 

I didn’t look it up, but I’m pretty sure there is a “law” that if the language/system has any concurrency available then it is the correctness of the program that guarantees determinism - ie. same inputs generate same outputs. 

FWIW this is definitely not true, at least not in the context of Go. This function is correct, yet defined to be non-deterministic:

func F() string {
    ch := make(chan int)
    close(ch)
    select {
    case <-ch:
        return "foo"
    case <-ch:
        return "bar"
    }
}

I think the question of "when is a Go program deterministic" makes full sense, FWIW. Because being deterministic is a desirable property of many programs (compilers being one obvious example). So considering this question gives you insight into how to write such programs. For example, realizing that map iteration order is not defined to be deterministic leads you to sort keys of a map before output, if you want deterministic output.

I don't think there's much more to be learned from it though - you quickly find out, that it's essentially impossible (or is unreasonably restrictive to the language) to guarantee a program's determinism against an adversary. So there's no real reasonable "deterministic subset of Go" - all you can do, is do your best to avoid the stuff you discovered.

Bakul Shah

unread,
Jan 16, 2023, 7:27:23 AM1/16/23
to Axel Wagner, Robert Engels, Stan Srednyak, golang-nuts
I would think the map iteration order is arbitrary but deterministic. That is, the same set of keys will be iterated the same way every time. This is not the case with the select statement.

Axel Wagner

unread,
Jan 16, 2023, 7:39:38 AM1/16/23
to Bakul Shah, Robert Engels, Stan Srednyak, golang-nuts
On Mon, Jan 16, 2023 at 1:27 PM Bakul Shah <ba...@iitbombay.org> wrote:
I would think the map iteration order is arbitrary but deterministic. That is, the same set of keys will be iterated the same way every time.

If you would think that, you'd be wrong. The iteration order of a map is unspecified. But gc (the most prevalent Go implementation) explicitly randomizes it.
 
This is not the case with the select statement.

Well, that's why I chose that example. Because it makes it easier to talk about "a definitely non-deterministic program" and "a program who's determinism is not specified, but is randomized in practice, but you might argue that in certain circumstances…".

But the point still stands: Thinking about where non-determinism comes from, let you avoid it, where it matters. "Sorting keys of a map before output" is just as much a strategy of that as "synchronizing the order of operations" or "ordering the output". There are different strategies to achieve deterministic output, with different tradeoffs. And they are worth thinking about, to me.
Reply all
Reply to author
Forward
0 new messages