Defer in goroutines isn't guaranteed to execute

4,266 views
Skip to first unread message

Robert Tweed

unread,
May 26, 2014, 1:29:27 AM5/26/14
to golan...@googlegroups.com
I'm having an issue with the fact that it appears (perhaps by design,
perhaps a bug) that a deferred function in a goroutine is not guaranteed
to be run if the program terminates. That doesn't seem right to me,
since the idiomatic use of defer is to perform cleanups where it's
implicit that the programmer wants to have that guarantee - a bit like a
finally block.

In any case, here's what I'm trying to do. I am trying to write a server
object (in fact I plan to repeat this pattern for lots of services in
the application) that does its own lazy resource initialisation on
demand. It should therefore also clean up after itself, either when
destroyed explicitly or when the program terminates.

My first attempt has something like this, which doesn't work - note that
because this is intended to be lazy, the client is not expected to call
open() directly, that's why its private, it gets called automatically
from certain public methods as needed:

func (this *Server) open() {
this.handle = someresource.Open();
go func() {
defer func() {
someresource.Close()
this.handle = nil
}()
<- this.sigClose
}()
}

The idea is that if you push anything into the sigClose channel, that
will explicitly kill the object. The intention of using defer is to
*guarantee* that if the client never does that explicitly, it will
eventually happen implicitly at program exit, in spite of the goroutine
remaining blocked.

I haven't got as far as testing with panic conditions etc., because if
main() just exits, the deferred function is never called.

Is there a way to change this behaviour so that the deferred function it
is guaranteed to be run?

Otherwise, it look like the caller is going to have to know up-front
which objects need to have deferred cleanup operations and call them
itself from within the main loop, which I want to avoid. I don't want to
clutter up the code by deferring close operations for every object that
conceivably /might/ need it (either in the future, or depending on a
config file option that can't be known about at compile-time) when most
of the time it won't be necessary. The caller should not care about
whether the object it is buffered or not internally, it should just work
the same in either case.

So, is there a general (preferably idiomatic) solution to this problem?
This pattern is really just RAII except that Go doesn't have
deterministic destructors, so we generally need to use defer to achieve
the same effect. I'd be fairly happy with another method that gets the
same result, as long as it doesn't break encapsulation by putting the
burden of the defer onto the caller. I also don't want to introduce any
unnecessary dependencies, e.g., passing a WaitGroup to every object
whether it's needed or not.

- Robert

Tamás Gulácsi

unread,
May 26, 2014, 1:41:52 AM5/26/14
to golan...@googlegroups.com
Put that defer in your main.
Btw how do you signal that channel?

David Symonds

unread,
May 26, 2014, 1:41:58 AM5/26/14
to Robert Tweed, golang-nuts
On 26 May 2014 15:29, Robert Tweed <fistful.o...@gmail.com> wrote:

> I'm having an issue with the fact that it appears (perhaps by design,
> perhaps a bug) that a deferred function in a goroutine is not guaranteed to
> be run if the program terminates. That doesn't seem right to me, since the
> idiomatic use of defer is to perform cleanups where it's implicit that the
> programmer wants to have that guarantee - a bit like a finally block.

That programmer has made an error. Deferred functions are only
guaranteed to run when the function returns (including during an
unwind due to a panic). It's not like a finally block, which is also
why panic/recover is not like exceptions in other languages.


> Is there a way to change this behaviour so that the deferred function it is
> guaranteed to be run?

There isn't, no.

The recommended pattern if you need some cleanup to be done if your
program abnormally terminates is to run the program in a script or
supervisor program, and have that do the cleanup.

Robert Tweed

unread,
May 26, 2014, 2:15:43 AM5/26/14
to golan...@googlegroups.com
On 26/05/2014 06:41, Tamás Gulácsi wrote:
> Put that defer in your main.
That's exactly the pattern I'm trying to avoid as it breaks
encapsulation on the object, or is at best a leaky abstraction.
> Btw how do you signal that channel?
>
Any method that might signal that the resource isn't needed anymore
would just put something in it, e.g:

func (this *Server) Kill() {
this.sigClose <- nil
}

Note that the code as it stands does work call this explicitly from main:

func main() {
bar := foo.NewServer()
defer bar.Kill();
}

However that's somewhat redundant code and, my primary concern here is
objects where the internal resource usage changes for some reason, so
the client will not know about the need to call Kill(). Indeed this
could also occur quite normally because NewServer() could return
different concrete types, some of which use some kind of buffering that
requires a shutdown function, but most of which do not. Imagine for a
moment that NewServer is actually a factory that returns the appropriate
object based on a config file.

It seems highly redundant to require that every interface include a Kill
method, and every caller must add it to their defers, even though it
won't be used most of the time.

This is of course relevant where dealing with high level abstractions
where the caller shouldn't be expected to care about the low-level
semantics of the object it's dealing with. Say I create a logging server
- the caller should expect to be able to call logging functions, but
shouldn't care of those logs are going to disk, memcached, a database,
/dev/null, or whatever. Nor should it care whether or not that object
requires some deferred sync on exit.

It's fine when there are only one or two objects like that, but it's not
fine when you want to compose a complex application out of many small
service objects like this and where it's likely the low-level semantics
of those objects is going to change over time.

- Robert

Robert Tweed

unread,
May 26, 2014, 3:17:52 AM5/26/14
to golang-nuts
On 26/05/2014 06:41, David Symonds wrote:
> The recommended pattern if you need some cleanup to be done if your
program abnormally terminates
> is to run the program in a script or supervisor program, and have
that do the cleanup.

That does seem like using a sledgehammer to crack a nut in this case.
This issue is not with abnormal program termination (in which case all
bets are off anyway) but simply guaranteeing that certain things will
definitely happen at some point, either during the normal lifetime
(which can be done with SetFinalizer) or when the program terminates
*normally*. A supervisor program would solve the problem of cleaning up
after messy crashes, but doesn't solve the problem of, for example,
synchronising a buffer to disk when the program exits, since the
supervisor wouldn't have access to the contents of that buffer anyway.

On 26/05/2014 06:41, David Symonds wrote:
> On 26 May 2014 15:29, Robert Tweed <fistful.o...@gmail.com> wrote:
>
>> ... the idiomatic use of defer is to perform cleanups where it's
implicit that
>> the programmer wants to have that guarantee - a bit like a finally
block.
>
> That programmer has made an error.

I don't mean to imply that there's anything in the specification to say
that defer should work like a finally block, but Go is primarily
designed around the principle of least surprise and intuitively, defer
just means "make sure this code gets run after what follows, even if
there is a panic". It therefore seems contradictory that such code is
not run on normal program termination. It's only an issue for goroutines
because normal program termination is the only time there's ever a
distinction between a function returning and a function just "ending".

- Robert

Robert Tweed

unread,
May 26, 2014, 3:25:38 AM5/26/14
to golan...@googlegroups.com
On 26/05/2014 07:54, Tianran Shen wrote:
> Maybe a global defer list ?
>
> http://play.golang.org/p/NUglosQlrT
>
Something like that might be the only practical solution - in which case
I'll probably wrap it up in a factory object of some sort so that there
isn't too much exposed wiring and to try avoid introducing a global
dependency into every object that needs a destructor. I'd also prefer a
cleaner solution if one exists.

- Robert

Jan Mercl

unread,
May 26, 2014, 4:02:59 AM5/26/14
to Robert Tweed, golang-nuts
On Mon, May 26, 2014 at 7:29 AM, Robert Tweed
<fistful.o...@gmail.com> wrote:

The problem is not rooted in the defers. It's just that you rely on
some goroutines to finish but no synchronization is used to guarantee
that. One can for example use a sync.WaitGroup for that purpose.

-j

Robert Tweed

unread,
May 26, 2014, 4:06:16 AM5/26/14
to golang-nuts
On 26/05/2014 08:59, Tianran Shen wrote:
Also, runtime.SetFinalizer might be another choice if you were searching something like destructors in OO.
Not quite. It's useful if you want to do lazy cleanup of a resource and don't particularly care when or whether it actually happens, but there's still no guarantee a finalizer will run. It gets run when the GC decides to run it and isn't run at all if the program ends before then.

Again the canonical example is if you have an in-memory buffer (i.e., a write cache) and you want to guarantee that it is flushed properly on exit, a finalizer will not provide that guarantee.

- Robert

Tamás Gulácsi

unread,
May 26, 2014, 4:24:31 AM5/26/14
to golan...@googlegroups.com
See what io does, and implement a Closer. Thus every user will know to put Close in a defer or call it explicitly.

Robert Tweed

unread,
May 26, 2014, 4:29:37 AM5/26/14
to golang-nuts
Not exactly. Adding a WaitGroup doesn't solve the basic problem of
offloading the responsibility to the caller instead of encapsulating it
properly, which is fundamentally what I'm trying to solve.

Also, in this instance a WaitGroup isn't quite the right thing anyway.
The goroutine is blocked permanently, unless you send something on the
quit channel. A WaitGroup would allow you to stop main from ending until
the goroutine is done, but all that would do here is prevent the program
from ever ending normally as the goroutine isn't waiting for some work
to complete. It's only there to free up the resource when the program
ends (which doesn't work) or when asked to explicitly (which the client
might never do).

So we are left with the problem that there doesn't appear to be any way
to guarantee that synchronisation will occur without some kind of global
god object, or offloading responsibility onto the caller in every
instance (the god object is still offloading responsibility to the
caller, but it only needs to happen once).

I still think both this practical issue and the apparently surprising
behaviour of defer would be resolved if the language spec provided a
guarantee that deferred functions in goroutines will be run on normal
program termination. However, that's just my opinion and it doesn't
provide an immediate solution.

- Robert

Robert Tweed

unread,
May 26, 2014, 4:35:44 AM5/26/14
to golan...@googlegroups.com
On 26/05/2014 09:24, Tamás Gulácsi wrote:
> See what io does, and implement a Closer. Thus every user will know to put Close in a defer or call it explicitly.
>
Again this is what I'm trying to avoid, since it means that if I have a
system comprised of a very large number of relatively high level
objects, we'd need to make that assumption about all of them even though
only a small number would ever need that. It's fine for something
low-level like io where you know exactly what kind of object you are
dealing with and therefore you know you should close it. It doesn't work
with this kind of high-level object that may or may not acquire some
resources on demand - the caller doesn't, and shoudn't, know about such
implementation details.

Also, even if you implement a Close() function that just means the
object meets the Closer interface, so it can be passed to things that
expect a Closer. It doesn't provide any guarantees that the client will
actually call Close.

- Robert

Dan Kortschak

unread,
May 26, 2014, 5:02:53 AM5/26/14
to Robert Tweed, golang-nuts
You either write a "god" object that manages the system you are creating or you expect the language to provide the deity for you. Go makes it pretty easy to commit theogenesis.

BTW Have you had a look at tomb (https://godoc.org/launchpad.net/tomb), which might help you in your quest.

Robert Tweed

unread,
May 26, 2014, 5:47:04 AM5/26/14
to golang-nuts
On 26/05/2014 10:02, Dan Kortschak wrote:
> You either write a "god" object that manages the system you are creating or you expect the language to provide the deity for you. Go makes it pretty easy to commit theogenesis.

Having certain things built-in helps in that at least it keeps things
standard. Hacking in global objects and the like gives you some
convenience, at the expense of portability. Or at least, it means an
increased chance of collisions or unnecessary duplication between
different libraries trying to do the same thing in different ways.

Take the destructor list package suggested by Tianran Shen. That's
certainly a way to solve the problem, except that now any code that uses
it is dependent on that package. Now say some 3rd party library does the
same thing but re-implements the same package... now you have two of
them. And so on. Of course the latter is not an extant problem, but I
don't particularly favour approaches that naturally lead to such
problems. Plus, if you have a hard-coded dependency within your objects,
it could create issues with unit testing. It's much better if the
dependency is injected from outside, but again the problem is doing that
without forcing the caller to know more than it should. That can be
solved by putting it all in a factory, but then you lose the ability to
use idiomatic constructor functions within the package itself - now you
have another package to maintain just to keep track of some dependencies
that shouldn't really be needed in the first place.

So yes, there are ways to solve the problem, but they are all a bit messy.
> BTW Have you had a look at tomb (https://godoc.org/launchpad.net/tomb), which might help you in your quest.

No, I haven't seen that. It looks like it might be handy for certain
situations, but unfortunately I can't see how it would help here.

- Robert

Jesse McNelis

unread,
May 26, 2014, 6:36:59 AM5/26/14
to Robert Tweed, golang-nuts
On Mon, May 26, 2014 at 6:35 PM, Robert Tweed
<fistful.o...@gmail.com> wrote:
> It doesn't work with this kind of high-level
> object that may or may not acquire some resources on demand - the caller
> doesn't, and shoudn't, know about such implementation details.

It's pretty hard to actually hide this kind of thing.
If you've got a non-memory resource(network connection, file etc.)
then you've also got error handling associated with it.
You can't pretend to the caller that it's not there because the caller
has to handle the errors that result from it.

In Go it's common to expose these things and Go devs expect to have to
handle them.

> Also, even if you implement a Close() function that just means the object
> meets the Closer interface, so it can be passed to things that expect a
> Closer. It doesn't provide any guarantees that the client will actually call
> Close.

The caller not calling Close() is a bug in the caller. All the
solutions that attempt to make up
for incompetence on the part of the caller always end up make code
harder to deal with
and understand for the competent.

Go makes very little effort to save programmers from their own
mistakes, but makes it much easier for them to understand those
mistakes.
If you're worried about callers forgetting to call Close() then you
should be terrified that they can forget to handle any errors you
return.

Andrew Gerrand

unread,
May 26, 2014, 7:40:44 AM5/26/14
to Robert Tweed, golang-nuts

On 26 May 2014 19:46, Robert Tweed <fistful.o...@gmail.com> wrote:
Having certain things built-in helps in that at least it keeps things standard. Hacking in global objects and the like gives you some convenience, at the expense of portability. Or at least, it means an increased chance of collisions or unnecessary duplication between different libraries trying to do the same thing in different ways.

So implement this as a library and start using it. Release it as open source, see if other people want to use it too. If enough people actually want to use such a thing, then it may be made part of the standard library at some point.

Note that if only part of this story actually happens, you still get what you need.

Andrew

John Waycott

unread,
May 26, 2014, 9:48:05 AM5/26/14
to golan...@googlegroups.com
It is the caller's responsibility to clean up its resources. Presumably, main() created the server, so it should shut it down. The server should have a public Shutdown() function that sends over your hidden channel to close resources the server was responsible for opening. Add a defer in the main() to call server.Shutdown().

The defers will still not be called in a few cases like calling log.Fatalf(), but you can't expect things to clean up properly in those situations.

Tianran Shen

unread,
May 26, 2014, 10:06:08 AM5/26/14
to Robert Tweed, golan...@googlegroups.com
Maybe a global defer list ?

http://play.golang.org/p/NUglosQlrT


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Tianran Shen
School of Software Engineering
South China University of Technology
Guangzhou,PRC

Tianran Shen

unread,
May 26, 2014, 10:06:23 AM5/26/14
to Robert Tweed, golan...@googlegroups.com
Also, runtime.SetFinalizer might be another choice if you were searching something like destructors in OO.


- Robert

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Steven Blenkinsop

unread,
May 26, 2014, 3:15:50 PM5/26/14
to Robert Tweed, golang-nuts
You could fairly easily combine it with a WaitGroup to create a MultiTomb, which you could then use to synchronize your program lifetime and ensure that all goroutines unwind before exiting. Of course, this approach requires each goroutine to synchronize and check whether the program is dying at appropriate times, but this is just how concurrent programs are written in Go.

Dan Kortschak

unread,
May 26, 2014, 4:11:35 PM5/26/14
to Steven Blenkinsop, Robert Tweed, golang-nuts
There's actually an example hidden in a link in the tomb docs to the playground that does this.

Jason E. Aten

unread,
May 26, 2014, 6:27:25 PM5/26/14
to golan...@googlegroups.com
On Sunday, May 25, 2014 10:29:27 PM UTC-7, Robert Tweed wrote:
I'm having an issue with the fact that it appears (perhaps by design,
perhaps a bug) that a deferred function in a goroutine is not guaranteed
to be run if the program terminates.

In my limited experience, shutdown sequences *are*, as you've noticed, the trickiest part of writing go servers. Plenty of opportunity for races.

Here's an example of what was involved for me, and how I handled it:

https://github.com/glycerine/goq/blob/master/workstart.go#L344

 Note the signalling by closing two specific (and distinct) channels at the begining and at the end of the shutdown sequence. For me, one rule of thumb that has evolved is that closing a channel is almost always preferable to a waitgroup, since it is like a broadcast that doesn't have to care how many listeners it has. Closing two channels in this way is a little like a two-phase commit. First announce to all goroutines that you are shutting down, then wait for them to all signal that they have shutdown, then signal your client by closing your own Done channel.

Anyway, the above example is just an elaboration on the fundamental "close-a-channel" communication pattern involved in starting a goroutine and waiting for it to finish its job:

...
taskDone := make(chan bool)
go func() {
      // ... do some task work, then tell the world we are done
      close(taskDone)
}
<-taskDone
// the goroutine has finished its task
...


Jason E. Aten

unread,
May 26, 2014, 6:39:28 PM5/26/14
to golan...@googlegroups.com
On Monday, May 26, 2014 3:27:25 PM UTC-7, Jason E. Aten wrote:

I meant to add, and you probably figured this out yourself already, but it's only a slight elaboration that insures that all defers are run, using the fact that defers run at function boundaries.


taskDone := make(chan bool)

   go func() {
          func() {
             defer func() { ... stuff you want to defer ... }()

            // ... do some task work, then tell the world we are done
          }()
      // here we know the deferred stuff is done.
      close(taskDone)
    }

// back on main/client goroutine:
<-taskDone
// the goroutine has finished its task, and all of its defers
...

Robert Tweed

unread,
May 26, 2014, 6:55:09 PM5/26/14
to golang-nuts
This actually sounds much more like a workable answer to the original question - it may not be part of the standard library, but it seems to be fairly well recognised so that's a step in the right direction at least. I'll investigate further and see if a MultiTomb does exactly what I want, or can be adapted to do so.

There's nothing in my original example that requires the goroutine to block forever with <-sigClose - that's just the simplest thing I could have used and I was hoping that defer would just do what I was expecting it to do in that case. It's easy enough to replace that pattern with something based on tomb instead so that it will quit if main is quitting (and main will wait for it). At least this should provide some consistency across the project. I still have the problem of how to manage the dependencies, but it looks like there isn't a simpler option.

- Robert

Robert Tweed

unread,
May 26, 2014, 6:56:39 PM5/26/14
to golan...@googlegroups.com
On 26/05/2014 23:27, Jason E. Aten wrote:
> ...
> taskDone := make(chan bool)
> go func() {
> // ... do some task work, then tell the world we are done
> close(taskDone)
> }
> <-taskDone
> // the goroutine has finished its task
> ...
>
Thanks, I agree this stuff can be a pain - more generally just getting
different objects to synchronise correctly can be a pain. It usually
requires more than one channel for signalling and it can become
cumbersome at times. The tomb package looks pretty good for dealing with
the general problem (alongside waitgroups).

However there's a general misunderstanding of the original problem that
nearly everyone seems to be making here, which is that the goroutine is
doing "some work" in the background and needs to signal to the main
thread (or its own caller) that it is done. If that were the problem, a
WaitGroup could solve that, but that's not the problem at all.

The only reason the goroutine in my original example is blocking is
because I set it up like that, so it won't run Close() on its resource
handle until:

(a) The caller explicitly indicates they are done with it
(b) GC forces it (with the addition of a finalizer)
(c) The program terminates normally

The problem is that (c) never happens because the defer isn't called on
program termination if the goroutine is blocked; only if the goroutine
shuts down on its own before main() returns. It seems that the only way
to make that happen is to add something based on tomb so that main can
signal its intent to shut down, which in turn causes all goroutines to
shut down - basically I replace <-sigChannel with something a bit more
complex to allow for that signalling from main()

- Robert

Dan Kortschak

unread,
May 26, 2014, 7:02:55 PM5/26/14
to Robert Tweed, golang-nuts
Roger Peppe's example linked from the tomb docs:

http://play.golang.org/p/Xh7qWsDPZP

Gustavo Niemeyer

unread,
May 27, 2014, 4:27:10 AM5/27/14
to Dan Kortschak, Robert Tweed, golang-nuts
This post provided background at the time the tomb package was
released, and is probably still a good read in this context:

http://blog.labix.org/2011/10/09/death-of-goroutines-under-control

As a coincidence, I've also been bringing back to life the project
that originally motivated the development of tomb: mup, an IRC bot
originally written in Erlang several years ago, mainly to get a more
practical feeling for the language characteristics. The bot is still
in use nowadays, with a single process sitting on 53 channels both in
Canonical and FreeNode, providing various productivity-related
features (bug creationg/changing reporting, commit messages, contact
information, etc).

I started porting it to Go back in 2011, but never got to finish the
task. Recently the co-workers that have been maintaining the machine
where it runs (pretty much unattended) contacted me as they want to
shut the machine down, and thus hand the bot back to me. That, plus an
idea I've got for a nice new design, prompted me to revive the 2011
project, and finish the port.

If you want to track progress, the project is evolving at:

https://github.com/niemeyer/mup

It's not yet ready for general consumption, but it's already a pretty
good example for how to properly handle clean termination and error
tracking for several goroutines under the responsibility of various
parties within the code, while still offering a clean exported API
that hides all the details away. If you're curious about that, I
suggest starting to read the code from StartBridge on bridge.go.

I'll do a more reasonable blog post explaining the project once it's ready.
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--

gustavo @ http://niemeyer.net

Robert Tweed

unread,
May 27, 2014, 4:45:11 AM5/27/14
to golang-nuts
On 27/05/2014 09:26, Gustavo Niemeyer wrote:
> It's not yet ready for general consumption, but it's already a pretty
> good example for how to properly handle clean termination and error
> tracking for several goroutines under the responsibility of various
> parties within the code, while still offering a clean exported API
> that hides all the details away. If you're curious about that, I
> suggest starting to read the code from StartBridge on bridge.go.
>

Thanks, this looks quite similar to the kind of thing I'm aiming for.
I'll have a more thorough read over the code later.

JOOI, how do you find Go compares with Erlang for this kind of project?
I don't know Erlang so I'm curious to get a sense of how it compares
with Go. This kind of application is pretty much the canonical use-case
for either of the two languages so it's interesting that you've got a
version written in both, one of which is in production. I'm thinking in
terms of things like how it performs under real-world conditions; did it
take a lot of tweaking & system-level configuration to get it from
first-working to actually-works-in-production; is there a big difference
in the size & complexity of the code; etc. Mainly just your subjective
observations on the bigger pros and cons of each language in the context
of this specific application.

- Robert

Nate Finch

unread,
May 27, 2014, 7:02:50 AM5/27/14
to golan...@googlegroups.com
I guess I don't understand what you're actually asking for, because you seem to already know what you need to do, you just don't like the answer.  

You were using defer in a goroutine because you thought that would work for your case c, and it doesn't.  Ok, let's accept that and go back to the original problem.  

You need something that runs at the end of the program (which in Go is defined as the end of the main() function) to clean up.  So.... put something at the end of main() to clean up.  Make it a fancy service that you can register cleanup tasks with if you want.  It seems like you already know that's the answer.  

Am I missing something?

roger peppe

unread,
May 27, 2014, 8:42:54 AM5/27/14
to Nate Finch, golang-nuts
On 27 May 2014 12:02, Nate Finch <nate....@gmail.com> wrote:
> You need something that runs at the end of the program (which in Go is
> defined as the end of the main() function) to clean up. So.... put
> something at the end of main() to clean up. Make it a fancy service that
> you can register cleanup tasks with if you want. It seems like you already
> know that's the answer.

This is what I was about to suggest. Something like this.

package atexit
import (
"sync"
"os"
)

var (
mu sync.Mutex
functions []func()
)

// Do registers the given function to be called
// when Exit is called.
func Do(f func()) {
mu.Lock()
defer mu.Unlock()
functions = append(functions, f)
}

// Exit calls all functions registered with
// Do and then calls os.Exit with the given error code.
func Exit(code int) {
for _, f := range functions {
f()
}
os.Exit(code)
}

Robert Tweed

unread,
May 27, 2014, 10:38:43 AM5/27/14
to golan...@googlegroups.com
On 27/05/2014 12:02, Nate Finch wrote:
> I guess I don't understand what you're actually asking for, because
> you seem to already know what you need to do, you just don't like the
> answer.
Yes, I've been programming in Go long enough that I can generally solve
most problems that come up. That doesn't mean those solutions are
necessarily optimal :)

Per my original question:

"I am trying to write a server object (in fact I plan to repeat this
pattern for lots of services in the application) that does its own lazy
resource initialisation on demand. It should therefore also clean up
after itself, either when destroyed explicitly or when the program
terminates. ... Otherwise, it looks like the caller is going to have to
know up-front which objects need to have deferred cleanup operations and
call them itself from within the main loop, which I want to avoid."

My intent was specifically to ensure that an object that internally (and
lazily) allocates a resource can hold on to it for as long as it needs
to and release it either in response to some runtime condition, or if
all else fails, when the program ends. The latter is the thing that
needs some kind of guarantee that it will actually run.

The idea of doing it with defer inside a blocked goroutine is that it
doesn't put any dependency on the caller to know about what the object
is doing internally. It would be perfect if it worked, but it doesn't.
Given David Symond's clarification, I can understand /why/ it doesn't
work, but that doesn't mean I have to like it :)

The issue has always been about avoiding the abstraction-leakage or an
extra dependency, not about whether it's possible to make it work by
/any/ means (I did indeed already know that this was possible before
asking the question - I was specifically looking for something _better_
as I consider it quite messy compared to the RAII pattern). The somewhat
disappointing thing about defer not being guaranteed to run on exit is
that it would make an RAII-like implementation pretty trivial in Go
(i.e., the code I originally posted, which is nice and simple, without
any dependencies outside the core language, would have worked).

It doesn't look like there is a solution to that problem, but so far
some kind of shutdown manager based on tomb or something similar seems
to be the cleanest and most reusable of the options that will actually work.

- Robert

yy

unread,
May 27, 2014, 10:51:15 AM5/27/14
to roger peppe, Nate Finch, golang-nuts
On 27 May 2014 14:42, roger peppe <rogp...@gmail.com> wrote:
> // Do registers the given function to be called
> // when Exit is called.
> func Do(f func()) {
> mu.Lock()
> defer mu.Unlock()
> functions = append(functions, f)
> }

Nice! Although I think you lost a great opportunity to call this
function Defer, which is what the OP was initially asking. Of course,
then it should store (or Exit run) the functions in LIFO order, and
atexit would not be the best name for the package. Maybe something
like global.Defer and global.Return:

http://play.golang.org/p/LCn1yF7RGV

If this was a common idiom, it may come handy to have a runtime.Defer,
and then Return would not be needed. But there doesn't seem to be a
big demand for the feature and it can be easily implemented by the
user.


--
- yiyus || JGL .

Robert Tweed

unread,
May 27, 2014, 11:14:40 AM5/27/14
to golan...@googlegroups.com
On 27/05/2014 15:51, yy wrote:
> On 27 May 2014 14:42, roger peppe <rogp...@gmail.com> wrote:
>> // Do registers the given function to be called
>> // when Exit is called.
>> func Do(f func()) {
>> mu.Lock()
>> defer mu.Unlock()
>> functions = append(functions, f)
>> }
> Nice! Although I think you lost a great opportunity to call this function Defer... If this was a common idiom, it may come handy to have a runtime.Defer, and then Return would not be needed. But there doesn't seem to be a big demand for the feature and it can be easily implemented by the user.
Almost, but I'm not quite ready to back this as a perfect solution or
ready to go into stdlib. First off, with the original function, there's
a quit channel (sigClose) so if another method sends on that channel,
the deferred function will run and there won't be any references to it
held in RAM anymore.

With this, callbacks are deferred until the end no matter what, and
there's no way to remove them. This means if you create and destroy lots
of objects, you have a memory leak, as you'll keep adding more and more
deferred functions that will never be run until the program exits. Not
only that, but you now also have the problem that those deferred
functions are probably trying to call back to an object that should have
been destroyed already, and so now there's a reference to that object
(and it's descendents) that won't get GC'd, making the potential memory
leak even worse.

(thinking about this, I'm not sure if having a blocked goroutine as per
my original code would also cause a memory leak if the object isn't
killed explicitly... I don't know enough about the GC to know if it will
know the goroutine is orphaned and run the finalizer that would kill it,
or if the reference to sigClose would keep it alive forever)

And even with that you still have the responsibility on someone (main,
really) to call Exit - if you just let main return then none of the
callbacks will ever be run, so it might be dangerous as a hidden
dependency (i.e., if some object uses it internally but the caller
doesn't realise they depend on it to function correctly).

The shutdown manager pattern is more complex, but it's much closer to a
complete solution. It's probably too complex to go into the standard
library though.

- Robert

Gustavo Niemeyer

unread,
May 27, 2014, 11:37:35 AM5/27/14
to Robert Tweed, golang-nuts
On Tue, May 27, 2014 at 10:44 AM, Robert Tweed
<fistful.o...@gmail.com> wrote:
> JOOI, how do you find Go compares with Erlang for this kind of project?
> I don't know Erlang so I'm curious to get a sense of how it compares with Go.

One of the most interesting realizations from using Erlang for a real
project was that most of the properties that are generally associated
with the language in terms of reliability actually come from
conventions that are encouraged via the standard library and community
consensus, rather than language features.

This might be a good introduction to the topic:

http://www.erlang.org/doc/design_principles/des_princ.html

These libraries make use of Erlang's process ("goroutine") monitoring
features to track termination via channels, but the core concepts work
just as well with standard Go language features.

> This kind of application is pretty much the canonical use-case for either of
> the two languages so it's interesting that you've got a version written in
> both, one of which is in production. I'm thinking in terms of things like
> how it performs under real-world conditions; did it take a lot of tweaking &
> system-level configuration to get it from first-working to
> actually-works-in-production;

It's been too many years for me to remember details about putting it
in production, but I don't recall anything traumatic at least. The
fact I've had almost zero maintenance work all those years is
definitely a major benefit which I'm hoping the Go version can keep up
with.

> is there a big difference in the size &
> complexity of the code; etc. Mainly just your subjective observations on the
> bigger pros and cons of each language in the context of this specific
> application.

In this specific context the major drawback I've observed with Erlang
is that multiple developers that wanted to extend the bot with some
functionality ended up abandoning it before accomplishing anything,
because going over the bridge of learning about the different
programming paradigm is too high a cost for what is a supposed to be a
quick hack. In that regard Go is much simpler and uses more well known
programming strategies, so I'm sure it'll do much better there.


gustavo @ http://niemeyer.net

Ian Lance Taylor

unread,
May 27, 2014, 12:21:44 PM5/27/14
to Robert Tweed, golang-nuts
On Tue, May 27, 2014 at 7:38 AM, Robert Tweed
<fistful.o...@gmail.com> wrote:
>
> "I am trying to write a server object (in fact I plan to repeat this pattern
> for lots of services in the application) that does its own lazy resource
> initialisation on demand. It should therefore also clean up after itself,
> either when destroyed explicitly or when the program terminates. ...
> Otherwise, it looks like the caller is going to have to know up-front which
> objects need to have deferred cleanup operations and call them itself from
> within the main loop, which I want to avoid."
>
> My intent was specifically to ensure that an object that internally (and
> lazily) allocates a resource can hold on to it for as long as it needs to
> and release it either in response to some runtime condition, or if all else
> fails, when the program ends. The latter is the thing that needs some kind
> of guarantee that it will actually run.

A resource that is internal to the program--memory, say, or a file
descriptor--will go away when the program exits. So you can only be
talking about a resource that is external to the program. Any program
can crash, or the machine itself can be shut down suddenly, so any
program that uses an external resource must consider the possibility
that the resource was not cleaned up properly the last time the
program was run. So what we are talking about here must be a matter
of cleanliness rather than correctness.

One thing we've learned from large C++ programs is that shutting them
down often means running a large number of global destructors and
atexit functions. The effect is that shutting down a large program
can take surprising amounts of time--all you want to do is quit, and
the program just sits there trying to quit. This situation is so bad,
and so general, that C++11 introduced a new way to exit:
std::quick_exit. And, of course, much like your concern, people worry
about clean exits when using std::quick_exit. So C++11 also has
std::at_quick_exit. I think that if you step back from the history of
how we got here and just look at where we are, we are in an absurd
place.

So a lesson for Go is to not go to that absurd place. Therefore, Go
does not have global destructors and does not have atexit. It has no
enforced finalization at all. An unfortunate consequence of that is
that some libraries find it harder to clean up nicely when the program
is shut down. That is not ideal. But, while not ideal, it's better
than the alternative.

Ian

Robert Tweed

unread,
May 27, 2014, 12:38:39 PM5/27/14
to golang-nuts
On 27/05/2014 16:36, Gustavo Niemeyer wrote:
> One of the most interesting realizations from using Erlang for a real
> project was that most of the properties that are generally associated
> with the language in terms of reliability actually come from
> conventions that are encouraged via the standard library and community
> consensus, rather than language features.
>
> This might be a good introduction to the topic:
>
> http://www.erlang.org/doc/design_principles/des_princ.html
>
> These libraries make use of Erlang's process ("goroutine") monitoring
> features to track termination via channels, but the core concepts work
> just as well with standard Go language features.

That's interesting, but the linked page doesn't explain something, which
is that I always thought the thing that differentiates the supervision
tree model in Erlang is that workers are isolated, so they can be
killed, replaced or reloaded as needed.

AFAIK, you can do the same thing with Go, but you'd need to write your
supervisors and workers as completely separate applications, since
there's no way to have any kind of process isolation within a single Go
application. I.e., you can't launch a goroutine with its own isolated
memory address space and pre-emptively kill it if it stops responding.
You can do that at the OS level, which is why people quite often use
something like nginx as a front end for a bunch of server processes
(written in Go or Node or whatever) and restart any that start acting up.

AFAIK, Erlang gives you this process isolation for free as part of the
language. Is that incorrect?

> In this specific context the major drawback I've observed with Erlang
> is that multiple developers that wanted to extend the bot with some
> functionality ended up abandoning it before accomplishing anything,
> because going over the bridge of learning about the different
> programming paradigm is too high a cost for what is a supposed to be a
> quick hack. In that regard Go is much simpler and uses more well known
> programming strategies, so I'm sure it'll do much better there.

Yes, this is always an issue with any language other than the top 4 or
5. I suppose regardless of popularity, Go is at least easy to learn and
hack on quickly. The only potential issue with that is that Go has a lot
of subtle edge cases and things that are not intuitive to anyone that
isn't completely familiar with it's inner workings. This particular
thread is a just one example of something where the way it works is
simple in terms of the machine specification, but not necessarily
intuitive in terms of what a programmer would expect. Go mostly does a
good job of hiding complexity, but it doesn't entirely get rid of it.
IMO this can lead to a risk of subtle bugs due to new programmers having
a false sense of security.

It takes about 20 minutes to learn Go, but about 6 months or more to
really learn Go.

In many ways this is the same Achilles heel that JavaScript suffers
from, as Douglas Crockford called it, "the only language people feel
like they don't need to learn to use". It's not necessarily a good thing.

- Robert

Robert Tweed

unread,
May 27, 2014, 1:07:23 PM5/27/14
to golang-nuts
On 27/05/2014 17:21, Ian Lance Taylor wrote:
> On Tue, May 27, 2014 at 7:38 AM, Robert Tweed
> <fistful.o...@gmail.com> wrote:
>> A resource that is internal to the program--memory, say, or a file
>> descriptor--will go away when the program exits.

I will once again restate the canonical case where this is important,
which is where you have an in-memory buffer that needs to be flushed. Of
course, if the system crashes then all bets are off and you're going to
lose some data, but we're talking about normal termination here.

In some very small number of cases you are going to need to design a
system that is fault tolerant even in the case of hardware failures, but
that's a much, much bigger engineering problem. It doesn't seem
unreasonable to expect a system to be correct in normal operation, even
if it isn't fault tolerant.

>> One thing we've learned from large C++ programs is that shutting them
>> down often means running a large number of global destructors and
>> atexit functions. ...
>> So a lesson for Go is to not go to that absurd place.

This is a reasonable argument. However, that's basically an application
design issue. If your application is registering a ton of long-running
shutdown processes then of course it's going to take ages to shut down.
The solution is Don't Do That. The C++ solution of layering hacks upon
hacks is of course a silly solution to the wrong problem, which is that
people have been writing bad shutdown code and instead of properly
testing and fixing that code, want to have ways to force terminate
things at the application level if they are running for too long. If a
program isn't responding, the OS can kill it, and that might be a valid
response to certain kinds of problem. In that sense, an excessively slow
destructor is really no different than an infinite loop. It it could
safely finish any faster and doesn't, it's a bug. It it can't safely
finish any faster and you force kill it, that's a bug!

I wouldn't be surprised if the issues with large C++ applications are
more to do with concurrency issues like deadlocks occurring during
shutdown, which in theory shouldn't happen with properly written Go code.

- Robert

Gustavo Niemeyer

unread,
May 27, 2014, 1:21:36 PM5/27/14
to Robert Tweed, golang-nuts
On Tue, May 27, 2014 at 6:38 PM, Robert Tweed
<fistful.o...@gmail.com> wrote:
> AFAIK, you can do the same thing with Go, but you'd need to write your
> supervisors and workers as completely separate applications, since there's
> no way to have any kind of process isolation within a single Go application.
> I.e., you can't launch a goroutine with its own isolated memory address
> space and pre-emptively kill it if it stops responding.

It's true that you cannot forcefully kill a goroutine, but
well-designed goroutines are very commonly isolated and communicated
with via channels, and the killing can be included in the design via
simple conventions, such as the one offered by the tomb package.

> It takes about 20 minutes to learn Go, but about 6 months or more to really
> learn Go.

"It takes about 20 minutes to learn X, but about 6 months or more to really
learn X."

Fixed it for you. :-)


gustavo @ http://niemeyer.net

roger peppe

unread,
May 27, 2014, 1:31:11 PM5/27/14
to Robert Tweed, golang-nuts
I see that concern. That's easily addressed with something like this:

http://play.golang.org/p/ov-9pl8NtI

You'd could do:

func foo() {
defer atexit.Do(func(){
myCleanup()
}).Exec()

Usually though, it's better to arrange things so that you have no
global cleanups like this - you can make types responsible for
their own cleanup.
This blog post might give you some ideas:
http://rogpeppe.wordpress.com/2014/03/15/cancellation-in-go-the-juju-way/

yy

unread,
May 27, 2014, 1:50:35 PM5/27/14
to Robert Tweed, golang-nuts
On 27 May 2014 17:14, Robert Tweed <fistful.o...@gmail.com> wrote:
> Almost, but I'm not quite ready to back this as a perfect solution or ready
> to go into stdlib. First off, with the original function, there's a quit
> channel (sigClose) so if another method sends on that channel, the deferred
> function will run and there won't be any references to it held in RAM
> anymore.

Of course this solution is not perfect, neither worth being considered
for its inclusion in the standard library. And you know better than
anybody else if it can help you with your problem. However, the issue
you point out can be solved in several ways, for example:

http://play.golang.org/p/UOxTftOKAd

> And even with that you still have the responsibility on someone (main,
> really) to call Exit - if you just let main return then none of the
> callbacks will ever be run, so it might be dangerous as a hidden dependency
> (i.e., if some object uses it internally but the caller doesn't realise they
> depend on it to function correctly).

I don't think adding a call at the end of main for cleanup is asking
for too much, but it is true that there is no way to enforce it. In my
last example, you also have to take care of using "defer
global.Defer(func(){ ... })()" in your goroutines, which is not the
prettiest construction ever.

> The shutdown manager pattern is more complex, but it's much closer to a
> complete solution. It's probably too complex to go into the standard library
> though.

Whatever works for you, but please let us know once you get it fixed.
The problem already piqued my curiosity.

Ian Lance Taylor

unread,
May 27, 2014, 1:56:40 PM5/27/14
to Robert Tweed, golang-nuts
On Tue, May 27, 2014 at 10:06 AM, Robert Tweed
<fistful.o...@gmail.com> wrote:
> On 27/05/2014 17:21, Ian Lance Taylor wrote:
>>
>> On Tue, May 27, 2014 at 7:38 AM, Robert Tweed
>> <fistful.o...@gmail.com> wrote:
>>>
>>> A resource that is internal to the program--memory, say, or a file
>>> descriptor--will go away when the program exits.
>
>
> I will once again restate the canonical case where this is important, which
> is where you have an in-memory buffer that needs to be flushed. Of course,
> if the system crashes then all bets are off and you're going to lose some
> data, but we're talking about normal termination here.

Sorry, I missed that use case in the thread. But I don't understand
it. It sounds like you have an API that must be closed down, but you
don't want the users of the API to have to close it down. I don't
find that compelling. Any flush operation to a destination that is
not in-program memory, or the write cache I now see that you mentioned
earlier, can fail. So you seem to have an API that can fail without
ever reporting the failures back to the API's caller. It does
presumably report failures to the user of the program, but that's not
quite the same thing.

And, of course, there are some natural though incomplete workarounds,
like flushing after a timeout, and in a finalizer. Those would only
be appropriate for cases where some data loss is OK--but this API in
general seems only suitable for such a case.

In short, it's a valid use case, but I don't find it to be a
compelling one. In particular I don't find it to be a compelling
argument for adding a feature to Go that was explicitly considered and
rejected.


>>> One thing we've learned from large C++ programs is that shutting them
>>> down often means running a large number of global destructors and
>>> atexit functions. ...
>>>
>>> So a lesson for Go is to not go to that absurd place.
>
>
> This is a reasonable argument. However, that's basically an application
> design issue.

Yes. And that is also true of what you are arguing for. I am
suggesting: don't design your application that way.

Ian

Andrew Gerrand

unread,
May 27, 2014, 5:24:15 PM5/27/14
to Robert Tweed, golang-nuts


On Wednesday, 28 May 2014, Robert Tweed <fistful.o...@gmail.com> wrote:
On 27/05/2014 17:21, Ian Lance Taylor wrote:
On Tue, May 27, 2014 at 7:38 AM, Robert Tweed
<fistful.o...@gmail.com> wrote:
A resource that is internal to the program--memory, say, or a file
descriptor--will go away when the program exits.

I will once again restate the canonical case where this is important, which is where you have an in-memory buffer that needs to be flushed. Of course, if the system crashes then all bets are off and you're going to lose some data, but we're talking about normal termination here.

"All bets are off" in system crashes only if you are betting that the system won't crash. But they do crash all the time. 

In some very small number of cases you are going to need to design a system that is fault tolerant even in the case of hardware failures, but that's a much, much bigger engineering problem. It doesn't seem unreasonable to expect a system to be correct in normal operation, even if it isn't fault tolerant.

It's not too hard. You just need to design your software assuming that it will crash at any time.

Forget clean shutdown. Instead you want to write a rock solid initialisation procedure that can recover from any kind of failure mode. 

This has a lot of benefits, chiefly that your initialisation and recovery code is exercised every time you run your program, so there's a much higher chance it is correct. Second is that it's much easier to test/simulate program recovery than program failure. 
 
This talk from GopherCon discusses this in the context of embedded software, but I think the advice is generally applicable.


Andrew



One thing we've learned from large C++ programs is that shutting them
down often means running a large number of global destructors and
atexit functions. ...
So a lesson for Go is to not go to that absurd place.

This is a reasonable argument. However, that's basically an application design issue. If your application is registering a ton of long-running shutdown processes then of course it's going to take ages to shut down. The solution is Don't Do That. The C++ solution of layering hacks upon hacks is of course a silly solution to the wrong problem, which is that people have been writing bad shutdown code and instead of properly testing and fixing that code, want to have ways to force terminate things at the application level if they are running for too long. If a program isn't responding, the OS can kill it, and that might be a valid response to certain kinds of problem. In that sense, an excessively slow destructor is really no different than an infinite loop. It it could safely finish any faster and doesn't, it's a bug. It it can't safely finish any faster and you force kill it, that's a bug!

I wouldn't be surprised if the issues with large C++ applications are more to do with concurrency issues like deadlocks occurring during shutdown, which in theory shouldn't happen with properly written Go code.

- Robert

Gustavo Niemeyer

unread,
May 27, 2014, 5:48:28 PM5/27/14
to Andrew Gerrand, Robert Tweed, golang-nuts
On Tue, May 27, 2014 at 11:24 PM, 'Andrew Gerrand' via golang-nuts
<golan...@googlegroups.com> wrote:
> It's not too hard. You just need to design your software assuming that it
> will crash at any time.

+1

> Forget clean shutdown. Instead you want to write a rock solid initialisation
> procedure that can recover from any kind of failure mode.

I'm totally on board of the rock solid initialization advice, but not
so much of the "forget clean shutdown" one. Not rarely there are
benefits to complementing the rock solid initialization with a polite
shutdown cleanup. This might be a last flush of the journal, or a
timely release of a leader lease, or a graceful disconnection notice
to the server, etc. These are not about correctness, though, but about
using the chance of a polite shutdown notice to improve the system
behavior.

Besides that, there's also another important reason for having clean
shutdown procedures: testing. It's a nightmare to test code that
spawns arbitrary concurrent logic and offers no means for a timely
cancellation of the side effects. Speaking as a consumer of
third-party packages, please do offer a proper Stop/Close method in
your packages, so we can write tests including their logic.

Please note that I'm not defending any changes to the language with
that. This is just about what we do with it.


gustavo @ http://niemeyer.net

Andrew Gerrand

unread,
May 27, 2014, 6:15:04 PM5/27/14
to Gustavo Niemeyer, Robert Tweed, golang-nuts

On 28 May 2014 07:47, Gustavo Niemeyer <gus...@niemeyer.net> wrote:
I'm totally on board of the rock solid initialization advice, but not
so much of the "forget clean shutdown" one. Not rarely there are
benefits to complementing the rock solid initialization with a polite
shutdown cleanup. This might be a last flush of the journal, or a
timely release of a leader lease, or a graceful disconnection notice
to the server, etc. These are not about correctness, though, but about
using the chance of a polite shutdown notice to improve the system
behavior.

You're right, of course. In my enthusiasm I overstated my point. :-)

As a part of a a system it is good to be polite, but it is important not to assume that other processes are polite.

Andrew

Jason E. Aten

unread,
May 27, 2014, 7:52:31 PM5/27/14
to Gustavo Niemeyer, golang-nuts

> On May 27, 2014, at 2:47 PM, Gustavo Niemeyer <gus...@niemeyer.net> wrote:
> Besides that, there's also another important reason for having clean
> shutdown procedures: testing. It's a nightmare to test code that
> spawns arbitrary concurrent logic and offers no means for a timely
> cancellation of the side effects.

+1

Exactly. Couldn't have said it better. I do BDD, and if my shutdown isn't squeaky clean then one (even passing) test can really mess up subsequent tests.

Robert Tweed

unread,
May 28, 2014, 1:40:26 AM5/28/14
to golang-nuts
On 27/05/2014 18:56, Ian Lance Taylor wrote:
> It sounds like you have an API that must be closed down, but you don't
> want the users of the API to have to close it down. I don't find that
> compelling. Any flush operation to a destination that is not
> in-program memory, or the write cache I now see that you mentioned
> earlier, can fail. So you seem to have an API that can fail without
> ever reporting the failures back to the API's caller.
Yes, the caller should not know how it works because it probably won't
work like that. I will again use the word "encapsulation" because it
really is more of a general design principle that a one-off specific
problem to be solved.

The more general idea behind the design is a composable architecture
made out of services, where each service may or may not actually be a
service residing in another process or on another machine. A service
wouldn't necessarily report failures to the caller - it would report
general failures to another service that may or may not be the original
caller. Exactly how it's composed depends on the configuration, which
could be different per environment.

So the point is that the service is a "black box". In the event that one
of these black boxes decides (based on the runtime configuration) that
it needs a write-back cache, it should be able to create one and use it
with some guarantee of safety without the caller having to have been
designed with that eventuality in mind ahead of time. I plan to include
asserted safety guarantees elsewhere such that if a caller tries to do
this in a situation where it would be unsafe and against a
fault-tolerance policy, it will cause a panic, but otherwise it will be
assumed that fault tolerance isn't important.

However, I understand your concern about making unnecessary changes to
the language, which is why I have not used the words "I propose a change
to the language". I have merely expressed my initial surprise that defer
doesn't do what I thought it would do in this instance, and would like
to find the most elegant solution to the problem - preferably one that
doesn't require injecting dependencies unless they are needed.

So before anyone says "but aha! if you do this, you need to inject a
dependency to handle errors!" the answer is no I don't - I can
potentially create and use the same service in different ways that may
or may not require a delegated error handler, or the caller can even
decide that it doesn't care about errors (although the wider system may
still care about logging them), but that's up to the caller. However the
service still needs to be able to maintain it's own general
encapsulation such that it doesn't expose internal wiring unless the
caller is explicitly requesting something where such wiring is
appropriate. There's also an inherent difference between making
synchronous and asynchronous calls to a given service, which affects how
much responsibility a caller can potentially offload.

A also realise that this kind of black-box architecture isn't really
"the Go way" but the typical Go pattern of explicit caller
responsibility and transparency works great for low-level system stuff
involving only concrete types, it's not appropriate for the kind of
framework I am trying to build. OTOH, it turns out that the language
features of Go are the best fit for this type of architecture compared
to other languages I've considered with the possible exception of
Erlang, but as-per the discussion elsethread, Erlang is somwhat
over-complicated and, as II intend to open source the framework
eventually, it would be nice if other people actually used it!

- Robert

Robert Tweed

unread,
May 28, 2014, 1:49:37 AM5/28/14
to golang-nuts
On 27/05/2014 18:20, Gustavo Niemeyer wrote:
>> AFAIK, you can do the same thing with Go, but you'd need to write your
>> supervisors and workers as completely separate applications, since there's
>> no way to have any kind of process isolation within a single Go application.
>> I.e., you can't launch a goroutine with its own isolated memory address
>> space and pre-emptively kill it if it stops responding.
> It's true that you cannot forcefully kill a goroutine, but
> well-designed goroutines are very commonly isolated and communicated
> with via channels, and the killing can be included in the design via
> simple conventions, such as the one offered by the tomb package.
>
That's true, but one of the key principle of Erlang if I understand it
correctly, is that it can kill and replace processes even if they
contain bugs. Otherwise you are reliant on "write code without any bugs"
to ensure that the system remains continuously operational, and that's
not a guarantee anyone can give. Following some conventions will help,
but all it just takes is one mistake and suddenly the whole architecture
is compromised.

- Robert

Gustavo Niemeyer

unread,
May 28, 2014, 5:30:37 AM5/28/14
to Robert Tweed, golang-nuts
On Wed, May 28, 2014 at 7:49 AM, Robert Tweed
<fistful.o...@gmail.com> wrote:
> That's true, but one of the key principle of Erlang if I understand it
> correctly, is that it can kill and replace processes even if they contain
> bugs. Otherwise you are reliant on "write code without any bugs" to ensure
> that the system remains continuously operational, and that's not a guarantee
> anyone can give. Following some conventions will help, but all it just takes
> is one mistake and suddenly the whole architecture is compromised.

Again, Erlang's reliability is largely based on conventions as well,
not on language features, and the same style of conventions is
available to Go code. More fundamentally, though, the idea that you
might get away with writing arbitrarily buggy Erlang software and
remain continuously operational is absurd.


gustavo @ http://niemeyer.net

Geoffrey Teale

unread,
May 28, 2014, 6:06:22 AM5/28/14
to Gustavo Niemeyer, Robert Tweed, golang-nuts
What Gustavo says is true.  Erlang's OTP libraries provide out of the box support for building monitored processes that restart failed nodes and allow for things like live-upgrades.  That's the real value of Erlang.  It's less about language than library - you definitely could build a framework like that in Go, but it's not there and mature like Erlang's is.  

The convention that these libraries encourage is "fail often and fail early", That's fine if you're watching the logs and actively managing your system, it certainly suits the situation that Erlang was built for,  but it's not a panacea, and everyone should be weary of calling any solution the "one true way". 

My own experience of Erlang is that OTP is great, but the general practise of programming Erlang was a steep learning curve, fraught with problems, even for someone with a functional programming background.  Debugging was also a bit of nightmare.  I suspect that had I had the opportunity to do more than a handful of projects in Erlang this would have improved.  Go on the other hand feels like a bigger immediate win, with fewer hurdles to jump over.

-- 
Geoff Teale


Jason E. Aten

unread,
May 28, 2014, 6:30:26 AM5/28/14
to Geoffrey Teale, Gustavo Niemeyer, Robert Tweed, golang-nuts
I do wonder how one might achieve hot-code-swapping with no downtime or lost tcp connections (ala Erlang) in Go. Obviously the entire binary would have to be replaced. How could socket connections not get dropped?

Geoffrey Teale

unread,
May 28, 2014, 7:16:00 AM5/28/14
to Jason E. Aten, Gustavo Niemeyer, Robert Tweed, golang-nuts
Hot code swapping is fine if you're using distinct processes and communicating between them - doing it with goroutines in a single process would, admittedly be hard - though if it's possible to patch a running Linux kernel without reboot, I'm pretty sure it's possible to live-patch a Go binary.  Whether it's a worthwhile or sensible thing to do is not clear to me at all.

Socket connections remaining up would require an intermediary to handle the network layer communication. 

-- 
Geoff Teale

Gustavo Niemeyer

unread,
May 28, 2014, 8:20:36 AM5/28/14
to Jason E. Aten, Geoffrey Teale, Robert Tweed, golang-nuts
Hot swapping, in Erlang or otherwise, feels like a lot of work for
little benefit. Reliable applications should live in more than one box
anyway, and remain available even if one of them goes away. I'd rather
focus on doing that well, than on never restarting any one box.


gustavo @ http://niemeyer.net

Robert Johnstone

unread,
May 28, 2014, 9:00:32 AM5/28/14
to golan...@googlegroups.com, Robert Tweed
I'm doubtful of your first point.  My Erlang experience is limited, but the ability to monitor other processes, and get signals when they fail, is not a secondary design decision.  The libraries built on top of that are also important, but I wouldn't overlook the importance of process identity and process monitoring in Erlang's philosophy towards building fault-tolerance systems.

(As an side, it is funny reading this discussion, where there is an assumption that Go's goroutines and Erlang's processes are very similar.  I was following a much older thread where there was considerable hostility to any comparison between the two approaches)

Ian Lance Taylor

unread,
May 28, 2014, 9:50:03 AM5/28/14
to Robert Tweed, golang-nuts
On Tue, May 27, 2014 at 10:40 PM, Robert Tweed
<fistful.o...@gmail.com> wrote:
>
> However, I understand your concern about making unnecessary changes to the
> language, which is why I have not used the words "I propose a change to the
> language".

Understood, that's not really on the table. I thought that at this
point you were arguing for os.AtExit, and that's what I've been
arguing against.

> A also realise that this kind of black-box architecture isn't really "the Go
> way" but the typical Go pattern of explicit caller responsibility and
> transparency works great for low-level system stuff involving only concrete
> types, it's not appropriate for the kind of framework I am trying to build.
> OTOH, it turns out that the language features of Go are the best fit for
> this type of architecture compared to other languages I've considered with
> the possible exception of Erlang, but as-per the discussion elsethread,
> Erlang is somwhat over-complicated and, as II intend to open source the
> framework eventually, it would be nice if other people actually used it!

I think that I would add Exit or Shutdown to your framework. I
understand that that makes your framework less elegant and less
standalone.

Ian

John Waycott

unread,
May 28, 2014, 10:17:30 AM5/28/14
to golan...@googlegroups.com, Robert Tweed
Just to add to what Ian said, here are a few practical examples:
1. To do hardware maintenance, an admin needs to shutdown the service and restart it on another server.
2. A failure is causing the black box service to misbehave, so it needs to be shutdown to prevent any more problems (maybe a network issue is causing the service to spew garbage, for example). The black box is oblivious to problem so it will not know it needs to shutdown.
3. The service should run at scheduled intervals, not under the control of the service itself.

The black box reports problems to another entity, so presumably someone or some service looks at those events and makes a decision to let the service continue to run. If the service cannot be controlled, there would really be no point in it reporting failures; no one could shut it down or take action to fix the problem.

Gustavo Niemeyer

unread,
May 28, 2014, 11:46:00 AM5/28/14
to Robert Johnstone, golan...@googlegroups.com, Robert Tweed
On Wed, May 28, 2014 at 3:00 PM, Robert Johnstone
<r.w.jo...@gmail.com> wrote:
> I'm doubtful of your first point. My Erlang experience is limited, but the
> ability to monitor other processes, and get signals when they fail, is not a
> secondary design decision. The libraries built on top of that are also
> important, but I wouldn't overlook the importance of process identity and
> process monitoring in Erlang's philosophy towards building fault-tolerance
> systems.

Erlang does not force you to use these features in any particular way.
It is the libraries and conventions embraced by the community which
turn them into reliability assets, and some of these ideas can be put
to good use on Go software. This is not an hypothesis - there's
production Go software out there, today.

> (As an side, it is funny reading this discussion, where there is an
> assumption that Go's goroutines and Erlang's processes are very similar. I
> was following a much older thread where there was considerable hostility
> to any comparison between the two approaches)

Let's avoid feeding the trolls: nobody said they are "very similar" in
this thread. They're both microthreading models - there are
similarities, and there are differences.


gustavo @ http://niemeyer.net

Robert Johnstone

unread,
May 28, 2014, 12:14:28 PM5/28/14
to golan...@googlegroups.com, Robert Johnstone, Robert Tweed


On Wednesday, 28 May 2014 11:46:00 UTC-4, Gustavo Niemeyer wrote:
On Wed, May 28, 2014 at 3:00 PM, Robert Johnstone
<r.w.jo...@gmail.com> wrote:
> I'm doubtful of your first point.  My Erlang experience is limited, but the
> ability to monitor other processes, and get signals when they fail, is not a
> secondary design decision.  The libraries built on top of that are also
> important, but I wouldn't overlook the importance of process identity and
> process monitoring in Erlang's philosophy towards building fault-tolerance
> systems.

Erlang does not force you to use these features in any particular way.
It is the libraries and conventions embraced by the community which
turn them into reliability assets, and some of these ideas can be put
to good use on Go software. This is not an hypothesis - there's
production Go software out there, today.

What point are you making?  We don't disagree that Erlang's libraries are important.  We don't disagree that there is Go software in production today.  We don't disagree that there are good ideas in Erlang's library that might be useful in Go.  Your response seems to be tangential.

I remain doubtful that the importance of process identity and monitoring can be neglected in a comparison.  As an example, the use of supervision trees, which are built on this functionality, are pretty central to how Erlang structures services.
 
> (As an side, it is funny reading this discussion, where there is an
> assumption that Go's goroutines and Erlang's processes are very similar.  I
> was following a much older thread where there was considerable hostility
> to any comparison between the two approaches)

Let's avoid feeding the trolls: nobody said they are "very similar" in
this thread. They're both microthreading models - there are
similarities, and there are differences.

I agree that nobody said they are "very similar", that is why I wrote "there is an assumption". 

Gustavo Niemeyer

unread,
May 28, 2014, 12:56:45 PM5/28/14
to Robert Johnstone, golan...@googlegroups.com, Robert Tweed
On Wed, May 28, 2014 at 6:14 PM, Robert Johnstone
<r.w.jo...@gmail.com> wrote:
> What point are you making? We don't disagree that Erlang's libraries are
> important. We don't disagree that there is Go software in production today.
> We don't disagree that there are good ideas in Erlang's library that might
> be useful in Go. Your response seems to be tangential.

If you agree with all of that, there isn't much to be doubtful about.

> I remain doubtful that the importance of process identity and monitoring can
> be neglected in a comparison. As an example, the use of supervision trees,
> which are built on this functionality, are pretty central to how Erlang
> structures services.

It's trivial to monitor a goroutine in Go, and it's also trivial to
have goroutines that carry identifiers, although the latter is not
even required for implementing the ideas described.

> I agree that nobody said they are "very similar", that is why I wrote "there
> is an assumption".

I haven't observed, assumed, or said so. You're just feeding the
trolls which you accuse of being hostile.


gustavo @ http://niemeyer.net

Robert Tweed

unread,
May 28, 2014, 4:57:14 PM5/28/14
to golan...@googlegroups.com
On 28/05/2014 15:17, John Waycott wrote:
>
> The black box reports problems to another entity, so presumably
> someone or some service looks at those events and makes a decision to
> let the service continue to run. If the service cannot be controlled,
> there would really be no point in it reporting failures; no one could
> shut it down or take action to fix the problem.
To clarify, just because something is a black box at the point of use
doesn't mean it's a black box everywhere. There still needs to be
something to decide which concrete type should fulfill a particular
interface in a particular situation, and that will take responsibility
for instantiating and hooking up the correct control and output
connections as needed. However the problem here is that no matter how
these responsibilities are delegated out, there's always a
responsibility on main() to do the shutdown, for all running processes,
if the program exits normally.

- Robert

Robert Tweed

unread,
May 28, 2014, 5:02:31 PM5/28/14
to golang-nuts
On 28/05/2014 14:49, Ian Lance Taylor wrote:
> However, I understand your concern about making unnecessary changes to the
> language, which is why I have not used the words "I propose a change to the
> language".
> Understood, that's not really on the table. I thought that at this
> point you were arguing for os.AtExit, and that's what I've been
> arguing against.
No, in fact I argued against something along those lines going into
stdlib where it was suggested elsethread.

It would probably be useful if deferred functions in goroutines could be
guaranteed to run at exit without adding framework code, but I accept
the argument for why this is currently not the case and supporting it
would add some language complexity: either a global flag (which isn't a
good idea because it's spooky action at a distance and would wreak havoc
with portability) or another keyword, which is probably too much extra
complexity for what is a single use-case.

> I think that I would add Exit or Shutdown to your framework. I
> understand that that makes your framework less elegant and less
> standalone. Ian
Yes, it looks like there's going to have to be some kind of global
synchronisation manager that main will shut down before returning. I
just need to figure out the best way to get the extra dependency into
objects that need it without compromising portability in cases where it
isn't needed. As it's going to be part of a framework there may be lots
of objects that will never need this, but as I'll want standard ways to
instantiate things that must assume it is necessary, for forwards
compatibility if nothing else. It might be doable with automatic
dependency injection.

- Robert
Reply all
Reply to author
Forward
0 new messages