Best practice for sets in go?

267 views
Skip to first unread message

bwall...@gmail.com

unread,
Jul 26, 2016, 7:24:24 AM7/26/16
to golang-nuts
Heyo,

I'm wondering whether there's any official preference between the following two approaches for implementing a set (of, e.g., strings) in go:

    greetings := map[string]bool {
        "hi":      true,
        "hello": true,
    }

    greetings := map[string]struct{} {
        "hi":      {},
        "hello": {},
    }

Effective Go suggests the former:

"A set can be implemented as a map with value type bool. Set the map entry to true to put the value in the set, and then test it by simple indexing."

But the latter appears to be more memory efficient, and I've seen both in practice.

Thanks!
Brendan

Dave Cheney

unread,
Jul 26, 2016, 7:35:01 AM7/26/16
to golang-nuts
If saving one byte per entry is important for your application, the latter is your choice. The former has nicer properties that you can use the single value form of map access,

if m[key] {
// found
}

I'd say the former is the right choice for 97.25% of general use cases.

Sam Whited

unread,
Jul 26, 2016, 10:07:08 AM7/26/16
to Dave Cheney, golang-nuts
I'd agree with this, except that I don't trust that one guy in the
office not to sneak past something that somehow puts a false in the
map and causes bugs later. Arguably that's a review problem, but even
so one or two extra bytes of code (`_, ok = m[key]; ok') seems like a
small price to pay for safety guarantees and to save a few bytes of
memory.

(though as you said, it hardly matters either way; I just prefer to
recommend the struct{} version to new people)

—Sam


--
Sam Whited
pub 4096R/54083AE104EA7AD3

Uli Kunitz

unread,
Jul 26, 2016, 10:32:18 AM7/26/16
to golang-nuts
I concur. Readability of the actions on a bool map is far better than for the struct{} map. The latter would require a custom type to improve readability.

Tyler Compton

unread,
Jul 26, 2016, 12:27:29 PM7/26/16
to golang-nuts, da...@cheney.net
In my opinion, struct{} should always be used for placeholders. I also use it when sending to a channel where the data itself doesn't matter. struct{} looks gross but it's very expressive in this case and I think most all Go programmers are familiar with its meaning.

map[string]bool suggests that the bool value has some kind of significance. Sure, the documentation you put in probably says otherwise, but the bool type is still unnecessary indirection.

This issue is made worse if one uses the if m[key] { ... } idiom for the reason that Sam Whited brought up. Plus, then the bool has a meaning, but it's only for the purpose of slightly shortening code instead of using a well-known Go pattern. Why bother making the meaning of the map more complicated?

if _, ok = m[key]; ok {
}

Is something Go programmers are used to.

if m[ok] {
}

Requires that the programmer take a look at the map and its documentation to see what's going on unless they're already used to that pattern in the code base.

Jakob Borg

unread,
Jul 26, 2016, 1:33:39 PM7/26/16
to Tyler Compton, golang-nuts, Dave Cheney
2016-07-26 18:27 GMT+02:00 Tyler Compton <xav...@gmail.com>:
> if _, ok = m[key]; ok {
> }
>
> Is something Go programmers are used to.
>
> if m[ok] {
> }
>
> Requires that the programmer take a look at the map and its documentation to
> see what's going on unless they're already used to that pattern in the code
> base.

If this map is part of some sort of external API I'm inclined to
agree. (Except that a map[something]struct{} is a crappy external API
sort of type and should probably be a StringSet or whatever with
relevant methods on it.)

However, in my code these sort of maps usually pop up in deduplication
loops and similar:

seen := make(map[string]bool)
for _, v := range whatever {
seen[v] = true
}
// ...
for _, v := range somethingelse {
if seen[v] {
// ...
}
}

and there I think the bool value makes the code read much nicer
without any risk for confusion.

//jb

Edward Muller

unread,
Jul 26, 2016, 1:37:12 PM7/26/16
to Tyler Compton, golang-nuts, da...@cheney.net
I encourage people to use the later form because of the duality of a bool (true or false) and the fact that it implies meaning. When you see the bool you have to start considering the false path. I'd much prefer the extra few bytes for the longer inclusion checks because there is no ambiguity in the check. 

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matt Harden

unread,
Jul 26, 2016, 1:56:10 PM7/26/16
to Edward Muller, Tyler Compton, golang-nuts, da...@cheney.net
Sets are in an unfortunate situation in Go. They're so easy to create from maps that it's hard to justify creating a special type for them, but the struct{}/bool dichotomy has good arguments on both sides. The result is that there is no idiomatic set type. I'm starting to think that it's best to always create/use a custom set type, just to make the usage clear and eliminate the possibility of having the "false" bug.

For deduplication specifically, I find closures useful:

seen := make(map[string]struct{})
first := func(s string) bool {
  if _, ok := seen[s]; !ok {
    seen[s] = struct{}{}
  }
  return ok
}
for _, v := range blah {
   if first(v) {
      // ...

Nate Finch

unread,
Jul 26, 2016, 3:27:53 PM7/26/16
to golang-nuts, bwall...@gmail.com
I much prefer struct{}.  For gophers, it is very expressive of "nothing to see here".    This value has no meaning, so don't even bother with it.

And as someone else said, the _, ok pattern is so pervasive, it's no harder for me to read than if myMap[foo] { 

if _, ok := myMap[foo]; ok {

}

I think it tells the story better than bool.  If you use bool, you're saying "this value could be true or false", when in fact, the value should always be true, or not exist.

Dan Kortschak

unread,
Jul 26, 2016, 6:59:01 PM7/26/16
to Nate Finch, golang-nuts, bwall...@gmail.com
Very much agree, but also something that has not been explicitly (or at
least deeply) said here is the use of a tiny type (when the number of
uses warrants - this is salt to taste).

type set map[T]struct{}

func (s set) has(v T) bool {
_, ok := s[v]
return ok
}

func (s set) add(v T) { s[v] = struct{}{} }

This gets you the expressiveness that's needed, doesn't expose what some
people think is ugly and avoids the ternary state problem of map[T]bool.
Reply all
Reply to author
Forward
0 new messages