Hi Tushar,
My rule of thumb in practice is: for returning structs, I almost always return a pointer.
The exception is -- the only time I would return a value in general -- is when:
a) the returned value is _intended_ to be an immutable value, like a time.Time and a string value;
*and*
b) the returned value is 3 words (e.g. a word is 8 bytes on a 64-bit architecture like amd64) or less.
Both time.Time (3 words) and string (2 words) meet this criteria.
Why this (b) heuristic? Because over 3 words and
I assume, as a heuristic -- that should be measured if it
matters -- that it will be faster to copy the one word pointer
and than the struct value. A pointer of course
is more likely to be faster to copy initially, but slower if
there is a cache miss on use or if it induces alot more garbage for
the garbage collector.
A value is more likely to be on the stack, so typically less garbage;
and since its more likely to be on the stack, it is also more likely to be in the
hardware's cache lines, so use will be faster. You can see why you
have to measure your actual use to see which is faster, if it
turns out the profiling shows that it is your bottleneck
and thus matters.
And of course the other rules in the previously referenced
guidelines provided by Ian would also apply.
So if you have a sync.Mutex value inside, or something
else that cannot be copied (sync.RWMutex, sync.WaitGroup),
then you _must_ return a pointer, per their documentation. Notice if you had
a *sync.Mutex inside, then you might get away with returning a value, but
that gets tricky. Are shallow copies enforced, somehow? Will the user
make a mistake in copying them by value with a default shallow copy? Your
API design needs to balance the possibility of user error with space/time efficiency.
The docs can say copies are forbidden (like sync or math/big below does), but
the user might not read the docs, or remember their rules.
More detail/an example:
The idea of an immutable value can be subtle. An integer (as in the math
concept of an integer that can grow towards infinity and possibly become very big)
is a good example of the tradeoffs.
Usually an integer value fits in a word, because _usually_ they need only
need be under 64 bits (or 63 bits for signed), and can be represented with
an int (word sized) or int64. For example: if you are incrementing an 63-bit
positive integer once every clock cycle, and your clock cycle is
an optimistic 10GHz, so 0.1 nanosecond, and assuming that said word
integer can be incremented in a single clock cycle, then a 63 bit or 2^63
sized integer could be incremented for 29 years before overflowing,
since 2^63/(1e10 increments/sec*60 sec/minute*60 minutes/hour*
24 hour/day*365.25 days/year) = 29.2271 years. Usually we assume
that our code is doing other things as well, and will be restarted and/or
ported to 128-bit or higher architectures before then.
But notice the distinction when the integers need to get bigger today, say
for checking the math involved with cryptography: the math/big package
uses pointers for big integers because very big integers are going to take up many
more words than 3. So the big package returns pointers and insists on using
pointers. But that can mean user error if the user accidentally copies them
> Operations always take pointer arguments (*Int) rather
> than Int values, and each unique Int value requires its own
> unique *Int pointer. To "copy" an Int value, an existing
> (or newly allocated) Int must be set to a new value using
> the Int.Set method; shallow copies of Ints are not > supported and may lead to errors.
>
> func (z *Int) Set(x *Int) *Int // signature of math/big.Int.Set()
Hope this helps capture some of the nuance. Generally pointers
are the safer bet, and values are a performance optimization.
Jason