Proposal: auto return String instead of []byte if requested

279 views
Skip to first unread message

Kevin Chadwick

unread,
Sep 11, 2020, 12:45:52 PM9/11/20
to golang-nuts
I apologise if this has already been discussed. Google didn't turn up anything
directly related.

If you read a file using the following that returns a byte slice.

tlsCertb, err := ioutil.ReadFile("/etc/ssl/mycert")
if err != nil {
log.Fatal(err)
}
tlsCert = string(tlsCertb)

Is there a way to get a string without the cast.

Otherwise couldn't the language automatically return a string rather than a byte
slice in these cases if the receiving var is already a string?

e.g.

var string tlsCert
tlsCert, err = ioutil.ReadFile("/etc/ssl/mycert")
if err != nil {
log.Fatal(err)
}

Volker Dobler

unread,
Sep 11, 2020, 2:31:54 PM9/11/20
to golang-nuts
Please no. This is just begging for problems.
A simple type conversion is 1 (one!) line and pretty clear.
Once you open this can of worms someone would like
to  have a []rune and then automatic conversions from
int32 to int and 6 month later you have a JavaScript
like nonsense language just because you saved one
trivial line of code.

V

Ian Lance Taylor

unread,
Sep 11, 2020, 3:09:57 PM9/11/20
to Kevin Chadwick, golang-nuts
The way Go works, ioutil.ReadFile is compiled with the io/ioutil
package. It can't change based on how it is called. So it is always
going to return []byte.

The only way to implement what you suggest would be to add an implicit
conversion from []byte to string. But that seems problematic. In
general Go avoids implicit conversions except to interface types. And
a conversion to string does require a copy, so it doesn't seem like a
great idea to do that implicitly.

If this happens a lot in your code, it seems easy enough to use a tiny
helper function.

Ian

Kevin Chadwick

unread,
Sep 11, 2020, 3:43:18 PM9/11/20
to golang-nuts
On 2020-09-11 19:08, Ian Lance Taylor wrote:
> The way Go works, ioutil.ReadFile is compiled with the io/ioutil
> package. It can't change based on how it is called. So it is always
> going to return []byte.

Ok. I figured it might save an allocation as well if the coder made clear their
intention prior. I thought it may not be worth the effort, even if it were
straight forward.

Thanks for the consideration.

Amnon

unread,
Sep 12, 2020, 2:37:14 AM9/12/20
to golang-nuts
People tend to use strings far too much.

Does tlsCert in this code really need to be a string?

Probably not. 
But it it does really need to be a string, then converting to a sting as you have done is
quite straight forward. And explicit is better than implicit.
The beauty of Go is its simplicity. Go doesn't do magic.

tapi...@gmail.com

unread,
Sep 12, 2020, 3:42:14 AM9/12/20
to golang-nuts
Is it good to introduce owner transfer based string<->[]byte conversions?
After the conversion, the being converted string/[]byte values mustn't be used any more.
Such as

tlsCertData, _ = ioutil.ReadFile("/etc/ssl/mycert")
var tlsCert string = bultin.ByteSlice2String(tlsCertData)

// forbid using tlsCertData any more
_ = tlsCertData // error: tlsCertData can only used after a re-assignment.

tapi...@gmail.com

unread,
Sep 12, 2020, 3:46:17 AM9/12/20
to golang-nuts
There is a prerequisite to transfer ownership: it must be proved that
no other values share ownership of the byte slice returned by ioutil.ReadFile.

Axel Wagner

unread,
Sep 12, 2020, 4:26:34 AM9/12/20
to golang-nuts
Hi y'all,

given that the concern here seems to be performance (though, TBH, I doubt this particular case is much of a bottleneck), this seems to be far simpler to address as a compiler optimization - if the compiler can prove there are no other references to a `[]byte`, it can do the conversion cheaply. You need to implement that check anyway, and optimizations are transparent to the user, so the hurdle for them is lower.

The only problem with that is that it would need to happen cross-package. That's where the type-system might help. AFAICT however, if you're willing to change the API anyway, the simpler fix here would be to take an `io.Writer` instead of returning a `[]byte`. The user is then free to pass a `*bytes.Buffer` or a `*strings.Builder`, depending on which of the two they need. And if the wins of this optimization are large enough to justify changing the language (which, again, remains to be proven) they certainly are large enough to justify this slightly less user-friendly API.

So, really, IMO the case for language changes here is pretty weak.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/af63dbc8-188a-46f6-aa7c-e2c44d17cf37n%40googlegroups.com.

David Finkel

unread,
Sep 12, 2020, 7:03:41 PM9/12/20
to Axel Wagner, golang-nuts
On Sat, Sep 12, 2020 at 4:25 AM 'Axel Wagner' via golang-nuts <golan...@googlegroups.com> wrote:
Hi y'all,

given that the concern here seems to be performance (though, TBH, I doubt this particular case is much of a bottleneck), this seems to be far simpler to address as a compiler optimization - if the compiler can prove there are no other references to a `[]byte`, it can do the conversion cheaply. You need to implement that check anyway, and optimizations are transparent to the user, so the hurdle for them is lower.

The only problem with that is that it would need to happen cross-package. That's where the type-system might help. AFAICT however, if you're willing to change the API anyway, the simpler fix here would be to take an `io.Writer` instead of returning a `[]byte`. The user is then free to pass a `*bytes.Buffer` or a `*strings.Builder`, depending on which of the two they need. And if the wins of this optimization are large enough to justify changing the language (which, again, remains to be proven) they certainly are large enough to justify this slightly less user-friendly API.

So, really, IMO the case for language changes here is pretty weak.

I agree that this is more of a missing compiler optimization. The compiler already has the ability to elide copies for a subset of []byte -> string conversions.

As an example, here's the implementation of bytes.Equal

func Equal(a, b []byte) bool {
// Neither cmd/compile nor gccgo allocates for these string conversions.
return string(a) == string(b)
}


Compiler explorer shows that this actually gets compiled directly down to the appropriate runtime.memequal at a callsite.

In contrast, a very trivial package with two functions doesn't appears to not inline a trivial function doing a string->bytes conversion, but ends up calling both runtime.slicebytetostring and runtime.stringtoslicebyte.In this contrived example simple escape analysis would be enough to elide the copies.

My general impression is that the GC compiler's inliner works rather well across packages, in many cases, so if this does end up being a bottleneck for someone, this may not be an expensive optimization to add over the current infrastructure. (disclaimer: I haven't delved deeply into the inner workings of the GC compiler's optimization passes)
 
Reply all
Reply to author
Forward
0 new messages