Converting int to string

3,962 views
Skip to first unread message

Ted Stockwell

unread,
Jun 13, 2012, 1:49:18 PM6/13/12
to golang-nuts
This test....

package qwerty
import (
"fmt"
"testing"
)
func TestSomething(t *testing.T) {
f:= 0
fmt.Printf("f=%d\n",f)
fmt.Printf("f="+string(f)+"\n")
}

outputs this...

=== RUN TestSomething
f=0
f=

I expected string(f) to evaluate to "0", why does it evaluate to ""?

Rémy Oudompheng

unread,
Jun 13, 2012, 1:58:02 PM6/13/12
to Ted Stockwell, golang-nuts
2012/6/13 Ted Stockwell <emor...@yahoo.com>:
http://golang.org/ref/spec#Conversions_to_and_from_a_string_type

Here string(0) == "\0"

Rémy.

Karl J. Smith

unread,
Jun 13, 2012, 2:01:14 PM6/13/12
to Ted Stockwell, golang-nuts

It's treating your int as byte, as in the 'A' example above. Use strconv.Itoa for what you're trying to do instead.

Ted Stockwell

unread,
Jun 13, 2012, 2:53:41 PM6/13/12
to golang-nuts


On Jun 13, 12:58 pm, Rémy Oudompheng <remyoudomph...@gmail.com> wrote:
>
> http://golang.org/ref/spec#Conversions_to_and_from_a_string_type
>
> Here string(0) == "\0"
>


That link specifically says "Converting a signed or unsigned integer
value to a string type yields a string containing the UTF-8
representation of the integer."
So, according to the specification I think that the current behavior
is incorrect since "" is not a valid UTF-8 representation of the value
0.

Karl J. Smith

unread,
Jun 13, 2012, 3:02:03 PM6/13/12
to Ted Stockwell, golang-nuts
The '\0' is just invisible. It's not blank.

Rémy Oudompheng

unread,
Jun 13, 2012, 3:21:22 PM6/13/12
to Karl J. Smith, Ted Stockwell, golang-nuts
On 2012/6/13 Karl J. Smith <frum...@gmail.com> wrote:
> The '\0' is just invisible. It's not blank.
>
> http://play.golang.org/p/XjHn6zT9Lr

Let me mention the very useful
fmt.Printf("%q\n", string(0))

Rémy.

Karl J. Smith

unread,
Jun 13, 2012, 3:42:51 PM6/13/12
to Rémy Oudompheng, Ted Stockwell, golang-nuts
Or even %#v, which is great for debugging, and does the same as %q in this case (http://golang.org/pkg/fmt/)

Dominik Honnef

unread,
Jun 13, 2012, 2:04:29 PM6/13/12
to golan...@googlegroups.com
It does not evaluate to "", it evaluates to "\x00". string(int) treats
the number as a unicode code point. If you set f to 65, string(f) will
be "A", if you set it to 0x00BF, it will be "¿" and so on.

To properly convert a number to its string representation, use the
appropriate methods from fmt (e.g. Sprintf) or use the strconv package.

P.S.: use "%#v" as the format string in fmt.Printf to see the actual
object representation, which would be "\x00" in your case.

Ted Stockwell

unread,
Jun 13, 2012, 5:19:01 PM6/13/12
to golang-nuts


On Jun 13, 1:04 pm, Dominik Honnef <domin...@fork-bomb.org> wrote:
>
> string(int) treats the number as a unicode code point.
>

That's my point, the spec doesn't say that string(int) creates a
string representation of the code points associated with a integer.
The spec says that its converts an integer into its corresponding
string representation.

I would expect the result of a string conversion to be able to be
converted back into its corresponding number using strconv.Atoi.
strconv.Atoi wont parse either "\x00" or "A"

Dominik Honnef

unread,
Jun 13, 2012, 5:30:49 PM6/13/12
to golan...@googlegroups.com
The link also specifically gives example, for those who are inclined to
misinterpret what "UTF-8 representation of the integer" means. It says
UTF-8 representation, not string representation. UTF-8, because the
integer is a Unicode code point, which can be represented in various
ways (UTF-8, UTF-16BE, etc). It has absolutely nothing to do with
converting numbers into their string representation. It is about runes.
The examples show this, and since the examples are part of the
specification, behaviour and specification match.

And as others (and I) have already said, it isn't "", it's "\0". The
UTF-8 represenation of 0 is 0. The UTF-8 representation for [0, 128) is
the value itself, for anything above that it can be 2 to 4 bytes.

Rob 'Commander' Pike

unread,
Jun 13, 2012, 7:26:46 PM6/13/12
to Ted Stockwell, golang-nuts
On Wed, Jun 13, 2012 at 4:19 PM, Ted Stockwell <emor...@yahoo.com> wrote:
>
>
> On Jun 13, 1:04 pm, Dominik Honnef <domin...@fork-bomb.org> wrote:
>>
>> string(int) treats the number as a unicode code point.
>>
>
> That's my point, the spec doesn't say that string(int) creates a
> string representation of the code points associated with a integer.
> The spec says that its converts an integer into its corresponding
> string representation.

It says it creates the "UTF-8 string representation of the integer".
UTF-8 is an encoding of integer values. Given the integer 7, it is the
byte "\x07". Given the integer 1234, it is the bytes "\xe1\x88\xb4".
And given the integer 0, it is the byte "\0".

You read the spec as saying the decimal representation of the integer,
but that's not what it says or does.

> I would expect the result of a string conversion to be able to be
> converted back into its corresponding number using strconv.Atoi.
> strconv.Atoi wont parse either "\x00" or "A"

Right idea, wrong encoding. Try []int("\xe1\x88\xb4").

-rob

Ted Stockwell

unread,
Jun 13, 2012, 8:02:50 PM6/13/12
to golang-nuts


On Jun 13, 6:26 pm, "Rob 'Commander' Pike" <r...@golang.org> wrote:
>
> It says it creates the "UTF-8 string representation of the integer".
> UTF-8 is an encoding of integer values. Given the integer 7, it is the
> byte "\x07".  Given the integer 1234, it is the bytes "\xe1\x88\xb4".
> And given the integer 0, it is the byte "\0".
>
> You read the spec as saying the decimal representation of the integer,
> but that's not what it says or does.
>

True.
It might be less confusing to newbies like myself if it said "UTF-8
encoding of the integer" instead of "UTF-8 representation of the
integer".
To me "UTF-8 representation of the integer" means a string
representation of the string where the string uses UTF-8 encoding.
I interpreted it that way because my interpretation seems, to me, to
be the more reasonable thing to do :-).

> > I would expect the result of a string conversion to be able to be
> > converted back into its corresponding number using strconv.Atoi.
> > strconv.Atoi wont parse either "\x00" or "A"
>
> Right idea, wrong encoding.  Try []int("\xe1\x88\xb4").
>

I know I'm pissing people off with my persistence but I'm trying hard
to build a consistent mental model of how Go works.
The expression above, []int("\xe1\x88\xb4"), does not compile.
I get "cannot convert "?" (type string) to type []int".
Same for []int("\x07")

One more thing :-)...
[]int("\x07") would convert the string "\x07" to an []int, how would I
convert it back to the original int of 7?

Nigel Tao

unread,
Jun 13, 2012, 8:10:17 PM6/13/12
to Ted Stockwell, golang-nuts
On 14 June 2012 10:02, Ted Stockwell <emor...@yahoo.com> wrote:
> The expression above, []int("\xe1\x88\xb4"), does not compile.
> I get "cannot convert "?" (type string) to type []int".
> Same for []int("\x07")

Sorry, use "rune" instead of "int".


> One more thing :-)...
> []int("\x07") would convert the string "\x07" to an []int, how would I
> convert it back to the original int of 7?

[]rune("\x07")[0]

Rob 'Commander' Pike

unread,
Jun 13, 2012, 8:18:30 PM6/13/12
to Nigel Tao, Ted Stockwell, golang-nuts
Yes, apologies, it's []rune (which is a form of integer).

-rob

Andrew Gerrand

unread,
Jun 13, 2012, 8:38:30 PM6/13/12
to Ted Stockwell, golang-nuts
On 14 June 2012 10:02, Ted Stockwell <emor...@yahoo.com> wrote:
> It might be less confusing to newbies like myself if it said "UTF-8
> encoding of the integer" instead of "UTF-8 representation of the
> integer".

The goal of the spec is to be a precise definition of the language.
It's great that it's short and accessible to newcomers, but that is a
result of good process, not a design consideration.

The choice of words is deliberate. Elsewhere in the document are
references to a "decimal representation," and so "UTF-8
representation" is appropriate here IMO.

> []int("\x07") would convert the string "\x07" to an []int, how would I
> convert it back to the original int of 7?

package main

import "fmt"

func main() {
s := []rune("\x07")
i := s[0]
fmt.Println(i)
}

Andrew

Marvin Renich

unread,
Jul 9, 2012, 9:13:58 AM7/9/12
to golang-nuts
* Andrew Gerrand <a...@golang.org> [120613 20:45]:
> On 14 June 2012 10:02, Ted Stockwell <emor...@yahoo.com> wrote:
> > It might be less confusing to newbies like myself if it said "UTF-8
> > encoding of the integer" instead of "UTF-8 representation of the
> > integer".
>
> The goal of the spec is to be a precise definition of the language.
> It's great that it's short and accessible to newcomers, but that is a
> result of good process, not a design consideration.
>
> The choice of words is deliberate. Elsewhere in the document are
> references to a "decimal representation," and so "UTF-8
> representation" is appropriate here IMO.

Sorry to reopen a month-old thread; I'm that far behind.

I don't think "UTF-8 representation" is at all ambiguous, and I don't
think "UTF-8 encoding" would improve it. What is ambiguous is
"integer". A Unicode code point is always an integer, and its UTF-8
representation is not in question. But an integer is not always a
Unicode code point. The UTF-8 representation of the integer that
represents the number of records in a database that match a certain
criterion is *not* the UTF-8 representation of that same number treated
as a Unicode code point.

The examples do, indeed, disambiguate what the words intend, but most
specification documents treat examples as clarifying, but non-normative.
The normative description should be unambiguous without the examples.
Examples should clarify the difficult-to-understand, but should not be
necessary to disambiguate.

I think it would remove any ambiguity to say "...string containing the
UTF-8 representation of the integer treated as a Unicode code point."

...Marvin

Rémy Oudompheng

unread,
Jul 9, 2012, 3:00:40 PM7/9/12
to Marvin Renich, golang-nuts
On 2012/7/9 Marvin Renich <mr...@renich.org> wrote:
> I don't think "UTF-8 representation" is at all ambiguous, and I don't
> think "UTF-8 encoding" would improve it. What is ambiguous is
> "integer". A Unicode code point is always an integer, and its UTF-8
> representation is not in question. But an integer is not always a
> Unicode code point.
> [...]
> I think it would remove any ambiguity to say "...string containing the
> UTF-8 representation of the integer treated as a Unicode code point."

What do you mean by the combination of these sentences?

Rémy.

Marvin Renich

unread,
Jul 9, 2012, 4:05:54 PM7/9/12
to golang-nuts
* Rémy Oudompheng <remyoud...@gmail.com> [120709 15:06]:
The thread was about the following statement in the spec being mistaken
to mean that the result was the ASCII representation of the integer:

Converting a signed or unsigned integer value to a string type yields
a string containing the UTF-8 representation of the integer.

In the specific message to which I responded, Andrew Gerrand was
responding to a suggestion that "UTF-8 representation" be changed to
"UTF-8 encoding" by saying that this wording was deliberate. My first
sentence above was agreeing with that.

My second sentence was saying that where the spec says "integer" is
where I believe the ambiguity arises. A pear (or Unicode codepoint) is
a fruit (or integer), but a fruit is not always a pear.

The spec say "UTF-8 representation of the integer", but an integer
variable (in go or any other computer language) can represent many other
things besides a Unicode code point, and if I were to read "UTF-8
representation of the integer representing the number of wombats in cave
3", I would expect the result to be the same as Itoa. Only if the
integer in question is a Unicode code point would I expect its UTF-8
representation to be what the spec intends here.

My last sentence quoted above is a suggestion to improve the spec by
adding the phrase "treated as a Unicode code point" so as to remove any
possible misinterpretation.

...Marvin

Rémy Oudompheng

unread,
Jul 10, 2012, 1:14:01 AM7/10/12
to Marvin Renich, golang-nuts
I tend to think it cannot really be made much clearer, and that
examples are a lot more efficient to explain what is meant and there
happens to be several of them.

Rémy.
Reply all
Reply to author
Forward
0 new messages