How does the go compiler treat initialized string variables?

149 views
Skip to first unread message

Frank Jüdes

unread,
Nov 1, 2022, 7:36:29 PM11/1/22
to golang-nuts
I have to write a program that should verify the content of configuration-servers. It will need a lot of pre-initialized data to verify the actual content of a server, so it has to initialize many strings. What i know from other C-style languages is, that code like

var MyString *string = 'Hello World!';

Will result in having two identical copies of the string »Hello World!« in the memory: The first one within the program-code itself and a second copy somewhere in the heap-memory.
How will the go-compiler handle something like this:

package main
  import ("fmt")
  type MyStruct struct {
    Text string
    Count int32
  }
  func main() {
    MyVar := MyStruct{Text: "Hello World!", Count: 20 }
    fmt.Printf("%#v\n",MyVar) }

Will there be two copies of the string »Hello World!" in the memory or just one? As said, mMy code will contain a gazillion of pre-defined string variables and having each string allocate two copies of itself in memory would be bad for small systems.
Thank you very much in advance for your help.

Jan Mercl

unread,
Nov 1, 2022, 7:47:47 PM11/1/22
to Frank Jüdes, golang-nuts


On Tue, Nov 1, 2022, 20:36 Frank Jüdes <jue...@gmail.com> wrote:
I have to write a program that should verify the content of configuration-servers. It will need a lot of pre-initialized data to verify the actual content of a server, so it has to initialize many strings. What i know from other C-style languages is, that code like

var MyString *string = 'Hello World!';

Will result in having two identical copies of the string »Hello World!« in the memory: The first one within the program-code itself and a second copy somewhere in the heap-memory.

I think the string backing array will be in the text segment as in C. The string header will end in the data segment, provided it's a package scoped variable, but the header has no direct equivalent in C.

How will the go-compiler handle something like this:

package main
  import ("fmt")
  type MyStruct struct {
    Text string
    Count int32
  }
  func main() {
    MyVar := MyStruct{Text: "Hello World!", Count: 20 }
    fmt.Printf("%#v\n",MyVar) }

Will there be two copies of the string »Hello World!" in the memory or just one?

The backing string  array will exist only once, again in the text segment, I believe, because there's no reason for making any copies of it in this case.

Not tested/verified 

Frank Jüdes

unread,
Nov 2, 2022, 2:15:27 AM11/2/22
to golang-nuts
Thank you very much for the answer! - It actually turns out that my structure is a bit more complex than i though. The test-cases themself are a structure of seven strings and one in64 which are organized as a test-case list into a map[string] and those lists are organized into groups also as maps[strings]… 🤯
I am generating the variable that holds all this data straight out of two Database-tables, it looks like this
var TestCaseGroup = T_TestCaseGroup {
  "cn=config": {
    "ds-cfg-add-missing-rdn-attributes": {
      RecommendedValue: "true",
      MessageClass: "Recommendation",
      MessageType: "Compatibility",
      MessageText: "It is recommended to enable this feature to make OUD more compatible to older applications."},
    "ds-cfg-allow-attribute-name-exceptions": {
      RecommendedValue: "false",
      MessageClass: "Severe Warning",
      MessageType: "Data-Quality",
      MessageText: "This feature should be disabled, to prevent the directory-schema to become incompatible with LDAP standards."},

And i have no idea how to figure out if these strings are being copied into the heap.
But: The good news is, that the compiler is performing a string de-duplication, for example the string "Mild Error" appears in hundred of test-cases but appears only once in the whole program-code. - Tested with strings | grep 'Mild Error'. I think that's a good sign.

Frank Jüdes

unread,
Nov 2, 2022, 2:47:39 AM11/2/22
to golang-nuts
Just had an idea: I printed the address of the initialized variable, and for comparison another variable that was created in the heap:
Address of package variable: 0x818710
Address of Program variable: 0xc000010028
Imho it is a safe assumption that the initialized data-structure is not located in the heap, but somewhere in a text-segment.

jake...@gmail.com

unread,
Nov 2, 2022, 8:13:38 PM11/2/22
to golang-nuts
Just to add a tidbit to what Jan said. The key here is that strings (type string) in Go are immutable, whereas strings ("char *"  based types) in C are not. That is why the same string can be used again and again without ever needing to be copied, and why they can live in the text segment.

Konstantin Khomoutov

unread,
Nov 3, 2022, 3:41:33 PM11/3/22
to Frank Jüdes, golang-nuts
On Tue, Nov 01, 2022 at 11:18:48AM -0700, Frank Jüdes wrote:

> I have to write a program that should verify the content of
> configuration-servers. It will need a lot of pre-initialized data to verify
> the actual content of a server, so it has to initialize many strings. What
> i know from other C-style languages is, that code like
>
> var MyString *string = 'Hello World!';
>
> Will result in having two identical copies of the string »Hello World!« in
> the memory: The first one within the program-code itself and a second copy
> somewhere in the heap-memory.
[...]
> Will there be two copies of the string »Hello World!" in the memory or just
> one? As said, mMy code will contain a gazillion of pre-defined string
> variables and having each string allocate two copies of itself in memory
> would be bad for small systems.

In addition to what others have said, please note that you can use for
analysis the usual tooling such as `objdump` and `nm` to peek at the generated
executable image file, as well as Go's native `go tool objdump` and
`go build -gcflags=-S` - with the latter producing the assembly code.

Reply all
Reply to author
Forward
0 new messages