Can a local string literal have different addresses every time the enclosing function is called?

49 views
Skip to first unread message

Brian Bi

unread,
Aug 13, 2017, 9:48:33 PM8/13/17
to std-dis...@isocpp.org
According to the current draft, "whether successive evaluations of a string-literal yield the same or a different object is unspecified".

This wording seems to suggest that in the following program, two different addresses may be printed:

#include <iostream>
void foo() {
    const char* bar = "bar";
    std::cout << (void*)bar << '\n';
}
int main() {
    foo();
    foo();
}

This is based on the interpretation that the string-literal is evaluated once every time the function foo() is called.

However, the same paragraph also says that "Evaluating a string-literal results in a string literal object with static storage duration". This suggests that the evaluation of a local string literal happens once, as if it were any other static local variable. After all, surely it was not intended that a program can contain an unbounded number of copies of the string literal object that all have static storage duration.

I wonder whether a wording change is needed to clarify the situation?
--
Brian Bi

Bo Persson

unread,
Aug 13, 2017, 10:18:53 PM8/13/17
to std-dis...@isocpp.org
On 2017-08-14 03:48, Brian Bi wrote:
> According to the current draft, "whether successive evaluations of
> a/string-literal
> <http://eel.is/c++draft/lex.string#nt:string-literal>/yield the same or
> a different object is unspecified".
>
> This wording seems to suggest that in the following program, two
> different addresses may be printed:
>
> #include <iostream>
> void foo() {
> const char* bar = "bar";
> std::cout << (void*)bar << '\n';
> }
> int main() {
> foo();
> foo();
> }
>
> This is based on the interpretation that the string-literal is evaluated
> once every time the function foo() is called.
>
> However, the same paragraph also says that "Evaluating a/string-literal
> <http://eel.is/c++draft/lex.string#nt:string-literal>/results in a
> string literal object with static storage duration". This suggests that
> the evaluation of a local string literal happens once, as if it were any
> other static local variable. After all, surely it was not intended that
> a program can contain an unbounded number of copies of the string
> literal object that all have static storage duration.
>
> I wonder whether a wording change is needed to clarify the situation?


I believe the intention is to clarify this case

int main()
{
const char* bar1 = "bar";
const char* bar2 = "bar";

std::cout << (void*)bar1 << (void*)bar2 << '\n';
}

and that bar1 and bar2 might, or might not, contain the same address.



Bo Persson


Richard Smith

unread,
Aug 13, 2017, 10:19:50 PM8/13/17
to std-dis...@isocpp.org
On 13 August 2017 at 18:48, Brian Bi <bbi...@gmail.com> wrote:
According to the current draft, "whether successive evaluations of a string-literal yield the same or a different object is unspecified".

This wording seems to suggest that in the following program, two different addresses may be printed:

#include <iostream>
void foo() {
    const char* bar = "bar";
    std::cout << (void*)bar << '\n';
}
int main() {
    foo();
    foo();
}

This is based on the interpretation that the string-literal is evaluated once every time the function foo() is called.

Yes, that's correct. See CWG 1823.
 
However, the same paragraph also says that "Evaluating a string-literal results in a string literal object with static storage duration". This suggests that the evaluation of a local string literal happens once, as if it were any other static local variable.

Is that based on something in the wording or on your expectation for how the rule would work? (If the former, we likely have some wording to fix.)
 
After all, surely it was not intended that a program can contain an unbounded number of copies of the string literal object that all have static storage duration.

Actually, yes, that is intended. In practice, if an inline function or function template specialization contains a string-literal, then in some implementations, each translation unit will see a different string-literal object. That could be avoided by an implementation change (and an ABI change), but we did not deem the implementation cost and negative impact on the resulting program quality to be justified by the questionable benefits of guaranteeing that you get the same string literal object each time. So we chose to instead say that implementations that give different string literal objects in different evaluations are conforming.

Rather than add a special-case rule that only applies in the situations where you would need a mangled name or similar for the string literal, we chose to use a general rule to keep the language semantics as simple as we could. (This latitude also incidentally means we can write a sanitizer-esque mode for a compiler that intentionally gives different values sometimes.)
 
I wonder whether a wording change is needed to clarify the situation?

The current wording reflects the considered intent, and it sounds like you did interpret it correctly (but found it surprising). It's probably not worth further discussion in CWG unless there's new information of some kind to discuss. But I would accept a pull request adding an example :)
Reply all
Reply to author
Forward
0 new messages