Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Is a char array initializer a "string literal"?

173 views
Skip to first unread message

Juha Nieminen

unread,
Jul 26, 2022, 3:57:48 AM7/26/22
to
In another forum I was discussing string literals and char arrays.
During the conversation this question occurred to me:
In an array initialization like this:

char str[] = "hello";

can that "hello" even be considered a "string literal", or should it
be classified as something else?

The question arose because an (actual) string literal has a type of
array-of-const-char (of a size needed to store the contents of the
string literal). So for example the type of "hello" is const char[6].

However, you can't initialize an array with another array. Given that
fact, that would mean that the "hello" in

char str[] = "hello";

is not a const char[6] (because you can't initialize 'str' with one).
It's something else. In fact, the above is essentially just syntactic
sugar for:

char str[] = { 'h', 'e', 'l', 'l', 'o', '\0' };

Thus, should that "hello", in this context, be considered an initializer
list rather than a "string literal"?

(Yes, I wouldn't be surprised if the standard still calls it a "string
literal". But even standards are flawed and ambiguous sometimes. No
standard is absolutely perfect.)

Mut...@dastardlyhq.com

unread,
Jul 26, 2022, 4:03:17 AM7/26/22
to
On Tue, 26 Jul 2022 07:57:32 -0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
>In another forum I was discussing string literals and char arrays.
>During the conversation this question occurred to me:
>In an array initialization like this:
>
> char str[] = "hello";
>
>can that "hello" even be considered a "string literal", or should it
>be classified as something else?

Its a literal and its a string.

On a side note its interesting how many people don't know the critical
difference between

char str[] = "hello";

and

char *str = "hello";

in C and C++.

Öö Tiib

unread,
Jul 26, 2022, 4:51:09 AM7/26/22
to
On Tuesday, 26 July 2022 at 10:57:48 UTC+3, Juha Nieminen wrote:
> In another forum I was discussing string literals and char arrays.
> During the conversation this question occurred to me:
> In an array initialization like this:
>
> char str[] = "hello";
>
> can that "hello" even be considered a "string literal", or should it
> be classified as something else?
>
> The question arose because an (actual) string literal has a type of
> array-of-const-char (of a size needed to store the contents of the
> string literal). So for example the type of "hello" is const char[6].
>
> However, you can't initialize an array with another array. Given that
> fact, that would mean that the "hello" in
>
> char str[] = "hello";
>
> is not a const char[6] (because you can't initialize 'str' with one).
> It's something else. In fact, the above is essentially just syntactic
> sugar for:
>
> char str[] = { 'h', 'e', 'l', 'l', 'o', '\0' };
>
> Thus, should that "hello", in this context, be considered an initializer
> list rather than a "string literal"?

No. One is string literal used in array initialization (special case for
character and wide character arrays) other is array initialization from
brace enclosed list. Both are also aggregate initializations as
characters are aggregates.

The C++ has about dozen ways how to initialize something and lot
of those are made to have confusingly similar syntax (that is the
dreaded "uniform" initialization) so being sometimes confused is
unavoidable. Just design classes in a way that only narrow subset
is usable.

>
> (Yes, I wouldn't be surprised if the standard still calls it a "string
> literal". But even standards are flawed and ambiguous sometimes. No
> standard is absolutely perfect.)

The str there is not string literal in neither case and you have no ways
to interact with the "hello" before or after of that initialization so it is
hard to see how it can be different from string literal by as-if rule. I
have never had issue of it not behaving like string literal is promised
to behave in said context.

Bo Persson

unread,
Jul 26, 2022, 5:12:30 AM7/26/22
to
In the grammar "hello" is a string-literal (with a dash).

http://eel.is/c++draft/lex.string


You also have basic_string literals, with an s at the end - "hello"s.

http://eel.is/c++draft/basic.string.literals


Perhaps they are the real string literals? ;-)

Or should we say std::string literals?


Malcolm McLean

unread,
Jul 26, 2022, 5:30:19 AM7/26/22
to
C doesn't really have strings. It just has char arrays. Whilst strings are pretty
much always implemented as char arrays in any language, in most languages
you have a "string type", and you can copy, assign, concatenate and so on
with inbuilt language features. C doesn't do this.
The exception is that string literals (double quoted text in source code) are
allowed as special syntax for creating a string.

Alf P. Steinbach

unread,
Jul 26, 2022, 5:36:45 AM7/26/22
to
It's a string literal with all the rules of string literals, but there's
a special case exception allowing this usage of initializing a `char` or
other character type array; in this context it doesn't decay to pointer.

C++ requires that the array that's initialized is of sufficient size for
the whole string including the zero-terminator, while C, in my opinion
more practically oriented, does not require that.

When one is primarily concerned with not throwing away the compile time
string size information one can just declare a reference, like

const auto& str = "hello";

... or perhaps a `string_view`, which has `constexpr` construction,

const string_view str = "hello";

Sadly, students are still taught to declare pointers as a way of naming
string literals.

- Alf

Juha Nieminen

unread,
Jul 26, 2022, 6:08:08 AM7/26/22
to
Öö Tiib <oot...@hot.ee> wrote:
> The str there is not string literal in neither case and you have no ways
> to interact with the "hello" before or after of that initialization so it is
> hard to see how it can be different from string literal by as-if rule.

By the fact that a string literal is of type array-of-const-char, and that
you can't initialize an array with another array (even if it's a temporary).

If you, for example, print the value of sizeof("hello"), you'll get 6 as
the answer because "hello" is of type const char[6]. But quite clearly
it is not in:

char str[] = "hello";

because you can't initialize an array with an array.

OTOH, perhaps I'm approaching this completely incorrectly. Perhaps
"string literal" shouldn't be interpreted as "a literal that's an
array of const char".

Instead, perhaps it should be thought of as: "The character
sequence '"hello"' appearing in the source code (after preprocessing)
is a string literal. How that string literal is interpreted by
the compiler depends on the particular context in which it appears.
Depending on the context it can be an array of const char, or an
array initializer list."

Juha Nieminen

unread,
Jul 26, 2022, 6:19:16 AM7/26/22
to
Malcolm McLean <malcolm.ar...@gmail.com> wrote:
> C doesn't really have strings. It just has char arrays. Whilst strings are pretty
> much always implemented as char arrays in any language, in most languages
> you have a "string type", and you can copy, assign, concatenate and so on
> with inbuilt language features. C doesn't do this.
> The exception is that string literals (double quoted text in source code) are
> allowed as special syntax for creating a string.

That's not really relevant to what I was asking.

Besides, the C standard calls then "strings" throughout. For example,
it says that the 'argv' value that the main() function gets is an array
of pointers to strings. On a particular curious note, in another part it
says:

"A character string literal need not be a string (see 7.1.1), because
a null character may be embedded in it by a \0 escape sequence."

7.1.1 defines: "A string is a contiguous sequence of characters
terminated by and including the first null character."

Juha Nieminen

unread,
Jul 26, 2022, 6:23:02 AM7/26/22
to
Alf P. Steinbach <alf.p.s...@gmail.com> wrote:
> When one is primarily concerned with not throwing away the compile time
> string size information one can just declare a reference, like
>
> const auto& str = "hello";

Well, there's a quiz question if I ever saw one. What does 'auto' expand
to there?

Honestly, I wouldn't be certain without a bit further research or
testing.

Öö Tiib

unread,
Jul 26, 2022, 8:10:12 AM7/26/22
to
On Tuesday, 26 July 2022 at 13:08:08 UTC+3, Juha Nieminen wrote:
> Öö Tiib <oot...@hot.ee> wrote:
> > The str there is not string literal in neither case and you have no ways
> > to interact with the "hello" before or after of that initialization so it is
> > hard to see how it can be different from string literal by as-if rule.
> By the fact that a string literal is of type array-of-const-char, and that
> you can't initialize an array with another array (even if it's a temporary).

I am allowed by standard to initialize character array with string literal
(and wide character array with wide string literal).

> If you, for example, print the value of sizeof("hello"), you'll get 6 as
> the answer because "hello" is of type const char[6]. But quite clearly
> it is not in:
>
> char str[] = "hello";
>
> because you can't initialize an array with an array.

Third time I say that we are promised to by standard to be able to.
(First time was in previous reply). Easiest is to see it in cppreference,
as its text is easier to read than that of standard and it is easily available
over net.
<https://en.cppreference.com/w/c/language/array_initialization>

> OTOH, perhaps I'm approaching this completely incorrectly. Perhaps
> "string literal" shouldn't be interpreted as "a literal that's an
> array of const char".

I suspect you are treating initialization as assignment. Some of (all the
ways of) initialization look slightly like assignment. We can't assign or
pass array as value unless it is member of class ... but that does not
regulate usage of string literals for initialization.

> Instead, perhaps it should be thought of as: "The character
> sequence '"hello"' appearing in the source code (after preprocessing)
> is a string literal. How that string literal is interpreted by
> the compiler depends on the particular context in which it appears.
> Depending on the context it can be an array of const char, or an
> array initializer list."

No it is not initializer list. C++ can't be simplified much. It just has its
whole pile of different things in it of what some look confusingly
similar but are not. String literal and initializer list look very different
and are very different. For example:

auto str = { 'h', 'e', 'l', 'l', 'o', '\0' };

That is required to result with str being std::initializer_list<char>.

auto str = "hello";

That is required to result with str being char const *.

Öö Tiib

unread,
Jul 26, 2022, 9:19:02 AM7/26/22
to
On Tuesday, 26 July 2022 at 15:10:12 UTC+3, Öö Tiib wrote:
>
> <https://en.cppreference.com/w/c/language/array_initialization>

Oops gave wrong page about C. The C++ is that
<https://en.cppreference.com/w/cpp/language/aggregate_initialization>
Scroll down to Character arrays.

Alf P. Steinbach

unread,
Jul 26, 2022, 10:50:44 AM7/26/22
to
It's a reference to the array.

- Alf

Juha Nieminen

unread,
Jul 26, 2022, 12:03:23 PM7/26/22
to
Öö Tiib <oot...@hot.ee> wrote:
> No it is not initializer list. C++ can't be simplified much. It just has its
> whole pile of different things in it of what some look confusingly
> similar but are not. String literal and initializer list look very different
> and are very different. For example:

Just because two things "look different" doesn't mean they are. One can
perfectly well be just syntactic sugar that means the same as the other.

> auto str = { 'h', 'e', 'l', 'l', 'o', '\0' };
>
> That is required to result with str being std::initializer_list<char>.
>
> auto str = "hello";
>
> That is required to result with str being char const *.

Actually I think 'auto' in the last case will expand to char[6],
not const char*. That's because the type of "hello" is char[6].

Alf P. Steinbach

unread,
Jul 26, 2022, 12:18:41 PM7/26/22
to
`auto` on its own is roughly like `std::decay` of the initializer type,
<url: https://en.cppreference.com/w/cpp/types/decay>,

arrays decay to pointers, functions decay to pointers, and top level
const/volatile qualification is removed.

But if you make a reference `auto&`, then there must be a more exact
match with the initializer type.


- Alf

Keith Thompson

unread,
Jul 26, 2022, 3:03:23 PM7/26/22
to
Malcolm McLean <malcolm.ar...@gmail.com> writes:
[...]
> C doesn't really have strings.

Yes it does.

The C standard defines a "string" as "a contiguous sequence of
characters terminated by and including the first null character".

C has no string *type*, but it certainly does have strings.

(C++ has the same thing, but with different terminology. And of course
C++ also has the type std::string.)

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Keith Thompson

unread,
Jul 26, 2022, 3:13:35 PM7/26/22
to
Juha Nieminen <nos...@thanks.invalid> writes:
> In another forum I was discussing string literals and char arrays.
> During the conversation this question occurred to me:
> In an array initialization like this:
>
> char str[] = "hello";
>
> can that "hello" even be considered a "string literal", or should it
> be classified as something else?

It's a string literal.

A string literal is a syntactic construct, a sequence of characters
in a source file. "hello" unquestionably meets the definition of
*string-literal* given in the C++ grammar.

A char array is an object that can exist during program execution.

> The question arose because an (actual) string literal has a type of
> array-of-const-char (of a size needed to store the contents of the
> string literal). So for example the type of "hello" is const char[6].

Yes.

> However, you can't initialize an array with another array. Given that
> fact, that would mean that the "hello" in
>
> char str[] = "hello";
>
> is not a const char[6] (because you can't initialize 'str' with one).

There's a specific rule that allows a char array to be initialized with
a string literal. See [dcl.init.string] in the standard.

The string literal "hello" is always of type `const char[6]`.
In most contexts, it "decays" to a pointer expression of type `const
char*` which evaluates to the address of the initial character
of the array. A string literal used as an initializer for a
character array object is one of the exceptions, where it does not
"decay". The array value is used to initialize the array object.
(Another exception is the operand of `sizeof`.)

> It's something else. In fact, the above is essentially just syntactic
> sugar for:
>
> char str[] = { 'h', 'e', 'l', 'l', 'o', '\0' };
>
> Thus, should that "hello", in this context, be considered an initializer
> list rather than a "string literal"?

No, it's a string literal. The fact that it happens to be equivalent to
some other syntactic construct does not change that fact.

> (Yes, I wouldn't be surprised if the standard still calls it a "string
> literal". But even standards are flawed and ambiguous sometimes. No
> standard is absolutely perfect.)

Yes, standards are flawed and ambiguous, but this is not a flaw or an
ambiguity.

Andrey Tarasevich

unread,
Jul 26, 2022, 8:57:08 PM7/26/22
to
On 7/26/2022 12:57 AM, Juha Nieminen wrote:
> In another forum I was discussing string literals and char arrays.
> During the conversation this question occurred to me:
> In an array initialization like this:
>
> char str[] = "hello";
>
> can that "hello" even be considered a "string literal", or should it
> be classified as something else?

Yes, it is a string literal. I.e. formally it is an independent object
with static storage duration, independent and separate from `str` in
your example.

> The question arose because an (actual) string literal has a type of
> array-of-const-char (of a size needed to store the contents of the
> string literal). So for example the type of "hello" is const char[6].
>
> However, you can't initialize an array with another array.

But you _can_ initialize a char array with a string literal. A special
exception is deliberately made for this specific case in C and C++
standards. They explicitly state that you _can_ initialize a char array
with a string literal and explicitly specify the semantics of such
initialization.

--
Best regards,
Andrey

Keith Thompson

unread,
Jul 26, 2022, 10:24:49 PM7/26/22
to
Andrey Tarasevich <andreyta...@hotmail.com> writes:
> On 7/26/2022 12:57 AM, Juha Nieminen wrote:
>> In another forum I was discussing string literals and char arrays.
>> During the conversation this question occurred to me:
>> In an array initialization like this:
>> char str[] = "hello";
>> can that "hello" even be considered a "string literal", or should it
>> be classified as something else?
>
> Yes, it is a string literal. I.e. formally it is an independent object
> with static storage duration, independent and separate from `str` in
> your example.

It is a string literal, i.e., it is a source code token starting and
ending with a '"' character. The corresponding object with static
storage duration is not a string literal. It doesn't exist until
execution time.

[...]

Juha Nieminen

unread,
Jul 27, 2022, 3:41:17 AM7/27/22
to
Alf P. Steinbach <alf.p.s...@gmail.com> wrote:
> `auto` on its own is roughly like `std::decay` of the initializer type,
> <url: https://en.cppreference.com/w/cpp/types/decay>,
>
> But if you make a reference `auto&`, then there must be a more exact
> match with the initializer type.

Is there a good reason that they added such confusing needless complexity
to the keyword? What's the purpose of such inconsistent behavior?

Öö Tiib

unread,
Jul 27, 2022, 4:17:25 AM7/27/22
to
They reused the rules of template argument deduction. I am not 100% sure
why but most likely it is because of hoped ease to learn and to implement
both when both are same.

Template argument is deduced from function call arguments. To function
we can pass an array as argument but it will decay into pointer. We can take
reference to array but then also in template we need to indicate that the
parameter takes reference with that &.

Malcolm McLean

unread,
Jul 27, 2022, 4:56:15 AM7/27/22
to
On Tuesday, 26 July 2022 at 20:03:23 UTC+1, Keith Thompson wrote:
> Malcolm McLean <malcolm.ar...@gmail.com> writes:
> [...]
> > C doesn't really have strings.
> Yes it does.
>
> The C standard defines a "string" as "a contiguous sequence of
> characters terminated by and including the first null character".
> C has no string *type*, but it certainly does have strings.
>
The C standard uses a lot of words in a special way. Like "function".
"string" is another of those words.

Keith Thompson

unread,
Jul 27, 2022, 2:03:12 PM7/27/22
to
Yes, and when you refuse to acknowledge that the standard has its own
meanings for those words, you cause confusion.

It would help if you'd at least acknowledge in the first place that you
have your own meanings for words.

(Since this is comp.lang.c++, let's drop this.)

Andrey Tarasevich

unread,
Jul 27, 2022, 4:05:57 PM7/27/22
to
On 7/26/2022 7:24 PM, Keith Thompson wrote:
> Andrey Tarasevich <andreyta...@hotmail.com> writes:
>> On 7/26/2022 12:57 AM, Juha Nieminen wrote:
>>> In another forum I was discussing string literals and char arrays.
>>> During the conversation this question occurred to me:
>>> In an array initialization like this:
>>> char str[] = "hello";
>>> can that "hello" even be considered a "string literal", or should it
>>> be classified as something else?
>>
>> Yes, it is a string literal. I.e. formally it is an independent object
>> with static storage duration, independent and separate from `str` in
>> your example.
>
> It is a string literal, i.e., it is a source code token starting and
> ending with a '"' character. The corresponding object with static
> storage duration is not a string literal. It doesn't exist until
> execution time.

No. The distinctions is there, but it is different.

While it is true that the standard text differentiates between
lexical/grammatical `string-literal` (hyphenated) and "string literal
object", the former is generally treated as an expression, not as a mere
token in source code. For example, in 7.5.1: "[...] A string-literal is
an lvalue. [...]".

A "string literal" is an expression that evaluates to a "string literal
object".

Also, Note 7 on 5.13.5 (admittedly non-normative) goes as far as stating
"The effect of attempting to modify a string-literal is undefined." I
hope you understand that this not not an attempt to dissuade us from
modifying the source code of our programs.

These nothing wrong with taking the same terminological liberties in a
Usenet discussion.

--
Best regards,
Andrey

Keith Thompson

unread,
Jul 27, 2022, 6:38:24 PM7/27/22
to
Andrey Tarasevich <andreyta...@hotmail.com> writes:
> On 7/26/2022 7:24 PM, Keith Thompson wrote:
>> Andrey Tarasevich <andreyta...@hotmail.com> writes:
>>> On 7/26/2022 12:57 AM, Juha Nieminen wrote:
>>>> In another forum I was discussing string literals and char arrays.
>>>> During the conversation this question occurred to me:
>>>> In an array initialization like this:
>>>> char str[] = "hello";
>>>> can that "hello" even be considered a "string literal", or should it
>>>> be classified as something else?
>>>
>>> Yes, it is a string literal. I.e. formally it is an independent object
>>> with static storage duration, independent and separate from `str` in
>>> your example.
>> It is a string literal, i.e., it is a source code token starting and
>> ending with a '"' character. The corresponding object with static
>> storage duration is not a string literal. It doesn't exist until
>> execution time.
>
> No. The distinctions is there, but it is different.

I don't think it's different.

> While it is true that the standard text differentiates between
> lexical/grammatical `string-literal` (hyphenated) and "string literal
> object", the former is generally treated as an expression, not as a
> mere token in source code. For example, in 7.5.1: "[...] A
> string-literal is an lvalue. [...]".

It's a token and it's an expression. "A *literal* is a primary
expression" [expr.prim.literal].

> A "string literal" is an expression that evaluates to a "string
> literal object".

Yes. How does that disagree with what I wrote? (A string literal
object is not a string literal.)

> Also, Note 7 on 5.13.5 (admittedly non-normative) goes as far as
> stating "The effect of attempting to modify a string-literal is
> undefined." I hope you understand that this not not an attempt to
> dissuade us from modifying the source code of our programs.

Yes. It would have been more precise to say that "The effect of
attempting to modify a string literal object is undefined." The current
wording is informal, and I don't have a huge problem with it.

> These nothing wrong with taking the same terminological liberties in a
> Usenet discussion.

Unless the point of confusion in the original post is precisely the
distinction I pointed out.
0 new messages