Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Arrays and C++

126 views
Skip to first unread message

arnuld

unread,
Oct 12, 2014, 1:26:54 PM10/12/14
to
AIM: To understand why "arrays of int/float" and "arrays of char" behave
differently in C++


#include <iostream>

int main()
{
int arr1[] = {10,11,12};
char arr2[] = {'a','b','c'};
double arr3[] = {0,10, 0.11, 0.12};
const char* str = "comp.lang.c++";
const wchar_t* alpha = L"first line"
"Second Line";

std::cout << "arr1 = " << arr1 << std::endl;
std::cout << "arr2 = " << arr2 << std::endl;
std::cout << "arr3 = " << arr3 << std::endl;
std::cout << "str = " << str << std::endl;
std::wcout << "alpha = " << alpha << std::endl;

return 0;
}
========================= OUTPUT ================================
[arnuld@arch64 c++]$ g++ -ansi -pedantic -Wall -Wextra arrays.cpp
[arnuld@arch64 c++]$ ./a.out
arr1 = 0x7fffffffe960
arr2 = abc
arr3 = 0x7fffffffe930
str = comp.lang.c++
alpha = first lineSecond Line
[arnuld@arch64 c++]$


An array name is converted to pointer to its first element. arr1 and arr3
(1st and 3rd wariable) are arrays of int and float respectively and they
behave accordingly to this rule but arr2 and str (2nd and 4th variables)
do not. Why ?

alpha (5th variable) has its initialization spread across 2 lines. I
need to give L only on first line and not on 2nd line. Is this assumption
correct ?




--
arnuld
http://lispmachine.wordpress.com/

Victor Bazarov

unread,
Oct 12, 2014, 1:38:17 PM10/12/14
to
When you ask "why", you presume that your conclusion is correct. It
isn't. All arrays behave according to the rule.

> alpha (5th variable) has its initialization spread across 2 lines. I
> need to give L only on first line and not on 2nd line. Is this assumption
> correct ?

Yes. Concatenation of the adjacent string literals is done prior to the
syntactic analysis of the code.

V
--
I do not respond to top-posted replies, please don't ask

Barry Schwarz

unread,
Oct 12, 2014, 1:46:06 PM10/12/14
to
Actually they do. The difference is how the compiler generates code
for the overloaded operator <<. When the right operand has type
pointer to char, the generated code treats the address as the start of
a C style string (similar to using %s in printf). When the operand
has type pointer to "object that cannot be a string," the generated
code treats it the same as using %p in printf.

>
>alpha (5th variable) has its initialization spread across 2 lines. I
>need to give L only on first line and not on 2nd line. Is this assumption
>correct ?

Yes. It only looks like the initialization is spread across two
lines. Two string literals separated only by white space are merged
into a single string literal in one of the early compilation phases.
Since '\n' and ' ' are both white space characters, your
initialization of alpha is processed as
L"first lineSecond Line"

--
Remove del for email

Wouter van Ooijen

unread,
Oct 12, 2014, 2:12:03 PM10/12/14
to
arnuld schreef op 12-Oct-14 7:26 PM:
> AIM: To understand why "arrays of int/float" and "arrays of char" behave
> differently in C++
>
>
> #include <iostream>
>
> int main()
> {
> int arr1[] = {10,11,12};
> char arr2[] = {'a','b','c'};
> double arr3[] = {0,10, 0.11, 0.12};
> const char* str = "comp.lang.c++";
> const wchar_t* alpha = L"first line"
> "Second Line";
>
> std::cout << "arr1 = " << arr1 << std::endl;
> std::cout << "arr2 = " << arr2 << std::endl;

Note that you are doing something dangerous here: you ask a string
(const char pointer) to be printed that is not 0-terminated.

Wouter

JiiPee

unread,
Oct 12, 2014, 2:57:41 PM10/12/14
to
Are you talking about this:
const char* str = "comp.lang.c++";

?

arnuld

unread,
Oct 12, 2014, 3:46:56 PM10/12/14
to
> On Sun, 12 Oct 2014 10:45:49 -0700, Barry Schwarz wrote:

> Actually they do. The difference is how the compiler generates code for
> the overloaded operator <<. When the right operand has type pointer to
> char, the generated code treats the address as the start of a C style
> string (similar to using %s in printf). When the operand has type
> pointer to "object that cannot be a string," the generated code treats
> it the same as using %p in printf.

But arr2 is an array of chars not terminated by null character:

char arr2[] = {'a','b','c'};

Still it prints it correctly. How does std::cout know where to stop.
Where I can find about rules regarding this ?




--
arnuld
http://lispmachine.wordpress.com/

arnuld

unread,
Oct 12, 2014, 4:08:58 PM10/12/14
to
> On Sun, 12 Oct 2014 19:57:26 +0100, JiiPee wrote:

> Are you talking about this:
> const char* str = "comp.lang.c++";

No. He was talking about this:

char arr2[] = {'a','b','c'};




--
arnuld
http://lispmachine.wordpress.com/

JiiPee

unread,
Oct 12, 2014, 4:28:06 PM10/12/14
to
On 12/10/2014 21:08, arnuld wrote:
>> On Sun, 12 Oct 2014 19:57:26 +0100, JiiPee wrote:
>> Are you talking about this:
>> const char* str = "comp.lang.c++";
>
> No. He was talking about this:
>
> char arr2[] = {'a','b','c'};
>
>
>
>

But arr2 is not const char pointer, is it? its a char pointer... we can
change its content.

JiiPee

unread,
Oct 12, 2014, 4:39:01 PM10/12/14
to
on my GCC it does not print arr2 correctly... it adds a return at the
end of it when I do:

printf("the size of %s is %d and the length is %d\n\n",
arr2, sizeof(arr2), strlen(arr2));

if I use arr2[] = "abc" it works.

I guess its behaviour is undefined.. in some machines its correct in
some not.

Richard Damon

unread,
Oct 12, 2014, 5:20:15 PM10/12/14
to
It depends on what follows it in memory. Working sometimes and not
others is very possible undefined behavior (sometimes undefined behavior
is what you want, but don't count on it, which can make it a very hard
bug to find.)

Since before it is an int, which likely will be aligned to at least a 2
byte boundary, (even more likely a 4 byte, maybe an 8), and the double
after will likely want to be aligned to 4 or 8 bytes too, the 3 byte
char array is almost certainly going to have a padding byte (or more)
added after it. The compiler might force that byte(s) to 0, or it might
leave it to a random value.

Wouter van Ooijen

unread,
Oct 12, 2014, 5:33:29 PM10/12/14
to
JiiPee schreef op 12-Oct-14 10:27 PM:
You are correct in that. But the arr2 is still dangerous to print, as
others have already spelled out. Accessing array elements beyond the
array itself (which is what will happen when operator<< tries to print
this char array) is undefined behaviour.

Wouter

JiiPee

unread,
Oct 12, 2014, 5:37:44 PM10/12/14
to
in an error case it will be printed:

the size of abc
is 3 ...


JiiPee

unread,
Oct 12, 2014, 5:40:11 PM10/12/14
to
yes i know, it printed wrongly on my machine

Ike Naar

unread,
Oct 12, 2014, 6:04:27 PM10/12/14
to
On 2014-10-12, arnuld <sun...@invalid.address> wrote:
>> On Sun, 12 Oct 2014 10:45:49 -0700, Barry Schwarz wrote:
>
>> Actually they do. The difference is how the compiler generates code for
>> the overloaded operator <<. When the right operand has type pointer to
>> char, the generated code treats the address as the start of a C style
>> string (similar to using %s in printf). When the operand has type
>> pointer to "object that cannot be a string," the generated code treats
>> it the same as using %p in printf.
>
> But arr2 is an array of chars not terminated by null character:
>
> char arr2[] = {'a','b','c'};
>
> Still it prints it correctly.

You were lucky.

> How does std::cout know where to stop.

Most probably the byte following the 'c' in memory, by chance, happened
to be a null byte.
But that's not something you can rely on if you write your code like that.

Charles J. Daniels

unread,
Oct 12, 2014, 7:11:51 PM10/12/14
to
I don't fully trust the answers here. I know that arr2 points to the first char in the array by address, but they type of arr2 should be char[3], not char* -- haven't you ever seen compiler errors that say a function "takes type X, not char[3]" -- it's entirely possible cout does a sizeof and will not print past the size of the array, but that does mean it may print junk within the bounds not preceded by null.

Barry Schwarz

unread,
Oct 12, 2014, 9:44:56 PM10/12/14
to
On Sun, 12 Oct 2014 16:11:40 -0700 (PDT), "Charles J. Daniels"
<chaj...@gmail.com> wrote:

>I don't fully trust the answers here. I know that arr2 points to the first char in the array by address, but they type of arr2 should be char[3], not char* -- haven't you ever seen compiler errors that say a function "takes type X, not char[3]" -- it's entirely possible cout does a sizeof and will not print past the size of the array, but that does mean it may print junk within the bounds not preceded by null.

All answers or just the ones in this thread?


The type of the object arr2 is char[3]. The type of the expression
arr2 is subject to the following rule quoted from the C standard:

"Except when it is the operand of the sizeof operator, the _Alignof
operator, or the unary & operator, or is a string literal used to
initialize an array, an expression that has type 荘array of type鋳 is
converted to an expression with type 荘pointer to type鋳 that points
to the initial element of the array object and is not an lvalue. If
the array object has register storage class, the behavior is
undefined."

Therefore, in this context
cout << arr2;
is identical to and compiled the same as
cut << &arr2[0];

Öö Tiib

unread,
Oct 13, 2014, 11:40:53 PM10/13/14
to
On Sunday, 12 October 2014 20:26:54 UTC+3, arnuld wrote:
>
> const wchar_t* alpha = L"first line"
> "Second Line";

...

> alpha (5th variable) has its initialization spread across 2 lines. I
> need to give L only on first line and not on 2nd line. Is this assumption
> correct ?

It is correct if you have C++11 compiler.

In C++98 and C++03 it was undefined behavior [lex.string]:
"If a narrow string literal token is adjacent to a wide string
literal token, the behavior is undefined."

That changed in C++11:
"If one string literal has no encoding-prefix, it is treated as
a string literal of the same encoding-prefix as the other operand."

It is perhaps worth mentioning since some people sometimes post here that
they keep using very old compilers.

Urs Thuermann

unread,
Oct 16, 2014, 5:39:20 PM10/16/14
to
Barry Schwarz <schw...@dqel.com> writes:

> On 12 Oct 2014 17:26:40 GMT, arnuld <sun...@invalid.address> wrote:
>
> >#include <iostream>
> >
> >int main()
> >{
> > int arr1[] = {10,11,12};
> > char arr2[] = {'a','b','c'};
> > double arr3[] = {0,10, 0.11, 0.12};
> > const char* str = "comp.lang.c++";
> > const wchar_t* alpha = L"first line"
> > "Second Line";
> >
> > std::cout << "arr1 = " << arr1 << std::endl;
> > std::cout << "arr2 = " << arr2 << std::endl;
> > std::cout << "arr3 = " << arr3 << std::endl;
> > std::cout << "str = " << str << std::endl;
> > std::wcout << "alpha = " << alpha << std::endl;
> >
> > return 0;
> >}

> >An array name is converted to pointer to its first element. arr1 and arr3
> >(1st and 3rd wariable) are arrays of int and float respectively and they
> >behave accordingly to this rule but arr2 and str (2nd and 4th variables)
> >do not. Why ?
>
> Actually they do. The difference is how the compiler generates code
> for the overloaded operator <<. When the right operand has type
> pointer to char, the generated code treats the address as the start of
> a C style string (similar to using %s in printf). When the operand
> has type pointer to "object that cannot be a string," the generated
> code treats it the same as using %p in printf.

Actually, the difference is in the definition of the operator<<()
functions involved. In all cases the array is converted to the
pointer of it's first element, i.e. this code calls

std::ostream::operator<<(int *);
std::operator<<(std::ostream &, char *)
std::ostream::operator<<(double *);
std::operator<<(std::ostream &, const char *)
std::operator<<(std::wostream &, const wchar_t *)

The operators for int* and double* simply print the address in hex,
since they cannot know the size of the array. The operator function
don't even know that an array is given as argument, they only get a
pointer. The operator<< for char* and wchar_t* also don't the size of
the array, but they rely on the programmer to pass a pointer a
null-terminated string of char's or wchar_t's, respectively. The
print all characters up to (but not including) the first null.

BTW, the GNU implementation has

std::operator<<(std::ostream &, char *)
std::operator<<(std::wostream &, wchar_t *)

which print the characters, and it also has

std::ostream::operator<<(char *);
std::wostream::operator<<(wchar_t *);

which print the pointer value in hex. But I don't know the rules
which decide between two when you write

cout << arr2;

I assume the member function is via template and the other operators
or more specific so they are chosen. But I haven't checked in the
header files.

urs
0 new messages