Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

pointer and storage

1 view
Skip to first unread message

dis_is...@yahoo.com

unread,
Sep 17, 2006, 10:21:10 AM9/17/06
to
Hi.I have a question on the following statement.

char* a="hello";

The question is where "hello" gets stored.Is it in some static area
,stack or heap.I have observed that attempting to modify "hello"
results in segmentation fault.Thanks for any help.

Eric

deepak

unread,
Sep 17, 2006, 10:44:52 AM9/17/06
to

The *a will be stored in the readonly part of the data area.
Thats the reason behind the segmenatation fault.

-Deepak.
>
> Eric

Richard Heathfield

unread,
Sep 17, 2006, 10:49:54 AM9/17/06
to
dis_is...@yahoo.com said:

> Hi.I have a question on the following statement.
>
> char* a="hello";
>
> The question is where "hello" gets stored.Is it in some static area
> ,stack or heap.

It depends on the implementation. No implementation is required to use
stacks or heaps.

You would be well-advised to use const char * rather than mere char *.

> I have observed that attempting to modify "hello"
> results in segmentation fault.

The behaviour resulting from modifying a constant string is undefined. A
segmentation fault is one possible result. The absence of a segmentation
fault is another possible result. And the destruction of Rome by fire is
another possible result.


--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Frederick Gotham

unread,
Sep 17, 2006, 11:02:11 AM9/17/06
to
dis_is...@yahoo.com posted:

> char* a="hello";


This is a definition of a non-const pointer to a non-const char. It also
initialises the pointer to the address of a string literal (which is il-
advised.)


> The question is where "hello" gets stored.Is it in some static area
> ,stack or heap.I have observed that attempting to modify "hello"
> results in segmentation fault.Thanks for any help.


The following two programs are equivalent:

/* Program 1 */

int main(void)
{
char const *p = "Hello"; return 0;
}

/* Program 2 */

char const str_literal1[] = {'H','e','l','l','o',0};

#define LITERAL1 (*(char(*)[sizeof str_literal1])&str_literal1)

int main(void)
{
char const *p = LITERAL1; return 0;
}

--

Frederick Gotham

Ark

unread,
Sep 17, 2006, 2:47:40 PM9/17/06
to
Richard Heathfield wrote:
> dis_is...@yahoo.com said:
>
>> Hi.I have a question on the following statement.
>>
>> char* a="hello";
>>
>> The question is where "hello" gets stored.Is it in some static area
>> ,stack or heap.
>
> It depends on the implementation. No implementation is required to use
> stacks or heaps.
>
> You would be well-advised to use const char * rather than mere char *.
>
>> I have observed that attempting to modify "hello"
>> results in segmentation fault.
>
> The behaviour resulting from modifying a constant string is undefined. A
> segmentation fault is one possible result. The absence of a segmentation
> fault is another possible result. And the destruction of Rome by fire is
> another possible result.
>
>
I am confused profoundly.
I always thought that where the string literals are stored (RO vs. RW)
is implementation-defined (and decent compilers would allow me to choose
my way with a command-line switch).
However, the /type/ of (a pointer to) a string literal is char *,
regardless of the switch, or so I read the standard a while ago.
So the statement
*a='a';
must compile OK *without diagnostics* and then cause or not cause
undefined behavior depending on implementation-defined behavior.

<OT> BTW, as an embedded type, I talked myself out of using compound
literals because they end up in my scarce RAM. (IAR EWARM 4.40
toolchain.) I find it odd that IAR puts string literals, which are, at
least conceptually, compound literals, in RO and true (syntactically)
compound literals, in RW. </OT>

Can anyone shed a bright light on this?
Thank you,
- Ark

Ben Pfaff

unread,
Sep 17, 2006, 3:06:27 PM9/17/06
to
Ark <akh...@macroexpressions.com> writes:

> I am confused profoundly.
> I always thought that where the string literals are stored (RO vs. RW)
> is implementation-defined (and decent compilers would allow me to choose
> my way with a command-line switch).

The Standard doesn't say, so yes, the location of string literals
is defined by an implementation.

> However, the /type/ of (a pointer to) a string literal is char *,
> regardless of the switch, or so I read the standard a while ago.
> So the statement
> *a='a';
> must compile OK *without diagnostics* and then cause or not cause
> undefined behavior depending on implementation-defined behavior.

No compiler is required to compile anything without diagnostics.
The Standard does not restrict an implementation from diagnosing
anything at all. It does *require* a diagnostic for many
programs, but it does not *forbid* diagnostics for other
programs.

Furthermore, the statement in question yields undefined behavior,
so the program is not strictly conforming. Implementations are
only required to successfully translate and execution strictly
conforming programs. The Standard does not clearly distinguish
translation time and runtime, so one cannot even say that the
program should be successfully translated and then executed as
undefined behavior.
--
Go not to Usenet for counsel, for they will say both no and yes.

Keith Thompson

unread,
Sep 17, 2006, 3:19:28 PM9/17/06
to
Ark <akh...@macroexpressions.com> writes:
> Richard Heathfield wrote:
[...]

>> The behaviour resulting from modifying a constant string is
>> undefined. A segmentation fault is one possible result. The absence
>> of a segmentation fault is another possible result. And the
>> destruction of Rome by fire is another possible result.

It would be clearer to use the term "string literal" rather than
"constant string".

> I am confused profoundly.
> I always thought that where the string literals are stored (RO vs. RW)
> is implementation-defined

Yes.

> (and decent compilers would allow me to
> choose my way with a command-line switch).

That's debatable. I don't see much advantage in allowing string
literals to be modifiable (except *maybe* to handle old and broken
code).

> However, the /type/ of (a pointer to) a string literal is char *,
> regardless of the switch, or so I read the standard a while ago.

Yes.

> So the statement
> *a='a';
> must compile OK *without diagnostics* and then cause or not cause
> undefined behavior depending on implementation-defined behavior.

The following:
char *a = "hello";
*a = 'a';
(assuming it appears in an appropriate context) is legal (it violates
no syntax rules or constraints), and a conforming compiler must accept
it. But, as always, a compiler is free to issue any diagnostics it
likes. The standard requires diagnostics in certain cases; it never
forbids them.

If the initialization is executed, it invokes undefined behavior. The
undefined behavior is unconditional, though the effects of the
undefined behavior can be literally anything. There is no
implementation-defined behavior involved (implementation-defined
behavior must be documented by the implementation, and there is no
documentation requirement here).

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Keith Thompson

unread,
Sep 17, 2006, 3:31:30 PM9/17/06
to
Ben Pfaff <b...@cs.stanford.edu> writes:
> Ark <akh...@macroexpressions.com> writes:
>> I am confused profoundly.
>> I always thought that where the string literals are stored (RO vs. RW)
>> is implementation-defined (and decent compilers would allow me to choose
>> my way with a command-line switch).
>
> The Standard doesn't say, so yes, the location of string literals
> is defined by an implementation.

But it's not "implementation-defined" (i.e., the implementation isn't
required to document its choice).

[snip]

> Furthermore, the statement in question yields undefined behavior,
> so the program is not strictly conforming. Implementations are
> only required to successfully translate and execution strictly
> conforming programs. The Standard does not clearly distinguish
> translation time and runtime, so one cannot even say that the
> program should be successfully translated and then executed as
> undefined behavior.

I don't think that's true. For example, this program:

#include <stdio.h>
#include <limits.h>
int main (void)
{
printf("INT_MAX = %d\n", INT_MAX);
return 0;
}

is not strictly conforming, since it produces output that depends on
implementation-defined behavior, but I don't believe an implementation
may reject it on that basis.

"Strictly conforming" programs are a very narrow class, and
"conforming" programs are a very wide class ("conforming" programs can
depend on any arbitrary compiler extensions). The standard doesn't
seem to have a name for the class of programs that must be accepted,
but whose behavior may depend on the implementation.

Frederick Gotham

unread,
Sep 17, 2006, 3:36:32 PM9/17/06
to
Keith Thompson posted:

> For example, this program:
>
> #include <stdio.h>
> #include <limits.h>
> int main (void)
> {
> printf("INT_MAX = %d\n", INT_MAX);
> return 0;
> }
>
> is not strictly conforming, since it produces output that depends on
> implementation-defined behavior, but I don't believe an implementation
> may reject it on that basis.


Are you saying that there's something wrong with that program? It seems
perfectly alright to me.

--

Frederick Gotham

pete

unread,
Sep 17, 2006, 4:44:35 PM9/17/06
to
deepak wrote:
>
> dis_is...@yahoo.com wrote:
> > Hi.I have a question on the following statement.
> >
> > char* a="hello";
> >
> > The question is where "hello" gets stored.Is it in some static area
> > ,stack or heap.I have observed that attempting to modify "hello"
> > results in segmentation fault.Thanks for any help.
>
> The *a will be stored in the readonly part of the data area.

The rules of C, neither require nor prohibit
a string literal being stored in read only memory.

> Thats the reason behind the segmenatation fault.

--
pete

pete

unread,
Sep 17, 2006, 4:47:46 PM9/17/06
to

"Correct program" comes pretty close.

N869
4. Conformance

[#3] A program that is correct in all other aspects,
operating on correct data, containing unspecified behavior
shall be a correct program and act in accordance with
5.1.2.3.

--
pete

Keith Thompson

unread,
Sep 17, 2006, 5:01:27 PM9/17/06
to

No, not at all. I was responding to what Ben Pfaff wrote (and you
snipped) upthread:

| Furthermore, the statement in question yields undefined behavior,
| so the program is not strictly conforming. Implementations are
| only required to successfully translate and execution strictly
| conforming programs.

The program is not strictly conforming, but it's still perfectly
valid.

Frederick Gotham

unread,
Sep 17, 2006, 5:05:04 PM9/17/06
to
Keith Thompson posted:

>>> #include <stdio.h>
>>> #include <limits.h>
>>> int main (void)
>>> {
>>> printf("INT_MAX = %d\n", INT_MAX);
>>> return 0;
>>> }
>

> The program is not strictly conforming, but it's still perfectly
> valid.


Why isn't that program strictly conforming?

--

Frederick Gotham

Ben Pfaff

unread,
Sep 17, 2006, 5:15:25 PM9/17/06
to
pete <pfi...@mindspring.com> writes:

> Keith Thompson wrote:
>> "Strictly conforming" programs are a very narrow class, and
>> "conforming" programs are a very wide class ("conforming" programs can
>> depend on any arbitrary compiler extensions). The standard doesn't
>> seem to have a name for the class of programs that must be accepted,
>> but whose behavior may depend on the implementation.
>
> "Correct program" comes pretty close.

That's a good name. I'll try to remember that.

When I wrote my earlier article in this thread, I knew that
"strictly conforming" was not a perfect term, but I didn't have a
better one and didn't feel like adding a lot of qualifiers or an
extended explanation.
--
"Your correction is 100% correct and 0% helpful. Well done!"
--Richard Heathfield

Keith Thompson

unread,
Sep 17, 2006, 5:24:58 PM9/17/06
to

C99 4p5:

A _strictly conforming program_ shall use only those features of
the language and library specified in this International
Standard. It shall not produce output dependent on any
unspecified, undefined, or implementation-defined behavior, and
shall not exceed any minimum implementation limit.

The program's behavior is implementation-defined, since it depends on
the implementation-defined value of INT_MAX.

(You just downloaded n1124, so you could have looked up the definition
yourself.)

Richard Heathfield

unread,
Sep 17, 2006, 5:36:31 PM9/17/06
to
Keith Thompson said:

<snip>

>> Richard Heathfield wrote:
> [...]
>>> The behaviour resulting from modifying a constant string is
>>> undefined. A segmentation fault is one possible result. The absence
>>> of a segmentation fault is another possible result. And the
>>> destruction of Rome by fire is another possible result.
>
> It would be clearer to use the term "string literal" rather than
> "constant string".

Yes, I nearly said "string literal", and then I thought along these lines:
'The term "string literal" refers to a source level construct, whereas the
behaviour in question (the modification thereof) refers to something that
happens at runtime. If I say "string literal", someone will nit-pick it,
and say that the string literal only exists in the "mind" of the lexer, and
it can't ever be modified, because by the time modification is (putatively)
possible the string literal has done its bit as a compilation token, and no
longer exists. So I'll play safe and call it a constant string instead.'

So much for playing safe. :-)

<snip>

Old Wolf

unread,
Sep 17, 2006, 6:35:52 PM9/17/06
to
Frederick Gotham wrote:
>
> The following two programs are equivalent:
>
> /* Program 1 */
>
> int main(void)
> {
> char const *p = "Hello"; return 0;
> }
>
> /* Program 2 */
>
> char const str_literal1[] = {'H','e','l','l','o',0};
>
> #define LITERAL1 (*(char(*)[sizeof str_literal1])&str_literal1)
>
> int main(void)
> {
> char const *p = LITERAL1; return 0;
> }

Well, both of those programs cause no observable behaviour
so they are both equivalent to:

int main(void)
{
return 0;
}

Chris Torek

unread,
Sep 17, 2006, 6:20:40 PM9/17/06
to
>Keith Thompson said:
>> It would be clearer to use the term "string literal" rather than
>> "constant string".

In article <CM-dnTTRvrE_IpDY...@bt.com>


Richard Heathfield <inv...@invalid.invalid> wrote:
>Yes, I nearly said "string literal", and then I thought along these lines:
>'The term "string literal" refers to a source level construct, whereas the
>behaviour in question (the modification thereof) refers to something that

>happens at runtime. If I say "string literal", someone will nit-pick it ...

I think the most appropriate phrase would be: "an anonymous array
produced by a string literal". This is a little unweildy, I admit,
but it sidesteps the case in which a string literal is used to
initialize a named array:

char s[] = "some string literal";

C99 throws in an interesting wrinkle though, in that compound
literals can produce modifiable objects:

void f(void) {
int *p = (int []) { 1, 2, 3 };

p[1]++;
/* now the array has 1,3,3 in it */
...
}

This means one can write:

void g(void) {
char *s = (char []) { "some string" };
...
}

to obtain a write-able anonymous array. I think I can still get
away with the wording I used above, though, because this anonymous
array is actually produced by the compound literal, and merely
*initialized* (not in fact "produced") by the string literal.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Ark

unread,
Sep 17, 2006, 9:23:46 PM9/17/06
to
That's exactly where my comprehension fails me.
After
char *a = "hello";
the pointer /is/ initialized, and if, as Keith writes,
*a = 'a';
produces the UB unconditionally, it means that the initialization of the
pointer is unconditionally bad (for the type), isn't it? There must be a
reason (like "old broken code"? or something else?) why the type of
"hello" is not const char *.
OK, I can drill this case down my brain, but this leaves the following
question:
What are (all) legal initializations of char *a such that assigning to
*a is UB-free?
Thanks,
Ark

lovecreatesbea...@gmail.com

unread,
Sep 17, 2006, 10:19:50 PM9/17/06
to

Keith Thompson wrote:
> The following:
> char *a = "hello";
> *a = 'a';
<snip>

> If the initialization is executed, it invokes undefined behavior. The
> undefined behavior is unconditional, though the effects of the
> undefined behavior can be literally anything. There is no
> implementation-defined behavior involved (implementation-defined
> behavior must be documented by the implementation, and there is no
> documentation requirement here).

Is it the initialization that causes the undefined behaviour, or will
it cause the UB? Do you mean the coming UB caused by the next
assignment statement above?

Jack Klein

unread,
Sep 17, 2006, 10:38:02 PM9/17/06
to
On Sun, 17 Sep 2006 19:19:28 GMT, Keith Thompson <ks...@mib.org> wrote
in comp.lang.c:

No, a conforming compiler is not required to accept it, although I
don't know of any that will not. If the compiler can determine, at
compile time, that a statement or expression producing undefined
behavior will be executed by all possible paths through the program,
it is free to do anything at all at compile time.

For example:

#include <stdlib.h>
#include <time.h>

int main(void)
{
char *a = "I'm a string literal";
srand(time(0));
if (rand() > (RAND_MAX / 2))
{
*a = 'a';
}
return 0;
}

A compiler must translate the program above.

However:

int main(void)
{
char *a = "I'm a string literal";
*a = 'a';
return 0;
}

...a compiler is not required to translate the second form.

The really interesting question is if the call to srand() is omitted
from the first example. Is a compiler allowed to "know" that its
version of rand() will return a value greater than RAND_MAX / 2 with
default initialization, equivalent to srand(1)?

> If the initialization is executed, it invokes undefined behavior. The

^^^^^^^^^^^^^^
ITYM assignment.

> undefined behavior is unconditional, though the effects of the
> undefined behavior can be literally anything. There is no
> implementation-defined behavior involved (implementation-defined
> behavior must be documented by the implementation, and there is no
> documentation requirement here).

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html

Jack Klein

unread,
Sep 17, 2006, 10:48:21 PM9/17/06
to
On Sun, 17 Sep 2006 21:23:46 -0400, Ark <akh...@macroexpressions.com>
wrote in comp.lang.c:

The simple fact is that string literals existed in the early C
language long before the const keyword appeared. So sufficiently old
code that assigned the address of a string literal to a plain old
ordinary pointer to char is not necessarily "broken", it was the only
character pointer type available at the time.

Having the const keyword available officially now for almost 17 years
does make it easier to avoid accidental errors, if it is used
properly. Attempting to write through a "pointer to const type" is a
constraint violation requiring a diagnostic.

> OK, I can drill this case down my brain, but this leaves the following
> question:
> What are (all) legal initializations of char *a such that assigning to
> *a is UB-free?

I'm too lazy to think hard about it right now, but assigning the
address of a modifiable array and using dynamic allocation come to
mind, without getting into type punning.

char ok [] = "hello";
char *a = ok;

...results in a pointing to characters that can be modified.

Jack Klein

unread,
Sep 17, 2006, 10:49:54 PM9/17/06
to
On Sun, 17 Sep 2006 15:02:11 GMT, Frederick Gotham
<fgot...@SPAM.com> wrote in comp.lang.c:

> dis_is...@yahoo.com posted:
>
> > char* a="hello";
>
>
> This is a definition of a non-const pointer to a non-const char. It also
> initialises the pointer to the address of a string literal (which is il-
> advised.)
>
>
> > The question is where "hello" gets stored.Is it in some static area
> > ,stack or heap.I have observed that attempting to modify "hello"
> > results in segmentation fault.Thanks for any help.
>
>
> The following two programs are equivalent:

No they are not. The type of a string literal in C is "array of
char", and most specifically not "array of const char".

> /* Program 1 */
>
> int main(void)
> {
> char const *p = "Hello"; return 0;
> }
>
> /* Program 2 */
>
> char const str_literal1[] = {'H','e','l','l','o',0};
>
> #define LITERAL1 (*(char(*)[sizeof str_literal1])&str_literal1)
>
> int main(void)
> {
> char const *p = LITERAL1; return 0;
> }

--

Keith Thompson

unread,
Sep 17, 2006, 11:06:49 PM9/17/06
to
Ark <akh...@macroexpressions.com> writes:
> Keith Thompson wrote:
[...]

>> The following:
>> char *a = "hello";
>> *a = 'a';
>> (assuming it appears in an appropriate context) is legal (it violates
>> no syntax rules or constraints), and a conforming compiler must accept
>> it. But, as always, a compiler is free to issue any diagnostics it
>> likes. The standard requires diagnostics in certain cases; it never
>> forbids them.
>> If the initialization is executed, it invokes undefined behavior.
>> The
>> undefined behavior is unconditional, though the effects of the
>> undefined behavior can be literally anything. There is no
>> implementation-defined behavior involved (implementation-defined
>> behavior must be documented by the implementation, and there is no
>> documentation requirement here).
>>
> That's exactly where my comprehension fails me.
> After
> char *a = "hello";
> the pointer /is/ initialized, and if, as Keith writes,
> *a = 'a';
> produces the UB unconditionally, it means that the initialization of
> the pointer is unconditionally bad (for the type), isn't it?

No, it isn't, but it's a bad idea.

Initializing a char* object ("a" in this case) to point to the first
character of a string literal is perfectly legal. For example, you
can read the elements of the array through the pointer will work just
fine. Undefined behavior occurs only if you try to *modify* elements
of the array.

> There
> must be a reason (like "old broken code"? or something else?) why the
> type of "hello" is not const char *.

It's to avoid breaking old code that may have been written before
"const" was introduced to the language (a *long* time ago). For example:

#include <stdio.h>

void print_string(char *s)
{
printf("print_string(\"%s\")\n", s);
}

int main(void)
{
char *message = "hello";
print_string(message);
return 0;
}

In old versions of the C language, before "const" was introduced, this
kind of thing was common. The language didn't provide a way to have
the compiler warn you if you tried to modify something that shouldn't
be modified.

Once "const" was introduced, it might have made sense to make string
literals const, but it would have broken existing code, which was
considered unacceptable. The alternative would have required all the
existing code to be modified by adding "const" qualifiers -- which
would have meant it would fail to compile under old compilers. It was
considered too high a price to pay.

> OK, I can drill this case down my brain, but this leaves the following
> question:
> What are (all) legal initializations of char *a such that assigning to
> *a is UB-free?

There are infinitely many such initializations. As long as a points
to modifiable memory, you can modify it.

Here's one example:

char str[] = "hello";
char *s = str;

The first line creates str as a non-const array. The second
initializes s to point to the first character of the array.

Keith Thompson

unread,
Sep 17, 2006, 11:12:05 PM9/17/06
to
Jack Klein <jack...@spamcop.net> writes:
> On Sun, 17 Sep 2006 19:19:28 GMT, Keith Thompson <ks...@mib.org> wrote
> in comp.lang.c:
[...]

>> The following:
>> char *a = "hello";
>> *a = 'a';
>> (assuming it appears in an appropriate context) is legal (it violates
>> no syntax rules or constraints), and a conforming compiler must accept
>> it. But, as always, a compiler is free to issue any diagnostics it
>> likes. The standard requires diagnostics in certain cases; it never
>> forbids them.
>
> No, a conforming compiler is not required to accept it, although I
> don't know of any that will not. If the compiler can determine, at
> compile time, that a statement or expression producing undefined
> behavior will be executed by all possible paths through the program,
> it is free to do anything at all at compile time.

You're right.

[snip]

>> If the initialization is executed, it invokes undefined behavior. The
> ^^^^^^^^^^^^^^
> ITYM assignment.

Yes, thanks.

Frederick Gotham

unread,
Sep 18, 2006, 12:39:32 AM9/18/06
to
Jack Klein posted:

>> The following two programs are equivalent:
>
> No they are not. The type of a string literal in C is "array of
> char", and most specifically not "array of const char".


Hence the macro which casts away the constness. (The actual underlying array
is defined as const to reflect that the behaviour is undefined to modify a
string literal.)

--

Frederick Gotham

0 new messages