Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

typedef, macro or a proper type for strings?

210 views
Skip to first unread message

Thiago Adams

unread,
Aug 11, 2020, 9:42:53โ€ฏAM8/11/20
to

Is typedef for strings a good idea? (I think it is not)

If we try to define

typedef char * string;

We have the following situation:

void F(const string s);

were s looks like constant but it is not.

If we try this way:

typedef char string [];

then

void F(const string s);

is fine but it doesn't work for structs:

struct X {
string s;
int i;
}


One alternative is a macro

#define string char*

void F(const string s) {
//s[0] = 'c'; ERROR as expected
}

works here as well

struct Person {
string name;
};

if string was a proper type in C it could have
a mix of properties of pointers/values/arrays

for instance:

disabled some pointer operations:
name++, name--, *name
(name + 1)

kept:

name[0]
name = 0;
if (name) { }

The motivation of having a proper name string is similar
of the creation of bool/true/false instead of int 1 0.


Bonita Montero

unread,
Aug 11, 2020, 9:45:58โ€ฏAM8/11/20
to
Only use a typedef if the type may change, f.e. with a #ifdef.
And if you want a proper language with real strings close to
C, use C++ยด.

Thiago Adams

unread,
Aug 11, 2020, 9:56:53โ€ฏAM8/11/20
to
C++ strings are bad in many aspects. But I don't want to
use this topic to talk about it.

Ben Bacarisse

unread,
Aug 11, 2020, 9:58:37โ€ฏAM8/11/20
to
Thiago Adams <thiago...@gmail.com> writes:

> Is typedef for strings a good idea? (I think it is not)

Agreed. There is no C type that corresponds to a string (which, in C,
is really a data layout).

> If we try to define
>
> typedef char * string;
>
> We have the following situation:
>
> void F(const string s);
>
> were s looks like constant but it is not.

Hmm... s looks const and is const. The trouble is s is not a string so
type name is misleading.

> If we try this way:
>
> typedef char string [];
>
> then
>
> void F(const string s);
>
> is fine

Again, I'd classify this one as not fine. s 'looks' const but isn't --
it's *s (etc.) that are const qualified.

> but it doesn't work for structs:
>
> struct X {
> string s;
> int i;
> }

Yup.

> One alternative is a macro
>
> #define string char*
>
> void F(const string s) {
> //s[0] = 'c'; ERROR as expected

Again, not as I'd expect. I'd expect s = 0 (etc.) to be prohibited but
not necessarily accesses through s.

> }
>
> works here as well
>
> struct Person {
> string name;
> };
>
> if string was a proper type in C it could have
> a mix of properties of pointers/values/arrays
>
> for instance:
>
> disabled some pointer operations:
> name++, name--, *name
> (name + 1)
>
> kept:
>
> name[0]
> name = 0;
> if (name) { }

This creates a mess of inconsistencies. name[i] means *(name + i) so
banning pointer arithmetic and dereference would be a special case.

> The motivation of having a proper name string is similar
> of the creation of bool/true/false instead of int 1 0.

I think it's a much more complex proposition than that.

--
Ben.

Thiago Adams

unread,
Aug 11, 2020, 10:10:14โ€ฏAM8/11/20
to
On Tuesday, August 11, 2020 at 10:58:37 AM UTC-3, Ben Bacarisse wrote:
Yes.

> > }
> >
> > works here as well
> >
> > struct Person {
> > string name;
> > };
> >
> > if string was a proper type in C it could have
> > a mix of properties of pointers/values/arrays
> >
> > for instance:
> >
> > disabled some pointer operations:
> > name++, name--, *name
> > (name + 1)
> >
> > kept:
> >
> > name[0]
> > name = 0;
> > if (name) { }
>
> This creates a mess of inconsistencies. name[i] means *(name + i) so
> banning pointer arithmetic and dereference would be a special case.

In this case *(name + i) it is only a syntax difference.

(
actually I have a side question about style is when we pass
arrays to be used as pointers.

For instance: F(int *p) and we call

int a[10];

we have 3 forms

F(a);
F(&a);
F(&a[0]);

)


> > The motivation of having a proper name string is similar
> > of the creation of bool/true/false instead of int 1 0.
>
> I think it's a much more complex proposition than that.
>

Yes sure!

I am trying to think what semantics I would like to have
if we had a string type. Also the current alternatives.

Another sample:

#define string char *

void F(const string s) {
//s[0] = 'a'; ERROR AS EXPECTED
s = 0; //IS ALLOWED
}

I would expect both not allowed. (Like you said)

This syntax also could be useful:

void F(int size, const string s[size]);




Thiago Adams

unread,
Aug 11, 2020, 10:26:22โ€ฏAM8/11/20
to
On Tuesday, August 11, 2020 at 11:10:14 AM UTC-3, Thiago Adams wrote:
> On Tuesday, August 11, 2020 at 10:58:37 AM UTC-3, Ben Bacarisse wrote:
...
> > I think it's a much more complex proposition than that.
> >
>
> Yes sure!


Here is another decision about semantics:

Option 1 like : typedef char string[];

string s = "123456789";
static_assert(sizeof(s) == 10, "");
string s[] = { "123456789", "abcd" }; //error ?


Option 2 like: typedef char * string;

string s = "123456789";
static_assert(sizeof(s) == sizeof(char*), "");
string s[] = { "123456789", "abcd" };

Bonita Montero

unread,
Aug 11, 2020, 10:41:39โ€ฏAM8/11/20
to
> C++ strings are bad in many aspects. ...

Why ?

Ben Bacarisse

unread,
Aug 11, 2020, 10:53:48โ€ฏAM8/11/20
to
I would not say "only" but I think we understand each other.

> (
> actually I have a side question about style is when we pass
> arrays to be used as pointers.
>
> For instance: F(int *p) and we call
>
> int a[10];
>
> we have 3 forms
>
> F(a);
> F(&a);
> F(&a[0]);

The middle one is a constraint violation.

For the third, &a[0] means &*(a+0) which means a+0 or essentially a.

> )
>
>
>> > The motivation of having a proper name string is similar
>> > of the creation of bool/true/false instead of int 1 0.
>>
>> I think it's a much more complex proposition than that.
>>
>
> Yes sure!
>
> I am trying to think what semantics I would like to have
> if we had a string type.

To quote the old joke: "I wouldn't start from here" (here being C).

--
Ben.

Thiago Adams

unread,
Aug 11, 2020, 10:59:36โ€ฏAM8/11/20
to
On Tuesday, August 11, 2020 at 11:41:39 AM UTC-3, Bonita Montero wrote:
> > C++ strings are bad in many aspects. ...
>
> Why ?

For instance:

void F(std::string& s);
F("123456789");




Bonita Montero

unread,
Aug 11, 2020, 11:36:49โ€ฏAM8/11/20
to
>> Why ?

> For instance:
> void F(std::string& s);

void F( std::string const &s );

> F("123456789");

Where's the problem ?

Thiago Adams

unread,
Aug 11, 2020, 11:52:25โ€ฏAM8/11/20
to
C++ string is a specialized container of char and it does not
direct map the c string type - that is the type of string literals
used in C and C++.

In this sample, what happens is the copy of the literal string
to a new memory inside std::string. Probably you known, std::string
can have a optimization for small strings but in any case
this is unnecessarily and bad copy.





Bonita Montero

unread,
Aug 11, 2020, 12:02:36โ€ฏPM8/11/20
to
> C++ string is a specialized container of char and it does not
> direct map the c string type - that is the type of string literals
> used in C and C++.

But you can treat a C++-string as a C-string by getting its storage
with c_str().
And you can't modify string-literals anyway, so where's the problem
with passing a string literal converted to a std::string-temporary ?

> In this sample, what happens is the copy of the literal string
> to a new memory inside std::string.

So what ? When you pass a string literal to a function the
storage of the string is inaccessible outside the call anyway.
You can't criticise C++ for using the wrong language-facilities;
you can still use a non-temporary string initialized with the
string-literal and everything is fine.

> Probably you known, std::string can have a optimization for
> small strings but in any case this is unnecessarily and bad copy.

This has nothing to do with the above since the small string storage
is also distinct from the literal-memory.

Thiago Adams

unread,
Aug 11, 2020, 12:14:00โ€ฏPM8/11/20
to
On Tuesday, August 11, 2020 at 1:02:36 PM UTC-3, Bonita Montero wrote:
> > C++ string is a specialized container of char and it does not
> > direct map the c string type - that is the type of string literals
> > used in C and C++.
>
> But you can treat a C++-string as a C-string by getting its storage
> with c_str().
> And you can't modify string-literals anyway, so where's the problem
> with passing a string literal converted to a std::string-temporary ?
>
> > In this sample, what happens is the copy of the literal string
> > to a new memory inside std::string.
>
> So what ?
If you are happy with this copy then there is nothing
I can say to convince you the opposite.

Bonita Montero

unread,
Aug 11, 2020, 12:15:21โ€ฏPM8/11/20
to
>> So what ?

> If you are happy with this copy then there is nothing
> I can say to convince you the opposite.

C-strings are no alternative; they're simply crap.

Thiago Adams

unread,
Aug 11, 2020, 1:04:43โ€ฏPM8/11/20
to
On Tuesday, August 11, 2020 at 11:10:14 AM UTC-3, Thiago Adams wrote:
...
> I am trying to think what semantics I would like to have
> if we had a string type. Also the current alternatives.

Another alternative is to use typedef char string;

// Make sense here
// s points to a c-string
void F1(const string* s);


// Considering C string is an array of chars ended with '\0'
// the only option here is a string of size 1 with '\0'

string s;
string s[1];

(this could be an warning or error)

When the string is used as array (2 or more items)

string s[10];

We need to interpret/reinterpret string as char because
we don't have a array of 10 strings of size 1 '\0'.

Instead we have a array of chars that also can be
initialized as:

string s[] = "test";

David Brown

unread,
Aug 11, 2020, 1:08:19โ€ฏPM8/11/20
to
On 11/08/2020 15:42, Thiago Adams wrote:
>
> Is typedef for strings a good idea? (I think it is not)
>
> If we try to define
>
> typedef char * string;
>

A typedef for a string type is fine. But a "char*" alone is perhaps not
a suitable type for a useful string type (in the general sense rather
than the C library definition). If you consider a "string" to track
allocation of memory, length of string and length of allocation, and
perhaps also reference counting and copy-on-write, sub-strings and
slices, small string optimisations, etc., then you would pack all the
details into a struct. And then a typedef would be fine.

But "typedef char * string;" is unlikely to be a good plan. At the very
least, use:

typedef struct { char * s; } string;

That gives you a new type and eliminates many (but not all) of the
potential mixups and mistakes possible with the simple typedef.

Thiago Adams

unread,
Aug 11, 2020, 1:13:41โ€ฏPM8/11/20
to
The idea is not create a new string type for C
(in a sense to create a string class or a totally new object with a new memory layout) instead just map the current concept of c-strings
arrays of char ended with zero.


John Bode

unread,
Aug 11, 2020, 2:11:33โ€ฏPM8/11/20
to
On Tuesday, August 11, 2020 at 8:42:53 AM UTC-5, Thiago Adams wrote:
> Is typedef for strings a good idea? (I think it is not)
>

Not unless you build an entire API around it and make it a full-fledged ADT, no.

Same for all your other ideas, you can't just use a typedef or a macro, you have
to write an entire library to support it. Think of the FILE type and the stdio
routines as a template for the level of abstraction you need.

Or, hack the language to add support for an honest-to-God string type, with
proper string semantics and operators.

The Real Non Homosexual

unread,
Aug 11, 2020, 2:49:08โ€ฏPM8/11/20
to
You're still a fucking idiot. Go off and write another 200 line undocumented function you fucking retard.

Keith Thompson

unread,
Aug 11, 2020, 3:07:31โ€ฏPM8/11/20
to
The problem with that is that "string" in C is not a type, so there
is no existing type that can reasonably be named "string".

A char[] object might or might not contain a string. A char* might
or might not point to the initial element of an array of char,
and that array might or might not contain a string.

Maybe C *should* have had a string type, but it doesn't. Which is
why I prefer to deal with pointers and arrays as pointers and arrays.

(Or you can define something similar to C++'s std::string, but
in C you'll need to do a lot of work manually and it's going to
be error-prone.)

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Bart

unread,
Aug 11, 2020, 3:18:41โ€ฏPM8/11/20
to
Go through the C standard library (and perhaps POSIX too).

What proportion of the char* parameters and return types are going be
pointers to zero-terminated strings?

I'd say the majority will be.

So what's wrong with creating a special typedef for that? (Other than
the problem of properly applying a const attribute to it. That would at
least have made my question much easier to answer, than having to look
at the details specs of 1500 functions.

Keith Thompson

unread,
Aug 11, 2020, 3:41:06โ€ฏPM8/11/20
to
I find it misleading.

Something like
typedef char *string;
is particularly misleading, because it implies that strings are
pointers Hiding a pointer type behind a typedef is often problematic
anyway, unless it's treated as an entirely opaque type (i.e.,
client code doesn't know it's a pointer).

C's treatment of strings is messy. Hiding that behind a typedef
doesn't make it less messy. As as you say, it wouldn't interact well
with const, unless you define const and non-const variants.

I suppose if you define something like
typedef char *string_pointer;

that's not quite as bad. The name could act as a reminder that objects
declared with type string_pointer are intended to be pointers to strings
(or they can be null).

Of course it's at least partly a matter of taste, and nobody is
obligated to share mine.

Thiago Adams

unread,
Aug 11, 2020, 4:10:55โ€ฏPM8/11/20
to
On Tuesday, August 11, 2020 at 4:41:06 PM UTC-3, Keith Thompson wrote:
> Bart writes:
> > On 11/08/2020 20:07, Keith Thompson wrote:
How about?

typedef char string;

void F(const string* s);

struct Person {
string * name;
};

string s[] = "test";
string s[100] = {0};

The only thing that may look inconsistent is that
string s[100] is not a array of 100 strings of size 1.
Instead it is a string with 100 chars.

If it was a type (not a typedef) we could have a warning
here:

string s;
string s[1];

because it does not make sense.
The only option to represent a c string here is size 1 with '\0';

So this could be written as:
string s[] = "";









Keith Thompson

unread,
Aug 11, 2020, 5:44:07โ€ฏPM8/11/20
to
I don't find that any better. Just as a string is not a pointer,
a string is not a character.

> void F(const string* s);
>
> struct Person {
> string * name;
> };
>
> string s[] = "test";
> string s[100] = {0};
>
> The only thing that may look inconsistent is that
> string s[100] is not a array of 100 strings of size 1.
> Instead it is a string with 100 chars.
>
> If it was a type (not a typedef) we could have a warning
> here:
>
> string s;
> string s[1];
>
> because it does not make sense.
> The only option to represent a c string here is size 1 with '\0';
>
> So this could be written as:
> string s[] = "";

Any attempt to define a typedef for a C "string" that doesn't create an
opaque type will inevitably leave ugly gaps that will be visible to
client code.

In my opinion, the best approach is just to use char* and char[] as
appropriate. Anyone writing C code will have to understand how C
defines and manipulates strings, and the often confusing relationship
between pointers and arrays. (The comp.lang.c FAQ http://www.c-faq.com/
is very helpful for that.) If I'm reading code that uses a low-level
typedef for "string", I have to understand all that *and* how the
typedef is used.

It's rare for the intended audience for C source code to be people who
don't know the language well. I wouldn't optimize readability for
inexperienced readers at the expense of reducing it for experienced
readers.

Or you can define your own String type, perhaps something inspired by
C++'s std::string, but C doesn't have all the features needed to make
such a type conveniently usable, and the standard library still wants
char* and const char* arguments.

<OT>Or you can use another language.</OT>

Thiago Adams

unread,
Aug 12, 2020, 2:25:42โ€ฏPM8/12/20
to
On Tuesday, August 11, 2020 at 4:41:06 PM UTC-3, Keith Thompson wrote:
Just for curiosity, Windows API has the following typedefs.

LPCSTR for constant string and
LPSTR for non-constant string

typedef _Null_terminated_ CHAR *NPSTR, *LPSTR, *PSTR;
..
typedef _Null_terminated_ CONST CHAR *LPCSTR, *PCSTR;

This '_Null_terminated_' is used by the static analysis.


So, to avoid the const problem in pointers they created two
typedefs.

I don't what was the last time that the 'L' for 'long pointer'
make sense.



Siri Cruise

unread,
Aug 12, 2020, 4:49:17โ€ฏPM8/12/20
to
> >> But "typedef char * string;" is unlikely to be a good plan. At the
> >> very least, use:
> >>
> >> typedef struct { char * s; } string;

And with that you can't pass it str* functions which is
presumably the point of naming it 'string'

> The problem with that is that "string" in C is not a type, so there
> is no existing type that can reasonably be named "string".

A rational number is not simply a pair of integers. Since a
typedef cannot express that, rather than

typedef struct {int num; unsigned den;} Rational;

and letting all the rational* functions imply the rest of type,
you are recommending I instead do

typedef struct {int integer1; unsigned integer2;} IntegerPair;

because the implicit documentation that the integer pair is
intended to represent a rational is outweighed by the typedef not
completely and explicitly forcing it to be rational. Like the
typedef allows (IntegerPair){1, 0}, but you would need
rationalnumber(1, 0) to enforce that restriction.

> Maybe C *should* have had a string type, but it doesn't. Which is
> why I prefer to deal with pointers and arrays as pointers and arrays.

I prefer to deal with abstractions and try to contain
implementation details to a single place.

So rat.h would define the type Rat as
typedef struct {int num; unsigned den} Rat;
Rat ratnumber (int num, int den);
// Return a rational number with a positive den
// and no common factor in num and den. Or
// return r such that ratisnan(r).
bool ratisrat (Rat r);
// Whether r is a valid rational.
bool ratisnan (Rat r);
// Whether r is an invalid rational.
int ratnum (Rat r);
// The numerator of r if ratisrat(r).
int ratden (Rat r);
// The denominator of r if ratisrat(r).
Rat ratadd (Rat q, Rat r);
// Add rationals.
// ratnumber(ratnum(q)*ratden(r)+ratnum(r)*ratden(q),
// ratden(q)*ratden(r))
// if ratisrat(q) and ratisrat(r).
...

Compile rat.c into rat.a and document you use rationals with

cc -lrat source.c
#include "rat.h"

--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
The first law of discordiamism: The more energy This post / \
to make order is nore energy made into entropy. insults Islam. Mohammed

Tim Rentsch

unread,
Aug 25, 2020, 9:34:25โ€ฏAM8/25/20
to
Speaking for myself, I often find it helpful, and not confusing,
to introduce a typedef name

typedef const char *String;

used to refer to null-terminated character sequences (and of
course the 'const' prevents modification without resorting to
some sort of type trickery). Other people I have worked with
have similarly found it useful and were not confused by it.

Of course, as with all things subjective, YMMV.
0 new messages