Is there any way in C to print ® ?

g.kana...@gmail.com

unread,

Nov 15, 2005, 12:08:56 AM11/15/05

to

Hi,

I'm looking for a way to print ® in a C program. Could any of you
help me out?

Regards,
Raju

Peter Nilsson

unread,

Nov 15, 2005, 12:18:20 AM11/15/05

to

g.kanaka.r...@gmail.com wrote:
> Hi,
>
> I'm looking for a way to print ® in a C program. Could any of you
> help me out?

putchar('®');

Though this is not strictly conforming since ® is not a member of the
basic
execution character set.

Under C99, you can try...

putchar('\u00AE');

You could do...

printf("®");

...and pipe your output through a html browser.

Lastly, just do printf("(R)");

--
Peter

haroon

unread,

Nov 15, 2005, 12:36:06 AM11/15/05

to

try this:
/*CODE BEGINS*/
putchar(174);
/*CODE ENDS*/

it will only work however if your system character set supports that
character.

Jordan Abel

unread,

Nov 15, 2005, 12:51:39 AM11/15/05

to

What's the legal status of (R)? I know that (c) doesn't have any legal
status, though it hardly matters in jurisdictions where copyright is
automatic.

Sandeep

unread,

Nov 15, 2005, 12:10:43 PM11/15/05

to

Peter Nilsson wrote:
>
> Though this is not strictly conforming since ® is not a member of the
> basic
> execution character set.
>
> Under C99, you can try...
>
> putchar('\u00AE');
>
> You could do...
>
> printf("®");
>

Being more generic, you can lookup the character/symbol that you want
to print in a ascii table. 174 is ascii for (R).

http://www.arachnoid.com/javascript/ascii.html

Lew Pitcher

unread,

Nov 15, 2005, 12:25:20 PM11/15/05

to

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sandeep wrote:
[snip]

> 174 is ascii for (R).

Wrong. ASCII only extends from codepoint 0 through to codepoint 127. Any
characterset that has other codepoints /is not ASCII/.

> http://www.arachnoid.com/javascript/ascii.html

A more definitive source would be the ISO, ANSI or ECMA standards
documents. Of the three, the ECMA documents are the only 'free' ones
around. Check
http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-006.pdf
and
http://www.ecma-international.org/publications/files/ecma-st/ECMA-048.pdf

Alternatively, you can take a look at the ISO/IEC JTC 1/SC 2 definition
of ASCII at http://anubis.dkuug.dk/i18n/charmaps/ASCII

- --

Lew Pitcher, IT Specialist, Enterprise Data Systems
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed here are my own, not my employer's)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFDehn+agVFX4UWr64RAh+BAKCtk1PS+BGLQkUtv3+eR3HUx813tgCfbabQ
kuixvCOLPPDn44IZl0+Hkak=
=pknL
-----END PGP SIGNATURE-----

Robert Gamble

unread,

Nov 15, 2005, 1:45:46 PM11/15/05

to

Sandeep wrote:
> Peter Nilsson wrote:
> >
> > Though this is not strictly conforming since ® is not a member of the
> > basic
> > execution character set.
> >
> > Under C99, you can try...
> >
> > putchar('\u00AE');
> >
> > You could do...
> >
> > printf("®");
> >
>
>
> Being more generic, you can lookup the character/symbol that you want
> to print in a ascii table. 174 is ascii for (R).

Last I checked ASCII used seven bits to represent character values and
there are not 174 different possible encodings with 7 bits. Certain
extended codesets may define larger values using more bits but they are
not ASCII.

Robert Gamble

Clark S. Cox III

unread,

Nov 15, 2005, 1:55:45 PM11/15/05

to

On 2005-11-15 12:10:43 -0500, "Sandeep" <sandeep...@gmail.com> said:
>
> Being more generic, you can lookup the character/symbol that you want
> to print in a ascii table. 174 is ascii for (R).
>
> http://www.arachnoid.com/javascript/ascii.html

This is not true, but is a common misconception. ASCII has values in
the range [0, 127]. Any character value that is greater than 127 is
*not* ASCII.

--
Clark S. Cox, III
clar...@gmail.com

Simon Biber

unread,

Nov 15, 2005, 8:43:17 PM11/15/05

to

g.kana...@gmail.com wrote:
> Hi,
>
> I'm looking for a way to print ® in a C program. Could any of you
> help me out?

Yeah, this should work...

#include <wchar.h>
#include <locale.h>

int main(void)
{
setlocale(LC_CTYPE, "");
putwchar(L'\u00AE');
putwchar(L'\n');
return 0;
}

Here is what happens when I run it on my system:

[sbiber@eagle c]$ echo $LANG
en_AU.UTF-8

This means that my system locale is set to Australian English, and the
execution character set is UTF-8.

[sbiber@eagle c]$ c99 -pedantic regtr.c -o regtr

It compiles cleanly as a C99 program.

[sbiber@eagle c]$ ./regtr
®

When run in the usual way, it outputs a registered trademark symbol in
the locale's character set, UTF-8.

[sbiber@eagle c]$ LANG=C ./regtr
(R)

When run in the C locale, it outputs the best-possible ASCII
representation of the character, which is the three characters '(', 'R',
')'.

[sbiber@eagle c]$ LANG=en_AU.ISO-8859-1 ./regtr | iconv -f ISO-8859-1
®

When run in the locale corresponding to Australian English in the
ISO-8859-1 character set, it outputs the registered trademark symbol in
that character set, which I then send to the 'iconv' command to convert
it from that character set back into the system default, UTF-8.

--
Simon.

Simon Biber

unread,

Nov 15, 2005, 9:05:21 PM11/15/05

to

Peter Nilsson wrote:
> g.kanaka.r...@gmail.com wrote:
>
>>Hi,
>>
>>I'm looking for a way to print ® in a C program. Could any of you
>>help me out?
>
>
> putchar('®');
>
> Though this is not strictly conforming since ® is not a member of the
> basic
> execution character set.

This does NOT work even on some systems where '®' is a member of the
execution character set, because the character '®' may be a multi-byte
character. To be specific, on this Linux machine it consists of the two
bytes 0xC2 and 0xAE:

[sbiber@eagle c]$ echo -n ® | od -t x1
0000000 c2 ae
0000002

> Under C99, you can try...
>
> putchar('\u00AE');

This has the same problem, as \u00AE may be a multi-byte character. One
correct solution is to put it in a string.

For example, puts("®") will always work so long as the character exists
in the source and execution character sets, even if it happens to take
more than one byte.

If you can't guarantee that the source character set contains the
character, you can use a universal character sequence, ie.
puts("\u00AE"). This will still fail if the execution character set does
not contain that character, of course.

#include <stdio.h>
#include <locale.h>

int main(void)
{
setlocale(LC_CTYPE, "");
puts("This is using a literal character: ®");
puts("This is using a universal character sequence: \u00AE");
return 0;
}

[sbiber@eagle c]$ c99 -pedantic regtr.c -o regtr

[sbiber@eagle c]$ ./regtr
This is using a literal character: ®
This is using a universal character sequence: ®

--
Simon.

Sandeep

unread,

Nov 15, 2005, 10:21:26 PM11/15/05

to

I should have written "Extended Ascii Character Set".

Lew Pitcher

unread,

Nov 15, 2005, 10:59:32 PM11/15/05

to

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

You still would have been wrong.

ASCII has but one extension, and that's the 8-bit NAPLPS extension that looks
nothing like the characterset documented at
http://www.arachnoid.com/javascript/ascii.html

The NAPLPS "extension" is ANSI_X3.110-1983 documented at
http://ra.dkuug.dk/i18n/charmaps/NAPLPS

What that arachnoid.com page documents is "some characterset that has 128
codepoints in common with ASCII, but is otherwise unrelated to ASCII".

- --
Lew Pitcher

Master Codewright & JOAT-in-training | GPG public key available on request
Registered Linux User #112576 (http://counter.li.org/)
Slackware - Because I know what I'm doing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)

iD8DBQFDeq6kagVFX4UWr64RAgZcAJ0Sn3MaVYTWaVhr0K/5gSICbXb+wACg2Sfc
q3w0scbhPITYMLus6eOD6kM=
=gLh7
-----END PGP SIGNATURE-----

Jordan Abel

unread,

Nov 16, 2005, 2:43:06 AM11/16/05

to

On 2005-11-16, Lew Pitcher <lpit...@sympatico.ca> wrote:
>
> Sandeep wrote:
>> Clark S. Cox III wrote:
>>
>>>On 2005-11-15 12:10:43 -0500, "Sandeep" <sandeep...@gmail.com> said:
>>>
>>>>Being more generic, you can lookup the character/symbol that you want
>>>>to print in a ascii table. 174 is ascii for (R).
>>>>
>>>>http://www.arachnoid.com/javascript/ascii.html
>>>
>>>This is not true, but is a common misconception. ASCII has values in
>>>the range [0, 127]. Any character value that is greater than 127 is
>>>*not* ASCII.
>>
>>
>> I should have written "Extended Ascii Character Set".
>
> You still would have been wrong.
>
> ASCII has but one extension, and that's the 8-bit NAPLPS extension that looks
> nothing like the characterset documented at
> http://www.arachnoid.com/javascript/ascii.html
>
> The NAPLPS "extension" is ANSI_X3.110-1983 documented at
> http://ra.dkuug.dk/i18n/charmaps/NAPLPS
>
> What that arachnoid.com page documents is "some characterset that has 128
> codepoints in common with ASCII, but is otherwise unrelated to ASCII".

Eh? How do you figure that ISO 8859 [as shown] and ISO 10646 [as a
subset of which is shown] don't extend ascii? Or, for that matter, ISO
2022? Where does this "NAPLPS" thing get sole legitimacy from?

Kenneth Brody

unread,

Nov 16, 2005, 10:49:51 AM11/16/05

to

Simon Biber wrote:
>
> g.kana...@gmail.com wrote:
> > Hi,
> >

> > I'm looking for a way to print Ž in a C program. Could any of you

> > help me out?
>
> Yeah, this should work...
>
> #include <wchar.h>
> #include <locale.h>
>
> int main(void)
> {
> setlocale(LC_CTYPE, "");
> putwchar(L'\u00AE');
> putwchar(L'\n');
> return 0;
> }

By "should work", I assume you mean "should work, if you happen to have
the same non-standard extensions that I do"? (Or are "setlocale" and
"putwchar" part of standard C?)

Here's my compiler's output:

==========
foo.c
foo.c(7) : warning C4129: 'u' : unrecognized character escape sequence
foo.c(7) : error C2065: 'stdout' : undeclared identifier
foo.c(7) : warning C4047: 'function' : 'struct _iobuf *' differs in levels of
indirection from 'int '
foo.c(7) : warning C4024: 'fputwc' : different types for formal and actual pa
rameter 2
foo.c(8) : warning C4047: 'function' : 'struct _iobuf *' differs in levels of
indirection from 'int '
foo.c(8) : warning C4024: 'fputwc' : different types for formal and actual pa
rameter 2
==========

> Here is what happens when I run it on my system:

Well, it won't even _compile_ on mine.

[...]

> [sbiber@eagle c]$ c99 -pedantic regtr.c -o regtr
>
> It compiles cleanly as a C99 program.

Ah. I don't have C99.

[...]

If I add the missing "#include <stdio.h>", I still get the warning about
"'u' : unrecognized character escape sequence", but it compiles and links.
Of course, running it simply gives me a "u".

--
+-------------------------+--------------------+-----------------------------+
| Kenneth J. Brody | www.hvcomputer.com | |
| kenbrody/at\spamcop.net | www.fptech.com | #include <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------------+
Don't e-mail me at: <mailto:ThisIsA...@gmail.com>

Roger Leigh

unread,

Nov 16, 2005, 3:37:27 PM11/16/05

to

g.kana...@gmail.com writes:

> I'm looking for a way to print ® in a C program. Could any of you
> help me out?

$ cat cp.c
#include <stdio.h>

int main(void)
{
printf ("Copyright © 2005 Roger Leigh <rleigh -at- debian.org>\n");
return 0;
}

$ file cp.c
cp.c: UTF-8 Unicode C program text

$ c99 -o cp cp.c

$ locale -k charmap
charmap="UTF-8"

$ ./cp
Copyright © 2005 Roger Leigh <rleigh -at- debian.org>

Modern GCCs use UTF-8 as the narrow execution charset and UTF-32
(UCS-4) as the wide execution charset. The input charset defaults to
UTF-8, so you can just write UTF-8 Unicode C source files and build
them as you would plain ASCII source. So you can basically use the
universal character set for everything. You could even get gcc to
recode it with -fexecution-charset and/or -finput-charset.

I'm not sure how this is implemented by other compilers, but there's
nothing non-standard about this.

Regards,
Roger

--
Roger Leigh
Printing on GNU/Linux? http://gimp-print.sourceforge.net/
Debian GNU/Linux http://www.debian.org/
GPG Public Key: 0x25BFB848. Please sign and encrypt your mail.

Keith Thompson

unread,

Nov 16, 2005, 4:19:32 PM11/16/05

to

Kenneth Brody <kenb...@spamcop.net> writes:
> Simon Biber wrote:
>>
>> g.kana...@gmail.com wrote:
>> > Hi,
>> >

>> > I'm looking for a way to print ® in a C program. Could any of you

>> > help me out?
>>
>> Yeah, this should work...
>>
>> #include <wchar.h>
>> #include <locale.h>
>>
>> int main(void)
>> {
>> setlocale(LC_CTYPE, "");
>> putwchar(L'\u00AE');
>> putwchar(L'\n');
>> return 0;
>> }
>
> By "should work", I assume you mean "should work, if you happen to have
> the same non-standard extensions that I do"? (Or are "setlocale" and
> "putwchar" part of standard C?)

Yes, both setlocal() and putwchar() are standard, at least in C99.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Simon Biber

unread,

Nov 16, 2005, 6:20:31 PM11/16/05

to

Kenneth Brody wrote:
> Simon Biber wrote:
>
>>g.kana...@gmail.com wrote:
>>
>>>Hi,
>>>

>>>I'm looking for a way to print ® in a C program. Could any of you

>>>help me out?
>>
>>Yeah, this should work...
>>
>>#include <wchar.h>
>>#include <locale.h>
>>
>>int main(void)
>>{
>> setlocale(LC_CTYPE, "");
>> putwchar(L'\u00AE');
>> putwchar(L'\n');
>> return 0;
>>}
>
>
> By "should work", I assume you mean "should work, if you happen to have
> the same non-standard extensions that I do"? (Or are "setlocale" and
> "putwchar" part of standard C?)

Indeed they are part of standard C, and not just C99 -- I seem to
remember locales and wide character support was added in the 1994 amendment.

Universal character constants are C99 only, though, which is why your
compiler gives warning C4129.

> Here's my compiler's output:
>
> ==========
> foo.c
> foo.c(7) : warning C4129: 'u' : unrecognized character escape sequence
> foo.c(7) : error C2065: 'stdout' : undeclared identifier
> foo.c(7) : warning C4047: 'function' : 'struct _iobuf *' differs in levels of
> indirection from 'int '
> foo.c(7) : warning C4024: 'fputwc' : different types for formal and actual pa
> rameter 2
> foo.c(8) : warning C4047: 'function' : 'struct _iobuf *' differs in levels of
> indirection from 'int '
> foo.c(8) : warning C4024: 'fputwc' : different types for formal and actual pa
> rameter 2
> ==========
>
>
>>Here is what happens when I run it on my system:
>
>
> Well, it won't even _compile_ on mine.
>
> [...]
>
>>[sbiber@eagle c]$ c99 -pedantic regtr.c -o regtr
>>
>>It compiles cleanly as a C99 program.
>
>
> Ah. I don't have C99.
>
> [...]
>
> If I add the missing "#include <stdio.h>", I still get the warning about
> "'u' : unrecognized character escape sequence", but it compiles and links.
> Of course, running it simply gives me a "u".

<stdio.h> was not actually missing. The putwchar function is defined in
<wchar.h>, and it is not necessary to #include <stdio.h>.

---- C99 quote ----
7.24.3.9 The putwchar function
Synopsis
#include <wchar.h>
wint_t putwchar(wchar_t c);

Description
The putwchar function is equivalent to putwc with the second
argument stdout.

Returns
The putwchar function returns the character written, or WEOF.
----

It sounds like your implementation implements putwchar as a macro such as:

#define putwchar(c) putwc((c), stdout)

which then fails because stdout is not defined by <wchar.h>. I believe
such an implementation would not conform to the standard. Can anyone
confirm that?

--
Simon.

Kenneth Brody

unread,

Nov 17, 2005, 11:11:43 AM11/17/05

to

Simon Biber wrote:
[...]

> > By "should work", I assume you mean "should work, if you happen to have
> > the same non-standard extensions that I do"? (Or are "setlocale" and
> > "putwchar" part of standard C?)
>
> Indeed they are part of standard C, and not just C99 -- I seem to
> remember locales and wide character support was added in the 1994 amendment.

[...]

> It sounds like your implementation implements putwchar as a macro such as:
>
> #define putwchar(c) putwc((c), stdout)
>
> which then fails because stdout is not defined by <wchar.h>. I believe
> such an implementation would not conform to the standard. Can anyone
> confirm that?

Actually:

#define putwchar(_c) fputwc((_c),stdout)

But, as I said, mine isn't a C99 compiler (AFAIK), so it doesn't have to
conform to the C99 standard.

Stephen Sprunk

unread,

Dec 7, 2005, 12:36:45 PM12/7/05

to

"Jordan Abel" <jma...@purdue.edu> wrote in message
news:slrndnlop7...@random.yi.org...

> On 2005-11-16, Lew Pitcher <lpit...@sympatico.ca> wrote:
>> Sandeep wrote:
>>> Clark S. Cox III wrote:
>>>>On 2005-11-15 12:10:43 -0500, "Sandeep" <sandeep...@gmail.com> said:
>>>>>Being more generic, you can lookup the character/symbol that
>>>>>you want to print in a ascii table. 174 is ascii for (R).
>>>>>
>>>>>http://www.arachnoid.com/javascript/ascii.html
>>>>
>>>>This is not true, but is a common misconception. ASCII has values
>>>>in the range [0, 127]. Any character value that is greater than 127
>>>>is *not* ASCII.
>>>
>>> I should have written "Extended Ascii Character Set".
>>
>> You still would have been wrong.

...

> Eh? How do you figure that ISO 8859 [as shown] and ISO 10646 [as a
> subset of which is shown] don't extend ascii? Or, for that matter, ISO
> 2022? Where does this "NAPLPS" thing get sole legitimacy from?

If you're going to use that argument, then virtually every character set
ever developed is an "Extended ASCII Character Set", from Latin-1 to
Unicode, making it a completely meaningless term. And, since not all
character sets which are a superset of ASCII have ® as codepoint 174, you're
still wrong.

The page you cite as being "ASCII" appears, after a brief review, to
describe either Latin-1 or CP1252. That's fine if you know the execution
character set is one of those, but most of us can't assume that safely.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin

Jordan Abel

unread,

Dec 7, 2005, 2:24:07 PM12/7/05

to

On 2005-12-07, Stephen Sprunk <ste...@sprunk.org> wrote:
> "Jordan Abel" <jma...@purdue.edu> wrote in message
> news:slrndnlop7...@random.yi.org...
>> On 2005-11-16, Lew Pitcher <lpit...@sympatico.ca> wrote:
>>> Sandeep wrote:
>>>> Clark S. Cox III wrote:
>>>>>On 2005-11-15 12:10:43 -0500, "Sandeep" <sandeep...@gmail.com> said:
>>>>>>Being more generic, you can lookup the character/symbol that you
>>>>>>want to print in a ascii table. 174 is ascii for (R).
>>>>>>
>>>>>>http://www.arachnoid.com/javascript/ascii.html
>>>>>
>>>>>This is not true, but is a common misconception. ASCII has values
>>>>>in the range [0, 127]. Any character value that is greater than 127
>>>>>is *not* ASCII.
>>>>
>>>> I should have written "Extended Ascii Character Set".
>>>
>>> You still would have been wrong.
> ...
>> Eh? How do you figure that ISO 8859 [as shown] and ISO 10646 [as a
>> subset of which is shown] don't extend ascii? Or, for that matter,
>> ISO 2022? Where does this "NAPLPS" thing get sole legitimacy from?
>
> If you're going to use that argument, then virtually every character
> set ever developed is an "Extended ASCII Character Set", from Latin-1
> to Unicode, making it a completely meaningless term.

It excludes EBCDIC derivatives. Arguably also national iso646 variants,
if we take it to mean extensions of IRV-1991 rather than of the
invariant subset.

> And, since not all character sets which are a superset of ASCII have ®
> as codepoint 174, you're still wrong.
>
> The page you

It wasn't I. I was just curious about why "NAPLPS" has special
legitimacy

Besides, looking up 174 will give you the value to put in \u00AE
[incidentally, what does that yield if ® is not in the execution
character set?

Lew Pitcher

unread,

Dec 7, 2005, 2:42:05 PM12/7/05

to

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

AFAIK, the only 'extension' to ASCII sanctioned by ISO is/was ISO/IEC
4873:1991 ("ISO 8-bit code for information interchange -- Structure and
rules for implementation").

IIRC, this is a codified version of the ANSI NAPLPS characterset
(http://anubis.dkuug.dk/i18n/charmaps/NAPLPS)

[snip]

- --

Lew Pitcher, IT Specialist, Enterprise Data Systems
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed here are my own, not my employer's)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFDlzsNagVFX4UWr64RAqAwAKDA6IHq0EiPIUn6DjvKltJg0wGqAQCgpSce
L4UZDIeu8hb3j6f2mv6xt7c=
=P2Tb
-----END PGP SIGNATURE-----

Jordan Abel

unread,

Dec 7, 2005, 3:12:24 PM12/7/05

to

What does 'sanctioned' mean? Was ISO/IEC 8859:1998 "unsanctioned"? How
about 10646? 10637? 2022?

> IIRC,

AFAICT, YRI. [From what i can find online, 4873 is not a version of
NAPLPS, but is actually more closely related to 2022]