Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

byte & char ?

0 views
Skip to first unread message

perseus

unread,
Apr 24, 2002, 10:02:16 AM4/24/02
to
hi all,

I am now reading the source code of a project and I find that there is a
type "byte" in it. So I wanna ask if there is a type called byte in
standard c++ ? if yes, what's the relationship with char ? and how could I
convert one to another? (sorry for the last two question if byte type does
not exist).

Perseus


perseus

unread,
Apr 24, 2002, 10:04:06 AM4/24/02
to
sorry, I find that there isn't already. Thanks a lot.


"perseus" <perseus...@hotmail.com> wrote in message
news:3cc6...@newsgate.hknet.com...

Bob Hairgrove

unread,
Apr 24, 2002, 10:30:49 AM4/24/02
to

<LOL mode> ARRRGGGHHH ... here we go again!!! </LOL mode>

May I refer you to the previous thread with subject of:
"B. Stroustrup RE: sizeof(char) and sizeof(int)"?
You'll find *lots* to read about in those messages.

Anyway, the crux of the issue is this:
In Standard C++, a char is defined as being one byte. However, the
number of bits in a byte is implementation-defined. The smallest byte
must have 8 bits, however, because it must accommodate 255 unsigned
char's or 127 signed char's.

For portable terminology, use "octet" when you want to refer to a
specific number of bits (i.e. 8 bits, to be exact).

There are machines with 9 bits which are still actively used, AFAIK.


Bob Hairgrove
rhairgro...@Pleasebigfoot.com

Donovan Rebbechi

unread,
Apr 24, 2002, 10:31:25 AM4/24/02
to

There's no type called "byte", but the term "byte" is often used in the context
of C++. Byte referes to a unit of size, and means "the size of one character",
not the conventional "8 bits".

--
Donovan

Gianni Mariani

unread,
Apr 24, 2002, 11:10:20 AM4/24/02
to
Bob Hairgrove wrote:

..... The smallest byte


> must have 8 bits, however, because it must accommodate 255 unsigned
> char's or 127 signed char's.
>

...

And I always thought there was 256 values in both ...

I'd better go back an re-learn it then eh ? :)

Alexander Terekhov

unread,
Apr 24, 2002, 12:04:28 PM4/24/02
to

Donovan Rebbechi wrote:
>
> In article <3cc6...@newsgate.hknet.com>, perseus wrote:
> > hi all,
> >
> > I am now reading the source code of a project and I find that there is a
> > type "byte" in it. So I wanna ask if there is a type called byte in
> > standard c++ ?

If you have no spare 18 bucks to spend on "Official" Standard C++ PDF,
well... don't tell anyone, but you could get something *even better*
and with no charges at all (your ISP aside) here: ;-)

http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/papers/2001/n1334/

"....Consolidated Technical Corrigendum for
^^^^^^^^^^^^
International Standard for Information Systems?
Programming Language C++ .... DRAFT: 9 November 2001"

Personally, I think that it would be really nice if more folks
would pay more attention to the early TC drafts and provide
their feedback/objections/whatever while the "internal" process
is still under way...

> > if yes, what's the relationship with char ? and how could I
> > convert one to another? (sorry for the last two question if byte type does
> > not exist).
>
> There's no type called "byte", but the term "byte" is often used in the context
> of C++. Byte referes to a unit of size, and means "the size of one character",
> not the conventional "8 bits".

Well, ;-) I think that you are "right" but that's actually INCORRECT
and/or IRRELEVANT *in practice* (real life programing for 99.9999
C/C++ folks out there). Wintel aside, CONSIDER:

http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap03.html#tag_03_84

"Byte

An individually addressable unit of data storage
that is exactly an octet, used to store a character
or a portion of a character; see also Character.
A byte is composed of a contiguous sequence of 8
bits. The least significant bit is called the
"low-order" bit; the most significant is called
the "high-order" bit.

Note:

The definition of byte from the ISO C standard is
broader than the above and might accommodate hardware
architectures with different sized addressable units
than octets. "

http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap03.html#tag_03_87

"Character

A sequence of one or more bytes representing a single
graphic symbol or control code.

Note:

This term corresponds to the ISO C standard term
multi-byte character, where a single-byte character
is a special case of a multi-byte character. Unlike
the usage in the ISO C standard, character here has
no necessary relationship with storage space, and
byte is used when storage space is discussed.

See the definition of the portable character set
in Portable Character Set for a further explanation
of the graphical representations of (abstract)
characters, as opposed to character encodings."

http://www.opengroup.org/onlinepubs/007904975/basedefs/limits.h.html

"{CHAR_BIT}
Number of bits in a type char.
[CX] Value: 8 "

"[CX] Extension to the ISO C standard

The functionality described is an extension
to the ISO C standard. Application writers
may make use of an extension as it is supported
on all IEEE Std 1003.1-2001-conforming systems."

"The values for the limits {CHAR_BIT}, {SCHAR_MAX},
and {UCHAR_MAX} are now required to be 8, +127,
and 255, respectively."

http://www.opengroup.org/onlinepubs/007904975/xrat/xbd_chap03.html#tag_01_03_00_02

"Byte

The restriction that a byte is now exactly eight
bits was a conscious decision by the standard
developers. It came about due to a combination
of factors, primarily the use of the type
int8_t within the networking functions and the
alignment with the ISO/IEC 9899:1999 standard,
where the intN_t types are now defined.

According to the ISO/IEC 9899:1999 standard:

The [u]intN_t types must be two's complement
with no padding bits and no illegal values.

All types (apart from bit fields, which are not
relevant here) must occupy an integral number of
bytes.

If a type with width W occupies B bytes with C
bits per byte ( C is the value of {CHAR_BIT}),
then it has P padding bits where P+ W= B* C.

Therefore, for int8_t P=0, W=8. Since B>=1, C>=8,
the only solution is B=1, C=8.

The standard developers also felt that this was
not an undue restriction for the current state-of-
the-art for this version of IEEE Std 1003.1, but
recognize that if industry trends continue, a wider
character type may be required in the future."

"Character

The term "character" is used to mean a sequence
of one or more bytes representing a single graphic
symbol. The deviation in the exact text of the
ISO C standard definition for "byte" meets the
intent of the rationale of the ISO C standard
also clears up the ambiguity raised by the term
"basic execution character set". The octet-minimum
requirement is a reflection of the {CHAR_BIT} value."

regards,
alexander.

Bob Hairgrove

unread,
Apr 24, 2002, 12:33:31 PM4/24/02
to
On 24 Apr 2002 15:10:20 GMT, Gianni Mariani <giann...@mariani.ws>
wrote:

Well, 256 values unsigned values *if* you count the 0, which is
traditionally used as a delimiter ...

Instead of "char" I should have written "character" and specified that
a character is by definition (perhaps by mine<g>) unsigned, since a
character is not a number, so "signed char" would hold 127 characters,
the sign giving the character some special meaning (not in the sense
of "signed" as with a number type).


Bob Hairgrove
rhairgro...@Pleasebigfoot.com

0 new messages