Re: representation of register's fields

James Harris

unread,

Feb 5, 2010, 1:05:36 PM2/5/10

to

On 27 Jan, 23:25, James Harris <james.harri...@googlemail.com> wrote:
> On 25 Jan, 18:50, "bae...@gmail.com" <bae...@gmail.com> wrote:

...

> > james's suggestion works fine, but dependency remains. for example
> > what about machines having 2 byte integer ?
>
> Can you say a bit more about what you have in mind? For example, maybe
> you mean representing a 4-byte integer on a 16-bit machine or maybe
> you mean representing a 2-byte integer, etc.

I suspect there is no problem. Unlike C's bitfield syntax, which I
believe is non-portable, masking and shifting should be independent of
the endianness. All that's needed is that the register containing the
fields should have the right width.

...

> FWIW a general write of the field (in a machine word called "fields")
> would be something like
>
> fields = (fields & ~ B5_MASK) | ((new_b5 & B5_MASK) << B5_SHIFT)

Sorry, this is wrong. I had the mask and shift swapped. As penance
I've written up this and other examples of bitfield manipulation at

http://codewiki.wikispaces.com/bitfield_operations.c

Hopefully they are all correct. If anyone spots an error let me know
and I'll fix it.

(added comp.lang.c)

James
--
comp.lang.c.moderated - moderation address: cl...@plethora.net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.

Jasen Betts

unread,

Feb 6, 2010, 5:33:35 AM2/6/10

to

On 2010-02-05, James Harris <james.h...@googlemail.com> wrote:
> On 27 Jan, 23:25, James Harris <james.harri...@googlemail.com> wrote:
>> On 25 Jan, 18:50, "bae...@gmail.com" <bae...@gmail.com> wrote:
>
> ...
>
>> > james's suggestion works fine, but dependency remains. for example
>> > what about machines having 2 byte integer ?
>>
>> Can you say a bit more about what you have in mind? For example, maybe
>> you mean representing a 4-byte integer on a 16-bit machine or maybe
>> you mean representing a 2-byte integer, etc.
>
> I suspect there is no problem. Unlike C's bitfield syntax, which I
> believe is non-portable, masking and shifting should be independent of
> the endianness. All that's needed is that the register containing the
> fields should have the right width.

masking and shifing on integers larger than 8-bit is non-portable, if
you want the same bit-order in memory or on disk.

or if you encounter hardware with a different byte-width.

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---

James Harris

unread,

Feb 6, 2010, 6:58:01 AM2/6/10

to

On 6 Feb, 10:33, Jasen Betts <ja...@xnet.co.nz> wrote:

> On 2010-02-05, James Harris <james.harri...@googlemail.com> wrote:
>
>
>
> > On 27 Jan, 23:25, James Harris <james.harri...@googlemail.com> wrote:
> >> On 25 Jan, 18:50, "bae...@gmail.com" <bae...@gmail.com> wrote:
>
> > ...
>
> >> > james's suggestion works fine, but dependency remains. for example
> >> > what about machines having 2 byte integer ?
>
> >> Can you say a bit more about what you have in mind? For example, maybe
> >> you mean representing a 4-byte integer on a 16-bit machine or maybe
> >> you mean representing a 2-byte integer, etc.
>
> > I suspect there is no problem. Unlike C's bitfield syntax, which I
> > believe is non-portable, masking and shifting should be independent of
> > the endianness. All that's needed is that the register containing the
> > fields should have the right width.
>
> masking and shifing on integers larger than 8-bit is non-portable, if
> you want the same bit-order in memory or on disk.

Are you sure this is relevant? I would have thought masking and
shifting were machine-indepdendent. Storing them in memory and passing
them between machines is endianness-dependent, yes, but that is true
of integers, period. It has nothing to do with shifting and masking.
It's a different issue.

Binary values are endianness-independent. Shift left moves to more
significant, shift right to less significant.

James

James Kuyper

unread,

Feb 6, 2010, 4:29:31 PM2/6/10

to

James Harris wrote:
> On 6 Feb, 10:33, Jasen Betts <ja...@xnet.co.nz> wrote:
>> On 2010-02-05, James Harris <james.harri...@googlemail.com> wrote:
>>
>>
>>
>>> On 27 Jan, 23:25, James Harris <james.harri...@googlemail.com> wrote:
>>>> On 25 Jan, 18:50, "bae...@gmail.com" <bae...@gmail.com> wrote:
>>> ...
>>>>> james's suggestion works fine, but dependency remains. for example
>>>>> what about machines having 2 byte integer ?
>>>> Can you say a bit more about what you have in mind? For example, maybe
>>>> you mean representing a 4-byte integer on a 16-bit machine or maybe
>>>> you mean representing a 2-byte integer, etc.
>>> I suspect there is no problem. Unlike C's bitfield syntax, which I
>>> believe is non-portable, masking and shifting should be independent of
>>> the endianness. All that's needed is that the register containing the
>>> fields should have the right width.
>> masking and shifing on integers larger than 8-bit is non-portable, if
>> you want the same bit-order in memory or on disk.
>
> Are you sure this is relevant? I would have thought masking and
> shifting were machine-indepdendent.

Not quite. Bit masking operations on negative values can have different
results depending upon whether the machine uses 2's complement, 1's
complement, or sign-magnitude representation, all of which are permitted
by the C standard, though 2's complement is so widely used that many
people are unaware of that fact that it isn't universal.

Shifting by a negative number of bits, or a number of bits greater than
the width of the data type, has undefined behavior. Shifting a negative
value has undefined behavior. In both cases, the actual results you get
(if your program doesn't simply crash) are machine-dependent.

This is why many coding standards prohibit the use of bit-masking and
shift operations on signed integers. If you avoid all of the issues
mentioned above, then you're right, the effect of those operations on
the value of the expression is, indeed, machine independent.

> Storing them in memory and passing

> them between machines is endianness-dependent, yes, ...

And that's what he was talking about.

James Harris

unread,

Feb 6, 2010, 9:28:04 PM2/6/10

to

Both points accepted. I only had unsigned in mind. I had thought to
add some signed operations to the code samples (which, if anyone's
wondering, were at

http://codewiki.wikispaces.com/bitfield_operations.c)

but given your comments I don't think I'll bother!

James

Ben Bacarisse

unread,

Feb 7, 2010, 8:10:04 PM2/7/10

to

James Kuyper <james...@verizon.net> writes:
<snip>

> Shifting by a negative number of bits, or a number of bits greater
> than the width of the data type, has undefined behavior. Shifting a
> negative value has undefined behavior.

Nit: right shifting a negative number is implementation defined, not
undefined. That does not alter the point you were making one iota:

> In both cases, the actual
> results you get (if your program doesn't simply crash) are
> machine-dependent.

<snip>
--
Ben.

James Kuyper

unread,

Feb 8, 2010, 12:16:42 AM2/8/10

to

Ben Bacarisse wrote:
> James Kuyper <james...@verizon.net> writes:
> <snip>
>> Shifting by a negative number of bits, or a number of bits greater
>> than the width of the data type, has undefined behavior. Shifting a
>> negative value has undefined behavior.
>
> Nit: right shifting a negative number is implementation defined, not
> undefined. That does not alter the point you were making one iota:

You're right - that distinction was in an early version of the message,
but got lost during a re-write.

>> In both cases, the actual
>> results you get (if your program doesn't simply crash) are
>> machine-dependent.

Mark

unread,

Feb 10, 2010, 3:00:33 AM2/10/10

to

James Harris wrote:
> Are you sure this is relevant? I would have thought masking and
> shifting were machine-indepdendent. Storing them in memory and passing
> them between machines is endianness-dependent, yes, but that is true
> of integers, period. It has nothing to do with shifting and masking.
> It's a different issue.
>
> Binary values are endianness-independent. Shift left moves to more
> significant, shift right to less significant.

So, suppose we're writing code for big-endian platform with 32-bit wide
registers, it is safe and correct to build a value for a register with bit
operations, rather then with bitfields, something like:

unsigned int val;

val = (reg << 16) | (0u << 21) | (phy << 24) | (SMI_READ << 29) | (1u <<
30);
*(volatile unsigned int *)(REG_0) = val;

...or this requires some byte-swapping macros to swap bytes before storing
them in registers (what Linux provides in 'include/linux/byteorder/') ?

--
Mark

Ben Bacarisse

unread,

Feb 10, 2010, 12:49:43 PM2/10/10

to

"Mark" <mark_cruz...@hotmail.com> writes:

> James Harris wrote:
>> Are you sure this is relevant? I would have thought masking and
>> shifting were machine-indepdendent. Storing them in memory and passing
>> them between machines is endianness-dependent, yes, but that is true
>> of integers, period. It has nothing to do with shifting and masking.
>> It's a different issue.
>>
>> Binary values are endianness-independent. Shift left moves to more
>> significant, shift right to less significant.
>
>
> So, suppose we're writing code for big-endian platform with 32-bit
> wide registers, it is safe and correct to build a value for a register
> with bit operations, rather then with bitfields,

The short answer is yes (but see some further points below).

A longer answer would also say that bit-fields can't be used for this
in any portable way, though people often do use them, presumably
because they have some assurance that the compiler they use will pack
them as desired.

> something like:
>
> unsigned int val;
>
> val = (reg << 16) | (0u << 21) | (phy << 24) | (SMI_READ << 29) | (1u
> << 30);

Assuming sane types for the undeclared names, this puts the various
pieces together according to bit-significance. For example, the
bottom 16 bits will be zero and the second most significant bit will
be 1.

> *(volatile unsigned int *)(REG_0) = val;

Byte order is involved here (as JH said) since this is storing a value
to memory. This is presumably "safe" since you will, I presume, want
this value stored according to the rules for this machine. Were this
a network protocol, it would be wrong to write the value directly to a
packet buffer using this sort of code.

Endianness is about the relationship between significance and
addresses. If we use a pointer to a small type to scan over the parts
of a larger one, it is the endianness that determines the significance
of the parts we get back. Thus, to extend your example, if unsigned
short is 16 bits:

unsigned short *sp = (void *)&val;

it is up to the machine architecture (and maybe even some run-time
settings) whether sp[0] is the low-order half word (and thus zero) or
the high-order one.

<snip>
--
Ben.

James Kuyper

unread,

Feb 10, 2010, 12:50:21 PM2/10/10

to

Mark wrote:
> James Harris wrote:
>> Are you sure this is relevant? I would have thought masking and
>> shifting were machine-indepdendent. Storing them in memory and passing
>> them between machines is endianness-dependent, yes, but that is true
>> of integers, period. It has nothing to do with shifting and masking.
>> It's a different issue.
>>
>> Binary values are endianness-independent. Shift left moves to more
>> significant, shift right to less significant.
>
>
> So, suppose we're writing code for big-endian platform with 32-bit wide
> registers, it is safe and correct to build a value for a register with
> bit operations, rather then with bitfields, something like:
>
> unsigned int val;
>
> val = (reg << 16) | (0u << 21) | (phy << 24) | (SMI_READ << 29) | (1u
> << 30);

If 'reg', 'phy' and 'SMI_READ' are all unsigned, and contain suitable
values, code like that is a way of putting such a value into 'val', on
any machine where unsigned int has at least 32 bits, regardless of byte
order (or even bit order). As far as all of C's bitwise operators are
concerned, the correct bits will be set, in the correct order, from MSB
to LSB, regardless of what the physical order of those bits is in
memory. However, the following code seems to be intended to implement
what you're talking about when you say that you want to put the value in
a register:

> *(volatile unsigned int *)(REG_0) = val;

You don't say what REG_0 is; from the way you've used it, it is
presumably a pointer to a register. There's no way in standard C to
create such a pointer. Registers might or might not be used by the
compiler to implement C, but in itself C doesn't provide any mechanisms
for accessing them. Declaring a variable as 'register' is request, and
an implementation of C is not obliged to honor that request. Even if it
is honored, standard C provides no mechanism for telling the compiler
which register to use for the variable.

If you're using a implementation that includes an extension to C which
provides such mechanisms, that extension is implementation-specific, and
any questions like this about those mechanisms are best asked in a forum
devoted to that implementation. You'll get better answers in such a
forum. More importantly, if someone gives you an incorrect answer, they
are more likely to be corrected by the other members of the forum, than
they would be in this group - there might not be anyone here who knows
enough about that extension to recognize that the answer was incorrect.

Phil Carmody

unread,

Feb 11, 2010, 5:39:47 AM2/11/10

to

Unless all bits are zero, of course.

Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1

Mark

unread,

Feb 11, 2010, 8:56:03 AM2/11/10

to

James Kuyper wrote:
> However, the following code seems to be
> intended to implement what you're talking about when you say that you
> want to put the value in a register:
>
>> *(volatile unsigned int *)(REG_0) = val;
>
> You don't say what REG_0 is;

This is a memory-mapped register, so REG_0 is a constant representing
address in the memory. In standard C we can't declare a variable so that it
resides at a specified address. So in order to access a device register we
can dereference a pointer whose value is the register's address.

--
Mark

Jasen Betts

unread,

Feb 11, 2010, 8:56:54 AM2/11/10

to

On 2010-02-10, Mark <mark_cruz...@hotmail.com> wrote:
> James Harris wrote:
>> Are you sure this is relevant? I would have thought masking and
>> shifting were machine-indepdendent. Storing them in memory and passing
>> them between machines is endianness-dependent, yes, but that is true
>> of integers, period. It has nothing to do with shifting and masking.
>> It's a different issue.
>>
>> Binary values are endianness-independent. Shift left moves to more
>> significant, shift right to less significant.
>
>
> So, suppose we're writing code for big-endian platform with 32-bit wide
> registers, it is safe and correct to build a value for a register with bit
> operations, rather then with bitfields, something like:
>
> unsigned int val;
>
> val = (reg << 16) | (0u << 21) | (phy << 24) | (SMI_READ << 29) | (1u <<
> 30);
> *(volatile unsigned int *)(REG_0) = val;
>
> ...or this requires some byte-swapping macros to swap bytes before storing
> them in registers (what Linux provides in 'include/linux/byteorder/') ?

it's only safe if the stored values are then only used on 32 bit big
endian systems.

if you want to be binary-file compatible with other platforms the
files should only be read and written in bytes.

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---

James Kuyper

unread,

Feb 11, 2010, 12:02:02 PM2/11/10

to

Mark wrote:
> James Kuyper wrote:
>> However, the following code seems to be
>> intended to implement what you're talking about when you say that you
>> want to put the value in a register:
>>
>>> *(volatile unsigned int *)(REG_0) = val;
>>
>> You don't say what REG_0 is;
>
> This is a memory-mapped register, so REG_0 is a constant representing
> address in the memory. In standard C we can't declare a variable so that
> it resides at a specified address. So in order to access a device
> register we can dereference a pointer whose value is the register's
> address.

That was, as I've already stated, my best guess as to what you were
doing - I don't need to have such things explained to me.

I was making two points: standard C provides no mechanism for defining
such a constant. Therefore, you must be using an implementation-specific
extension to C. In order to get useful answers to any questions about
that extension, you should post your question in a forum devoted to the
kind of systems that support that particular extension. My second point
is that you didn't give use enough information to answer your question.
It contained three different expressions involving identifiers for which
you provided no information as to what it was that they identified. The
answer to your question depends strongly upon the types and values of
those expressions.

Mark

unread,

Feb 11, 2010, 9:17:43 PM2/11/10

to

James Kuyper wrote:
> That was, as I've already stated, my best guess as to what you were
> doing - I don't need to have such things explained to me.

Sorry, didn't mean to offend you, I thought there was some misunderstanding
about my environment, that's why I was gaving some trivial details.

> I was making two points: standard C provides no mechanism for defining
> such a constant. Therefore, you must be using an implementation-
> specific extension to C.

[snip]

And now I'm confused with your words. Just to make it clear and clean, here
is more accurate example, and we assume 'phy' and 'reg' can't contain values
bigger then 28 and 31 respectively:

#define REG_0 0x01u
#define SMI_READ 0u

unsigned int val, reg, phy;

val = (reg << 16) | (0u << 21) | (phy << 24) | (SMI_READ << 29) | (1u <<
30);

*(volatile unsigned int *)(REG_0) = val; /* XX */

The XX-marked statement is valid in the standard C, and doesn't require any
compiler's extensions, am I right ?

--
Mark

James Kuyper

unread,

Feb 11, 2010, 11:23:15 PM2/11/10

to

Mark wrote:
> James Kuyper wrote:

...

>> I was making two points: standard C provides no mechanism for defining
>> such a constant. Therefore, you must be using an implementation-
>> specific extension to C.
> [snip]
>
> And now I'm confused with your words. Just to make it clear and clean,
> here is more accurate example, and we assume 'phy' and 'reg' can't
> contain values bigger then 28 and 31 respectively:
>
> #define REG_0 0x01u
> #define SMI_READ 0u
>
> unsigned int val, reg, phy;
>
> val = (reg << 16) | (0u << 21) | (phy << 24) | (SMI_READ << 29) | (1u
> << 30);
>
> *(volatile unsigned int *)(REG_0) = val; /* XX */
>
> The XX-marked statement is valid in the standard C, and doesn't require
> any compiler's extensions, am I right ?

I was using the term "extensions" loosely, to cover all
implementation-specific features of a compiler. Your code makes a couple
of seriously non-portable assumptions, beyond simply assuming that
unsigned int has 32 bits. It assumes that there are pointer values that
can, when dereferenced, refer to registers. It assumes, in particular,
that (volatile unsigned int*)0x01u is one such value. The C standard
says nothing to support those assumptions.

What the standard does say is that when you convert an arbitrary integer
to a pointer type, "the result is implementation-defined, might not be
correctly aligned, might not point to an entity of the referenced type,
and might be a trap representation." Note, in particular, that if any of
the nasty alternatives mentioned in that sentence is actually true for a
given implementation, virtually any attempt to actually make use of the
resulting pointer value (for instance, by dereferencing it) renders the
behavior of your entire program undefined.

The standard doesn't say these things arbitrarily - it says them because
there are real world machines where these possibilities actual happen.
There are some real-world machines which have registers specialized to
hold memory addresses, where even simply loading an invalid pointer
value into an address register will cause the program to be aborted.
There are many real-world machines where converting an arbitrary integer
value into a pointer will generally result in a pointer to memory that
your program doesn't have permission to access, with the same result.

If your implementation does in fact define that the result of converting
0x01u to volatile unsigned int* is a pointer that, when dereferenced,
gives your code access to a particular register, then your code is
perfectly fine - for that implementation. However, only your
implementation's documentation can tell you whether or not byte-swapping
is needed when making use of such a pointer; standard C is deliberately
silent on such issues.