which one is faster?

echo ma

unread,

Apr 4, 2012, 7:05:42 PM4/4/12

to

on a 32bit system.

#pragma pack(4)
struct TestStruct
{
unsigned short a;
unsigned short b;
unsigned int c;
};
#pragma pack()
struct TestStruct t;
t.a = 0; //step 1
t.b = 1; //step 2
t.c = 2; //step 3

Questing is : Are these 3 steps having the same perfomance time?

Sorry for my poor english.
--
comp.lang.c.moderated - moderation address: cl...@plethora.net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.

James Kuyper

unread,

Apr 23, 2012, 9:31:16 AM4/23/12

to

On 04/04/2012 07:05 PM, echo ma wrote:
> on a 32bit system.
>
> #pragma pack(4)
> struct TestStruct
> {
> unsigned short a;
> unsigned short b;
> unsigned int c;
> };
> #pragma pack()
> struct TestStruct t;
> t.a = 0; //step 1
> t.b = 1; //step 2
> t.c = 2; //step 3
>
> Questing is : Are these 3 steps having the same perfomance time?

On some systems, #pragma pack() won't even be recognized as a supported
#pragma. It's not defined by the standard, and it might have different
effects on different systems.

Even on systems where #pragma pack() is recognized, and has similar
effects, there is no general answer to that question, just different
answers for different systems, and those answers will differ even on the
same system, depending upon the context. Tell us which 32-bit system
you're using, and if anyone has a similar system, they'll be able to
perform tests to determine which one is faster.
--
James Kuyper

Joel C. Salomon

unread,

Apr 23, 2012, 9:33:16 AM4/23/12

to

More CPU cycles have been spent passing this message across the Internet
than would be saved by any such optimization.

David Brown

unread,

Apr 23, 2012, 9:33:30 AM4/23/12

to

On 05/04/2012 01:05, echo ma wrote:
> on a 32bit system.
>
> #pragma pack(4)
> struct TestStruct
> {
> unsigned short a;
> unsigned short b;
> unsigned int c;
> };
> #pragma pack()
> struct TestStruct t;
> t.a = 0; //step 1
> t.b = 1; //step 2
> t.c = 2; //step 3
>
> Questing is : Are these 3 steps having the same perfomance time?
>

Taken out of context, it is impossible to say. The speed of code
depends on a huge number of factors, including the compiler, compiler
options, processor (there are several dozens of commonly used 32-bit
processor architectures), cache, memory, etc., as well as the
surrounding code.

As a wild generalisation, the timing will be much the same if the
processor uses absolute addressing modes, but for processors that make
more use of pointers there will be some initial code to set up the
pointer before the first access, then each access will be quickly done.

> Sorry for my poor english.

There is no problem with your English - the problem lies in not knowing
what you are trying to ask. It is seldom a good plan to take statements
out of context and ask if that one statement is fast or slow - you have
to look at the bigger picture.

Thomas Richter

unread,

Apr 23, 2012, 9:31:31 AM4/23/12

to

On 05.04.2012 01:05, echo ma wrote:
> on a 32bit system.
>
> #pragma pack(4)
> struct TestStruct
> {
> unsigned short a;
> unsigned short b;
> unsigned int c;
> };
> #pragma pack()
> struct TestStruct t;
> t.a = 0; //step 1
> t.b = 1; //step 2
> t.c = 2; //step 3
>
> Questing is : Are these 3 steps having the same perfomance time?

Sorry, I don't understand. To have something "the same", you need to
compare two different things. So which two "are the same" or "not the
same"? I see only one piece of code, and this is not even portable.
(#pragma pack() is not ANSI-C).

Anyhow, before performing micro-optimizations like the one you made
above, profile your code and identify whether this is really the cause
of a performance (or memory) problem. You should probably post a more
complete example to allow any judgement - and measure first before
trying non-portable constructs.

Greetings,
Thomas

Barry Schwarz

unread,

Apr 23, 2012, 9:32:01 AM4/23/12

to

On Wed, 4 Apr 2012 18:05:42 -0500 (CDT), echo ma <fat...@gmail.com>
wrote:

>on a 32bit system.
>
>#pragma pack(4)
>struct TestStruct
>{
> unsigned short a;
> unsigned short b;
> unsigned int c;
>};
>#pragma pack()
>struct TestStruct t;
>t.a = 0; //step 1
>t.b = 1; //step 2
>t.c = 2; //step 3
>
>Questing is : Are these 3 steps having the same perfomance time?
>
>Sorry for my poor english.

It will depend on your system (hardware, compiler, and options).

There is no guarantee that
t.a = 0;
and
t.a = 1;
will have the same performance. (Some systems have methods of zeroing
memory that are more efficient than storing an arbitrary value.)

For that matter, there is no guarantee that
t.a = 0;
and
t.b = 0;
have the same performance. (On some systems, access performance is
different for different alignments.)

And then step three accesses "more" memory than the first two so why
should its performance be the same.

The only way to answer these types of questions is to benchmark your
code and even that is questionable unless you disable cache and your
code is the ONLY process executing on the system, something pretty
rare on most user systems these days.

--
Remove del for email

Keith Thompson

unread,

Apr 23, 2012, 9:32:16 AM4/23/12

to

echo ma <fat...@gmail.com> writes:
> on a 32bit system.
>
> #pragma pack(4)
> struct TestStruct
> {
> unsigned short a;
> unsigned short b;
> unsigned int c;
> };
> #pragma pack()
> struct TestStruct t;
> t.a = 0; //step 1
> t.b = 1; //step 2
> t.c = 2; //step 3
>
> Questing is : Are these 3 steps having the same perfomance time?

The C language standard says nothing about the relative performance
of various operations, so there is no general answer to your
question. The language specifies the behavior of programs, not
their performance.

You can answer it for your own system by measuring the performance.

Suppose you find out that one operation is faster than another on
your system; what would you do with that information? If you need
to assign a value to t.c, assigning a value to t.a might be faster,
but that doesn't do yhou much good if it's not the right thing to do.

(Note that "#pragma pack" is non-standard and potentially unsafe;
see <http://stackoverflow.com/q/8568432/827263>. For your particular
struct definition, it's unlikely to have any effect.)

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson

unread,

Apr 30, 2012, 10:57:44 PM4/30/12

to

Thomas Richter <th...@math.tu-berlin.de> writes:
> On 05.04.2012 01:05, echo ma wrote:
>> on a 32bit system.
>>
>> #pragma pack(4)
>> struct TestStruct
>> {
>> unsigned short a;
>> unsigned short b;
>> unsigned int c;
>> };
>> #pragma pack()
>> struct TestStruct t;
>> t.a = 0; //step 1
>> t.b = 1; //step 2
>> t.c = 2; //step 3
>>
>> Questing is : Are these 3 steps having the same perfomance time?
>
> Sorry, I don't understand. To have something "the same", you need to
> compare two different things. So which two "are the same" or "not the
> same"? I see only one piece of code, and this is not even portable.
> (#pragma pack() is not ANSI-C).

I think echo ma is asking about the relative performance of the three
statements "t.a = 0;", "t.b = 1;", and "t.c = 2;".

I suppose if, for example, the assignment to t.a turned out to be
significantly faster than the assignment to t.b, then it might make
sense to rearrange the struct definition so the most frequently accessed
members are first.

In practice, such rearrangement is unlikely to be helpful.

[...]

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

George Neuner

unread,

Apr 30, 2012, 10:58:47 PM4/30/12

to

On Mon, 23 Apr 2012 08:32:16 -0500 (CDT), Keith Thompson
<ks...@mib.org> wrote:

>echo ma <fat...@gmail.com> writes:
>> on a 32bit system.
>>
>> #pragma pack(4)
>> struct TestStruct
>> {
>> unsigned short a;
>> unsigned short b;
>> unsigned int c;
>> };
>> #pragma pack()
>> struct TestStruct t;
>> t.a = 0; //step 1
>> t.b = 1; //step 2
>> t.c = 2; //step 3
>>
>> Questing is : Are these 3 steps having the same perfomance time?
>

>Suppose you find out that one operation is faster than another on
>your system; what would you do with that information? If you need
>to assign a value to t.c, assigning a value to t.a might be faster,
>but that doesn't do yhou much good if it's not the right thing to do.
>
>(Note that "#pragma pack" is non-standard and potentially unsafe;
>see <http://stackoverflow.com/q/8568432/827263>. For your particular
>struct definition, it's unlikely to have any effect.)

The only real reasons for such tightly packed structures are to save
memory in *very* small systems, or to perform memory mapped device
access ... which brings up both coding issues such as register value
reuse and CPU/platform issues such as bus width, access alignment,
write-back caching, out-of-order execution, write combining, etc.

x86 being an exception, most 32-bit CPUs use memory mapped I/O
exclusively, and it often is desirable to create structures for
accessing groups of contiguous device "registers". However, in such
cases, making a single 32-bit access to A+B in order to write a 16-bit
value to B would be incorrect (and potentially destructive).

As someone else said already, it would be nice if the OP had given a
more complete description of the intended use.

George