Vector code in C

jacob navia

unread,

Jun 28, 2009, 5:36:11 AM6/28/09

to

Hi

The preview of the X86 processor features for 2010
shows a 256 bit register file (called YMM) that
will be able to do SIMD of 4 double precision
data or 8 single precision data at a time.

In this context, I would be interested to know if there are
any extensions proposed or planned by the standards
committee to be able to use this hardware features.

Already the language is unable to use the clamped addition
capabilities of most modern processors, and missing this
vector features will increase the gap between the abstract
C CPU and the real ones.

Other extensions planned are 16 bit half precision floats.
The NVIDIA graphic cards use already this format,
and it would be interesting to know if the construct

short float identifier;

could be used for this. This format features a sign bit,
5 bits exponent (excess 15) and 10 bits mantissa.

Thanks

P.S. In the lcc-win compiler systems this will be provided
by the operator overloading feature, that solves this, and most
other similar problems.

Giacomo Catenazzi

unread,

Jun 29, 2009, 4:08:52 AM6/29/09

to jacob navia

jacob navia wrote:
> The preview of the X86 processor features for 2010
> shows a 256 bit register file (called YMM) that
> will be able to do SIMD of 4 double precision
> data or 8 single precision data at a time.
>
> In this context, I would be interested to know if there are
> any extensions proposed or planned by the standards
> committee to be able to use this hardware features.
>
> Already the language is unable to use the clamped addition
> capabilities of most modern processors, and missing this
> vector features will increase the gap between the abstract
> C CPU and the real ones.

I consider this as a feature.
If one want to write a portable program, it will use standard
C, if one need specific CPU features and speed, it will use
compiler extensions.

How you would write a program that works on new and old
CPUs (and maybe other vendors or other architectures)?

How to handle such extension efficiently if new families
will increase the size of register (e.g. 12 or 16 single
precision at time) ?

With a future CPU the speed could get very low, if you use the
parameters of other CPUs (already happened several time on Intel
CPU, with very heavy optimized programs).
Thus there is a risk that a program will be faster on X and
slower on X+1, which is contrary of portability

Considering the extra program assumptions, I think that
adding a compiler extensions assumption is also natural.

As you know, one need (writing fast programs) to
know well the CPU, the compiler and numeric maths.
This is already true with standard C floating points, where good
understandings of the three items will increase considerably
the speed, which is impossible on language and/or compiler level.

So I think the problem should be solved at:
- at compiler level (extensions) for the very specialized programs,
where the programmers must already know a lot of details
- at library level for general programmers. But in this cases we
need some "reference" implementations. (MPI is one of the most
common, but probably much more parallelized and complex that your
initial question)

ciao
cate

jacob navia

unread,

Jun 29, 2009, 5:10:53 AM6/29/09

to

Vector extensions are a *common* feature of most modern CPUs
today

Processor Vector Extension Year Web Reference
Sun UltraSPARC VIS
(Visual Instruction Set) 1995 (shipped)
Hewlett-Packard PA-RISC MAX
(Multimedia Acceleration eXtensions) 1995 (shipped)
Intel Pentium MMX, SSE SSE2 1997 (shipped)
Silicon graphics MDMX
(MIPS Digital Media eXtension) 1996
PowerPC AltiVec AltiVec 1998
AMD K6-2 3DNow! Nov 1998 (shipped) HOME PAGE

Most of the vector extensions are more than 10 years old!
The features I am speaking about in my message are quite old in
processor standards. True, they will NOT be useful in the 8051
or in the old 6502.

So what?

Why must be always the case that we make C compatible with
obsolete hardware?

> So I think the problem should be solved at:
> - at compiler level (extensions) for the very specialized programs,
> where the programmers must already know a lot of details

The problem is that all those solutions make any code that uses them
completely non-portable. Suggestions could be made by the language as
to how to use those features in a portable manner.

> - at library level for general programmers. But in this cases we
> need some "reference" implementations. (MPI is one of the most
> common, but probably much more parallelized and complex that your
> initial question)
>

You are confusing parallel programming and allowing vector extensions.
What I was suggesting in my message is that we allow the declaration of
a vector data type with some specific annotation that allows to perform
the 4 operations with those vectors in a portable manner. For instance:

typedef struct tagVector2double {
double lowerPart;
double higherPart;
} Vector2double;

Vector2double a,b,c;

// ... later in code

c = a+b; // Vector operation

The solution proposed by lcc-win is operator overloading, that
I have been proposing since quite a long time here.

Other solutions exist, but they are problematic, and specific
to each problem.

Even more problematic is the absolute lack of any initiative
of the standards committee about this operations since it
increases the motivation for developers to leave C as a development
language. The main advantage of C is that it is close to the
hardware.

Tom St Denis

unread,

Jun 29, 2009, 7:57:59 AM6/29/09

to

On Jun 29, 5:10 am, jacob navia <ja...@jacob.remcomp.fr> wrote:
> Vector extensions are a *common* feature of most modern CPUs
> today
>
> Processor Vector Extension Year Web Reference
> Sun UltraSPARC VIS
> (Visual Instruction Set) 1995 (shipped)
> Hewlett-Packard PA-RISC MAX
> (Multimedia Acceleration eXtensions) 1995 (shipped)
> Intel Pentium MMX, SSE SSE2 1997 (shipped)
> Silicon graphics MDMX
> (MIPS Digital Media eXtension) 1996
> PowerPC AltiVec AltiVec 1998
> AMD K6-2 3DNow! Nov 1998 (shipped) HOME PAGE
>
> Most of the vector extensions are more than 10 years old!
> The features I am speaking about in my message are quite old in
> processor standards. True, they will NOT be useful in the 8051
> or in the old 6502.
>
> So what?
>
> Why must be always the case that we make C compatible with
> obsolete hardware?

Even when hardware has SIMD instructions they're not all compatible.
Even 3DNOW/SSE have different feature sets. So defining an extension
to C will just make things messier as every vendor would want their
subset of things added. Then on the lower end processors [ARM7 for
instance] the compiler vendors will have to provide more and more
runtime assistance to make it "compliant."

It's probably better if compilers just recognized vector ops and
optimized appropriately. Like if you had

struct vector {
float x, y, z, w;
} v1, v2;

v1.x += v2.x;
v1.y += v2.y;
v1.z += v2.z;
v1.w += v2.w;

Or whatever, it recognizes you're adding 4 floats adjacent to one
another and issues a 128-bit SIMD vector add. That would make more
sense as it's still basically C and you're not adding to the vendors
workload, but the processors that can take advantage of it will need
compilers modified to support it.

Which if I'm not mistaken is the direction GCC is eventually going to
head in.

Tom

jacob navia

unread,

Jun 29, 2009, 8:04:41 AM6/29/09

to

> } v1, v2;to make any vendor extensions, as you imply above.

c = a+b,

>
> v1.x += v2.x;
> v1.y += v2.y;
> v1.z += v2.z;
> v1.w += v2.w;
>

This implies operator overloading, precisely the option that
has been developed in lcc-win.

The advantage of operator overloading is that there is no need
to have any vendor extensions:

c = a+b;

needs an underlying operator overload with a concrete implementation
in mind, but the user code remains unchanged.

As to the concern of the lower end CPUs, they would just overload
the operators and do the operations in software!

lawrenc...@siemens.com

unread,

Jun 29, 2009, 11:51:23 AM6/29/09

to

In comp.std.c jacob navia <ja...@jacob.remcomp.fr> wrote:
>
> Even more problematic is the absolute lack of any initiative
> of the standards committee about this operations since it
> increases the motivation for developers to leave C as a development
> language. The main advantage of C is that it is close to the
> hardware.

Perhaps you overlooked the Data Parallel C Extensions technical report
that the C committee produced 11 years ago:

<http://www.cs.unh.edu/~pjh/dpce/>

This is definitely not my area of expertise, but my understanding is
that the proposed extensions never caught on because compiler technology
advanced to the point where automatic vectorization produces nearly as
good a result without requiring any special syntax.
--
Larry Jones

Even though we're both talking english, we're not speaking the same language.
-- Calvin

jacob navia

unread,

Jun 29, 2009, 12:37:47 PM6/29/09

to

lawrenc...@siemens.com wrote:
> In comp.std.c jacob navia <ja...@jacob.remcomp.fr> wrote:
>> Even more problematic is the absolute lack of any initiative
>> of the standards committee about this operations since it
>> increases the motivation for developers to leave C as a development
>> language. The main advantage of C is that it is close to the
>> hardware.
>
> Perhaps you overlooked the Data Parallel C Extensions technical report
> that the C committee produced 11 years ago:
>
> <http://www.cs.unh.edu/~pjh/dpce/>
>
> This is definitely not my area of expertise, but my understanding is
> that the proposed extensions never caught on because compiler technology
> advanced to the point where automatic vectorization produces nearly as
> good a result without requiring any special syntax.

That report is a general specification for data paralmlelism. That is
not the point here. I am speaking about the possibility for the C
programmer to define small vectors of C data types like integers or
doubles, that will be treated in parallel BY ONE CPU!

This means that the complexity of parallel programming are
completely absent, since there is no shared memory, synchronization
and all the associated problems of parallel programming.

As to the reasons why those propositions did not caught on, this
is open to debate. 11 years ago the need for parallel programming
was not so widespread as today, just to start with that.

But, again, here we are NOT speaking about parallel programs in
multiple CPUs but in parallel processing in a single CPU.

What is obvious (see my previous post) is that many modern CPUs
implement this features and there is no way for the programmer
to express this in a portable manner.

Tom St Denis

unread,

Jun 29, 2009, 12:40:06 PM6/29/09

to

Ok, I'm going to answer this as bluntly as I can as to inject some
sense.

C has these things called "functions."

Use them. Almost every time I hear about some short coming of C, like
difficulty working with strings, or otherwise tedious structure work,
it almost always boils down to "write a complicated function once, use
it many times."

At the very least use macros. I'd do something like

#define VEC_ADD(v1, v2, v3) do { v1.x = v2.x + v2.y; ... } while (0);

If I wanted to inline it, then my code would be simply

VEC_ADD(a, a, b);

Not only is that supported by every standards conforming C compiler
out there, but advanced compilers can make use of SIMD to optimize it,
while vendors for all other platforms don't have to add even more
support CRT code to their distribution.

Tom

BartC

unread,

Jun 29, 2009, 1:07:38 PM6/29/09

to

Using functions is never going to be as sweet as having a numeric type
(including these 'vectors') being fully supported by the language,
especially if there are memory management issues as well. (I think 'vector'
might have been used in the sense of 'array' rather than 'point')

> At the very least use macros. I'd do something like
>
> #define VEC_ADD(v1, v2, v3) do { v1.x = v2.x + v2.y; ... } while (0);

(Funny sort of vector add you have)

> If I wanted to inline it, then my code would be simply
>
> VEC_ADD(a, a, b);

Better than a += b; ? If you have a range of types for the vector
components, or 2D, 3D and 4D vectors, you'd need a differently named macro
for each, while the "+" is always "+". And then you may want to convert one
vector type to another. Or multiple a vector by scalar, or by a matrix, or
compare a vector with another. You would end up with a whole family of
macros.

Having the type built-in to the language however means you can just use
+, -, *, /, == and != a lot of the time.

> Not only is that supported by every standards conforming C compiler
> out there, but advanced compilers can make use of SIMD to optimize it,
> while vendors for all other platforms don't have to add even more
> support CRT code to their distribution.

I don't personally think these higher-level types belong in C, but in a more
appropriate language, because you can't just add a handful of advanced
features while the rest of the language doesn't even have a proper 'for'
statement.

So, you're right, when talking about C.

--
Bart

user923005

unread,

Jun 29, 2009, 1:50:34 PM6/29/09

to

If I wanted to manipulate vectors in C, I would probably use BLAS,
like everyone else has been doing since the late 1970s.
There are several variants. Intel has one for their chips, and ATLAS
is well known. The horribly named GotoBLAS is supposed to be very
good, though I have not tried that one.
If you have a compiler that cannot interface with Fortran, you can
always get CBLAS.
http://www.tacc.utexas.edu/resources/software/gotoblasfaq.php
http://software.intel.com/en-us/intel-mkl/
http://math-atlas.sourceforge.net/
http://www.netlib.org/clapack/cblas/

Flash Gordon

unread,

Jun 29, 2009, 2:14:39 PM6/29/09

to

jacob navia wrote:
> Giacomo Catenazzi wrote:
>> jacob navia wrote:

<snip>

>>> In this context, I would be interested to know if there are
>>> any extensions proposed or planned by the standards
>>> committee to be able to use this hardware features.

<snip>

> Vector extensions are a *common* feature of most modern CPUs
> today
>
> Processor Vector Extension Year Web Reference
> Sun UltraSPARC VIS
> (Visual Instruction Set) 1995 (shipped)
> Hewlett-Packard PA-RISC MAX
> (Multimedia Acceleration eXtensions) 1995 (shipped)
> Intel Pentium MMX, SSE SSE2 1997 (shipped)
> Silicon graphics MDMX
> (MIPS Digital Media eXtension) 1996
> PowerPC AltiVec AltiVec 1998
> AMD K6-2 3DNow! Nov 1998 (shipped) HOME PAGE
>
> Most of the vector extensions are more than 10 years old!

So? Will one of those fit in your wrist watch?

> The features I am speaking about in my message are quite old in
> processor standards. True, they will NOT be useful in the 8051
> or in the old 6502.
>
> So what?
>
> Why must be always the case that we make C compatible with
> obsolete hardware?

Not all small devices are obsolete. Sometimes you need a small device
without all the extras that won't be used due to either limited space,
the need to keep the power consumption down, or even because the
processor actually going to be implemented as part of an ASIC and you
need to keep the gate count down! Not everything is running off large
power supplies with plenty of space for cooling.

For example, a minute of searching with Google showed a 16 bit processor
core released in December 2005, so hardly obsolete! Also they are
continuing to develop the range, since there has been another model
since then from the same company.

There is a *lot* more to the world than large processors!

>> So I think the problem should be solved at:
>> - at compiler level (extensions) for the very specialized programs,
>> where the programmers must already know a lot of details
>
> The problem is that all those solutions make any code that uses them
> completely non-portable. Suggestions could be made by the language as
> to how to use those features in a portable manner.

One option is that you design your optimisor (or code generator) so that
it recognises certain simple idioms and translates them in to the vector
instructions. Then you document and publish in large letters on your web
site that your compiler does this. Then people can write fully portable
code that will work anywhere which will run really fast when compiled
with your compiler. If it is worth while this will encourage other
compiler writers to do the same thing.

Oh, and I have seen a compiler writer use exactly this method to provide
a way to access one of the fancy instructions on a processor. As a
programmer my attitude was, "that makes a lot of sense, I'll remember
that whilst doing the coding".

>> - at library level for general programmers. But in this cases we
>> need some "reference" implementations. (MPI is one of the most
>> common, but probably much more parallelized and complex that your
>> initial question)
>>
>
> You are confusing parallel programming and allowing vector extensions.
> What I was suggesting in my message is that we allow the declaration of
> a vector data type with some specific annotation that allows to perform
> the 4 operations with those vectors in a portable manner. For instance:
>
> typedef struct tagVector2double {
> double lowerPart;
> double higherPart;
> } Vector2double;
>
> Vector2double a,b,c;
>
> // ... later in code
>
> c = a+b; // Vector operation

Alternative make your code generator recognise that when there are two
(or more) additions of the form:

*(c_base+0) = *(a_base+0) + *(b_base+0)
*(c_base+1) = *(a_base_1) + *(b_base+1)
...

and also
for (i=0; i<N; i++)
*(c_base+i) = *(a_base+i) + *(b_base+i)

Then lots of existing code will suddenly run faster without any need to
change it!

> The solution proposed by lcc-win is operator overloading, that
> I have been proposing since quite a long time here.

Operator overloading allows *you* to solve the problem in your compiler.
However, if I had an earlier version of your compiler (with operator
overloading but not the vector extension) then I still could not do it
without resorting to assembler. After all, how else would I get the
compiler to use these fancy instructions? So really, the thing that
allows you to implement using this new instruction is *not* the operator
overloading, it is whatever you use within the definition of the
"function" that overloads the + operator that tells it to use the fancy
vector instruction the processor provides!

You are a fan of operator overloading so you see every problem as being
solved by it. Other people are not fans of it (or just not as
enthusiastic about it) so they see other solutions.

> Other solutions exist, but they are problematic, and specific
> to each problem.

You still need to implement one of those other solutions to have a way
to use operator overloading to specify using a specific instruction.
Otherwise I could get any C++ compiler to use the vector instructions of
the processor using standard types and operators (that are common to
bother C and C++) with the addition of the operator overloding provided
by C++. So show me some standard C++ that will do this on the versions
of gcc and MS Visual Studio I have for the Intel processors.

> Even more problematic is the absolute lack of any initiative
> of the standards committee about this operations

How do you know they are not considering how to take advantage of vector
operations? Have you spoken to all of the committee members? Try putting
forward your propositions without attacking the committee, including how
you intend to actually specify the use of vector operations (rather than
sequential normal add instructions) and people might be more sympathetic.

> since it
> increases the motivation for developers to leave C as a development
> language. The main advantage of C is that it is close to the
> hardware.

That is, for some things, a big advantage. Other times the advantage is
that it is relatively easy to port a C compiler to a target. There are
probably a hole host of reasons why people use C.
--
Flash Gordon

Flash Gordon

unread,

Jun 29, 2009, 2:31:10 PM6/29/09

to

jacob navia wrote:
> lawrenc...@siemens.com wrote:
>> In comp.std.c jacob navia <ja...@jacob.remcomp.fr> wrote:
>>> Even more problematic is the absolute lack of any initiative
>>> of the standards committee about this operations since it
>>> increases the motivation for developers to leave C as a development
>>> language. The main advantage of C is that it is close to the
>>> hardware.
>>
>> Perhaps you overlooked the Data Parallel C Extensions technical report
>> that the C committee produced 11 years ago:
>>
>> <http://www.cs.unh.edu/~pjh/dpce/>
>>
>> This is definitely not my area of expertise, but my understanding is
>> that the proposed extensions never caught on because compiler technology
>> advanced to the point where automatic vectorization produces nearly as
>> good a result without requiring any special syntax.
>
> That report is a general specification for data paralmlelism. That is
> not the point here. I am speaking about the possibility for the C
> programmer to define small vectors of C data types like integers or
> doubles, that will be treated in parallel BY ONE CPU!

The first two examples where adding together two arrays of int placing
the result in a third array of int (i.e. vector add done with an array
instead of a structure) and adding a constant to all elements of an
array (i.e. adding an int to a vector, with the addition being defined
as adding to all elements of the vector).

Sounds to me like it covers *exactly* what you are trying to cover, only
slightly differently and possibly covering a lot more things that you
did not try to cover.

Also, I would say that data parallel programming is specifically *not*
intended for doing thing with multiple CPUs, for that you would would to
specify both instruction and data parallelism! Vectors, on the other
hand, fit *exactly* in to data parallelism because you are performing
the same operation on multiple items of data (the elements of the vector).

> This means that the complexity of parallel programming are
> completely absent, since there is no shared memory, synchronization
> and all the associated problems of parallel programming.

No, you have not understood the title of the article.

> As to the reasons why those propositions did not caught on, this
> is open to debate. 11 years ago the need for parallel programming
> was not so widespread as today, just to start with that.

Vector arithmetic has been a big topic in some areas for a lot longer
than 11 years. For example in the image processing I worked on back in
1985 before C was even standardised!

> But, again, here we are NOT speaking about parallel programs in
> multiple CPUs but in parallel processing in a single CPU.

Which is what data parallelism is all about.

> What is obvious (see my previous post) is that many modern CPUs
> implement this features and there is no way for the programmer
> to express this in a portable manner.

See the comments by other people, including the post you just responded to.
--
Flash Gordon

robert...@yahoo.com

unread,

Jun 29, 2009, 4:35:01 PM6/29/09

to

On Jun 29, 11:37 am, jacob navia <ja...@jacob.remcomp.fr> wrote:

> to express this in a portable manner.- Hide quoted text -

Except by coding the loops as you would without any such extensions,
and then letting the compiler vectorize it automatically. Vectorizing
nominally scalar code is not a new idea - Fortran compliers started
doing it back in the 70s. C99 did add "restrict" specifically to aid
that. If that’s a little harder for the compiler writer, I’m not sure
I really care (sorry).

The syntactic niceties of being able to specify a whole vector as an
operand are a separate issue, but would be easily handled with some
templates or classes in C++.

And as for C needing to support obsolete CPUs... I suspect that's the
majority of its user base these days - smaller, and in some sense
"obsolete," embedded CPUs.

FWIW, MSVC happily generated vector (SSE/2) code for the following
with the right compiler flags (“-Ox –arch:SSE2”):

float a[], b[], c[];

void f(void)
{
int i;

for (i=0; i<64; i++)
c[i] = a[i] + b[i];

}

jacob navia

unread,

Jun 29, 2009, 5:21:37 PM6/29/09

to

robert...@yahoo.com wrote:
>
> The syntactic niceties of being able to specify a whole vector as an
> operand are a separate issue, but would be easily handled with some
> templates or classes in C++.
>

Well, I am just saying that C could do the same.
Without classes or templates obviously, just with operator
overloading.

Your example of MSVC vectorizing is nice, but as you
know very well, after decades of efforts, the result
is very brittle and you need very specific
loops and a very specific situation for it. WIth
striaghtforward addition it will work, but with a
slightly more complicated expression it will fail.

Phil Carmody

unread,

Jun 29, 2009, 5:31:28 PM6/29/09

to

Flash Gordon <sm...@spam.causeway.com> writes:
> jacob navia wrote:
>> Giacomo Catenazzi wrote:
>>> jacob navia wrote:
>
> <snip>
>
>>>> In this context, I would be interested to know if there are
>>>> any extensions proposed or planned by the standards
>>>> committee to be able to use this hardware features.
>
> <snip>
>
>> Vector extensions are a *common* feature of most modern CPUs
>> today
>>
>> Processor Vector Extension Year Web Reference
>> Sun UltraSPARC VIS
>> (Visual Instruction Set) 1995 (shipped)
>> Hewlett-Packard PA-RISC MAX
>> (Multimedia Acceleration eXtensions) 1995 (shipped)
>> Intel Pentium MMX, SSE SSE2 1997 (shipped)
>> Silicon graphics MDMX
>> (MIPS Digital Media eXtension) 1996
>> PowerPC AltiVec AltiVec 1998
>> AMD K6-2 3DNow! Nov 1998 (shipped) HOME PAGE
>>
>> Most of the vector extensions are more than 10 years old!
>
> So? Will one of those fit in your wrist watch?

VFP will.

>> The features I am speaking about in my message are quite old in
>> processor standards. True, they will NOT be useful in the 8051
>> or in the old 6502.
>>
>> So what?
>>
>> Why must be always the case that we make C compatible with
>> obsolete hardware?
>
> Not all small devices are obsolete. Sometimes you need a small device
> without all the extras that won't be used due to either limited space,
> the need to keep the power consumption down, or even because the
> processor actually going to be implemented as part of an ASIC and you
> need to keep the gate count down! Not everything is running off large
> power supplies with plenty of space for cooling.
>
> For example, a minute of searching with Google showed a 16 bit
> processor core released in December 2005, so hardly obsolete! Also
> they are continuing to develop the range, since there has been another
> model since then from the same company.
>
> There is a *lot* more to the world than large processors!

However, applications for them are now smaller than mobile phones.
It's an ever-shrinking portion of the market. Having said that,
the ultra-low-power field is a wonderful one to program for nowadays,
The average pointy-clicky-MSVCPP (l)user simply wouldn't have a clue
what to do.

Phil
--
Marijuana is indeed a dangerous drug.
It causes governments to wage war against their own people.
-- Dave Seaman (sci.math, 19 Mar 2009)

Tom St Denis

unread,

Jun 29, 2009, 7:43:12 PM6/29/09

to

On Jun 29, 1:07 pm, "BartC" <ba...@freeuk.com> wrote:
> Using functions is never going to be as sweet as having a numeric type
> (including these 'vectors') being fully supported by the language,
> especially if there are memory management issues as well. (I think 'vector'
> might have been used in the sense of 'array' rather than 'point')

Vectors are just arrays.

> > At the very least use macros. I'd do something like
>
> > #define VEC_ADD(v1, v2, v3) do { v1.x = v2.x + v2.y; ... } while (0);
>
> (Funny sort of vector add you have)

Typo, more reason to use a vetted math library to deal with this sort
of thing.

> Better than a += b; ? If you have a range of types for the vector
> components, or 2D, 3D and 4D vectors, you'd need a differently named macro
> for each, while the "+" is always "+". And then you may want to convert one
> vector type to another. Or multiple a vector by scalar, or by a matrix, or
> compare a vector with another. You would end up with a whole family of
> macros.

Well if I were trying to work with arbitrary sized vectors ...

struct vector {
int used, size;
float *dp;
};

Then write appropriately. OMG hard. Hint: I wrote two math
libraries in C that deal with large integers, it's no different than
arbitrary vectors.

> Having the type built-in to the language however means you can just use
> +, -, *, /, == and != a lot of the time.

Then use C++.

> I don't personally think these higher-level types belong in C, but in a more
> appropriate language, because you can't just add a handful of advanced
> features while the rest of the language doesn't even have a proper 'for'
> statement.
>
> So, you're right, when talking about C.

I still contend that appropriate use of modular coding and algorithms
can make the average complicated math task relatively straight
forward.

Tom

jacob navia

unread,

Jun 29, 2009, 7:49:47 PM6/29/09

to

Tom St Denis wrote:
>> Having the type built-in to the language however means you can just use
>> +, -, *, /, == and != a lot of the time.
>
> Then use C++.
>

Obviously operator overloading is OK if it is C++,
wrong if it is in C.

We should keep C in his cage, and keep all features
that make sense for C++.

C should just be kept frozen forever.

Change to C++!

jacob navia

unread,

Jun 29, 2009, 7:55:33 PM6/29/09

to

Tom St Denis wrote:
> Ok, I'm going to answer this as bluntly as I can as to inject some
> sense.
>
> C has these things called "functions."
>
> Use them. Almost every time I hear about some short coming of C, like
> difficulty working with strings, or otherwise tedious structure work,
> it almost always boils down to "write a complicated function once, use
> it many times."
>

Ahhhhh functions. Yes, of course, I forgot that. Functions,
yes, that solves everything

> At the very least use macros. I'd do something like
>
> #define VEC_ADD(v1, v2, v3) do { v1.x = v2.x + v2.y; ... } while (0);
>
> If I wanted to inline it, then my code would be simply
>
> VEC_ADD(a, a, b);
>

Great.

> Not only is that supported by every standards conforming C compiler
> out there, but advanced compilers can make use of SIMD to optimize it,
> while vendors for all other platforms don't have to add even more
> support CRT code to their distribution.

Wonderful world where you live. I like it very much. In my
stupid world (full of stupids like me), even Intel compiler
will have big trouble vectorizing operations as soon as they
get a little bit sophisticated. Decades of research for
automatic vectorizing gives very poor results but... you live
in the better world of course.

Tom St Denis

unread,

Jun 29, 2009, 8:07:20 PM6/29/09

to

Why the hell not? It's hard enough getting proper C support for some
obscure platforms, last thing I want is for them to have more to
support.

Frankly, if I wanted to do DSP work in C I'd just find [or write] a
competent DSP library so that from the high level application I'm
making calls like

mdct_encode(in, out, size);

And it does all the nitty gritty. Why do I need operator overloading
and all that nonsense at the high level?

It's not my fault you don't know how to write modular factored code.
I've written my share of large integer math and public key libraries
on top of basic C. I can't say C++ would have made any of it any
easier. In the case of the integer math you still need the algorithms
behind the scenes, and most of the time in the PK cases is
understanding the algorithms, then figuring out ways of optimizing
them. Very little time is spent "dealing" with having to make C calls
to large integer functions.

But really, if you want overloading I don't see why C++ is a bad
suggestion.

You might as well be asking C compiler vendors to add "natural
language extensions" so we can write C as we speak it ...

for x equals 0 from 1 to 30 curly open brace ...

It's so much easier!

Tom St Denis

unread,

Jun 29, 2009, 8:11:09 PM6/29/09

to

On Jun 29, 7:55 pm, jacob navia <ja...@jacob.remcomp.fr> wrote:
> Wonderful world where you live. I like it very much. In my
> stupid world (full of stupids like me), even Intel compiler
> will have big trouble vectorizing operations as soon as they
> get a little bit sophisticated. Decades of research for
> automatic vectorizing gives very poor results but... you live
> in the better world of course.

Compilers ARE getting better in terms of optimizations. It's not my
fault your claim to fame is ripping off LCC and then adding a resource
editor on top.

GCC 4.4.0 is miles better than say GCC 2.8.2 and that's only a decade
and a bit apart.

But who's counting...

Sure it's slow progress but in the mean time, platform specific asm's
and proper use of factored code.

I wrote a math library [tomsfastmath, though I don't distribute it
anymore...] which out of the box builds in ISO C mode, but also
supports x86_32/64, ARMv4, PPC, MIPS, AVR32, etc. I used macros that
meant the body of the code was the same for EVERY SINGLE PLATFORM.
OMG AMAZING!

So basically your main complaint is it's too much work to design your
software well.

Gotcha.

Tom

Keith Thompson

unread,

Jun 29, 2009, 8:42:29 PM6/29/09

to

Sarcasm is not persuasive.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Kenny McCormack

unread,

Jun 29, 2009, 9:42:59 PM6/29/09

to

In article <ln1vp2p...@nuthaus.mib.org>,
Keith Thompson <ks...@mib.org> wrote:
...

>> Change to C++!
>
>Sarcasm is not persuasive.

Having another anxiety attack there, Kiki?

(In case the reference is not clear - either because you're new to CLC
or you just don't remember it - the reference is to Han pointing out
just how anxious Kiki gets when he sees a sig that's too long by his
lights...)

Richard Heathfield

unread,

Jun 30, 2009, 2:27:48 AM6/30/09

to

jacob navia said:

> Tom St Denis wrote:
>>> Having the type built-in to the language however means you can
>>> just use +, -, *, /, == and != a lot of the time.
>>
>> Then use C++.
>>
>
> Obviously operator overloading is OK if it is C++,
> wrong if it is in C.

Right. Whilst I would find operator overloading in C to be quite
convenient, ISO C doesn't support it and therefore I can't assume
it, which makes the concept practically useless for me (which isn't
to say that it would be useless for everyone).

> We should keep C in his cage, and keep all features
> that make sense for C++.

That's a stupid attitude, and as a supposedly bright bunny -
compiler maintainer and everything - you have no business
entertaining it. It's as if a racing bike designer said "hey, let's
add two more wheels - and let's get rid of those handlebars and put
a steering wheel there instead, and why not use a much bigger
engine, and how about a rear wing for stability on corners?" And
when everyone else says "if you want a racing *car*, you know where
to find it", he sarcastically retorts "we should keep bike-racing
in his cage, and keep all features that make sense for car-racing".

Yes, you *could* turn a racing bike into a racing car, but it
wouldn't be a very good racing car, and it would be a lousy racing
bike, too. There's a good reason why C and C++ have different
names. It's because they're different languages.

> C should just be kept frozen forever.

Well, not necessarily, but the last official attempt to change it
was hardly an outstanding success, was it?

> Change to C++!

If C++ has the features or the design philosophy that you want and
is sufficiently portable for your needs, that's precisely what you
should do. In my case, it mostly hasn't and isn't, so I don't. And
if you changed C to be like C++, C would no longer suit my needs
and I'd have to go looking for something as portable as C used to
be (which would almost certainly turn out to be what-C-is-now
anyway).

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Forged article? See
http://www.cpax.org.uk/prg/usenet/comp.lang.c/msgauth.php
"Usenet is a strange place" - dmr 29 July 1999

Flash Gordon

unread,

Jun 30, 2009, 2:38:43 AM6/30/09

to

Can't say I know that one.

>>> The features I am speaking about in my message are quite old in
>>> processor standards. True, they will NOT be useful in the 8051
>>> or in the old 6502.
>>>
>>> So what?
>>>
>>> Why must be always the case that we make C compatible with
>>> obsolete hardware?
>> Not all small devices are obsolete. Sometimes you need a small device
>> without all the extras that won't be used due to either limited space,
>> the need to keep the power consumption down, or even because the
>> processor actually going to be implemented as part of an ASIC and you
>> need to keep the gate count down! Not everything is running off large
>> power supplies with plenty of space for cooling.
>>
>> For example, a minute of searching with Google showed a 16 bit
>> processor core released in December 2005, so hardly obsolete! Also
>> they are continuing to develop the range, since there has been another
>> model since then from the same company.
>>
>> There is a *lot* more to the world than large processors!
>
> However, applications for them are now smaller than mobile phones.

Yes, little things like planes...

> It's an ever-shrinking portion of the market. Having said that,
> the ultra-low-power field is a wonderful one to program for nowadays,

Oh, I agree the market is probably shrinking, but it is a long way from
dead or obsolete. However, if/when people start getting serious about
being environmentally friendly (and therefor using the minimum power
required) it could well grow again.

> The average pointy-clicky-MSVCPP (l)user simply wouldn't have a clue
> what to do.

Indeed.
--
Flash Gordon

luserXtrog

unread,

Jun 30, 2009, 4:05:10 AM6/30/09

to

On Jun 29, 12:07 pm, "BartC" <ba...@freeuk.com> wrote:

> I don't personally think these higher-level types belong in C, but in a more
> appropriate language, because you can't just add a handful of advanced
> features while the rest of the language doesn't even have a proper 'for'
> statement.

In what way is C's 'for' statement improper?

--
lxt

Giacomo Catenazzi

unread,

Jun 30, 2009, 5:19:14 AM6/30/09

to jacob navia

jacob navia wrote:
> Tom St Denis wrote:
>>> Having the type built-in to the language however means you can just use
>>> +, -, *, /, == and != a lot of the time.
>>
>> Then use C++.
>>
>
> Obviously operator overloading is OK if it is C++,
> wrong if it is in C.
>
> We should keep C in his cage, and keep all features
> that make sense for C++.

Yes, as you said it is obvious. Two different targets and needs.

C++ trade speed with understandability of programs (but
OTOH is it difficult for human to track the object call order,
especially with overloading and implicit type conversions.)

C want to be faster (and thus low level). If C will add features
of C++ it will become a mess to understand and check C programs.

I think most programmers don't really understand the C implicit type
conversions and the C operator overloading, so adding such thing
will have negative impacts to C.
1/2 is now obvious, but mixing unsigned short int with signed long?
'a' become an integer in an expression?

C++ handle it differently, and it recommends to program in an other
way (using C++ and programming like C, with few C++ extensions is
IMO not a good way to program).

I really think adding overloading will confuse more users: they need
to parse carefully the meaning, implicit type conversions etc.
(thus giving extra burden to the reader in name of readibility ?).

Write it explicitly, e.g. "add_vector(a,b,n)". Few more keystrokes more
but the code is surely more readable and it give less confusions!

ciao
cate

Note: also in maths there is a tendency of remove overloading operators,
e.g. we have a lot of product symbols: nothing, ., x, vedge, circled, ...
to simpify readability, restricted in one context.

IMHO we cannot do it in C language like:
#include <vector-opers.h> or <set-opers.h> or <allfloat-opers.h> or
<string-opers.h> or ... to distringuish the context of overloading.
And without context it will create more problem to a reader
(programs should not only be written but also read by humans)

ciao
cate

BartC

unread,

Jun 30, 2009, 5:20:03 AM6/30/09

to

Every time I write a C 'for' statement, it feels like I have to tell the
compiler how to implement it:

for ( i = A; i<=B; ++i) .....

Notice: the loop variable appears 3 times instead of once; the compare
operation has to be supplied; the increment operation has to be supplied. An
extra four things to get wrong.

A proper loop would just require 'for', i, A and B, and some syntax. And
often the syntax indicates whether the loop goes up or down.

(OK, a C for statement is just a kind of elaborate while loop, and is more
flexible than the for statements of other languages (but then, if statements
and gotos are also quite flexible).

Also, the zero-based indexing of C suits it's for statement better,
otherwise many loops would be 0 to N-1 instead of 0; <N)

--
Bartc

BartC

unread,

Jun 30, 2009, 5:43:11 AM6/30/09

to

Giacomo Catenazzi wrote:
> jacob navia wrote:
>> Tom St Denis wrote:
>>>> Having the type built-in to the language however means you can
>>>> just use +, -, *, /, == and != a lot of the time.
>>>
>>> Then use C++.
>>>
>>
>> Obviously operator overloading is OK if it is C++,
>> wrong if it is in C.

> I really think adding overloading will confuse more users: they need

> to parse carefully the meaning, implicit type conversions etc.
> (thus giving extra burden to the reader in name of readibility ?).
>
> Write it explicitly, e.g. "add_vector(a,b,n)". Few more keystrokes
> more but the code is surely more readable and it give less confusions!

I'm curious: suppose you had two points p=(x1,y1,z1) and q=(x2,y2,z2), and
wanted to get the midpoint m; in operator notation it would just be:

m = (p+q)/2;

What would it look like using your function notation (and preserving p and
q)?

And why wouldn't you use the same function notation for, say, finding the
average of two ints?

--
Bart

Nick Keighley

unread,

Jun 30, 2009, 5:52:40 AM6/30/09

to

On 30 June, 10:20, "BartC" <ba...@freeuk.com> wrote:
> luserXtrog wrote:
> > On Jun 29, 12:07 pm, "BartC" <ba...@freeuk.com> wrote:
>
> >> I don't personally think these higher-level types belong in C, but
> >> in a more appropriate language, because you can't just add a handful
> >> of advanced features while the rest of the language doesn't even
> >> have a proper 'for' statement.
>
> > In what way is C's 'for' statement improper?
>
> Every time I write a C 'for' statement, it feels like I have to tell the
> compiler how to implement it:
>
> for ( i = A; i<=B; ++i) .....
>
> Notice: the loop variable appears 3 times instead of once; the compare
> operation has to be supplied; the increment operation has to be supplied. An
> extra four things to get wrong.
>
> A proper loop would just require 'for', i, A and B, and some syntax. And
> often the syntax indicates whether the loop goes up or down.

ah, you want Algol-68!

for i from 1 to 10 by 1 do something;
for i from 1 to 10 do something;
to 10 do something;

(you can drop many of the fields, I'm not certain you can drop i)

> (OK, a C for statement is just a kind of elaborate while loop, and is more
> flexible than the for statements of other languages (but then, if statements
> and gotos are also quite flexible).

ah, you want flexibility. You want Algol-60

FOR k:=1,V12 WHILE V1<N DO sometheing;
FOR j:=I+G,L,1 STEP 1 UNTIL N, C+D DO A[k,j]:=B[k,j];

the semantics are ... interesting

Nick Keighley

unread,

Jun 30, 2009, 5:58:02 AM6/30/09

to

On 29 June, 17:37, jacob navia <ja...@jacob.remcomp.fr> wrote:

<snip>

> What is obvious (see my previous post) is that many modern CPUs

> implement this features and there is no way for the [C] programmer

> to express this in a portable manner.

what do Fortran programmers do?

I don't know, but I understood super-computer programmers
have had vector operations for years and they were traditionally
programmed in Fortran. Did Fortran have vectors? (It didn't
last time I wrote any Fortran).

--
Nick Keighley

GOD IS REAL
unless a type declaration to the contrary is made

Tom St Denis

unread,

Jun 30, 2009, 7:46:07 AM6/30/09

to

On Jun 30, 5:20 am, "BartC" <ba...@freeuk.com> wrote:
> Every time I write a C 'for' statement, it feels like I have to tell the
> compiler how to implement it:
>
> for ( i = A; i<=B; ++i) .....
>
> Notice: the loop variable appears 3 times instead of once; the compare
> operation has to be supplied; the increment operation has to be supplied. An
> extra four things to get wrong.
>
> A proper loop would just require 'for', i, A and B, and some syntax. And
> often the syntax indicates whether the loop goes up or down.

But that takes away the power of it, you can write things like

for (i = 0; i < MAX_LEN && string[i]; ++i) { ... }

For instance....

sure you would write that as

i = 0;
while (i < MAX_LEN && string[i]) {
...
++i;
}

But I fail to see how that's better... or using a more basic syntax...

for i = 0 to MAX_LEN do
if (string[i] = 0) break;
...
next i

I fail to see how that's better either...

Tom

BartC

unread,

Jun 30, 2009, 8:13:04 AM6/30/09

to

Tom St Denis wrote:
> On Jun 30, 5:20 am, "BartC" <ba...@freeuk.com> wrote:
>> Every time I write a C 'for' statement, it feels like I have to tell
>> the compiler how to implement it:
>>
>> for ( i = A; i<=B; ++i) .....
>>
>> Notice: the loop variable appears 3 times instead of once; the
>> compare operation has to be supplied; the increment operation has to
>> be supplied. An extra four things to get wrong.
>>
>> A proper loop would just require 'for', i, A and B, and some syntax.
>> And often the syntax indicates whether the loop goes up or down.
>
> But that takes away the power of it, you can write things like
>
> for (i = 0; i < MAX_LEN && string[i]; ++i) { ... }

There is no reason why there can't be both types, with the streamlined
version used in 90% of cases where you are just iterating over A to B.

>
> For instance....
>
> sure you would write that as
>
> i = 0;
> while (i < MAX_LEN && string[i]) {
> ...
> ++i;
> }
>
> But I fail to see how that's better... or using a more basic syntax...
>
> for i = 0 to MAX_LEN do
> if (string[i] = 0) break;
> ...
> next i
>
> I fail to see how that's better either...

It's better because you can't accidentally use <= instead of < or ++j
instead of ++i.

And in this last example, Nick mentioned Algol60 where it would be more
like:

for i:=0 to MAX_LEN while string[i] do ...

--
Bart

Nick Keighley

unread,

Jun 30, 2009, 8:13:09 AM6/30/09

to

does that return a vector? Gets kind of expensive if the vectors
are large. Or you have memory management issues. Note the operator
overloading solutions have similar problems.

> Few more keystrokes more
> but the code is surely more readable and it give less confusions!

m = (p + q) / 2

m = vec_div_const (vec_add (p, q), 2);

> Note: also in maths there is a tendency of remove overloading operators,

vectors used + and . (dot) and x (cross) when I were a lad.

> e.g. we have a lot of product symbols: nothing, ., x, vedge, circled, ...
> to simpify readability, restricted in one context.
>
> IMHO we cannot do it in C language like:
> #include <vector-opers.h> or <set-opers.h> or <allfloat-opers.h> or
> <string-opers.h> or ... to distringuish the context of overloading.
> And without context it will create more problem to a reader
> (programs should not only be written but also read by humans)

you really have code that mixes vectors, strings and sets in the
same piece of code?

Tom St Denis

unread,

Jun 30, 2009, 8:18:45 AM6/30/09

to

On Jun 30, 8:13 am, "BartC" <ba...@freeuk.com> wrote:
> > for (i = 0; i < MAX_LEN && string[i]; ++i) { ... }
>
> There is no reason why there can't be both types, with the streamlined
> version used in 90% of cases where you are just iterating over A to B.

#define FOR(x,a,b) for (x = a; x < b; x++)

FOR(x, 0, 10) { printf("x == %d\n", x); }

For the lazy I suppose...

> It's better because you can't accidentally use <= instead of < or ++j
> instead of ++i.
>
> And in this last example, Nick mentioned Algol60 where it would be more
> like:
>
> for i:=0 to MAX_LEN while string[i] do ...

I'm going to say this as politely as I can. Real programmers/
developers do not get confused by the meaning of <= and < all that
often. Sure off-by-one errors do occur (technically you have on in
your Algol60 example since 0...MAX_LEN is MAX_LEN+1 elements) but
they're more the result of being tired or rushed than not
understanding what you are doing.

Basically, if you claim to be a C developer and can't handle a basic
comparison expression as say used in a for loop, you should re-
consider your profession.

Tom

Tom St Denis

unread,

Jun 30, 2009, 8:23:01 AM6/30/09

to

On Jun 30, 5:43 am, "BartC" <ba...@freeuk.com> wrote:
> Giacomo Catenazzi wrote:
> > jacob navia wrote:
> >> Tom St Denis wrote:
> >>>> Having the type built-in to the language however means you can
> >>>> just use +, -, *, /, == and != a lot of the time.
>
> >>> Then use C++.
>
> >> Obviously operator overloading is OK if it is C++,
> >> wrong if it is in C.
> > I really think adding overloading will confuse more users: they need
> > to parse carefully the meaning, implicit type conversions etc.
> > (thus giving extra burden to the reader in name of readibility ?).
>
> > Write it explicitly, e.g. "add_vector(a,b,n)". Few more keystrokes
> > more but the code is surely more readable and it give less confusions!
>
> I'm curious: suppose you had two points p=(x1,y1,z1) and q=(x2,y2,z2), and
> wanted to get the midpoint m; in operator notation it would just be:
>
> m = (p+q)/2;
>
> What would it look like using your function notation (and preserving p and
> q)?

You're right in that function calls for "math" get harder to read, but
that's where clean coding comes in.

vec_add(p, q, m);
vec_div(m, 2);

Is not only perfectly readable, but if you put it in a function like

vec_mid_point(p, q, m);

Is totally manageable.

This discussion is largely off topic since basically we're talking
about development practices and not language standards. Even in
something like C++ you should be working on factoring your code.

Tom

Francis Glassborow

unread,

Jun 30, 2009, 8:31:45 AM6/30/09

to

Nick Keighley wrote:
> On 29 June, 17:37, jacob navia <ja...@jacob.remcomp.fr> wrote:
>
> <snip>
>
>> What is obvious (see my previous post) is that many modern CPUs
>> implement this features and there is no way for the [C] programmer
>> to express this in a portable manner.
>
> what do Fortran programmers do?

Use High Performance Fortran. That exists exactly because it helps if
the programmer has tools for parallel programming (particularly when
using large array processors.

>
> I don't know, but I understood super-computer programmers
> have had vector operations for years and they were traditionally
> programmed in Fortran. Did Fortran have vectors? (It didn't
> last time I wrote any Fortran).

See above.

The thing that concerns me is that neither C nor C++ provide tools to
enable programmers to explicitly write code for vector/array processors.
The best algorithm can be highly dependant on whether the code will be
processed sequentially or in parallel. This is an area where it is not
much use the compiler optimising because it may already be optimising
the wrong code.

We know that the best way to improve performance is with a 'better'
algorithm (e.g. one that matches the resources better) but most modern
languages adamantly refuse to allow the programmer to express the intent
to use massively parallel hardware.

Tom St Denis

unread,

Jun 30, 2009, 8:46:01 AM6/30/09

to

On Jun 30, 8:31 am, Francis Glassborow

<francis.glassbo...@btinternet.com> wrote:
> We know that the best way to improve performance is with a 'better'
> algorithm (e.g. one that matches the resources better) but most modern
> languages adamantly refuse to allow the programmer to express the intent
> to use massively parallel hardware.

Again ... libraries people, libraries.

Suppose I was writing an application to perform massive ffts. I'd
love to have a single call from the high level be like

fft_forward(in, out, size);

Why can't that be a library call that breaks up the job, and seeds it
to spawned threads, etc, behind the scenes? That's the thing, you
write your highly [possibly platform specific] optimized library once,
and use it many times.

Ultimately your Fortran library code boils down to probably C and asm
behind the scenes. For example, calls to the kernel, pthreads, libc,
etc...

So if you just bought a $20M grid computer, and don't plan on having
common optimized libraries for it, regardless of the language, you're
an idiot. And if you do have optimized libraries for it, then clearly
you're not re-writing the wheel for every application you run on it.

James Kuyper

unread,

Jun 30, 2009, 8:58:35 AM6/30/09

to

Tom St Denis wrote:
> On Jun 30, 8:13 am, "BartC" <ba...@freeuk.com> wrote:
>>> for (i = 0; i < MAX_LEN && string[i]; ++i) { ... }
>> There is no reason why there can't be both types, with the streamlined
>> version used in 90% of cases where you are just iterating over A to B.
>
> #define FOR(x,a,b) for (x = a; x < b; x++)
>
> FOR(x, 0, 10) { printf("x == %d\n", x); }
>
> For the lazy I suppose...

You're not thinking "lazy" enough. The point isn't to reduce a little
bit of typing. The point is to get rid of the necessity of keeping track
of the size of any array, when the compiler already knows it. Something
more like this:

#define FOR(array) for(size_t array##_index; \
array##_index < sizeof array; array##_index++)

I'm using the token-pasting operator ## to reduce (but not eliminate)
the possibility of name conflicts.

James Kuyper

unread,

Jun 30, 2009, 9:03:13 AM6/30/09

to

James Kuyper wrote:
...

> #define FOR(array) for(size_t array##_index; \
> array##_index < sizeof array; array##_index++)

The loop limit should have been (sizeof array/sizeof array[0]), of course.

Tom St Denis

unread,

Jun 30, 2009, 9:04:54 AM6/30/09

to

But what if the array isn't of char? So you need to get a

(sizeof array / sizeof array[0]) in there.

What I think these people need is something along the lines of

char *my_program =
// C# program goes here
;

int main(void) {
FILE *f;
f = fopen("src.cs", "w");
fprintf(f, "%s", my_program);
fclose(f);
return system("mono src.cs");
}

Then they can write all the C# they want in C. :-)

Tom

Dik T. Winter

unread,

Jun 30, 2009, 9:49:21 AM6/30/09

to

In article <f18003f2-8600-45d3...@d32g2000yqh.googlegroups.com> Nick Keighley <nick_keigh...@hotmail.com> writes:
...

> > A proper loop would just require 'for', i, A and B, and some syntax. And
> > often the syntax indicates whether the loop goes up or down.
>
> ah, you want Algol-68!
>
> for i from 1 to 10 by 1 do something;
> for i from 1 to 10 do something;
> to 10 do something;
>
> (you can drop many of the fields, I'm not certain you can drop i)

Yes, it can be dropped. You can also add a 'while'. And you *must* add the
'od'.

> > (OK, a C for statement is just a kind of elaborate while loop, and is more
> > flexible than the for statements of other languages (but then, if
> > statements and gotos are also quite flexible).
>
> ah, you want flexibility. You want Algol-60
> FOR k:=1,V12 WHILE V1<N DO sometheing;
> FOR j:=I+G,L,1 STEP 1 UNTIL N, C+D DO A[k,j]:=B[k,j];
> the semantics are ... interesting

Indeed:
FOR days:= 31, if mod(year, 4) = 0 then 29 else 28, 31, 30, 31, 30, 31,
31, 30, 31, 30, 31 DO
BEGIN mdays[month]:= days; month:= month + 1 END;
--
dik t. winter, cwi, science park 123, 1098 xg amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Giacomo Catenazzi

unread,

Jun 30, 2009, 9:50:23 AM6/30/09

to

Tom St Denis wrote:
> On Jun 30, 5:43 am, "BartC" <ba...@freeuk.com> wrote:
>> Giacomo Catenazzi wrote:
>>> jacob navia wrote:
>>>> Tom St Denis wrote:
>>>>>> Having the type built-in to the language however means you can
>>>>>> just use +, -, *, /, == and != a lot of the time.
>>>>> Then use C++.
>>>> Obviously operator overloading is OK if it is C++,
>>>> wrong if it is in C.
>>> I really think adding overloading will confuse more users: they need
>>> to parse carefully the meaning, implicit type conversions etc.
>>> (thus giving extra burden to the reader in name of readibility ?).
>>> Write it explicitly, e.g. "add_vector(a,b,n)". Few more keystrokes
>>> more but the code is surely more readable and it give less confusions!
>> I'm curious: suppose you had two points p=(x1,y1,z1) and q=(x2,y2,z2), and
>> wanted to get the midpoint m; in operator notation it would just be:
>>
>> m = (p+q)/2;
>>
>> What would it look like using your function notation (and preserving p and
>> q)?
>
> You're right in that function calls for "math" get harder to read, but
> that's where clean coding comes in.
>
> vec_add(p, q, m);
> vec_div(m, 2);

Not really. Note the different domain.
the first one is an vector addition (by components), the second by a scalar.

but take an other example, two vectors:
vector v[1];
vec_add(v[0], v[1])

Now write it in the original notation.
p[0] is the first vector or first coordinate?

ciao
cate

Giacomo Catenazzi

unread,

Jun 30, 2009, 9:53:31 AM6/30/09

to

No, but the syntax should allow all of these (I've seen proposal for
all of these).
Thus an error could give strange results because compiler used an
unexpected overloading use, and eventual compiler errors could be
more confusing. Check C++ errors with a C style program.

ciao
cate

Dik T. Winter

unread,

Jun 30, 2009, 9:52:10 AM6/30/09

to

In article <332e444e-6b33-440e...@j32g2000yqh.googlegroups.com> Nick Keighley <nick_keigh...@hotmail.com> writes:
> On 29 June, 17:37, jacob navia <ja...@jacob.remcomp.fr> wrote:

...

> > What is obvious (see my previous post) is that many modern CPUs
> > implement this features and there is no way for the [C] programmer
> > to express this in a portable manner.
>
> what do Fortran programmers do?
>
> I don't know, but I understood super-computer programmers
> have had vector operations for years and they were traditionally
> programmed in Fortran. Did Fortran have vectors? (It didn't
> last time I wrote any Fortran).

No, but Fortran has very strict anti-aliassing rules. That is, if you assign
something to variable or an array element it must not be visible through
another name. This ensures that when a subroutine contains
DO 10 I = 1, N
10 A(I) = B(I)
the compiler can be certain that A and B do not reference the same storage
area (it can at least assume that).

BartC

unread,

Jun 30, 2009, 11:47:38 AM6/30/09

to

Tom St Denis wrote:
> On Jun 30, 8:13 am, "BartC" <ba...@freeuk.com> wrote:
>>> for (i = 0; i < MAX_LEN && string[i]; ++i) { ... }
>>
>> There is no reason why there can't be both types, with the
>> streamlined version used in 90% of cases where you are just
>> iterating over A to B.
>
> #define FOR(x,a,b) for (x = a; x < b; x++)
>
> FOR(x, 0, 10) { printf("x == %d\n", x); }

This is similar to the function argument. Some things just need to be
built-in and fully supported rather than bolted on with macros and
functions. As it is now no-one could use the above FOR macro in shared or
published code because it is not standard.

>
> For the lazy I suppose...
>
>> It's better because you can't accidentally use <= instead of < or ++j
>> instead of ++i.
>>
>> And in this last example, Nick mentioned Algol60 where it would be
>> more like:
>>
>> for i:=0 to MAX_LEN while string[i] do ...
>
> I'm going to say this as politely as I can. Real programmers/
> developers do not get confused by the meaning of <= and < all that
> often. Sure off-by-one errors do occur (technically you have on in
> your Algol60 example since 0...MAX_LEN is MAX_LEN+1 elements) but
> they're more the result of being tired or rushed than not
> understanding what you are doing.
>
> Basically, if you claim to be a C developer and can't handle a basic
> comparison expression as say used in a for loop, you should re-
> consider your profession.

I just happen to think that:

for (i=0; i<sizeof array/sizeof array[0]; ++i) printf("%d ",array[i]);

is more tedious (and error-prone) to write than, say:

for i=0 to array.upb do print array[i]

or:

forall x in array do print x

or even just:

print array

I'm not saying C should be upgraded to this level, but I do consider those
forms 'better' when they are available. It's not a question of being lazy or
incompetent.

And, getting back to the topic, it's far easier to recognize what 'print
array' is doing, and optimise that operation, than messing about trying to
analyse code written at far too low a level and figure what it's trying to
do.

And being built-in, 'print' will work for anything. (OK, print is probably
not a good example for optimising.)

--
Bart

jameskuyper

unread,

Jun 30, 2009, 12:24:46 PM6/30/09

to

BartC wrote:
> Tom St Denis wrote:

...

> > #define FOR(x,a,b) for (x = a; x < b; x++)
> >
> > FOR(x, 0, 10) { printf("x == %d\n", x); }
>
> This is similar to the function argument. Some things just need to be
> built-in and fully supported rather than bolted on with macros and
> functions. As it is now no-one could use the above FOR macro in shared or
> published code because it is not standard.

Why not? Is there something that prohibits shared/published code from
#defining FOR in this fashion?

BartC

unread,

Jun 30, 2009, 12:41:41 PM6/30/09

to

"jameskuyper" <james...@verizon.net> wrote in message
news:53f63d58-ee70-4c02...@x5g2000yqk.googlegroups.com...

For a start, everytime someone posts code here, or in any other forum, or in
a book, or in a collection of snippets, it will have to be accompanied by
dozens or hundreds of lines of macro definitions.

And for every hundred people doing this, there will be a hundred slightly
different collections of macros. Effectively, everyone will be using their
own mini-language.

(And in the above example, I don't know if he intended the loop to run from
0 to 9, or 0 to 10.)

--
Bart

Francis Glassborow

unread,

Jun 30, 2009, 1:01:12 PM6/30/09

to

Tom St Denis wrote:
> On Jun 30, 8:31 am, Francis Glassborow
> <francis.glassbo...@btinternet.com> wrote:
>> We know that the best way to improve performance is with a 'better'
>> algorithm (e.g. one that matches the resources better) but most modern
>> languages adamantly refuse to allow the programmer to express the intent
>> to use massively parallel hardware.
>
> Again ... libraries people, libraries.
>
> Suppose I was writing an application to perform massive ffts. I'd
> love to have a single call from the high level be like
>
> fft_forward(in, out, size);
>

But what are you going to write your library in? Libraries written in C
are just like user code. Libraries written in assembler are as
unportable as you can get. I want to be able to write my, for example
fft so that it will port to any system that provides me with appropriate
resources. Your suggestion stops me form doing that (at least effectively)

Tom St Denis

unread,

Jun 30, 2009, 1:15:30 PM6/30/09

to

On Jun 30, 1:01 pm, Francis Glassborow

I have no idea what you're talking about. I'm the author of
performance bignum libraries written in C with the option of non-
portable assembler implementations. Basically with the right define
some macros change to asm and boom more speed. The bulk of the code
though is all portable C.

In this case, you could use pthreads to start up job threads which
wait for jobs to be sent to them. So your FFT library would have an
init function that sets that up, then each time you issue a job e.g.

fft_forward(in, out, size);

It farms it off to threads that are already waiting for work.

My point though is if you dropped millions of dollars on a cluster
computer and can't spend the time to invest or acquire a performance
numerical library, you're an idiot. And the fact that performance
libraries already exist only serves to bolster my point.

Tom

Tom St Denis

unread,

Jun 30, 2009, 1:21:26 PM6/30/09

to

On Jun 30, 11:47 am, "BartC" <ba...@freeuk.com> wrote:
> > FOR(x, 0, 10) { printf("x == %d\n", x); }
>
> This is similar to the function argument. Some things just need to be
> built-in and fully supported rather than bolted on with macros and
> functions. As it is now no-one could use the above FOR macro in shared or
> published code because it is not standard.

Well yeah, but my point is no real C developer fears the for keyword.
So I'll promise not to make up contrived counter-points if you don't
make up contrived grievances. Ok?

> I'm not saying C should be upgraded to this level, but I do consider those
> forms 'better' when they are available. It's not a question of being lazy or
> incompetent.

Depends on your design. You could encapsulate "size" data in a
struct ... e.g.

struct vector {
int used, size;
float *v;
};

Then you don't need to use sizeof, you just use vector.used to know
how many to step through.

It's called software design.

> And, getting back to the topic, it's far easier to recognize what 'print
> array' is doing, and optimise that operation, than messing about trying to
> analyse code written at far too low a level and figure what it's trying to
> do.

vector_print(myvector);

Which then steps through vector.v[0...vector.used-1].

Wow, that's hard.

> And being built-in, 'print' will work for anything. (OK, print is probably
> not a good example for optimising.)

But it serves to promote my point that most of these "tedious" jobs
like manipulating strings, vectors, and what not can be hidden if you
build up from a strong base.

My math library encapsulates basically the same thing [except it's
integers not floats] and I have simple routines like

mp_exptmod(a,b,c,d); // d = a^b mod c

Which hides a LOT OF HARD MATH away from the user. And not only that
but how do you write 3 operand functions in C++ anyways?

My point of all this is C is mostly fine, and the sort of things you
guys are complaining about can be addressed through the proper use of
software design and existing libraries. We don't need to integrate
things like OpenMP if pthreads is available. You can easily write job
servers hiding away in the backend to split up things like DCT/FFT
work or what not.

Tom

jameskuyper

unread,

Jun 30, 2009, 1:46:22 PM6/30/09

to

Tom St Denis wrote:
> On Jun 30, 11:47 am, "BartC" <ba...@freeuk.com> wrote:
> > > FOR(x, 0, 10) { printf("x == %d\n", x); }
> >
> > This is similar to the function argument. Some things just need to be
> > built-in and fully supported rather than bolted on with macros and
> > functions. As it is now no-one could use the above FOR macro in shared or
> > published code because it is not standard.
>
> Well yeah, but my point is no real C developer fears the for keyword.
> So I'll promise not to make up contrived counter-points if you don't
> make up contrived grievances. Ok?

I don't fear the 'for' keyword; but I don't want to use it more often
than I have to. I've used languages like APL and IDL where array
operations are built into the language itself. Without all those for
loops my code looks a lot cleaner. It more closely represents the way
that I think about the formulas that I'm converting into code, and I
gather it vectorizes pretty efficiently. Both of those languages are
interpreted, which makes them pretty inefficient, but if array-
processing forms a sufficient large portion of what a program is
doing, it can operate at speeds approaching those achievable with C.

I'd like to combine the efficiency of C with the convenience of array
operations in those languages, and I don't think that it's
intrinsically impossible, but to do the job right would probably
result in a language that wasn't backward compatible with C. Operator
overloading in C++ allows something similar, but is quite a bit
clumsier.

Richard Bos

unread,

Jun 30, 2009, 3:36:04 PM6/30/09

to

Keith Thompson <ks...@mib.org> wrote:

> jacob navia <ja...@jacob.remcomp.fr> writes:
> > Tom St Denis wrote:
> >>> Having the type built-in to the language however means you can just use
> >>> +, -, *, /, == and != a lot of the time.
> >>
> >> Then use C++.
> >
> > Obviously operator overloading is OK if it is C++,
> > wrong if it is in C.
> >

> > We should keep C in his cage, and keep all features
> > that make sense for C++.
> >

> > C should just be kept frozen forever.
> >
> > Change to C++!
>
> Sarcasm is not persuasive.

Yes, it is, but it needs to be intelligent, non-repetitive sarcasm for
that to work.

Richard

Richard Bos

unread,

Jun 30, 2009, 3:36:06 PM6/30/09

to

jacob navia <ja...@jacob.remcomp.fr> wrote:

> robert...@yahoo.com wrote:
> >
> > The syntactic niceties of being able to specify a whole vector as an
> > operand are a separate issue, but would be easily handled with some
> > templates or classes in C++.
>
> Well, I am just saying that C could do the same.
> Without classes or templates obviously, just with operator
> overloading.

No, you're saying that it _should_, not that it could.
And yes, you're using that as an argument to try and bring your pet
extension into the language (again).

You don't need to convince anyone on the _could_. That much is obvious.
But you have yet to convince me of the _desirability_ both of vector
operations and of *barf* operator overloading.

Richard

Richard Bos

unread,

Jun 30, 2009, 3:36:04 PM6/30/09

to

"BartC" <ba...@freeuk.com> wrote:

> luserXtrog wrote:
> > On Jun 29, 12:07 pm, "BartC" <ba...@freeuk.com> wrote:
> >
> >> I don't personally think these higher-level types belong in C, but
> >> in a more appropriate language, because you can't just add a handful
> >> of advanced features while the rest of the language doesn't even
> >> have a proper 'for' statement.
> >
> > In what way is C's 'for' statement improper?
>

> Every time I write a C 'for' statement, it feels like I have to tell the
> compiler how to implement it:
>
> for ( i = A; i<=B; ++i) .....
>
> Notice: the loop variable appears 3 times instead of once; the compare
> operation has to be supplied; the increment operation has to be supplied. An
> extra four things to get wrong.
>

> A proper loop would just require 'for', i, A and B, and some syntax. And
> often the syntax indicates whether the loop goes up or down.

If you want BASIC, you know where to find it.

C's for loop is more powerful than that. In what other language (except
those which nicked it from C) can you write

for (node=head; node; node=node->next)
frobnicate(node->payload);

or

for (x=0, y=0; cell[x][y]<threshold; x+=dx, y+=dy) {
colour_neighbours(cell, x, y);
adjust_step(x,y, &dx,&dy);
}

Richard

Lew Pitcher

unread,

Jun 30, 2009, 3:42:36 PM6/30/09

to

On June 30, 2009 05:20, in comp.lang.c, BartC (ba...@freeuk.com) wrote:

> luserXtrog wrote:
>> On Jun 29, 12:07 pm, "BartC" <ba...@freeuk.com> wrote:
>>
>>> I don't personally think these higher-level types belong in C, but
>>> in a more appropriate language, because you can't just add a handful
>>> of advanced features while the rest of the language doesn't even
>>> have a proper 'for' statement.
>>
>> In what way is C's 'for' statement improper?
>
> Every time I write a C 'for' statement, it feels like I have to tell the
> compiler how to implement it:
>
> for ( i = A; i<=B; ++i) .....
>
> Notice: the loop variable appears 3 times instead of once; the compare
> operation has to be supplied; the increment operation has to be supplied.
> An extra four things to get wrong.
>
> A proper loop would just require 'for', i, A and B, and some syntax. And
> often the syntax indicates whether the loop goes up or down.

OK, I propose the following syntax for a "proper" loop

PERFORM VARYING variable
FROM initial_value
BY increment_value
UNTIL termination_condition
<statements to be performed as a loop>
END-PERFORM;

;-)
--
Lew Pitcher

Master Codewright & JOAT-in-training | Registered Linux User #112576
http://pitcher.digitalfreehold.ca/ | GPG public key available by request
---------- Slackware - Because I know what I'm doing. ------

robert...@yahoo.com

unread,

Jun 30, 2009, 3:43:18 PM6/30/09

to

On Jun 29, 4:21 pm, jacob navia <ja...@jacob.remcomp.fr> wrote:

> robertwess...@yahoo.com wrote:
>
> > The syntactic niceties of being able to specify a whole vector as an
> > operand are a separate issue, but would be easily handled with some
> > templates or classes in C++.
>
> Well, I am just saying that C could do the same.
> Without classes or templates obviously, just with operator
> overloading.
>

> Your example of MSVC vectorizing is nice, but as you
> know very well, after decades of efforts, the result
> is very brittle and you need very specific
> loops and a very specific situation for it. WIth
> striaghtforward addition it will work, but with a
> slightly more complicated expression it will fail.

While I don't know exactly what you're proposing to add in terms of
operations - it's likely that only simple vector operations will be
supported, just the kind that compilers *can* pick out of for loops.
In C (without "restrict") the aliasing thing is a big deal (which is
not a problem Fortran has), and the loops allow you to express any
code, not just code that can be easily vectorized.

Out of curiosity, do your proposed extensions support conditional
operations inside the vector? Are you intending to expose the usual
vector mask/merge way of doing that? For will your extensions allow
you to express:

for(i=0; i<x; i++)
{
if (a[i]<b[i])
c[i]=d[i]+e[i];
else
c[i]=d[i]*f[i]+3;
}

(Which would have been vectorized by pretty much any vectorizing
Fortran compiler in 1980).

jacob navia

unread,

Jun 30, 2009, 4:06:01 PM6/30/09

to

jacob navia

unread,

Jun 30, 2009, 4:27:01 PM6/30/09

to

robert...@yahoo.com wrote:
> Out of curiosity, do your proposed extensions support conditional
> operations inside the vector? Are you intending to expose the usual
> vector mask/merge way of doing that? For will your extensions allow
> you to express:
>
> for(i=0; i<x; i++)
> {
> if (a[i]<b[i])
> c[i]=d[i]+e[i];
> else
> c[i]=d[i]*f[i]+3;
> }
>
>
> (Which would have been vectorized by pretty much any vectorizing
> Fortran compiler in 1980).

Sure.

Can you tell me of a C compiler that vectorizes that today?

Thanks

In a review of vectorizing compilers of 1991, Levine et al

propose this loop
do 1 n1=1,2*ntimes
do 10 i=2,n,2
a(i)=a(i-1)+b(i)
10 continue
call dummy(ld,n,a,b,c,d,e,aa,bb,cc,1.)
1 continue

Some 78% of similar simple loops were vectorized.

Of course in 1991 they forgot what they knew in 1980.

Sure.

And we are speaking of dependence analysis in FORTRAN, not
in C with its pointers, aliases and what have you.

To answer your question, yes, you build a boolean
vector with a<b then you assign to c

c = d+(booleanVector*e);
c = d+(!booleanVector)*(f+3);

Actually, it is just APL.

jacob navia

unread,

Jun 30, 2009, 4:48:28 PM6/30/09

to

jacob navia wrote:
>
> To answer your question, yes, you build a boolean
> vector with a<b then you assign to c
>
> c = d+(booleanVector*e);
> c = d+(!booleanVector)*(f+3);

Sorry, the second line should have been
c += d+(!booleanVector)*(f+3);

BartC

unread,

Jun 30, 2009, 5:17:46 PM6/30/09

to

"Richard Bos" <ral...@xs4all.nl> wrote in message
news:4a49fef2...@news.xs4all.nl...

> "BartC" <ba...@freeuk.com> wrote:
>
>> luserXtrog wrote:

>> > In what way is C's 'for' statement improper?

>> A proper loop would just require 'for', i, A and B, and some syntax. And

>> often the syntax indicates whether the loop goes up or down.
>
> If you want BASIC, you know where to find it.
>
> C's for loop is more powerful than that. In what other language (except
> those which nicked it from C) can you write
>
> for (node=head; node; node=node->next)
> frobnicate(node->payload);

> for (x=0, y=0; cell[x][y]<threshold; x+=dx, y+=dy) {

> colour_neighbours(cell, x, y);
> adjust_step(x,y, &dx,&dy);
> }

In any language that supports while-do loops, I would imagine, since that is
pretty much what the C for-statement amounts to:

node=head;
while (node) {
frobnicate(node->payload); /* Do something x-rated, it sounds like */
node=node->next;
}

However what I had in mind was the very common requirement of simply
iterating over the range A to B. Doing it the C way is not the end of the
world but, well, the compiler isn't really doing very much to earn it's keep
here.

--
Bart

jacob navia

unread,

Jun 30, 2009, 6:52:33 PM6/30/09

to

jameskuyper wrote:
> I've used languages like APL and IDL where array
> operations are built into the language itself. Without all those for
> loops my code looks a lot cleaner. It more closely represents the way
> that I think about the formulas that I'm converting into code, and I
> gather it vectorizes pretty efficiently. Both of those languages are
> interpreted, which makes them pretty inefficient, but if array-
> processing forms a sufficient large portion of what a program is
> doing, it can operate at speeds approaching those achievable with C.
>
> I'd like to combine the efficiency of C with the convenience of array
> operations in those languages, and I don't think that it's
> intrinsically impossible, but to do the job right would probably
> result in a language that wasn't backward compatible with C. Operator
> overloading in C++ allows something similar, but is quite a bit
> clumsier.

Well, that is exactly my goal. Essentially, APL.
This allows parallelism by using a naturally parallel
data type: arrays.

Tim Prince

unread,

Jun 30, 2009, 9:33:11 PM6/30/09

to

Nick Keighley wrote:
> On 29 June, 17:37, jacob navia <ja...@jacob.remcomp.fr> wrote:
>

> <snip>

>
>> What is obvious (see my previous post) is that many modern CPUs
>> implement this features and there is no way for the [C] programmer
>> to express this in a portable manner.
>
> what do Fortran programmers do?
>
> I don't know, but I understood super-computer programmers
> have had vector operations for years and they were traditionally
> programmed in Fortran. Did Fortran have vectors? (It didn't
> last time I wrote any Fortran).
>

If you care to compare how your Fortran, C, and C++ compilers deal with
the same 18-year old public auto-vectorization benchmark:

http://sites.google.com/site/tprincesite/levine-callahan-dongarra-vectors

But soon, Fortran will have co-arrays, as well as the array assignments it
gained 18 years ago:
www.fortran.bcs.org/2006/ukfortran06.pdf
said to be commonly implemented in UPC:
http://upc.lbl.gov/
(another withering offshoot from C?)

jacob navia

unread,

Jul 1, 2009, 2:19:50 AM7/1/09

to

In that report I can read:

<quote>
However, the major vendors reported pressure
from users to provide co-arrays and it was
decided (straw vote 6-3-2) to keep them.
<end quote>

What a difference, yes. In Fortran, users can still influence the
decisions of the committee.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Flash Gordon

unread,

Jul 1, 2009, 3:35:35 AM7/1/09

to

jacob navia wrote:
> Tim Prince wrote:

<snip>

>> www.fortran.bcs.org/2006/ukfortran06.pdf
>> said to be commonly implemented in UPC:
>> http://upc.lbl.gov/
>> (another withering offshoot from C?)
>
> In that report I can read:
>
> <quote>
> However, the major vendors reported pressure
> from users to provide co-arrays and it was
> decided (straw vote 6-3-2) to keep them.
> <end quote>
>
> What a difference, yes. In Fortran, users can still influence the
> decisions of the committee.

You forget that most of the people here disagreeing with you are users.
So, as far as I can see, by not introducing all your pet extensions the
committee *is* going with what the majority of C users want! Also that
report said it was major vendors being pressured for features who passed
this request on to the committee, not a single minor vendor. So possibly
if you want a better chance start by getting your extensions implemented
in gcc and persuading the gcc team to accept them, then go pressuring MS
to add them... You will probably need to get a lot of other people to
pressure for the extensions to be added as well.
--
Flash Gordon

robert...@yahoo.com

unread,

Jul 1, 2009, 4:14:55 AM7/1/09

to

On Jun 30, 3:27 pm, jacob navia <ja...@jacob.remcomp.fr> wrote:
> Can you tell me of a C compiler that vectorizes that today?
>
> Thanks

Compiling the following with ICC 11.1 (with "-O3 -fp:fast"):

float a[64], b[64], c[64], d[64], e[64], f[64];

void v1(void)
{
int i;

for (i=0; i<64; i++)
c[i] = a[i] + b[i];
}

void v2()
{
int i;

for(i=0; i<64; i++)

{
if (a[i]<b[i])
c[i]=d[i]+e[i];
else
c[i]=d[i]*f[i]+3;
}
}

It fully vectorized (and partially unrolled) both routines:

; -- Machine type PW
; mark_description "Intel(R) C++ Compiler for applications running on
IA-32, Version 11.1 Build 20090511 %s";
; mark_description "-c -Fa -O3 -fp:fast";
.686P
.387
OPTION DOTNAME
ASSUME CS:FLAT,DS:FLAT,SS:FLAT
_TEXT SEGMENT PARA PUBLIC FLAT 'CODE'
; COMDAT _v1
TXTST0:
; -- Begin _v1
; mark_begin;
IF @Version GE 800
.MMX
ELSEIF @Version GE 612
.MMX
MMWORD TEXTEQU <QWORD>
ENDIF
IF @Version GE 800
.XMM
ELSEIF @Version GE 614
.XMM
XMMWORD TEXTEQU <OWORD>
ENDIF
ALIGN 16
PUBLIC _v1
_v1 PROC NEAR
.B1.1: ; Preds .B1.0
xor eax, eax ;12.5
; LOE eax ebx ebp esi edi
.B1.2: ; Preds .B1.2 .B1.1
movaps xmm0, XMMWORD PTR [_a+eax*4] ;13.16
movaps xmm1, XMMWORD PTR [_a+16+eax*4] ;13.16
movaps xmm2, XMMWORD PTR [_a+32+eax*4] ;13.16
movaps xmm3, XMMWORD PTR [_a+48+eax*4] ;13.16
movaps xmm4, XMMWORD PTR [_a+64+eax*4] ;13.16
movaps xmm5, XMMWORD PTR [_a+80+eax*4] ;13.16
movaps xmm6, XMMWORD PTR [_a+96+eax*4] ;13.16
movaps xmm7, XMMWORD PTR [_a+112+eax*4] ;13.16
addps xmm0, XMMWORD PTR [_b+eax*4] ;13.23
addps xmm1, XMMWORD PTR [_b+16+eax*4] ;13.23
addps xmm2, XMMWORD PTR [_b+32+eax*4] ;13.23
addps xmm3, XMMWORD PTR [_b+48+eax*4] ;13.23
addps xmm4, XMMWORD PTR [_b+64+eax*4] ;13.23
addps xmm5, XMMWORD PTR [_b+80+eax*4] ;13.23
addps xmm6, XMMWORD PTR [_b+96+eax*4] ;13.23
addps xmm7, XMMWORD PTR [_b+112+eax*4] ;13.23
movaps XMMWORD PTR [_c+eax*4], xmm0 ;13.9
movaps XMMWORD PTR [_c+16+eax*4], xmm1 ;13.9
movaps XMMWORD PTR [_c+32+eax*4], xmm2 ;13.9
movaps XMMWORD PTR [_c+48+eax*4], xmm3 ;13.9
movaps XMMWORD PTR [_c+64+eax*4], xmm4 ;13.9
movaps XMMWORD PTR [_c+80+eax*4], xmm5 ;13.9
movaps XMMWORD PTR [_c+96+eax*4], xmm6 ;13.9
movaps XMMWORD PTR [_c+112+eax*4], xmm7 ;13.9
add eax, 32 ;12.5
cmp eax, 64 ;12.5
jb .B1.2 ; Prob 98% ;12.5
; LOE eax ebx ebp esi edi
.B1.3: ; Preds .B1.2
ret ;15.1
ALIGN 16
; LOE
; mark_end;
_v1 ENDP
;_v1 ENDS
_TEXT ENDS
_DATA SEGMENT DWORD PUBLIC FLAT 'DATA'
_DATA ENDS
; -- End _v1
_TEXT SEGMENT PARA PUBLIC FLAT 'CODE'
; COMDAT _v2
TXTST1:
; -- Begin _v2
; mark_begin;
ALIGN 16
PUBLIC _v2
_v2 PROC NEAR
.B2.1: ; Preds .B2.0
movaps xmm0, XMMWORD PTR [_2il0floatpacket.0] ;
xor eax, eax ;24.5
; LOE eax ebx ebp esi edi xmm0
.B2.2: ; Preds .B2.2 .B2.1
movaps xmm3, XMMWORD PTR [_a+eax*4] ;26.13
movaps xmm2, XMMWORD PTR [_d+eax*4] ;27.18
movaps xmm4, XMMWORD PTR [_e+eax*4] ;27.23
movaps xmm1, XMMWORD PTR [_f+eax*4] ;29.23
movaps xmm7, XMMWORD PTR [_a+16+eax*4] ;26.13
movaps xmm6, XMMWORD PTR [_d+16+eax*4] ;27.18
movaps xmm5, XMMWORD PTR [_f+16+eax*4] ;29.23
cmpltps xmm3, XMMWORD PTR [_b+eax*4] ;26.18
cmpltps xmm7, XMMWORD PTR [_b+16+eax*4] ;26.18
addps xmm4, xmm2 ;27.23
mulps xmm2, xmm1 ;29.23
movaps xmm1, XMMWORD PTR [_e+16+eax*4] ;27.23
andps xmm4, xmm3 ;27.23
addps xmm2, xmm0 ;29.28
andnps xmm3, xmm2 ;29.28
orps xmm4, xmm3 ;29.28
movaps XMMWORD PTR [_c+eax*4], xmm4 ;27.13
addps xmm1, xmm6 ;27.23
mulps xmm6, xmm5 ;29.23
andps xmm1, xmm7 ;27.23
addps xmm6, xmm0 ;29.28
andnps xmm7, xmm6 ;29.28
orps xmm1, xmm7 ;29.28
movaps XMMWORD PTR [_c+16+eax*4], xmm1 ;27.13
add eax, 8 ;24.5
cmp eax, 64 ;24.5
jb .B2.2 ; Prob 98% ;24.5
; LOE eax ebx ebp esi edi xmm0
.B2.3: ; Preds .B2.2
ret ;31.1
ALIGN 16
; LOE
; mark_end;
_v2 ENDP
;_v2 ENDS
_TEXT ENDS
_DATA SEGMENT DWORD PUBLIC FLAT 'DATA'
_DATA ENDS
; -- End _v2
_RDATA SEGMENT DWORD PUBLIC FLAT 'DATA'
_2il0floatpacket.0 DD 040400000H,040400000H,040400000H,040400000H
_RDATA ENDS
_DATA SEGMENT DWORD PUBLIC FLAT 'DATA'
COMM _a:BYTE:256
COMM _b:BYTE:256
COMM _c:BYTE:256
COMM _d:BYTE:256
COMM _e:BYTE:256
COMM _f:BYTE:256
_DATA ENDS
END

robert...@yahoo.com

unread,

Jul 1, 2009, 4:37:58 AM7/1/09

to

On Jul 1, 3:14 am, "robertwess...@yahoo.com" <robertwess...@yahoo.com>
wrote:

By rough count that executes 218 instructions for all 64 iterations of
the loop.

And in 64 bit mode with SSE4.1 turned on, ICC reduced the second
routine to (184 instructions):

v2 PROC
.B2.1:: ; Preds .B2.0
movaps xmm1, XMMWORD PTR [_2il0floatpacket.0] ;
xor edx, edx ;24.5
lea rax, QWORD PTR [__ImageBase] ;
; LOE rax rdx rbx rbp rsi rdi r12 r13
r14 r15 xmm1 xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15
.B2.2:: ; Preds .B2.2 .B2.1
movaps xmm0, XMMWORD PTR [imagerel(a)+rax+rdx*4] ;26.13
cmpltps xmm0, XMMWORD PTR [imagerel(b)+rax+rdx*4] ;26.18
movaps xmm3, XMMWORD PTR [imagerel(d)+rax+rdx*4] ;29.18
movaps xmm4, XMMWORD PTR [imagerel(f)+rax+rdx*4] ;29.23
mulps xmm4, xmm3 ;29.23
addps xmm4, xmm1 ;29.28
movaps xmm2, XMMWORD PTR [imagerel(e)+rax+rdx*4] ;27.23
movaps xmm5, XMMWORD PTR [imagerel(e)+16+rax+rdx*4] ;27.23
addps xmm3, xmm2 ;27.23
blendvps xmm4, xmm3, xmm0 ;29.28
movaps xmm0, XMMWORD PTR [imagerel(a)+16+rax+rdx*4] ;26.13
cmpltps xmm0, XMMWORD PTR [imagerel(b)+16+rax+rdx*4] ;26.18
movaps xmm2, XMMWORD PTR [imagerel(d)+16+rax+rdx*4] ;29.18
movaps xmm3, XMMWORD PTR [imagerel(f)+16+rax+rdx*4] ;29.23
mulps xmm3, xmm2 ;29.23
movaps XMMWORD PTR [imagerel(c)+rax+rdx*4], xmm4 ;27.13
addps xmm2, xmm5 ;27.23
addps xmm3, xmm1 ;29.28
blendvps xmm3, xmm2, xmm0 ;29.28
movaps XMMWORD PTR [imagerel(c)+16+rax+rdx*4], xmm3 ;27.13
add rdx, 8 ;24.5
cmp rdx, 64 ;24.5
jl .B2.2 ; Prob 98% ;24.5
; LOE rax rdx rbx rbp rsi rdi r12 r13
r14 r15 xmm1 xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15

.B2.3:: ; Preds .B2.2
ret ;31.1

And AVX resulted in (86 instructions):

v2 PROC
.B2.1:: ; Preds .B2.0
sub rsp, 40 ;20.1
mov QWORD PTR [32+rsp], r13 ;20.1
vmovaps ymm0, YMMWORD PTR [_2il0floatpacket.0] ;
lea r13, QWORD PTR [71+rsp] ;20.1
and r13, -32 ;20.1
xor edx, edx ;24.5
lea rax, QWORD PTR [__ImageBase] ;
; LOE rax rdx rbx rbp rsi rdi r12 r14
r15 xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15 ymm0
.B2.2:: ; Preds .B2.2 .B2.1
vmovaps ymm1, YMMWORD PTR [imagerel(a)+rax+rdx*4] ;26.13
vmovaps ymm3, YMMWORD PTR [imagerel(d)+rax+rdx*4] ;29.18
vmulps ymm2, ymm3, YMMWORD PTR [imagerel(f)+rax+rdx*4] ;
29.23
vcmpltps ymm1, ymm1, YMMWORD PTR [imagerel(b)+rax+rdx*4] ;
26.18
vaddps ymm5, ymm3, YMMWORD PTR [imagerel(e)+rax+rdx*4] ;
27.23
vmovaps ymm3, YMMWORD PTR [imagerel(a)+32+rax+rdx*4] ;26.13
vaddps ymm4, ymm2, ymm0 ;29.28
vblendvps ymm2, ymm4, ymm5, ymm1 ;29.28
vcmpltps ymm1, ymm3, YMMWORD PTR [imagerel(b)+32+rax+rdx*4] ;
26.18
vmovaps ymm5, YMMWORD PTR [imagerel(d)+32+rax+rdx*4] ;29.18
vmulps ymm4, ymm5, YMMWORD PTR [imagerel(f)+32+rax+rdx*4] ;
29.23
vaddps ymm5, ymm5, YMMWORD PTR [imagerel(e)+32+rax+rdx*4] ;
27.23
vmovaps YMMWORD PTR [imagerel(c)+rax+rdx*4], ymm2 ;27.13
vaddps ymm4, ymm4, ymm0 ;29.28
vblendvps ymm2, ymm4, ymm5, ymm1 ;29.28
vmovaps YMMWORD PTR [imagerel(c)+32+rax+rdx*4], ymm2 ;27.13
add rdx, 16 ;24.5
cmp rdx, 64 ;24.5
jl .B2.2 ; Prob 98% ;24.5
; LOE rax rdx rbx rbp rsi rdi r12 r14
r15 xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15 ymm0
.B2.3:: ; Preds .B2.2
mov r13, QWORD PTR [32+rsp] ;31.1
add rsp, 40 ;31.1
ret ;31.1

Of course I don't have anything to run the AVX code on yet...

Richard Bos

unread,

Jul 2, 2009, 11:44:16 AM7/2/09

to

jacob navia <ja...@jacob.remcomp.fr> wrote:

> jameskuyper wrote:
> > I've used languages like APL and IDL where array
> > operations are built into the language itself. Without all those for
> > loops my code looks a lot cleaner.

> Well, that is exactly my goal. Essentially, APL.

I must say, that explains a lot.

Richard

Richard Bos

unread,

Jul 5, 2009, 7:31:57 AM7/5/09

to

"BartC" <ba...@freeuk.com> wrote:

> Tom St Denis wrote:
> > Basically, if you claim to be a C developer and can't handle a basic
> > comparison expression as say used in a for loop, you should re-
> > consider your profession.
>
> I just happen to think that:
>
> for (i=0; i<sizeof array/sizeof array[0]; ++i) printf("%d ",array[i]);
>
> is more tedious (and error-prone) to write than, say:
>
> for i=0 to array.upb do print array[i]
>
> or:
>
> forall x in array do print x
>
> or even just:
>
> print array

Possibly, but I think that "print array" does not do what I want at all.
It prints all array members in minimal resolution with spaces between
them, and I want a fixed number of digits with commas in between, except
for the last one, where I want " and ". A C for loop lets me do all that
and more. That's the problem with "high-level" solutions: they are, of
necessity, one-size-fits-all, and one-size-fits-all never really _does_
fit anyone.

> I'm not saying C should be upgraded to this level,

I'm saying it would be a downgrading. C is not BASIC, and should not try
to be.

Richard

Richard Bos

unread,

Jul 5, 2009, 7:31:58 AM7/5/09

to

"BartC" <ba...@freeuk.com> wrote:

> "Richard Bos" <ral...@xs4all.nl> wrote in message

> > C's for loop is more powerful than that. In what other language (except

> > those which nicked it from C) can you write
> >
> > for (node=head; node; node=node->next)
> > frobnicate(node->payload);
>
> > for (x=0, y=0; cell[x][y]<threshold; x+=dx, y+=dy) {
> > colour_neighbours(cell, x, y);
> > adjust_step(x,y, &dx,&dy);
> > }
>
> In any language that supports while-do loops, I would imagine, since that is
> pretty much what the C for-statement amounts to:
>
> node=head;
> while (node) {
> frobnicate(node->payload); /* Do something x-rated, it sounds like */
> node=node->next;
> }

Yes, but that's not nearly as elegant.

> However what I had in mind was the very common requirement of simply
> iterating over the range A to B. Doing it the C way is not the end of the
> world but, well, the compiler isn't really doing very much to earn it's keep
> here.

If we were talking about Perl, I'd suggest adding (yet) another
statement, just for the purpose of looping over integers. But C should
be kept simple. That is its strength; catering to BASIC programming
styles is not.

Richard

Al Grant

unread,

Jul 8, 2009, 6:06:35 AM7/8/09

to

On 1 July, 09:14, "robertwess...@yahoo.com" <robertwess...@yahoo.com>
wrote:

> On Jun 30, 3:27 pm, jacob navia <ja...@jacob.remcomp.fr> wrote:
> > Can you tell me of a C compiler that vectorizes that today?
>

> Compiling the following with ICC 11.1 (with "-O3 -fp:fast"):

...

Compiling it with armcc 3.1 (with "-O3 -Otime --vectorize"):

||v1|| PROC
LDR r0,|L1.140|
MOV r3,#0x10
ADD r1,r0,#0x100
ADD r2,r1,#0x100
|L1.16|
VLD1.32 {d0,d1},[r0]!
SUBS r3,r3,#1
VLD1.32 {d2,d3},[r1]!
VADD.F32 q0,q0,q1 ; 4-way FP addition
VST1.32 {d0,d1},[r2]!
BNE |L1.16|
BX lr
ENDP

||v2|| PROC
VMOV.F32 s0,#3.00000000
PUSH {r4,r5}
MOV r4,#0x10
LDR r5,|L1.140|
ADD r3,r5,#0x100
ADD r2,r5,#0x300
ADD r0,r5,#0x400
ADD r1,r5,#0x500
ADD r12,r5,#0x200
VDUP.32 q0,d0[0]
|L1.84|
SUBS r4,r4,#1
VLD1.32 {d6,d7},[r2]!
VLD1.32 {d18,d19},[r1]!
VMLA.F32 q0,q3,q9
VLD1.32 {d2,d3},[r5]!
VLD1.32 {d4,d5},[r3]!
VCGT.F32 q1,q2,q1 ; 4-way compare
VLD1.32 {d16,d17},[r0]!
VADD.F32 q2,q3,q8
VBSL q1,q2,q0 ; 4-way select
VST1.32 {d2,d3},[r12]!
BNE |L1.84|
POP {r4,r5}
BX lr
ENDP

BartC

unread,

Jul 8, 2009, 1:38:00 PM7/8/09

to

"Richard Bos" <ral...@xs4all.nl> wrote in message

news:4a508a7d...@news.xs4all.nl...

> "BartC" <ba...@freeuk.com> wrote:
>
>> "Richard Bos" <ral...@xs4all.nl> wrote in message
>
>> > C's for loop is more powerful than that. In what other language (except
>> > those which nicked it from C) can you write
>> >
>> > for (node=head; node; node=node->next)
>> > frobnicate(node->payload);
>>
>> > for (x=0, y=0; cell[x][y]<threshold; x+=dx, y+=dy) {
>> > colour_neighbours(cell, x, y);
>> > adjust_step(x,y, &dx,&dy);
>> > }
>>
>> In any language that supports while-do loops, I would imagine, since that
>> is
>> pretty much what the C for-statement amounts to:
>>
>> node=head;
>> while (node) {
>> frobnicate(node->payload); /* Do something x-rated, it sounds like */
>> node=node->next;
>> }
>
> Yes, but that's not nearly as elegant.

The C 'for' statement has merit. I've since added it to one of my own
languages, first as a 'cfor' statement, then integrated it into the while
statement. Not too elegant, but works:

for (a;b;c) {something} => cfor a,b,c do something end => while a,b,c do
something end

(which exercise took all of 30 minutes, showing syntax is essentially a
zero-cost benefit to a language).

The language has richer loop control than C, which makes this construct even
more useful. namely: restart, redo, next, exit(==break), compared with C's
break ... if at the outer level of a loop ... and if not inside a switch ...
and it's not a leap year. But C's (lack of) break statements is another
topic.

--
Bart