Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Auto vectorization

10 views
Skip to first unread message

javeri...@gmail.com

unread,
May 15, 2008, 8:45:14 AM5/15/08
to
Hi all:

I am currently working on the auto-vectorization phase in the GCC. I
have some points to discuss.

1-- How you define the profitability of auto-vectorization phase? Is
it just the speed up? If we do not get any speed up over scalar code
then there is no need to do auto-parallelization.

2--What are the phases or features in a compiler ( especially in the
GCC) that control the quality of auto-vectorization?

Thanks
JAY.

andreyb...@gmail.com

unread,
May 20, 2008, 6:33:15 AM5/20/08
to
There is a brief description of how vectorizer works in Intel
compiler:
http://www.springerlink.com/content/fl3l017321p11760/fulltext.pdf

According to the paper, alignment optimization is important for
vectorization.

Andrey

Roland Leißa

unread,
May 21, 2008, 11:17:57 AM5/21/08
to
Hi,

> 1-- How you define the profitability of auto-vectorization phase? Is
> it just the speed up? If we do not get any speed up over scalar code
> then there is no need to do auto-parallelization.

If you have vectorized code, which is just as fast as the scalar code
-- use the scalar code. Usually the vectorized version consumes more
memory. But gcc also does two versions of a loop and does runtime
checks whether the loop is long enough so the vectorized version is
worth the overhead. Otherwise the scalar version is taken. If you
know that your loop will never benefit from vectorized code in all
cases, it is better to use the scalar version. So the profitability
can be defined as speed up and memory consumption (if you think you
have enough memory, keep in mind that your cache is not that big and
cache misses are expensive).

> 2--What are the phases or features in a compiler ( especially in the
> GCC) that control the quality of auto-vectorization?

This is very complicated. I have fiddled around with the auto-
vectorizer more than once it is hard to get good results. Have a look
at this:
http://gcc.gnu.org/projects/tree-ssa/vectorization.html
and this:
http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01560.html

Some hints:

Use proper alignment (16 bit aligned data with SSE for instance). You
can achieve this with __attribute__ ((aligned (16))). This is _NOT_
guaranteed to work with dynamic memory. There you have to use tricks
or use posix_memalign(). If you are using C++ you can overload the new
operator for your class to automatically use posix_memalign when using
new. However it seems to be very hard for gcc to proof that memory is
properly aligned. So use pragmas. Use -ftree-vectorizer-verbose and -
ftree-vectorizer-verbose=5 to see whether all your effort is worth the
trouble. But in the end you can only be 100% sure what happens if you
see the asm output (with -S).

Hope, I could help.

Greetz,
Roland Leissa

Anton Lokhmotov

unread,
May 23, 2008, 10:58:06 AM5/23/08
to
> 1-- How you define the profitability of auto-vectorization phase? Is
> it just the speed up? If we do not get any speed up over scalar code
> then there is no need to do auto-parallelization.

Seems to be right. Since vector instructions typically do more work
than scalar ones, vector code is usually more *power* hungry. However,
if you achieve a considerable speed-up, the overall *energy*
consumption (power * time) can be less than that of scalar
code. (That's why vector instructions are so popular in embedded DSP
architectures.)

> 2--What are the phases or features in a compiler ( especially in the
> GCC) that control the quality of auto-vectorization?

Vectorization is (profitably) applicable to fairly specific code. Loop
restructuring transformations can massage code to a form more amenable
to vectorization.

If you are interested, I can send you a pdf of my PhD thesis on
programming and compiling for embedded SIMD architectures, which has a
survey chapter on automatic vectorization techniques.

Cheers,
Anton.

0 new messages