vDSP branching loop

4 views
Skip to first unread message

Luigi Castelli

unread,
Oct 28, 2015, 2:31:20 PM10/28/15
to PerfOpt-Dev
Hi there,

I have noticed there is no single vDSP function for the following:

for (n = 0; n < N; ++n)
    if (B[0] <= A[n])
        E[n] = C[0];
    else
        E[n] = D[0];

I have tried to use a combination of vDSP functions to implement the above but the results are not to my satisfaction.
How would you implement the above loop using vDSP ?

Thank you.

- Luigi

Stephen Canon

unread,
Oct 28, 2015, 2:45:08 PM10/28/15
to Luigi Castelli, PerfOpt-Dev
Hi Luigi —

Clang will happily autovectorize this loop for you if you write it like this (I made B, C, D scalar floats instead of pointers, since you only ever use the first element):

#include <stdlib.h>

void mySelect(const float *A, float B, float C, float D, float * restrict E, size_t N) {
    for (size_t n = 0; n < N; ++n)
        E[n] = A[n] >= B ? C : D;
}

– Steve

_______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list      (PerfOptimi...@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/perfoptimization-dev/scanon%40apple.com

This email sent to sca...@apple.com

Luigi Castelli

unread,
Oct 28, 2015, 3:12:07 PM10/28/15
to Stephen Canon, PerfOpt-Dev
Ok, thanks. Good to know.
What about if I change the loop to this? (B is a vector now)

for (n = 0; n < N; ++n)
    if (B[n] <= A[n])
        E[n] = C[0];
    else
        E[n] = D[0];

- Luigi

Stephen Canon

unread,
Oct 28, 2015, 3:26:44 PM10/28/15
to Luigi Castelli, PerfOpt-Dev
Sure:

void select(const float *A, float *B, float C, float D, float * restrict E, size_t N) {
    for (size_t n = 0; n < N; ++n)
        E[n] = A[n] >= B[n] ? C : D;
}

Vectorized by clang at -O3.

Luigi Castelli

unread,
Oct 28, 2015, 3:33:03 PM10/28/15
to Stephen Canon, PerfOpt-Dev
Perfect. Thanks Stephen!
Reply all
Reply to author
Forward
0 new messages