[llvm-dev] RFC: SIMD math-function library

Naoki Shibata via llvm-dev

unread,

Jul 13, 2016, 12:00:39 PM7/13/16

to llvm...@lists.llvm.org

Dear LLVM contributors,

I am Naoki Shibata, an associate professor at Nara Institute of Science
and Technology.

I and Hal Finkel would like to jointly propose to add my vectorized math
library to LLVM.

The library has been available as public domain software for years, I am
going to double-license the library if necessary.

********

Below is a proposal to add my vectorized math library, SLEEF [1], for
evaluating elementary functions (trigonometry, log, exp, etc.) to LLVM.
The library can be used directly, or can be targeted by an
autovectorization infrastructure. Patches to tie SLEEF into LLVM's
autovectorizer have been developed by Hal Finkel as part of the bgclang
project (which provides LLVM/Clang ported to the IBM BG/Q supercomputer
architecture). Hal has also developed a user-facing header for the
library, in the style of Clang's intrinsics headers, which we can use as
part of this project. SLEEF has been used as part of bgclang in this way
for several years.

The library currently supports several architectures:
* x86 - SSE2, FMA4, AVX, AVX2+FMA3
* ARM - NEON (single-precision only)
* A pure C (scalar) version
* Hal's version supports PowerPC/QPX.

It is faily easy to port to other architectures. The library provides
similar functionality to Intel's Short Vector Math Library (available
with Intel's Compiler).

Roadmap:
--------
1) Get agreement on incorporating the library.
2) Renaming the public interface to use only the
implementation-reserved namespace (i.e. names starting with
underscores), as is appropriate for a compiler runtime library.
3) Convert the functions to use LLVM's naming conventions (including, if
desired, converting the source files to C++ allowing the use of function
overloading).
4) Create and document a public interface to the library.
5) Add support for targeting the library to LLVM's autovectorizer.
6) Work with the community to port the library to other architectures.

Motivation:

Recent CPUs and GPUs have vectorized FP multipliers and adders for
improving throughput of FP computation. In order to extract the maximum
computation power from processors with vectorized ALUs, the software has
to be vectorized to use SIMD data structures. It is also preferred that
conditional branches and scatter/gather memory access are eliminated as
much as possible. However, rewriting existing software in this fashion
is a very hard and time consuming task that involves converting data
structures. Thus, realization of efficient libraries and automatic
vectorization is desired.

In this proposal, we are going to incorporate a vectorized math library,
currently named SLEEF, into LLVM runtime library. By doing this,
elementary functions can be directly evaluated using SIMD data types. We
can also expect extra performance improvements by allowing LLVM to
automatically target the functions (and inline them with LTO).

Functionality of the library:

For each elementary function, the library contains subroutines for
evaluation in single precision and double precision. Different accuracy
of the results can be chosen for a subset of the elementary functions;
for this subset there are versions with up to 1 ulp error and versions
with a few ulp error. Obviously, less accurate versions are faster.
Please note that we have 0.5 ulp maximum error when we convert a real
number into a floating point number. In Hal's bgclang port, the less
accurate versions are used with -ffast-math, and the more-accurate ones
otherwise.

For non-finite inputs and outputs, the library should return the same
results as libm. The library is tested if the evaluation error is within
the designed limit. The library is tested against high-precision
evaluation using the libmpfr library. Especially, we rigorously checked
the error of the trigonometric functions when the arguments are close to
an integral multiple of PI/2.

The size of the functions is very small.

Implementation of the library:

Basically, each function consists of reduction and kernel. For the
kernel, a polynomial approximation is used. The coefficients are
carefully set to minimize the number of multiplications and additions
while reducing the error. The reduction is devised so that the same
kernel can be used for all range of the input arguments. In order to
improve the accuracy in the functions with 1-ulp error, double-double
calculations are used. Use of fused multiply-add operations, which is
quite common recently, can further improve performance of these
functions. Some of the implementation techniques used in the library are
explained in [3].

[1] https://github.com/shibatch/sleef
[2] https://github.com/hfinkel/sleef-bgq/blob/master/simd/qpxmath.h
[3] http://ito-lab.naist.jp/~n-sibata/pdfs/isc10simd.pdf

********

Regards,

Naoki Shibata
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Vedant Kumar via llvm-dev

unread,

Jul 13, 2016, 4:46:03 PM7/13/16

to Naoki Shibata, llvm...@lists.llvm.org

Hi Naoki,

SLEEF looks very promising!

Are SLEEF routines validated against libm, in addition to libmpfr? Are
performance tracking tests in place to detect execution time or code size
regressions? If these are missing, IMO it would be good to add them to the
roadmap.

best,
vedant

Naoki Shibata via llvm-dev

unread,

Jul 14, 2016, 8:18:22 AM7/14/16

to Vedant Kumar, llvm...@lists.llvm.org

Hi Vedant,

Thank you for your comment.

For checking accuracy of finite outputs and correctness of handling
non-finite inputs and outputs, I believe validating against libmpfr is
enough. Please tell me the kind of regressions we need to detect. Do you
have concern on correctness of libmpfr?

What kind of execution time or code size regressions are we going to
check? Since SLEEF is completely branch-free, there should be no serious
execution time and code size regression unless branches are introduced.

It is of course okay for me to add additional regression checking, but I
just want to understand the necessity.

Regards,

Naoki Shibata

Hal Finkel via llvm-dev

unread,

Jul 15, 2016, 3:53:58 AM7/15/16

to Naoki Shibata, llvm...@lists.llvm.org, matt masten

Hi again,

As this RFC implies, I've been using the SLEEF library proposed here with Clang/LLVM for many years, and fully support its adoption into the LLVM project.

I'm CC'ing Matt and Xinmin from Intel who have started working on contributing support for their SVML library to LLVM (http://reviews.llvm.org/D19544), and I understand plan to contribute (some subset of) the vector math functions themselves. I'm also excited about Intel's planned contributions.

Here's how I currently see the situation: Regardless of what Intel contributes, we need a solution in this space for many different architectures. From personal experience, SLEEF is relatively easy to port to different architectures (i.e. different vector ISAs), and has already been ported to several. The performance is good as is the accuracy. I think it would make a great foundation for a vector-math-function runtime library for the LLVM project. I don't know what routines Intel is planning to contribute, or for what architectures they're tuned, but I expect we'll want to use those implementations on x86 platforms where appropriate.

Matt, Xinmin, what do you think?

Thanks again,
Hal

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Naoki Shibata via llvm-dev

unread,

Jul 15, 2016, 4:37:42 AM7/15/16

to Hal Finkel, llvm...@lists.llvm.org, matt masten

Hi all,

Okay, the point is whether Intel will publish the source code for their
SVML. If Intel will make SVML open-source, there would be not much
advantage in incorporating SLEEF into LLVM, since it would be also
fairly easy to port SVML to other architectures. If Intel will not
open-source SVML, then there could be advantage in using SLEEF for x86
by inlining the functions.

Is it possible to ask the person in charge what exactly Intel is going
to contribute?

Naoki Shibata

On 2016/07/15 12:53, Hal Finkel wrote:
> Hi again,
>
> As this RFC implies, I've been using the SLEEF library proposed here with Clang/LLVM for many years, and fully support its adoption into the LLVM project.
>
> I'm CC'ing Matt and Xinmin from Intel who have started working on contributing support for their SVML library to LLVM (http://reviews.llvm.org/D19544), and I understand plan to contribute (some subset of) the vector math functions themselves. I'm also excited about Intel's planned contributions.
>
> Here's how I currently see the situation: Regardless of what Intel contributes, we need a solution in this space for many different architectures. From personal experience, SLEEF is relatively easy to port to different architectures (i.e. different vector ISAs), and has already been ported to several. The performance is good as is the accuracy. I think it would make a great foundation for a vector-math-function runtime library for the LLVM project. I don't know what routines Intel is planning to contribute, or for what architectures they're tuned, but I expect we'll want to use those implementations on x86 platforms where appropriate.
>
> Matt, Xinmin, what do you think?
>
> Thanks again,
> Hal

_______________________________________________

Tian, Xinmin via llvm-dev

unread,

Jul 15, 2016, 4:39:17 AM7/15/16

to Hal Finkel, Naoki Shibata, llvm...@lists.llvm.org, Masten, Matt

I agree with Hal.

Since SLEEF library is targeted (portable) for many different architectures, it will be a great addition to LLVM community on SIMD support for all architectures

Currently, intel open sourced 6 functions (sin, cos, pow, exp, log, and sincos) GCC and LLVM for x86 ( {SS2, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, MIC, AVX512} x {mask, non-mask} ), AVX512 open source is to be done), the plan is to open source most of Intel SVML library for LLVM x86 support.

For achieving "close to metal performance for x86", I assume Intel SVML would provide a better performance and more control on accuracy for the time being, given the SVML team had tuned the SVML for many years for all x86 architectures, and we have not done performance and accuracy comparisons on SLEEF and SVML libraries.

In any case, I would suggest move this RFC forward and start this project. I think Intel's SVML code for x86 can be integrated into this project for x86, I will talk to Intel SVML library owner/stakeholders and ask them to take a look SLEEF and provide their recommendation/suggestion related to x86 and in general.

Thanks,
Xinmin

Naoki Shibata via llvm-dev

unread,

Jul 15, 2016, 4:45:48 AM7/15/16

to Martin....@movidius.com, Vedant Kumar, llvm...@lists.llvm.org

Hi Martin,

Thank you for your comment.

It is of course possible to rewrite SLEEF in more generic way, and
actually I once tried to do that using the vector data type in GCC. But
the code generated from such source code was far less efficient than the
version with explicit SIMD intrinsics.

Adding typedefs to specify the exact types is possible.

Regards,

Naoki Shibata

On 2016/07/14 18:25, Martin J. O'Riordan wrote:
> Having support for vector equivalents to the ISO C math functions is very valuable, and this kind of work of great benefit.
>
> There are a couple things though that concern me about this proposal:
>
> 1. OpenCL C already provides a vector math binding that for the most
> part provides this equivalence. It also supports vectors of
> multiple types through overloading. Perhaps it might be possible
> to align SLEEF with OpenCL C?
>
> 2. There are hard assumptions about how 'float', 'double' and 'long
> double' are implemented. Libraries with these kinds of hard-wired
> assumptions (including 'compiler-rt') cause me a lot of trouble to
> port to our platform which is at variance with these common
> assumptions.
>
> So I would suggest that the implementation uses typedefs to
> specifically bind to the type that provides the specific FP
> precision required.
>
> CLang supports the IEEE FP16, FP32, FP64, FP128 types, which can
> be bound to each of the higher level C types. Our architecture
> binds these as FP16 for '__fp16' aka 'half', FP32 for 'float' AND
> for 'double' and FP64 for 'long double'. There is no hardware
> support for FP64, so having 'float' and 'double' be FP32 is
> important to avoid the costly consequences of usual arithmetic
> conversions in C.
>
> Using specific synonyms would greatly enhance the portability of
> the library implementation. For example 'float32_t' instead of
> 'float' - pity C/C++ don't have these as Standard yet.
>
> Thanks,
>
> MartinO

Tian, Xinmin via llvm-dev

unread,

Jul 15, 2016, 4:55:28 AM7/15/16

to Naoki Shibata, Hal Finkel, llvm...@lists.llvm.org, Masten, Matt

Naoki,

Intel is planning open-source SVML library (most of them if it not 100%), 6 functions of SVML are open sourced for GCC and LLVM already. But, Intel SVML is x86 centric (SSE2, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 ....}. Personally, I am not sure if it would be fairly easy to port SVML to other architectures. SVML library team may provide a better answer, I will double check with them.

Given that SLEEF supports many different architectures, I think it has a value for LLVM, at least before all porting is done for SVML library to other architectures by LLVM community after Intel open sourced it.

Thanks,
Xinmin

-----Original Message-----
From: Naoki Shibata [mailto:shibatc...@gmail.com]
Sent: Thursday, July 14, 2016 9:38 PM
To: Hal Finkel <hfi...@anl.gov>
Cc: llvm...@lists.llvm.org; Chandler Carruth <chan...@gmail.com>; Tian, Xinmin <xinmi...@intel.com>; Masten, Matt <matt....@intel.com>
Subject: Re: RFC: SIMD math-function library

Martin J. O'Riordan via llvm-dev

unread,

Jul 15, 2016, 6:10:03 AM7/15/16

to Naoki Shibata, Vedant Kumar, llvm...@lists.llvm.org

I am looking forward to porting it to our platform, I know that this will be significant benefit.

We support 'v8f16' and v4f32' FP vector types natively, and having this library provide the optimised math functions for them will definitely be very useful.

All the best,

MartinO

-----Original Message-----
From: Naoki Shibata [mailto:shibatc...@gmail.com]

Hal Finkel via llvm-dev

unread,

Jul 15, 2016, 6:34:57 AM7/15/16

to Martin ORiordan, llvm...@lists.llvm.org, Naoki Shibata, Tom Stellard

[+Tom]

This is an interesting point. It might certainly make sense to integrate these routines with our OpenCL library implementation as well for targets that would benefit. Currently, we have scalar implementations of many math functions (e.g. http://llvm.org/svn/llvm-project/libclc/trunk/generic/lib/math/tanh.cl), and "vectorized" versions which just call the scalar functions (http://llvm.org/svn/llvm-project/libclc/trunk/generic/lib/clcmacro.h). If nothing else, it might make sense to borrow their naming convention?

Thanks again,
Hal

--

Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Naoki Shibata via llvm-dev

unread,

Jul 15, 2016, 7:02:08 AM7/15/16

to Tian, Xinmin, llvm...@lists.llvm.org, Masten, Matt

Is it possible to see the source code of the open-sourced SVML? The diff
file does not include the library. I searched the Internet but I could
not find.

Regards,

Naoki Shibata

On 2016/07/15 13:55, Tian, Xinmin wrote:
> Naoki,
>
> Intel is planning open-source SVML library (most of them if it not 100%), 6 functions of SVML are open sourced for GCC and LLVM already. But, Intel SVML is x86 centric (SSE2, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 ....}. Personally, I am not sure if it would be fairly easy to port SVML to other architectures. SVML library team may provide a better answer, I will double check with them.
>
> Given that SLEEF supports many different architectures, I think it has a value for LLVM, at least before all porting is done for SVML library to other architectures by LLVM community after Intel open sourced it.
>
> Thanks,
> Xinmin

_______________________________________________

Tian, Xinmin via llvm-dev

unread,

Jul 15, 2016, 3:18:11 PM7/15/16

to Naoki Shibata, llvm...@lists.llvm.org, Masten, Matt

It was open sourced for GCC. I will get you the contact who did open source. Thanks.

Xinmin.

-----Original Message-----
From: Naoki Shibata [mailto:shibatc...@gmail.com]
Sent: Friday, July 15, 2016 12:02 AM
To: Tian, Xinmin <xinmi...@intel.com>
Cc: Hal Finkel <hfi...@anl.gov>; llvm...@lists.llvm.org; Chandler Carruth <chan...@gmail.com>; Masten, Matt <matt....@intel.com>
Subject: Re: RFC: SIMD math-function library

Tian, Xinmin via llvm-dev

unread,

Jul 15, 2016, 5:16:49 PM7/15/16

to Naoki Shibata, llvm...@lists.llvm.org, Masten, Matt

Naoki, below is the link you can get the code.

https://sourceware.org/git/?p=glibc.git;a=tree;f=sysdeps/x86_64/fpu/multiarch;h=2c567a353c2d258dbc08c50cd6fa189b825f3257;hb=HEAD

Xinmin
-----Original Message-----
From: Naoki Shibata [mailto:shibatc...@gmail.com]

Sent: Friday, July 15, 2016 12:02 AM
To: Tian, Xinmin <xinmi...@intel.com>

Cc: Hal Finkel <hfi...@anl.gov>; llvm...@lists.llvm.org; Chandler Carruth <chan...@gmail.com>; Masten, Matt <matt....@intel.com>
Subject: Re: RFC: SIMD math-function library

Naoki Shibata via llvm-dev

unread,

Jul 15, 2016, 5:32:21 PM7/15/16

to Tian, Xinmin, llvm...@lists.llvm.org, Masten, Matt

Thank you. And now I understand why it is not very easy to port SVML to
other architectures.

Naoki Shibata

Hal Finkel via llvm-dev

unread,

Jul 27, 2016, 3:46:34 PM7/27/16

to Chandler Carruth, llvm...@lists.llvm.org, Naoki Shibata, Matt Masten

Hi everyone,

I think that everyone is on the same page. We'll put together a patch for review.

One remaining question: There seem two potential homes for this library: parallel_libs and compiler-rt. Opinions on where the vectorized math functions should live? My inclination is to target it for the new parallel_libs project, in part because I feel like compiler-rt has too many things grouped together already, and in part because vectorization is a form of parallel execution. Thoughts?

Thanks again,
Hal

--

Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Tian, Xinmin via llvm-dev

unread,

Jul 27, 2016, 4:42:41 PM7/27/16

to Hal Finkel, Chandler Carruth, llvm...@lists.llvm.org, Naoki Shibata, Masten, Matt

I don't have a strong preference using either parallel_lib or compiler-rt as the home for vectorlib, or a new one. Assume we go with parallel_libs, the structure is more or less like below, right?

parallel_libs
|
-------------------------------------------------------------
| | |
vector streamexecutor offload
|
--------------------
| | |
x86 arm power

Thanks,
Xinmin

-----Original Message-----
From: Hal Finkel [mailto:hfi...@anl.gov]

Chandler Carruth via llvm-dev

unread,

Jul 28, 2016, 2:30:16 AM7/28/16

to Hal Finkel, Chandler Carruth, llvm...@lists.llvm.org, Matt Masten, Naoki Shibata

On Wed, Jul 27, 2016 at 8:46 AM Hal Finkel via llvm-dev <llvm...@lists.llvm.org> wrote:

Hi everyone,

I think that everyone is on the same page. We'll put together a patch for review.

One remaining question: There seem two potential homes for this library: parallel_libs and compiler-rt. Opinions on where the vectorized math functions should live? My inclination is to target it for the new parallel_libs project, in part because I feel like compiler-rt has too many things grouped together already, and in part because vectorization is a form of parallel execution. Thoughts?

I share your preference and the basis for it.

C Bergström

unread,

Jul 28, 2016, 2:44:01 AM7/28/16

to Chandler Carruth, llvm-dev, Naoki Shibata, Matt Masten

Why is there any motivation to bundle it with unrelated stuff at all?
What's the benefit? If it's just to prop up the existence of
parallel_libs, then I don't think that makes sense.. Should we move
llvm loop optimizations over to parallel_libs as well?

If this is just a bikeshed argument, of course chandler will get his
way and nobody else matters..

Hopefully, the decision is driven by points like: maintaining a clear
modular design, repo with the same name it had before, works
independent of any compiler, clearly defined what it is and who is
working on it as well as the goals..

(Which is the exact opposite of parallel_libs which is a meta-bucket
of dumping "stuff") Another reason why parallel_libs doesn't make
sense is that it's still extremely low visibility or relevance. Was a
mailing list setup for it? If it's a real project, why wasn't that
list on cc?

I'd opt to go with what the author wants or worst case compiler-rt in
the event people refuse to create another repo. The nature of the
functions it implements is complementary to what's there already,
better visibility as well as something people may be checking out
already.

Hal Finkel via llvm-dev

unread,

Jul 28, 2016, 3:10:27 AM7/28/16

to C Bergström, llvm-dev, Naoki Shibata, Matt Masten

----- Original Message -----
> From: "C Bergström" <cberg...@pathscale.com>
> To: "Chandler Carruth" <chan...@gmail.com>
> Cc: "Hal Finkel" <hfi...@anl.gov>, "llvm-dev" <llvm...@lists.llvm.org>, "Matt Masten" <matt....@intel.com>,
> "Naoki Shibata" <shibatc...@gmail.com>
> Sent: Wednesday, July 27, 2016 9:43:34 PM
> Subject: Re: [llvm-dev] RFC: SIMD math-function library
>

> Why is there any motivation to bundle it with unrelated stuff at all?
> What's the benefit? If it's just to prop up the existence of
> parallel_libs, then I don't think that makes sense..

I don't think that parallel_libs needs propping - at the moment it is so new that parallel_libs-dev has zero messages. I don't see a strong need for another new top-level project, with whatever administrative overhead that implies. I'm not against it either. If the community wants a new top-level project for this library, then I'm sure we can make one.

> Should we move
> llvm loop optimizations over to parallel_libs as well?

;)

> If this is just a bikeshed argument, of course chandler will get his
> way and nobody else matters..

While many of us respect Chandler's opinion, that's not actually the way the community works.

>
> Hopefully, the decision is driven by points like: maintaining a clear
> modular design, repo with the same name it had before, works
> independent of any compiler, clearly defined what it is and who is
> working on it as well as the goals..

To be clear, I think the community should decide on the name. Using the name it has now is one option. That name is SLEEF (SIMD Library for Evaluating Elementary Functions). We might also wish to name it something more generic as part of the project, as is our general custom (e.g. compiler-rt, libc++, libomp, etc.).

>
> (Which is the exact opposite of parallel_libs which is a meta-bucket
> of dumping "stuff") Another reason why parallel_libs doesn't make
> sense is that it's still extremely low visibility or relevance. Was a
> mailing list setup for it? If it's a real project, why wasn't that
> list on cc?

Because the RFC was on this list, and as you might recall, we recently had a big discussion on this list about mailing lists, and about how cross-posting between different lists is a real pain for the list moderators. Thus, I didn't. If we target the library to the parallel_libs project, then future discussion will go there. In the mean time, I am assuming that the relevant parties are on this list.

>
> I'd opt to go with what the author wants or worst case compiler-rt in
> the event people refuse to create another repo. The nature of the
> functions it implements is complementary to what's there already,
> better visibility as well as something people may be checking out
> already.

I agree that it is complementary to what is already in compiler-rt. That is why I suggested it as the second option.

Thanks again,
Hal

>
>
> On Thu, Jul 28, 2016 at 10:29 AM, Chandler Carruth via llvm-dev
> <llvm...@lists.llvm.org> wrote:
> > On Wed, Jul 27, 2016 at 8:46 AM Hal Finkel via llvm-dev
> > <llvm...@lists.llvm.org> wrote:
> >>
> >> Hi everyone,
> >>
> >> I think that everyone is on the same page. We'll put together a
> >> patch for
> >> review.
> >>
> >> One remaining question: There seem two potential homes for this
> >> library:
> >> parallel_libs and compiler-rt. Opinions on where the vectorized
> >> math
> >> functions should live? My inclination is to target it for the new
> >> parallel_libs project, in part because I feel like compiler-rt has
> >> too many
> >> things grouped together already, and in part because vectorization
> >> is a form
> >> of parallel execution. Thoughts?
> >
> >
> > I share your preference and the basis for it.
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm...@lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
>

--

Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

C Bergström

unread,

Jul 28, 2016, 3:20:12 AM7/28/16

to Hal Finkel, llvm-dev, Naoki Shibata, Matt Masten

I'm positive +1 for inclusion since it has users, some development and
overall fits with the compiler genre.

If there's a bikeshed discussion on changing the name or where it
lives, I'd hope that we start a new discussion for that so it can be
easier to filter as well as stating clearly the pros/cons of each
proposal. I'm really really bored of reading all the bikeshed
discussions lately and the opinions, sometimes without strong
technical backing for why people choose Green or Blue.

Hal Finkel via llvm-dev

unread,

Sep 27, 2016, 1:34:42 AM9/27/16

to llvm...@lists.llvm.org, Matt Masten, Naoki Shibata

Hi again,

To start the conversation (and review) of the technical elements, I've put together an initial patch: https://reviews.llvm.org/D24951

Thanks again,
Hal

--
Hal Finkel
Lead, Compiler Technology and Programming Languages

C Bergström via llvm-dev

unread,

Sep 27, 2016, 3:50:29 AM9/27/16

to Hal Finkel, llvm-dev, Naoki Shibata, Matt Masten

I should keep quiet and leave well enough alone, but playing devil's
advocate for a moment -

I see you didn't bundle this with compiler-rt, which I guess is good?
In the end what was the reasoning for that? Do you see this being
sufficiently independent and running a different development track
that it made sense?

1) Why rename C files to C++ (consistency?)

2) It seems your "x" prefix is probably safe, but more safe may be clg
(like clgsinf) - This is a bit verbose of course, but I think you're
trying to avoid a collision, right? If this library is meant only for
internal compiler usage and not exposed, wouldn't __vsinf be really
what makes sense? (If Apple or others use this it shouldn't be public
facing) If there is meant to be a user facing API - maybe it could be
exposed as an alias with a conflicting API, but not automatically used
in the compiler.. This would mean there's both __vsinf (or xsinf if
you prefer) as well as sinf..

3) I see #if for SP and DP, but what about REAL10 and REAL16? Any plans?

4) It appears to be a known situation, but I'm a little uncomfortable
in general with having a top level project without *any* way to test
and validate it. The cmake option is there, but does it do what I
assume it should do? If it does could you add a README with a small
amount of information on how to setup the tests.. (I'm also strongly
against mpfr)

5) Is there any benchmarks we can test this with to see the benefit?

6) Lastly, if Apple or Linux introduce something similar, what's the
benefit of having this vs that? How do we decide which one to use or
should the compiler just support both concurrently?

Hal Finkel via llvm-dev

unread,

Sep 27, 2016, 10:29:33 AM9/27/16

to C Bergström, llvm-dev, Naoki Shibata, Matt Masten

----- Original Message -----
> From: "C Bergström" <cberg...@pathscale.com>
> To: "Hal Finkel" <hfi...@anl.gov>

> Cc: "llvm-dev" <llvm...@lists.llvm.org>, "Matt Masten" <matt....@intel.com>, "Naoki Shibata"
> <shibatc...@gmail.com>

> Sent: Monday, September 26, 2016 10:49:57 PM
> Subject: Re: [llvm-dev] RFC: SIMD math-function library
>

> I should keep quiet and leave well enough alone, but playing devil's
> advocate for a moment -
>
> I see you didn't bundle this with compiler-rt, which I guess is good?
> In the end what was the reasoning for that? Do you see this being
> sufficiently independent and running a different development track
> that it made sense?

compiler-rt is a conglomeration of many different things, and it is not clear to me that this needs to be yet-another thing that lives there. parallel-libs thematically makes sense to me; I really don't have a strong opinion about this. One thing that making the patch for parallel-libs made me realize was how much cmake functionality is currently housed in compiler-rt that perhaps should be made reusable somehow by all of the library projects.

>
> 1) Why rename C files to C++ (consistency?)

Consistency (and the code might naturally use several C++ features, such as function overloading and templates, as we refactor it).

>
> 2) It seems your "x" prefix is probably safe, but more safe may be
> clg
> (like clgsinf) - This is a bit verbose of course, but I think you're
> trying to avoid a collision, right? If this library is meant only for
> internal compiler usage and not exposed, wouldn't __vsinf be really
> what makes sense? (If Apple or others use this it shouldn't be public
> facing) If there is meant to be a user facing API - maybe it could be
> exposed as an alias with a conflicting API, but not automatically
> used
> in the compiler.. This would mean there's both __vsinf (or xsinf if
> you prefer) as well as sinf..

All of the external functions are prefixed with __x, which should be pretty safe. The x was in the library before, I just added the __. If you have a naming scheme you'd prefer, please feel free to suggest one (or more than one).

>
> 3) I see #if for SP and DP, but what about REAL10 and REAL16? Any
> plans?

I have no immediate plans; I also don't know of any targets that will support vectors of these higher-precision types.

>
> 4) It appears to be a known situation, but I'm a little uncomfortable
> in general with having a top level project without *any* way to test
> and validate it. The cmake option is there, but does it do what I
> assume it should do? If it does could you add a README with a small
> amount of information on how to setup the tests.. (I'm also strongly
> against mpfr)

In my opinion, this can't be committed at all without tests; having the regression tests is a must. We need to decide how to do them.

>
> 5) Is there any benchmarks we can test this with to see the benefit?

Good question. Eric F. has been working on adding benchmarks to libcxx, and we should consider leveraging that somehow to make sure we have benchmarks for this too.

>
> 6) Lastly, if Apple or Linux introduce something similar, what's the
> benefit of having this vs that? How do we decide which one to use or
> should the compiler just support both concurrently?
>

This isn't really an if. glibc now has libmvec, which provides a subset of the functionality proposed here for single and double precision (at least for x86). Apple's Accelerate framework vecLib provides much of this functionality, but only for single precision. What's provided here is fairly comprehensive, for both single and double precision, plus is easy to port to other architectures.

Thanks again,
Hal

C Bergström via llvm-dev

unread,

Sep 27, 2016, 11:10:41 AM9/27/16

to Hal Finkel, llvm-dev, Naoki Shibata, Matt Masten

On Tue, Sep 27, 2016 at 6:29 PM, Hal Finkel <hfi...@anl.gov> wrote:
> ----- Original Message -----
>> From: "C Bergström" <cberg...@pathscale.com>
>> To: "Hal Finkel" <hfi...@anl.gov>
>> Cc: "llvm-dev" <llvm...@lists.llvm.org>, "Matt Masten" <matt....@intel.com>, "Naoki Shibata"
>> <shibatc...@gmail.com>
>> Sent: Monday, September 26, 2016 10:49:57 PM
>> Subject: Re: [llvm-dev] RFC: SIMD math-function library
>>
>> I should keep quiet and leave well enough alone, but playing devil's
>> advocate for a moment -
>>
>> I see you didn't bundle this with compiler-rt, which I guess is good?
>> In the end what was the reasoning for that? Do you see this being
>> sufficiently independent and running a different development track
>> that it made sense?
>
> compiler-rt is a conglomeration of many different things, and it is not clear to me that this needs to be yet-another thing that lives there. parallel-libs thematically makes sense to me; I really don't have a strong opinion about this. One thing that making the patch for parallel-libs made me realize was how much cmake functionality is currently housed in compiler-rt that perhaps should be made reusable somehow by all of the library projects.

I didn't realize this is a patch specifically for parallel-libs. I'd
really hope you reconsider compiler-rt or even a top level project
home for this.

Not to be taken too seriously.. just thinking out loud below

Sorry to digress, but...
So far "parallel-libs" is just a front for SE. Six posts on the
mailing list and not much sign of life. So much potential, but what's
happened since the initial commit in ~June?
--------------
In contrast this has a more broad reach and general day-to-day use.
Coupling that with another project which is already a strong
dependency seems like a logical fit, no? compiler-rt isn't
particularly big and would anyone complain against it being included
there?

Anyway - this library doesn't handle parallelism, it's vectorization.
(semantics). Also there's already a bunch of built-ins that the
compiler leverages in compiler-rt a somewhat similar fashion?
lib/builtins

------------
For naming conventions - In general it may make sense to set a
precedent early on what internally named functions should be. Off the
cuff I'd propose __clg or something short and simple.. Consistency is
typically a good thing, but anything with __ by the standard should be
safe..

Sorry I can't help with the tests or benchmark side, but hopefully
when things are ironed out a bit more I'll try to test as best as I
can.

Thanks

Hal Finkel via llvm-dev

unread,

Sep 29, 2016, 2:20:03 AM9/29/16

to Hal Finkel, llvm-dev, Matt Masten, Naoki Shibata

----- Original Message -----
> From: "Hal Finkel via llvm-dev" <llvm...@lists.llvm.org>
> To: "C Bergström" <cberg...@pathscale.com>
> Cc: "llvm-dev" <llvm...@lists.llvm.org>, "Naoki Shibata" <shibatc...@gmail.com>, "Matt Masten"
> <matt....@intel.com>
> Sent: Tuesday, September 27, 2016 5:29:23 AM
> Subject: Re: [llvm-dev] RFC: SIMD math-function library
>

I've ported the unit tests to the gtest framework that LLVM uses and updated the patch. The library should now be pretty-well tested.

-Hal

Reply all

Reply to author

Forward