Trigonometric Functions on RISC-V

Muhammad Ali Akhtar

unread,

Apr 30, 2018, 1:54:44 PM4/30/18

to RISC-V SW Dev

Hello All,

Is RISC-V capable of executing Trigonometric functions (sin, cosine, tan, arc-tan). In other words, can C++ Math.c be compiled by RISC-V gcc compiler?

Regards.

Tommy Murphy

unread,

Apr 30, 2018, 1:59:33 PM4/30/18

to RISC-V SW Dev

Not sure I understand the question or the reasoning behind it but trig functions are part of the standard math lib - e.g. part of newlib (https://github.com/riscv/riscv-newlib/tree/riscv-newlib-2.5.0/newlib/libm/math) or glibc (https://github.com/riscv/riscv-glibc/tree/riscv-glibc-2.26/math).

Muhammad Ali Akhtar

unread,

Apr 30, 2018, 2:04:27 PM4/30/18

to Tommy Murphy, RISC-V SW Dev

Thanks tommy,

My algorithm uses lots of sin / tan / arc-tang functions. In my C++ code, I simply use the provided functions by math.h / other C libraries.
My understanding is that architectures like x86 directly support these functions in ISA i.e. Their ISAs provided sin / cos instructions (right..?)
I was just wondering if ISA support is required for these functions or Compiler is intelligent enough to translate these functions for any ISA.

Muhammad Ali Akhtar
Principal Design Engineer
http://www.linkedin.com/in/muhammadakhtar

On Mon, Apr 30, 2018 at 10:59 PM, Tommy Murphy <tommy_...@hotmail.com> wrote:

Not sure I understand the question or the reasoning behind it but trig functions are part of the standard math lib - e.g. part of newlib (https://github.com/riscv/riscv-newlib/tree/riscv-newlib-2.5.0/newlib/libm/math) or glibc (https://github.com/riscv/riscv-glibc/tree/riscv-glibc-2.26/math).

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/2a960386-02c3-43aa-a4a4-8b9617c5df36%40groups.riscv.org.

Tommy Murphy

unread,

Apr 30, 2018, 2:13:03 PM4/30/18

to RISC-V SW Dev, tommy_...@hotmail.com

The RISC-V instruction set definitely does not have instructions for trigonometric functions:

https://riscv.org/specifications/

Comparing x86 (CISV) and RISC-V (RISC) is comparing apples with oranges...

Tommy Murphy

unread,

Apr 30, 2018, 2:13:49 PM4/30/18

to RISC-V SW Dev, tommy_...@hotmail.com

On Monday, 30 April 2018 19:13:03 UTC+1, Tommy Murphy wrote:

Comparing x86 (CISV)

CISC not CISV!

John Leidel

unread,

Apr 30, 2018, 2:16:10 PM4/30/18

to Tommy Murphy, RISC-V SW Dev

Once the vector extension has been hardened (and well supported in the
simulators), it would be good to support something such as SLEEF:
http://sleef.org/

> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> email to sw-dev+un...@groups.riscv.org.

> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit

> https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/e698f897-b86f-432f-9c37-fe01f5a70333%40groups.riscv.org.

Jim Wilson

unread,

Apr 30, 2018, 2:51:40 PM4/30/18

to Muhammad Ali Akhtar, Tommy Murphy, RISC-V SW Dev

On Mon, Apr 30, 2018 at 11:03 AM, Muhammad Ali Akhtar
<muhamma...@gmail.com> wrote:
> My algorithm uses lots of sin / tan / arc-tang functions. In my C++ code, I
> simply use the provided functions by math.h / other C libraries.
> My understanding is that architectures like x86 directly support these
> functions in ISA i.e. Their ISAs provided sin / cos instructions (right..?)
> I was just wondering if ISA support is required for these functions or
> Compiler is intelligent enough to translate these functions for any ISA.

Any decent C library will have code for these routines, and the
compiler just calls them. Optionally, if a target has instructions
for these functions, as an optimization, the compiler may try to use
these instructions. However, hardware support for these functions is
not required. It is just an optional compiler optimization. If a
target does not have hardware instructions, then the compiler just
calls the C library routines as usual.

Jim

Bruce Hoult

unread,

Apr 30, 2018, 8:17:57 PM4/30/18

to Muhammad Ali Akhtar, Tommy Murphy, RISC-V SW Dev

On Tue, May 1, 2018 at 6:03 AM, Muhammad Ali Akhtar <muhamma...@gmail.com> wrote:

Thanks tommy,

My algorithm uses lots of sin / tan / arc-tang functions. In my C++ code, I simply use the provided functions by math.h / other C libraries.
My understanding is that architectures like x86 directly support these functions in ISA i.e. Their ISAs provided sin / cos instructions (right..?)
I was just wondering if ISA support is required for these functions or Compiler is intelligent enough to translate these functions for any ISA.

There is nothing magic about hardware. Trigonometric and logarithmic functions require a series of simple arithmetic operations (and usually table lookups) no matter whether it is sequenced in hardware or software. CISC "hardware" implementations are micro-coded, but RISC can be about as fast using normal program code if there is a good set of basic operations.

That was the discovery that led to the RISC principle in the first place, around 1980.

Luke Kenneth Casson Leighton

unread,

May 1, 2018, 5:09:47 AM5/1/18

to Bruce Hoult, Muhammad Ali Akhtar, Tommy Murphy, RISC-V SW Dev

On Tue, May 1, 2018 at 1:17 AM, Bruce Hoult <br...@hoult.org> wrote:

> That was the discovery that led to the RISC principle in the first place,
> around 1980.

i heard from someone who studied it that the DEC Alpha's fixed-point
engine was fast enough and flexible enough to be able to do
floating-point in the same timeframe as other architectectures of the
time were doing FP in hardware. the same person told me that even
just having 1s complement add would help: i didn't enquire further.

l.

kr...@berkeley.edu

unread,

May 1, 2018, 12:03:41 PM5/1/18

to Luke Kenneth Casson Leighton, Bruce Hoult, Muhammad Ali Akhtar, Tommy Murphy, RISC-V SW Dev

Luke - this is also untrue.

Check out the MIPS R4000 for example.

Take the time to educate yourself before spreading random things some
bloke told you. You're abusing this community resource.

Krste

| --
| You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.

| To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.

| To post to this group, send email to sw-...@groups.riscv.org.
| Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

| To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CAPweEDzBiV%3D1JYfS4ZDvhn-YujnrLsRb3U0AdLRcfMc425KXbw%40mail.gmail.com.

Luke Kenneth Casson Leighton

unread,

May 1, 2018, 3:06:34 PM5/1/18

to Krste Asanovic, Bruce Hoult, Muhammad Ali Akhtar, Tommy Murphy, RISC-V SW Dev

On Tue, May 1, 2018 at 5:03 PM, <kr...@berkeley.edu> wrote:
>
> Luke - this is also untrue.

https://en.wikipedia.org/wiki/DEC_Alpha#Floating-point_Extensions_(FIX)

that is... the... weirdest thing. richard lightman. amazing
engineer, very very reclusive. met him in... 2004, and he has since
disappeared entirely from the internet, sadly. 14 years later i can
picture his face even now as he told me the story about this, and i
had absolutely no reason... until now... to doubt his words.

thank you for being the first person with enough knowledge to correct
such long-standing misunderstandings and gaps in my knowledge of
architectures: it's really appreciated as this has always been an area
of computing that i've loved for many decades, and long been
fascinated about.

i don't believe richard would have been referring to the FP
Extensions: sadly he cannot be reached (at all) to clarify.

l.

Palmer Dabbelt

unread,

May 2, 2018, 3:59:49 PM5/2/18

to Bruce Hoult, muhamma...@gmail.com, tommy_...@hotmail.com, sw-...@groups.riscv.org

Since the RISC-V software stack is based on standard compilers and runtimes we
support everything that's supported anywhere else. As such, it's generally
best to just make sure your code calls the standard C library functions as
efficiently as possible (as floating-point can be tricky to swizzle around) and
then just let the toolchain figure out what's best on the hardware you're
targeting.

It looks like my Intel laptop avoids calling the x87 trigonometric functions in
favor of using loops of SSE instructions. Intel seems to agree that there's no
relevant AVX instructions for trig stuff, so modulo SIMD you're getting the
same performance on RISC-V as you are getting on x86.

https://software.intel.com/en-us/node/683457

This paper has some benchmarks that seem to indicate that, depending on the
function, the software iteration can be faster than the x87 instruction (on a
Pentium 4, so probably only relevant to a factor of 2 or so):

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.58.396&rep=rep1&type=pdf

Fun story: I remember the fastest solution to some problem set being to call
sincos() instead of using the hardware implementations of sin() and cos() as
they weren't that fast. I checked and GCC does this today :).

Michael Clark

unread,

May 13, 2018, 10:03:06 PM5/13/18

to Bruce Hoult, Palmer Dabbelt, muhamma...@gmail.com, Tommy Murphy, RISC-V SW Dev

x87 is not used by GCC on x86_64. GCC uses SSE4.2 scalar SP (32-bit) and DP (64-bit) floating point and SSE floating point lacks transcendentals.

The x87 transcendentals date back to the original design of the 8087 coprocessor from 1980 and is too inaccurate for scientific applications and too slow for graphics applications

- https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/289702

I was thinking about GPUs when I mentioned F_TD and D_TD in a prior email indicating possible ISA extensions.

Modern GPUs have RISC-like ISAs (however with Nvidia GPUs they are not exposed directly to the compiler due to PTX IR. The ISA can be changed on each new chip family). These ISAs have transcendental instructions, here referred to as an SFU (Special Function Unit). I don’t think this makes the GPU ISAs CISC:

https://devtalk.nvidia.com/default/topic/531855/fermi-and-kepler-gpu-special-function-units/

In graphics applications sin and cos approximations can be “good enough” for use in pixel shaders i.e. the accuracy results in reduced visual quality. These functions wouldn’t be used for scientific computing.

Transcendentals don’t necessarily need to be micro-coded as they could be implemented as multi-stage units like other FPU operations. I’d say that these instructions fall into the category of domain specific instructions. GPUs do all sorts of things in their ISAs while avoiding micro-code e.g. special load store instructions for local and shared memory vs what we would consider one set of normal load store instructions that access global memory. That doesn’t make them CISC. Neither does having transecendentals.

Also, FPUs are not micro-coded, but they are multi-stage pipelined units. It’s possible to have multi-stage pipelined functional units that are not micro-coded. i.e. they can indeed be hard-coded for a special function.

ROMs are 1T vs SRAMs which are 6T or 8T. Depending on their geometry, ROMs can be extremely fast to access. If you have ROM, in L1 proximity to the special functional units it would be possible to access the polynomial approximation vectors without going through the memory bus and disturbing L1/L2 then you could optimise transcendental functions without resorting to micro-code. Essentially you need to broadcast x*x to a vector and multiply that against the polynomial constants from ROM. There are many ways to do this, but here is one that is quite accurate:

http://mooooo.ooo/chebyshev-sine-approximation/

Loading the constants from RAM wouldn’t be appropriate, and having registers reserved for the constants would also not be appropriate, unless a special shader kernel compiler can reserve registers for the polynomial approximation vector. The only way I can think of doing these intrinsics efficiently and at speed without special function units is to have a way to load vector constants from a constant memory address space.

GPUs have constant memory with load instructions that don’t disturb the local/shared/global memory data paths i.e. so that when you execute sin, cos, log2 or exp2 (in a pixel shader being called for 8M pixels at 60Hz), you don’t constantly bump the data you are working on in and out of cache by loading constants via a single data path to memory, which essentially mucks with the memory systems ability to keep data loads constrained to the data in the array you are working, so you can focus on things like coalescing loads from multiple threads. A graphics optimised domain specific processor would likely just include fcos.[sd]/fsin.[sd]. An AI optimised processor may exclude them. A set of optional instructions that can be added or removed for custom processors customised for application specific domains seems like the perfect example of where you might have a profile that includes transcendentals. Interesting to note that tan or arctan are not included in the graphic ISAs. They probably don’t occur frequently enough in shader kernels.

Now we have domain specific processors, the lines are being blurred between NPU/CPU/GPU/TPU/XPU. Pixel shaders call transcendental functions frequently enough that they merit their own instructions in domain specific processors.

I think CISC micro-code is orthogonal to domain specific instruction extensions with multi-stage functional units such as FPUs or even Crypto. A Divider requires a multi-cycle state machine. Divide wasn’t included on early ARM and Alpha. Does divide make a processor CISC? (given the number of algorithms that exist to implement high radix multiply and divide).

I think the principle is that one can implement simple unary and binary instructions that make up the verbs for ones particular application domain. I think it would be possible to hard-code a sincos functional unit.

Reply all

Reply to author

Forward