[go] math: avoid assembly stubs

80 views

Skip to first unread message

Austin Clements (Gerrit)

unread,

Apr 15, 2021, 12:11:17 AM4/15/21

to Michael Knyszek, Robert Griesemer, Cherry Zhang, goph...@pubsubhelper.golang.org, Austin Clements, golang-co...@googlegroups.com

Attention is currently required from: Michael Knyszek, Robert Griesemer, Cherry Zhang.

Austin Clements would like Michael Knyszek, Robert Griesemer and Cherry Zhang to review this change.

View Change

math: avoid assembly stubs

Currently almost all math functions have the following pattern:

func Sin(x float64) float64

func sin(x float64) float64 {
    // ... pure Go implementation ...
}

Architectures that implement a function in assembly provide the
assembly implementation directly as the exported function (e.g., Sin),
and architectures that don't implement it in assembly use a small stub
to jump back to the Go code, like:

TEXT ·Sin(SB), NOSPLIT, $0
	JMP ·sin(SB)

However, most functions are not implemented in assembly on most
architectures, so this jump through assembly is a waste. It defeats
compiler optimizations like inlining. And, with regabi, it actually
adds a small but non-trivial overhead because the jump from assembly
back to Go must go through an ABI0->ABIInternal bridge function.

Hence, this CL reorganizes this structure across the entire package.
It now leans on inlining to achieve peak performance, but allows the
compiler to see all the way through the pure Go implementation.

Now, functions follow this pattern:

func Sin(x float64) float64 {
	if haveArchSin {
		return archSin(x)
	}
	return sin(x)
}

func sin(x float64) float64 {
    // ... pure Go implementation ...
}

Architectures that have assembly implementations use build-tagged
files to set haveArchX to true an provide an archX implementation.
That implementation can also still call back into the Go
implementation (some of them do this).

This reduces the overhead of enabling ABI wrappers on the Tile38
benchmarks from ~4% to nothing.

For #40724.

Change-Id: I44fbba2a17be930ec9daeb0a8222f55cd50555a0
---
M src/math/acosh.go
M src/math/arith_s390x.go
M src/math/asin.go
M src/math/asinh.go
M src/math/atan.go
M src/math/atan2.go
M src/math/atanh.go
M src/math/cbrt.go
M src/math/dim.go
M src/math/dim_amd64.s
M src/math/dim_arm64.s
A src/math/dim_asm.go
A src/math/dim_noasm.go
M src/math/dim_s390x.s
M src/math/erf.go
M src/math/exp.go
A src/math/exp2_asm.go
A src/math/exp2_noasm.go
A src/math/exp_amd64.go
M src/math/exp_amd64.s
M src/math/exp_arm64.s
M src/math/exp_asm.go
A src/math/exp_noasm.go
M src/math/expm1.go
M src/math/floor.go
M src/math/floor_386.s
M src/math/floor_amd64.s
M src/math/floor_arm64.s
A src/math/floor_asm.go
A src/math/floor_noasm.go
M src/math/floor_ppc64x.s
M src/math/floor_s390x.s
M src/math/floor_wasm.s
M src/math/frexp.go
M src/math/hypot.go
M src/math/hypot_386.s
M src/math/hypot_amd64.s
A src/math/hypot_asm.go
A src/math/hypot_noasm.go
M src/math/ldexp.go
M src/math/log.go
M src/math/log10.go
M src/math/log1p.go
M src/math/log_amd64.s
A src/math/log_asm.go
A src/math/log_stub.go
M src/math/mod.go
M src/math/modf.go
M src/math/modf_arm64.s
A src/math/modf_asm.go
A src/math/modf_noasm.go
M src/math/modf_ppc64x.s
M src/math/pow.go
M src/math/remainder.go
M src/math/sin.go
M src/math/sinh.go
M src/math/sqrt.go
M src/math/sqrt_386.s
M src/math/sqrt_amd64.s
M src/math/sqrt_arm.s
M src/math/sqrt_arm64.s
A src/math/sqrt_asm.go
M src/math/sqrt_mipsx.s
A src/math/sqrt_noasm.go
M src/math/sqrt_ppc64x.s
M src/math/sqrt_s390x.s
M src/math/sqrt_wasm.s
A src/math/stubs.go
D src/math/stubs_386.s
D src/math/stubs_amd64.s
D src/math/stubs_arm.s
D src/math/stubs_arm64.s
D src/math/stubs_mips64x.s
D src/math/stubs_mipsx.s
D src/math/stubs_ppc64x.s
D src/math/stubs_riscv64.s
M src/math/stubs_s390x.s
D src/math/stubs_wasm.s
M src/math/tan.go
M src/math/tanh.go
80 files changed, 855 insertions(+), 1,123 deletions(-)

diff --git a/src/math/acosh.go b/src/math/acosh.go
index 41ca871..f74e0b6 100644
--- a/src/math/acosh.go
+++ b/src/math/acosh.go
@@ -39,7 +39,12 @@
 //	Acosh(+Inf) = +Inf
 //	Acosh(x) = NaN if x < 1
 //	Acosh(NaN) = NaN
-func Acosh(x float64) float64
+func Acosh(x float64) float64 {
+	if haveArchAcosh {
+		return archAcosh(x)
+	}
+	return acosh(x)
+}
 
 func acosh(x float64) float64 {
 	const Large = 1 << 28 // 2**28
diff --git a/src/math/arith_s390x.go b/src/math/arith_s390x.go
index 90a7d4f..129156a 100644
--- a/src/math/arith_s390x.go
+++ b/src/math/arith_s390x.go
@@ -6,72 +6,165 @@
 
 import "internal/cpu"
 
-func log10TrampolineSetup(x float64) float64
-func log10Asm(x float64) float64
-
-func cosTrampolineSetup(x float64) float64
-func cosAsm(x float64) float64
-
-func coshTrampolineSetup(x float64) float64
-func coshAsm(x float64) float64
-
-func sinTrampolineSetup(x float64) float64
-func sinAsm(x float64) float64
-
-func sinhTrampolineSetup(x float64) float64
-func sinhAsm(x float64) float64
-
-func tanhTrampolineSetup(x float64) float64
-func tanhAsm(x float64) float64
-
-func log1pTrampolineSetup(x float64) float64
-func log1pAsm(x float64) float64
-
-func atanhTrampolineSetup(x float64) float64
-func atanhAsm(x float64) float64
-
-func acosTrampolineSetup(x float64) float64
-func acosAsm(x float64) float64
-
-func acoshTrampolineSetup(x float64) float64
-func acoshAsm(x float64) float64
-
-func asinTrampolineSetup(x float64) float64
-func asinAsm(x float64) float64
-
-func asinhTrampolineSetup(x float64) float64
-func asinhAsm(x float64) float64
-
-func erfTrampolineSetup(x float64) float64
-func erfAsm(x float64) float64
-
-func erfcTrampolineSetup(x float64) float64
-func erfcAsm(x float64) float64
-
-func atanTrampolineSetup(x float64) float64
-func atanAsm(x float64) float64
-
-func atan2TrampolineSetup(x, y float64) float64
-func atan2Asm(x, y float64) float64
-
-func cbrtTrampolineSetup(x float64) float64
-func cbrtAsm(x float64) float64
+func expTrampolineSetup(x float64) float64
+func expAsm(x float64) float64
 
 func logTrampolineSetup(x float64) float64
 func logAsm(x float64) float64
 
+// Below here all functions are grouped in stubs.go for other
+// architectures.
+
+const haveArchLog10 = true
+
+func archLog10(x float64) float64
+func log10TrampolineSetup(x float64) float64
+func log10Asm(x float64) float64
+
+const haveArchCos = true
+
+func archCos(x float64) float64
+func cosTrampolineSetup(x float64) float64
+func cosAsm(x float64) float64
+
+const haveArchCosh = true
+
+func archCosh(x float64) float64
+func coshTrampolineSetup(x float64) float64
+func coshAsm(x float64) float64
+
+const haveArchSin = true
+
+func archSin(x float64) float64
+func sinTrampolineSetup(x float64) float64
+func sinAsm(x float64) float64
+
+const haveArchSinh = true
+
+func archSinh(x float64) float64
+func sinhTrampolineSetup(x float64) float64
+func sinhAsm(x float64) float64
+
+const haveArchTanh = true
+
+func archTanh(x float64) float64
+func tanhTrampolineSetup(x float64) float64
+func tanhAsm(x float64) float64
+
+const haveArchLog1p = true
+
+func archLog1p(x float64) float64
+func log1pTrampolineSetup(x float64) float64
+func log1pAsm(x float64) float64
+
+const haveArchAtanh = true
+
+func archAtanh(x float64) float64
+func atanhTrampolineSetup(x float64) float64
+func atanhAsm(x float64) float64
+
+const haveArchAcos = true
+
+func archAcos(x float64) float64
+func acosTrampolineSetup(x float64) float64
+func acosAsm(x float64) float64
+
+const haveArchAcosh = true
+
+func archAcosh(x float64) float64
+func acoshTrampolineSetup(x float64) float64
+func acoshAsm(x float64) float64
+
+const haveArchAsin = true
+
+func archAsin(x float64) float64
+func asinTrampolineSetup(x float64) float64
+func asinAsm(x float64) float64
+
+const haveArchAsinh = true
+
+func archAsinh(x float64) float64
+func asinhTrampolineSetup(x float64) float64
+func asinhAsm(x float64) float64
+
+const haveArchErf = true
+
+func archErf(x float64) float64
+func erfTrampolineSetup(x float64) float64
+func erfAsm(x float64) float64
+
+const haveArchErfc = true
+
+func archErfc(x float64) float64
+func erfcTrampolineSetup(x float64) float64
+func erfcAsm(x float64) float64
+
+const haveArchAtan = true
+
+func archAtan(x float64) float64
+func atanTrampolineSetup(x float64) float64
+func atanAsm(x float64) float64
+
+const haveArchAtan2 = true
+
+func archAtan2(y, x float64) float64
+func atan2TrampolineSetup(x, y float64) float64
+func atan2Asm(x, y float64) float64
+
+const haveArchCbrt = true
+
+func archCbrt(x float64) float64
+func cbrtTrampolineSetup(x float64) float64
+func cbrtAsm(x float64) float64
+
+const haveArchTan = true
+
+func archTan(x float64) float64
 func tanTrampolineSetup(x float64) float64
 func tanAsm(x float64) float64
 
-func expTrampolineSetup(x float64) float64
-func expAsm(x float64) float64
+const haveArchExpm1 = true
 
+func archExpm1(x float64) float64
 func expm1TrampolineSetup(x float64) float64
 func expm1Asm(x float64) float64
 
+const haveArchPow = true
+
+func archPow(x, y float64) float64
 func powTrampolineSetup(x, y float64) float64
 func powAsm(x, y float64) float64
 
+const haveArchFrexp = false
+
+func archFrexp(x float64) (float64, int) {
+	panic("not implemented")
+}
+
+const haveArchLdexp = false
+
+func archLdexp(frac float64, exp int) float64 {
+	panic("not implemented")
+}
+
+const haveArchLog2 = false
+
+func archLog2(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchMod = false
+
+func archMod(x, y float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchRemainder = false
+
+func archRemainder(x, y float64) float64 {
+	panic("not implemented")
+}
+
 // hasVX reports whether the machine has the z/Architecture
 // vector facility installed and enabled.
 var hasVX = cpu.S390X.HasVX
diff --git a/src/math/asin.go b/src/math/asin.go
index 88b851e..989a741 100644
--- a/src/math/asin.go
+++ b/src/math/asin.go
@@ -16,7 +16,12 @@
 // Special cases are:
 //	Asin(±0) = ±0
 //	Asin(x) = NaN if x < -1 or x > 1
-func Asin(x float64) float64
+func Asin(x float64) float64 {
+	if haveArchAsin {
+		return archAsin(x)
+	}
+	return asin(x)
+}
 
 func asin(x float64) float64 {
 	if x == 0 {
@@ -48,7 +53,12 @@
 //
 // Special case is:
 //	Acos(x) = NaN if x < -1 or x > 1
-func Acos(x float64) float64
+func Acos(x float64) float64 {
+	if haveArchAcos {
+		return archAcos(x)
+	}
+	return acos(x)
+}
 
 func acos(x float64) float64 {
 	return Pi/2 - Asin(x)
diff --git a/src/math/asinh.go b/src/math/asinh.go
index 65ae4c6..6dcb241 100644
--- a/src/math/asinh.go
+++ b/src/math/asinh.go
@@ -36,7 +36,12 @@
 //	Asinh(±0) = ±0
 //	Asinh(±Inf) = ±Inf
 //	Asinh(NaN) = NaN
-func Asinh(x float64) float64
+func Asinh(x float64) float64 {
+	if haveArchAsinh {
+		return archAsinh(x)
+	}
+	return asinh(x)
+}
 
 func asinh(x float64) float64 {
 	const (
diff --git a/src/math/atan.go b/src/math/atan.go
index 7fcc90b..69af860 100644
--- a/src/math/atan.go
+++ b/src/math/atan.go
@@ -92,7 +92,12 @@
 // Special cases are:
 //      Atan(±0) = ±0
 //      Atan(±Inf) = ±Pi/2
-func Atan(x float64) float64
+func Atan(x float64) float64 {
+	if haveArchAtan {
+		return archAtan(x)
+	}
+	return atan(x)
+}
 
 func atan(x float64) float64 {
 	if x == 0 {
diff --git a/src/math/atan2.go b/src/math/atan2.go
index d84b332..11d7e81 100644
--- a/src/math/atan2.go
+++ b/src/math/atan2.go
@@ -26,7 +26,12 @@
 //	Atan2(y<0, -Inf) = -Pi
 //	Atan2(+Inf, x) = +Pi/2
 //	Atan2(-Inf, x) = -Pi/2
-func Atan2(y, x float64) float64
+func Atan2(y, x float64) float64 {
+	if haveArchAtan2 {
+		return archAtan2(y, x)
+	}
+	return atan2(y, x)
+}
 
 func atan2(y, x float64) float64 {
 	// special cases
diff --git a/src/math/atanh.go b/src/math/atanh.go
index 8904bd6..fe8bd6d 100644
--- a/src/math/atanh.go
+++ b/src/math/atanh.go
@@ -44,7 +44,12 @@
 //	Atanh(-1) = -Inf
 //	Atanh(x) = NaN if x < -1 or x > 1
 //	Atanh(NaN) = NaN
-func Atanh(x float64) float64
+func Atanh(x float64) float64 {
+	if haveArchAtanh {
+		return archAtanh(x)
+	}
+	return atanh(x)
+}
 
 func atanh(x float64) float64 {
 	const NearZero = 1.0 / (1 << 28) // 2**-28
diff --git a/src/math/cbrt.go b/src/math/cbrt.go
index 4afd4a8..45c8ecb 100644
--- a/src/math/cbrt.go
+++ b/src/math/cbrt.go
@@ -22,7 +22,12 @@
 //	Cbrt(±0) = ±0
 //	Cbrt(±Inf) = ±Inf
 //	Cbrt(NaN) = NaN
-func Cbrt(x float64) float64
+func Cbrt(x float64) float64 {
+	if haveArchCbrt {
+		return archCbrt(x)
+	}
+	return cbrt(x)
+}
 
 func cbrt(x float64) float64 {
 	const (
diff --git a/src/math/dim.go b/src/math/dim.go
index d2e5d47..6a857bb 100644
--- a/src/math/dim.go
+++ b/src/math/dim.go
@@ -32,7 +32,12 @@
 //	Max(x, NaN) = Max(NaN, x) = NaN
 //	Max(+0, ±0) = Max(±0, +0) = +0
 //	Max(-0, -0) = -0
-func Max(x, y float64) float64
+func Max(x, y float64) float64 {
+	if haveArchMax {
+		return archMax(x, y)
+	}
+	return max(x, y)
+}
 
 func max(x, y float64) float64 {
 	// special cases
@@ -59,7 +64,12 @@
 //	Min(x, -Inf) = Min(-Inf, x) = -Inf
 //	Min(x, NaN) = Min(NaN, x) = NaN
 //	Min(-0, ±0) = Min(±0, -0) = -0
-func Min(x, y float64) float64
+func Min(x, y float64) float64 {
+	if haveArchMin {
+		return archMin(x, y)
+	}
+	return min(x, y)
+}
 
 func min(x, y float64) float64 {
 	// special cases
diff --git a/src/math/dim_amd64.s b/src/math/dim_amd64.s
index 85c02e6..253f03b 100644
--- a/src/math/dim_amd64.s
+++ b/src/math/dim_amd64.s
@@ -8,8 +8,8 @@
 #define NaN    0x7FF8000000000001
 #define NegInf 0xFFF0000000000000
 
-// func ·Max(x, y float64) float64
-TEXT ·Max(SB),NOSPLIT,$0
+// func ·archMax(x, y float64) float64
+TEXT ·archMax(SB),NOSPLIT,$0
 	// +Inf special cases
 	MOVQ    $PosInf, AX
 	MOVQ    x+0(FP), R8
@@ -52,8 +52,8 @@
 	MOVQ    R9, ret+16(FP) // return other 0
 	RET
 
-// func Min(x, y float64) float64
-TEXT ·Min(SB),NOSPLIT,$0
+// func archMin(x, y float64) float64
+TEXT ·archMin(SB),NOSPLIT,$0
 	// -Inf special cases
 	MOVQ    $NegInf, AX
 	MOVQ    x+0(FP), R8
diff --git a/src/math/dim_arm64.s b/src/math/dim_arm64.s
index 2cb866f..f112003 100644
--- a/src/math/dim_arm64.s
+++ b/src/math/dim_arm64.s
@@ -8,8 +8,8 @@
 #define NaN    0x7FF8000000000001
 #define NegInf 0xFFF0000000000000
 
-// func ·Max(x, y float64) float64
-TEXT ·Max(SB),NOSPLIT,$0
+// func ·archMax(x, y float64) float64
+TEXT ·archMax(SB),NOSPLIT,$0
 	// +Inf special cases
 	MOVD	$PosInf, R0
 	MOVD	x+0(FP), R1
@@ -28,8 +28,8 @@
 	MOVD	R0, ret+16(FP)
 	RET
 
-// func Min(x, y float64) float64
-TEXT ·Min(SB),NOSPLIT,$0
+// func archMin(x, y float64) float64
+TEXT ·archMin(SB),NOSPLIT,$0
 	// -Inf special cases
 	MOVD	$NegInf, R0
 	MOVD	x+0(FP), R1
diff --git a/src/math/dim_asm.go b/src/math/dim_asm.go
new file mode 100644
index 0000000..9ba742a
--- /dev/null
+++ b/src/math/dim_asm.go
@@ -0,0 +1,16 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build amd64 || arm64 || riscv64 || s390x
+// +build amd64 arm64 riscv64 s390x
+
+package math
+
+const haveArchMax = true
+
+func archMax(x, y float64) float64
+
+const haveArchMin = true
+
+func archMin(x, y float64) float64
diff --git a/src/math/dim_noasm.go b/src/math/dim_noasm.go
new file mode 100644
index 0000000..ea46577
--- /dev/null
+++ b/src/math/dim_noasm.go
@@ -0,0 +1,20 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build !amd64 && !arm64 && !riscv64 && !s390x
+// +build !amd64,!arm64,!riscv64,!s390x
+
+package math
+
+const haveArchMax = false
+
+func archMax(x, y float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchMin = false
+
+func archMin(x, y float64) float64 {
+	panic("not implemented")
+}
diff --git a/src/math/dim_s390x.s b/src/math/dim_s390x.s
index 74fdd75..1277026 100644
--- a/src/math/dim_s390x.s
+++ b/src/math/dim_s390x.s
@@ -11,7 +11,7 @@
 #define NegInf 0xFFF0000000000000
 
 // func ·Max(x, y float64) float64
-TEXT ·Max(SB),NOSPLIT,$0
+TEXT ·archMax(SB),NOSPLIT,$0
 	// +Inf special cases
 	MOVD    $PosInf, R4
 	MOVD    x+0(FP), R8
@@ -52,8 +52,8 @@
 	MOVD    R9, ret+16(FP) // return other 0
 	RET
 
-// func Min(x, y float64) float64
-TEXT ·Min(SB),NOSPLIT,$0
+// func archMin(x, y float64) float64
+TEXT ·archMin(SB),NOSPLIT,$0
 	// -Inf special cases
 	MOVD    $NegInf, R4
 	MOVD    x+0(FP), R8
diff --git a/src/math/erf.go b/src/math/erf.go
index 9b4048f..4d6fe47 100644
--- a/src/math/erf.go
+++ b/src/math/erf.go
@@ -185,7 +185,12 @@
 //	Erf(+Inf) = 1
 //	Erf(-Inf) = -1
 //	Erf(NaN) = NaN
-func Erf(x float64) float64
+func Erf(x float64) float64 {
+	if haveArchErf {
+		return archErf(x)
+	}
+	return erf(x)
+}
 
 func erf(x float64) float64 {
 	const (
@@ -264,7 +269,12 @@
 //	Erfc(+Inf) = 0
 //	Erfc(-Inf) = 2
 //	Erfc(NaN) = NaN
-func Erfc(x float64) float64
+func Erfc(x float64) float64 {
+	if haveArchErfc {
+		return archErfc(x)
+	}
+	return erfc(x)
+}
 
 func erfc(x float64) float64 {
 	const Tiny = 1.0 / (1 << 56) // 2**-56
diff --git a/src/math/exp.go b/src/math/exp.go
index bd4c5c9..d05eb91 100644
--- a/src/math/exp.go
+++ b/src/math/exp.go
@@ -11,7 +11,12 @@
 //	Exp(NaN) = NaN
 // Very large values overflow to 0 or +Inf.
 // Very small values underflow to 1.
-func Exp(x float64) float64
+func Exp(x float64) float64 {
+	if haveArchExp {
+		return archExp(x)
+	}
+	return exp(x)
+}
 
 // The original C code, the long comment, and the constants
 // below are from FreeBSD's /usr/src/lib/msun/src/e_exp.c
@@ -132,7 +137,12 @@
 // Exp2 returns 2**x, the base-2 exponential of x.
 //
 // Special cases are the same as Exp.
-func Exp2(x float64) float64
+func Exp2(x float64) float64 {
+	if haveArchExp2 {
+		return archExp2(x)
+	}
+	return exp2(x)
+}
 
 func exp2(x float64) float64 {
 	const (
diff --git a/src/math/exp2_asm.go b/src/math/exp2_asm.go
new file mode 100644
index 0000000..76d258f
--- /dev/null
+++ b/src/math/exp2_asm.go
@@ -0,0 +1,12 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build arm64
+// +build arm64
+
+package math
+
+const haveArchExp2 = true
+
+func archExp2(x float64) float64
diff --git a/src/math/exp2_noasm.go b/src/math/exp2_noasm.go
new file mode 100644
index 0000000..1b0ac87
--- /dev/null
+++ b/src/math/exp2_noasm.go
@@ -0,0 +1,14 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build !arm64
+// +build !arm64
+
+package math
+
+const haveArchExp2 = false
+
+func archExp2(x float64) float64 {
+	panic("not implemented")
+}
diff --git a/src/math/exp_amd64.go b/src/math/exp_amd64.go
new file mode 100644
index 0000000..654ccce
--- /dev/null
+++ b/src/math/exp_amd64.go
@@ -0,0 +1,12 @@
+// Copyright 2017 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build amd64
+// +build amd64
+
+package math
+
+import "internal/cpu"
+
+var useFMA = cpu.X86.HasAVX && cpu.X86.HasFMA
diff --git a/src/math/exp_amd64.s b/src/math/exp_amd64.s
index b3e1c22..02b71c8 100644
--- a/src/math/exp_amd64.s
+++ b/src/math/exp_amd64.s
@@ -37,7 +37,7 @@
 GLOBL exprodata<>+0(SB), RODATA, $72
 
 // func Exp(x float64) float64
-TEXT ·Exp(SB),NOSPLIT,$0
+TEXT ·archExp(SB),NOSPLIT,$0
 	// test bits for not-finite
 	MOVQ    x+0(FP), BX
 	MOVQ    $~(1<<63), AX // sign bit mask
diff --git a/src/math/exp_arm64.s b/src/math/exp_arm64.s
index 19736cb..44673ab 100644
--- a/src/math/exp_arm64.s
+++ b/src/math/exp_arm64.s
@@ -23,7 +23,7 @@
 // This is an assembly implementation of the method used for function Exp in file exp.go.
 //
 // func Exp(x float64) float64
-TEXT ·Exp(SB),$0-16
+TEXT ·archExp(SB),$0-16
 	FMOVD	x+0(FP), F0	// F0 = x
 	FCMPD	F0, F0
 	BNE	isNaN		// x = NaN, return NaN
@@ -109,7 +109,7 @@
 // This is an assembly implementation of the method used for function Exp2 in file exp.go.
 //
 // func Exp2(x float64) float64
-TEXT ·Exp2(SB),$0-16
+TEXT ·archExp2(SB),$0-16
 	FMOVD	x+0(FP), F0	// F0 = x
 	FCMPD	F0, F0
 	BNE	isNaN		// x = NaN, return NaN
diff --git a/src/math/exp_asm.go b/src/math/exp_asm.go
index 654ccce..a1673ea 100644
--- a/src/math/exp_asm.go
+++ b/src/math/exp_asm.go
@@ -1,12 +1,12 @@
-// Copyright 2017 The Go Authors. All rights reserved.
+// Copyright 2021 The Go Authors. All rights reserved.
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-//go:build amd64
-// +build amd64
+//go:build amd64 || arm64 || s390x
+// +build amd64 arm64 s390x
 
 package math
 
-import "internal/cpu"
+const haveArchExp = true
 
-var useFMA = cpu.X86.HasAVX && cpu.X86.HasFMA
+func archExp(x float64) float64
diff --git a/src/math/exp_noasm.go b/src/math/exp_noasm.go
new file mode 100644
index 0000000..b757e6e
--- /dev/null
+++ b/src/math/exp_noasm.go
@@ -0,0 +1,14 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build !amd64 && !arm64 && !s390x
+// +build !amd64,!arm64,!s390x
+
+package math
+
+const haveArchExp = false
+
+func archExp(x float64) float64 {
+	panic("not implemented")
+}
diff --git a/src/math/expm1.go b/src/math/expm1.go
index 8e77398..66d3421 100644
--- a/src/math/expm1.go
+++ b/src/math/expm1.go
@@ -121,7 +121,12 @@
 //	Expm1(-Inf) = -1
 //	Expm1(NaN) = NaN
 // Very large values overflow to -1 or +Inf.
-func Expm1(x float64) float64
+func Expm1(x float64) float64 {
+	if haveArchExpm1 {
+		return archExpm1(x)
+	}
+	return expm1(x)
+}
 
 func expm1(x float64) float64 {
 	const (
diff --git a/src/math/floor.go b/src/math/floor.go
index 18e89ef..7913a90 100644
--- a/src/math/floor.go
+++ b/src/math/floor.go
@@ -10,7 +10,12 @@
 //	Floor(±0) = ±0
 //	Floor(±Inf) = ±Inf
 //	Floor(NaN) = NaN
-func Floor(x float64) float64
+func Floor(x float64) float64 {
+	if haveArchFloor {
+		return archFloor(x)
+	}
+	return floor(x)
+}
 
 func floor(x float64) float64 {
 	if x == 0 || IsNaN(x) || IsInf(x, 0) {
@@ -33,7 +38,12 @@
 //	Ceil(±0) = ±0
 //	Ceil(±Inf) = ±Inf
 //	Ceil(NaN) = NaN
-func Ceil(x float64) float64
+func Ceil(x float64) float64 {
+	if haveArchCeil {
+		return archCeil(x)
+	}
+	return ceil(x)
+}
 
 func ceil(x float64) float64 {
 	return -Floor(-x)
@@ -45,7 +55,12 @@
 //	Trunc(±0) = ±0
 //	Trunc(±Inf) = ±Inf
 //	Trunc(NaN) = NaN
-func Trunc(x float64) float64
+func Trunc(x float64) float64 {
+	if haveArchTrunc {
+		return archTrunc(x)
+	}
+	return trunc(x)
+}
 
 func trunc(x float64) float64 {
 	if x == 0 || IsNaN(x) || IsInf(x, 0) {
diff --git a/src/math/floor_386.s b/src/math/floor_386.s
index 0960eec..1990cb0 100644
--- a/src/math/floor_386.s
+++ b/src/math/floor_386.s
@@ -4,8 +4,8 @@
 
 #include "textflag.h"
 
-// func Ceil(x float64) float64
-TEXT ·Ceil(SB),NOSPLIT,$0
+// func archCeil(x float64) float64
+TEXT ·archCeil(SB),NOSPLIT,$0
 	FMOVD   x+0(FP), F0  // F0=x
 	FSTCW   -2(SP)       // save old Control Word
 	MOVW    -2(SP), AX
@@ -18,8 +18,8 @@
 	FMOVDP  F0, ret+8(FP)
 	RET
 
-// func Floor(x float64) float64
-TEXT ·Floor(SB),NOSPLIT,$0
+// func archFloor(x float64) float64
+TEXT ·archFloor(SB),NOSPLIT,$0
 	FMOVD   x+0(FP), F0  // F0=x
 	FSTCW   -2(SP)       // save old Control Word
 	MOVW    -2(SP), AX
@@ -32,8 +32,8 @@
 	FMOVDP  F0, ret+8(FP)
 	RET
 
-// func Trunc(x float64) float64
-TEXT ·Trunc(SB),NOSPLIT,$0
+// func archTrunc(x float64) float64
+TEXT ·archTrunc(SB),NOSPLIT,$0
 	FMOVD   x+0(FP), F0  // F0=x
 	FSTCW   -2(SP)       // save old Control Word
 	MOVW    -2(SP), AX
diff --git a/src/math/floor_amd64.s b/src/math/floor_amd64.s
index 4ef02eb..0880499 100644
--- a/src/math/floor_amd64.s
+++ b/src/math/floor_amd64.s
@@ -6,8 +6,8 @@
 
 #define Big		0x4330000000000000 // 2**52
 
-// func Floor(x float64) float64
-TEXT ·Floor(SB),NOSPLIT,$0
+// func archFloor(x float64) float64
+TEXT ·archFloor(SB),NOSPLIT,$0
 	MOVQ	x+0(FP), AX
 	MOVQ	$~(1<<63), DX // sign bit mask
 	ANDQ	AX,DX // DX = |x|
@@ -28,8 +28,8 @@
 	MOVQ    AX, ret+8(FP) // return x
 	RET
 
-// func Ceil(x float64) float64
-TEXT ·Ceil(SB),NOSPLIT,$0
+// func archCeil(x float64) float64
+TEXT ·archCeil(SB),NOSPLIT,$0
 	MOVQ	x+0(FP), AX
 	MOVQ	$~(1<<63), DX // sign bit mask
 	MOVQ	AX, BX // BX = copy of x
@@ -54,8 +54,8 @@
 	MOVQ	AX, ret+8(FP)
 	RET
 
-// func Trunc(x float64) float64
-TEXT ·Trunc(SB),NOSPLIT,$0
+// func archTrunc(x float64) float64
+TEXT ·archTrunc(SB),NOSPLIT,$0
 	MOVQ	x+0(FP), AX
 	MOVQ	$~(1<<63), DX // sign bit mask
 	MOVQ	AX, BX // BX = copy of x
diff --git a/src/math/floor_arm64.s b/src/math/floor_arm64.s
index 6d240d4..d9c5df7 100644
--- a/src/math/floor_arm64.s
+++ b/src/math/floor_arm64.s
@@ -4,22 +4,22 @@
 
 #include "textflag.h"
 
-// func Floor(x float64) float64
-TEXT ·Floor(SB),NOSPLIT,$0
+// func archFloor(x float64) float64
+TEXT ·archFloor(SB),NOSPLIT,$0
 	FMOVD	x+0(FP), F0
 	FRINTMD	F0, F0
 	FMOVD	F0, ret+8(FP)
 	RET
 
-// func Ceil(x float64) float64
-TEXT ·Ceil(SB),NOSPLIT,$0
+// func archCeil(x float64) float64
+TEXT ·archCeil(SB),NOSPLIT,$0
 	FMOVD	x+0(FP), F0
 	FRINTPD	F0, F0
 	FMOVD	F0, ret+8(FP)
 	RET
 
-// func Trunc(x float64) float64
-TEXT ·Trunc(SB),NOSPLIT,$0
+// func archTrunc(x float64) float64
+TEXT ·archTrunc(SB),NOSPLIT,$0
 	FMOVD	x+0(FP), F0
 	FRINTZD	F0, F0
 	FMOVD	F0, ret+8(FP)
diff --git a/src/math/floor_asm.go b/src/math/floor_asm.go
new file mode 100644
index 0000000..1265e51
--- /dev/null
+++ b/src/math/floor_asm.go
@@ -0,0 +1,20 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build 386 || amd64 || arm64 || ppc64 || ppc64le || s390x || wasm
+// +build 386 amd64 arm64 ppc64 ppc64le s390x wasm
+
+package math
+
+const haveArchFloor = true
+
+func archFloor(x float64) float64
+
+const haveArchCeil = true
+
+func archCeil(x float64) float64
+
+const haveArchTrunc = true
+
+func archTrunc(x float64) float64
diff --git a/src/math/floor_noasm.go b/src/math/floor_noasm.go
new file mode 100644
index 0000000..821af21
--- /dev/null
+++ b/src/math/floor_noasm.go
@@ -0,0 +1,26 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build !386 && !amd64 && !arm64 && !ppc64 && !ppc64le && !s390x && !wasm
+// +build !386,!amd64,!arm64,!ppc64,!ppc64le,!s390x,!wasm
+
+package math
+
+const haveArchFloor = false
+
+func archFloor(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchCeil = false
+
+func archCeil(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchTrunc = false
+
+func archTrunc(x float64) float64 {
+	panic("not implemented")
+}
diff --git a/src/math/floor_ppc64x.s b/src/math/floor_ppc64x.s
index 2ab011d..29b92a6 100644
--- a/src/math/floor_ppc64x.s
+++ b/src/math/floor_ppc64x.s
@@ -6,19 +6,19 @@
 
 #include "textflag.h"
 
-TEXT ·Floor(SB),NOSPLIT,$0
+TEXT ·archFloor(SB),NOSPLIT,$0
 	FMOVD   x+0(FP), F0
 	FRIM	F0, F0
 	FMOVD   F0, ret+8(FP)
 	RET
 
-TEXT ·Ceil(SB),NOSPLIT,$0
+TEXT ·archCeil(SB),NOSPLIT,$0
 	FMOVD   x+0(FP), F0
 	FRIP    F0, F0
 	FMOVD	F0, ret+8(FP)
 	RET
 
-TEXT ·Trunc(SB),NOSPLIT,$0
+TEXT ·archTrunc(SB),NOSPLIT,$0
 	FMOVD   x+0(FP), F0
 	FRIZ    F0, F0
 	FMOVD   F0, ret+8(FP)
diff --git a/src/math/floor_s390x.s b/src/math/floor_s390x.s
index 896e79b..b5dd462 100644
--- a/src/math/floor_s390x.s
+++ b/src/math/floor_s390x.s
@@ -4,22 +4,22 @@
 
 #include "textflag.h"
 
-// func Floor(x float64) float64
-TEXT ·Floor(SB),NOSPLIT,$0
+// func archFloor(x float64) float64
+TEXT ·archFloor(SB),NOSPLIT,$0
 	FMOVD	x+0(FP), F0
 	FIDBR	$7, F0, F0
 	FMOVD	F0, ret+8(FP)
 	RET
 
-// func Ceil(x float64) float64
-TEXT ·Ceil(SB),NOSPLIT,$0
+// func archCeil(x float64) float64
+TEXT ·archCeil(SB),NOSPLIT,$0
 	FMOVD	x+0(FP), F0
 	FIDBR	$6, F0, F0
 	FMOVD	F0, ret+8(FP)
 	RET
 
-// func Trunc(x float64) float64
-TEXT ·Trunc(SB),NOSPLIT,$0
+// func archTrunc(x float64) float64
+TEXT ·archTrunc(SB),NOSPLIT,$0
 	FMOVD	x+0(FP), F0
 	FIDBR	$5, F0, F0
 	FMOVD	F0, ret+8(FP)
diff --git a/src/math/floor_wasm.s b/src/math/floor_wasm.s
index 4d8a0eb..3751471 100644
--- a/src/math/floor_wasm.s
+++ b/src/math/floor_wasm.s
@@ -4,21 +4,21 @@
 
 #include "textflag.h"
 
-TEXT ·Floor(SB),NOSPLIT,$0
+TEXT ·archFloor(SB),NOSPLIT,$0
 	Get SP
 	F64Load x+0(FP)
 	F64Floor
 	F64Store ret+8(FP)
 	RET
 
-TEXT ·Ceil(SB),NOSPLIT,$0
+TEXT ·archCeil(SB),NOSPLIT,$0
 	Get SP
 	F64Load x+0(FP)
 	F64Ceil
 	F64Store ret+8(FP)
 	RET
 
-TEXT ·Trunc(SB),NOSPLIT,$0
+TEXT ·archTrunc(SB),NOSPLIT,$0
 	Get SP
 	F64Load x+0(FP)
 	F64Trunc
diff --git a/src/math/frexp.go b/src/math/frexp.go
index 0e26feb..3c8a909 100644
--- a/src/math/frexp.go
+++ b/src/math/frexp.go
@@ -13,7 +13,12 @@
 //	Frexp(±0) = ±0, 0
 //	Frexp(±Inf) = ±Inf, 0
 //	Frexp(NaN) = NaN, 0
-func Frexp(f float64) (frac float64, exp int)
+func Frexp(f float64) (frac float64, exp int) {
+	if haveArchFrexp {
+		return archFrexp(f)
+	}
+	return frexp(f)
+}
 
 func frexp(f float64) (frac float64, exp int) {
 	// special cases
diff --git a/src/math/hypot.go b/src/math/hypot.go
index c7f19d4..12af177 100644
--- a/src/math/hypot.go
+++ b/src/math/hypot.go
@@ -16,7 +16,12 @@
 //	Hypot(p, ±Inf) = +Inf
 //	Hypot(NaN, q) = NaN
 //	Hypot(p, NaN) = NaN
-func Hypot(p, q float64) float64
+func Hypot(p, q float64) float64 {
+	if haveArchHypot {
+		return archHypot(p, q)
+	}
+	return hypot(p, q)
+}
 
 func hypot(p, q float64) float64 {
 	// special cases
diff --git a/src/math/hypot_386.s b/src/math/hypot_386.s
index a89cdf7..80a8fd3 100644
--- a/src/math/hypot_386.s
+++ b/src/math/hypot_386.s
@@ -4,8 +4,8 @@
 
 #include "textflag.h"
 
-// func Hypot(p, q float64) float64
-TEXT ·Hypot(SB),NOSPLIT,$0
+// func archHypot(p, q float64) float64
+TEXT ·archHypot(SB),NOSPLIT,$0
 // test bits for not-finite
 	MOVL    p_hi+4(FP), AX   // high word p
 	ANDL    $0x7ff00000, AX
diff --git a/src/math/hypot_amd64.s b/src/math/hypot_amd64.s
index d7983a7..fe326c9 100644
--- a/src/math/hypot_amd64.s
+++ b/src/math/hypot_amd64.s
@@ -7,8 +7,8 @@
 #define PosInf 0x7FF0000000000000
 #define NaN 0x7FF8000000000001
 
-// func Hypot(p, q float64) float64
-TEXT ·Hypot(SB),NOSPLIT,$0
+// func archHypot(p, q float64) float64
+TEXT ·archHypot(SB),NOSPLIT,$0
 	// test bits for special cases
 	MOVQ    p+0(FP), BX
 	MOVQ    $~(1<<63), AX
diff --git a/src/math/hypot_asm.go b/src/math/hypot_asm.go
new file mode 100644
index 0000000..9435af4
--- /dev/null
+++ b/src/math/hypot_asm.go
@@ -0,0 +1,12 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build 386 || amd64
+// +build 386 amd64
+
+package math
+
+const haveArchHypot = true
+
+func archHypot(p, q float64) float64
diff --git a/src/math/hypot_noasm.go b/src/math/hypot_noasm.go
new file mode 100644
index 0000000..bc41b13
--- /dev/null
+++ b/src/math/hypot_noasm.go
@@ -0,0 +1,14 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build !386 && !amd64
+// +build !386,!amd64
+
+package math
+
+const haveArchHypot = false
+
+func archHypot(p, q float64) float64 {
+	panic("not implemented")
+}
diff --git a/src/math/ldexp.go b/src/math/ldexp.go
index aa50a49..55c82f1 100644
--- a/src/math/ldexp.go
+++ b/src/math/ldexp.go
@@ -11,7 +11,12 @@
 //	Ldexp(±0, exp) = ±0
 //	Ldexp(±Inf, exp) = ±Inf
 //	Ldexp(NaN, exp) = NaN
-func Ldexp(frac float64, exp int) float64
+func Ldexp(frac float64, exp int) float64 {
+	if haveArchLdexp {
+		return archLdexp(frac, exp)
+	}
+	return ldexp(frac, exp)
+}
 
 func ldexp(frac float64, exp int) float64 {
 	// special cases
diff --git a/src/math/log.go b/src/math/log.go
index e328348..1b3e306 100644
--- a/src/math/log.go
+++ b/src/math/log.go
@@ -77,7 +77,12 @@
 //	Log(0) = -Inf
 //	Log(x < 0) = NaN
 //	Log(NaN) = NaN
-func Log(x float64) float64
+func Log(x float64) float64 {
+	if haveArchLog {
+		return archLog(x)
+	}
+	return log(x)
+}
 
 func log(x float64) float64 {
 	const (
diff --git a/src/math/log10.go b/src/math/log10.go
index ccd079d..e6916a5 100644
--- a/src/math/log10.go
+++ b/src/math/log10.go
@@ -6,7 +6,12 @@
 
 // Log10 returns the decimal logarithm of x.
 // The special cases are the same as for Log.
-func Log10(x float64) float64
+func Log10(x float64) float64 {
+	if haveArchLog10 {
+		return archLog10(x)
+	}
+	return log10(x)
+}
 
 func log10(x float64) float64 {
 	return Log(x) * (1 / Ln10)
@@ -14,7 +19,12 @@
 
 // Log2 returns the binary logarithm of x.
 // The special cases are the same as for Log.
-func Log2(x float64) float64
+func Log2(x float64) float64 {
+	if haveArchLog2 {
+		return archLog2(x)
+	}
+	return log2(x)
+}
 
 func log2(x float64) float64 {
 	frac, exp := Frexp(x)
diff --git a/src/math/log1p.go b/src/math/log1p.go
index e34e1ff..c117f72 100644
--- a/src/math/log1p.go
+++ b/src/math/log1p.go
@@ -92,7 +92,12 @@
 //	Log1p(-1) = -Inf
 //	Log1p(x < -1) = NaN
 //	Log1p(NaN) = NaN
-func Log1p(x float64) float64
+func Log1p(x float64) float64 {
+	if haveArchLog1p {
+		return archLog1p(x)
+	}
+	return log1p(x)
+}
 
 func log1p(x float64) float64 {
 	const (
diff --git a/src/math/log_amd64.s b/src/math/log_amd64.s
index 3d76389..d84091f 100644
--- a/src/math/log_amd64.s
+++ b/src/math/log_amd64.s
@@ -19,7 +19,7 @@
 #define PosInf 0x7FF0000000000000
 
 // func Log(x float64) float64
-TEXT ·Log(SB),NOSPLIT,$0
+TEXT ·archLog(SB),NOSPLIT,$0
 	// test bits for special cases
 	MOVQ    x+0(FP), BX
 	MOVQ    $~(1<<63), AX // sign bit mask
diff --git a/src/math/log_asm.go b/src/math/log_asm.go
new file mode 100644
index 0000000..4d3b7ee
--- /dev/null
+++ b/src/math/log_asm.go
@@ -0,0 +1,12 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build amd64 || s390x
+// +build amd64 s390x
+
+package math
+
+const haveArchLog = true
+
+func archLog(x float64) float64
diff --git a/src/math/log_stub.go b/src/math/log_stub.go
new file mode 100644
index 0000000..e169716
--- /dev/null
+++ b/src/math/log_stub.go
@@ -0,0 +1,14 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build !amd64 && !s390x
+// +build !amd64,!s390x
+
+package math
+
+const haveArchLog = false
+
+func archLog(x float64) float64 {
+	panic("not implemented")
+}
diff --git a/src/math/mod.go b/src/math/mod.go
index 7efc018..6bc5f28 100644
--- a/src/math/mod.go
+++ b/src/math/mod.go
@@ -18,7 +18,12 @@
 //	Mod(x, 0) = NaN
 //	Mod(x, ±Inf) = x
 //	Mod(x, NaN) = NaN
-func Mod(x, y float64) float64
+func Mod(x, y float64) float64 {
+	if haveArchMod {
+		return archMod(x, y)
+	}
+	return mod(x, y)
+}
 
 func mod(x, y float64) float64 {
 	if y == 0 || IsInf(x, 0) || IsNaN(x) || IsNaN(y) {
diff --git a/src/math/modf.go b/src/math/modf.go
index c5bb894..bf08dc6 100644
--- a/src/math/modf.go
+++ b/src/math/modf.go
@@ -10,7 +10,12 @@
 // Special cases are:
 //	Modf(±Inf) = ±Inf, NaN
 //	Modf(NaN) = NaN, NaN
-func Modf(f float64) (int float64, frac float64)
+func Modf(f float64) (int float64, frac float64) {
+	if haveArchModf {
+		return archModf(f)
+	}
+	return modf(f)
+}
 
 func modf(f float64) (int float64, frac float64) {
 	if f < 1 {
diff --git a/src/math/modf_arm64.s b/src/math/modf_arm64.s
index 7c70ef3..1e4a329 100644
--- a/src/math/modf_arm64.s
+++ b/src/math/modf_arm64.s
@@ -4,8 +4,8 @@
 
 #include "textflag.h"
 
-// func Modf(f float64) (int float64, frac float64)
-TEXT ·Modf(SB),NOSPLIT,$0
+// func archModf(f float64) (int float64, frac float64)
+TEXT ·archModf(SB),NOSPLIT,$0
 	MOVD	f+0(FP), R0
 	FMOVD	R0, F0
 	FRINTZD	F0, F1
diff --git a/src/math/modf_asm.go b/src/math/modf_asm.go
new file mode 100644
index 0000000..ce431a9
--- /dev/null
+++ b/src/math/modf_asm.go
@@ -0,0 +1,12 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build arm64 || ppc64 || ppc64le
+// +build arm64 ppc64 ppc64le
+
+package math
+
+const haveArchModf = true
+
+func archModf(f float64) (int float64, frac float64)
diff --git a/src/math/modf_noasm.go b/src/math/modf_noasm.go
new file mode 100644
index 0000000..9607a08
--- /dev/null
+++ b/src/math/modf_noasm.go
@@ -0,0 +1,14 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build !arm64 && !ppc64 && !ppc64le
+// +build !arm64,!ppc64,!ppc64le
+
+package math
+
+const haveArchModf = false
+
+func archModf(f float64) (int float64, frac float64) {
+	panic("not implemented")
+}
diff --git a/src/math/modf_ppc64x.s b/src/math/modf_ppc64x.s
index da58653..caa435e 100644
--- a/src/math/modf_ppc64x.s
+++ b/src/math/modf_ppc64x.s
@@ -6,8 +6,8 @@
 
 #include "textflag.h"
 
-// func Modf(f float64) (int float64, frac float64)
-TEXT ·Modf(SB),NOSPLIT,$0
+// func archModf(f float64) (int float64, frac float64)
+TEXT ·archModf(SB),NOSPLIT,$0
 	FMOVD	f+0(FP), F0
 	FRIZ	F0, F1
 	FMOVD	F1, int+8(FP)
diff --git a/src/math/pow.go b/src/math/pow.go
index 2219a90..e45a044 100644
--- a/src/math/pow.go
+++ b/src/math/pow.go
@@ -35,7 +35,12 @@
 //	Pow(+Inf, y) = +0 for y < 0
 //	Pow(-Inf, y) = Pow(-0, -y)
 //	Pow(x, y) = NaN for finite x < 0 and finite non-integer y
-func Pow(x, y float64) float64
+func Pow(x, y float64) float64 {
+	if haveArchPow {
+		return archPow(x, y)
+	}
+	return pow(x, y)
+}
 
 func pow(x, y float64) float64 {
 	switch {
diff --git a/src/math/remainder.go b/src/math/remainder.go
index 7c77d6e..bf8bfd5 100644
--- a/src/math/remainder.go
+++ b/src/math/remainder.go
@@ -34,7 +34,12 @@
 //	Remainder(x, 0) = NaN
 //	Remainder(x, ±Inf) = x
 //	Remainder(x, NaN) = NaN
-func Remainder(x, y float64) float64
+func Remainder(x, y float64) float64 {
+	if haveArchRemainder {
+		return archRemainder(x, y)
+	}
+	return remainder(x, y)
+}
 
 func remainder(x, y float64) float64 {
 	const (
diff --git a/src/math/sin.go b/src/math/sin.go
index 3b6dbe3..d95bb54 100644
--- a/src/math/sin.go
+++ b/src/math/sin.go
@@ -114,7 +114,12 @@
 // Special cases are:
 //	Cos(±Inf) = NaN
 //	Cos(NaN) = NaN
-func Cos(x float64) float64
+func Cos(x float64) float64 {
+	if haveArchCos {
+		return archCos(x)
+	}
+	return cos(x)
+}
 
 func cos(x float64) float64 {
 	const (
@@ -175,7 +180,12 @@
 //	Sin(±0) = ±0
 //	Sin(±Inf) = NaN
 //	Sin(NaN) = NaN
-func Sin(x float64) float64
+func Sin(x float64) float64 {
+	if haveArchSin {
+		return archSin(x)
+	}
+	return sin(x)
+}
 
 func sin(x float64) float64 {
 	const (
diff --git a/src/math/sinh.go b/src/math/sinh.go
index 573a37e..9fe9b4e 100644
--- a/src/math/sinh.go
+++ b/src/math/sinh.go
@@ -22,7 +22,12 @@
 //	Sinh(±0) = ±0
 //	Sinh(±Inf) = ±Inf
 //	Sinh(NaN) = NaN
-func Sinh(x float64) float64
+func Sinh(x float64) float64 {
+	if haveArchSinh {
+		return archSinh(x)
+	}
+	return sinh(x)
+}
 
 func sinh(x float64) float64 {
 	// The coefficients are #2029 from Hart & Cheney. (20.36D)
@@ -69,7 +74,12 @@
 //	Cosh(±0) = 1
 //	Cosh(±Inf) = +Inf
 //	Cosh(NaN) = NaN
-func Cosh(x float64) float64
+func Cosh(x float64) float64 {
+	if haveArchCosh {
+		return archCosh(x)
+	}
+	return cosh(x)
+}
 
 func cosh(x float64) float64 {
 	x = Abs(x)
diff --git a/src/math/sqrt.go b/src/math/sqrt.go
index 1077a62..903d57d 100644
--- a/src/math/sqrt.go
+++ b/src/math/sqrt.go
@@ -89,7 +89,12 @@
 //	Sqrt(±0) = ±0
 //	Sqrt(x < 0) = NaN
 //	Sqrt(NaN) = NaN
-func Sqrt(x float64) float64
+func Sqrt(x float64) float64 {
+	if haveArchSqrt {
+		return archSqrt(x)
+	}
+	return sqrt(x)
+}
 
 // Note: Sqrt is implemented in assembly on some systems.
 // Others have assembly stubs that jump to func sqrt below.
diff --git a/src/math/sqrt_386.s b/src/math/sqrt_386.s
index 5a5c33a..90aec13 100644
--- a/src/math/sqrt_386.s
+++ b/src/math/sqrt_386.s
@@ -4,8 +4,8 @@
 
 #include "textflag.h"
 
-// func Sqrt(x float64) float64
-TEXT ·Sqrt(SB),NOSPLIT,$0
+// func archSqrt(x float64) float64
+TEXT ·archSqrt(SB),NOSPLIT,$0
 	FMOVD   x+0(FP),F0
 	FSQRT
 	FMOVDP  F0,ret+8(FP)
diff --git a/src/math/sqrt_amd64.s b/src/math/sqrt_amd64.s
index 1102903..c3b110e 100644
--- a/src/math/sqrt_amd64.s
+++ b/src/math/sqrt_amd64.s
@@ -4,8 +4,8 @@
 
 #include "textflag.h"
 
-// func Sqrt(x float64) float64
-TEXT ·Sqrt(SB), NOSPLIT, $0
+// func archSqrt(x float64) float64
+TEXT ·archSqrt(SB), NOSPLIT, $0
 	XORPS  X0, X0 // break dependency
 	SQRTSD x+0(FP), X0
 	MOVSD  X0, ret+8(FP)
diff --git a/src/math/sqrt_arm.s b/src/math/sqrt_arm.s
index ffc7d10..64792ec 100644
--- a/src/math/sqrt_arm.s
+++ b/src/math/sqrt_arm.s
@@ -4,8 +4,8 @@
 
 #include "textflag.h"
 
-// func Sqrt(x float64) float64
-TEXT ·Sqrt(SB),NOSPLIT,$0
+// func archSqrt(x float64) float64
+TEXT ·archSqrt(SB),NOSPLIT,$0
 	MOVB	runtime·goarm(SB), R11
 	CMP	$5, R11
 	BEQ	arm5
diff --git a/src/math/sqrt_arm64.s b/src/math/sqrt_arm64.s
index 3041d25..36ba41a 100644
--- a/src/math/sqrt_arm64.s
+++ b/src/math/sqrt_arm64.s
@@ -4,8 +4,8 @@
 
 #include "textflag.h"
 
-// func Sqrt(x float64) float64
-TEXT ·Sqrt(SB),NOSPLIT,$0
+// func archSqrt(x float64) float64
+TEXT ·archSqrt(SB),NOSPLIT,$0
 	FMOVD	x+0(FP), F0
 	FSQRTD	F0, F0
 	FMOVD	F0, ret+8(FP)
diff --git a/src/math/sqrt_asm.go b/src/math/sqrt_asm.go
new file mode 100644
index 0000000..276e290
--- /dev/null
+++ b/src/math/sqrt_asm.go
@@ -0,0 +1,12 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build 386 || amd64 || arm64 || arm || mips || mipsle || ppc64 || ppc64le || s390x || wasm
+// +build 386 amd64 arm64 arm mips mipsle ppc64 ppc64le s390x wasm
+
+package math
+
+const haveArchSqrt = true
+
+func archSqrt(x float64) float64
diff --git a/src/math/sqrt_mipsx.s b/src/math/sqrt_mipsx.s
index a63ea9e..c619c19 100644
--- a/src/math/sqrt_mipsx.s
+++ b/src/math/sqrt_mipsx.s
@@ -6,8 +6,8 @@
 
 #include "textflag.h"
 
-// func Sqrt(x float64) float64
-TEXT ·Sqrt(SB),NOSPLIT,$0
+// func archSqrt(x float64) float64
+TEXT ·archSqrt(SB),NOSPLIT,$0
 #ifdef GOMIPS_softfloat
 	JMP ·sqrt(SB)
 #else
diff --git a/src/math/sqrt_noasm.go b/src/math/sqrt_noasm.go
new file mode 100644
index 0000000..87c22d4
--- /dev/null
+++ b/src/math/sqrt_noasm.go
@@ -0,0 +1,14 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build !386 && !amd64 && !arm64 && !arm && !mips && !mipsle && !ppc64 && !ppc64le && !s390x && !wasm
+// +build !386,!amd64,!arm64,!arm,!mips,!mipsle,!ppc64,!ppc64le,!s390x,!wasm
+
+package math
+
+const haveArchSqrt = false
+
+func archSqrt(x float64) float64 {
+	panic("not implemented")
+}
diff --git a/src/math/sqrt_ppc64x.s b/src/math/sqrt_ppc64x.s
index 0469f4d..174b63e 100644
--- a/src/math/sqrt_ppc64x.s
+++ b/src/math/sqrt_ppc64x.s
@@ -6,8 +6,8 @@
 
 #include "textflag.h"
 
-// func Sqrt(x float64) float64
-TEXT ·Sqrt(SB),NOSPLIT,$0
+// func archSqrt(x float64) float64
+TEXT ·archSqrt(SB),NOSPLIT,$0
 	FMOVD	x+0(FP), F0
 	FSQRT	F0, F0
 	FMOVD	F0, ret+8(FP)
diff --git a/src/math/sqrt_s390x.s b/src/math/sqrt_s390x.s
index 37ca0be..fa31f75 100644
--- a/src/math/sqrt_s390x.s
+++ b/src/math/sqrt_s390x.s
@@ -4,8 +4,8 @@
 
 #include "textflag.h"
 
-// func Sqrt(x float64) float64
-TEXT ·Sqrt(SB),NOSPLIT,$0
+// func archSqrt(x float64) float64
+TEXT ·archSqrt(SB),NOSPLIT,$0
 	FMOVD x+0(FP), F1
 	FSQRT F1, F1
 	FMOVD F1, ret+8(FP)
diff --git a/src/math/sqrt_wasm.s b/src/math/sqrt_wasm.s
index cbfe598..fa6799d 100644
--- a/src/math/sqrt_wasm.s
+++ b/src/math/sqrt_wasm.s
@@ -4,7 +4,7 @@
 
 #include "textflag.h"
 
-TEXT ·Sqrt(SB),NOSPLIT,$0
+TEXT ·archSqrt(SB),NOSPLIT,$0
 	Get SP
 	F64Load x+0(FP)
 	F64Sqrt
diff --git a/src/math/stubs.go b/src/math/stubs.go
new file mode 100644
index 0000000..e1345eb
--- /dev/null
+++ b/src/math/stubs.go
@@ -0,0 +1,161 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+//go:build !s390x
+// +build !s390x
+
+// This is a large group of functions that most architectures don't
+// implement in assembly.
+
+package math
+
+const haveArchAcos = false
+
+func archAcos(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchAcosh = false
+
+func archAcosh(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchAsin = false
+
+func archAsin(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchAsinh = false
+
+func archAsinh(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchAtan = false
+
+func archAtan(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchAtan2 = false
+
+func archAtan2(y, x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchAtanh = false
+
+func archAtanh(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchCbrt = false
+
+func archCbrt(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchCos = false
+
+func archCos(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchCosh = false
+
+func archCosh(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchErf = false
+
+func archErf(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchErfc = false
+
+func archErfc(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchExpm1 = false
+
+func archExpm1(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchFrexp = false
+
+func archFrexp(x float64) (float64, int) {
+	panic("not implemented")
+}
+
+const haveArchLdexp = false
+
+func archLdexp(frac float64, exp int) float64 {
+	panic("not implemented")
+}
+
+const haveArchLog10 = false
+
+func archLog10(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchLog2 = false
+
+func archLog2(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchLog1p = false
+
+func archLog1p(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchMod = false
+
+func archMod(x, y float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchPow = false
+
+func archPow(x, y float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchRemainder = false
+
+func archRemainder(x, y float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchSin = false
+
+func archSin(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchSinh = false
+
+func archSinh(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchTan = false
+
+func archTan(x float64) float64 {
+	panic("not implemented")
+}
+
+const haveArchTanh = false
+
+func archTanh(x float64) float64 {
+	panic("not implemented")
+}
diff --git a/src/math/stubs_386.s b/src/math/stubs_386.s
deleted file mode 100644
index bccb3ed..0000000
--- a/src/math/stubs_386.s
+++ /dev/null
@@ -1,98 +0,0 @@
-// Copyright 2019 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-#include "textflag.h"
-
-TEXT ·Acos(SB), NOSPLIT, $0
-	JMP ·acos(SB)
-
-TEXT ·Acosh(SB), NOSPLIT, $0
-	JMP ·acosh(SB)
-
-TEXT ·Asin(SB), NOSPLIT, $0
-	JMP ·asin(SB)
-
-TEXT ·Asinh(SB), NOSPLIT, $0
-	JMP ·asinh(SB)
-
-TEXT ·Atan(SB), NOSPLIT, $0
-	JMP ·atan(SB)
-
-TEXT ·Atan2(SB), NOSPLIT, $0
-	JMP ·atan2(SB)
-
-TEXT ·Atanh(SB), NOSPLIT, $0
-	JMP ·atanh(SB)
-
-TEXT ·Cbrt(SB), NOSPLIT, $0
-	JMP ·cbrt(SB)
-
-TEXT ·Cos(SB), NOSPLIT, $0
-	JMP ·cos(SB)
-
-TEXT ·Cosh(SB), NOSPLIT, $0
-	JMP ·cosh(SB)
-
-TEXT ·Erf(SB), NOSPLIT, $0
-	JMP ·erf(SB)
-
-TEXT ·Erfc(SB), NOSPLIT, $0
-	JMP ·erfc(SB)
-
-TEXT ·Exp(SB), NOSPLIT, $0
-	JMP ·exp(SB)
-
-TEXT ·Exp2(SB), NOSPLIT, $0
-	JMP ·exp2(SB)
-
-TEXT ·Expm1(SB), NOSPLIT, $0
-	JMP ·expm1(SB)
-
-TEXT ·Frexp(SB), NOSPLIT, $0
-	JMP ·frexp(SB)
-
-TEXT ·Ldexp(SB), NOSPLIT, $0
-	JMP ·ldexp(SB)
-
-TEXT ·Log10(SB), NOSPLIT, $0
-	JMP ·log10(SB)
-
-TEXT ·Log2(SB), NOSPLIT, $0
-	JMP ·log2(SB)
-
-TEXT ·Log1p(SB), NOSPLIT, $0
-	JMP ·log1p(SB)
-
-TEXT ·Log(SB), NOSPLIT, $0
-	JMP ·log(SB)
-
-TEXT ·Max(SB), NOSPLIT, $0
-	JMP ·max(SB)
-
-TEXT ·Min(SB), NOSPLIT, $0
-	JMP ·min(SB)
-
-TEXT ·Mod(SB), NOSPLIT, $0
-	JMP ·mod(SB)
-
-TEXT ·Modf(SB), NOSPLIT, $0
-	JMP ·modf(SB)
-
-TEXT ·Pow(SB), NOSPLIT, $0
-	JMP ·pow(SB)
-
-TEXT ·Remainder(SB), NOSPLIT, $0
-	JMP ·remainder(SB)
-
-TEXT ·Sin(SB), NOSPLIT, $0
-	JMP ·sin(SB)
-
-TEXT ·Sinh(SB), NOSPLIT, $0
-	JMP ·sinh(SB)
-
-TEXT ·Tan(SB), NOSPLIT, $0
-	JMP ·tan(SB)
-
-TEXT ·Tanh(SB), NOSPLIT, $0
-	JMP ·tanh(SB)
diff --git a/src/math/stubs_amd64.s b/src/math/stubs_amd64.s
deleted file mode 100644
index 31e01fd..0000000
--- a/src/math/stubs_amd64.s
+++ /dev/null
@@ -1,86 +0,0 @@
-// Copyright 2019 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-#include "textflag.h"
-
-TEXT ·Acos(SB), NOSPLIT, $0
-	JMP ·acos(SB)
-
-TEXT ·Acosh(SB), NOSPLIT, $0
-	JMP ·acosh(SB)
-
-TEXT ·Asin(SB), NOSPLIT, $0
-	JMP ·asin(SB)
-
-TEXT ·Asinh(SB), NOSPLIT, $0
-	JMP ·asinh(SB)
-
-TEXT ·Atan(SB), NOSPLIT, $0
-	JMP ·atan(SB)
-
-TEXT ·Atan2(SB), NOSPLIT, $0
-	JMP ·atan2(SB)
-
-TEXT ·Atanh(SB), NOSPLIT, $0
-	JMP ·atanh(SB)
-
-TEXT ·Cbrt(SB), NOSPLIT, $0
-	JMP ·cbrt(SB)
-
-TEXT ·Cos(SB), NOSPLIT, $0
-	JMP ·cos(SB)
-
-TEXT ·Cosh(SB), NOSPLIT, $0
-	JMP ·cosh(SB)
-
-TEXT ·Erf(SB), NOSPLIT, $0
-	JMP ·erf(SB)
-
-TEXT ·Erfc(SB), NOSPLIT, $0
-	JMP ·erfc(SB)
-
-TEXT ·Exp2(SB), NOSPLIT, $0
-	JMP ·exp2(SB)
-
-TEXT ·Expm1(SB), NOSPLIT, $0
-	JMP ·expm1(SB)
-
-TEXT ·Frexp(SB), NOSPLIT, $0
-	JMP ·frexp(SB)
-
-TEXT ·Ldexp(SB), NOSPLIT, $0
-	JMP ·ldexp(SB)
-
-TEXT ·Log10(SB), NOSPLIT, $0
-	JMP ·log10(SB)
-
-TEXT ·Log2(SB), NOSPLIT, $0
-	JMP ·log2(SB)
-
-TEXT ·Log1p(SB), NOSPLIT, $0
-	JMP ·log1p(SB)
-
-TEXT ·Mod(SB), NOSPLIT, $0
-	JMP ·mod(SB)
-
-TEXT ·Modf(SB), NOSPLIT, $0
-	JMP ·modf(SB)
-
-TEXT ·Pow(SB), NOSPLIT, $0
-	JMP ·pow(SB)
-
-TEXT ·Remainder(SB), NOSPLIT, $0
-	JMP ·remainder(SB)
-
-TEXT ·Sin(SB), NOSPLIT, $0
-	JMP ·sin(SB)
-
-TEXT ·Sinh(SB), NOSPLIT, $0
-	JMP ·sinh(SB)
-
-TEXT ·Tan(SB), NOSPLIT, $0
-	JMP ·tan(SB)
-
-TEXT ·Tanh(SB), NOSPLIT, $0
-	JMP ·tanh(SB)
diff --git a/src/math/stubs_arm.s b/src/math/stubs_arm.s
deleted file mode 100644
index 31bf872..0000000
--- a/src/math/stubs_arm.s
+++ /dev/null
@@ -1,110 +0,0 @@
-// Copyright 2019 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-#include "textflag.h"
-
-TEXT ·Acos(SB), NOSPLIT, $0
-	B ·acos(SB)
-
-TEXT ·Acosh(SB), NOSPLIT, $0
-	B ·acosh(SB)
-
-TEXT ·Asin(SB), NOSPLIT, $0
-	B ·asin(SB)
-
-TEXT ·Asinh(SB), NOSPLIT, $0
-	B ·asinh(SB)
-
-TEXT ·Atan(SB), NOSPLIT, $0
-	B ·atan(SB)
-
-TEXT ·Atan2(SB), NOSPLIT, $0
-	B ·atan2(SB)
-
-TEXT ·Atanh(SB), NOSPLIT, $0
-	B ·atanh(SB)
-
-TEXT ·Cbrt(SB), NOSPLIT, $0
-	B ·cbrt(SB)
-
-TEXT ·Cos(SB), NOSPLIT, $0
-	B ·cos(SB)
-
-TEXT ·Cosh(SB), NOSPLIT, $0
-	B ·cosh(SB)
-
-TEXT ·Erf(SB), NOSPLIT, $0
-	B ·erf(SB)
-
-TEXT ·Erfc(SB), NOSPLIT, $0
-	B ·erfc(SB)
-
-TEXT ·Exp2(SB), NOSPLIT, $0
-	B ·exp2(SB)
-
-TEXT ·Exp(SB), NOSPLIT, $0
-	B ·exp(SB)
-
-TEXT ·Expm1(SB), NOSPLIT, $0
-	B ·expm1(SB)
-
-TEXT ·Floor(SB), NOSPLIT, $0
-	B ·floor(SB)
-
-TEXT ·Ceil(SB), NOSPLIT, $0
-	B ·ceil(SB)
-
-TEXT ·Trunc(SB), NOSPLIT, $0
-	B ·trunc(SB)
-
-TEXT ·Frexp(SB), NOSPLIT, $0
-	B ·frexp(SB)
-
-TEXT ·Hypot(SB), NOSPLIT, $0
-	B ·hypot(SB)
-
-TEXT ·Ldexp(SB), NOSPLIT, $0
-	B ·ldexp(SB)
-
-TEXT ·Log10(SB), NOSPLIT, $0
-	B ·log10(SB)
-
-TEXT ·Log2(SB), NOSPLIT, $0
-	B ·log2(SB)
-
-TEXT ·Log1p(SB), NOSPLIT, $0
-	B ·log1p(SB)
-
-TEXT ·Log(SB), NOSPLIT, $0
-	B ·log(SB)
-
-TEXT ·Max(SB), NOSPLIT, $0
-	B ·max(SB)
-
-TEXT ·Min(SB), NOSPLIT, $0
-	B ·min(SB)
-
-TEXT ·Mod(SB), NOSPLIT, $0
-	B ·mod(SB)
-
-TEXT ·Modf(SB), NOSPLIT, $0
-	B ·modf(SB)
-
-TEXT ·Pow(SB), NOSPLIT, $0
-	JMP ·pow(SB)
-
-TEXT ·Remainder(SB), NOSPLIT, $0
-	B ·remainder(SB)
-
-TEXT ·Sin(SB), NOSPLIT, $0
-	B ·sin(SB)
-
-TEXT ·Sinh(SB), NOSPLIT, $0
-	B ·sinh(SB)
-
-TEXT ·Tan(SB), NOSPLIT, $0
-	B ·tan(SB)
-
-TEXT ·Tanh(SB), NOSPLIT, $0
-	B ·tanh(SB)
diff --git a/src/math/stubs_arm64.s b/src/math/stubs_arm64.s
deleted file mode 100644
index 24564d0..0000000
--- a/src/math/stubs_arm64.s
+++ /dev/null
@@ -1,88 +0,0 @@
-// Copyright 2014 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-// +build arm64
-
-#include "textflag.h"
-
-TEXT ·Asin(SB), NOSPLIT, $0
-	B ·asin(SB)
-
-TEXT ·Acos(SB), NOSPLIT, $0
-	B ·acos(SB)
-
-TEXT ·Asinh(SB), NOSPLIT, $0
-	B ·asinh(SB)
-
-TEXT ·Acosh(SB), NOSPLIT, $0
-	B ·acosh(SB)
-
-TEXT ·Atan2(SB), NOSPLIT, $0
-	B ·atan2(SB)
-
-TEXT ·Atan(SB), NOSPLIT, $0
-	B ·atan(SB)
-
-TEXT ·Atanh(SB), NOSPLIT, $0
-	B ·atanh(SB)
-
-TEXT ·Erf(SB), NOSPLIT, $0
-	B ·erf(SB)
-
-TEXT ·Erfc(SB), NOSPLIT, $0
-	B ·erfc(SB)
-
-TEXT ·Cbrt(SB), NOSPLIT, $0
-	B ·cbrt(SB)
-
-TEXT ·Cosh(SB), NOSPLIT, $0
-	B ·cosh(SB)
-
-TEXT ·Expm1(SB), NOSPLIT, $0
-	B ·expm1(SB)
-
-TEXT ·Frexp(SB), NOSPLIT, $0
-	B ·frexp(SB)
-
-TEXT ·Hypot(SB), NOSPLIT, $0
-	B ·hypot(SB)
-
-TEXT ·Ldexp(SB), NOSPLIT, $0
-	B ·ldexp(SB)
-
-TEXT ·Log10(SB), NOSPLIT, $0
-	B ·log10(SB)
-
-TEXT ·Log2(SB), NOSPLIT, $0
-	B ·log2(SB)
-
-TEXT ·Log1p(SB), NOSPLIT, $0
-	B ·log1p(SB)
-
-TEXT ·Log(SB), NOSPLIT, $0
-	B ·log(SB)
-
-TEXT ·Mod(SB), NOSPLIT, $0
-	B ·mod(SB)
-
-TEXT ·Remainder(SB), NOSPLIT, $0
-	B ·remainder(SB)
-
-TEXT ·Sin(SB), NOSPLIT, $0
-	B ·sin(SB)
-
-TEXT ·Sinh(SB), NOSPLIT, $0
-	B ·sinh(SB)
-
-TEXT ·Cos(SB), NOSPLIT, $0
-	B ·cos(SB)
-
-TEXT ·Tan(SB), NOSPLIT, $0
-	B ·tan(SB)
-
-TEXT ·Tanh(SB), NOSPLIT, $0
-	B ·tanh(SB)
-
-TEXT ·Pow(SB), NOSPLIT, $0
-	B ·pow(SB)
diff --git a/src/math/stubs_mips64x.s b/src/math/stubs_mips64x.s
deleted file mode 100644
index 187b069..0000000
--- a/src/math/stubs_mips64x.s
+++ /dev/null
@@ -1,115 +0,0 @@
-// Copyright 2014 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-// +build mips64 mips64le
-
-#include "textflag.h"
-
-TEXT ·Asin(SB), NOSPLIT, $0
-	JMP ·asin(SB)
-
-TEXT ·Acos(SB), NOSPLIT, $0
-	JMP ·acos(SB)
-
-TEXT ·Asinh(SB), NOSPLIT, $0
-	JMP ·asinh(SB)
-
-TEXT ·Acosh(SB), NOSPLIT, $0
-	JMP ·acosh(SB)
-
-TEXT ·Atan2(SB), NOSPLIT, $0
-	JMP ·atan2(SB)
-
-TEXT ·Atan(SB), NOSPLIT, $0
-	JMP ·atan(SB)
-
-TEXT ·Atanh(SB), NOSPLIT, $0
-	JMP ·atanh(SB)
-
-TEXT ·Min(SB), NOSPLIT, $0
-	JMP ·min(SB)
-
-TEXT ·Max(SB), NOSPLIT, $0
-	JMP ·max(SB)
-
-TEXT ·Erf(SB), NOSPLIT, $0
-	JMP ·erf(SB)
-
-TEXT ·Erfc(SB), NOSPLIT, $0
-	JMP ·erfc(SB)
-
-TEXT ·Exp2(SB), NOSPLIT, $0
-	JMP ·exp2(SB)
-
-TEXT ·Expm1(SB), NOSPLIT, $0
-	JMP ·expm1(SB)
-
-TEXT ·Exp(SB), NOSPLIT, $0
-	JMP ·exp(SB)
-
-TEXT ·Floor(SB), NOSPLIT, $0
-	JMP ·floor(SB)
-
-TEXT ·Ceil(SB), NOSPLIT, $0
-	JMP ·ceil(SB)
-
-TEXT ·Trunc(SB), NOSPLIT, $0
-	JMP ·trunc(SB)
-
-TEXT ·Frexp(SB), NOSPLIT, $0
-	JMP ·frexp(SB)
-
-TEXT ·Hypot(SB), NOSPLIT, $0
-	JMP ·hypot(SB)
-
-TEXT ·Ldexp(SB), NOSPLIT, $0
-	JMP ·ldexp(SB)
-
-TEXT ·Log10(SB), NOSPLIT, $0
-	JMP ·log10(SB)
-
-TEXT ·Log2(SB), NOSPLIT, $0
-	JMP ·log2(SB)
-
-TEXT ·Log1p(SB), NOSPLIT, $0
-	JMP ·log1p(SB)
-
-TEXT ·Log(SB), NOSPLIT, $0
-	JMP ·log(SB)
-
-TEXT ·Modf(SB), NOSPLIT, $0
-	JMP ·modf(SB)
-
-TEXT ·Mod(SB), NOSPLIT, $0
-	JMP ·mod(SB)
-
-TEXT ·Remainder(SB), NOSPLIT, $0
-	JMP ·remainder(SB)
-
-TEXT ·Sin(SB), NOSPLIT, $0
-	JMP ·sin(SB)
-
-TEXT ·Sinh(SB), NOSPLIT, $0
-	JMP ·sinh(SB)
-
-TEXT ·Cos(SB), NOSPLIT, $0
-	JMP ·cos(SB)
-
-TEXT ·Cosh(SB), NOSPLIT, $0
-	JMP ·cosh(SB)
-
-TEXT ·Sqrt(SB), NOSPLIT, $0
-	JMP ·sqrt(SB)
-
-TEXT ·Tan(SB), NOSPLIT, $0
-	JMP ·tan(SB)
-
-TEXT ·Tanh(SB), NOSPLIT, $0
-	JMP ·tanh(SB)
-
-TEXT ·Cbrt(SB), NOSPLIT, $0
-	JMP ·cbrt(SB)
-
-TEXT ·Pow(SB), NOSPLIT, $0
-	JMP ·pow(SB)
diff --git a/src/math/stubs_mipsx.s b/src/math/stubs_mipsx.s
deleted file mode 100644
index 4b82449..0000000
--- a/src/math/stubs_mipsx.s
+++ /dev/null
@@ -1,113 +0,0 @@
-// Copyright 2016 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-// +build mips mipsle
-
-#include "textflag.h"
-
-TEXT ·Asin(SB), NOSPLIT, $0
-	JMP ·asin(SB)
-
-TEXT ·Acos(SB), NOSPLIT, $0
-	JMP ·acos(SB)
-
-TEXT ·Asinh(SB), NOSPLIT, $0
-	JMP ·asinh(SB)
-
-TEXT ·Acosh(SB), NOSPLIT, $0
-	JMP ·acosh(SB)
-
-TEXT ·Atan2(SB), NOSPLIT, $0
-	JMP ·atan2(SB)
-
-TEXT ·Atan(SB), NOSPLIT, $0
-	JMP ·atan(SB)
-
-TEXT ·Atanh(SB), NOSPLIT, $0
-	JMP ·atanh(SB)
-
-TEXT ·Min(SB), NOSPLIT, $0
-	JMP ·min(SB)
-
-TEXT ·Max(SB), NOSPLIT, $0
-	JMP ·max(SB)
-
-TEXT ·Erf(SB), NOSPLIT, $0
-	JMP ·erf(SB)
-
-TEXT ·Erfc(SB), NOSPLIT, $0
-	JMP ·erfc(SB)
-
-TEXT ·Exp2(SB), NOSPLIT, $0
-	JMP ·exp2(SB)
-
-TEXT ·Expm1(SB), NOSPLIT, $0
-	JMP ·expm1(SB)
-
-TEXT ·Exp(SB), NOSPLIT, $0
-	JMP ·exp(SB)
-
-TEXT ·Floor(SB), NOSPLIT, $0
-	JMP ·floor(SB)
-
-TEXT ·Ceil(SB), NOSPLIT, $0
-	JMP ·ceil(SB)
-
-TEXT ·Trunc(SB), NOSPLIT, $0
-	JMP ·trunc(SB)
-
-TEXT ·Frexp(SB), NOSPLIT, $0
-	JMP ·frexp(SB)
-
-TEXT ·Hypot(SB), NOSPLIT, $0
-	JMP ·hypot(SB)
-
-TEXT ·Ldexp(SB), NOSPLIT, $0
-	JMP ·ldexp(SB)
-
-TEXT ·Log10(SB), NOSPLIT, $0
-	JMP ·log10(SB)
-
-TEXT ·Log2(SB), NOSPLIT, $0
-	JMP ·log2(SB)
-
-TEXT ·Log1p(SB), NOSPLIT, $0
-	JMP ·log1p(SB)
-
-TEXT ·Log(SB), NOSPLIT, $0
-	JMP ·log(SB)
-
-TEXT ·Modf(SB), NOSPLIT, $0
-	JMP ·modf(SB)
-
-TEXT ·Mod(SB), NOSPLIT, $0
-	JMP ·mod(SB)
-
-TEXT ·Remainder(SB), NOSPLIT, $0
-	JMP ·remainder(SB)
-
-TEXT ·Sin(SB), NOSPLIT, $0
-	JMP ·sin(SB)
-
-TEXT ·Sinh(SB), NOSPLIT, $0
-	JMP ·sinh(SB)
-
-TEXT ·Cos(SB), NOSPLIT, $0
-	JMP ·cos(SB)
-
-TEXT ·Cosh(SB), NOSPLIT, $0
-	JMP ·cosh(SB)
-
-TEXT ·Tan(SB), NOSPLIT, $0
-	JMP ·tan(SB)
-
-TEXT ·Tanh(SB), NOSPLIT, $0
-	JMP ·tanh(SB)
-
-TEXT ·Cbrt(SB), NOSPLIT, $0
-	JMP ·cbrt(SB)
-
-TEXT ·Pow(SB), NOSPLIT, $0
-	JMP ·pow(SB)
-
diff --git a/src/math/stubs_ppc64x.s b/src/math/stubs_ppc64x.s
deleted file mode 100644
index e7be07e..0000000
--- a/src/math/stubs_ppc64x.s
+++ /dev/null
@@ -1,101 +0,0 @@
-// Copyright 2014 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-// +build ppc64 ppc64le
-
-#include "textflag.h"
-
-TEXT ·Asin(SB), NOSPLIT, $0
-	BR ·asin(SB)
-
-TEXT ·Acos(SB), NOSPLIT, $0
-	BR ·acos(SB)
-
-TEXT ·Asinh(SB), NOSPLIT, $0
-	BR ·asinh(SB)
-
-TEXT ·Acosh(SB), NOSPLIT, $0
-	BR ·acosh(SB)
-
-TEXT ·Atan2(SB), NOSPLIT, $0
-	BR ·atan2(SB)
-
-TEXT ·Atan(SB), NOSPLIT, $0
-	BR ·atan(SB)
-
-TEXT ·Atanh(SB), NOSPLIT, $0
-	BR ·atanh(SB)
-
-TEXT ·Min(SB), NOSPLIT, $0
-	BR ·min(SB)
-
-TEXT ·Max(SB), NOSPLIT, $0
-	BR ·max(SB)
-
-TEXT ·Erf(SB), NOSPLIT, $0
-	BR ·erf(SB)
-
-TEXT ·Erfc(SB), NOSPLIT, $0
-	BR ·erfc(SB)
-
-TEXT ·Exp2(SB), NOSPLIT, $0
-	BR ·exp2(SB)
-
-TEXT ·Expm1(SB), NOSPLIT, $0
-	BR ·expm1(SB)
-
-TEXT ·Exp(SB), NOSPLIT, $0
-	BR ·exp(SB)
-
-TEXT ·Frexp(SB), NOSPLIT, $0
-	BR ·frexp(SB)
-
-TEXT ·Hypot(SB), NOSPLIT, $0
-	BR ·hypot(SB)
-
-TEXT ·Ldexp(SB), NOSPLIT, $0
-	BR ·ldexp(SB)
-
-TEXT ·Log10(SB), NOSPLIT, $0
-	BR ·log10(SB)
-
-TEXT ·Log2(SB), NOSPLIT, $0
-	BR ·log2(SB)
-
-TEXT ·Log1p(SB), NOSPLIT, $0
-	BR ·log1p(SB)
-
-TEXT ·Log(SB), NOSPLIT, $0
-	BR ·log(SB)
-
-TEXT ·Mod(SB), NOSPLIT, $0
-	BR ·mod(SB)
-
-TEXT ·Remainder(SB), NOSPLIT, $0
-	BR ·remainder(SB)
-
-TEXT ·Sin(SB), NOSPLIT, $0
-	BR ·sin(SB)
-
-TEXT ·Sinh(SB), NOSPLIT, $0
-	BR ·sinh(SB)
-
-TEXT ·Cos(SB), NOSPLIT, $0
-	BR ·cos(SB)
-
-TEXT ·Cosh(SB), NOSPLIT, $0
-	BR ·cosh(SB)
-
-TEXT ·Tan(SB), NOSPLIT, $0
-	BR ·tan(SB)
-
-TEXT ·Tanh(SB), NOSPLIT, $0
-	BR ·tanh(SB)
-
-TEXT ·Cbrt(SB), NOSPLIT, $0
-	BR ·cbrt(SB)
-
-TEXT ·Pow(SB), NOSPLIT, $0
-	BR ·pow(SB)
-
diff --git a/src/math/stubs_riscv64.s b/src/math/stubs_riscv64.s
deleted file mode 100644
index b36efb8..0000000
--- a/src/math/stubs_riscv64.s
+++ /dev/null
@@ -1,104 +0,0 @@
-// Copyright 2016 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-#include "textflag.h"
-
-TEXT ·Asin(SB),NOSPLIT,$0
-	JMP ·asin(SB)
-
-TEXT ·Acos(SB),NOSPLIT,$0
-	JMP ·acos(SB)
-
-TEXT ·Asinh(SB),NOSPLIT,$0
-        JMP ·asinh(SB)
-
-TEXT ·Acosh(SB),NOSPLIT,$0
-        JMP ·acosh(SB)
-
-TEXT ·Atan2(SB),NOSPLIT,$0
-	JMP ·atan2(SB)
-
-TEXT ·Atan(SB),NOSPLIT,$0
-	JMP ·atan(SB)
-
-TEXT ·Atanh(SB),NOSPLIT,$0
-	JMP ·atanh(SB)
-
-TEXT ·Erf(SB),NOSPLIT,$0
-	JMP ·erf(SB)
-
-TEXT ·Erfc(SB),NOSPLIT,$0
-	JMP ·erfc(SB)
-
-TEXT ·Exp2(SB),NOSPLIT,$0
-	JMP ·exp2(SB)
-
-TEXT ·Expm1(SB),NOSPLIT,$0
-	JMP ·expm1(SB)
-
-TEXT ·Exp(SB),NOSPLIT,$0
-	JMP ·exp(SB)
-
-TEXT ·Floor(SB),NOSPLIT,$0
-	JMP ·floor(SB)
-
-TEXT ·Ceil(SB),NOSPLIT,$0
-	JMP ·ceil(SB)
-
-TEXT ·Trunc(SB),NOSPLIT,$0
-	JMP ·trunc(SB)
-
-TEXT ·Frexp(SB),NOSPLIT,$0
-	JMP ·frexp(SB)
-
-TEXT ·Hypot(SB),NOSPLIT,$0
-	JMP ·hypot(SB)
-
-TEXT ·Ldexp(SB),NOSPLIT,$0
-	JMP ·ldexp(SB)
-
-TEXT ·Log10(SB),NOSPLIT,$0
-	JMP ·log10(SB)
-
-TEXT ·Log2(SB),NOSPLIT,$0
-	JMP ·log2(SB)
-
-TEXT ·Log1p(SB),NOSPLIT,$0
-	JMP ·log1p(SB)
-
-TEXT ·Log(SB),NOSPLIT,$0
-	JMP ·log(SB)
-
-TEXT ·Modf(SB),NOSPLIT,$0
-	JMP ·modf(SB)
-
-TEXT ·Mod(SB),NOSPLIT,$0
-	JMP ·mod(SB)
-
-TEXT ·Remainder(SB),NOSPLIT,$0
-	JMP ·remainder(SB)
-
-TEXT ·Sin(SB),NOSPLIT,$0
-	JMP ·sin(SB)
-
-TEXT ·Sinh(SB),NOSPLIT,$0
-	JMP ·sinh(SB)
-
-TEXT ·Cos(SB),NOSPLIT,$0
-	JMP ·cos(SB)
-
-TEXT ·Cosh(SB),NOSPLIT,$0
-	JMP ·cosh(SB)
-
-TEXT ·Tan(SB),NOSPLIT,$0
-	JMP ·tan(SB)
-
-TEXT ·Tanh(SB),NOSPLIT,$0
-	JMP ·tanh(SB)
-
-TEXT ·Cbrt(SB),NOSPLIT,$0
-	JMP ·cbrt(SB)
-
-TEXT ·Pow(SB),NOSPLIT,$0
-	JMP ·pow(SB)
diff --git a/src/math/stubs_s390x.s b/src/math/stubs_s390x.s
index d0087ab..7400179 100644
--- a/src/math/stubs_s390x.s
+++ b/src/math/stubs_s390x.s
@@ -4,31 +4,7 @@
 
 #include "textflag.h"
 
-TEXT ·Exp2(SB), NOSPLIT, $0
-	BR ·exp2(SB)
-
-TEXT ·Frexp(SB), NOSPLIT, $0
-	BR ·frexp(SB)
-
-TEXT ·Hypot(SB), NOSPLIT, $0
-	BR ·hypot(SB)
-
-TEXT ·Ldexp(SB), NOSPLIT, $0
-	BR ·ldexp(SB)
-
-TEXT ·Log2(SB), NOSPLIT, $0
-	BR ·log2(SB)
-
-TEXT ·Modf(SB), NOSPLIT, $0
-	BR ·modf(SB)
-
-TEXT ·Mod(SB), NOSPLIT, $0
-	BR ·mod(SB)
-
-TEXT ·Remainder(SB), NOSPLIT, $0
-	BR ·remainder(SB)
-
-TEXT ·Log10(SB), NOSPLIT, $0
+TEXT ·archLog10(SB), NOSPLIT, $0
 	MOVD ·log10vectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -49,7 +25,7 @@
 GLOBL ·log10vectorfacility+0x00(SB), NOPTR, $8
 DATA ·log10vectorfacility+0x00(SB)/8, $·log10TrampolineSetup(SB)
 
-TEXT ·Cos(SB), NOSPLIT, $0
+TEXT ·archCos(SB), NOSPLIT, $0
 	MOVD ·cosvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -70,7 +46,7 @@
 GLOBL ·cosvectorfacility+0x00(SB), NOPTR, $8
 DATA ·cosvectorfacility+0x00(SB)/8, $·cosTrampolineSetup(SB)
 
-TEXT ·Cosh(SB), NOSPLIT, $0
+TEXT ·archCosh(SB), NOSPLIT, $0
 	MOVD ·coshvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -91,7 +67,7 @@
 GLOBL ·coshvectorfacility+0x00(SB), NOPTR, $8
 DATA ·coshvectorfacility+0x00(SB)/8, $·coshTrampolineSetup(SB)
 
-TEXT ·Sin(SB), NOSPLIT, $0
+TEXT ·archSin(SB), NOSPLIT, $0
 	MOVD ·sinvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -112,7 +88,7 @@
 GLOBL ·sinvectorfacility+0x00(SB), NOPTR, $8
 DATA ·sinvectorfacility+0x00(SB)/8, $·sinTrampolineSetup(SB)
 
-TEXT ·Sinh(SB), NOSPLIT, $0
+TEXT ·archSinh(SB), NOSPLIT, $0
 	MOVD ·sinhvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -133,7 +109,7 @@
 GLOBL ·sinhvectorfacility+0x00(SB), NOPTR, $8
 DATA ·sinhvectorfacility+0x00(SB)/8, $·sinhTrampolineSetup(SB)
 
-TEXT ·Tanh(SB), NOSPLIT, $0
+TEXT ·archTanh(SB), NOSPLIT, $0
 	MOVD ·tanhvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -154,7 +130,7 @@
 GLOBL ·tanhvectorfacility+0x00(SB), NOPTR, $8
 DATA ·tanhvectorfacility+0x00(SB)/8, $·tanhTrampolineSetup(SB)
 
-TEXT ·Log1p(SB), NOSPLIT, $0
+TEXT ·archLog1p(SB), NOSPLIT, $0
 	MOVD ·log1pvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -175,7 +151,7 @@
 GLOBL ·log1pvectorfacility+0x00(SB), NOPTR, $8
 DATA ·log1pvectorfacility+0x00(SB)/8, $·log1pTrampolineSetup(SB)
 
-TEXT ·Atanh(SB), NOSPLIT, $0
+TEXT ·archAtanh(SB), NOSPLIT, $0
 	MOVD ·atanhvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -196,7 +172,7 @@
 GLOBL ·atanhvectorfacility+0x00(SB), NOPTR, $8
 DATA ·atanhvectorfacility+0x00(SB)/8, $·atanhTrampolineSetup(SB)
 
-TEXT ·Acos(SB), NOSPLIT, $0
+TEXT ·archAcos(SB), NOSPLIT, $0
 	MOVD ·acosvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -217,7 +193,7 @@
 GLOBL ·acosvectorfacility+0x00(SB), NOPTR, $8
 DATA ·acosvectorfacility+0x00(SB)/8, $·acosTrampolineSetup(SB)
 
-TEXT ·Asin(SB), NOSPLIT, $0
+TEXT ·archAsin(SB), NOSPLIT, $0
 	MOVD ·asinvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -238,7 +214,7 @@
 GLOBL ·asinvectorfacility+0x00(SB), NOPTR, $8
 DATA ·asinvectorfacility+0x00(SB)/8, $·asinTrampolineSetup(SB)
 
-TEXT ·Asinh(SB), NOSPLIT, $0
+TEXT ·archAsinh(SB), NOSPLIT, $0
 	MOVD ·asinhvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -259,7 +235,7 @@
 GLOBL ·asinhvectorfacility+0x00(SB), NOPTR, $8
 DATA ·asinhvectorfacility+0x00(SB)/8, $·asinhTrampolineSetup(SB)
 
-TEXT ·Acosh(SB), NOSPLIT, $0
+TEXT ·archAcosh(SB), NOSPLIT, $0
 	MOVD ·acoshvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -280,7 +256,7 @@
 GLOBL ·acoshvectorfacility+0x00(SB), NOPTR, $8
 DATA ·acoshvectorfacility+0x00(SB)/8, $·acoshTrampolineSetup(SB)
 
-TEXT ·Erf(SB), NOSPLIT, $0
+TEXT ·archErf(SB), NOSPLIT, $0
 	MOVD ·erfvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -301,7 +277,7 @@
 GLOBL ·erfvectorfacility+0x00(SB), NOPTR, $8
 DATA ·erfvectorfacility+0x00(SB)/8, $·erfTrampolineSetup(SB)
 
-TEXT ·Erfc(SB), NOSPLIT, $0
+TEXT ·archErfc(SB), NOSPLIT, $0
 	MOVD ·erfcvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -322,7 +298,7 @@
 GLOBL ·erfcvectorfacility+0x00(SB), NOPTR, $8
 DATA ·erfcvectorfacility+0x00(SB)/8, $·erfcTrampolineSetup(SB)
 
-TEXT ·Atan(SB), NOSPLIT, $0
+TEXT ·archAtan(SB), NOSPLIT, $0
 	MOVD ·atanvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -343,7 +319,7 @@
 GLOBL ·atanvectorfacility+0x00(SB), NOPTR, $8
 DATA ·atanvectorfacility+0x00(SB)/8, $·atanTrampolineSetup(SB)
 
-TEXT ·Atan2(SB), NOSPLIT, $0
+TEXT ·archAtan2(SB), NOSPLIT, $0
 	MOVD ·atan2vectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -364,7 +340,7 @@
 GLOBL ·atan2vectorfacility+0x00(SB), NOPTR, $8
 DATA ·atan2vectorfacility+0x00(SB)/8, $·atan2TrampolineSetup(SB)
 
-TEXT ·Cbrt(SB), NOSPLIT, $0
+TEXT ·archCbrt(SB), NOSPLIT, $0
 	MOVD ·cbrtvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -385,7 +361,7 @@
 GLOBL ·cbrtvectorfacility+0x00(SB), NOPTR, $8
 DATA ·cbrtvectorfacility+0x00(SB)/8, $·cbrtTrampolineSetup(SB)
 
-TEXT ·Log(SB), NOSPLIT, $0
+TEXT ·archLog(SB), NOSPLIT, $0
 	MOVD ·logvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -406,7 +382,7 @@
 GLOBL ·logvectorfacility+0x00(SB), NOPTR, $8
 DATA ·logvectorfacility+0x00(SB)/8, $·logTrampolineSetup(SB)
 
-TEXT ·Tan(SB), NOSPLIT, $0
+TEXT ·archTan(SB), NOSPLIT, $0
 	MOVD ·tanvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -427,7 +403,7 @@
 GLOBL ·tanvectorfacility+0x00(SB), NOPTR, $8
 DATA ·tanvectorfacility+0x00(SB)/8, $·tanTrampolineSetup(SB)
 
-TEXT ·Exp(SB), NOSPLIT, $0
+TEXT ·archExp(SB), NOSPLIT, $0
 	MOVD ·expvectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -448,7 +424,7 @@
 GLOBL ·expvectorfacility+0x00(SB), NOPTR, $8
 DATA ·expvectorfacility+0x00(SB)/8, $·expTrampolineSetup(SB)
 
-TEXT ·Expm1(SB), NOSPLIT, $0
+TEXT ·archExpm1(SB), NOSPLIT, $0
 	MOVD ·expm1vectorfacility+0x00(SB), R1
 	BR   (R1)
 
@@ -469,7 +445,7 @@
 GLOBL ·expm1vectorfacility+0x00(SB), NOPTR, $8
 DATA ·expm1vectorfacility+0x00(SB)/8, $·expm1TrampolineSetup(SB)
 
-TEXT ·Pow(SB), NOSPLIT, $0
+TEXT ·archPow(SB), NOSPLIT, $0
 	MOVD ·powvectorfacility+0x00(SB), R1
 	BR   (R1)
 
diff --git a/src/math/stubs_wasm.s b/src/math/stubs_wasm.s
deleted file mode 100644
index c97a2d7..0000000
--- a/src/math/stubs_wasm.s
+++ /dev/null
@@ -1,101 +0,0 @@
-// Copyright 2018 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-#include "textflag.h"
-
-TEXT ·Asin(SB), NOSPLIT, $0
-	JMP ·asin(SB)
-
-TEXT ·Asinh(SB), NOSPLIT, $0
-	JMP ·asinh(SB)
-
-TEXT ·Acos(SB), NOSPLIT, $0
-	JMP ·acos(SB)
-
-TEXT ·Acosh(SB), NOSPLIT, $0
-	JMP ·acosh(SB)
-
-TEXT ·Atan(SB), NOSPLIT, $0
-	JMP ·atan(SB)
-
-TEXT ·Atanh(SB), NOSPLIT, $0
-	JMP ·atanh(SB)
-
-TEXT ·Atan2(SB), NOSPLIT, $0
-	JMP ·atan2(SB)
-
-TEXT ·Cbrt(SB), NOSPLIT, $0
-	JMP ·cbrt(SB)
-
-TEXT ·Cos(SB), NOSPLIT, $0
-	JMP ·cos(SB)
-
-TEXT ·Cosh(SB), NOSPLIT, $0
-	JMP ·cosh(SB)
-
-TEXT ·Erf(SB), NOSPLIT, $0
-	JMP ·erf(SB)
-
-TEXT ·Erfc(SB), NOSPLIT, $0
-	JMP ·erfc(SB)
-
-TEXT ·Exp(SB), NOSPLIT, $0
-	JMP ·exp(SB)
-
-TEXT ·Expm1(SB), NOSPLIT, $0
-	JMP ·expm1(SB)
-
-TEXT ·Exp2(SB), NOSPLIT, $0
-	JMP ·exp2(SB)
-
-TEXT ·Frexp(SB), NOSPLIT, $0
-	JMP ·frexp(SB)
-
-TEXT ·Hypot(SB), NOSPLIT, $0
-	JMP ·hypot(SB)
-
-TEXT ·Ldexp(SB), NOSPLIT, $0
-	JMP ·ldexp(SB)
-
-TEXT ·Log(SB), NOSPLIT, $0
-	JMP ·log(SB)
-
-TEXT ·Log1p(SB), NOSPLIT, $0
-	JMP ·log1p(SB)
-
-TEXT ·Log10(SB), NOSPLIT, $0
-	JMP ·log10(SB)
-
-TEXT ·Log2(SB), NOSPLIT, $0
-	JMP ·log2(SB)
-
-TEXT ·Max(SB), NOSPLIT, $0
-	JMP ·max(SB)
-
-TEXT ·Min(SB), NOSPLIT, $0
-	JMP ·min(SB)
-
-TEXT ·Mod(SB), NOSPLIT, $0
-	JMP ·mod(SB)
-
-TEXT ·Modf(SB), NOSPLIT, $0
-	JMP ·modf(SB)
-
-TEXT ·Pow(SB), NOSPLIT, $0
-	JMP ·pow(SB)
-
-TEXT ·Remainder(SB), NOSPLIT, $0
-	JMP ·remainder(SB)
-
-TEXT ·Sin(SB), NOSPLIT, $0
-	JMP ·sin(SB)
-
-TEXT ·Sinh(SB), NOSPLIT, $0
-	JMP ·sinh(SB)
-
-TEXT ·Tan(SB), NOSPLIT, $0
-	JMP ·tan(SB)
-
-TEXT ·Tanh(SB), NOSPLIT, $0
-	JMP ·tanh(SB)
diff --git a/src/math/tan.go b/src/math/tan.go
index 49b1239..a25417f 100644
--- a/src/math/tan.go
+++ b/src/math/tan.go
@@ -79,7 +79,12 @@
 //	Tan(±0) = ±0
 //	Tan(±Inf) = NaN
 //	Tan(NaN) = NaN
-func Tan(x float64) float64
+func Tan(x float64) float64 {
+	if haveArchTan {
+		return archTan(x)
+	}
+	return tan(x)
+}
 
 func tan(x float64) float64 {
 	const (
diff --git a/src/math/tanh.go b/src/math/tanh.go
index 0b7fb7f..a825678 100644
--- a/src/math/tanh.go
+++ b/src/math/tanh.go
@@ -71,7 +71,12 @@
 //	Tanh(±0) = ±0
 //	Tanh(±Inf) = ±1
 //	Tanh(NaN) = NaN
-func Tanh(x float64) float64
+func Tanh(x float64) float64 {
+	if haveArchTanh {
+		return archTanh(x)
+	}
+	return tanh(x)
+}
 
 func tanh(x float64) float64 {
 	const MAXLOG = 8.8029691931113054295988e+01 // log(2**127)

To view, visit change 310331. To unsubscribe, or for help writing mail filters, visit settings.

Austin Clements (Gerrit)

unread,

Apr 15, 2021, 12:13:09 AM4/15/21

to Austin Clements, goph...@pubsubhelper.golang.org, Cherry Zhang, Robert Griesemer, Michael Knyszek, golang-co...@googlegroups.com

Attention is currently required from: Michael Knyszek, Robert Griesemer, Cherry Zhang.

Patch set 1:Run-TryBot +1

View Change

1 comment:

Patchset:
- Patch Set #1:
  TRY=386 amd64 arm arm64 mips64 mips64le mips mipsle ppc64 ppc64le s390x wasm

To view, visit change 310331. To unsubscribe, or for help writing mail filters, visit settings.

Austin Clements (Gerrit)

unread,

Apr 15, 2021, 12:33:10 AM4/15/21

to Austin Clements, goph...@pubsubhelper.golang.org, golang-co...@googlegroups.com

Attention is currently required from: Michael Knyszek, Robert Griesemer, Cherry Zhang.

Austin Clements uploaded patch set #2 to this change.

View Change

M src/math/dim_riscv64.s

M src/math/sqrt_riscv64.s


M src/math/sqrt_s390x.s
M src/math/sqrt_wasm.s
A src/math/stubs.go
D src/math/stubs_386.s
D src/math/stubs_amd64.s
D src/math/stubs_arm.s
D src/math/stubs_arm64.s
D src/math/stubs_mips64x.s
D src/math/stubs_mipsx.s
D src/math/stubs_ppc64x.s
D src/math/stubs_riscv64.s
M src/math/stubs_s390x.s
D src/math/stubs_wasm.s
M src/math/tan.go
M src/math/tanh.go

82 files changed, 861 insertions(+), 1,129 deletions(-)

To view, visit change 310331. To unsubscribe, or for help writing mail filters, visit settings.

Josh Bleecher Snyder (Gerrit)

unread,

Apr 15, 2021, 1:22:48 AM4/15/21

to Austin Clements, goph...@pubsubhelper.golang.org, Josh Bleecher Snyder, Go Bot, Cherry Zhang, Robert Griesemer, Michael Knyszek, golang-co...@googlegroups.com

Attention is currently required from: Austin Clements, Michael Knyszek, Robert Griesemer, Cherry Zhang.

Patch set 2:Code-Review +1

View Change

2 comments:

Patchset:
- Patch Set #2:
  nice idea. +1 instead of +2 because it’s too large to review on my phone. :)
File src/math/asin.go:
- Patch Set #2, Line 22: }
  I’m AFK now but if recollection serves, the inliner will be more generous if you spell this:
```
if have {
  return arch()
} else {
  return purego()
}
```
  because it gets analyzed in an early phase and the if-else form can be trivially trimmed.
  Might not matter in practice.

To view, visit change 310331. To unsubscribe, or for help writing mail filters, visit settings.

Austin Clements (Gerrit)

unread,

Apr 15, 2021, 8:23:59 AM4/15/21

to Austin Clements, goph...@pubsubhelper.golang.org, Go Bot, Josh Bleecher Snyder, Cherry Zhang, Robert Griesemer, Michael Knyszek, golang-co...@googlegroups.com

Attention is currently required from: Michael Knyszek, Robert Griesemer, Josh Bleecher Snyder, Cherry Zhang.

View Change

3 comments:

Patchset:
- Patch Set #2:
  nice idea. +1 instead of +2 because it’s too large to review on my phone. […]
  Thanks :)
- Patch Set #2:
  I'm running the math benchmarks now and will post the results once I have them.
File src/math/asin.go:

- Patch Set #2, Line 22: }

- I’m AFK now but if recollection serves, the inliner will be more generous if you spell this: […]
  I just checked this for Sin (no assembly on amd64) and Exp (assembly on amd64) and the inline cost is 61 in all cases. Since have is always a const, early deadcode will recognize the if { return } and cut the rest of the body (oddly, it leaves behind an empty `if have { }`, but inlining also recognizes that the condition is const and doesn't count it).

To view, visit change 310331. To unsubscribe, or for help writing mail filters, visit settings.

Cherry Zhang (Gerrit)

unread,

Apr 15, 2021, 10:43:40 AM4/15/21

to Austin Clements, goph...@pubsubhelper.golang.org, Go Bot, Josh Bleecher Snyder, Robert Griesemer, Michael Knyszek, golang-co...@googlegroups.com

Attention is currently required from: Austin Clements, Michael Knyszek, Robert Griesemer, Josh Bleecher Snyder.

Patch set 2:Code-Review +2

View Change

To view, visit change 310331. To unsubscribe, or for help writing mail filters, visit settings.

Austin Clements (Gerrit)

unread,

Apr 15, 2021, 11:43:14 AM4/15/21

to Austin Clements, goph...@pubsubhelper.golang.org, golang-co...@googlegroups.com

Attention is currently required from: Austin Clements, Michael Knyszek, Robert Griesemer, Josh Bleecher Snyder.

Austin Clements uploaded patch set #3 to this change.

View Change

Prior to this change, enabling ABI wrappers results in a geomean
slowdown of the math benchmarks of 8.77% (full results:
https://perf.golang.org/search?q=upload:20210415.6) and of the Tile38
benchmarks by ~4%. After this change, enabling ABI wrappers is
completely performance-neutral on Tile38 and all but one math
benchmark (full results:
https://perf.golang.org/search?q=upload:20210415.7). ABI wrappers slow
down SqrtIndirectLatency-12 by 2.09%, which makes sense because that
call must still go through an ABI wrapper.

With ABI wrappers disabled (which won't be an option on amd64 much
longer), on linux/amd64, this change is largely performance-neutral
and slightly improves the performance of a few benchmarks:

(Because there are so many benchmarks, I've applied the Šidák
correction to the alpha threshold. It makes relatively little
difference in which benchmarks are statistically significant.)

name                    old time/op  new time/op  delta
Acos-12                 22.3ns ± 0%  18.8ns ± 1%  -15.44%  (p=0.000 n=18+16)
Acosh-12                28.2ns ± 0%  28.2ns ± 0%     ~     (p=0.404 n=18+20)
Asin-12                 18.1ns ± 0%  18.2ns ± 0%   +0.20%  (p=0.000 n=18+16)
Asinh-12                32.8ns ± 0%  32.9ns ± 1%     ~     (p=0.891 n=18+20)
Atan-12                 9.92ns ± 0%  9.90ns ± 1%   -0.24%  (p=0.000 n=17+16)
Atanh-12                27.7ns ± 0%  27.5ns ± 0%   -0.72%  (p=0.000 n=16+20)
Atan2-12                18.5ns ± 0%  18.4ns ± 0%   -0.59%  (p=0.000 n=19+19)
Cbrt-12                 22.1ns ± 0%  22.1ns ± 0%     ~     (p=0.804 n=16+17)
Ceil-12                 0.84ns ± 0%  0.84ns ± 0%     ~     (p=0.663 n=18+16)
Copysign-12             0.84ns ± 0%  0.84ns ± 0%     ~     (p=0.762 n=16+19)
Cos-12                  12.7ns ± 0%  12.7ns ± 1%     ~     (p=0.145 n=19+18)
Cosh-12                 22.2ns ± 0%  22.5ns ± 0%   +1.60%  (p=0.000 n=17+19)
Erf-12                  11.1ns ± 1%  11.1ns ± 1%     ~     (p=0.010 n=19+19)
Erfc-12                 12.6ns ± 1%  12.7ns ± 0%     ~     (p=0.066 n=19+15)
Erfinv-12               16.1ns ± 0%  16.1ns ± 0%     ~     (p=0.462 n=17+20)
Erfcinv-12              16.0ns ± 1%  16.0ns ± 1%     ~     (p=0.015 n=17+16)
Exp-12                  16.3ns ± 0%  16.5ns ± 1%   +1.25%  (p=0.000 n=19+16)
ExpGo-12                36.2ns ± 1%  36.1ns ± 1%     ~     (p=0.242 n=20+18)
Expm1-12                18.6ns ± 0%  18.7ns ± 0%   +0.25%  (p=0.000 n=16+19)
Exp2-12                 34.7ns ± 0%  34.6ns ± 1%     ~     (p=0.010 n=19+18)
Exp2Go-12               34.8ns ± 1%  34.8ns ± 1%     ~     (p=0.372 n=19+19)
Abs-12                  0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.766 n=18+16)
Dim-12                  0.84ns ± 1%  0.84ns ± 1%     ~     (p=0.167 n=17+19)
Floor-12                0.84ns ± 0%  0.84ns ± 0%     ~     (p=0.993 n=18+16)
Max-12                  3.35ns ± 0%  3.35ns ± 0%     ~     (p=0.894 n=17+19)
Min-12                  3.35ns ± 0%  3.36ns ± 1%     ~     (p=0.214 n=18+18)
Mod-12                  35.2ns ± 0%  34.7ns ± 0%   -1.45%  (p=0.000 n=18+17)
Frexp-12                5.31ns ± 0%  4.75ns ± 0%  -10.51%  (p=0.000 n=19+18)
Gamma-12                14.8ns ± 0%  16.2ns ± 1%   +9.21%  (p=0.000 n=20+19)
Hypot-12                6.16ns ± 0%  6.17ns ± 0%   +0.26%  (p=0.000 n=19+20)
HypotGo-12              7.79ns ± 1%  7.78ns ± 0%     ~     (p=0.497 n=18+17)
Ilogb-12                4.47ns ± 0%  4.47ns ± 0%     ~     (p=0.167 n=19+19)
J0-12                   76.0ns ± 0%  76.3ns ± 0%   +0.35%  (p=0.000 n=19+18)
J1-12                   76.8ns ± 1%  75.9ns ± 0%   -1.14%  (p=0.000 n=18+18)
Jn-12                    167ns ± 1%   168ns ± 1%     ~     (p=0.038 n=18+18)
Ldexp-12                6.98ns ± 0%  6.43ns ± 0%   -7.97%  (p=0.000 n=17+18)
Lgamma-12               15.9ns ± 0%  16.0ns ± 1%     ~     (p=0.011 n=20+17)
Log-12                  13.3ns ± 0%  13.4ns ± 1%   +0.37%  (p=0.000 n=15+18)
Logb-12                 4.75ns ± 0%  4.75ns ± 0%     ~     (p=0.831 n=16+18)
Log1p-12                19.5ns ± 0%  19.5ns ± 1%     ~     (p=0.851 n=18+17)
Log10-12                15.9ns ± 0%  14.0ns ± 0%  -11.92%  (p=0.000 n=17+16)
Log2-12                 7.88ns ± 1%  8.01ns ± 0%   +1.72%  (p=0.000 n=20+20)
Modf-12                 4.75ns ± 0%  4.34ns ± 0%   -8.66%  (p=0.000 n=19+17)
Nextafter32-12          5.31ns ± 0%  5.31ns ± 0%     ~     (p=0.389 n=17+18)
Nextafter64-12          5.03ns ± 1%  5.03ns ± 0%     ~     (p=0.774 n=17+18)
PowInt-12               29.9ns ± 0%  28.5ns ± 0%   -4.69%  (p=0.000 n=18+19)
PowFrac-12              91.0ns ± 0%  91.1ns ± 0%     ~     (p=0.029 n=19+19)
Pow10Pos-12             1.12ns ± 0%  1.12ns ± 0%     ~     (p=0.363 n=20+20)
Pow10Neg-12             3.90ns ± 0%  3.90ns ± 0%     ~     (p=0.921 n=17+18)
Round-12                2.31ns ± 0%  2.31ns ± 1%     ~     (p=0.390 n=18+18)
RoundToEven-12          0.84ns ± 0%  0.84ns ± 0%     ~     (p=0.280 n=18+19)
Remainder-12            31.6ns ± 0%  29.6ns ± 0%   -6.16%  (p=0.000 n=18+17)
Signbit-12              0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.385 n=19+18)
Sin-12                  12.5ns ± 0%  12.5ns ± 0%     ~     (p=0.080 n=18+18)
Sincos-12               16.4ns ± 2%  16.4ns ± 2%     ~     (p=0.253 n=20+19)
Sinh-12                 26.1ns ± 0%  26.1ns ± 0%   +0.18%  (p=0.000 n=17+19)
SqrtIndirect-12         3.91ns ± 0%  3.90ns ± 0%     ~     (p=0.133 n=19+19)
SqrtLatency-12          2.79ns ± 0%  2.79ns ± 0%     ~     (p=0.226 n=16+19)
SqrtIndirectLatency-12  6.68ns ± 0%  6.37ns ± 2%   -4.66%  (p=0.000 n=17+20)
SqrtGoLatency-12        49.4ns ± 0%  49.4ns ± 0%     ~     (p=0.289 n=18+16)
SqrtPrime-12            3.18µs ± 0%  3.18µs ± 0%     ~     (p=0.084 n=17+18)
Tan-12                  13.8ns ± 0%  13.9ns ± 2%     ~     (p=0.292 n=19+20)
Tanh-12                 25.4ns ± 0%  25.4ns ± 0%     ~     (p=0.101 n=17+17)
Trunc-12                0.84ns ± 0%  0.84ns ± 0%     ~     (p=0.765 n=18+16)
Y0-12                   75.8ns ± 0%  75.9ns ± 1%     ~     (p=0.805 n=16+18)
Y1-12                   76.3ns ± 0%  75.3ns ± 1%   -1.34%  (p=0.000 n=19+17)
Yn-12                    164ns ± 0%   164ns ± 2%     ~     (p=0.356 n=18+20)
Float64bits-12          0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.383 n=18+18)
Float64frombits-12      0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.066 n=18+19)
Float32bits-12          0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.889 n=16+19)
Float32frombits-12      0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.007 n=18+19)
FMA-12                  23.9ns ± 0%  24.0ns ± 0%   +0.31%  (p=0.000 n=16+17)
[Geo mean]              9.86ns       9.77ns        -0.87%

(https://perf.golang.org/search?q=upload:20210415.5)

To view, visit change 310331. To unsubscribe, or for help writing mail filters, visit settings.

Austin Clements (Gerrit)

unread,

Apr 15, 2021, 11:44:04 AM4/15/21

to Austin Clements, goph...@pubsubhelper.golang.org, Cherry Zhang, Go Bot, Josh Bleecher Snyder, Robert Griesemer, Michael Knyszek, golang-co...@googlegroups.com

Attention is currently required from: Michael Knyszek, Robert Griesemer, Josh Bleecher Snyder.

View Change

1 comment:

Patchset:
- Patch Set #3:
  Updated with a pile of benchmark results. TL;DR: benchmarks are all neutral to good, and ABI wrappers are no longer a cost.

To view, visit change 310331. To unsubscribe, or for help writing mail filters, visit settings.

Austin Clements (Gerrit)

unread,

Apr 15, 2021, 11:48:26 AM4/15/21

to Austin Clements, goph...@pubsubhelper.golang.org, golang-...@googlegroups.com, Cherry Zhang, Go Bot, Josh Bleecher Snyder, Robert Griesemer, Michael Knyszek, golang-co...@googlegroups.com

Austin Clements submitted this change.

View Change

Approvals:
  Cherry Zhang: Looks good to me, approved
  Austin Clements: Trusted

Reviewed-on: https://go-review.googlesource.com/c/go/+/310331
Trust: Austin Clements <aus...@google.com>
Reviewed-by: Cherry Zhang <cher...@google.com>

diff --git a/src/math/dim_riscv64.s b/src/math/dim_riscv64.s
index 38f5fe7..5b2fd3d 100644
--- a/src/math/dim_riscv64.s
+++ b/src/math/dim_riscv64.s
@@ -9,8 +9,8 @@
 #define	PosInf	0x080
 #define	NaN	0x200


 
-// func Max(x, y float64) float64
-TEXT ·Max(SB),NOSPLIT,$0
+// func archMax(x, y float64) float64
+TEXT ·archMax(SB),NOSPLIT,$0

 	MOVD	x+0(FP), F0
 	MOVD	y+8(FP), F1
 	FCLASSD	F0, X5
@@ -39,8 +39,8 @@
 	MOVD	F1, ret+16(FP)


 	RET
 
-// func Min(x, y float64) float64
-TEXT ·Min(SB),NOSPLIT,$0
+// func archMin(x, y float64) float64
+TEXT ·archMin(SB),NOSPLIT,$0

 	MOVD	x+0(FP), F0
 	MOVD	y+8(FP), F1
 	FCLASSD	F0, X5

index 0000000..b910256


--- /dev/null
+++ b/src/math/sqrt_asm.go
@@ -0,0 +1,12 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+

+//go:build 386 || amd64 || arm64 || arm || mips || mipsle || ppc64 || ppc64le || s390x || riscv64 || wasm
+// +build 386 amd64 arm64 arm mips mipsle ppc64 ppc64le s390x riscv64 wasm


+
+package math
+
+const haveArchSqrt = true
+
+func archSqrt(x float64) float64
diff --git a/src/math/sqrt_mipsx.s b/src/math/sqrt_mipsx.s
index a63ea9e..c619c19 100644
--- a/src/math/sqrt_mipsx.s
+++ b/src/math/sqrt_mipsx.s
@@ -6,8 +6,8 @@
 
 #include "textflag.h"
 
-// func Sqrt(x float64) float64
-TEXT ·Sqrt(SB),NOSPLIT,$0
+// func archSqrt(x float64) float64
+TEXT ·archSqrt(SB),NOSPLIT,$0
 #ifdef GOMIPS_softfloat
 	JMP ·sqrt(SB)
 #else
diff --git a/src/math/sqrt_noasm.go b/src/math/sqrt_noasm.go
new file mode 100644

index 0000000..7b546b7


--- /dev/null
+++ b/src/math/sqrt_noasm.go
@@ -0,0 +1,14 @@
+// Copyright 2021 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+

+//go:build !386 && !amd64 && !arm64 && !arm && !mips && !mipsle && !ppc64 && !ppc64le && !s390x && !riscv64 && !wasm
+// +build !386,!amd64,!arm64,!arm,!mips,!mipsle,!ppc64,!ppc64le,!s390x,!riscv64,!wasm


+
+package math
+
+const haveArchSqrt = false
+
+func archSqrt(x float64) float64 {
+	panic("not implemented")
+}
diff --git a/src/math/sqrt_ppc64x.s b/src/math/sqrt_ppc64x.s
index 0469f4d..174b63e 100644
--- a/src/math/sqrt_ppc64x.s
+++ b/src/math/sqrt_ppc64x.s
@@ -6,8 +6,8 @@
 
 #include "textflag.h"
 
-// func Sqrt(x float64) float64
-TEXT ·Sqrt(SB),NOSPLIT,$0
+// func archSqrt(x float64) float64
+TEXT ·archSqrt(SB),NOSPLIT,$0
 	FMOVD	x+0(FP), F0
 	FSQRT	F0, F0
 	FMOVD	F0, ret+8(FP)

diff --git a/src/math/sqrt_riscv64.s b/src/math/sqrt_riscv64.s
index 048171b..f223510 100644
--- a/src/math/sqrt_riscv64.s
+++ b/src/math/sqrt_riscv64.s


@@ -6,8 +6,8 @@
 
 #include "textflag.h"
 
-// func Sqrt(x float64) float64
-TEXT ·Sqrt(SB),NOSPLIT,$0
+// func archSqrt(x float64) float64
+TEXT ·archSqrt(SB),NOSPLIT,$0

 	MOVD	x+0(FP), F0
 	FSQRTD	F0, F0

2 is the latest approved patch-set.
No files were changed between the latest approved patch-set and the submitted one.

To view, visit change 310331. To unsubscribe, or for help writing mail filters, visit settings.

Reply all

Reply to author

Forward

0 new messages