Patch for SQRADDSC and SQRADDDB in fp_sqr_comba.c of TomsFastMath

6 views
Skip to first unread message

wtc...@gmail.com

unread,
Feb 17, 2008, 8:54:30 PM2/17/08
to LibTom Projects
The following is a patch for the SQRADDSC and SQRADDDB macros in
fp_sqr_comba.c of TomsFastMath (0.12). The patch was contributed by
Jakub Jelinek of Red Hat. The patch only modifies the definitions of
these macros for x86-64, but similar changes may be made to the other
processor architectures.

The changes to SQRADDSC and SQRADDDB are mutually independent.
I describe them below.

1. SQRADDSC: sc0, sc1, sc2 are output only, so they shouldn't be
listed
as inputs. This change eliminates the GCC warnings that sc0, sc1, sc2
"is used uninitialized in this function".

The change is to remove "0"(sc0), "1"(sc1), "2"(sc2) from the input
list
and change the numbering of i, j to %3, %4 accordingly.

2. SQRADDDB: strictly speaking, we need "earlyclobbers" ("=&r") for
c0, c1, c2 because the inputs sc0, sc1, sc2 are used again after we
have modified c0, c1, c2.

The change is to change "=r" to "=&r" for c0, c1, c2 in the output
list.

Wan-Teh Chang
NSS developer
(We use some code from TFM in
http://lxr.mozilla.org/security/source/security/nss/lib/freebl/mpi/mp_comba.c)


Here is the patch.
-----------------------------------------------------------------------------------------------------
--- tomsfastmath-0.12/src/sqr/fp_sqr_comba.c 2007-03-15
00:58:46.000000000 +0100
+++ tomsfastmath-0.12.patched/src/sqr/fp_sqr_comba.c 2008-02-18
02:23:03.000000000 +0100
@@ -120,22 +120,22 @@
"adcq $0,%2 \n\t" \
"addq %%rax,%0 \n\t" \
"adcq %%rdx,%1 \n\t" \
"adcq $0,%2 \n\t" \
:"=r"(c0), "=r"(c1), "=r"(c2): "0"(c0), "1"(c1), "2"(c2),
"g"(i), "g"(j) :"%rax","%rdx","%cc");

#define SQRADDSC(i, j) \
asm( \
- "movq %6,%%rax \n\t" \
- "mulq %7 \n\t" \
+ "movq %3,%%rax \n\t" \
+ "mulq %4 \n\t" \
"movq %%rax,%0 \n\t" \
"movq %%rdx,%1 \n\t" \
"xorq %2,%2 \n\t" \
- :"=r"(sc0), "=r"(sc1), "=r"(sc2): "0"(sc0), "1"(sc1), "2"(sc2),
"g"(i), "g"(j) :"%rax","%rdx","%cc");
+ :"=r"(sc0), "=r"(sc1), "=r"(sc2): "g"(i),
"g"(j) :"%rax","%rdx","%cc");

#define SQRADDAC(i,
j) \
asm( \
"movq %6,%%rax \n\t" \
"mulq %7 \n\t" \
"addq %%rax,%0 \n\t" \
"adcq %%rdx,%1 \n\t" \
"adcq $0,%2 \n\t" \
@@ -144,17 +144,17 @@
#define SQRADDDB \
asm( \
"addq %6,%0 \n\t" \
"adcq %7,%1 \n\t" \
"adcq %8,%2 \n\t" \
"addq %6,%0 \n\t" \
"adcq %7,%1 \n\t" \
"adcq %8,%2 \n\t" \
- :"=r"(c0), "=r"(c1), "=r"(c2) : "0"(c0), "1"(c1), "2"(c2),
"r"(sc0), "r"(sc1), "r"(sc2) : "%cc");
+ :"=&r"(c0), "=&r"(c1), "=&r"(c2) : "0"(c0), "1"(c1), "2"(c2),
"r"(sc0), "r"(sc1), "r"(sc2) : "%cc");

#elif defined(TFM_SSE2)

/* SSE2 Optimized */
#define COMBA_START

#define CLEAR_CARRY \
c0 = c1 = c2 = 0;
Reply all
Reply to author
Forward
0 new messages