Re: sage-4.0.2

0 views
Skip to first unread message

Bill Hart

unread,
Jun 19, 2009, 9:58:51 AM6/19/09
to sage-r...@googlegroups.com, mpir-dev
I have just checked and there are 13 instructions which appear in SSE3 which are not available in SSE2. These are:

ADDSUBPD - (Add-Subtract-Packed-Double)
ADDSUBPS - (Add-Subtract-Packed-Single)
HADDPD - (Horizontal-Add-Packed-Double)
HADDPS (Horizontal-Add-Packed-Single)
HSUBPD - (Horizontal-Subtract-Packed-Double)
HSUBPS - (Horizontal-Subtract-Packed-Single)
LDDQU - (misaligned integer vector load)
MOVDDUP, MOVSHDUP, MOVSLDUP - (for complex numbers)
FISTTP - (FISTP with "chop" truncate)
MONITOR, MWAIT - (intel only control instructions)

I grepped for these in the current MPIR assembly code and we don't use any of them. SSE4 instructions are only available on Intel i7 and AMD K10, so I doubt we use those except perhaps on the processors which actually support them.

I also checked SSSE3 which are intel extensions to SSE3. They include the instructions:

PSIGNB, PSIGNW, PSIGND - packed sign
PABSB, PABSW, PABSD - packed absolute value
PALIGNR - packed align right
PSHUFB - packed shuffle bytes
PMULHRSW - packed multiply high with round and scale
PMADDUBSW - multiply and add packed signed and unsigned bytes
PHSUBW, PHSUBD - packed horizontal subtract
PHSUBSW  - packed horizontal subtract and saturate (words and dwords)
PHADDW, PHADDD - packed horizontal add
PHADDSW - packed horizontal add and saturate words

These also don't appear in MPIR.

SSE5 is an AMD extension which has been announced but not implemented.

I'm fairly sure MPIR is currently safe wrt SSE. Could some other package in Sage be using SSE3 or SSSE3?

Bill.

2009/6/19 William Stein <wst...@gmail.com>

Hello,

Sage-4.0.2 has been released by Craig Citro and Nick Alexander!   The
source code is available here:

  http://sagemath.org/src/sage-4.0.2.tar

Binaries (only for modern hardware with ssse3) will be available in a
few days (at most), along with a release
tour, release notes, etc.   Brave developers might also try
 sage -upgrade
to upgrade to the latest version.

Tom Boothby will be the main release manager for the next Sage release
(maybe sage-4.1).

-- William


--
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org



William Stein

unread,
Jun 19, 2009, 10:02:51 AM6/19/09
to mpir...@googlegroups.com, sage-r...@googlegroups.com
Yes.

Dumb question -- is it possible that the compiler inserts SSE*
instructions in the generated code, even though you don't explicitly
use them in your own assembler code?

I.e., I write code all the time in C and I never mention any explicit
instructions at all, but of course the C compiler generates all kinds
of instructions.

-- William

William Stein

unread,
Jun 19, 2009, 10:17:48 AM6/19/09
to mpir...@googlegroups.com, sage-r...@googlegroups.com
On Fri, Jun 19, 2009 at 3:58 PM, Bill Hart<goodwi...@googlemail.com> wrote:

My second remark is that the only person who put any effort into
checking for other instruction sets in any of the Sage-built
code/libraries was Michael Abshoff. Nobody double-checked what he
did, and maybe he was just wrong in his claim that only ATLAS has
>sse2 stuff in it.

Does anybody know how to take a .so or other binary and tell whether
it includes specific assembly instructions?

-- william

Bill Hart

unread,
Jun 19, 2009, 10:22:04 AM6/19/09
to mpir...@googlegroups.com
That would be almost impossible I would think. You probably have to
grep the source code.

The other issue of course is to look out for -march and -mtune options
being passed to gcc. If you are building universal binaries, you do
not want to be passing -march=core2 for example, or -march=prescott or
-march=nocona, etc.

Most libraries Sage uses probably don't have fat binary support, so
dealing with this issue is probably something that needs to be done on
a library by library basis.

Bill.

2009/6/19 William Stein <wst...@gmail.com>:

Bill Hart

unread,
Jun 19, 2009, 10:31:58 AM6/19/09
to mpir...@googlegroups.com, sage-r...@googlegroups.com
SSE2 is probably used by libm4ri and certainly by MPIR, but only on K8
and above or pentium4 or above. Again fat binaries do not assume SSE2
support is available, however in this case the compiler might well
make some assumptions.

If you build on a 32 bit Pentium 4 with SSE2 support then quite
possibly the binary which results will not run on an AMD K7.

Of course anything earlier than Pentium 4 or K7 is probably so
out-of-date it won't run Sage at all.

It would be useful to have an example of a binary which would not run,
the architecture of the machine on which it would not run and the
architecture of the machine on which the binary was generated. Such an
example might lead to a better hypothesis about what is going wrong.

Bill.

2009/6/19 William Stein <wst...@gmail.com>:

Bill Hart

unread,
Jun 19, 2009, 10:42:36 AM6/19/09
to mpir...@googlegroups.com, sage-r...@googlegroups.com
I should add that libm4ri probably only uses SSE2 on core2 machines. But if the binary was built on a core2, I don't know what Michael has set it up to do.

And then there is ATLAS of course. I don't know about that at all. I mean, it automatically tunes for the architecture on which it is built. It certainly will use SSE where available. So if it is available on the build machine...

Bill.

2009/6/19 Bill Hart <goodwi...@googlemail.com>

Jason Moxham

unread,
Jun 19, 2009, 10:55:44 AM6/19/09
to mpir...@googlegroups.com, sage-r...@googlegroups.com
 
mpn_popcount and mpn_hamdist on K10 and nehalem use sse3+
mpn_l/rshift  on k10/core2/nehalem  use sse2
mpn_l/rshift on K8 use mmx
 
I think thats it.
 
gcc does emit sse code but I think you have to explicitly turn it on
 
Jason

Bill Hart

unread,
Jun 19, 2009, 11:07:13 AM6/19/09
to mpir...@googlegroups.com, sage-r...@googlegroups.com
2009/6/19 Jason Moxham <ja...@njkfrudils.plus.com>:
>
> mpn_popcount and mpn_hamdist on K10 and nehalem use sse3+
> mpn_l/rshift  on k10/core2/nehalem  use sse2
> mpn_l/rshift on K8 use mmx

Of course none of that would be a problem in a fat binary, regardless
of what system it was built on. So I think we are good.

Bill Hart

unread,
Jun 19, 2009, 11:21:54 AM6/19/09
to mpir...@googlegroups.com, sage-r...@googlegroups.com
Here is a quote from the web which seems to get it right:

"You can tell gcc to not use the 387 unit, and to use the sse unit
instead which does not suffer from this. That's what -mfpmath=sse does,
but you also have to tell gcc that it's okay
to emit sse instructions, thus -march=(something that supports sse) or
-msse is required. But do note however that plain sse can only do
single precision (32 bit) floats, so if you're using doubles you require
sse2, otherwise the 387 will still be used. You mentioned you're using
an Athlon-XP which does not support sse2, so that's out."

Somewhere else it says that the gcc mindset is to have code it emits run
on any x86 anywhere by default, but if you explicitly tell it to use SSE
or if you do -march=(something that supports sse) then it will use SSE.

After that it is down to assembly language.

I checked FLINT through and there is nothing unusual used there,
except in longlong.h. But it seems that certain things would have to
be defined (e.g. __amd64__) for it to use illegal instructions.
Actually this is a bug in FLINT. Currently it probably only uses C
fallback stuff from longlong.h. And here I have been thinking all
along that it used assembly language where available!! It probably
would, if I used autoconf to build FLINT, in which case I'd have to
add fat binary support to FLINT. I wonder how much faster FLINT would
be if it used the assembly language!

Bill.

2009/6/19 Bill Hart <goodwi...@googlemail.com>:
Reply all
Reply to author
Forward
0 new messages