survey: how many "cpus" do you have?

10 views
Skip to first unread message

John H Palmieri

unread,
Sep 23, 2009, 1:13:44 AM9/23/09
to sage-devel
The goal of trac #6283 is 'Make it so NUM_THREADS is set intelligently
instead of idiotically in makefile so doing "make ptest" or "make
ptestlong" doesn't kill some computer'. Right now, NUM_THREADS (set in
SAGE_ROOT/makefile) is used for parallel testing if you do "make
ptest" or "make ptestlong"; the makefile says that in the future, it
could be used for parallel building. One idea is to set NUM_THREADS
equal to the number of cpus, or processors, or cores, or something
like that. More precisely, the idea is to use the output from this:

sage: import multiprocessing
sage: multiprocessing.cpu_count()

This seems to give reasonable numbers for my iMac, an ubuntu box I
sometimes use, and sage.math. Does it give bad numbers for your
computer?

If we can get reasonable numbers using some method like this, then I
propose to change the command "sage -tp N <files>": if N==0, then
change N to be the number of cpus. Then we change NUM_THREADS to 0 at
the top of SAGE_ROOT/makefile, and away we go...

John

mmarco

unread,
Sep 23, 2009, 6:11:30 AM9/23/09
to sage-devel
I got the right answer (counting different cores of the same processor
as different processors) in all the computers i have checked (with
different number of processors and different flavours of linux). All
of them are x86 or amd64 architecture. Still have to check how does it
work with atom processors with one core but multi thread technology.

Miguel

Dr. David Kirkby

unread,
Sep 23, 2009, 8:55:13 AM9/23/09
to sage-...@googlegroups.com
John H Palmieri wrote:
> The goal of trac #6283 is 'Make it so NUM_THREADS is set intelligently
> instead of idiotically in makefile so doing "make ptest" or "make
> ptestlong" doesn't kill some computer'. Right now, NUM_THREADS (set in
> SAGE_ROOT/makefile) is used for parallel testing if you do "make
> ptest" or "make ptestlong"; the makefile says that in the future, it
> could be used for parallel building. One idea is to set NUM_THREADS
> equal to the number of cpus, or processors, or cores, or something
> like that. More precisely, the idea is to use the output from this:
>
> sage: import multiprocessing
> sage: multiprocessing.cpu_count()

On my quad processor Sun, this returns a sensible value (4).

I will need to build Sage again on 't2' and let you know what it gives
on that, as it may be 2, 16 or 128!


> This seems to give reasonable numbers for my iMac, an ubuntu box I
> sometimes use, and sage.math. Does it give bad numbers for your
> computer?
>
> If we can get reasonable numbers using some method like this, then I
> propose to change the command "sage -tp N <files>": if N==0, then
> change N to be the number of cpus. Then we change NUM_THREADS to 0 at
> the top of SAGE_ROOT/makefile, and away we go...
>
> John

This used to be easy. Count the physical CPUs in a machine, and that was
it. Nowadays it is lot more difficult, as there are several issues.

* In systems with virtual machines, on them, not all the physical
processors may be available. (An admin can restrict them).

* Many processors, such as those on 't2' have multiple cores. 't2' has 8
cores per CPU.

* Many processors, such as those on 't2' have what Intel call
Hyperthreading and Sun call 'CoolThreads'. Basically there are more
hardware threads than CPU cores. So on 't2' there are

2 physical processors
Each processor has 8 cores
Each core has 8 hardware threads.
This gives a total of 128 threads.

So you can get any number from 2, 16 or 128 for the number of 'CPUs'.

I've got about 10 machines at home, with the number of processors
varying from 1 to 4. My laptop is dual core. I should soon have a single
processor, but quad core 3.33 GHz Xeon.


Dan Drake

unread,
Sep 23, 2009, 9:27:58 AM9/23/09
to sage-...@googlegroups.com
On Wed, 23 Sep 2009 at 01:55PM +0100, Dr. David Kirkby wrote:
> I will need to build Sage again on 't2' and let you know what it gives
> on that, as it may be 2, 16 or 128!

I downloaded Python 2.6.2 to 't2', built it, and ran
multiprocessing.cpu_count() -- it returned 128 on that machine.

Dan

--
--- Dan Drake
----- http://mathsci.kaist.ac.kr/~drake
-------

signature.asc

mmarco

unread,
Sep 23, 2009, 12:21:27 PM9/23/09
to sage-devel
I got 2 as answer over my core2duo laptop (running over gentoo linux),
which is correct counting the two cores as two processors.
I also got 2 as answer over my atom n270 netbook (running over
ubuntu), which is a single core processor but with hyperthreading. For
multithread computation puroposes, i think it can be considered as a
right answer.

Nick Alexander

unread,
Sep 23, 2009, 1:09:44 PM9/23/09
to sage-...@googlegroups.com
> This seems to give reasonable numbers for my iMac, an ubuntu box I
> sometimes use, and sage.math. Does it give bad numbers for your
> computer?

sage: import multiprocessing
sage: multiprocessing.cpu_count()

2
sage: !uname -a
Darwin pv139204.reshsg.uci.edu 9.7.0 Darwin Kernel Version 9.7.0: Tue
Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386 i386

This is correct.

Nick

John H Palmieri

unread,
Sep 23, 2009, 1:31:38 PM9/23/09
to sage-devel
On Sep 23, 6:27 am, Dan Drake <dr...@kaist.edu> wrote:
> On Wed, 23 Sep 2009 at 01:55PM +0100, Dr. David Kirkby wrote:
> > I will need to build Sage again on 't2' and let you know what it gives
> > on that, as it may be 2, 16 or 128!
>
> I downloaded Python 2.6.2 to 't2', built it, and ran
> multiprocessing.cpu_count() -- it returned 128 on that machine.

I can't tell: is 128 a reasonable answer here? Is it okay to doctest
Sage in parallel using 128 threads?

John

William Stein

unread,
Sep 23, 2009, 3:17:26 PM9/23/09
to sage-...@googlegroups.com

Ouch. I tried running > 16 processes in parallel using
multiprocessing on t2 and got no speed up. I think 128 at once would
not be faster than 16 without doing something clever (that I totally
don't understand).

William

Dr. David Kirkby

unread,
Sep 23, 2009, 6:25:07 PM9/23/09
to sage-...@googlegroups.com

I *think* the issue is one of I/O, though I am the first to admit I am
no expert on this.

If you have 16 CPU bound threads, all wanting CPU time and not using any
disk access, using 16 threads would probably be optimal on 't2', as it
has 16 cores.

For the sort of applications for which the Sun T5240 (ie. t2) was
designed (web servers, databases etc), this is not the case. For those
applications, the processor will often sit waiting for I/O. So many of
those 16 cores will be idle. I believe this is where the the multiple
threads start to have an advantage.

I believe the only way to really know is to test this on 't2'. But it
should be noted that on another machine, the results might be very
different. The earlier Sun machines, based on the T1 processor, had only
1 floating point processor shared between all the cores on the CPU. In
that case, if the task used a lot of floating point maths, running with
only one thread was probably optimal. The Sun T5240, using the T2+
processor, does at least have one FPU for each fore.


It should be clear to anyone that has used 't2' that it's architecture
if not well suited for what we are currently using it for. I believe it
is an excellent machine for use by ISPs for web servers, databases etc,
but it is not ideal for what we are using it for. My Blade 2000, which
was about 5 years older than 't2' was about twice as fast as 't2' for
anything I tried to do with it.


Peter Jeremy

unread,
Sep 24, 2009, 3:57:28 AM9/24/09
to sage-...@googlegroups.com
On 2009-Sep-23 12:17:26 -0700, William Stein <wst...@gmail.com> wrote:
>On Wed, Sep 23, 2009 at 10:31 AM, John H Palmieri
><jhpalm...@gmail.com> wrote:
>> I can't tell: is 128 a reasonable answer here?  Is it okay to doctest
>> Sage in parallel using 128 threads?
>
>Ouch. I tried running > 16 processes in parallel using
>multiprocessing on t2 and got no speed up. I think 128 at once would
>not be faster than 16 without doing something clever (that I totally
>don't understand).

Given that sage is FP-intensive and a T-2 processor has 8 FPUs (ie the
box t2 has 16 FPUs), there's probably little point in going beyond 16.

--
Peter Jeremy

John H Palmieri

unread,
Sep 24, 2009, 11:32:50 AM9/24/09
to sage-devel
So, back to the original question: for parallel testing, can we set
the number of threads to be the output from multiprocessing.cpu_count
()? On t2, is it actually *bad* to use 128 threads, or is it just
about the same as using 16? (The point was to have a non-idiotic way
of setting the number of threads, and I'm trying to figure out if
cpu_count() qualifies or if it needs refinement, or perhaps another
approach altogether.)

--
John

Dr. David Kirkby

unread,
Sep 24, 2009, 6:17:26 PM9/24/09
to sage-...@googlegroups.com

You would need to test this. My guess is that optimum performance would
be achieved somewhere between 16 and 128 - probably closer to 128.

Another issue is whether it is desirable to attempt to take over all the
resources of a large multi-user server.

To the best of my knowledge, there are no machines designed for personal
use (i.e. workstations) with more than 4 threads or cores. Certainly any
'sun4v' machine, which uses use T1, T2 or T2+ processor are not designed
for personal use. In such cases, perhaps the default number of threads
should be limited to 4, and an informative message printed that better
performance could be achieved using more threads, but that this might
considered anti-social on a system used by many people.


It so happens that 't2' is not very busy, and exploiting the hardware to
the full is no big deal now. But one would normally expect machines like
the T5240 to be used by multiple users - they are not

Dan Drake

unread,
Sep 24, 2009, 8:04:46 PM9/24/09
to sage-...@googlegroups.com

Here's one way to look at this: right now, as shipped, "make ptest" does
the wrong thing on 99+ percent of the machines that Sage gets used on,
because it uses 15 threads, and how many machines on which Sage gets
used can handle that well?

With the patch (#6283), "make ptest" does the *right* thing on 99+
percent of machines, and it's very easy to override it if necessary.

I'm thinking that here, the perfect is the enemy of the good. We can
revisit this problem when sage-support is flooded with people
overloading their Sun machines when they run "make ptest". :)

signature.asc

Marshall Hampton

unread,
Sep 24, 2009, 10:11:45 PM9/24/09
to sage-devel
Well said. It's clearly a big improvement, and simple. Works well on
all the machines I have available.

-Marshall

Dr. David Kirkby

unread,
Sep 25, 2009, 3:19:57 AM9/25/09
to sage-...@googlegroups.com
Dan Drake wrote:
> On Thu, 24 Sep 2009 at 08:32AM -0700, John H Palmieri wrote:
>> So, back to the original question: for parallel testing, can we set
>> the number of threads to be the output from multiprocessing.cpu_count
>> ()? On t2, is it actually *bad* to use 128 threads, or is it just
>> about the same as using 16? (The point was to have a non-idiotic way
>> of setting the number of threads, and I'm trying to figure out if
>> cpu_count() qualifies or if it needs refinement, or perhaps another
>> approach altogether.)
>
> Here's one way to look at this: right now, as shipped, "make ptest" does
> the wrong thing on 99+ percent of the machines that Sage gets used on,
> because it uses 15 threads, and how many machines on which Sage gets
> used can handle that well?

Agreed. 15 is a silly number.

>
> With the patch (#6283), "make ptest" does the *right* thing on 99+
> percent of machines, and it's very easy to override it if necessary.

I'm not so convinced of this. It is fine on workstations, but not on big
multi-user servers. I would have stuck a maximum limit of 8 by default.

> I'm thinking that here, the perfect is the enemy of the good. We can
> revisit this problem when sage-support is flooded with people
> overloading their Sun machines when they run "make ptest". :)

It would *not* be *their* Sun machines. It would be Sun machines owned
by a university department or similar, designed to be used by a number
of people. As such, people should rarely be making full use of it. (It
will be fine on 't2' at the minute, but not in general).

I think 't2' was not an ideal choice for us. The T2+ processors are not
designed for what we are using them for.

But on a multi-core machine using multiple Opteron or Xeon processors,
using every available thread for Sage is not a good idea in my opinion.

Dave

Minh Nguyen

unread,
Sep 25, 2009, 3:38:30 AM9/25/09
to sage-...@googlegroups.com
On Fri, Sep 25, 2009 at 5:19 PM, Dr. David Kirkby
<david....@onetel.net> wrote:

<SNIP>

> I'm not so convinced of this. It is fine on workstations, but not on big
> multi-user servers. I would have stuck a maximum limit of 8 by default.

A default of 1 would work for any (probably most) machine.

--
Regards
Minh Van Nguyen

William Stein

unread,
Sep 25, 2009, 3:43:36 AM9/25/09
to sage-devel, Casey Palowitch
SUN specifically donated that box to us for creating an optimized Sage
*notebook* server system. It was not for doing research
compute-bound work. It may well still turn out to be the case that T2
is a good choice for running a notebook server with many simultaneous
connections. I don't know yet. It's not inconceivable since i/o,
many threads, etc., are all important for having say hundreds of
clients at once all doing basic calculus (which isn't CPU bound).

The main concern I have so far is security -- I don't quite understand
how to 100% safely sandbox processes on T2. However, my understanding
is that Solaris Zones are supposed to do exactly this in an elegant
way.

Since I'm working on the notebook right now, one of my projects soon
will be testing ways to run reconfigure and optimize the notebook
server so it can fully take advantage of the unique capabilities of
T2. Since I've separated the notebook from the core Sage library,
at least it will be easy to do all this without having to worry about
all the trouble of building Sage (the notebook itself is pure Python).

-- William



>
> But on a multi-core machine using multiple Opteron or Xeon processors,
> using every available thread for Sage is not a good idea in my opinion.
>
> Dave
>
>
>
>
> >
>



--
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org

Robert Bradshaw

unread,
Sep 25, 2009, 4:21:41 AM9/25/09
to sage-...@googlegroups.com

That would defeat the purpose of ptest (as opposed to just test...)
Capping at 8 seems both easy and reasonable.

- Robert

Robert Bradshaw

unread,
Sep 25, 2009, 4:29:06 AM9/25/09
to sage-...@googlegroups.com

Another thought just came to mind--I've seen benefits to testing with
1 more thread than I have cores, as it's not entirely CPU-bound.

- Robert

Dr. David Kirkby

unread,
Sep 25, 2009, 11:01:57 AM9/25/09
to sage-...@googlegroups.com, Casey Palowitch
William Stein wrote:

> SUN specifically donated that box to us for creating an optimized Sage
> *notebook* server system. It was not for doing research
> compute-bound work. It may well still turn out to be the case that T2
> is a good choice for running a notebook server with many simultaneous
> connections. I don't know yet. It's not inconceivable since i/o,
> many threads, etc., are all important for having say hundreds of
> clients at once all doing basic calculus (which isn't CPU bound).

Yes, that could certainly be so.

> The main concern I have so far is security -- I don't quite understand
> how to 100% safely sandbox processes on T2. However, my understanding
> is that Solaris Zones are supposed to do exactly this in an elegant
> way.

Zones work well. I am going to try to find out how secure it would be to
share an NFS file system into a zone. The fact 'disk' is a file server,
I don't know if there would be any security issues in sharing that
read/write to a zone.

There's no disk space on 't2' that is local, which think might be
advantageous for a machine designed to run multiple Sage instances from
untrusted users. I'm just a bit concerned someone who managed to hack a
zone might be able to do a bit more damage, the fact a NFS share is
available to them too.

Perhaps my fears are unfounded.


Dave


William Stein

unread,
Sep 25, 2009, 12:14:48 PM9/25/09
to sage-...@googlegroups.com
On Fri, Sep 25, 2009 at 8:01 AM, Dr. David Kirkby
<david....@onetel.net> wrote:
>
> William Stein wrote:
>
>> SUN specifically donated that box to us for creating an optimized Sage
>> *notebook* server system.   It was not for doing research
>> compute-bound work.  It may well still turn out to be the case that T2
>> is a good choice for running a notebook server with many simultaneous
>> connections.  I don't know yet.  It's not inconceivable since i/o,
>> many threads, etc., are all important for having say hundreds of
>> clients at once all doing basic calculus (which isn't CPU bound).
>
> Yes, that could certainly be so.
>
>> The main concern I have so far is security -- I don't quite understand
>> how to 100% safely sandbox processes on T2.  However, my understanding
>> is that Solaris Zones are supposed to do exactly this in an elegant
>> way.
>
> Zones work well. I am going to try to find out how secure it would be to
> share an NFS file system into a zone. The fact 'disk' is a file server,
> I don't know if there would be any security issues in sharing that
> read/write to a zone.
>
> There's no disk space on 't2' that is local, which  think might be

Isn't this 22GB of unused local disk space:

$ df
...
rootpool2/scratch 30G 8.4G 22G 29% /scratch

22GB is plenty of local disk space for running something like a notebook server.
For performance and security reasons one would probably want to use
local disk space.


William

Reply all
Reply to author
Forward
0 new messages