platform independent way of getting the number of processors

651 views
Skip to first unread message

Ondrej Certik

unread,
Mar 28, 2009, 5:53:46 PM3/28/09
to sage-...@googlegroups.com
Hi,

I am trying to figure out the best way to automatically determine the
number of processors and used that information to speed up Sage build.
What is the best way of doing it?

If I can assume python on the system, then one can just use:

def ncpus()
#for Linux, Unix and MacOS
if hasattr(os, "sysconf"):
if os.sysconf_names.has_key("SC_NPROCESSORS_ONLN"):
#Linux and Unix
ncpus = os.sysconf("SC_NPROCESSORS_ONLN")
if isinstance(ncpus, int) and ncpus > 0:
return ncpus
else:
#MacOS X
return int(os.popen2("sysctl -n hw.ncpu")[1].read())
#for Windows
if os.environ.has_key("NUMBER_OF_PROCESSORS"):
ncpus = int(os.environ["NUMBER_OF_PROCESSORS"])
if ncpus > 0:
return ncpus
#return the default value
return 1

If Python is not available, then I can use this simple C program:

http://github.com/certik/sysconf/blob/master/ncpus.c

but I suspect this will not work on Mac or Windows. So if it only
works on linux, one can just run:

cat /proc/cpuinfo | grep processor | wc -l

no need to compile anything.

So it seems to me that a good strategy might be to write a bash script
that will:

1) try if python is installed, if so, runs ncpus()
2) try to compile the above C program, if it builds, run it
3) if it doesn't build, we are on Mac probably, so run sysctl -n
hw.ncpu (I don't have any Mac to test it on, but I guess there
might be a way to actually write the C program in a portable way to
work both on linux and Mac)

Alternatively, I can change 2) to:

2) try: cat /proc/cpuinfo | grep processor | wc -l

In any case, this bash script would be used only up to building
Python. Once we have Python, we can use ncpus() from that point on.

Any ideas welcome.

Ondrej

Ondrej Certik

unread,
Mar 28, 2009, 6:18:08 PM3/28/09
to sage-...@googlegroups.com
> 3) if it doesn't build, we are on Mac probably, so run sysctl -n
> hw.ncpu     (I don't have any Mac to test it on, but I guess there
> might be a way to actually write the C program in a portable way to
> work both on linux and Mac)

So according to this thread:

http://www.nabble.com/-patch--libgomp-detect---of-cpus-Darwin-FreeBSD-td19577956.html

it seems that the following code should work on Mac:

----------
#include "sys/sysctl.h"
#include "stdio.h"

int main()
{
int ncpus = 1;
size_t len = sizeof(ncpus);
sysctl((int[2]) {CTL_HW, HW_NCPU}, 2, &ncpus, &len, NULL, 0);
printf("%d", ncpus);
return 0;
}
-----------------------

On linux it complains that CTL_HW and HW_NCPU are not defined, but
according to this FAQ:

http://www.osxfaq.com/man/3/sysctl.ws

it seems it should be defined in sys/sysctl.h on Mac. So maybe a
simple portable C program that works on all platforms could be a
solution.


Ondrej

Elliott

unread,
Mar 28, 2009, 6:59:28 PM3/28/09
to sage-devel
If the user has Java installed, you could execute a .class file to get
this information; for example:

public class NumProcessors {
public static void main(String[] args) {
System.out.println(Runtime.getRuntime().availableProcessors());
}
}

If you compiled this and put the resulting .class file it in the
directory of your script, you would just have to execute "java
NumProcessors" and then capture its output. This would be better than
trying to handle a different case for each OS, I think.

Elliott

Roman Pearce

unread,
Mar 28, 2009, 7:40:15 PM3/28/09
to sage-devel
/* Linux */
#include <sched.h>
int sched_getaffinity(pid_t pid, unsigned int cpusetsize, cpu_set_t
*mask);

static inline int num_processors()
{
unsigned int bit;
int np;
cpu_set_t aff;
memset(&aff, 0, sizeof(aff) );
sched_getaffinity(0, sizeof(aff), &aff );
for(np = 0, bit = 0; bit < 8*sizeof(aff); bit++)
np += (((char *)&aff)[bit / 8] >> (bit % 8)) & 1;
return np;
}


/* Mac OS X */
#include <sys/types.h>
#include <sys/sysctl.h>
static inline int num_processors()
{
int np = 1;
size_t length = sizeof( np );
sysctlbyname("hw.ncpu", &np, &length, NULL, 0);
return np;
}


/* Windows NT */
#include <windows.h>
static inline int num_processors()
{
SYSTEM_INFO info;
GetSystemInfo(&info);
return info.dwNumberOfProcessors;
}

Ondrej Certik

unread,
Mar 29, 2009, 3:49:15 AM3/29/09
to sage-...@googlegroups.com
Hi Roman,

Thanks a lot for the code!

I just tried the following code on several linuxes (Debian, Ubuntu,
Gentoo, Red Hat, OpenSUSE) and on OS X 10.5 Intel and it seems to just
work everywhere:

#include "unistd.h"
#include "stdio.h"

int main()
{
int ncpus;
ncpus = sysconf(_SC_NPROCESSORS_ONLN);


printf("%d", ncpus);
return 0;
}

It will not work on windows, but is there some reason to use a
different code for linux and Mac, if the above seems to be working
just fine?

Ondrej

Ondrej Certik

unread,
Mar 29, 2009, 3:52:00 AM3/29/09
to sage-...@googlegroups.com
Hi Elliott,

On Sat, Mar 28, 2009 at 3:59 PM, Elliott <elliott...@gmail.com> wrote:
>
> If the user has Java installed, you could execute a .class file to get
> this information; for example:
>
> public class NumProcessors {
>  public static void main(String[] args) {
>    System.out.println(Runtime.getRuntime().availableProcessors());
>  }
> }
>
> If you compiled this and put the resulting .class file it in the
> directory of your script, you would just have to execute "java
> NumProcessors" and then capture its output. This would be better than
> trying to handle a different case for each OS, I think.

Thanks, that's indeed nice if the java is installed. I think when
python is installed, the ncpus() function (which is also in Sage btw)
works fine. I am looking for something that will work even if python
and java is not installed, but I think I found it, see my previous
email.

Ondrej

Peter Jeremy

unread,
Mar 29, 2009, 2:34:11 PM3/29/09
to sage-...@googlegroups.com
On 2009-Mar-28 14:53:46 -0700, Ondrej Certik <ond...@certik.cz> wrote:
>I am trying to figure out the best way to automatically determine the
>number of processors and used that information to speed up Sage build.

Note that this should be able to be over-ridden by the operator -
just because a system has (say) 8 cores available doesn't mean that
the sage build should use them all.

>If Python is not available, then I can use this simple C program:
>
>http://github.com/certik/sysconf/blob/master/ncpus.c
>
>but I suspect this will not work on Mac or Windows.

sysconf() is part of POSIX so it should work in any POSIX environment.
Microsoft made a big claim about Windows being POSIX compliant so it
should work there - but may need to link against special libraries.
It should work on OS-X (though I can't test it).

>2) try: cat /proc/cpuinfo | grep processor | wc -l

That is far less portable than sysconf() because it _only_ works
on Linux, whereas sysconf() should work on nearly all Unix systems
(and some others).

--
Peter Jeremy

Ondrej Certik

unread,
Mar 29, 2009, 2:55:48 PM3/29/09
to sage-...@googlegroups.com
On Sun, Mar 29, 2009 at 11:34 AM, Peter Jeremy
<peter...@optushome.com.au> wrote:
> On 2009-Mar-28 14:53:46 -0700, Ondrej Certik <ond...@certik.cz> wrote:
>>I am trying to figure out the best way to automatically determine the
>>number of processors and used that information to speed up Sage build.
>
> Note that this should be able to be over-ridden by the operator -
> just because a system has (say) 8 cores available doesn't mean that
> the sage build should use them all.

Yes, exactly. I am still thinking of the best way to do that. I think
the default should be to just use one processor, but it should be easy
to tell it to use all processors, or just certain amount of them.

I think I will just add more targets to the makefile in the top
directory, e.g. something like

make # use 1 processor
make parallel # use all processors
JOBS=3 make # use 3 processors

>
>>If Python is not available, then I can use this simple C program:
>>
>>http://github.com/certik/sysconf/blob/master/ncpus.c
>>
>>but I suspect this will not work on Mac or Windows.
>
> sysconf() is part of POSIX so it should work in any POSIX environment.
> Microsoft made a big claim about Windows being POSIX compliant so it
> should work there - but may need to link against special libraries.
> It should work on OS-X (though I can't test it).
>
>>2) try: cat /proc/cpuinfo | grep processor | wc -l
>
> That is far less portable than sysconf() because it _only_ works
> on Linux, whereas sysconf() should work on nearly all Unix systems
> (and some others).

Yes, indeed, I found out that sysconf() does work on Mac. So that's awesome.


Ondrej

William Stein

unread,
Mar 29, 2009, 3:37:42 PM3/29/09
to sage-...@googlegroups.com, spd...@googlegroups.com
On Sun, Mar 29, 2009 at 11:34 AM, Peter Jeremy
<peter...@optushome.com.au> wrote:
> On 2009-Mar-28 14:53:46 -0700, Ondrej Certik <ond...@certik.cz> wrote:
>>I am trying to figure out the best way to automatically determine the
>>number of processors and used that information to speed up Sage build.
>
> Note that this should be able to be over-ridden by the operator -
> just because a system has (say) 8 cores available doesn't mean that
> the sage build should use them all.

Just for the record this is a discussion about "Simple Python
Distribution", not Sage. What Ondrej is doing doesn't a priori have
anything to do with how Sage is built (though I of course hope it
will). I've cc'd this to the spd-dev list since the discussion would
also make sense there:

http://groups.google.com/group/spd-dev/about

and there are no discussions there yet.

>>If Python is not available, then I can use this simple C program:
>>
>>http://github.com/certik/sysconf/blob/master/ncpus.c
>>
>>but I suspect this will not work on Mac or Windows.
>
> sysconf() is part of POSIX so it should work in any POSIX environment.
> Microsoft made a big claim about Windows being POSIX compliant so it
> should work there - but may need to link against special libraries.
> It should work on OS-X (though I can't test it).

Microsoft Windows only implements POSIX.1. According to wikipedia:
"Because only the first version of POSIX (POSIX.1) is implemented, a
POSIX application cannot create a thread or window, nor can it use RPC
or socket. Instead of implementing the later versions of POSIX,
Microsoft offers Windows Services for UNIX."
http://en.wikipedia.org/wiki/Microsoft_POSIX_subsystem

I donly know if sysconf is in POSIX.1 or not.

>
>>2) try: cat /proc/cpuinfo | grep processor | wc -l
>
> That is far less portable than sysconf() because it _only_ works
> on Linux, whereas sysconf() should work on nearly all Unix systems
> (and some others).
>
> --
> Peter Jeremy
>

--
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org

Roman Pearce

unread,
Mar 29, 2009, 3:51:41 PM3/29/09
to sage-devel
On Mar 29, 12:49 am, Ondrej Certik <ond...@certik.cz> wrote:
> I just tried the following code on several linuxes (Debian, Ubuntu,
> Gentoo, Red Hat, OpenSUSE) and on OS X 10.5 Intel and it seems to just
> work everywhere:
>
> #include "unistd.h"
> #include "stdio.h"
>
> int main()
> {
>     int ncpus;
>     ncpus = sysconf(_SC_NPROCESSORS_ONLN);
>     printf("%d", ncpus);
>     return 0;
>
> }

This is the best way. I am using sysconf on POSIX systems now.

Peter Jeremy

unread,
Mar 30, 2009, 6:33:25 AM3/30/09
to sage-...@googlegroups.com
On 2009-Mar-29 11:55:48 -0700, Ondrej Certik <ond...@certik.cz> wrote:
>I think I will just add more targets to the makefile in the top
>directory, e.g. something like
>
>make # use 1 processor
>make parallel # use all processors
>JOBS=3 make # use 3 processors

FWIW, FreeBSD has just implemented something similar in its ports
system. It defaults to a number of parallel builds equal to the
number of cores present and can be over-ridden (eg to 3) with:
make MAKE_JOBS_NUMBER=3 ...
I can't say that that particular name really grabs me but I
offer it as one example of prior art.

>> sysconf() is part of POSIX so it should work in any POSIX environment.
>> Microsoft made a big claim about Windows being POSIX compliant so it

To address William's later comment, sysconf() appears to be part of
IEEE Std 1003.1-1988 (POSIX.1), though I'm not sure if POSIX specifies
requires all the values to be queryable. (And you are still up
against Microsoft complying with the letter, rather than the spirit of
POSIX).

--
Peter Jeremy

Reply all
Reply to author
Forward
0 new messages