Running M2 on a multi-processor machine

194 views
Skip to first unread message

Elizabeth Arnold

unread,
Feb 3, 2010, 4:56:06 PM2/3/10
to Macaulay2
I am trying to run some serial code on a parallel machine to take
advantage of the larger memory. However, the computations are quite a
bit slower than my laptop! The sysadmin says that since the machine
has a lot of processors, M2 seems to think that it can spawn a lot of
threads... which are
stepping all over each other for resources. Do you know if there is
an option in M2 to set a cap on the number of threads?
Thanks,
Beth Arnold

Thomas Kahle

unread,
Feb 3, 2010, 5:52:41 PM2/3/10
to maca...@googlegroups.com

I have seen this too on our 16 core machines, and could never figure out
a reason or way around it.
For some time I suspected the power management of the linux kernel. It
looked like the M2 process would be switching frequently between
different cores which would trigger cpu frequency scaling of the
individual cores.
However, in recent versions of the linux kernel M2 stays on physically
the same core, but is still slower than on my laptop, while a simple
C-program with a stupid loop will run multiple times faster than on the
laptop.

I asked about this on the mailing list. Look in the archive for a thread
"M2 Performance on Opteron Processor".

regards
Thomas

--
Thomas Kahle

The fundamental theorem of algebra is open source. Like any other
mathematical theorem it can be applied free of charge and everybody
has access to its proof and can convince himself how it works. Why
should software be any different?

signature.asc

Dan Grayson

unread,
Feb 3, 2010, 6:24:53 PM2/3/10
to Macaulay2
I don't see why having multiple threads would be a problem: usually a
program benefits from parallel execution. (If M2 doesn't use multiple
cores, I suspect a problem with the way we compiled it.)

Anyway, try running M2 like this:

GC_NPROCS=3 GC_MARKERS=3 M2

Here is the explanation from the appropriate readme file for the
garbage collector we use:

GC_NPROCS=<n> - Linux w/threads only. Explicitly sets the number of
processors
that the GC should expect to use. Note that setting this to
1
when multiple processors are available will preserve
correctness, but may lead to really horrible performance,
since the lock implementation will immediately yield without
first spinning.

GC_MARKERS=<n> - Only if compiled with PARALLEL_MARK. Set the number
of marker threads. This is normally set to the number of
processors. It is safer to adjust GC_MARKERS than GC_NPROCS,
since GC_MARKERS has no impact on the lock implementation.

We do compile with PARALLEL_MARK.

Reply all
Reply to author
Forward
0 new messages