Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Mathematica 7 is now available

58 views
Skip to first unread message

Wolfram Research

unread,
Nov 19, 2008, 5:40:40 AM11/19/08
to
Today, Wolfram Research announced Mathematica 7, a major release
that accelerates the drive to integrate and automate
functionality as core Mathematica capabilities, adding image
processing, parallel high-performance computing (HPC), new
on-demand curated data, and other recently developed
computational innovations--in total over 500 new functions and 12
application areas.

Image processing is one key integrated addition.
Industrial-strength, high-performance functions for image
composition, transformation, enhancement, and segmentation
combine with the existing Mathematica infrastructure of
high-level language, automated interface construction,
interactive notebook documents, and computational power to create
a uniquely versatile image processing solution.

Built-in parallel computing is another key new area of
integration in Mathematica 7 (and a first across technical
computing). For the first time, every copy of Mathematica comes
standard with the technology to parallelize computations over
multiple cores or over networks of Mathematica deployed across a
grid. Every copy of Mathematica 7 comes with four computation
processes included. More processes as well as network
capabilities can be added easily.

Computable data sources, introduced in Mathematica 6, are unique
and popular innovations because of the ease with which data can
be utilized in Mathematica. Mathematica 7 builds on this with
major additions including the complete human genome, as well as
weather, astronomical, GIS, and geodesy data. Example uses
include finding, analyzing, and visualizing gene
sequences--making use of Mathematica's powerful string
capabilities (including new string alignment functionality),
pattern matching, and statistics. Similarly, both real-time and
historical weather data from 16,000 weather stations is included
in Mathematica 7, giving everyone from climatologists to
economists curated information to use in their analyses or
applications.

Other areas of innovation in Mathematica 7 include:

* Charting and information visualization
* Vector field visualization
* Comprehensive spline support, including NURBS
* Industrial-strength Boolean computation
* Statistical model analysis
* Integrated geodesy and GIS data
* Many symbolic computation breakthroughs, including discrete
calculus, sequence recognition, and transcendental roots

To learn more about the enhancements available in Mathematica 7
and to see the full list of new features, visit:
http://www.wolfram.com/mathematica/newin7

NOTE: Mathematica users with Premier Service will receive an
email in the next few days with instructions on how to download
their free upgrades.

Szabolcs

unread,
Nov 20, 2008, 4:55:34 AM11/20/08
to
Hello,

I would like to know if the two bugs mentioned here (and in other
messages) are still present in 7.0:

http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_frm/thread/265f5619682b67e/0188372330785945

I'd also be curious about the performance (especially memory use) of
the graphics system. In version 6, ListDensityPlot could use up an
enormous amount of memory when high order interpolation was turned on
(much much more memory than the size of the final graphic that was
generated).


Finally I'd like to say that, though this might seem like a very minor
point compared to the other new features, I appreciate the addition of
line cap and join controls very much (CapForm, JoinForm). Those very
sorely missed in v6!

Murray Eisenberg

unread,
Nov 21, 2008, 5:31:43 AM11/21/08
to
Mathematica 7.0: The first results are 8 for the Union and also 8 for
the Tally. With the different number, neither are 2 and 4 isomorphic
nor are 4 and 2 isomorphic.

--
Murray Eisenberg mur...@math.umass.edu
Mathematics & Statistics Dept.
Lederle Graduate Research Tower phone 413 549-1020 (H)
University of Massachusetts 413 545-2859 (W)
710 North Pleasant Street fax 413 545-1801
Amherst, MA 01003-9305

DrMajorBob

unread,
Nov 21, 2008, 5:32:04 AM11/21/08
to
That PARTICULAR failure of Tally, at least, seems to be gone in version 7.

No promises in general, mind you.

Bobby

On Thu, 20 Nov 2008 03:55:39 -0600, Szabolcs <szho...@gmail.com> wrote:

> Hello,
>
> I would like to know if the two bugs mentioned here (and in other
> messages) are still present in 7.0:
>
> http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_frm/thread/265f5619682b67e/0188372330785945
>
> I'd also be curious about the performance (especially memory use) of
> the graphics system. In version 6, ListDensityPlot could use up an
> enormous amount of memory when high order interpolation was turned on
> (much much more memory than the size of the final graphic that was
> generated).
>
>
> Finally I'd like to say that, though this might seem like a very minor
> point compared to the other new features, I appreciate the addition of
> line cap and join controls very much (CapForm, JoinForm). Those very
> sorely missed in v6!
>

--
DrMaj...@longhorns.com

Alexei Boulbitch

unread,
Nov 21, 2008, 5:33:07 AM11/21/08
to
Dear MatheGroup members,

I would like to put a trivial though an important question to you. For
some time I try to find problems to be best solved*
using Mathematica, and where parallel computing would be necessary. So
far I did not succeed in finding them. This tells
me that I do not understand something important. I would be grateful, if
you could give few examples. Very frankly,
this my mail is not aimed to state a uselessness of parallelization. In
contrast, I need few realistic examples to make my mind and
begin understanding, if I may need using a parallelization myself.


* Just to explain the point: molecular dynamics simulation for instance,
often requires parallelization, but here Fortran
of C would be probably methods of choice, rather than Mathematica. In
analytical calculations in contrast, Mathematica
would be the method of choice, but in this case I do not see how
parallelization may be benefited from.

It is evident how this my question is related to the M7 release.

Best regards, and thank you in advance, Alexei

--
Alexei Boulbitch, Dr., Habil.
Senior Scientist

IEE S.A.
ZAE Weiergewan
11, rue Edmond Reuter
L-5326 Contern
Luxembourg

Phone: +352 2454 2566
Fax: +352 2454 3566

Website: www.iee.lu

This e-mail may contain trade secrets or privileged, undisclosed or otherwise confidential information. If you are not the intended recipient and have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal from your system. Thank you for your co-operation.

Helen Read

unread,
Nov 21, 2008, 5:34:23 AM11/21/08
to
Szabolcs wrote:
> Hello,
>
> I would like to know if the two bugs mentioned here (and in other
> messages) are still present in 7.0:
>
> http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_frm/thread/265f5619682b67e/0188372330785945

I'd like to know if the ability to set a RevolutionAxis (which was
present in the old SurfaceOfRevolution, and is sorely missing in
RevolutionPlot3D) has been added to RevolutionPlot3D.

--
Helen Read
University of Vermont

Michael Weyrauch

unread,
Nov 21, 2008, 5:32:35 AM11/21/08
to
The Tally[] problem is solved. This I checked with a prerelease version
of Mathematica 7 I got at this year's Mathematica users conference.

Michael


Szabolcs schrieb:

David Park

unread,
Nov 22, 2008, 6:10:10 AM11/22/08
to
Yes, there is a new RevolutionAxis option in Version 7.


David Park
djm...@comcast.net
http://home.comcast.net/~djmpark

Albert Retey

unread,
Nov 22, 2008, 6:09:59 AM11/22/08
to
Hi Alexei,

> I would like to put a trivial though an important question to you. For
> some time I try to find problems to be best solved*
> using Mathematica, and where parallel computing would be necessary. So
> far I did not succeed in finding them. This tells
> me that I do not understand something important. I would be grateful, if
> you could give few examples. Very frankly,
> this my mail is not aimed to state a uselessness of parallelization. In
> contrast, I need few realistic examples to make my mind and
> begin understanding, if I may need using a parallelization myself.
>
>
> * Just to explain the point: molecular dynamics simulation for instance,
> often requires parallelization, but here Fortran
> of C would be probably methods of choice, rather than Mathematica. In
> analytical calculations in contrast, Mathematica
> would be the method of choice, but in this case I do not see how
> parallelization may be benefited from.

I think in the above you are implicitly making at least 2
presuppositions that are not well justified, and probably plain wrong:

1) Whether or not an algorithm can successfully be parallelized has
anything to do with it being analytical or not.

Whether or not an algorithm can take advantage of parallelism or not
does only depend on whether there are parts of the algorithm which can
be executed in parallel without suffering too much from communication
overhead. It has nothing to do with what these parts do, numerics,
analytics or anything else. Trivial examples for easily parallelizable
algorithms are those called "embarassingly parallel", where you do
completely independent calculations in parallel.

One example for analytical calculations which can profit from
"embarassing" parallelism I know of are multiloop calculation in quantum
field theory, where you have to calculate many, sometimes thousands, of
Feynman diagrams _analytically_, and in the end sum up all of these.
Since each of these diagrams can be calculated independently of each
other, an "embarassing" parallel algorithm would just distribute the
calculation of these to the processors available and gain almost linear
speedup, since there is practically no communication overhead.

2) Mathematica is not useful for anything but analytic calculations
(like e.g. numerical analysis), probably because it is too slow.

For every useful language/system/tool there is a trade off between
execution-speed, time of development, code maintenance and many other
aspects. Of course when you neglect the amount of time and work you have
to invest, you will always be able to write code in C or Fortran that is
faster than Mathematica code. The same is true for Assembler code, which
will always outperform C or Fortran code if decently written. Both these
statements of course imply that the programmer knows his business,
otherwise Wolframs developers or the authors of the compiler in action
will probably do a better job although their codes must be much more
general.

There are many aspects that contribute to a well reasoned decision which
is the best tool/system/language for a specific task, and pure execution
speed is in many cases only one of the less important factors. If you
can receive a result by running 100 NDSolve-s in parallel within half an
hour, I would not even think about which ode-library I would link to my
Fortran program to build a program that probably gets the same result in
1 Minute -- once I have spent 2 days writing it (and another week
debugging that stupid index being one off). Let alone writing
specialized solver code that makes best use of the Cell processors of
the new supercomputer so it will run the same thing in 5 Seconds.

> ... and where parallel computing would be necessary

Usually parallel computing can not do more for you than speed up your
calculations. So when would parallel computing be _necessary_? I can
think of two cases:

1) the calculation times for a problem would be too long to wait for
(something in the order of years). For these cases to be solvable in
reasonable time you will need massive parallelism, that is you are
talking about probably 10-thousands of processors. Here it certainly
makes sense to spend some time coding to get maximal execution speed.
This is probably not the class of problems that Mathematica 7 really
addresses.

2) The result is only useful if it is calculated fast enough. E.g. a
weather forecast for the next day is useless if it takes 4 days to
calculate it. Reducing this by parallel computing to 1/2 a day would
make the same program useful, and I can see many cases where that would
make Mathematica programs useful, which without parallelism would be
useless.

In all other cases parallelism is just helping you to do more in less
time, which is often a good enough reason to make use of it. If you can
increase your throughput by a factor of 4 with the 4 Kernels Mathematica
7 lets you use you probably can make your boss quite happy...

hth,

albert

Brett Champion

unread,
Nov 22, 2008, 6:09:37 AM11/22/08
to
On Nov 21, 2008, at 4:34 AM, Helen Read wrote:

> Szabolcs wrote:
>> Hello,
>>
>> I would like to know if the two bugs mentioned here (and in other
>> messages) are still present in 7.0:
>>
>> http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_frm/thread/265f5619682b67e/0188372330785945
>

> I'd like to know if the ability to set a RevolutionAxis (which was
> present in the old SurfaceOfRevolution, and is sorely missing in
> RevolutionPlot3D) has been added to RevolutionPlot3D.

Yes.

http://reference.wolfram.com/mathematica/ref/RevolutionAxis.html

Brett

Szabolcs

unread,
Nov 22, 2008, 6:09:27 AM11/22/08
to
On Nov 21, 11:34 am, Helen Read <r...@math.uvm.edu> wrote:

> Szabolcs wrote:
> > Hello,
>
> > I would like to know if the two bugs mentioned here (and in other
> > messages) are still present in 7.0:
>
> >http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_...

>
> I'd like to know if the ability to set a RevolutionAxis (which was
> present in the old SurfaceOfRevolution, and is sorely missing in
> RevolutionPlot3D) has been added to RevolutionPlot3D.
>

Hello Helen,

RevolutionPlot3D does have a RevolutionAxis option in Mathematica
6.0.3. Does it differ in any way from the RevolutionAxis option of
SurfaceOfRevolution?

Try this:

RevolutionPlot3D[Cos[x], {x, 0, Pi}, RevolutionAxis -> {1, 1, 0}]

J Davis

unread,
Nov 22, 2008, 6:13:31 AM11/22/08
to
I hope the problem where Exporting or Save As... for a Plot3D to pdf
format (Windows platform) resulted in HUGE file sizes has been
addressed in 7.

--JD

Ney Lemke

unread,
Nov 22, 2008, 6:13:52 AM11/22/08
to
Hello,

Can someone explain why Mathematica is going directly to version 7.0?
Version 6.0 was a big improvement, but it was not very polished. I
have some problems using it.
Specially the documentation System and some Java issues. I am
wondering if with this new version,
we will have another big improvement, but we will remain having to
cope with the same kind of problems.

On Nov 21, 8:32 am, Michael Weyrauch <michael.weyra...@gmx.de> wrote:
> The Tally[] problem is solved. This I checked with a prerelease version
> of Mathematica 7 I got at this year's Mathematica users conference.
>
> Michael
>
> Szabolcs schrieb:
>

> > Hello,
>
> > I would like to know if the two bugs mentioned here (and in other
> > messages) are still present in 7.0:
>

> >http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_...

Szabolcs

unread,
Nov 22, 2008, 6:14:25 AM11/22/08
to
On Nov 21, 11:32 am, Michael Weyrauch <michael.weyra...@gmx.de> wrote:
> The Tally[] problem is solved. This I checked with a prerelease version
> of Mathematica 7 I got at this year's Mathematica users conference.
>

What about the other bug (the eigenvalue problem), linked from the
same thread I mentioned?

I copied the (wrong) results from Mathematica 6 here:


In[1]:= mat = {{-6, 0, -Sqrt[3], 0, 0, Sqrt[3], 0, 0, 0, 0, 0, 0, 0,
0}, {0, -6, 0, -Sqrt[3], 0, 0, Sqrt[3], 0, 0, 0, 0, 0, 0,
0}, {-Sqrt[3], 0, -4, 2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4),
2 Sqrt[2/3], 0, 0, Sqrt[3], 0, 0, 0, 0, 0, 0}, {0, -Sqrt[3],
2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4), -4/3, -(2 Sqrt[2])/3, 0, 0, 0,
Sqrt[3], 0, 0, 0, 0, 0}, {0, 0, 2 Sqrt[2/3], -(2 Sqrt[2])/3, 7/3,
0, 0, 0, 0, Sqrt[3], 0, 0, 0, 0}, {Sqrt[3], 0, 0, 0, 0, -4, 0,
2 (-1/(4 Sqrt[3]) + Sqrt[3]/4), 0, 0, 2 Sqrt[2/3], 0, 0, 0}, {0,
Sqrt[3], 0, 0, 0, 0, -4, 0, 2 (-1/(4 Sqrt[3]) + Sqrt[3]/4), 0, 0,
2 Sqrt[2/3], 0, 0}, {0, 0, Sqrt[3], 0, 0,
2 (-1/(4 Sqrt[3]) + Sqrt[3]/4), 0, -14/3,
2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4), 2 Sqrt[2/3], (2 Sqrt[2])/3, 0,
0, 0}, {0, 0, 0, Sqrt[3], 0, 0, 2 (-1/(4 Sqrt[3]) + Sqrt[3]/4),
2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4), -2, -(2 Sqrt[2])/3,
0, (2 Sqrt[2])/3, 0, 0}, {0, 0, 0, 0, Sqrt[3], 0, 0,
2 Sqrt[2/3], -(2 Sqrt[2])/3, -7/3, 0, 0,
2 (1/(3 Sqrt[2]) + (2 Sqrt[2])/3), Sqrt[10/3]}, {0, 0, 0, 0, 0,
2 Sqrt[2/3], 0, (2 Sqrt[2])/3, 0, 0, -16/3,
2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4), 2 Sqrt[2/3], 0}, {0, 0, 0, 0,
0, 0, 2 Sqrt[2/3], 0, (2 Sqrt[2])/3, 0,
2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4), -8/3, -(2 Sqrt[2])/3, 0}, {0,
0, 0, 0, 0, 0, 0, 0, 0, 2 (1/(3 Sqrt[2]) + (2 Sqrt[2])/3),
2 Sqrt[2/3], -(2 Sqrt[2])/3, 1/2,
2 (-Sqrt[5/3]/16 - Sqrt[15]/16)}, {0, 0, 0, 0, 0, 0, 0, 0, 0,
Sqrt[10/3], 0, 0, 2 (-Sqrt[5/3]/16 - Sqrt[15]/16), 7/2}};

In[2]:= mat === Conjugate@Transpose[mat]
Out[2]= True

(mat is Hermitian so we expect real eigenvalues.)

In[3]:= N@Eigenvalues[mat]

Out[3]= {-9.41358 + 0.88758 I, -9.41358 - 0.88758 I, -7.37965 +
2.32729 I, -7.37965 - 2.32729 I, -4.46655 + 2.59738 I, -4.46655 -
2.59738 I, 4.36971, 3.21081, -2.32456 + 2.10914 I, -2.32456 -
2.10914 I, 2.04366+ 0.552265 I,
2.04366- 0.552265 I, -0.249588 + 1.29034 I, -0.249588 - 1.29034 I}

In[4]:= Eigenvalues[N[mat]]

Out[4]= {-9.09122, -7.41855, -7.41855, -7.2915, 4.33734, -4., -4., \
3.2915, -3.24612, -2.38787, -2.38787, 1.80642, 1.80642,
9.21707*10^-16}

Murray Eisenberg

unread,
Nov 24, 2008, 4:07:10 AM11/24/08
to
Mathematica 7.0 (Windows 32 bit). For your mat:

N@Eigenvalues[mat] // InputForm

{-9.091215416949623, -7.4185507188738455, -7.4185507188738455,
-7.291502622129181, 4.337337307188519, -4., -4., 3.2915026221291814,
-3.2461218902388955, -2.387873132949261, -2.387873132949261,
1.8064238518231066, 1.8064238518231066, 0.}

--

Curtis F. Osterhoudt

unread,
Nov 24, 2008, 4:07:53 AM11/24/08
to
It seems that the sizes of the resulting .PDFs are more reasonable now,
but there are still problems with strange artifacts appearing. If the
exported graphic is to look more like it does on screen, the "use bitmap
representation" helps the look immensely, but it obviously leads to more
jaggies, and leads to much larger file sizes.

Helen Read

unread,
Nov 24, 2008, 4:08:46 AM11/24/08
to

Nice. This will make my students happy. Now they won't have to stand on
their heads and interchange x and y and set ViewVertical->{-1,0,0} to
revolve around the x-axis.

Helen Read

unread,
Nov 24, 2008, 4:08:35 AM11/24/08
to

Wow, they sure snuck it in. It was missing prior to 6.0.3, and it's
nowhere to be found in the Documentation for 6.0.3.

DrMajorBob

unread,
Nov 24, 2008, 4:08:24 AM11/24/08
to
Yes, that bug is gone too. (For that example.)

mat === Conjugate@Transpose[mat]

True

nEigen = N@Eigenvalues[mat]

{-9.09122, -7.41855, -7.41855, -7.2915, 4.33734, -4., -4., 3.2915, \

-3.24612, -2.38787, -2.38787, 1.80642, 1.80642, 0.}

eigenN = Eigenvalues[N[mat]]

{-9.09122, -7.41855, -7.41855, -7.2915, 4.33734, -4., -4., 3.2915, \

-3.24612, -2.38787, -2.38787, 1.80642, 1.80642, -9.93145*10^-16}

nEigen - eigenN // Chop

{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}

Bobby

--
DrMaj...@longhorns.com

Nasser Abbasi

unread,
Nov 24, 2008, 4:10:53 AM11/24/08
to

"Szabolcs" <szho...@gmail.com> wrote in message
news:gg8pih$k4r$1...@smc.vnet.net...

It seems to be fixed in M7:

In[36]:= $Version
Out[36]= 7.0 for Microsoft Windows (32-bit) (November 10, 2008)

In[32]:= mat = {{-6, 0, -Sqrt[3], 0, 0, Sqrt[3], 0,


0, 0, 0, 0, 0, 0, 0}, {0, -6, 0,
-Sqrt[3], 0, 0, Sqrt[3], 0, 0, 0, 0,
0, 0, 0}, {-Sqrt[3], 0, -4,

2*(-(4*Sqrt[3])^(-1) + (3*Sqrt[3])/
4), 2*Sqrt[2/3], 0, 0, Sqrt[3], 0,


0, 0, 0, 0, 0}, {0, -Sqrt[3],

2*(-(4*Sqrt[3])^(-1) + (3*Sqrt[3])/
4), -4/3, -(2*Sqrt[2])/3, 0, 0, 0,


Sqrt[3], 0, 0, 0, 0, 0},

{0, 0, 2*Sqrt[2/3], -(2*Sqrt[2])/3,


7/3, 0, 0, 0, 0, Sqrt[3], 0, 0, 0,
0}, {Sqrt[3], 0, 0, 0, 0, -4, 0,

2*(-(4*Sqrt[3])^(-1) + Sqrt[3]/4), 0,
0, 2*Sqrt[2/3], 0, 0, 0},


{0, Sqrt[3], 0, 0, 0, 0, -4, 0,

2*(-(4*Sqrt[3])^(-1) + Sqrt[3]/4), 0,
0, 2*Sqrt[2/3], 0, 0},


{0, 0, Sqrt[3], 0, 0,

2*(-(4*Sqrt[3])^(-1) + Sqrt[3]/4), 0,
-14/3, 2*(-(4*Sqrt[3])^(-1) +
(3*Sqrt[3])/4), 2*Sqrt[2/3],
(2*Sqrt[2])/3, 0, 0, 0},


{0, 0, 0, Sqrt[3], 0, 0,

2*(-(4*Sqrt[3])^(-1) + Sqrt[3]/4),
2*(-(4*Sqrt[3])^(-1) + (3*Sqrt[3])/
4), -2, -(2*Sqrt[2])/3, 0,
(2*Sqrt[2])/3, 0, 0}, {0, 0, 0, 0,
Sqrt[3], 0, 0, 2*Sqrt[2/3],
-(2*Sqrt[2])/3, -7/3, 0, 0,
2*(1/(3*Sqrt[2]) + (2*Sqrt[2])/3),


Sqrt[10/3]}, {0, 0, 0, 0, 0,

2*Sqrt[2/3], 0, (2*Sqrt[2])/3, 0, 0,
-16/3, 2*(-(4*Sqrt[3])^(-1) +
(3*Sqrt[3])/4), 2*Sqrt[2/3], 0},
{0, 0, 0, 0, 0, 0, 2*Sqrt[2/3], 0,
(2*Sqrt[2])/3, 0,
2*(-(4*Sqrt[3])^(-1) + (3*Sqrt[3])/
4), -8/3, -(2*Sqrt[2])/3, 0},


{0, 0, 0, 0, 0, 0, 0, 0, 0,

2*(1/(3*Sqrt[2]) + (2*Sqrt[2])/3),
2*Sqrt[2/3], -(2*Sqrt[2])/3, 1/2,
2*(-Sqrt[5/3]/16 - Sqrt[15]/16)},


{0, 0, 0, 0, 0, 0, 0, 0, 0,

Sqrt[10/3], 0, 0, 2*(-Sqrt[5/3]/16 -
Sqrt[15]/16), 7/2}};

In[33]:= mat === Conjugate[Transpose[mat]]
Out[33]= True

In[34]:= N[Eigenvalues[mat]]
Out[34]= {-9.091215416949623, -7.4185507188738455,


-7.4185507188738455, -7.291502622129181,
4.337337307188519, -4., -4.,
3.2915026221291814, -3.2461218902388955,
-2.387873132949261, -2.387873132949261,
1.8064238518231066, 1.8064238518231066,
0.}

In[35]:= Eigenvalues[N[mat]]
Out[35]= {-9.091215416949622, -7.4185507188738455,
-7.418550718873844, -7.291502622129181,
4.337337307188519, -4.000000000000002,
-3.999999999999999, 3.2915026221291814,
-3.246121890238896, -2.387873132949261,
-2.3878731329492604, 1.8064238518231066,
1.8064238518231046,
-2.8189256280805394*^-16}

Nasser


Mark Westwood

unread,
Nov 24, 2008, 6:45:14 AM11/24/08
to
Hi Alexei

There are 2 possible justifications for parallel computing:

1) Just because it's there, and it's an interesting set of problems
to tackle. But it's mainly 'there' because:

2) Serial execution on a single CPU is too slow (or, as a variant,
your problem requires so much memory that you need to use the address
space of more than one process(or))

When is serial execution on a single CPU too slow ? Well, that's the
question you have to answer for your problems and practices. Do you
want to wait 3 days for a serial answer, or spend 3 weeks
parallelising your program to wait 1 hour ? Depends on how often you
are going to run the program I suspect.

The questions you raise about parallel Mathematica and parallel
Fortran or C are interesting ones too. I do a fair bit of Mathematica
programming but my day job is really Fortran programming for solving
problems in computational electromagnetics. This is one of those
fields where parallel computing is firmly entrenched; solving large
inverse problems just needs lots of CPUs and (usually) lots of memory.

For me, the availability of parallel Mathematica, and a desktop with 2
quad core processors, changes some of the variables in the equations
which determine, for me, whether or not a program should be written in
Mathematica or in Fortran. However, I have a strong preference for
slow programs easily-written in Mathematica, easily integrated with
graphics and all the rest.

I expect you'll get a lot of other views on this one.

Regards

Mark Westwood

> This e-mail may contain trade secrets or privileged, undisclosed or other=
wise confidential information. If you are not the intended recipient and ha=
ve received this e-mail in error, you are hereby notified that any review, =
copying or distribution of it is strictly prohibited. Please inform us imme=
diately and destroy the original transmittal from your system. Thank you fo=
r your co-operation.


Murray Eisenberg

unread,
Nov 25, 2008, 7:15:45 AM11/25/08
to
But...

$Version
6.0 for Microsoft Windows (32-bit) (May 21, 2008)
Options[RevolutionPlot3D][[51]]
RevolutionAxis -> {0, 0, 1}


Helen Read wrote:
> Szabolcs wrote:

> Wow, they sure snuck it in. It was missing prior to 6.0.3, and it's
> nowhere to be found in the Documentation for 6.0.3.
>
> --
> Helen Read
> University of Vermont
>

--

Jon Harrop

unread,
Dec 4, 2008, 7:15:23 AM12/4/08
to
Alexei Boulbitch wrote:
> Dear MatheGroup members,
>
> I would like to put a trivial though an important question to you. For
> some time I try to find problems to be best solved* using Mathematica, and
> where parallel computing would be necessary. So far I did not succeed in
> finding them. This tells me that I do not understand something
> important...

Parallelism is used when performance is important. In general, Mathematica
is used when performance is largely unimportant. Consequently, they are
almost mutually exclusive.

For example, you would need to parallelize the following Mathematica program
perfectly across 700,000 CPU cores to obtain performance comparable to the
equivalent OCaml program:

delta = Sqrt[$MachineEpsilon];

RaySphere[o_, d_, c_, r_] :=
Block[{v, b, disc, t1, t2},
v = c - o;
b = v.d;
disc = Sqrt[b^2 - v.v + r^2];
t2 = b + disc;
If[Im[disc] != 0 || t2 <= 0, \[Infinity],
t1 = b - disc;
If[t1 > 0, t1, t2]]
]

Intersect[o_, d_][{lambda_, n_}, Sphere[c_, r_]] :=
Block[{lambda2 = RaySphere[o, d, c, r]},
If[lambda2 >= lambda, {lambda, n}, {lambda2,
Normalize[o + lambda2 d - c]}]
]
Intersect[o_, d_][{lambda_, n_}, Bound[c_, r_, s_]] :=
Block[{lambda2 = RaySphere[o, d, c, r]},
If[lambda2 >= lambda, {lambda, n},
Fold[Intersect[o, d], {lambda, n}, s]]
]

neglight = N@Normalize[{1, 3, -2}];

nohit = {\[Infinity], {0, 0, 0}};

RayTrace[o_, d_, scene_] :=
Block[{lambda, n, g, p},
{lambda, n} = Intersect[o, d][nohit, scene];
If[lambda == \[Infinity], 0,
g = n.neglight;
If[g <= 0, 0,
{lambda, n} =
Intersect[o + lambda d + delta n, neglight][nohit, scene];
If[lambda < \[Infinity], 0, g]]]
]

Create[level_, c_, r_] :=
Block[{obj = Sphere[c, r]},
If[level == 1, obj,
Block[{a = 3*r/Sqrt[12], Aux},
Aux[x1_, z1_] := Create[level - 1, c + {x1, a, z1}, 0.5 r];
Bound[c,
3 r, {obj, Aux[-a, -a], Aux[a, -a], Aux[-a, a], Aux[a, a]}]]]]

scene = Create[1, {0, -1, 4}, 1];

Main[level_, n_, ss_] :=
Block[{scene = Create[level, {0, -1, 4}, 1]},
Table[
Sum[
RayTrace[{0, 0, 0},
N@Normalize[{(x + s/ss/ss)/n - 1/2, (y + Mod[s, ss]/ss)/n - 1/2,
1}], scene], {s, 0, ss^2 - 1}]/ss^2, {y, 0, n - 1},
{x, 0, n - 1}]]

AbsoluteTiming[Export["image.pgm", Graphics@Raster@Main[9, 512, 4]]]

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u

Michael Weyrauch

unread,
Dec 5, 2008, 5:30:49 AM12/5/08
to
Hello,

> Parallelism is used when performance is important. In general, Mathematica
> is used when performance is largely unimportant. Consequently, they are
> almost mutually exclusive.

This is a strange statement, which I hardly understand. First of all, in
many of my numerical calculations performance is important (e.g. solving
large sets of coupled differential equations) and Mathematica compares
rather well with compiled languages (in a few cases I can compare with
compiled code which does the same calculations as my Mathematica code)

Within such calculations often linear algebra routines are used, which
where parallalized in Mathematica even before Version 7, and, of course
it makes a big difference if your Computer has just one or more
processors available. So compared to compiled languages with programs
that are not parallelized (which is often the case) Mathematica may even
have an advantage.

>
> For example, you would need to parallelize the following Mathematica program
> perfectly across 700,000 CPU cores to obtain performance comparable to the
> equivalent OCaml program:
>

I don't know OCaml. Therefore, could you elaborate a bit how this
tremendous difference comes about and under what conditions??

Thanks Michael

Jon Harrop

unread,
Dec 8, 2008, 6:21:43 AM12/8/08
to
Michael Weyrauch wrote:
> Hello,
>
>> Parallelism is used when performance is important. In general,
>> Mathematica is used when performance is largely unimportant.
>> Consequently, they are almost mutually exclusive.
>
> This is a strange statement, which I hardly understand. First of all, in
> many of my numerical calculations performance is important (e.g. solving
> large sets of coupled differential equations) and Mathematica compares
> rather well with compiled languages (in a few cases I can compare with
> compiled code which does the same calculations as my Mathematica code)
>
> Within such calculations often linear algebra routines are used, which
> where parallalized in Mathematica even before Version 7, and, of course
> it makes a big difference if your Computer has just one or more
> processors available. So compared to compiled languages with programs
> that are not parallelized (which is often the case) Mathematica may even
> have an advantage.

Yes, and that is true of any task where Mathematica already bundles a decent
implementation that you can simply invoke. However, it is not true in
general.

>> For example, you would need to parallelize the following Mathematica
>> program perfectly across 700,000 CPU cores to obtain performance
>> comparable to the equivalent OCaml program:
>
> I don't know OCaml.

This is not specific to OCaml. Here are equivalent implementations of that
program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all well
over 100,000x faster than the Mathematica I gave (and many are more concise
as well!):

http://www.ffconsultancy.com/languages/ray_tracer/

> Therefore, could you elaborate a bit how this tremendous difference comes
> about and under what conditions??

Mathematica programs are only competitively performant when the bottleneck
is one of Mathematica's internal routines (FFT, Schur decomposition,
regular expressions etc.). Optimizing Mathematica code largely entails
reimplementing your code in such a way that stress is moved onto these
internal functions even if that is objectively inappropriate (e.g. in the
case of the recent example on this list with the subject "Floating-Point
Computing").

If you cannot express a solution to your problem in such a way then
Mathematica's performance is likely to be dire, often up to a thousand
times slower than a compiled language. The program I gave hits several of
Mathematica's inefficiencies and, consequently, it takes days to compute
instead of seconds.

I highly recommend learning when these performance issues are likely to
arise because you can save a huge amount of time by developing from scratch
in a performant language when you know performance will be important.

Daniel Lichtblau

unread,
Dec 9, 2008, 6:55:37 AM12/9/08
to
Jon Harrop wrote:

> Michael Weyrauch wrote:
>> Hello,
>>
>>> Parallelism is used when performance is important. In general,
>>> Mathematica is used when performance is largely unimportant.
>>> Consequently, they are almost mutually exclusive.
>> This is a strange statement, which I hardly understand. First of all, in
>> many of my numerical calculations performance is important (e.g. solving
>> large sets of coupled differential equations) and Mathematica compares
>> rather well with compiled languages (in a few cases I can compare with
>> compiled code which does the same calculations as my Mathematica code)
>>
>> Within such calculations often linear algebra routines are used, which
>> where parallalized in Mathematica even before Version 7, and, of course
>> it makes a big difference if your Computer has just one or more
>> processors available. So compared to compiled languages with programs
>> that are not parallelized (which is often the case) Mathematica may even
>> have an advantage.
>
> Yes, and that is true of any task where Mathematica already bundles a decent
> implementation that you can simply invoke. However, it is not true in
> general.
>
>>> For example, you would need to parallelize the following Mathematica
>>> program perfectly across 700,000 CPU cores to obtain performance
>>> comparable to the equivalent OCaml program:
>> I don't know OCaml.
>
> This is not specific to OCaml. Here are equivalent implementations of that
> program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all well
> over 100,000x faster than the Mathematica I gave (and many are more concise
> as well!):
>
> http://www.ffconsultancy.com/languages/ray_tracer/

I do not understand this claim that the speed difference is 5 orders of
magnitude. Some experiments indicate 3, perhaps.


>> Therefore, could you elaborate a bit how this tremendous difference comes
>> about and under what conditions??
>

> Mathematica programs are only competitively performant when the bottleneck
> is one of Mathematica's internal routines (FFT, Schur decomposition,
> regular expressions etc.). Optimizing Mathematica code largely entails
> reimplementing your code in such a way that stress is moved onto these
> internal functions even if that is objectively inappropriate (e.g. in the
> case of the recent example on this list with the subject "Floating-Point
> Computing").
>
> If you cannot express a solution to your problem in such a way then
> Mathematica's performance is likely to be dire, often up to a thousand
> times slower than a compiled language. The program I gave hits several of
> Mathematica's inefficiencies and, consequently, it takes days to compute
> instead of seconds.
>
> I highly recommend learning when these performance issues are likely to
> arise because you can save a huge amount of time by developing from scratch
> in a performant language when you know performance will be important.

If you have code that is readily written in a compiled language, it is
certainly the case that it will run faster there than in an interpreted
language. Probably much faster. So yes, that would be the route to take.

Some modest improvements to the original Mathematica code, though, bring
it to maybe 2 orders of magnitude of the OCaml speed. In the variant
below, I make sure everyone is a machine real number or machine integer,
whichever is appropriate. I use Compile on RaySphere. There are a few
other tweaks for general efficiency (some might apply to the OCaml
implementation, I'm not sure).

delta = Sqrt[$MachineEpsilon];
inf = 100000.;
neglight = Normalize[{1., 3., -2.}];
nohit = {inf, {0., 0., 0.}};

Clear[RaySphere, Intersect, Create, RayTrace, Main];

RaySphere = With[{inf = inf}, Compile[
{{c, _Real, 1}, {o, _Real, 1}, {d, _Real, 1}, {r, _Real}},
Module[{v = c - o, b, disc, t},
b = v.d;
disc = b^2 - v.v + r;
If[disc < 0., inf,
disc = Sqrt[disc];
If[b > disc, b - disc,
t = b + disc;
If[t < 0., inf, t]]]]]];

Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_]] := Block[
{lambda2 = RaySphere[c, o, d, r]},
If[lambda2 >= lambda, ln, {lambda2, c}]]

Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_, s_]] := Block[
{lambda2 = RaySphere[c, o, d, r]},
If[lambda2 >= lambda, ln,
Fold[Intersect[o, d, #1, #2] &, ln, s]
]
]

RayTrace[o_, d_, scene_] := Block[
{lambda, n, g, p},

{lambda, n} = Intersect[o, d, nohit, scene];
If[lambda >= inf, 0.,
n = Normalize[o + d*lambda - n];
g = n.neglight;
If[g <= 0., 0.,
{lambda, n} =
Intersect[o + lambda d + delta n, neglight, nohit, scene];
If[lambda < inf, 0., g]]]]

Create[level_, c_, r_] :=

Block[{obj = List[c, r^2]},


If[level == 1, obj,
Block[{a = 3*r/Sqrt[12], Aux},
Aux[x1_, z1_] := Create[level - 1, c + {x1, a, z1}, 0.5 r];

List[c,
9*r^2, {obj, Aux[-a, -a], Aux[a, -a], Aux[-a, a], Aux[a, a]}]]]]

prepareTable = Compile[{{n, _Integer}, {s, _Integer}},
With[{nss = N[s]},
Flatten[
Table[Normalize[{(x + s1/nss)/n - .5, (y + s2/nss)/n - .5,
1.}], {s1, 0, s - 1}, {s2, 0, s - 1}, {y, 0, n - 1}, {x, 0,
n - 1}], 1]
]];

Main[level_, n_, ss_] :=

Block[{scene = Create[level, {0., -1., 4.}, 1.], nss = N[ss]},
scene = MapAll[Developer`ToPackedArray, scene];
Total[Map[RayTrace[{0., 0., 0.}, #, scene] &,
prepareTable[n, ss], {3}], 1]/nss^2]

In[22]:= Timing[ff512 = Main[9, 512, 4];]
Out[22]= {1542.27, Null}

At this point, it might make sense to use ParallelMap instead of Map, in
Main.

Daniel Lichtblau
Wolfram Research

Jon Harrop

unread,
Dec 10, 2008, 4:49:18 AM12/10/08
to
Daniel Lichtblau wrote:

> Jon Harrop wrote:
>> This is not specific to OCaml. Here are equivalent implementations of
>> that program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all
>> well over 100,000x faster than the Mathematica I gave (and many are more
>> concise as well!):
>>
>> http://www.ffconsultancy.com/languages/ray_tracer/
>
> I do not understand this claim that the speed difference is 5 orders of
> magnitude. Some experiments indicate 3, perhaps.

The code I gave is over 5 orders of magnitude slower by my measurements.

> Some modest improvements to the original Mathematica code, though, bring
> it to maybe 2 orders of magnitude of the OCaml speed.

That is a huge improvement. Thank you.

> In the variant
> below, I make sure everyone is a machine real number or machine integer,
> whichever is appropriate. I use Compile on RaySphere. There are a few
> other tweaks for general efficiency (some might apply to the OCaml
> implementation, I'm not sure).

I have a few questions about this. :-)

> delta = Sqrt[$MachineEpsilon];
> inf = 100000.;

This is nasty. Why is infinity not represented internally by the floating
point number infinity in Mathematica?

> neglight = Normalize[{1., 3., -2.}];
> nohit = {inf, {0., 0., 0.}};
>
> Clear[RaySphere, Intersect, Create, RayTrace, Main];
>
> RaySphere = With[{inf = inf}, Compile[

What is the purpose of "With[{inf = inf}, ..]" here?

> {{c, _Real, 1}, {o, _Real, 1}, {d, _Real, 1}, {r, _Real}},
> Module[{v = c - o, b, disc, t},
> b = v.d;
> disc = b^2 - v.v + r;
> If[disc < 0., inf,
> disc = Sqrt[disc];
> If[b > disc, b - disc,
> t = b + disc;
> If[t < 0., inf, t]]]]]];
>
> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_]] := Block[
> {lambda2 = RaySphere[c, o, d, r]},
> If[lambda2 >= lambda, ln, {lambda2, c}]]

Defer calculation of the normal. Fair enough but it is only done once per
pixel.

> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_, s_]] := Block[
> {lambda2 = RaySphere[c, o, d, r]},
> If[lambda2 >= lambda, ln,
> Fold[Intersect[o, d, #1, #2] &, ln, s]
> ]
> ]

No currying and a named subpattern, ok.

> RayTrace[o_, d_, scene_] := Block[
> {lambda, n, g, p},
> {lambda, n} = Intersect[o, d, nohit, scene];
> If[lambda >= inf, 0.,
> n = Normalize[o + d*lambda - n];
> g = n.neglight;
> If[g <= 0., 0.,
> {lambda, n} =
> Intersect[o + lambda d + delta n, neglight, nohit, scene];
> If[lambda < inf, 0., g]]]]
>
> Create[level_, c_, r_] :=
> Block[{obj = List[c, r^2]},
> If[level == 1, obj,
> Block[{a = 3*r/Sqrt[12], Aux},
> Aux[x1_, z1_] := Create[level - 1, c + {x1, a, z1}, 0.5 r];
> List[c,
> 9*r^2, {obj, Aux[-a, -a], Aux[a, -a], Aux[-a, a], Aux[a, a]}]]]]
>
> prepareTable = Compile[{{n, _Integer}, {s, _Integer}},
> With[{nss = N[s]},
> Flatten[
> Table[Normalize[{(x + s1/nss)/n - .5, (y + s2/nss)/n - .5,
> 1.}], {s1, 0, s - 1}, {s2, 0, s - 1}, {y, 0, n - 1}, {x, 0,
> n - 1}], 1]
> ]];

So you precompute a matrix of all the ray directions using a compiled
function.

> Main[level_, n_, ss_] :=
> Block[{scene = Create[level, {0., -1., 4.}, 1.], nss = N[ss]},
> scene = MapAll[Developer`ToPackedArray, scene];
> Total[Map[RayTrace[{0., 0., 0.}, #, scene] &,
> prepareTable[n, ss], {3}], 1]/nss^2]
>
> In[22]:= Timing[ff512 = Main[9, 512, 4];]
> Out[22]= {1542.27, Null}

That is orders of magnitude faster than my version but I am not sure why. I
suspect the main performance improvement comes from the use of packed
arrays but I haven't been able to reproduce it in my own code yet.

> At this point, it might make sense to use ParallelMap instead of Map, in
> Main.

Replacing Map with ParallelMap slows it down, even with
Method -> "CoarsestGrained". How can I get a speedup from my 8 cores?! :-)

Many thanks,

Daniel Lichtblau

unread,
Dec 11, 2008, 3:42:50 AM12/11/08
to
[My comments are interspersed with the questions. --dl]

Jon Harrop wrote:
> Daniel Lichtblau wrote:
>> Jon Harrop wrote:

>>> This is not specific to OCaml. Here are equivalent implementations of
>>> that program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all
>>> well over 100,000x faster than the Mathematica I gave (and many are more
>>> concise as well!):
>>>
>>> http://www.ffconsultancy.com/languages/ray_tracer/
>> I do not understand this claim that the speed difference is 5 orders of
>> magnitude. Some experiments indicate 3, perhaps.
>

> The code I gave is over 5 orders of magnitude slower by my measurements.

Offhand I do not know what would account for this. I ran it in
Mathematica 7 (a 64 bit Linux machine) on

Main[3, 52, 4]

and observed something around 7-10 times slower than what I coded. I did
not try it on larger scenes, so possibly there is something going wrong
in the recursion. Certainly any of these codes will scale linearly with
the number of pixels (so quadratically with n), so my using a dimension
of 52 instead of 512 here should not matter for assessment of relative
speeds.


>> Some modest improvements to the original Mathematica code, though, bring
>> it to maybe 2 orders of magnitude of the OCaml speed.
>

> That is a huge improvement. Thank you.
>

>> In the variant
>> below, I make sure everyone is a machine real number or machine integer,
>> whichever is appropriate. I use Compile on RaySphere. There are a few
>> other tweaks for general efficiency (some might apply to the OCaml
>> implementation, I'm not sure).
>

> I have a few questions about this. :-)
>

>> delta = Sqrt[$MachineEpsilon];
>> inf = 100000.;
>

> This is nasty. Why is infinity not represented internally by the floating
> point number infinity in Mathematica?

Mathematica does not use machine double infinities, or bignum infinities
for that matter. Generally when those are encountered in a computation
e.g. with dedicated machine arithmetic, an exception is thrown and
Mathematica attempts to work around it e.g. with software higher precision.

This is not to say there is no reason to have a machine double infinity
in Mathematica. I can say, however, that putting one in would involve
considerable work, and I think it would give but small improvement in
speed or functionality.


>> neglight = Normalize[{1., 3., -2.}];
>> nohit = {inf, {0., 0., 0.}};
>>
>> Clear[RaySphere, Intersect, Create, RayTrace, Main];
>>
>> RaySphere = With[{inf = inf}, Compile[
>

> What is the purpose of "With[{inf = inf}, ..]" here?

This is to inject the evaluated form into the Compile. Else it is a
nuisance to get that form, because Compile has the HoldAll attribute.


>> {{c, _Real, 1}, {o, _Real, 1}, {d, _Real, 1}, {r, _Real}},
>> Module[{v = c - o, b, disc, t},
>> b = v.d;
>> disc = b^2 - v.v + r;
>> If[disc < 0., inf,
>> disc = Sqrt[disc];
>> If[b > disc, b - disc,
>> t = b + disc;
>> If[t < 0., inf, t]]]]]];
>>
>> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_]] := Block[
>> {lambda2 = RaySphere[c, o, d, r]},
>> If[lambda2 >= lambda, ln, {lambda2, c}]]
>

> Defer calculation of the normal. Fair enough but it is only done once per
> pixel.

It makes a small but measurable difference to speed. Helps a bit more
than you indicate, because the original code uses Normalize inside
Intersect, where it might be done as many times as lambda keeps
decreasing for a given originating call (to Intersect, with the entire
scene).


>> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_, s_]] := Block[
>> {lambda2 = RaySphere[c, o, d, r]},
>> If[lambda2 >= lambda, ln,
>> Fold[Intersect[o, d, #1, #2] &, ln, s]
>> ]
>> ]
>

> No currying and a named subpattern, ok.
>

>> RayTrace[o_, d_, scene_] := Block[
>> {lambda, n, g, p},
>> {lambda, n} = Intersect[o, d, nohit, scene];
>> If[lambda >= inf, 0.,
>> n = Normalize[o + d*lambda - n];
>> g = n.neglight;
>> If[g <= 0., 0.,
>> {lambda, n} =
>> Intersect[o + lambda d + delta n, neglight, nohit, scene];
>> If[lambda < inf, 0., g]]]]
>>
>> Create[level_, c_, r_] :=
>> Block[{obj = List[c, r^2]},
>> If[level == 1, obj,
>> Block[{a = 3*r/Sqrt[12], Aux},
>> Aux[x1_, z1_] := Create[level - 1, c + {x1, a, z1}, 0.5 r];
>> List[c,
>> 9*r^2, {obj, Aux[-a, -a], Aux[a, -a], Aux[-a, a], Aux[a, a]}]]]]
>>
>> prepareTable = Compile[{{n, _Integer}, {s, _Integer}},
>> With[{nss = N[s]},
>> Flatten[
>> Table[Normalize[{(x + s1/nss)/n - .5, (y + s2/nss)/n - .5,
>> 1.}], {s1, 0, s - 1}, {s2, 0, s - 1}, {y, 0, n - 1}, {x, 0,
>> n - 1}], 1]
>> ]];
>

> So you precompute a matrix of all the ray directions using a compiled
> function.
>

>> Main[level_, n_, ss_] :=
>> Block[{scene = Create[level, {0., -1., 4.}, 1.], nss = N[ss]},
>> scene = MapAll[Developer`ToPackedArray, scene];
>> Total[Map[RayTrace[{0., 0., 0.}, #, scene] &,
>> prepareTable[n, ss], {3}], 1]/nss^2]
>>
>> In[22]:= Timing[ff512 = Main[9, 512, 4];]
>> Out[22]= {1542.27, Null}
>

> That is orders of magnitude faster than my version but I am not sure why. I
> suspect the main performance improvement comes from the use of packed
> arrays but I haven't been able to reproduce it in my own code yet.

Yes, packed arrays, which among other things avoids ALL exact arithmetic
in places you do not want it. Also compiling RaySphere gives, by itself,
a factor of two or more in overal speed gain.


>> At this point, it might make sense to use ParallelMap instead of Map, in
>> Main.
>

> Replacing Map with ParallelMap slows it down, even with
> Method -> "CoarsestGrained". How can I get a speedup from my 8 cores?! :-)
>
> Many thanks,

I only made the suggestion, without trying it, because I have no actual
familiarity with ParallelMap. But one possibility: did you use
DistributeDefinitions on all the functions underneath that Map? If not,
that could certainly make it slower rather than faster.


Daniel Lichtblau
Wolfram Research

Daniel Lichtblau

unread,
Dec 11, 2008, 3:48:18 AM12/11/08
to
[My comments are interspersed with the questions. --dl]

Jon Harrop wrote:
> Daniel Lichtblau wrote:
>> Jon Harrop wrote:

>>> This is not specific to OCaml. Here are equivalent implementations of
>>> that program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all
>>> well over 100,000x faster than the Mathematica I gave (and many are more
>>> concise as well!):
>>>
>>> http://www.ffconsultancy.com/languages/ray_tracer/
>> I do not understand this claim that the speed difference is 5 orders of
>> magnitude. Some experiments indicate 3, perhaps.
>

> The code I gave is over 5 orders of magnitude slower by my measurements.

Offhand I do not know what would account for this. I ran it in
Mathematica 7 (a 64 bit Linux machine) on

Main[3, 52, 4]

and observed something around 7-10 times slower than what I coded. I did
not try it on larger scenes, so possibly there is something going wrong
in the recursion. Certainly any of these codes will scale linearly with
the number of pixels (so quadratically with n), so my using a dimension
of 52 instead of 512 here should not matter for assessment of relative
speeds.

>> Some modest improvements to the original Mathematica code, though, bring
>> it to maybe 2 orders of magnitude of the OCaml speed.
>

> That is a huge improvement. Thank you.
>

>> In the variant
>> below, I make sure everyone is a machine real number or machine integer,
>> whichever is appropriate. I use Compile on RaySphere. There are a few
>> other tweaks for general efficiency (some might apply to the OCaml
>> implementation, I'm not sure).
>

> I have a few questions about this. :-)
>

>> delta = Sqrt[$MachineEpsilon];
>> inf = 100000.;
>

> This is nasty. Why is infinity not represented internally by the floating
> point number infinity in Mathematica?

Mathematica does not use machine double infinities, or bignum infinities
for that matter. Generally when those are encountered in a computation
e.g. with dedicated machine arithmetic, an exception is thrown and
Mathematica attempts to work around it e.g. with software higher precision.

This is not to say there is no reason to have a machine double infinity
in Mathematica. I can say, however, that putting one in would involve
considerable work, and I think it would give but small improvement in
speed or functionality.

>> neglight = Normalize[{1., 3., -2.}];
>> nohit = {inf, {0., 0., 0.}};
>>
>> Clear[RaySphere, Intersect, Create, RayTrace, Main];
>>
>> RaySphere = With[{inf = inf}, Compile[
>

> What is the purpose of "With[{inf = inf}, ..]" here?

This is to inject the evaluated form into the Compile. Else it is a
nuisance to get that form, because Compile has the HoldAll attribute.

>> {{c, _Real, 1}, {o, _Real, 1}, {d, _Real, 1}, {r, _Real}},
>> Module[{v = c - o, b, disc, t},
>> b = v.d;
>> disc = b^2 - v.v + r;
>> If[disc < 0., inf,
>> disc = Sqrt[disc];
>> If[b > disc, b - disc,
>> t = b + disc;
>> If[t < 0., inf, t]]]]]];
>>
>> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_]] := Block[
>> {lambda2 = RaySphere[c, o, d, r]},
>> If[lambda2 >= lambda, ln, {lambda2, c}]]
>

> Defer calculation of the normal. Fair enough but it is only done once per
> pixel.

It makes a small but measurable difference to speed. Helps a bit more
than you indicate, because the original code uses Normalize inside
Intersect, where it might be done as many times as lambda keeps
decreasing for a given originating call (to Intersect, with the entire
scene).

>> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_, s_]] := Block[
>> {lambda2 = RaySphere[c, o, d, r]},
>> If[lambda2 >= lambda, ln,
>> Fold[Intersect[o, d, #1, #2] &, ln, s]
>> ]
>> ]
>

> No currying and a named subpattern, ok.
>

>> RayTrace[o_, d_, scene_] := Block[
>> {lambda, n, g, p},
>> {lambda, n} = Intersect[o, d, nohit, scene];
>> If[lambda >= inf, 0.,
>> n = Normalize[o + d*lambda - n];
>> g = n.neglight;
>> If[g <= 0., 0.,
>> {lambda, n} =
>> Intersect[o + lambda d + delta n, neglight, nohit, scene];
>> If[lambda < inf, 0., g]]]]
>>
>> Create[level_, c_, r_] :=
>> Block[{obj = List[c, r^2]},
>> If[level == 1, obj,
>> Block[{a = 3*r/Sqrt[12], Aux},
>> Aux[x1_, z1_] := Create[level - 1, c + {x1, a, z1}, 0.5 r];
>> List[c,
>> 9*r^2, {obj, Aux[-a, -a], Aux[a, -a], Aux[-a, a], Aux[a, a]}]]]]
>>
>> prepareTable = Compile[{{n, _Integer}, {s, _Integer}},
>> With[{nss = N[s]},
>> Flatten[
>> Table[Normalize[{(x + s1/nss)/n - .5, (y + s2/nss)/n - .5,
>> 1.}], {s1, 0, s - 1}, {s2, 0, s - 1}, {y, 0, n - 1}, {x, 0,
>> n - 1}], 1]
>> ]];
>

> So you precompute a matrix of all the ray directions using a compiled
> function.
>

>> Main[level_, n_, ss_] :=
>> Block[{scene = Create[level, {0., -1., 4.}, 1.], nss = N[ss]},
>> scene = MapAll[Developer`ToPackedArray, scene];
>> Total[Map[RayTrace[{0., 0., 0.}, #, scene] &,
>> prepareTable[n, ss], {3}], 1]/nss^2]
>>
>> In[22]:= Timing[ff512 = Main[9, 512, 4];]
>> Out[22]= {1542.27, Null}
>

> That is orders of magnitude faster than my version but I am not sure why. I
> suspect the main performance improvement comes from the use of packed
> arrays but I haven't been able to reproduce it in my own code yet.

Yes, packed arrays, which among other things avoids ALL exact arithmetic
in places you do not want it. Also compiling RaySphere gives, by itself,
a factor of two or more in overal speed gain.

>> At this point, it might make sense to use ParallelMap instead of Map, in
>> Main.
>

Jon Harrop

unread,
Dec 12, 2008, 6:57:39 AM12/12/08
to
On Wednesday 10 December 2008 16:22:15 Daniel Lichtblau wrote:
> Jon Harrop wrote:
> > Daniel Lichtblau wrote:
> >> Jon Harrop wrote:
> >>> This is not specific to OCaml. Here are equivalent implementations of
> >>> that program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and
> >>> all
> >>> well over 100,000x faster than the Mathematica I gave (and many are
> >>> more concise as well!):
> >>>
> >>> http://www.ffconsultancy.com/languages/ray_tracer/
> >>
> >> I do not understand this claim that the speed difference is 5 orders of
> >> magnitude. Some experiments indicate 3, perhaps.
> >
> > The code I gave is over 5 orders of magnitude slower by my measurements.
>
> Offhand I do not know what would account for this. I ran it in
> Mathematica 7 (a 64 bit Linux machine) on
>
> Main[3, 52, 4]
>
> and observed something around 7-10 times slower than what I coded.

You'll get completely different performance results if you lower the first
argument because it changes the scene being rendered. You're only rendering
a handful of spheres with those arguments whereas the original rendered
over 80,000 spheres.

With 9,512,1 your version is 170x faster than my code, which took 26,000
seconds here. The original arguments 9,512,4 would take 16x longer still,
meaning your code is still 640x slower than the fastest compiled languages.
That is the five orders of magnitude I was referring to.

> I did
> not try it on larger scenes, so possibly there is something going wrong
> in the recursion. Certainly any of these codes will scale linearly with
> the number of pixels (so quadratically with n), so my using a dimension
> of 52 instead of 512 here should not matter for assessment of relative
> speeds.

I'd recommend cutting ss down to 1 before reducing the resolution (i.e. use
3,200,1 instead of 3,50,4) but there is still the chance you'll remove
thousands of tiny spheres from the image if you render a big scene, so the
resolution can have a dramatic effect on performance at very low
resolutions.

> >> inf = 100000.;
> >
> > This is nasty. Why is infinity not represented internally by the
> > floating
> > point number infinity in Mathematica?
>
> Mathematica does not use machine double infinities, or bignum infinities
> for that matter. Generally when those are encountered in a computation
> e.g. with dedicated machine arithmetic, an exception is thrown and
> Mathematica attempts to work around it e.g. with software higher
> precision.
>
> This is not to say there is no reason to have a machine double infinity
> in Mathematica. I can say, however, that putting one in would involve
> considerable work, and I think it would give but small improvement in
> speed or functionality.

Surely that is not the case here, where Compile'd code is intended to handle
infinities often? Or does Compile already handle infinity as a
machine-precision float?

> > What is the purpose of "With[{inf = inf}, ..]" here?
>
> This is to inject the evaluated form into the Compile. Else it is a
> nuisance to get that form, because Compile has the HoldAll attribute.

I see. :-)

> > Defer calculation of the normal. Fair enough but it is only done once
> > per
> > pixel.
>
> It makes a small but measurable difference to speed. Helps a bit more
> than you indicate, because the original code uses Normalize inside
> Intersect, where it might be done as many times as lambda keeps
> decreasing for a given originating call (to Intersect, with the entire
> scene).

True. Hierarchical bounds checks mean that lambda cannot be decreased many
times though. I should quantify exactly how often that occurs though.

Perhaps a more interesting solution would be to wrap the result in Hold[..]
to get laziness?

> >> In[22]:= Timing[ff512 = Main[9, 512, 4];]
> >> Out[22]= {1542.27, Null}
> >
> > That is orders of magnitude faster than my version but I am not sure
> > why.
> > I suspect the main performance improvement comes from the use of packed
> > arrays but I haven't been able to reproduce it in my own code yet.
>
> Yes, packed arrays, which among other things avoids ALL exact arithmetic
> in places you do not want it. Also compiling RaySphere gives, by itself,
> a factor of two or more in overal speed gain.

I had already tried Compile but never got anywhere near that performance. I
suspect your packed arrays and precalculated normalized vectors do the rest.

> >> At this point, it might make sense to use ParallelMap instead of Map,
> >> in Main.
> >
> > Replacing Map with ParallelMap slows it down, even with
> > Method -> "CoarsestGrained". How can I get a speedup from my 8 cores?!
> > :-)
> >
> > Many thanks,
>
> I only made the suggestion, without trying it, because I have no actual
> familiarity with ParallelMap. But one possibility: did you use
> DistributeDefinitions on all the functions underneath that Map? If not,
> that could certainly make it slower rather than faster.

I've managed to get rid of all the errors by calling DistributeDefinitions
with the symbol names of all functions *and* the local "scene" variable
right before the ParallelMap. However, this still doesn't give any
performance improvement (and it only spawns 4 kernels for my 8 cores).

Might make a nice parallelism demo for Mathematica 7 if you can get it
working...

Jean-Marc Gulliet

unread,
Dec 14, 2008, 7:32:28 AM12/14/08
to
Jon Harrop wrote:

<snip>

> I've managed to get rid of all the errors by calling DistributeDefinitions
> with the symbol names of all functions *and* the local "scene" variable
> right before the ParallelMap. However, this still doesn't give any
> performance improvement (and it only spawns 4 kernels for my 8 cores).

John,

You seem to believe that Mathematica 7.0 uses as many core as available
on a given machine (the marketing documentation is quite misleading to
this respect).

However, the default license allows only up to 4 cores to be used (of
course, one can upgrade one's license by paying an additional fee).

The number of cores can also be restricted by the user. Check the tab
"Parallel" in the "Preferences" dialog box (goto menu "Edit" ->
"Preferences" at the bottom) [1].

Regards,
- Jean-Marc

[1] _Configuring and Monitoring_,
http://reference.wolfram.com/mathematica/ParallelTools/tutorial/ConfiguringAndMonitoring.html


Jon Harrop

unread,
Dec 15, 2008, 7:48:41 AM12/15/08
to
Jean-Marc Gulliet wrote:

> Jon Harrop wrote:
>> I've managed to get rid of all the errors by calling
>> DistributeDefinitions with the symbol names of all functions *and* the
>> local "scene" variable right before the ParallelMap. However, this still
>> doesn't give any performance improvement (and it only spawns 4 kernels
>> for my 8 cores).
>
> John,
>
> You seem to believe that Mathematica 7.0 uses as many core as available
> on a given machine (the marketing documentation is quite misleading to
> this respect).
>
> However, the default license allows only up to 4 cores to be used (of
> course, one can upgrade one's license by paying an additional fee).

Yes. I believe I succeeded in parallelizing the ray tracer across 4 of my 8
cores but it gave no speedup. Specifically, I had 1 core at 100% usage,
three more at 20% usage and performance was degraded significantly.

For reference, my final timings of our implementations are:

420,000s My Mathematica
2,500s Daniel's Mathematica
3.9s My serial OCaml
1.4s My parallel OCaml on 8 cores

So the performance difference really was five orders of magnitude as my
preliminary tests had indicated.

0 new messages