Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

64 views

Skip to first unread message

Nov 19, 2008, 5:40:40â€¯AM11/19/08

to

Today, Wolfram Research announced Mathematica 7, a major release

that accelerates the drive to integrate and automate

functionality as core Mathematica capabilities, adding image

processing, parallel high-performance computing (HPC), new

on-demand curated data, and other recently developed

computational innovations--in total over 500 new functions and 12

application areas.

that accelerates the drive to integrate and automate

functionality as core Mathematica capabilities, adding image

processing, parallel high-performance computing (HPC), new

on-demand curated data, and other recently developed

computational innovations--in total over 500 new functions and 12

application areas.

Image processing is one key integrated addition.

Industrial-strength, high-performance functions for image

composition, transformation, enhancement, and segmentation

combine with the existing Mathematica infrastructure of

high-level language, automated interface construction,

interactive notebook documents, and computational power to create

a uniquely versatile image processing solution.

Built-in parallel computing is another key new area of

integration in Mathematica 7 (and a first across technical

computing). For the first time, every copy of Mathematica comes

standard with the technology to parallelize computations over

multiple cores or over networks of Mathematica deployed across a

grid. Every copy of Mathematica 7 comes with four computation

processes included. More processes as well as network

capabilities can be added easily.

Computable data sources, introduced in Mathematica 6, are unique

and popular innovations because of the ease with which data can

be utilized in Mathematica. Mathematica 7 builds on this with

major additions including the complete human genome, as well as

weather, astronomical, GIS, and geodesy data. Example uses

include finding, analyzing, and visualizing gene

sequences--making use of Mathematica's powerful string

capabilities (including new string alignment functionality),

pattern matching, and statistics. Similarly, both real-time and

historical weather data from 16,000 weather stations is included

in Mathematica 7, giving everyone from climatologists to

economists curated information to use in their analyses or

applications.

Other areas of innovation in Mathematica 7 include:

* Charting and information visualization

* Vector field visualization

* Comprehensive spline support, including NURBS

* Industrial-strength Boolean computation

* Statistical model analysis

* Integrated geodesy and GIS data

* Many symbolic computation breakthroughs, including discrete

calculus, sequence recognition, and transcendental roots

To learn more about the enhancements available in Mathematica 7

and to see the full list of new features, visit:

http://www.wolfram.com/mathematica/newin7

NOTE: Mathematica users with Premier Service will receive an

email in the next few days with instructions on how to download

their free upgrades.

Nov 20, 2008, 4:55:34â€¯AM11/20/08

to

Hello,

I would like to know if the two bugs mentioned here (and in other

messages) are still present in 7.0:

I'd also be curious about the performance (especially memory use) of

the graphics system. In version 6, ListDensityPlot could use up an

enormous amount of memory when high order interpolation was turned on

(much much more memory than the size of the final graphic that was

generated).

Finally I'd like to say that, though this might seem like a very minor

point compared to the other new features, I appreciate the addition of

line cap and join controls very much (CapForm, JoinForm). Those very

sorely missed in v6!

Nov 21, 2008, 5:31:43â€¯AM11/21/08

to

Mathematica 7.0: The first results are 8 for the Union and also 8 for

the Tally. With the different number, neither are 2 and 4 isomorphic

nor are 4 and 2 isomorphic.

the Tally. With the different number, neither are 2 and 4 isomorphic

nor are 4 and 2 isomorphic.

--

Murray Eisenberg mur...@math.umass.edu

Mathematics & Statistics Dept.

Lederle Graduate Research Tower phone 413 549-1020 (H)

University of Massachusetts 413 545-2859 (W)

710 North Pleasant Street fax 413 545-1801

Amherst, MA 01003-9305

Nov 21, 2008, 5:32:04â€¯AM11/21/08

to

That PARTICULAR failure of Tally, at least, seems to be gone in version 7.

No promises in general, mind you.

Bobby

On Thu, 20 Nov 2008 03:55:39 -0600, Szabolcs <szho...@gmail.com> wrote:

> Hello,

>

> I would like to know if the two bugs mentioned here (and in other

> messages) are still present in 7.0:

>

> http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_frm/thread/265f5619682b67e/0188372330785945

>

> I'd also be curious about the performance (especially memory use) of

> the graphics system. In version 6, ListDensityPlot could use up an

> enormous amount of memory when high order interpolation was turned on

> (much much more memory than the size of the final graphic that was

> generated).

>

>

> Finally I'd like to say that, though this might seem like a very minor

> point compared to the other new features, I appreciate the addition of

> line cap and join controls very much (CapForm, JoinForm). Those very

> sorely missed in v6!

>

Nov 21, 2008, 5:33:07â€¯AM11/21/08

to

Dear MatheGroup members,

I would like to put a trivial though an important question to you. For

some time I try to find problems to be best solved*

using Mathematica, and where parallel computing would be necessary. So

far I did not succeed in finding them. This tells

me that I do not understand something important. I would be grateful, if

you could give few examples. Very frankly,

this my mail is not aimed to state a uselessness of parallelization. In

contrast, I need few realistic examples to make my mind and

begin understanding, if I may need using a parallelization myself.

* Just to explain the point: molecular dynamics simulation for instance,

often requires parallelization, but here Fortran

of C would be probably methods of choice, rather than Mathematica. In

analytical calculations in contrast, Mathematica

would be the method of choice, but in this case I do not see how

parallelization may be benefited from.

It is evident how this my question is related to the M7 release.

Best regards, and thank you in advance, Alexei

--

Alexei Boulbitch, Dr., Habil.

Senior Scientist

IEE S.A.

ZAE Weiergewan

11, rue Edmond Reuter

L-5326 Contern

Luxembourg

Phone: +352 2454 2566

Fax: +352 2454 3566

Website: www.iee.lu

This e-mail may contain trade secrets or privileged, undisclosed or otherwise confidential information. If you are not the intended recipient and have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal from your system. Thank you for your co-operation.

Nov 21, 2008, 5:34:23â€¯AM11/21/08

to

Szabolcs wrote:

> Hello,

>

> I would like to know if the two bugs mentioned here (and in other

> messages) are still present in 7.0:

>

> http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_frm/thread/265f5619682b67e/0188372330785945

> Hello,

>

> I would like to know if the two bugs mentioned here (and in other

> messages) are still present in 7.0:

>

> http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_frm/thread/265f5619682b67e/0188372330785945

I'd like to know if the ability to set a RevolutionAxis (which was

present in the old SurfaceOfRevolution, and is sorely missing in

RevolutionPlot3D) has been added to RevolutionPlot3D.

--

Helen Read

University of Vermont

Nov 21, 2008, 5:32:35â€¯AM11/21/08

to

The Tally[] problem is solved. This I checked with a prerelease version

of Mathematica 7 I got at this year's Mathematica users conference.

of Mathematica 7 I got at this year's Mathematica users conference.

Michael

Szabolcs schrieb:

Nov 22, 2008, 6:10:10â€¯AM11/22/08

to

Yes, there is a new RevolutionAxis option in Version 7.

David Park

djm...@comcast.net

http://home.comcast.net/~djmpark

Nov 22, 2008, 6:09:59â€¯AM11/22/08

to

Hi Alexei,

> I would like to put a trivial though an important question to you. For

> some time I try to find problems to be best solved*

> using Mathematica, and where parallel computing would be necessary. So

> far I did not succeed in finding them. This tells

> me that I do not understand something important. I would be grateful, if

> you could give few examples. Very frankly,

> this my mail is not aimed to state a uselessness of parallelization. In

> contrast, I need few realistic examples to make my mind and

> begin understanding, if I may need using a parallelization myself.

>

>

> * Just to explain the point: molecular dynamics simulation for instance,

> often requires parallelization, but here Fortran

> of C would be probably methods of choice, rather than Mathematica. In

> analytical calculations in contrast, Mathematica

> would be the method of choice, but in this case I do not see how

> parallelization may be benefited from.

I think in the above you are implicitly making at least 2

presuppositions that are not well justified, and probably plain wrong:

1) Whether or not an algorithm can successfully be parallelized has

anything to do with it being analytical or not.

Whether or not an algorithm can take advantage of parallelism or not

does only depend on whether there are parts of the algorithm which can

be executed in parallel without suffering too much from communication

overhead. It has nothing to do with what these parts do, numerics,

analytics or anything else. Trivial examples for easily parallelizable

algorithms are those called "embarassingly parallel", where you do

completely independent calculations in parallel.

One example for analytical calculations which can profit from

"embarassing" parallelism I know of are multiloop calculation in quantum

field theory, where you have to calculate many, sometimes thousands, of

Feynman diagrams _analytically_, and in the end sum up all of these.

Since each of these diagrams can be calculated independently of each

other, an "embarassing" parallel algorithm would just distribute the

calculation of these to the processors available and gain almost linear

speedup, since there is practically no communication overhead.

2) Mathematica is not useful for anything but analytic calculations

(like e.g. numerical analysis), probably because it is too slow.

For every useful language/system/tool there is a trade off between

execution-speed, time of development, code maintenance and many other

aspects. Of course when you neglect the amount of time and work you have

to invest, you will always be able to write code in C or Fortran that is

faster than Mathematica code. The same is true for Assembler code, which

will always outperform C or Fortran code if decently written. Both these

statements of course imply that the programmer knows his business,

otherwise Wolframs developers or the authors of the compiler in action

will probably do a better job although their codes must be much more

general.

There are many aspects that contribute to a well reasoned decision which

is the best tool/system/language for a specific task, and pure execution

speed is in many cases only one of the less important factors. If you

can receive a result by running 100 NDSolve-s in parallel within half an

hour, I would not even think about which ode-library I would link to my

Fortran program to build a program that probably gets the same result in

1 Minute -- once I have spent 2 days writing it (and another week

debugging that stupid index being one off). Let alone writing

specialized solver code that makes best use of the Cell processors of

the new supercomputer so it will run the same thing in 5 Seconds.

> ... and where parallel computing would be necessary

Usually parallel computing can not do more for you than speed up your

calculations. So when would parallel computing be _necessary_? I can

think of two cases:

1) the calculation times for a problem would be too long to wait for

(something in the order of years). For these cases to be solvable in

reasonable time you will need massive parallelism, that is you are

talking about probably 10-thousands of processors. Here it certainly

makes sense to spend some time coding to get maximal execution speed.

This is probably not the class of problems that Mathematica 7 really

addresses.

2) The result is only useful if it is calculated fast enough. E.g. a

weather forecast for the next day is useless if it takes 4 days to

calculate it. Reducing this by parallel computing to 1/2 a day would

make the same program useful, and I can see many cases where that would

make Mathematica programs useful, which without parallelism would be

useless.

In all other cases parallelism is just helping you to do more in less

time, which is often a good enough reason to make use of it. If you can

increase your throughput by a factor of 4 with the 4 Kernels Mathematica

7 lets you use you probably can make your boss quite happy...

hth,

albert

Nov 22, 2008, 6:09:37â€¯AM11/22/08

to

On Nov 21, 2008, at 4:34 AM, Helen Read wrote:

> Szabolcs wrote:

>> Hello,

>>

>> I would like to know if the two bugs mentioned here (and in other

>> messages) are still present in 7.0:

>>

>> http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_frm/thread/265f5619682b67e/0188372330785945

>

> I'd like to know if the ability to set a RevolutionAxis (which was

> present in the old SurfaceOfRevolution, and is sorely missing in

> RevolutionPlot3D) has been added to RevolutionPlot3D.

Yes.

http://reference.wolfram.com/mathematica/ref/RevolutionAxis.html

Brett

Nov 22, 2008, 6:09:27â€¯AM11/22/08

to

On Nov 21, 11:34 am, Helen Read <r...@math.uvm.edu> wrote:

> Szabolcs wrote:

> > Hello,

>

> > I would like to know if the two bugs mentioned here (and in other

> > messages) are still present in 7.0:

>

> >http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_...

>

> I'd like to know if the ability to set a RevolutionAxis (which was

> present in the old SurfaceOfRevolution, and is sorely missing in

> RevolutionPlot3D) has been added to RevolutionPlot3D.

>

> Szabolcs wrote:

> > Hello,

>

> > I would like to know if the two bugs mentioned here (and in other

> > messages) are still present in 7.0:

>

>

> I'd like to know if the ability to set a RevolutionAxis (which was

> present in the old SurfaceOfRevolution, and is sorely missing in

> RevolutionPlot3D) has been added to RevolutionPlot3D.

>

Hello Helen,

RevolutionPlot3D does have a RevolutionAxis option in Mathematica

6.0.3. Does it differ in any way from the RevolutionAxis option of

SurfaceOfRevolution?

Try this:

RevolutionPlot3D[Cos[x], {x, 0, Pi}, RevolutionAxis -> {1, 1, 0}]

Nov 22, 2008, 6:13:31â€¯AM11/22/08

to

I hope the problem where Exporting or Save As... for a Plot3D to pdf

format (Windows platform) resulted in HUGE file sizes has been

addressed in 7.

format (Windows platform) resulted in HUGE file sizes has been

addressed in 7.

--JD

Nov 22, 2008, 6:13:52â€¯AM11/22/08

to

Hello,

Can someone explain why Mathematica is going directly to version 7.0?

Version 6.0 was a big improvement, but it was not very polished. I

have some problems using it.

Specially the documentation System and some Java issues. I am

wondering if with this new version,

we will have another big improvement, but we will remain having to

cope with the same kind of problems.

On Nov 21, 8:32 am, Michael Weyrauch <michael.weyra...@gmx.de> wrote:

> The Tally[] problem is solved. This I checked with a prerelease version

> of Mathematica 7 I got at this year's Mathematica users conference.

>

> Michael

>

> Szabolcs schrieb:

>

> > Hello,

>

> > I would like to know if the two bugs mentioned here (and in other

> > messages) are still present in 7.0:

>

> >http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_...

Nov 22, 2008, 6:14:25â€¯AM11/22/08

to

On Nov 21, 11:32 am, Michael Weyrauch <michael.weyra...@gmx.de> wrote:

> The Tally[] problem is solved. This I checked with a prerelease version

> of Mathematica 7 I got at this year's Mathematica users conference.

>

> The Tally[] problem is solved. This I checked with a prerelease version

> of Mathematica 7 I got at this year's Mathematica users conference.

>

What about the other bug (the eigenvalue problem), linked from the

same thread I mentioned?

I copied the (wrong) results from Mathematica 6 here:

In[1]:= mat = {{-6, 0, -Sqrt[3], 0, 0, Sqrt[3], 0, 0, 0, 0, 0, 0, 0,

0}, {0, -6, 0, -Sqrt[3], 0, 0, Sqrt[3], 0, 0, 0, 0, 0, 0,

0}, {-Sqrt[3], 0, -4, 2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4),

2 Sqrt[2/3], 0, 0, Sqrt[3], 0, 0, 0, 0, 0, 0}, {0, -Sqrt[3],

2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4), -4/3, -(2 Sqrt[2])/3, 0, 0, 0,

Sqrt[3], 0, 0, 0, 0, 0}, {0, 0, 2 Sqrt[2/3], -(2 Sqrt[2])/3, 7/3,

0, 0, 0, 0, Sqrt[3], 0, 0, 0, 0}, {Sqrt[3], 0, 0, 0, 0, -4, 0,

2 (-1/(4 Sqrt[3]) + Sqrt[3]/4), 0, 0, 2 Sqrt[2/3], 0, 0, 0}, {0,

Sqrt[3], 0, 0, 0, 0, -4, 0, 2 (-1/(4 Sqrt[3]) + Sqrt[3]/4), 0, 0,

2 Sqrt[2/3], 0, 0}, {0, 0, Sqrt[3], 0, 0,

2 (-1/(4 Sqrt[3]) + Sqrt[3]/4), 0, -14/3,

2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4), 2 Sqrt[2/3], (2 Sqrt[2])/3, 0,

0, 0}, {0, 0, 0, Sqrt[3], 0, 0, 2 (-1/(4 Sqrt[3]) + Sqrt[3]/4),

2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4), -2, -(2 Sqrt[2])/3,

0, (2 Sqrt[2])/3, 0, 0}, {0, 0, 0, 0, Sqrt[3], 0, 0,

2 Sqrt[2/3], -(2 Sqrt[2])/3, -7/3, 0, 0,

2 (1/(3 Sqrt[2]) + (2 Sqrt[2])/3), Sqrt[10/3]}, {0, 0, 0, 0, 0,

2 Sqrt[2/3], 0, (2 Sqrt[2])/3, 0, 0, -16/3,

2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4), 2 Sqrt[2/3], 0}, {0, 0, 0, 0,

0, 0, 2 Sqrt[2/3], 0, (2 Sqrt[2])/3, 0,

2 (-1/(4 Sqrt[3]) + (3 Sqrt[3])/4), -8/3, -(2 Sqrt[2])/3, 0}, {0,

0, 0, 0, 0, 0, 0, 0, 0, 2 (1/(3 Sqrt[2]) + (2 Sqrt[2])/3),

2 Sqrt[2/3], -(2 Sqrt[2])/3, 1/2,

2 (-Sqrt[5/3]/16 - Sqrt[15]/16)}, {0, 0, 0, 0, 0, 0, 0, 0, 0,

Sqrt[10/3], 0, 0, 2 (-Sqrt[5/3]/16 - Sqrt[15]/16), 7/2}};

In[2]:= mat === Conjugate@Transpose[mat]

Out[2]= True

(mat is Hermitian so we expect real eigenvalues.)

In[3]:= N@Eigenvalues[mat]

Out[3]= {-9.41358 + 0.88758 I, -9.41358 - 0.88758 I, -7.37965 +

2.32729 I, -7.37965 - 2.32729 I, -4.46655 + 2.59738 I, -4.46655 -

2.59738 I, 4.36971, 3.21081, -2.32456 + 2.10914 I, -2.32456 -

2.10914 I, 2.04366+ 0.552265 I,

2.04366- 0.552265 I, -0.249588 + 1.29034 I, -0.249588 - 1.29034 I}

In[4]:= Eigenvalues[N[mat]]

Out[4]= {-9.09122, -7.41855, -7.41855, -7.2915, 4.33734, -4., -4., \

3.2915, -3.24612, -2.38787, -2.38787, 1.80642, 1.80642,

9.21707*10^-16}

Nov 24, 2008, 4:07:10â€¯AM11/24/08

to

Mathematica 7.0 (Windows 32 bit). For your mat:

N@Eigenvalues[mat] // InputForm

{-9.091215416949623, -7.4185507188738455, -7.4185507188738455,

-7.291502622129181, 4.337337307188519, -4., -4., 3.2915026221291814,

-3.2461218902388955, -2.387873132949261, -2.387873132949261,

1.8064238518231066, 1.8064238518231066, 0.}

--

Nov 24, 2008, 4:07:53â€¯AM11/24/08

to

It seems that the sizes of the resulting .PDFs are more reasonable now,

but there are still problems with strange artifacts appearing. If the

exported graphic is to look more like it does on screen, the "use bitmap

representation" helps the look immensely, but it obviously leads to more

jaggies, and leads to much larger file sizes.

but there are still problems with strange artifacts appearing. If the

exported graphic is to look more like it does on screen, the "use bitmap

representation" helps the look immensely, but it obviously leads to more

jaggies, and leads to much larger file sizes.

Nov 24, 2008, 4:08:46â€¯AM11/24/08

to

Nice. This will make my students happy. Now they won't have to stand on

their heads and interchange x and y and set ViewVertical->{-1,0,0} to

revolve around the x-axis.

Nov 24, 2008, 4:08:35â€¯AM11/24/08

to

Wow, they sure snuck it in. It was missing prior to 6.0.3, and it's

nowhere to be found in the Documentation for 6.0.3.

Nov 24, 2008, 4:08:24â€¯AM11/24/08

to

Yes, that bug is gone too. (For that example.)

mat === Conjugate@Transpose[mat]

True

nEigen = N@Eigenvalues[mat]

{-9.09122, -7.41855, -7.41855, -7.2915, 4.33734, -4., -4., 3.2915, \

-3.24612, -2.38787, -2.38787, 1.80642, 1.80642, 0.}

eigenN = Eigenvalues[N[mat]]

{-9.09122, -7.41855, -7.41855, -7.2915, 4.33734, -4., -4., 3.2915, \

-3.24612, -2.38787, -2.38787, 1.80642, 1.80642, -9.93145*10^-16}

nEigen - eigenN // Chop

{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}

Bobby

Nov 24, 2008, 4:10:53â€¯AM11/24/08

to

"Szabolcs" <szho...@gmail.com> wrote in message

news:gg8pih$k4r$1...@smc.vnet.net...

It seems to be fixed in M7:

In[36]:= $Version

Out[36]= 7.0 for Microsoft Windows (32-bit) (November 10, 2008)

In[32]:= mat = {{-6, 0, -Sqrt[3], 0, 0, Sqrt[3], 0,

0, 0, 0, 0, 0, 0, 0}, {0, -6, 0,

-Sqrt[3], 0, 0, Sqrt[3], 0, 0, 0, 0,

0, 0, 0}, {-Sqrt[3], 0, -4,

2*(-(4*Sqrt[3])^(-1) + (3*Sqrt[3])/

4), 2*Sqrt[2/3], 0, 0, Sqrt[3], 0,

0, 0, 0, 0, 0}, {0, -Sqrt[3],

2*(-(4*Sqrt[3])^(-1) + (3*Sqrt[3])/

4), -4/3, -(2*Sqrt[2])/3, 0, 0, 0,

Sqrt[3], 0, 0, 0, 0, 0},

{0, 0, 2*Sqrt[2/3], -(2*Sqrt[2])/3,

7/3, 0, 0, 0, 0, Sqrt[3], 0, 0, 0,

0}, {Sqrt[3], 0, 0, 0, 0, -4, 0,

2*(-(4*Sqrt[3])^(-1) + Sqrt[3]/4), 0,

0, 2*Sqrt[2/3], 0, 0, 0},

{0, Sqrt[3], 0, 0, 0, 0, -4, 0,

2*(-(4*Sqrt[3])^(-1) + Sqrt[3]/4), 0,

0, 2*Sqrt[2/3], 0, 0},

{0, 0, Sqrt[3], 0, 0,

2*(-(4*Sqrt[3])^(-1) + Sqrt[3]/4), 0,

-14/3, 2*(-(4*Sqrt[3])^(-1) +

(3*Sqrt[3])/4), 2*Sqrt[2/3],

(2*Sqrt[2])/3, 0, 0, 0},

{0, 0, 0, Sqrt[3], 0, 0,

2*(-(4*Sqrt[3])^(-1) + Sqrt[3]/4),

2*(-(4*Sqrt[3])^(-1) + (3*Sqrt[3])/

4), -2, -(2*Sqrt[2])/3, 0,

(2*Sqrt[2])/3, 0, 0}, {0, 0, 0, 0,

Sqrt[3], 0, 0, 2*Sqrt[2/3],

-(2*Sqrt[2])/3, -7/3, 0, 0,

2*(1/(3*Sqrt[2]) + (2*Sqrt[2])/3),

Sqrt[10/3]}, {0, 0, 0, 0, 0,

2*Sqrt[2/3], 0, (2*Sqrt[2])/3, 0, 0,

-16/3, 2*(-(4*Sqrt[3])^(-1) +

(3*Sqrt[3])/4), 2*Sqrt[2/3], 0},

{0, 0, 0, 0, 0, 0, 2*Sqrt[2/3], 0,

(2*Sqrt[2])/3, 0,

2*(-(4*Sqrt[3])^(-1) + (3*Sqrt[3])/

4), -8/3, -(2*Sqrt[2])/3, 0},

{0, 0, 0, 0, 0, 0, 0, 0, 0,

2*(1/(3*Sqrt[2]) + (2*Sqrt[2])/3),

2*Sqrt[2/3], -(2*Sqrt[2])/3, 1/2,

2*(-Sqrt[5/3]/16 - Sqrt[15]/16)},

{0, 0, 0, 0, 0, 0, 0, 0, 0,

Sqrt[10/3], 0, 0, 2*(-Sqrt[5/3]/16 -

Sqrt[15]/16), 7/2}};

In[33]:= mat === Conjugate[Transpose[mat]]

Out[33]= True

In[34]:= N[Eigenvalues[mat]]

Out[34]= {-9.091215416949623, -7.4185507188738455,

-7.4185507188738455, -7.291502622129181,

4.337337307188519, -4., -4.,

3.2915026221291814, -3.2461218902388955,

-2.387873132949261, -2.387873132949261,

1.8064238518231066, 1.8064238518231066,

0.}

In[35]:= Eigenvalues[N[mat]]

Out[35]= {-9.091215416949622, -7.4185507188738455,

-7.418550718873844, -7.291502622129181,

4.337337307188519, -4.000000000000002,

-3.999999999999999, 3.2915026221291814,

-3.246121890238896, -2.387873132949261,

-2.3878731329492604, 1.8064238518231066,

1.8064238518231046,

-2.8189256280805394*^-16}

Nasser

Nov 24, 2008, 6:45:14â€¯AM11/24/08

to

Hi Alexei

There are 2 possible justifications for parallel computing:

1) Just because it's there, and it's an interesting set of problems

to tackle. But it's mainly 'there' because:

2) Serial execution on a single CPU is too slow (or, as a variant,

your problem requires so much memory that you need to use the address

space of more than one process(or))

When is serial execution on a single CPU too slow ? Well, that's the

question you have to answer for your problems and practices. Do you

want to wait 3 days for a serial answer, or spend 3 weeks

parallelising your program to wait 1 hour ? Depends on how often you

are going to run the program I suspect.

The questions you raise about parallel Mathematica and parallel

Fortran or C are interesting ones too. I do a fair bit of Mathematica

programming but my day job is really Fortran programming for solving

problems in computational electromagnetics. This is one of those

fields where parallel computing is firmly entrenched; solving large

inverse problems just needs lots of CPUs and (usually) lots of memory.

For me, the availability of parallel Mathematica, and a desktop with 2

quad core processors, changes some of the variables in the equations

which determine, for me, whether or not a program should be written in

Mathematica or in Fortran. However, I have a strong preference for

slow programs easily-written in Mathematica, easily integrated with

graphics and all the rest.

I expect you'll get a lot of other views on this one.

Regards

Mark Westwood

> This e-mail may contain trade secrets or privileged, undisclosed or other=

wise confidential information. If you are not the intended recipient and ha=

ve received this e-mail in error, you are hereby notified that any review, =

copying or distribution of it is strictly prohibited. Please inform us imme=

diately and destroy the original transmittal from your system. Thank you fo=

r your co-operation.

Nov 25, 2008, 7:15:45â€¯AM11/25/08

to

But...

$Version

6.0 for Microsoft Windows (32-bit) (May 21, 2008)

Options[RevolutionPlot3D][[51]]

RevolutionAxis -> {0, 0, 1}

Helen Read wrote:

> Szabolcs wrote:

> Wow, they sure snuck it in. It was missing prior to 6.0.3, and it's

> nowhere to be found in the Documentation for 6.0.3.

>

> --

> Helen Read

> University of Vermont

>

--

Dec 4, 2008, 7:15:23â€¯AM12/4/08

to

Alexei Boulbitch wrote:

> Dear MatheGroup members,

>

> I would like to put a trivial though an important question to you. For

> some time I try to find problems to be best solved* using Mathematica, and

> where parallel computing would be necessary. So far I did not succeed in

> finding them. This tells me that I do not understand something

> important... > Dear MatheGroup members,

>

> I would like to put a trivial though an important question to you. For

> some time I try to find problems to be best solved* using Mathematica, and

> where parallel computing would be necessary. So far I did not succeed in

> finding them. This tells me that I do not understand something

Parallelism is used when performance is important. In general, Mathematica

is used when performance is largely unimportant. Consequently, they are

almost mutually exclusive.

For example, you would need to parallelize the following Mathematica program

perfectly across 700,000 CPU cores to obtain performance comparable to the

equivalent OCaml program:

delta = Sqrt[$MachineEpsilon];

RaySphere[o_, d_, c_, r_] :=

Block[{v, b, disc, t1, t2},

v = c - o;

b = v.d;

disc = Sqrt[b^2 - v.v + r^2];

t2 = b + disc;

If[Im[disc] != 0 || t2 <= 0, \[Infinity],

t1 = b - disc;

If[t1 > 0, t1, t2]]

]

Intersect[o_, d_][{lambda_, n_}, Sphere[c_, r_]] :=

Block[{lambda2 = RaySphere[o, d, c, r]},

If[lambda2 >= lambda, {lambda, n}, {lambda2,

Normalize[o + lambda2 d - c]}]

]

Intersect[o_, d_][{lambda_, n_}, Bound[c_, r_, s_]] :=

Block[{lambda2 = RaySphere[o, d, c, r]},

If[lambda2 >= lambda, {lambda, n},

Fold[Intersect[o, d], {lambda, n}, s]]

]

neglight = N@Normalize[{1, 3, -2}];

nohit = {\[Infinity], {0, 0, 0}};

RayTrace[o_, d_, scene_] :=

Block[{lambda, n, g, p},

{lambda, n} = Intersect[o, d][nohit, scene];

If[lambda == \[Infinity], 0,

g = n.neglight;

If[g <= 0, 0,

{lambda, n} =

Intersect[o + lambda d + delta n, neglight][nohit, scene];

If[lambda < \[Infinity], 0, g]]]

]

Create[level_, c_, r_] :=

Block[{obj = Sphere[c, r]},

If[level == 1, obj,

Block[{a = 3*r/Sqrt[12], Aux},

Aux[x1_, z1_] := Create[level - 1, c + {x1, a, z1}, 0.5 r];

Bound[c,

3 r, {obj, Aux[-a, -a], Aux[a, -a], Aux[-a, a], Aux[a, a]}]]]]

scene = Create[1, {0, -1, 4}, 1];

Main[level_, n_, ss_] :=

Block[{scene = Create[level, {0, -1, 4}, 1]},

Table[

Sum[

RayTrace[{0, 0, 0},

N@Normalize[{(x + s/ss/ss)/n - 1/2, (y + Mod[s, ss]/ss)/n - 1/2,

1}], scene], {s, 0, ss^2 - 1}]/ss^2, {y, 0, n - 1},

{x, 0, n - 1}]]

AbsoluteTiming[Export["image.pgm", Graphics@Raster@Main[9, 512, 4]]]

--

Dr Jon D Harrop, Flying Frog Consultancy Ltd.

http://www.ffconsultancy.com/?u

Dec 5, 2008, 5:30:49â€¯AM12/5/08

to

Hello,

> Parallelism is used when performance is important. In general, Mathematica

> is used when performance is largely unimportant. Consequently, they are

> almost mutually exclusive.

This is a strange statement, which I hardly understand. First of all, in

many of my numerical calculations performance is important (e.g. solving

large sets of coupled differential equations) and Mathematica compares

rather well with compiled languages (in a few cases I can compare with

compiled code which does the same calculations as my Mathematica code)

Within such calculations often linear algebra routines are used, which

where parallalized in Mathematica even before Version 7, and, of course

it makes a big difference if your Computer has just one or more

processors available. So compared to compiled languages with programs

that are not parallelized (which is often the case) Mathematica may even

have an advantage.

>

> For example, you would need to parallelize the following Mathematica program

> perfectly across 700,000 CPU cores to obtain performance comparable to the

> equivalent OCaml program:

>

I don't know OCaml. Therefore, could you elaborate a bit how this

tremendous difference comes about and under what conditions??

Thanks Michael

Dec 8, 2008, 6:21:43â€¯AM12/8/08

to

Michael Weyrauch wrote:

> Hello,

>

>> Parallelism is used when performance is important. In general,

>> Mathematica is used when performance is largely unimportant.

>> Consequently, they are almost mutually exclusive.

>

> This is a strange statement, which I hardly understand. First of all, in

> many of my numerical calculations performance is important (e.g. solving

> large sets of coupled differential equations) and Mathematica compares

> rather well with compiled languages (in a few cases I can compare with

> compiled code which does the same calculations as my Mathematica code)

>

> Within such calculations often linear algebra routines are used, which

> where parallalized in Mathematica even before Version 7, and, of course

> it makes a big difference if your Computer has just one or more

> processors available. So compared to compiled languages with programs

> that are not parallelized (which is often the case) Mathematica may even

> have an advantage.

> Hello,

>

>> Parallelism is used when performance is important. In general,

>> Mathematica is used when performance is largely unimportant.

>> Consequently, they are almost mutually exclusive.

>

> This is a strange statement, which I hardly understand. First of all, in

> many of my numerical calculations performance is important (e.g. solving

> large sets of coupled differential equations) and Mathematica compares

> rather well with compiled languages (in a few cases I can compare with

> compiled code which does the same calculations as my Mathematica code)

>

> Within such calculations often linear algebra routines are used, which

> where parallalized in Mathematica even before Version 7, and, of course

> it makes a big difference if your Computer has just one or more

> processors available. So compared to compiled languages with programs

> that are not parallelized (which is often the case) Mathematica may even

> have an advantage.

Yes, and that is true of any task where Mathematica already bundles a decent

implementation that you can simply invoke. However, it is not true in

general.

>> For example, you would need to parallelize the following Mathematica

>> program perfectly across 700,000 CPU cores to obtain performance

>> comparable to the equivalent OCaml program:

>

> I don't know OCaml.

This is not specific to OCaml. Here are equivalent implementations of that

program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all well

over 100,000x faster than the Mathematica I gave (and many are more concise

as well!):

http://www.ffconsultancy.com/languages/ray_tracer/

> Therefore, could you elaborate a bit how this tremendous difference comes

> about and under what conditions??

Mathematica programs are only competitively performant when the bottleneck

is one of Mathematica's internal routines (FFT, Schur decomposition,

regular expressions etc.). Optimizing Mathematica code largely entails

reimplementing your code in such a way that stress is moved onto these

internal functions even if that is objectively inappropriate (e.g. in the

case of the recent example on this list with the subject "Floating-Point

Computing").

If you cannot express a solution to your problem in such a way then

Mathematica's performance is likely to be dire, often up to a thousand

times slower than a compiled language. The program I gave hits several of

Mathematica's inefficiencies and, consequently, it takes days to compute

instead of seconds.

I highly recommend learning when these performance issues are likely to

arise because you can save a huge amount of time by developing from scratch

in a performant language when you know performance will be important.

Dec 9, 2008, 6:55:37â€¯AM12/9/08

to

Jon Harrop wrote:

> Michael Weyrauch wrote:

>> Hello,

>>

>>> Parallelism is used when performance is important. In general,

>>> Mathematica is used when performance is largely unimportant.

>>> Consequently, they are almost mutually exclusive.

>> This is a strange statement, which I hardly understand. First of all, in

>> many of my numerical calculations performance is important (e.g. solving

>> large sets of coupled differential equations) and Mathematica compares

>> rather well with compiled languages (in a few cases I can compare with

>> compiled code which does the same calculations as my Mathematica code)

>>

>> Within such calculations often linear algebra routines are used, which

>> where parallalized in Mathematica even before Version 7, and, of course

>> it makes a big difference if your Computer has just one or more

>> processors available. So compared to compiled languages with programs

>> that are not parallelized (which is often the case) Mathematica may even

>> have an advantage.

>

> Michael Weyrauch wrote:

>> Hello,

>>

>>> Parallelism is used when performance is important. In general,

>>> Mathematica is used when performance is largely unimportant.

>>> Consequently, they are almost mutually exclusive.

>> This is a strange statement, which I hardly understand. First of all, in

>> many of my numerical calculations performance is important (e.g. solving

>> large sets of coupled differential equations) and Mathematica compares

>> rather well with compiled languages (in a few cases I can compare with

>> compiled code which does the same calculations as my Mathematica code)

>>

>> Within such calculations often linear algebra routines are used, which

>> where parallalized in Mathematica even before Version 7, and, of course

>> it makes a big difference if your Computer has just one or more

>> processors available. So compared to compiled languages with programs

>> that are not parallelized (which is often the case) Mathematica may even

>> have an advantage.

>

> Yes, and that is true of any task where Mathematica already bundles a decent

> implementation that you can simply invoke. However, it is not true in

> general.

>

> implementation that you can simply invoke. However, it is not true in

> general.

>

>>> For example, you would need to parallelize the following Mathematica

>>> program perfectly across 700,000 CPU cores to obtain performance

>>> comparable to the equivalent OCaml program:

>> I don't know OCaml.

>

>>> program perfectly across 700,000 CPU cores to obtain performance

>>> comparable to the equivalent OCaml program:

>> I don't know OCaml.

>

> This is not specific to OCaml. Here are equivalent implementations of that

> program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all well

> over 100,000x faster than the Mathematica I gave (and many are more concise

> as well!):

>

> http://www.ffconsultancy.com/languages/ray_tracer/

> program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all well

> over 100,000x faster than the Mathematica I gave (and many are more concise

> as well!):

>

> http://www.ffconsultancy.com/languages/ray_tracer/

I do not understand this claim that the speed difference is 5 orders of

magnitude. Some experiments indicate 3, perhaps.

>> Therefore, could you elaborate a bit how this tremendous difference comes

>> about and under what conditions??

>

> Mathematica programs are only competitively performant when the bottleneck

> is one of Mathematica's internal routines (FFT, Schur decomposition,

> regular expressions etc.). Optimizing Mathematica code largely entails

> reimplementing your code in such a way that stress is moved onto these

> internal functions even if that is objectively inappropriate (e.g. in the

> case of the recent example on this list with the subject "Floating-Point

> Computing").

>

> If you cannot express a solution to your problem in such a way then

> Mathematica's performance is likely to be dire, often up to a thousand

> times slower than a compiled language. The program I gave hits several of

> Mathematica's inefficiencies and, consequently, it takes days to compute

> instead of seconds.

>

> I highly recommend learning when these performance issues are likely to

> arise because you can save a huge amount of time by developing from scratch

> in a performant language when you know performance will be important.

If you have code that is readily written in a compiled language, it is

certainly the case that it will run faster there than in an interpreted

language. Probably much faster. So yes, that would be the route to take.

Some modest improvements to the original Mathematica code, though, bring

it to maybe 2 orders of magnitude of the OCaml speed. In the variant

below, I make sure everyone is a machine real number or machine integer,

whichever is appropriate. I use Compile on RaySphere. There are a few

other tweaks for general efficiency (some might apply to the OCaml

implementation, I'm not sure).

delta = Sqrt[$MachineEpsilon];

inf = 100000.;

neglight = Normalize[{1., 3., -2.}];

nohit = {inf, {0., 0., 0.}};

Clear[RaySphere, Intersect, Create, RayTrace, Main];

RaySphere = With[{inf = inf}, Compile[

{{c, _Real, 1}, {o, _Real, 1}, {d, _Real, 1}, {r, _Real}},

Module[{v = c - o, b, disc, t},

b = v.d;

disc = b^2 - v.v + r;

If[disc < 0., inf,

disc = Sqrt[disc];

If[b > disc, b - disc,

t = b + disc;

If[t < 0., inf, t]]]]]];

Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_]] := Block[

{lambda2 = RaySphere[c, o, d, r]},

If[lambda2 >= lambda, ln, {lambda2, c}]]

Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_, s_]] := Block[

{lambda2 = RaySphere[c, o, d, r]},

If[lambda2 >= lambda, ln,

Fold[Intersect[o, d, #1, #2] &, ln, s]

]

]

RayTrace[o_, d_, scene_] := Block[

{lambda, n, g, p},

{lambda, n} = Intersect[o, d, nohit, scene];

If[lambda >= inf, 0.,

n = Normalize[o + d*lambda - n];

g = n.neglight;

If[g <= 0., 0.,

{lambda, n} =

Intersect[o + lambda d + delta n, neglight, nohit, scene];

If[lambda < inf, 0., g]]]]

Create[level_, c_, r_] :=

Block[{obj = List[c, r^2]},

If[level == 1, obj,

Block[{a = 3*r/Sqrt[12], Aux},

Aux[x1_, z1_] := Create[level - 1, c + {x1, a, z1}, 0.5 r];

List[c,

9*r^2, {obj, Aux[-a, -a], Aux[a, -a], Aux[-a, a], Aux[a, a]}]]]]

prepareTable = Compile[{{n, _Integer}, {s, _Integer}},

With[{nss = N[s]},

Flatten[

Table[Normalize[{(x + s1/nss)/n - .5, (y + s2/nss)/n - .5,

1.}], {s1, 0, s - 1}, {s2, 0, s - 1}, {y, 0, n - 1}, {x, 0,

n - 1}], 1]

]];

Main[level_, n_, ss_] :=

Block[{scene = Create[level, {0., -1., 4.}, 1.], nss = N[ss]},

scene = MapAll[Developer`ToPackedArray, scene];

Total[Map[RayTrace[{0., 0., 0.}, #, scene] &,

prepareTable[n, ss], {3}], 1]/nss^2]

In[22]:= Timing[ff512 = Main[9, 512, 4];]

Out[22]= {1542.27, Null}

At this point, it might make sense to use ParallelMap instead of Map, in

Main.

Daniel Lichtblau

Wolfram Research

Dec 10, 2008, 4:49:18â€¯AM12/10/08

to

Daniel Lichtblau wrote:

> Jon Harrop wrote:

>> This is not specific to OCaml. Here are equivalent implementations of

>> that program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all

>> well over 100,000x faster than the Mathematica I gave (and many are more

>> concise as well!):

>>

>> http://www.ffconsultancy.com/languages/ray_tracer/

>

> I do not understand this claim that the speed difference is 5 orders of

> magnitude. Some experiments indicate 3, perhaps.

> Jon Harrop wrote:

>> This is not specific to OCaml. Here are equivalent implementations of

>> that program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all

>> well over 100,000x faster than the Mathematica I gave (and many are more

>> concise as well!):

>>

>> http://www.ffconsultancy.com/languages/ray_tracer/

>

> I do not understand this claim that the speed difference is 5 orders of

> magnitude. Some experiments indicate 3, perhaps.

The code I gave is over 5 orders of magnitude slower by my measurements.

> Some modest improvements to the original Mathematica code, though, bring

> it to maybe 2 orders of magnitude of the OCaml speed.

That is a huge improvement. Thank you.

> In the variant

> below, I make sure everyone is a machine real number or machine integer,

> whichever is appropriate. I use Compile on RaySphere. There are a few

> other tweaks for general efficiency (some might apply to the OCaml

> implementation, I'm not sure).

I have a few questions about this. :-)

> delta = Sqrt[$MachineEpsilon];

> inf = 100000.;

This is nasty. Why is infinity not represented internally by the floating

point number infinity in Mathematica?

> neglight = Normalize[{1., 3., -2.}];

> nohit = {inf, {0., 0., 0.}};

>

> Clear[RaySphere, Intersect, Create, RayTrace, Main];

>

> RaySphere = With[{inf = inf}, Compile[

What is the purpose of "With[{inf = inf}, ..]" here?

> {{c, _Real, 1}, {o, _Real, 1}, {d, _Real, 1}, {r, _Real}},

> Module[{v = c - o, b, disc, t},

> b = v.d;

> disc = b^2 - v.v + r;

> If[disc < 0., inf,

> disc = Sqrt[disc];

> If[b > disc, b - disc,

> t = b + disc;

> If[t < 0., inf, t]]]]]];

>

> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_]] := Block[

> {lambda2 = RaySphere[c, o, d, r]},

> If[lambda2 >= lambda, ln, {lambda2, c}]]

Defer calculation of the normal. Fair enough but it is only done once per

pixel.

> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_, s_]] := Block[

> {lambda2 = RaySphere[c, o, d, r]},

> If[lambda2 >= lambda, ln,

> Fold[Intersect[o, d, #1, #2] &, ln, s]

> ]

> ]

No currying and a named subpattern, ok.

> RayTrace[o_, d_, scene_] := Block[

> {lambda, n, g, p},

> {lambda, n} = Intersect[o, d, nohit, scene];

> If[lambda >= inf, 0.,

> n = Normalize[o + d*lambda - n];

> g = n.neglight;

> If[g <= 0., 0.,

> {lambda, n} =

> Intersect[o + lambda d + delta n, neglight, nohit, scene];

> If[lambda < inf, 0., g]]]]

>

> Create[level_, c_, r_] :=

> Block[{obj = List[c, r^2]},

> If[level == 1, obj,

> Block[{a = 3*r/Sqrt[12], Aux},

> Aux[x1_, z1_] := Create[level - 1, c + {x1, a, z1}, 0.5 r];

> List[c,

> 9*r^2, {obj, Aux[-a, -a], Aux[a, -a], Aux[-a, a], Aux[a, a]}]]]]

>

> prepareTable = Compile[{{n, _Integer}, {s, _Integer}},

> With[{nss = N[s]},

> Flatten[

> Table[Normalize[{(x + s1/nss)/n - .5, (y + s2/nss)/n - .5,

> 1.}], {s1, 0, s - 1}, {s2, 0, s - 1}, {y, 0, n - 1}, {x, 0,

> n - 1}], 1]

> ]];

So you precompute a matrix of all the ray directions using a compiled

function.

> Main[level_, n_, ss_] :=

> Block[{scene = Create[level, {0., -1., 4.}, 1.], nss = N[ss]},

> scene = MapAll[Developer`ToPackedArray, scene];

> Total[Map[RayTrace[{0., 0., 0.}, #, scene] &,

> prepareTable[n, ss], {3}], 1]/nss^2]

>

> In[22]:= Timing[ff512 = Main[9, 512, 4];]

> Out[22]= {1542.27, Null}

That is orders of magnitude faster than my version but I am not sure why. I

suspect the main performance improvement comes from the use of packed

arrays but I haven't been able to reproduce it in my own code yet.

> At this point, it might make sense to use ParallelMap instead of Map, in

> Main.

Replacing Map with ParallelMap slows it down, even with

Method -> "CoarsestGrained". How can I get a speedup from my 8 cores?! :-)

Many thanks,

Dec 11, 2008, 3:42:50â€¯AM12/11/08

to

[My comments are interspersed with the questions. --dl]

Jon Harrop wrote:

> Daniel Lichtblau wrote:

>> Jon Harrop wrote:

>>> This is not specific to OCaml. Here are equivalent implementations of

>>> that program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all

>>> well over 100,000x faster than the Mathematica I gave (and many are more

>>> concise as well!):

>>>

>>> http://www.ffconsultancy.com/languages/ray_tracer/

>> I do not understand this claim that the speed difference is 5 orders of

>> magnitude. Some experiments indicate 3, perhaps.

>

> The code I gave is over 5 orders of magnitude slower by my measurements.

Offhand I do not know what would account for this. I ran it in

Mathematica 7 (a 64 bit Linux machine) on

Main[3, 52, 4]

and observed something around 7-10 times slower than what I coded. I did

not try it on larger scenes, so possibly there is something going wrong

in the recursion. Certainly any of these codes will scale linearly with

the number of pixels (so quadratically with n), so my using a dimension

of 52 instead of 512 here should not matter for assessment of relative

speeds.

>> Some modest improvements to the original Mathematica code, though, bring

>> it to maybe 2 orders of magnitude of the OCaml speed.

>

> That is a huge improvement. Thank you.

>

>> In the variant

>> below, I make sure everyone is a machine real number or machine integer,

>> whichever is appropriate. I use Compile on RaySphere. There are a few

>> other tweaks for general efficiency (some might apply to the OCaml

>> implementation, I'm not sure).

>

> I have a few questions about this. :-)

>

>> delta = Sqrt[$MachineEpsilon];

>> inf = 100000.;

>

> This is nasty. Why is infinity not represented internally by the floating

> point number infinity in Mathematica?

Mathematica does not use machine double infinities, or bignum infinities

for that matter. Generally when those are encountered in a computation

e.g. with dedicated machine arithmetic, an exception is thrown and

Mathematica attempts to work around it e.g. with software higher precision.

This is not to say there is no reason to have a machine double infinity

in Mathematica. I can say, however, that putting one in would involve

considerable work, and I think it would give but small improvement in

speed or functionality.

>> neglight = Normalize[{1., 3., -2.}];

>> nohit = {inf, {0., 0., 0.}};

>>

>> Clear[RaySphere, Intersect, Create, RayTrace, Main];

>>

>> RaySphere = With[{inf = inf}, Compile[

>

> What is the purpose of "With[{inf = inf}, ..]" here?

This is to inject the evaluated form into the Compile. Else it is a

nuisance to get that form, because Compile has the HoldAll attribute.

>> {{c, _Real, 1}, {o, _Real, 1}, {d, _Real, 1}, {r, _Real}},

>> Module[{v = c - o, b, disc, t},

>> b = v.d;

>> disc = b^2 - v.v + r;

>> If[disc < 0., inf,

>> disc = Sqrt[disc];

>> If[b > disc, b - disc,

>> t = b + disc;

>> If[t < 0., inf, t]]]]]];

>>

>> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_]] := Block[

>> {lambda2 = RaySphere[c, o, d, r]},

>> If[lambda2 >= lambda, ln, {lambda2, c}]]

>

> Defer calculation of the normal. Fair enough but it is only done once per

> pixel.

It makes a small but measurable difference to speed. Helps a bit more

than you indicate, because the original code uses Normalize inside

Intersect, where it might be done as many times as lambda keeps

decreasing for a given originating call (to Intersect, with the entire

scene).

>> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_, s_]] := Block[

>> {lambda2 = RaySphere[c, o, d, r]},

>> If[lambda2 >= lambda, ln,

>> Fold[Intersect[o, d, #1, #2] &, ln, s]

>> ]

>> ]

>

> No currying and a named subpattern, ok.

>

>> RayTrace[o_, d_, scene_] := Block[

>> {lambda, n, g, p},

>> {lambda, n} = Intersect[o, d, nohit, scene];

>> If[lambda >= inf, 0.,

>> n = Normalize[o + d*lambda - n];

>> g = n.neglight;

>> If[g <= 0., 0.,

>> {lambda, n} =

>> Intersect[o + lambda d + delta n, neglight, nohit, scene];

>> If[lambda < inf, 0., g]]]]

>>

>> Create[level_, c_, r_] :=

>> Block[{obj = List[c, r^2]},

>> If[level == 1, obj,

>> Block[{a = 3*r/Sqrt[12], Aux},

>> Aux[x1_, z1_] := Create[level - 1, c + {x1, a, z1}, 0.5 r];

>> List[c,

>> 9*r^2, {obj, Aux[-a, -a], Aux[a, -a], Aux[-a, a], Aux[a, a]}]]]]

>>

>> prepareTable = Compile[{{n, _Integer}, {s, _Integer}},

>> With[{nss = N[s]},

>> Flatten[

>> Table[Normalize[{(x + s1/nss)/n - .5, (y + s2/nss)/n - .5,

>> 1.}], {s1, 0, s - 1}, {s2, 0, s - 1}, {y, 0, n - 1}, {x, 0,

>> n - 1}], 1]

>> ]];

>

> So you precompute a matrix of all the ray directions using a compiled

> function.

>

>> Main[level_, n_, ss_] :=

>> Block[{scene = Create[level, {0., -1., 4.}, 1.], nss = N[ss]},

>> scene = MapAll[Developer`ToPackedArray, scene];

>> Total[Map[RayTrace[{0., 0., 0.}, #, scene] &,

>> prepareTable[n, ss], {3}], 1]/nss^2]

>>

>> In[22]:= Timing[ff512 = Main[9, 512, 4];]

>> Out[22]= {1542.27, Null}

>

> That is orders of magnitude faster than my version but I am not sure why. I

> suspect the main performance improvement comes from the use of packed

> arrays but I haven't been able to reproduce it in my own code yet.

Yes, packed arrays, which among other things avoids ALL exact arithmetic

in places you do not want it. Also compiling RaySphere gives, by itself,

a factor of two or more in overal speed gain.

>> At this point, it might make sense to use ParallelMap instead of Map, in

>> Main.

>

> Replacing Map with ParallelMap slows it down, even with

> Method -> "CoarsestGrained". How can I get a speedup from my 8 cores?! :-)

>

> Many thanks,

I only made the suggestion, without trying it, because I have no actual

familiarity with ParallelMap. But one possibility: did you use

DistributeDefinitions on all the functions underneath that Map? If not,

that could certainly make it slower rather than faster.

Daniel Lichtblau

Wolfram Research

Dec 11, 2008, 3:48:18â€¯AM12/11/08

to

[My comments are interspersed with the questions. --dl]

Jon Harrop wrote:

> Daniel Lichtblau wrote:

>> Jon Harrop wrote:

>>> This is not specific to OCaml. Here are equivalent implementations of

>>> that program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and all

>>> well over 100,000x faster than the Mathematica I gave (and many are more

>>> concise as well!):

>>>

>>> http://www.ffconsultancy.com/languages/ray_tracer/

>> I do not understand this claim that the speed difference is 5 orders of

>> magnitude. Some experiments indicate 3, perhaps.

>

> The code I gave is over 5 orders of magnitude slower by my measurements.

Offhand I do not know what would account for this. I ran it in

Mathematica 7 (a 64 bit Linux machine) on

Main[3, 52, 4]

and observed something around 7-10 times slower than what I coded. I did

not try it on larger scenes, so possibly there is something going wrong

in the recursion. Certainly any of these codes will scale linearly with

the number of pixels (so quadratically with n), so my using a dimension

of 52 instead of 512 here should not matter for assessment of relative

speeds.

>> Some modest improvements to the original Mathematica code, though, bring

>> it to maybe 2 orders of magnitude of the OCaml speed.

>

> That is a huge improvement. Thank you.

>

>> In the variant

>> below, I make sure everyone is a machine real number or machine integer,

>> whichever is appropriate. I use Compile on RaySphere. There are a few

>> other tweaks for general efficiency (some might apply to the OCaml

>> implementation, I'm not sure).

>

> I have a few questions about this. :-)

>

>> delta = Sqrt[$MachineEpsilon];

>> inf = 100000.;

>

> This is nasty. Why is infinity not represented internally by the floating

> point number infinity in Mathematica?

Mathematica does not use machine double infinities, or bignum infinities

for that matter. Generally when those are encountered in a computation

e.g. with dedicated machine arithmetic, an exception is thrown and

Mathematica attempts to work around it e.g. with software higher precision.

This is not to say there is no reason to have a machine double infinity

in Mathematica. I can say, however, that putting one in would involve

considerable work, and I think it would give but small improvement in

speed or functionality.

>> neglight = Normalize[{1., 3., -2.}];

>> nohit = {inf, {0., 0., 0.}};

>>

>> Clear[RaySphere, Intersect, Create, RayTrace, Main];

>>

>> RaySphere = With[{inf = inf}, Compile[

>

> What is the purpose of "With[{inf = inf}, ..]" here?

This is to inject the evaluated form into the Compile. Else it is a

nuisance to get that form, because Compile has the HoldAll attribute.

>> {{c, _Real, 1}, {o, _Real, 1}, {d, _Real, 1}, {r, _Real}},

>> Module[{v = c - o, b, disc, t},

>> b = v.d;

>> disc = b^2 - v.v + r;

>> If[disc < 0., inf,

>> disc = Sqrt[disc];

>> If[b > disc, b - disc,

>> t = b + disc;

>> If[t < 0., inf, t]]]]]];

>>

>> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_]] := Block[

>> {lambda2 = RaySphere[c, o, d, r]},

>> If[lambda2 >= lambda, ln, {lambda2, c}]]

>

> Defer calculation of the normal. Fair enough but it is only done once per

> pixel.

It makes a small but measurable difference to speed. Helps a bit more

than you indicate, because the original code uses Normalize inside

Intersect, where it might be done as many times as lambda keeps

decreasing for a given originating call (to Intersect, with the entire

scene).

>> Intersect[o_, d_, ln : {lambda_, n_}, List[c_, r_, s_]] := Block[

>> {lambda2 = RaySphere[c, o, d, r]},

>> If[lambda2 >= lambda, ln,

>> Fold[Intersect[o, d, #1, #2] &, ln, s]

>> ]

>> ]

>

> No currying and a named subpattern, ok.

>

>> RayTrace[o_, d_, scene_] := Block[

>> {lambda, n, g, p},

>> {lambda, n} = Intersect[o, d, nohit, scene];

>> If[lambda >= inf, 0.,

>> n = Normalize[o + d*lambda - n];

>> g = n.neglight;

>> If[g <= 0., 0.,

>> {lambda, n} =

>> Intersect[o + lambda d + delta n, neglight, nohit, scene];

>> If[lambda < inf, 0., g]]]]

>>

>> Create[level_, c_, r_] :=

>> Block[{obj = List[c, r^2]},

>> If[level == 1, obj,

>> Block[{a = 3*r/Sqrt[12], Aux},

>> Aux[x1_, z1_] := Create[level - 1, c + {x1, a, z1}, 0.5 r];

>> List[c,

>> 9*r^2, {obj, Aux[-a, -a], Aux[a, -a], Aux[-a, a], Aux[a, a]}]]]]

>>

>> prepareTable = Compile[{{n, _Integer}, {s, _Integer}},

>> With[{nss = N[s]},

>> Flatten[

>> Table[Normalize[{(x + s1/nss)/n - .5, (y + s2/nss)/n - .5,

>> 1.}], {s1, 0, s - 1}, {s2, 0, s - 1}, {y, 0, n - 1}, {x, 0,

>> n - 1}], 1]

>> ]];

>

> So you precompute a matrix of all the ray directions using a compiled

> function.

>

>> Main[level_, n_, ss_] :=

>> Block[{scene = Create[level, {0., -1., 4.}, 1.], nss = N[ss]},

>> scene = MapAll[Developer`ToPackedArray, scene];

>> Total[Map[RayTrace[{0., 0., 0.}, #, scene] &,

>> prepareTable[n, ss], {3}], 1]/nss^2]

>>

>> In[22]:= Timing[ff512 = Main[9, 512, 4];]

>> Out[22]= {1542.27, Null}

>

> That is orders of magnitude faster than my version but I am not sure why. I

> suspect the main performance improvement comes from the use of packed

> arrays but I haven't been able to reproduce it in my own code yet.

Yes, packed arrays, which among other things avoids ALL exact arithmetic

in places you do not want it. Also compiling RaySphere gives, by itself,

a factor of two or more in overal speed gain.

>> At this point, it might make sense to use ParallelMap instead of Map, in

>> Main.

>

Dec 12, 2008, 6:57:39â€¯AM12/12/08

to

On Wednesday 10 December 2008 16:22:15 Daniel Lichtblau wrote:

> Jon Harrop wrote:

> > Daniel Lichtblau wrote:

> >> Jon Harrop wrote:

> >>> This is not specific to OCaml. Here are equivalent implementations of

> >>> that program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and

> >>> all

> >>> well over 100,000x faster than the Mathematica I gave (and many are

> >>> more concise as well!):

> >>>

> >>> http://www.ffconsultancy.com/languages/ray_tracer/

> >>

> >> I do not understand this claim that the speed difference is 5 orders of

> >> magnitude. Some experiments indicate 3, perhaps.

> >

> > The code I gave is over 5 orders of magnitude slower by my measurements.

>

> Offhand I do not know what would account for this. I ran it in

> Mathematica 7 (a 64 bit Linux machine) on

>

> Main[3, 52, 4]

>

> and observed something around 7-10 times slower than what I coded.

> Jon Harrop wrote:

> > Daniel Lichtblau wrote:

> >> Jon Harrop wrote:

> >>> This is not specific to OCaml. Here are equivalent implementations of

> >>> that program in C++, Java, OCaml, SML, Scheme, Lisp and Haskell and

> >>> all

> >>> well over 100,000x faster than the Mathematica I gave (and many are

> >>> more concise as well!):

> >>>

> >>> http://www.ffconsultancy.com/languages/ray_tracer/

> >>

> >> I do not understand this claim that the speed difference is 5 orders of

> >> magnitude. Some experiments indicate 3, perhaps.

> >

> > The code I gave is over 5 orders of magnitude slower by my measurements.

>

> Offhand I do not know what would account for this. I ran it in

> Mathematica 7 (a 64 bit Linux machine) on

>

> Main[3, 52, 4]

>

> and observed something around 7-10 times slower than what I coded.

You'll get completely different performance results if you lower the first

argument because it changes the scene being rendered. You're only rendering

a handful of spheres with those arguments whereas the original rendered

over 80,000 spheres.

With 9,512,1 your version is 170x faster than my code, which took 26,000

seconds here. The original arguments 9,512,4 would take 16x longer still,

meaning your code is still 640x slower than the fastest compiled languages.

That is the five orders of magnitude I was referring to.

> I did

> not try it on larger scenes, so possibly there is something going wrong

> in the recursion. Certainly any of these codes will scale linearly with

> the number of pixels (so quadratically with n), so my using a dimension

> of 52 instead of 512 here should not matter for assessment of relative

> speeds.

I'd recommend cutting ss down to 1 before reducing the resolution (i.e. use

3,200,1 instead of 3,50,4) but there is still the chance you'll remove

thousands of tiny spheres from the image if you render a big scene, so the

resolution can have a dramatic effect on performance at very low

resolutions.

> >> inf = 100000.;

> >

> > This is nasty. Why is infinity not represented internally by the

> > floating

> > point number infinity in Mathematica?

>

> Mathematica does not use machine double infinities, or bignum infinities

> for that matter. Generally when those are encountered in a computation

> e.g. with dedicated machine arithmetic, an exception is thrown and

> Mathematica attempts to work around it e.g. with software higher

> precision.

>

> This is not to say there is no reason to have a machine double infinity

> in Mathematica. I can say, however, that putting one in would involve

> considerable work, and I think it would give but small improvement in

> speed or functionality.

Surely that is not the case here, where Compile'd code is intended to handle

infinities often? Or does Compile already handle infinity as a

machine-precision float?

> > What is the purpose of "With[{inf = inf}, ..]" here?

>

> This is to inject the evaluated form into the Compile. Else it is a

> nuisance to get that form, because Compile has the HoldAll attribute.

I see. :-)

> > Defer calculation of the normal. Fair enough but it is only done once

> > per

> > pixel.

>

> It makes a small but measurable difference to speed. Helps a bit more

> than you indicate, because the original code uses Normalize inside

> Intersect, where it might be done as many times as lambda keeps

> decreasing for a given originating call (to Intersect, with the entire

> scene).

True. Hierarchical bounds checks mean that lambda cannot be decreased many

times though. I should quantify exactly how often that occurs though.

Perhaps a more interesting solution would be to wrap the result in Hold[..]

to get laziness?

> >> In[22]:= Timing[ff512 = Main[9, 512, 4];]

> >> Out[22]= {1542.27, Null}

> >

> > That is orders of magnitude faster than my version but I am not sure

> > why.

> > I suspect the main performance improvement comes from the use of packed

> > arrays but I haven't been able to reproduce it in my own code yet.

>

> Yes, packed arrays, which among other things avoids ALL exact arithmetic

> in places you do not want it. Also compiling RaySphere gives, by itself,

> a factor of two or more in overal speed gain.

I had already tried Compile but never got anywhere near that performance. I

suspect your packed arrays and precalculated normalized vectors do the rest.

> >> At this point, it might make sense to use ParallelMap instead of Map,

> >> in Main.

> >

> > Replacing Map with ParallelMap slows it down, even with

> > Method -> "CoarsestGrained". How can I get a speedup from my 8 cores?!

> > :-)

> >

> > Many thanks,

>

> I only made the suggestion, without trying it, because I have no actual

> familiarity with ParallelMap. But one possibility: did you use

> DistributeDefinitions on all the functions underneath that Map? If not,

> that could certainly make it slower rather than faster.

I've managed to get rid of all the errors by calling DistributeDefinitions

with the symbol names of all functions *and* the local "scene" variable

right before the ParallelMap. However, this still doesn't give any

performance improvement (and it only spawns 4 kernels for my 8 cores).

Might make a nice parallelism demo for Mathematica 7 if you can get it

working...

Dec 14, 2008, 7:32:28â€¯AM12/14/08

to

Jon Harrop wrote:

<snip>

> I've managed to get rid of all the errors by calling DistributeDefinitions

> with the symbol names of all functions *and* the local "scene" variable

> right before the ParallelMap. However, this still doesn't give any

> performance improvement (and it only spawns 4 kernels for my 8 cores).

John,

You seem to believe that Mathematica 7.0 uses as many core as available

on a given machine (the marketing documentation is quite misleading to

this respect).

However, the default license allows only up to 4 cores to be used (of

course, one can upgrade one's license by paying an additional fee).

The number of cores can also be restricted by the user. Check the tab

"Parallel" in the "Preferences" dialog box (goto menu "Edit" ->

"Preferences" at the bottom) [1].

Regards,

- Jean-Marc

[1] _Configuring and Monitoring_,

http://reference.wolfram.com/mathematica/ParallelTools/tutorial/ConfiguringAndMonitoring.html

Dec 15, 2008, 7:48:41â€¯AM12/15/08

to

Jean-Marc Gulliet wrote:

> Jon Harrop wrote:

>> I've managed to get rid of all the errors by calling

>> DistributeDefinitions with the symbol names of all functions *and* the

>> local "scene" variable right before the ParallelMap. However, this still

>> doesn't give any performance improvement (and it only spawns 4 kernels

>> for my 8 cores).

>

> John,

>

> You seem to believe that Mathematica 7.0 uses as many core as available

> on a given machine (the marketing documentation is quite misleading to

> this respect).

>

> However, the default license allows only up to 4 cores to be used (of

> course, one can upgrade one's license by paying an additional fee).

> Jon Harrop wrote:

>> I've managed to get rid of all the errors by calling

>> DistributeDefinitions with the symbol names of all functions *and* the

>> local "scene" variable right before the ParallelMap. However, this still

>> doesn't give any performance improvement (and it only spawns 4 kernels

>> for my 8 cores).

>

> John,

>

> You seem to believe that Mathematica 7.0 uses as many core as available

> on a given machine (the marketing documentation is quite misleading to

> this respect).

>

> However, the default license allows only up to 4 cores to be used (of

> course, one can upgrade one's license by paying an additional fee).

Yes. I believe I succeeded in parallelizing the ray tracer across 4 of my 8

cores but it gave no speedup. Specifically, I had 1 core at 100% usage,

three more at 20% usage and performance was degraded significantly.

For reference, my final timings of our implementations are:

420,000s My Mathematica

2,500s Daniel's Mathematica

3.9s My serial OCaml

1.4s My parallel OCaml on 8 cores

So the performance difference really was five orders of magnitude as my

preliminary tests had indicated.

0 new messages