Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

[Caml-list] Announce: glome-0.2 (ocaml-based raytracer)

23 views

Skip to first unread message

Jim Snow

unread,

Jan 11, 2007, 8:28:57 PM1/11/07

to caml...@yquem.inria.fr

I've been working on a raytracer for awhile, and recently decided to
remove a lot of experimental code that doesn't work well anyways and
release the rest under the gpl version 2. Currently, glome renders some
of the scenes from the standard procedural database
(http://www.acm.org/tog/resources/SPD/). I thought that, aside from the
practical utility of generating pretty pictures, some people on this
list might be interested in using it to benchmark the quality of code
generated by various versions of the ocaml compiler.

Supported primitives are spheres and triangles. It uses a kd-tree as an
acceleration structure. There is limited joystick support (moving works
fine, but turning can have unexpected results) for those patient enough
to tolerate the low framerates.

I use lablgl for screen output, but there aren't any other libraries
required outside of the standard ocaml distribution.

I'm not a very experienced ocaml programmer, so I'm sure there are some
things I'm doing inefficiently just because I don't know better. I
welcome any suggestions that would make my code faster, or reduce the
memory footprint of my scene representation.

There is a discussion thread about glome over at ompf.org:
http://ompf.org/forum/viewtopic.php?t=336

Source code download is here: http://syn.cs.pdx.edu/~jsnow/glome/

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Jon Harrop

unread,

Jan 12, 2007, 7:22:37 AM1/12/07

to caml...@yquem.inria.fr

On Friday 12 January 2007 01:24, Jim Snow wrote:
> I've been working on a raytracer for awhile, and recently decided to
> remove a lot of experimental code that doesn't work well anyways and
> release the rest under the gpl version 2. Currently, glome renders some
> of the scenes from the standard procedural database
> (http://www.acm.org/tog/resources/SPD/). I thought that, aside from the
> practical utility of generating pretty pictures, some people on this
> list might be interested in using it to benchmark the quality of code
> generated by various versions of the ocaml compiler.

I have altered the code to be more idiomatic OCaml, although it is still very
not-OCaml. I've removed OOP from the hot path and virtual function dispatch
has been replaced with pattern matches.

http://www.ffconsultancy.com/temp/glome.tar.bz2

The code is now 1390LOC instead of 1746 (20% shorter). Performance is also
better. Building the Kd-tree is down from 7.0s to 6.3s.

I have many suggestions for what to do next:

1. Use records instead of float arrays: stronger type inference, more concise,
purely functional.

2. Get rid of almost all mutation. The core ray tracer has no reason to use
mutation and all those refs and assignments are confusing and probably slow.

3. Restructure the program again: put independent definitions related to
triangles in Triangle, put related definitions like the intersection routine
in Intersect.

Primarily, the program is far too verbose and convoluted. As an algorithm, ray
tracing is very functional in nature. I think the functionality provided by
this program could be achieved in half as many lines of code. It could also
be a lot faster.

> Supported primitives are spheres and triangles. It uses a kd-tree as an
> acceleration structure. There is limited joystick support (moving works
> fine, but turning can have unexpected results) for those patient enough
> to tolerate the low framerates.
>
> I use lablgl for screen output, but there aren't any other libraries
> required outside of the standard ocaml distribution.

Rather than rendering dots, you could generate a polygon mesh. To make things
more interesting, you could include the depth value in the mesh, so when you
rotate the scene it gets distorted by OpenGL without needing to ray trace
anything.

> I'm not a very experienced ocaml programmer, so I'm sure there are some
> things I'm doing inefficiently just because I don't know better. I
> welcome any suggestions that would make my code faster, or reduce the
> memory footprint of my scene representation.

My impression is that you are optimising prematurely. Get the program <1/2 the
size that it is before you even think about optimising anything. You're doing
all sorts of manual resource allocation and mutation thinking that it will
make things faster when, I think, it just makes the program unnecessarily
complicated.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

Jim Snow

unread,

Jan 15, 2007, 5:36:22 AM1/15/07

to Jon Harrop

Jon Harrop wrote:
> On Friday 12 January 2007 01:24, Jim Snow wrote:
>
>> I've been working on a raytracer for awhile, and recently decided to
>> remove a lot of experimental code that doesn't work well anyways and
>> release the rest under the gpl version 2. Currently, glome renders some
>> of the scenes from the standard procedural database
>> (http://www.acm.org/tog/resources/SPD/). I thought that, aside from the
>> practical utility of generating pretty pictures, some people on this
>> list might be interested in using it to benchmark the quality of code
>> generated by various versions of the ocaml compiler.
>>
>
> I have altered the code to be more idiomatic OCaml, although it is still very
> not-OCaml. I've removed OOP from the hot path and virtual function dispatch
> has been replaced with pattern matches.
>
> http://www.ffconsultancy.com/temp/glome.tar.bz2
>
> The code is now 1390LOC instead of 1746 (20% shorter). Performance is also
> better. Building the Kd-tree is down from 7.0s to 6.3s.
>
>

Sorry I'm a bit slow about replying; I was off trying to implement an
nlogn kd-tree compiler. Your version seems to have sped up the
raytracing by about 10%. However, I think I am going to stick with my
approach for the time being for the sake of maintainability; I don't
think putting all the ray-intersection code together in one
mutually-recursive is going to make the program easy to modify in the
future. I am tempted though. I might also give recursive modules a try.

(For those just joining us, my dilemma is thus: my raytracer defines
ray-intersection tests for a number of types of geometry. Aside from my
conventional primitives, triangles and spheres, I also have a number of
more abstract primitives like groups (a container for primitives, so I
can treat, say, a collection of triangles as if it were one triangle)
and kdtrees (semantically similar to a group, but with an axis-aligned
binary space partitioning scheme). In order for this latter type to
work correctly, they need to have a ray-intersection function that calls
the ray-intersection functions of their contained objects. Contained
objects may also be groups or kdtrees, hence the necessity of mutual
recursion. Due to the lack of mutual recursion across source files, I
had resorted to using objects; all primitives inherit from a base type
that supports a ray-intersection test. Unfortunately, this incurs
noticeable overhead. Jon Harrop's solution was to write one big
recursive ray-intersection test that pattern matches on the type of
supplied primitve, and evaluates the proper test.)

> I have many suggestions for what to do next:
>
> 1. Use records instead of float arrays: stronger type inference, more concise,
> purely functional.
>

I did try this after looking at your ray tracer; however, this did not
significantly affect performance, except in cases where I needed to
access vectors like an array (with an integer index), and none of the
tricks I could think of to do that were as fast as plain array access.
This created a bottleneck in my kd-tree traversal code (where
high-performance ray tracers tend to a significant portion, if not most,
of their time).

See http://ompf.org/forum/viewtopic.php?p=2709&highlight=#2709

> 2. Get rid of almost all mutation. The core ray tracer has no reason to use
> mutation and all those refs and assignments are confusing and probably slow.
>
>

If you mean the place where I pass a "traceresult" record into each
rayintersection test, I agree that that is definitely ugly, but at the
time it gained me a noticeable performance increase (I think it was
around 5% or so.). However, that was under a different workload; that
was before I implemented the kdtree and I was doing many more ray-object
intersections. I was also optimizing for a different set of
ray-intersection tests that returned more information, and I was keeping
the results around much longer (I couldn't throw them away until the
whole image was done rendering). I didn't want them to sit around so
long they get promoted to the major heap, only to get garbage collected
much later. It's possible those concerns aren't valid for the way the
raytracer currently works. I might try reverting that optimization one
of these days and seeing what happens.

If assignment to mutable fields bothers you, I suggest you avert your
eyes from the clr.ml file. I don't really have a good excuse for that
code, other than it seemed like a good idea at the time and I haven't
gotten around to re-writing it.

> 3. Restructure the program again: put independent definitions related to
> triangles in Triangle, put related definitions like the intersection routine
> in Intersect.
>
> Primarily, the program is far too verbose and convoluted. As an algorithm, ray
> tracing is very functional in nature. I think the functionality provided by
> this program could be achieved in half as many lines of code. It could also
> be a lot faster.
>

You're right that the program could be cleaner and much shorter, but I'm
relatively new to functional programming, and I haven't figured out all
the shortcuts. When I'm feeling lazy and I see something that seems
like it should be a loop I'll usually use a loop, whereas I could have
used recursion and saved myself some typing. I like that ocaml doesn't
force you to program in a particular way.

I also don't think "lines of code" is always a good way of measuring
code quality. Oop, for instance, adds a lot of cruft (which is one
reason I dislike java; I don't like being forced to do all that typing),
but I used it because it gave me mutual recursion without having to
stick all my mutually recursive functions together in one file, and
therefore I can group my code into smaller, more manageable units. (I
won't dispute that your version is faster.)

>> Supported primitives are spheres and triangles. It uses a kd-tree as an
>> acceleration structure. There is limited joystick support (moving works
>> fine, but turning can have unexpected results) for those patient enough
>> to tolerate the low framerates.
>>
>> I use lablgl for screen output, but there aren't any other libraries
>> required outside of the standard ocaml distribution.
>>
>
> Rather than rendering dots, you could generate a polygon mesh. To make things
> more interesting, you could include the depth value in the mesh, so when you
> rotate the scene it gets distorted by OpenGL without needing to ray trace
> anything.
>
>

Hm, I'll bet you'd like to know what the 2/3 of the code I didn't
publicly release does :)
(It doesn't do quite what you suggest, but I do draw the final image to
the screen as an adaptive triangle mesh (using the ROAM algorithm).)

Your bunny renderer looks interesting. It's been on my to-do list to
contrive some way to load more interesting datasets than the standard
procedural database.

brogoff

unread,

Jan 15, 2007, 1:37:37 PM1/15/07

to Jim Snow

On Mon, 15 Jan 2007, Jim Snow wrote:
> Jon Harrop wrote:
> > On Friday 12 January 2007 01:24, Jim Snow wrote:
> >> I've been working on a raytracer for awhile, and recently decided to
> >> remove a lot of experimental code that doesn't work well anyways and
> >> release the rest under the gpl version 2.

[...snip...]

> > I have altered the code to be more idiomatic OCaml, although it is still very
> > not-OCaml. I've removed OOP from the hot path and virtual function dispatch
> > has been replaced with pattern matches.
> >

> Sorry I'm a bit slow about replying; I was off trying to implement an
> nlogn kd-tree compiler. Your version seems to have sped up the
> raytracing by about 10%. However, I think I am going to stick with my
> approach for the time being for the sake of maintainability; I don't
> think putting all the ray-intersection code together in one
> mutually-recursive is going to make the program easy to modify in the
> future. I am tempted though. I might also give recursive modules a try.

I haven't coded a ray tracer in a long time, and the one I did was for a
college class, but my recollection is that even in C (the implementation
language I used) the design used classes/objects for the primitives, so
that one could add new primitives and the only piece of code that would
need modification would be the interpreter of scene description files.
I think using classes for that is the right approach. I'm sure you could
do it in a more coreish ML fashion without even using recursive modules,
say by emulating the OO in C approach with ML records of functions, but
it won't be any faster and will be uglier, since the class system
provides a kind of generalized polymorphic record.

This is a nice example for discussing the merits of OO features, and less
complex than the expression problem. A competitive "non-OO" approach
should provide easy extensibility along the same axes as the OO one.
I admit I don't see the need for cross file recursion here.

-- Brian

Nathaniel Gray

unread,

Jan 17, 2007, 6:07:53 PM1/17/07

to Jim Snow

On 1/15/07, Jim Snow <js...@cs.pdx.edu> wrote:

> Jon Harrop wrote:
> >
> > I have altered the code to be more idiomatic OCaml, although it is still very
> > not-OCaml. I've removed OOP from the hot path and virtual function dispatch
> > has been replaced with pattern matches.

> > [snip]

> >
> Sorry I'm a bit slow about replying; I was off trying to implement an
> nlogn kd-tree compiler. Your version seems to have sped up the
> raytracing by about 10%. However, I think I am going to stick with my
> approach for the time being for the sake of maintainability; I don't
> think putting all the ray-intersection code together in one
> mutually-recursive is going to make the program easy to modify in the
> future. I am tempted though. I might also give recursive modules a try.
>
> (For those just joining us, my dilemma is thus: my raytracer defines
> ray-intersection tests for a number of types of geometry. Aside from my
> conventional primitives, triangles and spheres, I also have a number of
> more abstract primitives like groups (a container for primitives, so I
> can treat, say, a collection of triangles as if it were one triangle)
> and kdtrees (semantically similar to a group, but with an axis-aligned
> binary space partitioning scheme). In order for this latter type to
> work correctly, they need to have a ray-intersection function that calls
> the ray-intersection functions of their contained objects. Contained
> objects may also be groups or kdtrees, hence the necessity of mutual
> recursion. Due to the lack of mutual recursion across source files, I
> had resorted to using objects; all primitives inherit from a base type
> that supports a ray-intersection test. Unfortunately, this incurs
> noticeable overhead. Jon Harrop's solution was to write one big
> recursive ray-intersection test that pattern matches on the type of
> supplied primitve, and evaluates the proper test.)

I wonder if you really need the mutual recursion. You can often avoid
mutual recursion by using closures. Instead of, say, a list of
objects with an isect (intersect) method you can use a list of
closures. Here's my silly example (you'll have to pardon my ignorance
of the domain):

(* Some "isectables" *)
type sphere = point3 * float * color
let isect_sphere sphere ray = ...

type triangle = point3 * point3 * point3 * color
let isect_triangle tri ray = ...

(* A group is just a list of closures that, when applied to a ray,
isect their contained geometry *)
type group = (ray -> unit) list
let isect_group group ray = List.iter (fun x -> x ray) group

let s = make_ray ... in
let t1 = make_triangle ... in
let s1 = make_sphere ... in
let group1 = [(isect_sphere s1); (isect_triangle t1)] in
isect_group group ray

I haven't benchmarked, but I think you should get better results than
if you were using virtual method dispatch in an inner loop.

Cheers,
-n8

--
>>>-- Nathaniel Gray -- Caltech Computer Science ------>
>>>-- Mojave Project -- http://mojave.cs.caltech.edu -->

Jon Harrop

unread,

Jan 17, 2007, 6:17:41 PM1/17/07

to caml...@yquem.inria.fr

On Wednesday 17 January 2007 23:01, you wrote:
> I haven't benchmarked, but I think you should get better results than
> if you were using virtual method dispatch in an inner loop.

I had already implemented that recursion trick but I accidentally sent my
response to Jim alone without spamming the list. Here's my latest source:

http://www.ffconsultancy.com/temp/glome.tar.bz2

I'm compiling with:

ocamlopt -inline 100 -unsafe -noassert -I +lablGL lablgl.cmxa
lablglut.cmxa -dtypes vec.ml clr.ml material.ml types.ml sphere.ml
triangle.ml group.ml kdtree.ml solid.ml camera.ml light.ml scene.ml trace.ml
glome.ml -o glome

Balls 14, Jim's then mine:

kdtree build time: 0.772883
with gc: 0.786881
total :2.393636

kdtree build time: 0.645902
with gc: 0.689895
total :2.331646

Tetra, Jim's then mine:

kdtree build time: 6.934946
with gc: 6.952944
total :0.429935

kdtree build time: 6.31004
with gc: 6.360033
total :0.396939

So my implementation is ~8% faster. That isn't significant, but the reduction
in source code of 25% is!

The code will be smaller and faster with records for vectors rather than
arrays. More importantly, we can rid of that hideous -unsafe. I'll do it ASAP
but I'm really busy writing more books on FPLs. ;-)

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

_______________________________________________

Jim Snow

unread,

Jan 18, 2007, 5:35:56 AM1/18/07

to Jon Harrop

New version (0.3) of glome is up: now with a new more elegant (and
presumably asymptotically optimal but still abysmally slow) kdtree
compiler, which unfortunately segfaults on large models. (Thanks to the
"marshalling limits" thread, I discovered that the size of the kdtree I
can construct depends on the size of the stack. I shall have to look
into this further.) It is also free of objects - I used Jon Harrop's
suggestion to implement mutual recursion by passing the generic
rayintersection function as an argument to any of the rayintersection
functions that need to be recursive. (It seems to me like an odd
solution, but I think I can live with it.) Mutation has been removed
from much of the rayintersection code, (also at Jon's suggestion) which
actually sped things up noticeably (about 15% or so).

Webpage is here: http://syn.cs.pdx.edu/~jsnow/glome/

Jon Harrop wrote:
>
> It didn't the last time I looked. Using "include" instead of "open" is often
> faster, probably for that reason.
>
>
I'll have to experiment with that and see what happens.
>> There are some hybrid renderers that do just that. There are some
>> reasons not to do that, though; for instance, ray tracing scales better
>> on big models (see, for instance,
>> http://www.openrt.de/Gallery/OliverDeussen_Sunflowers/Images/sunflowers_2.j
>> pg).
>>
>
> That simply isn't true. You can use trees with OpenGL and get the same
> asymptotic efficiency whilst also being 100x faster in real terms.
>
> I've written a planet renderer that adaptively tesselates a 2^80-triangle
> planet in real-time for OpenGL.
>
> I've written a 2D vector graphics engine that adaptively tesselates PostScript
> paths into triangles so you can fly around 1Gb PDF images in real time.
>
> If what you said was true, that wouldn't have been possible.
>
>
Perhaps I should be more specific about exactly what it is that is
scaling. With level-of-detail schemes (which could apply to ray-tracing
as well as GL), you can render datasets of enormous complexity, provided
you aren't trying to render it all at the same time. Your planet demo
looks very interesting, but it looks like your polygon counts at any
particular moment aren't very high. If you add some realistic
vegetation, the high polgygon counts would slow things down quite a
bit. OpenGL scales linearly with the number of triangles it has to
draw; ray-tracers scale logarithmically. You can avoid some of the
memory overhead of large scenes by using instancing, but GL still has to
draw every single triangle.

Ray-tracing has its own costs; sorting an acceleration structure, for
instance, can be very slow. Also, they currently only surpass the
performance of traditional polygon renderers on very complex scenes.
For most current rendering problems, it makes more sense to use GL right
now. But as computers get faster, and real-time global illumination
starts to become feasible, ray tracing is likely to look very
appealing. This is my opinion; you are free to disagree.

>
> Ray tracing is simply a bad way to render images, unless they
> are closeups of reflective spheres.
>
>
Opinions vary. So do datasets and application requirements.

>
>> So, I switched over to objects. This reduced
>> memory use a little, I think, but didn't help much. It did make things a
>> little slower, though. There's some more detailed discussion over at
>> ompf.org: http://ompf.org/forum/viewtopic.php?t=336
>>
>
> What is the memory use of my version like?
>
About 1.5 gigs for the 800k triangle level 4 sphereflake, same as my
version 0.2. I think the memory consumption is elsewhere. Most of the
memory gets consumed as the kdtree is being built.
> Apart from the texture mapping bugs, check out these screenshots of my planet
> demo. Now write me a ray tracer that can do that...
>
>
I doubt that whatever level-of-detail algorithms you employ in any way
preclude the use of raytracing, it would just be rather slow. (The
OpenRT people, for instance, are working on a drop-in OpenGL
replacement, and the XFRT are working on an OpenRT replacement that is
actually open.) Now write me an OpenGL app that can render this
correctly: http://graphics.ucsd.edu/~henrik/images/gbox.jpg :)

Nathaniel Gray wrote:
>
> I wonder if you really need the mutual recursion. You can often avoid
> mutual recursion by using closures. Instead of, say, a list of
> objects with an isect (intersect) method you can use a list of
> closures.

That's more or less what my original implementation did. I switched to
objects because I wasn't sure if closures were allocating space
efficiently. Then I switched to my current implementation because
calling object methods is slow (as evidenced by the results presented in
the "Benchmarking different dispatch types" thread). In the end, I
don't think it made a big difference - I'm just not intersecting with
very many primitives per ray. Every little bit helps, though.

Jon Harrop

unread,

Jan 18, 2007, 9:07:56 AM1/18/07

to caml...@yquem.inria.fr

On Thursday 18 January 2007 10:29, Jim Snow wrote:
> > What is the memory use of my version like?
>
> About 1.5 gigs for the 800k triangle level 4 sphereflake, same as my
> version 0.2. I think the memory consumption is elsewhere. Most of the
> memory gets consumed as the kdtree is being built.

I hit a shortcoming of OCaml's here. For a ray tracer, single-precision floats
are fine. But in OCaml you can't mix them with other types in a data
structure to save memory, you just use a pointer to a BigArray. I think this
is a justification for having float32 in the language.

I'll take the rest of the discussion back off list...

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

_______________________________________________

marinezpei...@gmail.com

unread,

May 3, 2012, 8:20:51 PM5/3/12

to caml...@yquem.inria.fr

0 new messages