Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: [Caml-list] Array copying in OCaml

5 views
Skip to first unread message

Dr. Thomas Fischbacher

unread,
Jul 24, 2008, 12:38:41 PM7/24/08
to Raj Bandyopadhyay, caml...@yquem.inria.fr

Raj Bandyopadhyay wrote:

> I have an application which copies a lot of (small) OCaml arrays using
> the Array library (Array.sub and Array.blit) functions. This is turning
> out to be extremely expensive.
>
> Is there any general way/trick to reduce the cost of this kind of
> operation? I haven't found a way not to copy as much, because the
> program semantics seems to demand it.

Are you sure there is no way delaying copying to the latest possible
time (e.g. by introducing an intermediate array to which you copy data
and which you use until you know for sure that you will have to hold
on to your copy)?

--
best regards,
Thomas Fischbacher
t.fisc...@soton.ac.uk


_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Christophe TROESTLER

unread,
Jul 24, 2008, 1:25:55 PM7/24/08
to ra...@rice.edu, caml...@yquem.inria.fr
On Thu, 24 Jul 2008 11:31:50 -0500, Raj Bandyopadhyay wrote:
>
> I have an application which copies a lot of (small) OCaml arrays
> using the Array library (Array.sub and Array.blit) functions. This
> is turning out to be extremely expensive.
>
> Is there any general way/trick to reduce the cost of this kind of
> operation? I haven't found a way not to copy as much, because the
> program semantics seems to demand it.

That all depends on the set of operations you want to support on these
arrays. For example, if all you want is access, concatenation and
slices, then an approach like the Rope library [1] may be envisioned.
What is the problem you want to solve?

Cheers,
C.

[1] http://ocaml-rope.sourceforge.net/

Adrien

unread,
Jul 24, 2008, 3:47:36 PM7/24/08
to Raj Bandyopadhyay, caml...@yquem.inria.fr
You can compile with the -unsafe flag. If you're using arrays a lot,
it can easily speed your program by 50%. If your program is array
based, the speedup can be even more important.

Don't assume it could magically make your program behave in a weird
way. When you're trying to access an element which is outside the
bounds of the array, ocaml rises an exception. With the -unsafe flag,
no exception will be risen and you will have no chance to catch it.
Consequently if you're out of bounds, your programm will die. That's
why it is called "unsafe". If you're sure you won't be out of bounds
(or are not trying to catch any exception), you can safely use
-unsafe.

If you don't want to affect globally your program, you can also use
Array.unsafe_get or Arra.unsafe_set. I don't know the full list
though.


---

Adrien Nader

2008/7/24 Raj Bandyopadhyay <ra...@rice.edu>:
> Hi


>
> I have an application which copies a lot of (small) OCaml arrays using the
> Array library (Array.sub and Array.blit) functions. This is turning out to
> be extremely expensive.
>
> Is there any general way/trick to reduce the cost of this kind of operation?
> I haven't found a way not to copy as much, because the program semantics
> seems to demand it.
>

> Thanks
> Raj

Raj Bandyopadhyay

unread,
Jul 27, 2008, 10:04:22 PM7/27/08
to caml...@yquem.inria.fr
As a follow up to this: I got a 3x performance improvement by writing my
own version of the Array.sub function in C, patterned after the
caml_make_vect function in the OCaml compiler.

I don't think that C is the main reason for the improvement, it's more
fundamental:

1) The OCaml library version of Array.sub creates an array,
***initializes it***, and then copies to it. The initialization is quite
unnecessary, and algorithmically makes the function about twice as
expensive as it should be.

2) From reading OCaml source code, it looks like caml_initialize is a
much cheaper function to use than caml_modify due to GC issues. Yet, the
OCaml library version uses caml_modify by initializing the target array
with a default value first, instead of using the source array to
initialize. I guess this is why on profiling, caml_modify shows up as
really expensive, paired with lots of GC calls.

I do believe that this calls for a better version of the Array.sub
function in the next version of OCaml. Here is my C code below (I am
only using it for an array of records, so the code is specialized to
that). I would be glad to submit a more general version if people want
it (and someone tells me where to submit it).

Thanks!
Raj


CAMLprim value caml_my_array_sub(value source, value ofs, value len)
{
CAMLparam3 (source,ofs,len);
CAMLlocal1 (res);
mlsize_t size, wsize, i,offset;

size = Long_val(len);
offset = Long_val(ofs);
if (size == 0)
{
res = Atom(0);
}
else
{
if (size > Max_wosize) caml_invalid_argument("Array.sub");
if (size < Max_young_wosize)
{
res = caml_alloc_small(size, 0);
for (i = 0; i < size; i++)
{
Field(res, i) = Field(source,i+offset);
}
}
else
{
caml_minor_collection();
res = caml_alloc_shr(size, 0);
for (i = 0; i < size; i++)
{
caml_initialize(&Field(res, i), Field(source,i+offset));
}
res = caml_check_urgent_gc (res);
}
}
CAMLreturn (res);
}


Adrien wrote:
> You can compile with the -unsafe flag. If you're using arrays a lot,
> it can easily speed your program by 50%. If your program is array
> based, the speedup can be even more important.
>
>
>

Richard Jones

unread,
Jul 28, 2008, 3:52:46 AM7/28/08
to Raj Bandyopadhyay, caml...@yquem.inria.fr
On Sun, Jul 27, 2008 at 09:03:58PM -0500, Raj Bandyopadhyay wrote:
> that). I would be glad to submit a more general version if people want
> it (and someone tells me where to submit it).

That's quite surprising. Here's where you would submit bugs &
patches:

http://caml.inria.fr/mantis/view_all_bug_page.php

Rich.

--
Richard Jones
Red Hat

Raj Bandyopadhyay

unread,
Jul 28, 2008, 1:34:34 PM7/28/08
to caml...@yquem.inria.fr
Thanks Richard!

I added my C code to array.c (in the development version which I got
from CVS), and added an external declaration in array.ml. However, I now
get the following compiler error:

File "_none_", line 1, characters 0-1:
Error: Error while linking boot/stdlib.cma(Array):
The external function `caml_general_array_sub' is not available
make[1]: *** [ocamlc] Error 2
make[1]: Leaving directory `/home/rajb/research/software/sources/ocaml'
make: *** [world] Error 2

Obviously this is because of incompatibility in the 'boot' version vs
the current version of the compiler. How do I get around this? I have
never worked on the OCaml compiler before, so I am definitely missing
something.

Richard Jones

unread,
Jul 28, 2008, 5:11:47 PM7/28/08
to Raj Bandyopadhyay, caml...@yquem.inria.fr
On Mon, Jul 28, 2008 at 12:34:17PM -0500, Raj Bandyopadhyay wrote:
> I added my C code to array.c (in the development version which I got
> from CVS), and added an external declaration in array.ml. However, I now
> get the following compiler error:

It's hard to say. Do you have a suggested patch against the compiler
we can look at & test?

Rich.

--
Richard Jones
Red Hat

_______________________________________________

Michel Mauny

unread,
Jul 29, 2008, 5:37:19 AM7/29/08
to Raj Bandyopadhyay, OCaml
Having a look at the comment "Hard bootstrap how-to" in the toplevel
Makefile of the OCaml distribution may help.

-- MM

--
Michel Mauny

ENSTA
(+33) 1 4552 5388 (ENSTA)
(+33) 1 3963 5796 (INRIA)

0 new messages