Are access to Bigarrays slower when passed to functions? If so, is it
fixable? or is there some workaround?
I stumbled upon this while coding on a school assignment ((not too ;) fast
multiplication of large integers).
1)
let a = Array.make 1000000 0 in
for i = 0 to 999999 do
a.(i) <- a.(i) + i;
done;;
2)
let a = Array.make 1000000 0 in
let rec loop a i =
if i <= 999999 then begin
a.(i) <- a.(i) + i;
loop a (i + 1)
end in
loop a 0;;
3)
open Bigarray;;
let a = Array1.create int c_layout 1000000 in
Array1.fill a 0;
for i = 0 to 999999 do
a.{i} <- a.{i} + i
done;;
4)
open Bigarray;;
let a = Array1.create int c_layout 1000000 in
Array1.fill a 0;
let rec loop a i =
if i <= 999999 then begin
a.{i} <- a.{i} + i;
loop a (i + 1)
end in
loop a 0;;
-Richard
-------------------
To unsubscribe, mail caml-lis...@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Pure caml use of bigarray is generally slower.
_________________________________________________________________
Chat with friends online, try MSN Messenger: http://messenger.msn.com
http://caml.inria.fr/archives/200110/msg00148.html
An aside(not all facts are cheked):
Bigarrays(of at least floats) can have a slight edge over normals arrays.
To get maximal speed of the inner loops data needs to be naturally
aligned. OCaml does nothing to enforce it for non-big arrays. Bigarrays on
the other hand are mmaped(4k on IA32) and you get perfectly aligned data
for free. I was thinking that maybe Array can be extended with
make[create]_aligned, for speed/space tradeoff.
--
mailto:ma...@pulsesoft.com
Additionally, it would also be nice to have a specialized "create"
function for (naturally unboxed) float arrays such that they need not
be initialized with a given float value. This may be beneficial for
algorithms that allocate work space whose contents is not necessarily
fully needed but filled on demand.
Regards,
Markus Mottl
--
Markus Mottl mar...@oefai.at
Austrian Research Institute
for Artificial Intelligence http://www.oefai.at/~markus
I did this once to be able to interface with Fortran libs on Sparc32,
and I still have a patch (against ocaml-2.01) lying around. It was
actually quite thoroughly tested, but it's probably not very tidy. The
main gotcha iirc was getting output_value/input_value to preserve
alignment :).
Yes. That message explained it very well :)
Thanks!
-Richard
As William Chester said, allocating 8-aligned arrays isn't really
hard, but keeping them 8-aligned across copying collection,
compaction, and structured I/O is quite a pain.
My experiments indicate that the lack of alignment on float arrays
(or more precisely the fact that they are 4-aligned instead of
8-aligned) has negligible impact on performance for the IA32 (Pentium)
and PowerPC processors, but non-negligible for SPARC and MIPS.
And of course on a 64-bit architecture the problem goes away because
everything in the Caml heap is then 8-aligned. Since I expect IA32
and PowerPC to remain dominant until we massively switch to 64-bit
processors, there's no urgent need to do something about float array
alignment.
- Xavier Leroy
> malc writes:
> > To get maximal speed of the inner loops data needs to be naturally
> > aligned. OCaml does nothing to enforce it for non-big arrays. Bigarrays on
> > the other hand are mmaped(4k on IA32) and you get perfectly aligned data
> > for free. I was thinking that maybe Array can be extended with
> > make[create]_aligned, for speed/space tradeoff.
>
> As William Chester said, allocating 8-aligned arrays isn't really
> hard, but keeping them 8-aligned across copying collection,
> compaction, and structured I/O is quite a pain.
>
> My experiments indicate that the lack of alignment on float arrays
> (or more precisely the fact that they are 4-aligned instead of
> 8-aligned) has negligible impact on performance for the IA32 (Pentium)
> and PowerPC processors, but non-negligible for SPARC and MIPS.
> And of course on a 64-bit architecture the problem goes away because
> everything in the Caml heap is then 8-aligned. Since I expect IA32
> and PowerPC to remain dominant until we massively switch to 64-bit
> processors, there's no urgent need to do something about float array
> alignment.
IA32 is now much bigger family, and unlucky owners of AMD 7th generation
machines, such as myself, do pay a price for unaligned double precission
float accesses.
--
mailto:ma...@pulsesoft.com