I'd like to announce the first pre-alpha release of libsndfile-ocaml
which is available here:
http://www.mega-nerd.com/tmp/libsndfile-ocaml.tgz
The ocamldoc generated docs are here:
http://www.mega-nerd.com/libsndfile/Ocaml/Sndfile.html
At this stage, basic reading from and writing to a file works.
Once I have received some feedback on what I have so far, I
intend to complete wrapping of the rest of the libsndfile API
on an as-needed basis.
Feedback please :-).
Cheers,
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"Every time an American goes to a gas station, he is sending money
to America's enemies." -- http://www.meforum.org/article/653
_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
haven't had the time to really test it, but it seems to work nicely.
Wouldn't it be interesting to optionally create a bigarray instead of an
array ?
Best wishes to all
San
Erik de Castro Lopo a écrit :
> Hi all,
>
> I'd like to announce the first pre-alpha release of libsndfile-ocaml
> which is available here:
>
> http://www.mega-nerd.com/tmp/libsndfile-ocaml.tgz
>
> The ocamldoc generated docs are here:
>
> http://www.mega-nerd.com/libsndfile/Ocaml/Sndfile.html
>
> At this stage, basic reading from and writing to a file works.
> Once I have received some feedback on what I have so far, I
> intend to complete wrapping of the rest of the libsndfile API
> on an as-needed basis.
>
> Feedback please :-).
>
> Cheers,
> Erik
> Thanks a lot!
>
> haven't had the time to really test it, but it seems to work nicely.
> Wouldn't it be interesting to optionally create a bigarray instead of an
> array ?
Wow, I've just looked up the Ocaml bigarray module. I didn't even
know it existed. I'll take a look at it. It may be a better solution
than what I have now.
I'm curious about you're comment about optional creation of bigarrays.
Does anyone have an example of this kind of option usage of bigarrays?
Cheers,
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"I'm too fucking busy, or vice versa" -- Dorothy Parker
> I'm curious about you're comment about optional creation of bigarrays.
> Does anyone have an example of this kind of option usage of bigarrays?
Ok, found an example in the Ocaml-cario bindings. I'll add it to the
libsndfile bindings.
Cheers,
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"Capitalism is an art form, an Apollonian fabrication to rival nature.
It is hypocritical for feminists and intellectuals to enjoy the pleasures
and conveniences of capitalism while sneering at it. Everyone born into
capitalism has incurred a debt to it."
-- Camille Paglia
> Vu Ngoc San wrote:
>
> > haven't had the time to really test it, but it seems to work nicely.
> > Wouldn't it be interesting to optionally create a bigarray instead of an
> > array ?
>
> Wow, I've just looked up the Ocaml bigarray module. I didn't even
> know it existed. I'll take a look at it. It may be a better solution
> than what I have now.
After a more detailed look at this I don't see any advantage to
using the bigarray module.
The C API for libsndfile allows reading data as short, int, float or
double. During reads, libsndfile automatically converts from the
internal data format of the file to the data format requested by the
user.
The trouble interfacing with Ocaml is that Ocaml doesn't support
16 bit shorts, 32 bit ints or 32 bit floats. The only data type that
makes any sense is the Ocaml float type (C double).
So, I'd like to ask, if I provide a bigarray interface, how would it
be used?
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"There are two kinds of large software systems: those that
evolved from small systems and those that don't work."
-- Seen on slashdot.org, then quoted by amk
> The trouble interfacing with Ocaml is that Ocaml doesn't support
> 16 bit shorts, 32 bit ints or 32 bit floats. The only data type that
> makes any sense is the Ocaml float type (C double).
>
> So, I'd like to ask, if I provide a bigarray interface, how would it
> be used?
>From the manual:
type ('a, 'b) kind
To each element kind is associated a Caml type, which is the type of
Caml values that can be stored in the big array or read back from it.
This type is not necessarily the same as the type of the array elements
proper: for instance, a big array whose elements are of kind float32_elt
contains 32-bit single precision floats, but reading or writing one of
its elements from Caml uses the Caml type float, which is 64-bit double
precision floats.
The abstract type ('a, 'b) kind captures this association of a Caml type
'a for values read or written in the big array, and of an element kind
'b which represents the actual contents of the big array. The following
predefined values of type kind list all possible associations of Caml
types with element kinds:
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
> type ('a, 'b) kind
> To each element kind is associated a Caml type, which is the type of
> Caml values that can be stored in the big array or read back from it.
> This type is not necessarily the same as the type of the array elements
> proper: for instance, a big array whose elements are of kind float32_elt
> contains 32-bit single precision floats, but reading or writing one of
> its elements from Caml uses the Caml type float, which is 64-bit double
> precision floats.
But why is taht any better than the existing Sndfile read method
which already returns an array of Ocmal floats. See:
http://www.mega-nerd.com/libsndfile/Ocaml/Sndfile.html
which has:
val sf_read : sndfile_t -> float array -> int
val sf_write : sndfile_t -> float array -> int
(well actually sndfile_t has been changed to Sndfile.t).
Since it is already possible to read Ocaml floats (which are normalised
to the range [-1.0, 1.0]) why would anyone want to ready any other data
type?
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"Do I do everything in C++ and teach a course in advanced swearing?"
-- David Beazley at IPC8, on choosing a language for teaching
Two reasons I can think of[*]: (a) to avoid copying, (b) to make an
exact reproduction (without the conversion to and from float).
Rich.
[*] I haven't looked at the libsndfile code so I've no idea if they're
correct, but hey it's Sunday...
--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Internet Marketing and AdWords courses - http://merjis.com/courses - NEW!
Merjis blog - http://blog.merjis.com - NEW!
I don't claim it is .. just answering your question, which
was about how to use an array of, for example, shorts.
> See:
>
> http://www.mega-nerd.com/libsndfile/Ocaml/Sndfile.html
>
> which has:
>
> val sf_read : sndfile_t -> float array -> int
> val sf_write : sndfile_t -> float array -> int
>
> (well actually sndfile_t has been changed to Sndfile.t).
>
> Since it is already possible to read Ocaml floats (which are normalised
> to the range [-1.0, 1.0]) why would anyone want to ready any other data
> type?
Performance or space issues?
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
_______________________________________________
> On Sun, Dec 31, 2006 at 03:23:05PM +1100, Erik de Castro Lopo wrote:
> > Since it is already possible to read Ocaml floats (which are normalised
> > to the range [-1.0, 1.0]) why would anyone want to ready any other data
> > type?
>
> Two reasons I can think of[*]: (a) to avoid copying, (b) to make an
> exact reproduction (without the conversion to and from float).
Well the amount of copying is the same whether I use bigarray or a
standard Ocaml float array so (a) is irrelevant.
Point (b) does make some sense in that someone might want to open a file,
seek to position A, read data from position A to position B, writing the
read data to a new file. An important criteria might well be that the
copied section of data be identical in source and destinations files.
However, using Ocaml float data this would actually be the case.
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"Hamas: Islam will conquer US and Britain."
-- http://www.pmw.org.il/LatestBulletins.htm#b220606
> > Since it is already possible to read Ocaml floats (which are normalised
> > to the range [-1.0, 1.0]) why would anyone want to ready any other data
> > type?
>
> Performance or space issues?
It depends on what is being done. For any signal processing algorithms,
anything other than floats normalised to [-1.0, 1.0] is a huge pain in
the neck. Any performance increases that might be acheived using ints
would be sqandered by the extra processing required to deal with scaling
issues and Ocaml's 31 bit floats.
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
Saying Python is easier than C++ is like saying that turning a
light switch on or off is easier than operating a nuclear reactor.
This could be interesting for read-only access or in-place modifications?
Another very interesting feature of bigarrays is the memory mapping of a
file as a big array, very useful to work with BIG files.
Matt
> Is it really the case? I thought that it was possible to create a
> bigarray wrapping a C array without
> copying datas. I do not know how to achieve this for float arrays?
When reading files libsndfile always does at least one copy; from
the disk to the array supplied by the caller. This single copy
only occurs if the data requested by the caller is the same format
and endian-ness as the format requested by the caller. When the data
formats are not the same two copies are required; from the disk to
and buffer internal to libsndfile and then a copy/data conversion
to the buffer suppiled by the caller.
The above doesn't change regardless of whether the caller supplies
an Ocaml float array or a bigarray.
In addition, I also regard the most common case to be the one where
a data convesion takes place between the file format and the format
requested by the caller.
> This could be interesting for read-only access or in-place modifications?
I don't see how this would be different float array vs bigarray.
> Another very interesting feature of bigarrays is the memory mapping of a
> file as a big array, very useful to work with BIG files.
Firstly, libsndfile doesn't use mem-mapping because the most common case
is where the disk format is different from the file format. Secondly I
consider a big file to be one containing say an hour of multiple channels
(say 8) of 32 bit float data at high sample rates (say 96kHz). That file
is:
96000 * 60 * 60 * 8 * 4 bytes =>
11059.200 Mbytes =>
11.059 Gbytes
Nobody is going to load the whole of that file into memory at once. Instead,
the most sensible and most general approach is to load it in in chunks.
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"I saw `cout' being shifted "Hello world" times to the left
and stopped right there." -- Steve Gonedes
Why not? That's tiny compared to available address space on a 64
bit machine, and personal computers have heaps
of free address space.
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
_______________________________________________
> On Tue, 2007-01-02 at 06:58 +1100, Erik de Castro Lopo wrote:
> > That file
> > is:
> >
> > 96000 * 60 * 60 * 8 * 4 bytes =>
> > 11059.200 Mbytes =>
> > 11.059 Gbytes
> >
> > Nobody is going to load the whole of that file into memory at once.
>
> Why not? That's tiny compared to available address space on a 64
> bit machine, and personal computers have heaps
> of free address space
Ok, so someone writes a simple application that loads the whole file
into memory and then plays it. Unfortunately disk transfer speeds being
in the order of 100 Mb/sec means that its going to take 110 seconds to
load that file. Thats bad!
Obviously, the smart way to do it is to stream that file off disk 100kB
chunks at a time. Thats what libsndfile is designed and optimised for.
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"C++ is a language strongly optimized for liars and people who
go by guesswork and ignorance." -- Erik Naggum
But a mem-mapped file shouldn't be loaded into physical memory until
it's accessed, and then only the page that has the data, right?
-e
> But a mem-mapped file shouldn't be loaded into physical memory until
> it's accessed, and then only the page that has the data, right?
But libsndfile doesn't use mem-mapping to access files, only read/write.
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"I could never learn to use C++, because of the completely
overwhelming desire to redesign the language every time I tried
to use it, but this is the normal, healthy reaction to C++."
-- Erik Naggum
> On Tue, 2007-01-02 at 06:58 +1100, Erik de Castro Lopo wrote:
>> That file
>> is:
>>
>> 96000 * 60 * 60 * 8 * 4 bytes =>
>> 11059.200 Mbytes =>
>> 11.059 Gbytes
>>
>> Nobody is going to load the whole of that file into memory at once.
>
> Why not? That's tiny compared to available address space on a 64
> bit machine, and personal computers have heaps
> of free address space.
I had to deal with big files and OCaml too and Erik's approach
sound good to me.
On 64 bit machines you may mmap huge files, but you can't on 32-bit
machines. I run in troubles with files > 700MB. Maybe you could mmap
smaller blocks, but this isn't possible with the current
implementation of bigarrays mmap (since you need to mmap with an
offset). Furthermore mmap is a bit different on different operation
system.
Measurements show that mmap doesn't mean a big (or any) speed
up. For the OS the advantage is, that no swap space needs to be
reservered.
Christoph Bauer
San
> I was thinking of using libsndfile in combination with the ocaml-gsl
> (gnu scientific library), and the latter uses bigarray, afaik.
Now that is a useful data point.
However, I do notice that the Gsl_vector module has functions:
val of_array : float array -> vector
val to_array : vector -> float array
I also notice that the Gsl_vector functions are all of type:
(float, Bigarray.float64_elt, Bigarray.c_layout) Bigarray.Array1.t
> I am just
> wondering whether then it would be appropriate to have bigarrays from
> libsndfile.
I was particularly interested if there was any utility to providing
functions for accessing shorts or ints. So far noone has come up with
a need for these.
It does however seem that it may be useful access to the data via
data via a bigarray of Bigarray.float64_elt elements.
Cheers,
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"Java is, in many ways, C++--." -- Michael Feldman
Ocaml-vorbis and ocaml-mad take strings as input. Reading data from
libsndfile as string would allow a straightforward use of these libs
together.
One might argue that having vorbis and mad work on float arrays would
be better. That's a point. But on the other hand, the current
datatypes avoid some conversions -- at least on the vorbis/mad side.
My 2 cents.
--
David
> Mmaping the file doesn't require pre-loading it, its loaded
> on demand by the paging system. Still, some files might not be mappable,
> depending on the OS and device they're on.
As I have already stated twice in this thread, libsndfile does not
mmap files. It just reads and/or writes :-).
BTW, anyone mmaping files on Linux for performance reasons should
be aware that Linus himself doesn't think mmap will have any
performance improvement over read:
http://www.cs.helsinki.fi/linux/linux-kernel/2001-40/1661.html
Someone has benchmarked mmap vs read/write and found mmap lacking:
http://lkml.org/lkml/2002/3/13/38
Interestingly, mmap is also slower than read on freebsd:
http://lists.freebsd.org/pipermail/freebsd-questions/2004-June/050245.html
http://lists.freebsd.org/pipermail/freebsd-questions/2004-June/050265.html
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"If you think C++ is not overly complicated, just what is a
protected abstract virtual base pure virtual private destructor
and when was the last time you needed one?" -- Tom Cargill
> Ocaml-vorbis and ocaml-mad take strings as input.
Personally, I consider that a mistake :-).
I've looked a little at these interfaces say:
http://savonet.sourceforge.net/ocamldoc/ocaml-vorbis/Vorbis.html
and just picking a single function:
val encode_buffer : encoder -> string -> string
Encode a wav buffer into ogg. WARNING: the size of the input buffer
should be less than or equal to 1024 bytes.
and there is no information about how the data is to be stored in the
input buffer. As someone reasonably knowledgable in audio processing
and audio file formats, I assume the the input string is pairs of
bytes, with each pair being a low byte and a high byte of a 16 bit
C short, with the endian-ness specified when the encoder is created.
So, the above interface you have created is perfectly adequate if all
you want to do is read from one file and write to another file. If
you want to do more than that; say read from a file, process the data
and then write to another file, then this interface is a pain in the
neck because you need to convert from the string data to an array of
into ot float, process and then convert back to a string.
> Reading data from
> libsndfile as string would allow a straightforward use of these libs
> together.
That is an argument that libsndfile should add read_string/write_string
methods. I am yet to be convinced.
> One might argue that having vorbis and mad work on float arrays would
> be better. That's a point. But on the other hand, the current
> datatypes avoid some conversions -- at least on the vorbis/mad side.
Conversion from short to float and from float to short (done right) is
very, very cheap incomparison to the vorbis/mad encoding and decoding.
I would argue that you would be unable to measure the difference between
decoding to string and decoding to float array because the noise of
other factors like disk accesss, paging and other activity on the system
would swamp the conversion time.
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"To me C++ seems to be a language that has sacrificed orthogonality
and elegance for random expediency." -- Meilir Page-Jones
Just a quick and somehow interesting observation: Linus assumes that the
read buffer is page-aligned in this comparison, something you cannot
expect in most higher languages (of course you can arrange that in
libsndfile - did you?). That reminds me that the OS guys live in another
world. Would be interesting which function wins if you compare Unix.read
(which does not care about alignment, and does an extra copy) with
Bigarray.mmap.
Gerd
--
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
ge...@gerd-stolpmann.de http://www.gerd-stolpmann.de
Phone: +49-6151-153855 Fax: +49-6151-997714
------------------------------------------------------------
Thanks,
PKE.
--
Pål-Kristian Engstad (eng...@naughtydog.com), Lead Programmer, ICE
team, Naughty Dog, Inc., 1601 Cloverfield Blvd, 6000 North,
Santa Monica, CA 90404, USA. Ph.: (310) 633-9112.
"Most of us would do well to remember that there is a reason Carmack
is Carmack, and we are not Carmack.",
Jonathan Blow, 2/1/2006, GD Algo Mailing List
Right, this is an unwarranted assumption. Furthermore, for certain
applications with complex access patterns I'd expect it to be way easier for
the user to use mmap efficiently. Even assuming that "read" were always
faster, implementing your own complex buffer management to cache data
efficiently for particular access patterns may not be easy. Unless your
name is Linus, of course... ;-)
That reminds me that the OS guys live in another
> world. Would be interesting which function wins if you compare Unix.read
> (which does not care about alignment, and does an extra copy) with
> Bigarray.mmap.
>
I use mmap in a fileserver for performance reasons. Compared to using
I/O-channels it requires, if I remember correctly, only about 50% of the
CPU-time, but OCaml-channels have to do an extra copy from the channel
buffer to the user buffer so this is not quite a fair comparison. I'd still
expect mmap to reduce overall CPU-time over "read" for larger files due to
less data copying between kernel buffers and user space.
I have observed that mmap does not pay for small files (my cutoff point is
8192 bytes), possibly due to some setup overhead for memory mappings. Total
time is not strongly effected, because we are generally I/O-bound, but if
you run the fileserver on the same machine as applications, the lower
CPU-time is a noticable advantage.
There is one caveat though regarding mmap with OCaml on 32bit platforms: the
GC has a bug which prevents it from reclaiming bigarrays aggressively
enough. Though there is plenty of RAM (the kernel need not keep mapped
files in memory), the process might run out of address space. I hope this
bug (0004108) will be fixed in the next release.
Regards,
Markus
--
Markus Mottl http://www.ocaml.info markus...@gmail.com