Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Caml-list] Buffer.add_channel hangs

2 views
Skip to first unread message

Stas Miasnikou

unread,
Mar 17, 2010, 4:27:23 PM3/17/10
to caml...@yquem.inria.fr
Hi,

OCaml 3.11.1, OpenBSD 4.6, i386.

I am trying to read whole file by doing:

let read_file_bin name =
let ic = open_in_bin name in
let b = Buffer.create 1024 in
(try Buffer.add_channel b ic max_int with _ -> ()); (* <-- HERE *)
close_in ic;
Array.init (Buffer.length b) (fun i -> int_of_char (Buffer.nth b i))

but it hangs on the line marked. Am I doing something wrong?

Stas

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Martin Jambon

unread,
Mar 17, 2010, 7:20:29 PM3/17/10
to Stas Miasnikou, caml...@yquem.inria.fr
Stas Miasnikou wrote:
> Hi,
>
> OCaml 3.11.1, OpenBSD 4.6, i386.
>
> I am trying to read whole file by doing:
>
> let read_file_bin name =
> let ic = open_in_bin name in
> let b = Buffer.create 1024 in
> (try Buffer.add_channel b ic max_int with _ -> ()); (* <-- HERE *)
> close_in ic;
> Array.init (Buffer.length b) (fun i -> int_of_char (Buffer.nth b i))
>
> but it hangs on the line marked. Am I doing something wrong?

The problem is max_int and the fact that Buffer.add_channel and Buffer.resize
do not check for this possibility:

let add_channel b ic len =
if b.position + len > b.length then resize b len;
really_input ic b.buffer b.position len;
b.position <- b.position + len

Something like the following would be better:

let add_channel b ic len =
if len < 0 || len > Sys.max_string_length then
invalid_arg "Buffer.add_channel";
...

Since you uncovered this problem, please kindly submit a proper bug report at
http://caml.inria.fr/mantis

(and figure what to do if the file is larger than 16MB on 32-bit systems)


Of course, you can see from the implementation of the Buffer module that a
string of your maximum length is created no matter what, which you surely want
to avoid especially on 64-bit systems where Sys.max_string_length is very large.

Martin

--
http://mjambon.com/

Goswin von Brederlow

unread,
Mar 17, 2010, 10:53:28 PM3/17/10
to Stas Miasnikou, caml...@yquem.inria.fr
Stas Miasnikou <stas.mi...@gmail.com> writes:

> Hi,
>
> OCaml 3.11.1, OpenBSD 4.6, i386.
>
> I am trying to read whole file by doing:
>
> let read_file_bin name =
> let ic = open_in_bin name in
> let b = Buffer.create 1024 in
> (try Buffer.add_channel b ic max_int with _ -> ()); (* <-- HERE *)
> close_in ic;
> Array.init (Buffer.length b) (fun i -> int_of_char (Buffer.nth b i))
>
> but it hangs on the line marked. Am I doing something wrong?
>
> Stas

For the problem see the other mail.

For a better solution I suggest you look at the Bigarray module. You can
mmap your file as int8_unsigned array and have your read_file function
done all in simple step.

MfG
Goswin

Stas Miasnikou

unread,
Mar 18, 2010, 2:47:26 AM3/18/10
to Martin Jambon, caml...@yquem.inria.fr
On 3/18/10, Martin Jambon <martin...@ens-lyon.org> wrote:

> Stas Miasnikou wrote:
>> OCaml 3.11.1, OpenBSD 4.6, i386.
>>
>> I am trying to read whole file by doing:
>>
>> let read_file_bin name =
>> let ic = open_in_bin name in
>> let b = Buffer.create 1024 in
>> (try Buffer.add_channel b ic max_int with _ -> ()); (* <-- HERE *)
>> close_in ic;
>> Array.init (Buffer.length b) (fun i -> int_of_char (Buffer.nth b i))
>>
>> but it hangs on the line marked. Am I doing something wrong?
>
> The problem is max_int and the fact that Buffer.add_channel and
> Buffer.resize
> do not check for this possibility:
>
> let add_channel b ic len =
> if b.position + len > b.length then resize b len;
> really_input ic b.buffer b.position len;
> b.position <- b.position + len
>
> Something like the following would be better:
>
> let add_channel b ic len =
> if len < 0 || len > Sys.max_string_length then
> invalid_arg "Buffer.add_channel";
> ...

Oh, never thought OCaml has bugs! ;-)

> Since you uncovered this problem, please kindly submit a proper bug report
> at
> http://caml.inria.fr/mantis

I've submitted it.

> (and figure what to do if the file is larger than 16MB on 32-bit systems)
>
> Of course, you can see from the implementation of the Buffer module that a
> string of your maximum length is created no matter what, which you surely
> want
> to avoid especially on 64-bit systems where Sys.max_string_length is very
> large.

Erm... yes. I think I follow the other advice and will use Bigarray.

Stas

Fabrice Le Fessant

unread,
Mar 18, 2010, 5:05:26 AM3/18/10
to Stas Miasnikou, caml...@yquem.inria.fr
Hi,

Maybe you should just use (in_channel_length ic) to get the size of the
file before hand, so that you can directly create a string with that
size instead of a Buffer.t ?

Regards,
--Fabrice

--
Fabrice LE FESSANT
Chercheur, Equipe ASAP
(As Scalable As Possible)
http://www.lefessant.net/

INRIA-Futurs, Bat P - 112
Parc Orsay Université
2-4, rue Jacques Monod
F-91893 Orsay Cedex, FRANCE

Stas Miasnikou

unread,
Mar 18, 2010, 1:52:29 PM3/18/10
to caml...@yquem.inria.fr
On 3/17/10, Stas Miasnikou <stas.mi...@gmail.com> wrote:
> OCaml 3.11.1, OpenBSD 4.6, i386.
>
> I am trying to read whole file by doing:
>
> let read_file_bin name =
> let ic = open_in_bin name in
> let b = Buffer.create 1024 in
> (try Buffer.add_channel b ic max_int with _ -> ()); (* <-- HERE *)
> close_in ic;
> Array.init (Buffer.length b) (fun i -> int_of_char (Buffer.nth b i))
>
> but it hangs on the line marked. Am I doing something wrong?

More on this, when doing:

let read_file_bin name =
let ic = open_in_bin name in
let b = Buffer.create 1024 in

(try Buffer.add_channel b ic (n + 100) with _ -> ());
close_in ic;
print_int (Buffer.length b); print_newline ();


Array.init (Buffer.length b) (fun i -> int_of_char (Buffer.nth b i))

With file length equal to n (65536) I get nothing read, i.e. after
reading Buffer.length b returns 0. Can anyone check this, so I know
whether this is OCaml or my OpenBSD 4.6 port of it issue?

Stas Miasnikou

unread,
Mar 20, 2010, 4:29:25 AM3/20/10
to caml...@yquem.inria.fr
On 3/18/10, Stas Miasnikou <stas.mi...@gmail.com> wrote:
> On 3/17/10, Stas Miasnikou <stas.mi...@gmail.com> wrote:
>> OCaml 3.11.1, OpenBSD 4.6, i386.
>>
>> I am trying to read whole file by doing:
>>
>> let read_file_bin name =
>> let ic = open_in_bin name in
>> let b = Buffer.create 1024 in
>> (try Buffer.add_channel b ic max_int with _ -> ()); (* <-- HERE *)
>> close_in ic;
>> Array.init (Buffer.length b) (fun i -> int_of_char (Buffer.nth b i))
>>
>> but it hangs on the line marked. Am I doing something wrong?
>
> More on this, when doing:
>
> let read_file_bin name =
> let ic = open_in_bin name in
> let b = Buffer.create 1024 in
> (try Buffer.add_channel b ic (n + 100) with _ -> ());
> close_in ic;
> print_int (Buffer.length b); print_newline ();
> Array.init (Buffer.length b) (fun i -> int_of_char (Buffer.nth b i))
>
> With file length equal to n (65536) I get nothing read, i.e. after
> reading Buffer.length b returns 0. Can anyone check this, so I know
> whether this is OCaml or my OpenBSD 4.6 port of it issue?

Aha, this behaviour is documented, mea culpa.

0 new messages