Stream API extension

5 views
Skip to first unread message

to...@rastageeks.org

unread,
Aug 21, 2008, 9:12:49 AM8/21/08
to bitstring
Hi all !

Following the stream enhancement proposition --
http://code.google.com/p/bitstring/issues/detail?id=1 -- we can start
the discussion here I believe..

With Sam, we had an idea which was similar to yours. Assuming that the
bitstring API is forward-only, which seems already the case, one
streamed bitstring would be created by providing a reading function.
Then the bitstring would be a string reference that could be extended
when needed.

Possibly, one could also provide a "seek" function is performances
when seeking are better than when reading.

Then, I believe this as to be validated against actual implementation.
I will try to look at this when I have time (probably not before the
end of next week..)


Romain

Richard Jones

unread,
Aug 21, 2008, 9:47:11 AM8/21/08
to bits...@googlegroups.com

I'm a bit concerned about the performance impact and the fact that as
far as I can see the very basic & simple bitstring type would need to
change.

type bitstring = string * int * int

I'm using this tuple directly in quite a few places. It's also neat
and understandable for the common case. Because it doesn't contain
any functions, it can be marshalled too.

Isn't the best way to do this to add another type? Let's call it a
'bitstring_stream'. This would have to contain function callbacks
and whatever else is needed to handle streams.

The bitmatch operator cannot work directly with both a bitstring and a
bitstring_stream, so it needs some syntactic change, eg:

bitmatch_stream bits_ext with
| ...

and this allows us to generate slightly different code too. Thus no
performance penalty in the common case.

By the way, I also have another, kind of related request, for memory
maps, but I'll put that into another thread.

Rich.

--
Richard Jones
Red Hat

to...@rastageeks.org

unread,
Aug 21, 2008, 10:02:51 AM8/21/08
to bitstring
Indeed, a seperate type could be a good idea.

Also, a different matching primitive would be good, since a matching
that needs to collect more data could possibly hold the whole
processing, so it's sematically good to notice the difference for the
programmer.

It would also be possible to manually grab the needed bits and then
use a bitstring matching.

Samuel Mimram

unread,
Aug 21, 2008, 10:05:45 AM8/21/08
to bits...@googlegroups.com
to...@rastageeks.org wrote:
> Also, a different matching primitive would be good, since a matching
> that needs to collect more data could possibly hold the whole
> processing, so it's sematically good to notice the difference for the
> programmer.

The question is then how much of the code can be factored between the
two ways of matching. Richard, do you have an idea on this point?

Cheers,

Samuel.

to...@rastageeks.org

unread,
Aug 28, 2008, 8:03:39 PM8/28/08
to bitstring
Hi all !

I am trying to start this..

For now, I have the following type:
type stream_stub =
{ read : string -> int -> int -> int;
skip : (int -> int) option;
rewind : (int -> int) option
}
and
stream_bitstring = stream_stub*Buffer.t

I think it's good to mimick the Unix and Pervasives read APIs.
I then wrap things that way:

let read (stub,buf) tmp ofs len =
let n = stub.read tmp ofs len in
if n > 0 then
Buffer.add_substring buf tmp 0 n;
n

and

let bitstring_of_stream (stub,buf) =
Buffer.contents buf, 0, Buffer.length buf lsl 3

However, I don't know what to do with the length of streams. Indeed,
read functions
ususally use bytes while it is in bits for bitstring.

I think it is better to count in bytes for streams, what do you
think ?


Romain

to...@rastageeks.org

unread,
Aug 29, 2008, 9:21:54 AM8/29/08
to bitstring
Hummm...

Sorry for noise, but I'm still evaluating the possibilities..
I now believe it's much more handy to use the Stream core module:
type bitstream = string Stream.t

Romain

Richard Jones

unread,
Aug 31, 2008, 4:29:31 AM8/31/08
to bits...@googlegroups.com
On Fri, Aug 29, 2008 at 06:21:54AM -0700, to...@rastageeks.org wrote:
> Sorry for noise, but I'm still evaluating the possibilities..
> I now believe it's much more handy to use the Stream core module:
> type bitstream = string Stream.t

The hard part isn't defining the type, it's modifying the bitmatch
operator so it understands streams.

Richard Jones

unread,
Aug 21, 2008, 10:50:08 AM8/21/08
to bits...@googlegroups.com

You'd definitely want to share code. The key would be to refactor
output_bitmatch[1] so that as much is shared as possible ...

It's a 400+ line monster function so I don't envy whoever gets to sit
down and write the patch.

Rich.

[1] http://code.google.com/p/bitstring/source/browse/trunk/pa_bitstring.ml#431

Romain Beauxis

unread,
Sep 22, 2008, 5:22:10 AM9/22/08
to bits...@googlegroups.com
Le Thursday 21 August 2008 16:50:08 Richard Jones, vous avez écrit :
> You'd definitely want to share code.  The key would be to refactor
> output_bitmatch[1] so that as much is shared as possible ...
>
> It's a 400+ line monster function so I don't envy whoever gets to sit
> down and write the patch.

(sorry for delayed answer..)

Yes, indeed I came to that conclusion too.

Although I'm stil considering the idea to write another camlp4 extenstion to
implement my lazy string idea, I will try to have a look at output_bitmatch's
code when I have more time.

Probably not in the near future, since I have a lot of stuff to finish for
the end of this year...


R.

Reply all
Reply to author
Forward
0 new messages