Amid much rejoicing, my Haskell version of protocol-buffer is now
released (version 0.2.9).
What is this for? What does it do? Why?
Shorter answer: It generates Haskell data types that can be converted back
and forth to lazy ByteStrings that interoperate with Google's generated code in
C++/Java/python.
It is a pure Haskell re-implementation of the Google code at
http://code.Google.com/apis/protocolbuffers/docs/overview.html
which is "...a language-neutral, platform-neutral, extensible way of
serializing structured data for use in communications protocols, data storage,
and more."
Google's project produces C++, Java, and Python code. This one produces
Haskell code.
The release tarball (with 3 Haskell packages inside, see README in source) is at
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/protocol-buffers
The darcs repository has moved to
http://darcs.haskell.org/packages/protocol-buffers2/
You will also need a recent ghc compiler, the "binary" package and the
"utf8-string" package from hackage.haskell.org (same site as mentioned
above).
The source compiles to 3 things:
1) the package "protocol-buffers" with the library API
2) the package "protocol-buffers-descriptor" with the
descriptor.proto code
3) The 'hprotoc' executable which is a command line program similar
to 'protoc'.
The "examples" sub-directory in the code has the Haskell version of
the "addressbook.proto" example and is compatible with Google's
similar example code.
The code generated from unittest.proto (and unittest_import.proto)
includes messages TestAllTypes and TestAllExtensions which have been
extensively tested by QuickCheck to ensure they can be wire encoded
and decoded (see the "tests" sub-directory in the code).
The user API, as exported by Text.ProtocolBuffers, allows for
converting messages back and forth to the lazy ByteString type. And
such messages can be merged, and the defaults accessed via the
MessageAPI type class.
The messages in Haskell as just regular data types and are thus
immutable. Required types are simple record fields, optional types
are Maybe, and repeated types are Seq (from Data.Sequence).
Extensions are supported via Key data that allows access to the
extension fields. Extensible messages contain an opaque ext'field
entry of type ExtField that contains the map data structure to contain
the extension field values.
The User API allows for serializing messages as the usual series of
fields. It also provides for a length prefix to be written to create
delimited messages. It also provides to write a wire tag with any
field number before the length and message data. This last form looks
like a field on the wire, and there is a special api call to read back
just the one message and its field number. This last API is similar
to the one that is part of the C# API.
No benchmarks have been run yet. Any suggestions?
Unsupported for the moment is loading and storing "unknown" fields.
It can be added sooner if someone has a use for this.
Unsupported indefinitely is code generation for Services and Methods.
I have yet to look into how this is presented in the other languages.
The API to read a single message field, as mentioned above, might be
extended to read any type instead of just messages.
optional clever_quote {
<autrijus> Perl: "Easy things are easy, hard things are possible"
<autrijus> Haskell: "Hard things are easy, the impossible just
happened"
}
Cheers!
Chris Kuklewicz
_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
I had learned the trick before the documentation was updated. It seems I have
used a very unreliable trick. And the "use castToSTUArray" suggested
alternative is a really poor one since I am not using arrays at all.
Who can suggest a way to cast from Float to Word32 and Double to Word64 using
ghc? The actual task is that I need to write out the Float as a little endian
sequence of four bytes and also be able to read it back in. The writing and
reading are done in Put and Get monads to ByteString (from the "binary" package).
The alloca/poke/peek work around I have looks like
castWord32ToFloat :: Word32 -> Float
castWord32ToFloat x = unsafePerformIO $
alloca $ \p -> poke p x >> peek (castPtr p)
castFloatToWord32 :: Float -> Word32
castFloatToWord32 x = unsafePerformIO $
alloca $ \p -> poke p x >> peek (castPtr p)
The unsafeCoerce trick that is no longer working looks like:
castWord64ToDouble :: Word64 -> Double
castWord64ToDouble (W64# w) = D# (unsafeCoerce# w)
castDoubleToWord64 :: Double -> Word64
castDoubleToWord64 (D# d) = W64# (unsafeCoerce# d)
Any ideas? Or is the alloca trick the only way to do this?
Chris
Ian Lynagh wrote:
> Hi Chris,
>
> On Sun, Sep 21, 2008 at 05:37:33PM +0100, Chris Kuklewicz wrote:
>> Also, I tried two tricks:
>> (D# x) <-> (W64# x) which works fine
>> (F# x) <-> (W32# x) which produced garbage, so I had to replace it with
>> alloca/poke/peek.
>
> This isn't supported, and I suspect is the cause of the -fasm problems.
>
> Please see
> http://hackage.haskell.org/trac/ghc/ticket/2209
> for more details and suggested alternative.
>
>
> Thanks
> Ian
I think alloca-like hacks is really the wrong direction and asking
for trouble.
You are trying to translate between platform-dependent native floats,
and IEEE floats in a specified platform-independent binary format
for Google. So use encodeFloat/decodeFloat - fast primitives in
GHC - on the native side, and a hand-written Binary instance for
the exact format you need on the Google side.
My opinion, YMMV.
Regards,
Yitz
Monday, September 22, 2008, 2:48:16 PM, you wrote:
> used a very unreliable trick. And the "use castToSTUArray" suggested
> alternative is a really poor one since I am not using arrays at all.
castToSTUArray does the same as your code, only in ST monad so you can
skip unsafePerformIO trick
if you dn't know, ST is a subset of IO monad with a limited set of
operations guaranteed to not have side-effects. so,
cvt x = unsafePerformIO $
do alloca $ \place -> do
poke place x
peek place
and
cvt x = runST $
do place <- newArray (0,1)
writeArray place 0 x
readArray place 0
generates almost the same code (the only difference is kind of memory
allocated)
--
Best regards,
Bulat mailto:Bulat.Z...@gmail.com
castSTUArray is the way GHC does it - the idea is to allocate a small
array, store the Float/Double in it, cast the type of the array to
Word32 or whatever, and then read out the contents.
It's more or less equivalent to the peek/poke solution, except that it
doesn't need unsafePerformIO.
GHC's code is here: (look for floatToWord):
http://darcs.haskell.org/ghc/compiler/cmm/PprC.hs
Cheers,
Simon