[Caml-list] Announce: the Piqi project

Skip to first unread message

Anton Lavrik

Jul 10, 2010, 6:52:00 PM7/10/10
to caml...@inria.fr

It is my pleasure to announce the Piqi project.

Piqi is a set of languages and tools for working with structured data.
It includes:

- High-level data representation and data definition languages.

- Binary encoding for compact and portable data representation.

- Tools for validating, pretty-printing and converting data between
different formats.

- Mappings to various programing languages allowing programs to
serialize and deserialize data in a portable manner.

- Open-source implementation licensed under the terms of Apache v2.0.

Piqi implements native mapping for OCaml. Mappings for other languages (C++,
Python, Java, Go, etc.) are supported through Google Protocol Buffers.

Here are some details.

1. Data representation language (Piq)

Piq is a text-based language that allows humans to conveniently read,
write and edit structured data.

Piq has a concise and powerful syntax:
- No syntax noise compared to XML.
- Reasonable amount of parenthesis compared to S-expressions.
- Comments.

Piq supports the following data literals:
- Unicode strings, booleans, integer and floating point numbers
(including "infinity" and "NaN");
- binary strings (byte arrays);
- verbatim text.

2. Data definition language (Piqi)

Piqi is a data definition language for Piq and its encodings. It stands
for "Piq Interfaces".

Piqi supports definition of records, variants (similar to OCaml's
polymorphic variants), lists and type aliases on top of a rich set of
built-in data types.

Piqi modules can reuse definitions from other modules using OCaml-like
imports and includes.

Piqi type definitions are extensible that allows data schema evolution
while maintaining backward and forward compatibility.

3. Piqi-OCaml mapping

Piqi implements mapping of Piqi modules to OCaml modules and Piqi type
definitions to OCaml type definitions using `piqic` -- Piq interface

`piqic ocaml` command takes a Piqi module and produces `.ml` file
- OCaml module which corresponds to the source Piqi module;
- OCaml type definitions;
- functions for serializing and de-serializing OCaml values.

Piqi types are naturally mapped to OCaml types:
- bool, string, float are mapped to correspondent OCaml types
- various integer types are mapped to OCaml's int, int32 or int64
- binaries are mapped to strings
- variants are mapped to OCaml's polymorphic variants
- records are mapped to OCaml's records where each OCaml record is
put it its own recursive sub-module. Such mapping method provides a
workaround for OCaml's flat record label namespace and it is actually
very convenient when used with special CamlP4 macros.

Piqi allows to add custom type mappings. For example, it is easy to add
support for OCaml's nativeint, bigint or any other OCaml type (not sure
about parametric types, though).

To get a feeling how it all looks like, here is a couple of examples:

- Piqi type specification written in Piqi (i.e. self-specification):


- This is how the above specification is mapped to OCaml (NOTE: this is
a hand-written spec which is used during the bootstrapping stage, the
real one can be produced by running `piqic ocaml`):


- Piqi self-specification (and Piqi intermediate language) represented
as OCaml data structures:


4. Google Protocol Buffers compatibility and mapping

Piqi is type- and binary-compatible with Protocol Buffers:
- Piqi modules and types (defined in `.piqi` files) can be converted to
Google Protocol Buffers type specifications (`.proto` files), and the
other way around.

- Piqi uses the same binary encoding as the one used by Protocol

5. Some implementation details

One of interesting Piqi properties is that Piqi language implementation
takes its own high-level specification written in Piqi, and parses the
language into OCaml intermediate representation without any hand-written
parsing rules.

This mechanism allows easy extension of Piqi language. When adding new
features, there is no need to design new syntax elements, update parsing
code and transform AST into intermediate language. Also, new extensions
are typically transparent for the core Piqi implementation.

6. Project status

Piqi is a late-stage prototype. The implementation is fully functional
and most of its functionality is utilized inside the project because of
Piqi's bootstrapped architecture.

OCaml mapping -- both generated code and the runtime library -- hasn't
been optimized for performance. There's plenty of room for optimization,
it just hasn't been a priority.

Piqi has been tested with OCaml 3.11 on i368 Squeeze and amd64 Lenny
Debian Linux. I haven't tried building it on Windows yet.

7. Acknowledgements

I would like to express my great appreciation to the OCaml language authors
and the OCaml community.

It would be extremely hard, if possible at all, to design and implement Piqi
as a hobby project using any other programming languages and tools I'm
aware of.

Piqi implementation relies on the following excellent software components:

- OCaml and Camlp4 -- http://caml.inria.fr

- ulex -- Unicode lexer generator, http://alain.frisch.fr/soft.html

- easy-format -- Pretty-printing library,

- OCamlMakefile -- Automated compilation of complex OCaml-projects,

- open-in -- Camlp4 syntax extension, http://alain.frisch.fr/soft.html

Piqi source code is available on github: http://github.com/alavrik/piqi

For more information, examples and documentation please visit http://piqi.org

I will be happy to answer OCaml-related questions in this topic branch. For
all other questions, suggestions and comments there is the Piqi Google
group: http://groups.google.com/group/piqi

Your feedback is highly appreciated!


Caml-list mailing list. Subscription management:
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Reply all
Reply to author
0 new messages