some possible additions to piqi + a bugfix

瀏覽次數:16 次
跳到第一則未讀訊息

Koen De Keyser

未讀,
2014年1月29日 上午10:33:572014/1/29
收件者:pi...@googlegroups.com
Hi all,

I've been experimenting quite a bit with Piqi lately and made some additions to the Ocaml code generation that I wanted to discuss to see if it would be useful to integrate them in the main Piqi.

- Generation of constructors: at the moment, piqic-ocaml only generates default constructors which take no arguments. Typically, when I have defined a type, I define a "create" function which takes all of the record fields as arguments. I've added a --gen-constructors option to piqi which does this for any record type, e.g. for the person type from the address book example, it generates the following code:

let create_person name id email phone =
   {
     Person.name = name;
     Person.id = id;
     Person.email = email;
     Person.phone = phone;
     Person.piqi_unknown_pb = [];
  }

- Generation of separate modules for each record type with consistent naming: every type generated by piqi has more or less the same functionality: a generate/parse call (serialization) and a (default) constructor. However, the default function names include the type name, which makes using the generated code as the input module for a functor impractical, as once has to write a adaptor module that maps the function names chosen by piqi to those expected by the functor. I've added a --gen-module-per-record option that adds a module for every record type and has a uniform interface.

E.g. for that same person type, it generates:

    module Person =
      struct
        type t = Person.t
        let pickle = gen_person
        let unpickle = parse_person
        let create = create_person
      end

- Generate setters / getters: adds a set and get function for every record field to the module:

    module Person =
      struct
        type t = Person.t     
        let pickle = gen_person
        let unpickle = parse_person
        let create = create_person
        let name t = t.Person.name
        let set_name t value = { (t) with Person.name = value; }
        let id t = t.Person.id
        let set_id t value = { (t) with Person.id = value; }
        let email t = t.Person.email
        let set_email t value = { (t) with Person.email = value; }
        let phone t = t.Person.phone
        let set_phone t value = { (t) with Person.phone = value; }    
      end

I am aware that some of these changes overlap with the functionality provided by the syntax extension (Module#... style). One of the reasons that I prefer not using a syntax extension is that it doesn't seem to be playing well with the Merlin tool that I use extensively when writing Ocaml code.

I've implemented these changes on https://github.com/amplidata/piqi/tree/generate_constructors_and_modules . Let me know if you think you'd want to merge something back to the main piqi branch.

In addition, I ran into a small bug in piqic-ocaml when using the --pp option, which uses camlp4o to pretty print the generated code. In some cases, it would output a binary AST instead of a ml file. Also, it depends on /bin/sh. I've tried to fix both of these issue on https://github.com/amplidata/piqi/tree/pretty_printing_fix and I'll send a pull request for this.

thanks,

Koen

Anton Lavrik

未讀,
2014年1月31日 清晨6:17:242014/1/31
收件者:pi...@googlegroups.com
On Wed, Jan 29, 2014 at 7:33 AM, Koen De Keyser <koen.d...@gmail.com> wrote: 

- Generation of constructors: at the moment, piqic-ocaml only generates default constructors which take no arguments. Typically, when I have defined a type, I define a "create" function which takes all of the record fields as arguments. I've added a --gen-constructors option to piqi which does this for any record type, e.g. for the person type from the address book example, it generates the following code:

let create_person name id email phone =
   {
     Person.name = name;
     Person.id = id;
     Person.email = email;
     Person.phone = phone;
     Person.piqi_unknown_pb = [];
  }

So far, I've been using default functions for that. For example,

let x = Person.default_person () in
{x with
    name = name
    id = id
    ...
}

The approach with defaults seems to be more flexible and robust. First, we can override only those fields that we care about. Second, we can refer them by name instead of relying on the order of fields in the record definition and the total number of fields. Both the order and the number of fields can change over time which will break the code in your case. Besides, it is easy to get confused if there are a lot of positional arguments for large records.

I've seen some people using labeled arguments for that. For example, the above example could be expressed as

let x = Person.make_person ~name ~id ...

This is a bit more concise compared to using default functions. We could do something like that.

 
- Generation of separate modules for each record type with consistent naming: every type generated by piqi has more or less the same functionality: a generate/parse call (serialization) and a (default) constructor. However, the default function names include the type name, which makes using the generated code as the input module for a functor impractical, as once has to write a adaptor module that maps the function names chosen by piqi to those expected by the functor. I've added a --gen-module-per-record option that adds a module for every record type and has a uniform interface.

E.g. for that same person type, it generates:

    module Person =
      struct
        type t = Person.t
        let pickle = gen_person
        let unpickle = parse_person
        let create = create_person
      end

I can see how this can be useful for functors. I'm wondering if it would be reasonable to also generate modules for other non-record user-defined types such as variants?

Also, if we decide to do it, I think we should keep naming consistent and use "gen" and "parse" instead of "pickle" and "unpickle".


- Generate setters / getters: adds a set and get function for every record field to the module:

    module Person =
      struct
        type t = Person.t     
        let pickle = gen_person
        let unpickle = parse_person
        let create = create_person
        let name t = t.Person.name
        let set_name t value = { (t) with Person.name = value; }
        let id t = t.Person.id
        let set_id t value = { (t) with Person.id = value; }
        let email t = t.Person.email
        let set_email t value = { (t) with Person.email = value; }
        let phone t = t.Person.phone
        let set_phone t value = { (t) with Person.phone = value; }    
      end

I am aware that some of these changes overlap with the functionality provided by the syntax extension (Module#... style). One of the reasons that I prefer not using a syntax extension is that it doesn't seem to be playing well with the Merlin tool that I use extensively when writing Ocaml code.

I have just realized that I should have removed syntax extensions long time ago. I kept them around for backward compatibility with OCaml version < 3.12. I don't think this is relevant any more. Since 3.12, "let open Module in" is a part of the language and we can use the standard "Module.({...})" syntax instead of  "Module#{}" implemented by the extension.

So, would you still prefer to use setters in getters instead of the standard field access syntax?

By the way, the use of modules may become even less relevant with the release of OCaml 4.1 as it (finally!) introduces a standard mechanism for record field disambiguation.


In addition, I ran into a small bug in piqic-ocaml when using the --pp option, which uses camlp4o to pretty print the generated code. In some cases, it would output a binary AST instead of a ml file. Also, it depends on /bin/sh. I've tried to fix both of these issue on https://github.com/amplidata/piqi/tree/pretty_printing_fix and I'll send a pull request for this.

Thank you for catching this!


Anton 

Koen De Keyser

未讀,
2014年2月5日 上午11:27:332014/2/5
收件者:pi...@googlegroups.com
Hi,

Sorry for the delay in getting back to this:



- Generation of constructors: at the moment, piqic-ocaml only generates default constructors which take no arguments. Typically, when I have defined a type, I define a "create" function which takes all of the record fields as arguments. I've added a --gen-constructors option to piqi which does this for any record type, e.g. for the person type from the address book example, it generates the following code:

let create_person name id email phone =
   {
     Person.name = name;
     Person.id = id;
     Person.email = email;
     Person.phone = phone;
     Person.piqi_unknown_pb = [];
  }

So far, I've been using default functions for that. For example,

let x = Person.default_person () in
{x with
    name = name
    id = id
    ...
}

The approach with defaults seems to be more flexible and robust. First, we can override only those fields that we care about. Second, we can refer them by name instead of relying on the order of fields in the record definition and the total number of fields. Both the order and the number of fields can change over time which will break the code in your case. Besides, it is easy to get confused if there are a lot of positional arguments for large records.

I've seen some people using labeled arguments for that. For example, the above example could be expressed as

let x = Person.make_person ~name ~id ...

This is a bit more concise compared to using default functions. We could do something like that.

One of the main reasons why I prefer to use explicit constructors (i.e. the create or make function) and getter/setter functions is that it allows to hide the inner details of the records. In my use case, there are additional relations between record fields, or constraints on them, that cannot be expressed in the proto/piqi description and which would not be enforced if I let other layers of the application have direct access to the record. In practice, I end up needing a layer on top of the generated code that enforces these constraints and relations. 90% of this interface might just be pass-through to the generated code, so it is of benefit to me that I would not have to write manual forwarding code for this 90%. One can accomplish this by doing an include of the generated module-per-type and then narrowing down the signature, so that only a selected subset of the generated functionality is exposed, in addition to adding a small number of manually written getters/setters/constructors that implement these extra relations and constraints by override the auto-generated ones and by adding a couple of additional functions.

I can imagine that in some other use cases the fields of the record are totally independent though, so I can see the use of the "record field name" based options as well. The explicit constructor/getter/setter functions are probably most useful within the module-per-type approach, so maybe the create/make function should simply be defined within that type specific module and not in the main module?

I am also not a frequent user of the named/optional arguments in Ocaml, but that is more of a personal preference. I haven't run into a lot of cases of getting the arguments for these constructors wrong, but it might indeed be a good option to go this named argument route here.

- Generation of separate modules for each record type with consistent naming: every type generated by piqi has more or less the same functionality: a generate/parse call (serialization) and a (default) constructor. However, the default function names include the type name, which makes using the generated code as the input module for a functor impractical, as once has to write a adaptor module that maps the function names chosen by piqi to those expected by the functor. I've added a --gen-module-per-record option that adds a module for every record type and has a uniform interface.

E.g. for that same person type, it generates:

    module Person =
      struct
        type t = Person.t
        let pickle = gen_person
        let unpickle = parse_person
        let create = create_person
      end

I can see how this can be useful for functors. I'm wondering if it would be reasonable to also generate modules for other non-record user-defined types such as variants?

Also, if we decide to do it, I think we should keep naming consistent and use "gen" and "parse" instead of "pickle" and "unpickle".

 
You are absolutely right that they should be as close as possible to the existing choices like gen/parse. I used pickle/unpickle because that was the expected interface of some code that I wanted to test it with. Also, if we would add a module-per-type, it would indeed be best to do this for all types for which it would make sense, like variants.

best regards,

Koen


Anton Lavrik

未讀,
2014年2月6日 凌晨4:49:012014/2/6
收件者:pi...@googlegroups.com
On Wed, Feb 5, 2014 at 8:27 AM, Koen De Keyser <koen.d...@gmail.com> wrote:

I've seen some people using labeled arguments for that. For example, the above example could be expressed as

let x = Person.make_person ~name ~id ...

This is a bit more concise compared to using default functions. We could do something like that.

One of the main reasons why I prefer to use explicit constructors (i.e. the create or make function) and getter/setter functions is that it allows to hide the inner details of the records. In my use case, there are additional relations between record fields, or constraints on them, that cannot be expressed in the proto/piqi description and which would not be enforced if I let other layers of the application have direct access to the record. In practice, I end up needing a layer on top of the generated code that enforces these constraints and relations. 90% of this interface might just be pass-through to the generated code, so it is of benefit to me that I would not have to write manual forwarding code for this 90%. One can accomplish this by doing an include of the generated module-per-type and then narrowing down the signature, so that only a selected subset of the generated functionality is exposed, in addition to adding a small number of manually written getters/setters/constructors that implement these extra relations and constraints by override the auto-generated ones and by adding a couple of additional functions.

This is helpful. It looks like setters/getters can be useful. Only thing -- it is a good practice to always prefix user-defined names. In this case, I suggest we use get_ for getters and set_ for setters.

I can imagine that in some other use cases the fields of the record are totally independent though, so I can see the use of the "record field name" based options as well. The explicit constructor/getter/setter functions are probably most useful within the module-per-type approach, so maybe the create/make function should simply be defined within that type specific module and not in the main module?

Sure. I think I like "make" better. It is shorter.
 
I am also not a frequent user of the named/optional arguments in Ocaml, but that is more of a personal preference. I haven't run into a lot of cases of getting the arguments for these constructors wrong, but it might indeed be a good option to go this named argument route here.

Well, on the one hand, there's no point having something in Piqi if it does't support schema evolution. This is a key idea behind the project. Another one is sharing application-level protocol definitions. Overall, positional arguments don't make a good fit here as they are not descriptive enough (especially in case of many fields which is not uncommon) and prone to a change (reordering, adding new fields, deprecating old fields).

Personally, it feels weird if record value construction and destructuring are treated differently with regard to the question of whether to use labels or not. If one wants to use labels for field access and pattern matching why not use labels for construction as well? Perhaps the reason we don't use labels in programming APIs more extensively is because people often prefer brevity over versatility and descriptiveness. Besides, naming things alone is hard enough. Coming up with short descriptive names can be a pain.

On the other hand, for the Piq format, it is possible to specify which input fields can be positional and which must be always labeled (this is controlled by the .piq-positional property). It works remarkably well for DSLs I'm using Piq for. Unfortunately, in OCaml, one is forced to decide up front which function arguments are positional and which ones are labeled. This means this approach can't be applied directly but I see how something like that could be useful if the goal is brevity.

Anton
回覆所有人
回覆作者
轉寄
0 則新訊息