Is a decoder the right level of abstraction for parsing JSON?

976 views
Skip to first unread message

Kasey Speakman

unread,
Sep 28, 2016, 7:34:40 PM9/28/16
to Elm Discuss
Today I found my team struggling over decoding some fixed-structure JSON. Search this list for "decode" to see further examples.

Even though I understand decoders well enough to use them, I still don't want to use them. For one thing, I don't deal with fuzzy JSON. I have type aliases that match the JSON structures. Decoder code is entirely redundant with the information in the type alias. Secondly, writing a decoder is demonstrably beginner unfriendly. (Read the search results or watch a beginner struggle with it to see for yourself.) So writing a decoder is painful both conceptually and because it's redundant work.

In the shorter term, my prevailing alternative to decoding in 0.17 is just to use ports to launder the JSON into clean Elm types, since ports are automatically decoded. But notably Date will not pass through ports. Not a frequent need for me but if I did need a Date type, another type alias with date as string or int would be required. Then that would have to be converted to the real record. It's even more redundant work (albeit simpler), so I'll avoid if possible.

In the longer term, the solution that jumps out at me is the compiler could generate decoders for marked type aliases automatically. (I assume ports work this way already.) And decoding is still there for fuzzy scenarios.

Anyway, I saw in one of the Elm-conf videos Evan say that if I have issues, I should say something. So there it is. The other main issue we have looks to get fixed next version (changes to files not being recompiled).

Duane Johnson

unread,
Sep 28, 2016, 8:02:29 PM9/28/16
to elm-d...@googlegroups.com

On Wed, Sep 28, 2016 at 5:34 PM, Kasey Speakman <kjspe...@gmail.com> wrote:
In the longer term, the solution that jumps out at me is the compiler could generate decoders for marked type aliases automatically. (I assume ports work this way already.) And decoding is still there for fuzzy scenarios.

That sounds like a neat idea. I'm trying to imagine how it would be implemented. Would you pass a type in to a decode function? e.g.

``elm
type alias User =
    { name : String
    , email : String
    , birthday : Date
    , id : Int
    }

user1Json = """[
    {"name": "Duane", "email": "duane....@gmail.com", "birthday": "19800101T000000Z", id: 42},
    {"name": "Kasey Speakman", "email": "ka...@idontknow.com", "birthday": "19810202T000000Z", id: 43}
]"""
```

-- the following is currently invalid Elm
user1 = decode(List User, user1Json)
```


James Wilson

unread,
Sep 29, 2016, 5:48:31 AM9/29/16
to Elm Discuss
users : Maybe (List User)
users
= decode userJson

one could just make use of type inference to know what to decode into. Might mean another special pseudo-typeclass (like comparable, appendable) called decodable. Then decode (might need renaming) would have a signature like:

decode : Value -> Maybe decodable


I have thought about this as well, also contemplating the port option and also finding the amount of boilerplate you have to write to decode into records a little annoying. I would be tempted to also provide something like

decodeWithOptions : DecoderOptions -> Value -> decodable

where options could include a function to transform keys into record keys.

Rupert Smith

unread,
Sep 29, 2016, 9:40:38 AM9/29/16
to Elm Discuss
On Thursday, September 29, 2016 at 12:34:40 AM UTC+1, Kasey Speakman wrote:
In the longer term, the solution that jumps out at me is the compiler could generate decoders for marked type aliases automatically. (I assume ports work this way already.) And decoding is still there for fuzzy scenarios.

+1 to the idea. 

I have already resorted to code genning data type, encoder and decoder for REST interfaces, for which I hold a model describing the data that the interface talks (think Swagger definition). 

Some scnearios that would need to be covered:

Optional fields:

Some use Maybe, some set a default value like "". Maybe is a purer approach but I can appreciate the sheer convenience of "" too.

Recursion:

I have data structures where a record can hold another of the same type (think parent/child relationship in a tree), or where one record can hold an instance of another that can hold an instance of the original giving rise to mutual recursion. To solve that I just always use a singleton union wrapper type:

type User = 
    User 
    { name : String
    , email : String
    , birthday : Date
    , id : Int
    }

its a PITA, but one that using codegen for the encoder and decoder goes some way towards making less of a hassle.

Rupert Smith

unread,
Sep 29, 2016, 11:25:25 AM9/29/16
to Elm Discuss
On Thursday, September 29, 2016 at 2:40:38 PM UTC+1, Rupert Smith wrote:
Some scnearios that would need to be covered:

Optional fields:

In one of my json, I have an optional list of 'roles'. This might be the empty list, it might be an explicit 'null' or might be missing altogether. To decode this I had to do:
 
        |: (("roles" := maybeNull (Decode.list roleDecoder))
                |> withDefault Nothing
           )

phew. I suppose I could write my own helper function here called 'maybeNullOrMissing'.

No quotes:

String don't seem to decode without quotes. I had an integer value that I wanted to just treat as a string, as its an id so I consider it opaque and for the maximum flexibility just treat it as a string. Since there were no quotes I had to do this:

        |: ("id" := Decode.int |> Decode.map toString)
 
Which would break if the server ever did return a non-integer string in quotes.

Rupert Smith

unread,
Sep 29, 2016, 11:35:06 AM9/29/16
to Elm Discuss
On Thursday, September 29, 2016 at 4:25:25 PM UTC+1, Rupert Smith wrote:
Some scnearios that would need to be covered:

I also have a little problem with the 'roles' encode to solve. At the moment I have:

        , ( "roles"
          , case model.roles of
                Just roles ->
                    roles |> List.map roleEncoder |> Encode.list

                Nothing ->
                    Encode.null
          ) 

which will mean 'null' is output when the roles are Nothing. Which is not what I really want, as an explicit null will be interpreted by the server as meaning 'set the roles to null', whereas what I really want is no roles field set at all which the server will interpret as 'don't change the roles, they are not being defined by this request'.

But Encode.object takes a list of (String, Value) -> Value and its easiest for me to write this as an entry in a list rather than build up the list more conditionally. I suppose I could output a list of Maybe Value and then use the filter to remove Nothings from that list to arrive at a list of Value just for fields that are not Nothing.

I suppose what I am trying to point out, is that auto generating encoders and decoders requires dealing with a lot of different conventions depending on how you want it done. If I was using something like jackson in Java, the behaviour would be controlled by a set of flags when building the parsers, like "allowNoQuotes", "allowMissingFields", and so on. I think any attempt at auto generating would need to look at a good number of common variations in how json is treated and provide options to deal with them, in order to not be overly restrictive and useful.

Rupert Smith

unread,
Sep 29, 2016, 11:36:07 AM9/29/16
to Elm Discuss
On Thursday, September 29, 2016 at 4:35:06 PM UTC+1, Rupert Smith wrote:
I also have a little problem with the 'roles' encode to solve. At the moment I have:

        , ( "roles"
          , case model.roles of
                Just roles ->
                    roles |> List.map roleEncoder |> Encode.list

                Nothing ->
                    Encode.null
          ) 

which will mean 'null' is output when the roles are Nothing.

Another option would be if Encode provided 'missing : Value', which means skip that field, but it doesn't. 

Eduardo Cuducos

unread,
Sep 29, 2016, 12:54:17 PM9/29/16
to Elm Discuss
This idea makes a lot of sense to me. We, as developers, could go “automagically” from JSON to Model if they match — and the Json.Decode will still be there if one needs to parse a JSON differently.

--
You received this message because you are subscribed to the Google Groups "Elm Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elm-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Riley Eynon-Lynch

unread,
Sep 29, 2016, 4:46:15 PM9/29/16
to Elm Discuss
We have 7k lines of Elm in production, and we stopped using it because of this pain in the boundary between Elm & JS. The examples from NRI show lots of local, elm-internal logic without much API chatter or JS interop. Our app has tons of API chatter (realtime collaboration is a feature), and we have 40k lines of JS in our main SPA that any newly-elmed component would have to talk to.

We found that of our 7k loc, over 1k was in decoders & encoders!

Martin DeMello

unread,
Sep 29, 2016, 5:30:40 PM9/29/16
to elm-d...@googlegroups.com
I've been thinking about this recently because I have some ocaml code I want to compile to javascript and call from elm, and doing it via json seems unnecessarily complicated (the elm code already uses the same algebraic datatypes the ocaml code does). It's definitely easier to insist that the encoding is strictly conforming (generating is easier than consuming, after all), so I would be very interested in any work along those lines.

martin

To unsubscribe from this group and stop receiving emails from it, send an email to elm-discuss+unsubscribe@googlegroups.com.

Francesco Orsenigo

unread,
Sep 29, 2016, 6:55:25 PM9/29/16
to Elm Discuss

This should make things easier: https://github.com/eeue56/json-to-elm

Once the AST can be exported and manipulated (a feature that is decently high in priority) then decoders/encoders/generators can be generated automatically.
This should happen as soon as elm-format gets AST export https://github.com/avh4/elm-format/issues/236

Maxwell Gurewitz

unread,
Sep 29, 2016, 7:34:35 PM9/29/16
to Elm Discuss
It seems to me if we as users had a way to iterate over Records, or convert them to dictionaries, then we could create a function for converting them into JSON strings.

Rupert Smith

unread,
Sep 30, 2016, 4:29:56 AM9/30/16
to Elm Discuss
On Thursday, September 29, 2016 at 11:55:25 PM UTC+1, Francesco Orsenigo wrote:

This should make things easier: https://github.com/eeue56/json-to-elm

That definitely helped to get me started. It does not cover all the use cases I described previously though - and I think those use cases are fairly common.

If there was a built in conversion, I would like to see it able to handle any json that can be described by json-schema with some extra flags to allow leniency around quotes and missing and extra fields and so on. Otherwise the built in conversion would be too restrictive to be useful.

Extra fields are another use case I did not mention before: I guess you just put a Dict in your record and any fields not explicitly named as record fields get put in there.

Also a standard conversion between json and nested Dicts would be useful, its essentially working with the json as untyped data. (Would need to handle unquoted numbers as strings).

A tool that would really help me right now would be a json-schema-to-elm converter.

Rupert Smith

unread,
Sep 30, 2016, 5:06:00 AM9/30/16
to Elm Discuss
On Thursday, September 29, 2016 at 9:46:15 PM UTC+1, Riley Eynon-Lynch wrote:
We have 7k lines of Elm in production, and we stopped using it because of this pain in the boundary between Elm & JS. The examples from NRI show lots of local, elm-internal logic without much API chatter or JS interop. Our app has tons of API chatter (realtime collaboration is a feature), and we have 40k lines of JS in our main SPA that any newly-elmed component would have to talk to.

We found that of our 7k loc, over 1k was in decoders & encoders!

I am finding this too, I have spent the last 2 weeks mostly dealing with the REST API and json conversion, but also bearing in mind this is my first attempt at doing this so also dealing with the learning curve.

Some further thoughts...

I have seen a few posts on Elm around protocol buffers. If you are talking to an API but using Elm over js, then the advantage of talking json as the native tongue of the client seems lost, why bother with json at all? After all its a verbose and stringy data interchage format.

So if another pseudo type class called 'decodable' were intrdocued, it would be helpful if it were able to work with different types of encoders/decoders not just for json. Fantastic if one could get easy conversion to and from json, protocol buffers, xml, AMQP message format, and so on. Perhaps there might even be a special native module type that allows new ones to be written.

Andrew Radford

unread,
Oct 1, 2016, 10:36:37 AM10/1/16
to Elm Discuss
Not sure if it has been mentioned but there is this effort to create decoders from swagger definitions:


Still a work in progress, but looks promising.

Kasey Speakman

unread,
Oct 3, 2016, 11:52:46 AM10/3/16
to Elm Discuss
@DuaneJohnson. I believe the compiler currently looks for ports and makes decoders for them. Then when you call a particular port or sub, the decoder is called under the covers. Something similar could happen for a Json.Decode.auto method, which could infer the type based on the tagger function it is invoked with. But it would probably have to be raised to the level of a language keyword like with `port` so its usage could be controlled enough that the compiler could find it.

@JamesWilson. I like what you're saying there. One slight tweak I'd make is to use a Result instead of Maybe. I'd like to know which field failed to decode in an error message.

@RupertSmith. I can think of one other case: decoding the same JSON to two different models. However, I think if you have a lot of custom rules, decoders would have to do. I don't have recursive fields, and I decode optional fields to Maybe. If I have multiple related Maybes, then I usually change my data model to link them together. Some reasonable defaults would fit 95% of my use cases, and I could use decoders for the one-offs.

@RileyEynon-Lynch. That's kindof what I was afraid of in the long term... decoders just being ongoing busy work. Coming from other languages, even back-end languages, decoding JSON just isn't a thing. Even on the back-end, I will setup a reflection-based deserializer with general error handling (e.g. Newtonsoft for .NET) and I'm done. The ongoing work is just maintaining the data types, which I must do anyway. The back-end deserializer does take some initial testing to work out the kinks, but that work has an end after which there is no ongoing maintenance.

PSA: There is also a code-gen tool to create decoders from example JSON.

http://noredink.github.io/json-to-elm/

(A little late to the party because I forgot to subscribe to updates.)

Rupert Smith

unread,
Oct 4, 2016, 4:47:21 AM10/4/16
to Elm Discuss
On Monday, October 3, 2016 at 4:52:46 PM UTC+1, Kasey Speakman wrote:
Coming from other languages, even back-end languages, decoding JSON just isn't a thing. Even on the back-end, I will setup a reflection-based deserializer with general error handling (e.g. Newtonsoft for .NET) and I'm done.

That is basically what I am trying to say; make it not a thing for Elm too, but also equal in its capabilities to the reflection based json handling of .Net, Java, whatever... 

Vojtěch Král

unread,
Oct 4, 2016, 7:40:08 AM10/4/16
to Elm Discuss


Dne čtvrtek 29. září 2016 1:34:40 UTC+2 Kasey Speakman napsal(a):
In the shorter term, my prevailing alternative to decoding in 0.17 is just to use ports to launder the JSON into clean Elm types, since ports are automatically decoded.

I have the same problem. Would someone be so kind as to give an example of how to launder the JSON via ports like this? Do you just rut the data through a port, or do you do ajax in JS?
Thanks!

Kasey Speakman

unread,
Oct 4, 2016, 10:36:37 AM10/4/16
to Elm Discuss
I haven't done this yet as we're developing the UI first as things take shape and faking the API calls with Process.sleep + Task.perform. So below is just a sketch.

Conceptually, you could use `Http.getString` (or construct a request and use `Http.send`) in Elm to get the JSON, then send that through a port to JavaScript. You'd need at least one outgoing port, and an incoming port for every result type.

port deserializeJson : String -> String -> Cmd msg

port studentsLoaded : (List Student -> msg) -> Sub msg
port coursesLoaded : (List Course -> msg) -> Sub msg

subscriptions : Model -> Sub Msg subscriptions model = Sub.batch [ studentLoaded StudentsLoaded , courseLoaded CoursesLoaded ]


On the outgoing port, you could pass both the JSON and the incoming port name as a string. Then in the JS, you could wire it once like this:

    app.ports.deserializeJson.subscribe(function(port, json) {
        // TODO can throw, so catch error and send to an error port
        var object = JSON.parse(json);
        app.ports[port].send(object);
    });

Once you get the HTTP response string from a particular request, you send it through the outgoing port with the correct return path.

deserializeJson "studentLoaded" json

It looks like after initial setup, each deserialization would take ~3 lines of wiring code. So it might not be worth it for small objects. But again this is just a sketch that I haven't tried to run.

Kasey Speakman

unread,
Oct 4, 2016, 10:39:05 AM10/4/16
to Elm Discuss
Sorry, subscriptions should be:

subscriptions : Model -> Sub Msg subscriptions model = Sub.batch [ studentsLoaded StudentsLoaded , coursesLoaded CoursesLoaded ]

Also, this does not work for types with Date or union members (limitations of ports).

On Tuesday, October 4, 2016 at 6:40:08 AM UTC-5, Vojtěch Král wrote:

Kasey Speakman

unread,
Oct 4, 2016, 10:40:00 AM10/4/16
to Elm Discuss
Ehh, another missed plural.

deserializeJson "studentsLoaded" json


On Tuesday, October 4, 2016 at 6:40:08 AM UTC-5, Vojtěch Král wrote:

Vojtěch Král

unread,
Oct 4, 2016, 11:37:41 AM10/4/16
to Elm Discuss
Awesome! I'll try it out...
Thanks!
Vojtech

Dne úterý 4. října 2016 16:36:37 UTC+2 Kasey Speakman napsal(a):

Kasey Speakman

unread,
Nov 3, 2016, 12:43:01 PM11/3/16
to Elm Discuss
I first started using Elm's recommended codec-ing scheme. I found that if I needed to round-trip an entity to an API, I needed 4 representations of the same entity. One on the client and one on server is given. But then 1 encoder representation and 1 decoder representation... both of which spell out the same information in the type alias. So in order to change the entity, I have 4 places that need maintenance. No thanks.

Abusing ports, I arrived at a similar solution to what I explained previously in this thread for decoding. But since encoding is also monkey work, I wanted to abuse ports to launder that too. However, the process to do that was herky-jerky. A port/sub round trip for encoding, then an Elm HTTP call, then a port/sub round trip for decoding. Instead, I abandoned the Elm HTTP library altogether. I added in a fetch polyfill (whatwg-fetch) and run the HTTP call in JS.


Notes

The JS fetch code could be generalized/improved even more, but this is all I need for calling my own APIs. Once tweaked and running for its intended use, it will rarely need to be touched.

Outgoing ports start with 'api' so that the startup code knows to wire them up. If Elm would let me, I would define only one outgoing port as `port apiRequest : Request a -> Cmd msg`, but that doesn't compile.

Overhead per API method is around 5 lines of code. One outgoing port, 2 incoming ports (1 success, 1 fail), and 2 for adding incoming ports to subscriptions list.

Most importantly, if my messages change, I only have 2 places to update them: client model and server model. No encoder/decoder maintenance.

Using strings to represent incoming ports is not ideal, but shouldn't need that much maintenance and are private.

Just starting to explore this, so maybe there is some blocking situation I haven't discovered yet.

OvermindDL1

unread,
Nov 3, 2016, 1:16:25 PM11/3/16
to Elm Discuss
I did precisely the same style in https://github.com/OvermindDL1/elm-jsphoenix too, to minimize the duplication of work (which also increased by elm removing the ability to extend record types into a new type, and lacking the ability to move Dict's across, so those two things still add some, but it still saved a ton of work but (ab)using ports for the end-user).

Kasey Speakman

unread,
Feb 18, 2017, 3:22:32 AM2/18/17
to Elm Discuss
An update. I'm using the ports approach to deal with JSON (and also sending the HTTP request). For several months, I've had a small but critical app in production using that. Another project in development is too. In the process, I have run across two additional caveats with this approach.
  1. Ports don't convert undefined properties to Maybe.Nothing. It's an open issue from Jan 2016.
    For a Maybe property, the JSON must have the property present and set to null. Otherwise error. This is particularly annoying when you store the data as JSON and pass it back to the client as-is. To work around this issue, I either have to waste space storing nulls in the database or waste (CPU/response) time server-side to inject nulls in the response.

  2. Cmd.map can't be used with this method.
    Using Http module, you can use Cmd.map to take some data from the request and give it to the response Msg. Using ports, you can't do that. I've noticed this when the data is easy to provide for a request, but after the response comes back it is less convenient to dig out of the model (e.g. behind a case statement).
Neither of these are blockers for me, just nuisance issues. It still beats maintaining codecs.

I've seen rumblings about tools for code-gen'ing JSON codecs for us (maybe elm-format? There also exists elm-swagger, but I don't use swagger.). I dunno though. Where possible, I tend to avoid code-gen because it's for handling a really tedious problem. And if the code-gen fails, then I have to handle a really tedious problem. (XSD/WSDL flashbacks.)

All it would really take for a profound QoL improvement are a couple of "special" functions on Http that handle data exactly like ports do now... just saying.

Wyatt Benno

unread,
Feb 19, 2017, 9:36:56 AM2/19/17
to Elm Discuss
True that. Elm needs to talk friendly with JSON and other, at this point expected, front end interop. API calls, JSON format, drop down options. These are all taken for granted in the front end world. But in elm we need to write a lot of code, just to make code not fail ever. I wish both aspects of ELM could just go together.

Kasey Speakman

unread,
Mar 17, 2017, 2:52:30 PM3/17/17
to Elm Discuss
Another update. I figured out how to get the encoders and decoders out of the ports using a native module. So I no longer have to actually send things across ports (still have to declare them). I can also now use Elm's Http module.

Here's a gist of what it takes:

Noah Hall

unread,
Mar 18, 2017, 12:43:04 PM3/18/17
to elm-d...@googlegroups.com
Http://json2elm.com. auto generating json "codecs" (decoders and encoders) has existed for more than a year already. 
--

Kasey Speakman

unread,
Mar 18, 2017, 12:57:34 PM3/18/17
to elm-d...@googlegroups.com
Yes, but it requires me to keep decoder/encoder files, and fiddle with them every time my types change. What a waste of time.

Using this method, I don't have to do anything extra when types change. Nor keep extra files.

To unsubscribe from this group and stop receiving emails from it, send an email to elm-discuss...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "Elm Discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elm-discuss/XW-SRfbzQ94/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elm-discuss...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages