error handling

16 views
Skip to first unread message

Ashish Agarwal

unread,
Jan 19, 2014, 10:57:20 AM1/19/14
to Biocaml
Our other conversation brought up the issue of error handling with polymorphic variants. I'm starting a separate thread to discuss this too. After 2 years of using polymorphic variants a lot, I'm questioning the viability of the approach. I do really like the precision it provides, but it has also been very cumbersome to maintain. I think we should consider moving to exceptions and/or Or_error. This could accelerate development quite a bit by simplifying the API. Any thoughts on this?

Sebastien Mondet

unread,
Jan 19, 2014, 12:39:44 PM1/19/14
to bio...@googlegroups.com

Just to be precise:
- using exceptions or or_error.t does not really simplify development, it just "delays" part of it (handling and especially documenting errors gets pushed for later, we see that in Core/Async's big lack of documentation)

The only problem that I've really had with polymorphic-variant-based errors is printing them while wanting to get rid of Camlp4, hopefully with ocaml 4.02.0, there will be no need for camlp4 anymore (extension points, and maybe even runtime types) :)

Also note that exception types, string, and/or Or_error.t are or can be used alltogether anyway, we just need to agree on the actual names we give to them:
('a, [> `Exn of exn | `Failure of string | `Core_or_error of Or_error.t ]) Result.t






On Sun, Jan 19, 2014 at 10:57 AM, Ashish Agarwal <agarw...@gmail.com> wrote:
Our other conversation brought up the issue of error handling with polymorphic variants. I'm starting a separate thread to discuss this too. After 2 years of using polymorphic variants a lot, I'm questioning the viability of the approach. I do really like the precision it provides, but it has also been very cumbersome to maintain. I think we should consider moving to exceptions and/or Or_error. This could accelerate development quite a bit by simplifying the API. Any thoughts on this?

--
You received this message because you are subscribed to the Google Groups "biocaml" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biocaml+u...@googlegroups.com.
To post to this group, send email to bio...@googlegroups.com.
Visit this group at http://groups.google.com/group/biocaml.
To view this discussion on the web visit https://groups.google.com/d/msgid/biocaml/CAMu2m2%2Buz_c%3D9xH76%2B6YCqhsnSVLc3EmcDOWUQQXZyFB3QqKLg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Malcolm Matalka

unread,
Jan 19, 2014, 12:44:36 PM1/19/14
to bio...@googlegroups.com
A bit of a rehash from the other thread but:

While polymorphic variants have their limitations, so far I've found
them to be the most powerful tool Ocaml has for error handling. But I
haven't used them at significant scale. Do you have a specific use case
where they fall apart? I just think it's so powerful to be able to add
a new error case and let the compiler tell you every place you need to
handle it. It seems like a lot to give up to me.

Disclaimer: I'm not a user of biocaml so my opinions should be taken
with a grain of salt.

Philippe Veber

unread,
Jan 20, 2014, 5:06:48 PM1/20/14
to Biocaml



2014/1/19 Sebastien Mondet <sebastie...@gmail.com>


Just to be precise:
- using exceptions or or_error.t does not really simplify development, it just "delays" part of it (handling and especially documenting errors gets pushed for later, we see that in Core/Async's big lack of documentation)
I must say that as a user, I also appreciate these thorough error types of yours. So I too would rather stick with it.
 

The only problem that I've really had with polymorphic-variant-based errors is printing them while wanting to get rid of Camlp4, hopefully with ocaml 4.02.0, there will be no need for camlp4 anymore (extension points, and maybe even runtime types) :)

Also note that exception types, string, and/or Or_error.t are or can be used alltogether anyway, we just need to agree on the actual names we give to them:
('a, [> `Exn of exn | `Failure of string | `Core_or_error of Or_error.t ]) Result.t
I don't feel that would be so great. Actually I'm quite happy the way it is: having detailed error types either as a Result.t or in an exception now looks to me as a nice strategy. The only modification I'd suggest is to allow exceptions versions to be as precise as Result.t versions. Let's see that on an example with the Fastq module:


val in_channel_to_item_stream : ?buffer_size:int -> ?filename:string -> in_channel ->
  (item, [> Error.parsing]) Result.t Stream.t
(** Parse an input-channel into a stream of [item] results. *)

val in_channel_to_item_stream_exn:
  ?buffer_size:int -> ?filename:string -> in_channel -> item Stream.t
(** Returns a stream of [item]s.
[Stream.next] will raise [Error _] in case of any error. *)

While the Result.t version provides an Error.parsing value in case of an error, the exception one returns an Error.t, which contains more variants on some other stuff. I would rather have several exception definitions to match the most precise types:

exception Parsing_error of Error.parsing

Would that be ok?
 






On Sun, Jan 19, 2014 at 10:57 AM, Ashish Agarwal <agarw...@gmail.com> wrote:
Our other conversation brought up the issue of error handling with polymorphic variants. I'm starting a separate thread to discuss this too. After 2 years of using polymorphic variants a lot, I'm questioning the viability of the approach. I do really like the precision it provides, but it has also been very cumbersome to maintain. I think we should consider moving to exceptions and/or Or_error. This could accelerate development quite a bit by simplifying the API. Any thoughts on this?

--
You received this message because you are subscribed to the Google Groups "biocaml" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biocaml+u...@googlegroups.com.
To post to this group, send email to bio...@googlegroups.com.
Visit this group at http://groups.google.com/group/biocaml.
To view this discussion on the web visit https://groups.google.com/d/msgid/biocaml/CAMu2m2%2Buz_c%3D9xH76%2B6YCqhsnSVLc3EmcDOWUQQXZyFB3QqKLg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "biocaml" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biocaml+u...@googlegroups.com.
To post to this group, send email to bio...@googlegroups.com.
Visit this group at http://groups.google.com/group/biocaml.

Ashish Agarwal

unread,
Jan 21, 2014, 10:05:14 AM1/21/14
to Biocaml
Let's define:
PV: indicating errors with a Result.t with polymorphic variants for the error type
EXN: indicating errors by raising exceptions

And now let's consider the similarities and differences.

PV works because polymorphic variants are an extensible data type. You can keep adding values to it, thus allowing composition of errors. EXN works for the same reason. Whenever you define an exception, you've added a value to the single global `exn` type.

Both PV and EXN allow you to be as precise as you want about the error type. You can define as fine grained a value as you want, e.g. the variant `Sequence_qualities_mismatch of string and analogously the exception Sequence_qualities_mismatch of string.

In PV, you lose scope. You don't know which module the variant `Foo comes from. Values of type exn include a module path.

PV requires you to consider errors while EXN allows you to ignore them. Actually, I disagree with this. The PV approach forces us to use the Result monad, and the entire point of that is to systematically ignore errors. You can avoid the monad when you do want to handle an error, but you can also use try ... catch when you want to handle an exception. I don't think PV leads to better error handling, except perhaps indirectly due to the next point.

PV leads to compiler enforced documentation about the possible errors. EXN requires manual diligence to document errors in comments. To me, this is the main difference in the two approaches. Compiler enforced documentation isn't a clear win. You get crazy error messages. The API is more painful since your attention is taken too much by errors, which you usually want to ignore (although this might be resolved by a custom documentation generator).

Maintaining PV errors is harder than with EXN. Right now, we define them all in an Error sub-module, which is annoying. Every time you define a new variant, you have to remember to add it there. Within that module, you might have to refactor the multiple type definitions. On the other hand, if you don't provide the these type definitions, then your specific variant name will get propagated to other signatures. Thus, if you change the name, you'll have to change it in many places or your code won't compile. All this for an error that is almost always ignored. With EXN, you can define precise errors, rename them freely, and it'll almost never require any other change. You just have to grep the source code to make sure you didn't use it somewhere, which you almost never will have.

Now responses to others' comments:

using exceptions or or_error.t does not really simplify development, it just "delays" part of it

I agree sort of. If you never have to handle the error, then the delay is permanent, i.e. you never have to worry about it. Finite delays are also maybe good. At least you make faster progress at first. Later you can add more precise types if we realize it is so important for some part.


Do you have a specific use case where they fall apart?

Imagine a command line app. The overall function takes in command line arguments and either directly or indirectly calls multiple functions. Every single one of those functions and the command line parsing can have errors. If you use the PV approach to the extreme, every single one of those errors is a variant in your output type. Thus, you literally have an output type that is pages long.


I just  think it's so powerful to be able to add a new error case and let the compiler tell you every place you need to handle it.

As I say above, I don't think you get this. You only get this if you handled the errors in the first place, but I claim you almost never do. You almost always use a Result monad and then the compiler doesn't tell you anything. This is an important point. If the situation is such that the error should usually be considered, I'm all in favor of returning a variant. But then we probably don't want it in a Result.t. For example, Pipe.read returns [ `Eof | `Ok of 'a] and Map.find returns 'a option. Both make sense to me.

I would rather have several exception definitions to match the most precise types:
>
> exception Parsing_error of Error.parsing
>
> Would that be ok?

In the limit, we could have an exception for each error variant. That actually makes sense to me, but I'm concerned about maintenance.

Thanks for all the discussion. I hope we can keep discussing a bit more, and then we should make a decision.




Philippe Veber

unread,
Jan 21, 2014, 3:38:16 PM1/21/14
to Biocaml
I mostly agree to what you say apart for the following comments

 
PV requires you to consider errors while EXN allows you to ignore them. Actually, I disagree with this. The PV approach forces us to use the Result monad, and the entire point of that is to systematically ignore errors. You can avoid the monad when you do want to handle an error, but you can also use try ... catch when you want to handle an exception. I don't think PV leads to better error handling, except perhaps indirectly due to the next point.
That's not completely fair: when you decide to handle errors with the Result monad, the compiler does an exhaustivity check, while nothing prevents you from forgetting a case in a try ... with ... expression. I think that's the reason why the functions of Fastq can only raise one exception, so that you won't forget any case.

 

PV leads to compiler enforced documentation about the possible errors. EXN requires manual diligence to document errors in comments. To me, this is the main difference in the two approaches. Compiler enforced documentation isn't a clear win. You get crazy error messages. The API is more painful since your attention is taken too much by errors, which you usually want to ignore (although this might be resolved by a custom documentation generator).
That was my initial concern, two years ago: it's certainly sad but ocaml's type system does not handle errors natively. As the type system is expressive enough, one can still stuff the return type of functions with other information, but that is always at the detriment of concision/readability, both in the API type and in the implementation. I personnally think that it is better to stick to the style the language was designed for (that is exceptions in ocaml), but that is certainly a matter of taste and depends on the kind of applications you're writing.


Maintaining PV errors is harder than with EXN. Right now, we define them all in an Error sub-module, which is annoying. Every time you define a new variant, you have to remember to add it there. Within that module, you might have to refactor the multiple type definitions. On the other hand, if you don't provide the these type definitions, then your specific variant name will get propagated to other signatures. Thus, if you change the name, you'll have to change it in many places or your code won't compile. All this for an error that is almost always ignored. With EXN, you can define precise errors, rename them freely, and it'll almost never require any other change. You just have to grep the source code to make sure you didn't use it somewhere, which you almost never will have.
I don't follow you here: if you want precise error description, you'll need some sort of variants, and hence type definitions for them. These definitions should be better regrouped (in a module or not, but it doesn't really hurt to use one) than scattered. Oh wait, now I get it: if you change the name of the error type then it has to be changed in the signature of all ('a,'b) Result.t returning functions (which mostly ignore the error anyway). Is that it?
 



I would rather have several exception definitions to match the most precise types:
>
> exception Parsing_error of Error.parsing
>
> Would that be ok?

In the limit, we could have an exception for each error variant. That actually makes sense to me, but I'm concerned about maintenance.
No I didn't mean going that far, and this would be harmful: a function should not be able to raise more than a couple of different exceptions, and ideally only one, because the type checker is not here to help you with exhaustivity check. IMO, the best design for a module is not a single exception for the module (because then functions that do very different things have to share the same error type), but at most one exception (or two) per function. For instance a Fastq.read_exn function can only raise Parser_error of Error.parsing, that way when you want to do error handling:

try Fastq.read_exn x
with
| Fastq.Parser_error e -> (
  match e with
  | `foo bar -> ...
  | `baz ->
)

and there the compiler is watching your steps: if you add a new error variant, you'll be warned that some cases are not covered. On the contrary

try Fastq.read_exn x
with
  | Fastq.Error_foo bar -> ...
  | Fastq.Error_baz ->

would be a real pain: if a new exception is created you are on your own to find places where to augment pattern matches.
 

Thanks for all the discussion. I hope we can keep discussing a bit more, and then we should make a decision.

Having divergent opinions on the topic, we have chosen to be liberal and support both styles up to now. In practice it's mostly you guys who made the effort (haven't much contributed the past year) so I'd say that's also mostly your call. My personal opinion would be to at least keep the exception-style functions, with one exception or two at most per function, polymorphic variant type as argument of the exception. If we keep supporting both styles, this polymorphic variant type can then be used in ('a,'b) Result.t returning functions. While I still think the Result monad is too much hassle for me, I do appreciate the rigorous error handling in your parser modules, which has been a real lesson for me (and sometimes a much appreciated help in my own programs!).

cheers,
ph.
 

Ashish Agarwal

unread,
Jan 21, 2014, 3:47:59 PM1/21/14
to Biocaml
On Tue, Jan 21, 2014 at 3:38 PM, Philippe Veber <philipp...@gmail.com> wrote:

That's not completely fair: when you decide to handle errors with the Result monad, the compiler does an exhaustivity check, while nothing prevents you from forgetting a case in a try ... with ... expression.

True. Your proposal later to use polymorphic variants as the argument of exceptions would sort provide this too.


Oh wait, now I get it: if you change the name of the error type then it has to be changed in the signature of all ('a,'b) Result.t returning functions (which mostly ignore the error anyway). Is that it?

Yes, that's what I meant.


My personal opinion would be to at least keep the exception-style functions, with one exception or two at most per function, polymorphic variant type as argument of the exception. If we keep supporting both styles, this polymorphic variant type can then be used in ('a,'b) Result.t returning functions. While I still think the Result monad is too much hassle for me, I do appreciate the rigorous error handling in your parser modules, which has been a real lesson for me (and sometimes a much appreciated help in my own programs!).

Okay, so it sounds like now you're in favor of EXN over PV. However, you're suggesting to use polymorphic variants as the argument of exceptions. That might be a happy middle ground.

Sebastien Mondet

unread,
Jan 21, 2014, 4:01:05 PM1/21/14
to bio...@googlegroups.com
The problem there is that Printexc.to_string (which is Exn.to_string in Core) does not really work with complex exception arguments.

    # exception Error of [ `bouh | `bah ];;
    exception Error of [ `bah | `bouh ]
    # Printexc.to_string (Error `bouh);;
    - : string = "Error(-1055159968)"

So to print decent error messages you still need to match over the argument (the only "pain point" to me with "PV")



--
You received this message because you are subscribed to the Google Groups "biocaml" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biocaml+u...@googlegroups.com.
To post to this group, send email to bio...@googlegroups.com.
Visit this group at http://groups.google.com/group/biocaml.

Malcolm Matalka

unread,
Jan 21, 2014, 4:06:13 PM1/21/14
to bio...@googlegroups.com
> As I say above, I don't think you get this. You only get this if you
> handled the errors in the first place, but I claim you almost never
> do.

This is true of young projects but IME, as projects age their error
handling becomes more and more important because of correctness.

Exn's are nice in this case because you can keep on adding new error
handling to your function and it's not an API change the compiler cares
about, so it's not a backwards incompatible change. But at the same
time, that's also the weakness from my perspective, since as an API
consumer I probably want to be made aware of such changes.

I disagree that the point of the Result.t monad is to avoid handling
errors. Instead, it lets someone draw a clear line between the
happy-path and the error handling. But I don't have a large codebase to
say how well it scales.
>> https://groups.google.com/d/msgid/biocaml/CAOOOohRP4%3DHPxgYqqKj0gai5-qdpLa_fuAxpB-S_V%2B18M%3DCe_w%40mail.gmail.com
>> .

Ashish Agarwal

unread,
Jan 24, 2014, 11:39:34 AM1/24/14
to Biocaml
We have about a 50/50 split in favor of continuing with using Result.t's vs exceptions, and I personally change my mind about this all the time. Thus, I'm inclined to not make a change, i.e. keep using Result.t's with polymorphic variants. I'll do what I can to make the PV approach easier to maintain. Follow along with the work on the Fastq module if you want to see what I'm doing.


Reply all
Reply to author
Forward
0 new messages