Going on with
https://groups.google.com/forum/#!topic/ats-lang-users/q0sY4IdTMbg , I proved each character ranges which valid UTF-8 string can represents. It was proved based on the UTF-8 specification re‑written for ATS. I though proving expected properties derived from the specification, was a way to check the specification it-self, and indeed I could catch two typo errors this way (two misspelled constants, so two constants substituted for another).
That's already nice. What's nice too, is that it was more easy than I was afraid it would be; the constraint solver just pleasantly surprised me, I was not expecting so much of it. Basically, I just told it where about to look around, and it found it-self. The proof for each range is short. Here is an example (for UTF-8 strings starting with a head byte in 0xF1..0xF3):
#define UTF8_F1_LOWER 0x40000
#define UTF8_F1_UPPER 0xFFFFF
prfn utf8_f1_range
{c:int} (pf:UTF8_F1(4, c)):
[UTF8_F1_LOWER <= c; c <= UTF8_F1_UPPER] void =
case+ pf of
| UTF8_F1_4(pfc3, _) =>
let
prval UTF8_F1_3(pfc2, _) = pfc3
prval UTF8_F1_2(pfc1, _) = pfc2
prval UTF8_F1_1(FIRST) = pfc1
in end
This lead to a question: what can be safely assumed from the constraint solver? Are its capacities, more or less specified, even informally?
This would be nice to know, as I know there were sometimes issues with Isabelle/HOL, when from a version to another, some proof were failing and needed to be rewritten. I think the point is easily seen, and that's why I have this question.
for the story, now I will attempt to prove each value in each range are indeed decodable from UTF-8 data. I still have no idea of how I will do :-D