Constraint solver: what can safely be assumed about its capacity?

Yannick Duchêne

unread,

Dec 12, 2015, 5:12:22 PM12/12/15

to ats-lang-users

Going on with https://groups.google.com/forum/#!topic/ats-lang-users/q0sY4IdTMbg , I proved each character ranges which valid UTF-8 string can represents. It was proved based on the UTF-8 specification re‑written for ATS. I though proving expected properties derived from the specification, was a way to check the specification it-self, and indeed I could catch two typo errors this way (two misspelled constants, so two constants substituted for another).

That's already nice. What's nice too, is that it was more easy than I was afraid it would be; the constraint solver just pleasantly surprised me, I was not expecting so much of it. Basically, I just told it where about to look around, and it found it-self. The proof for each range is short. Here is an example (for UTF-8 strings starting with a head byte in 0xF1..0xF3):

#define UTF8_F1_LOWER 0x40000
#define UTF8_F1_UPPER 0xFFFFF

prfn utf8_f1_range
  {c:int} (pf:UTF8_F1(4, c)):
    [UTF8_F1_LOWER <= c; c <= UTF8_F1_UPPER] void =
  case+ pf of
  | UTF8_F1_4(pfc3, _) =>
    let
      prval UTF8_F1_3(pfc2, _) = pfc3
      prval UTF8_F1_2(pfc1, _) = pfc2
      prval UTF8_F1_1(FIRST) = pfc1
    in end

This lead to a question: what can be safely assumed from the constraint solver? Are its capacities, more or less specified, even informally?

This would be nice to know, as I know there were sometimes issues with Isabelle/HOL, when from a version to another, some proof were failing and needed to be rewritten. I think the point is easily seen, and that's why I have this question.

for the story, now I will attempt to prove each value in each range are indeed decodable from UTF-8 data. I still have no idea of how I will do :-D

gmhwxi

unread,

Dec 12, 2015, 5:25:31 PM12/12/15

to ats-lang-users

The core of the constraint-solver in ATS is based on linear integer programming.
The solver adds to its core some heuristics for handling non-linear terms (involving
multiplication). It is a very primitive solver and also a very stable one.

I suggest that you try the Z3 solver for solving constraints generated from typechecking
ATS code. It is quite good.

Barry Schwartz

unread,

Dec 12, 2015, 5:39:41 PM12/12/15

to 'Yannick Duchêne' via ats-lang-users

'Yannick Duchêne' via ats-lang-users <ats-lan...@googlegroups.com> skribis:

> Going on with
> https://groups.google.com/forum/#!topic/ats-lang-users/q0sY4IdTMbg , I
> proved each character ranges which valid UTF-8 string can represents.

Heh, I just wanted to note I was writing UTF-8 stuff in ATS at the
same time. :)

No particular connection, just a cute coincidence.

Yannick Duchêne

unread,

Dec 12, 2015, 5:49:24 PM12/12/15

to ats-lang-users

Le samedi 12 décembre 2015 23:25:31 UTC+1, gmhwxi a écrit :

The core of the constraint-solver in ATS is based on linear integer programming.
The solver adds to its core some heuristics for handling non-linear terms (involving
multiplication). It is a very primitive solver and also a very stable one.

So basically, modulo the computation time (the problem it said to be NP-hard), whenever I can reduce a numeric problem to linear integer expressions, it's OK (I searched the web, and it seems standard), I can count on ATS alone.

For the non-linear expressions, what are the additions? is there is reasonable picture of the kind of expressions?

I suggest that you try the Z3 solver for solving constraints generated from typechecking
ATS code. It is quite good.

If I need it and am stuck otherwise, I will try it.

Yannick Duchêne

unread,

Dec 12, 2015, 5:52:12 PM12/12/15

to ats-lang-users

Le samedi 12 décembre 2015 23:39:41 UTC+1, Barry Schwartz a écrit :

Heh, I just wanted to note I was writing UTF-8 stuff in ATS at the
same time. :)

No particular connection, just a cute coincidence.

Citizens of the world, love Unicode, and some of them, additionally UTF-8 :-p .

Barry Schwartz

unread,

Dec 12, 2015, 6:24:36 PM12/12/15

to 'Yannick Duchêne' via ats-lang-users

'Yannick Duchêne' via ats-lang-users <ats-lan...@googlegroups.com> skribis:

Oh, definitely. Latin-1 can’t handle Esperanto, and UTF-8 is both
stupendous and greatly able to benefit from ATS. (Compare UTF-7.)

gmhwxi

unread,

Dec 12, 2015, 6:31:39 PM12/12/15

to ats-lang-users

On Saturday, December 12, 2015 at 5:49:24 PM UTC-5, Yannick Duchêne wrote:

Le samedi 12 décembre 2015 23:25:31 UTC+1, gmhwxi a écrit :

The core of the constraint-solver in ATS is based on linear integer programming.
The solver adds to its core some heuristics for handling non-linear terms (involving
multiplication). It is a very primitive solver and also a very stable one.

So basically, modulo the computation time (the problem it said to be NP-hard), whenever I can reduce a numeric problem to linear integer expressions, it's OK (I searched the web, and it seems standard), I can count on ATS alone.

For the non-linear expressions, what are the additions? is there is reasonable picture of the kind of expressions?

It is a bit hard to describe the added heuristics. Basically, some kind of algebra involving polynormials. For instance,

(m+n)*(m-n) = m*m - n*n

Here is an example that can probably give you more feel:

http://www.ats-lang.org/SERVER/MYCODE/Patsoptaas_serve.php?mycode_url=https://raw.githubusercontent.com/githwxi/ATS-Postiats/master/doc/EXAMPLE/ARITH/tally-of-powers.dats

Yannick Duchêne

unread,

Dec 13, 2015, 7:41:22 PM12/13/15

to ats-lang-users

Does it compute constants expressions as constants when it encounters any?

I got a funny case:

    prfn f1 (): [UTF8_E0_UPPER / TB_BASE == 0x3F] void = ()
    prval () = f1()
    prfn f2 (): [temp <= UTF8_E0_UPPER / TB_BASE] void = ()
    stadef b2 = (temp % TB_BASE) + TB_OFS
    prval () = f2()
    prfn f3 (): [TB_E0_LO <= b2; b2 <= TB_E0_HI] void = ()

Don't need more to see the picture. If I comment out the two first line, it fails on f3. If I delete the two first line and replace `UTF8_E0_UPPER / TB_BASE`, dividing two constants, by its constant value, 0x03, then it does not fails. Seems I have to tell it the value of a constants expressions.

I'm surprised, because I though getting constants from constant expressions, was part of the basics (while keeping the original expression too, as it may be helpful).

Also, is there a more convenient notation for this kind of things? Something I would name “note” and “recall”? #define are not OK with that, ATS don't want this:

#define NOTE(name, formula) prfn name (): formula void = ()
#define RECALL(name) prval () = name()

gmhwxi

unread,

Dec 13, 2015, 7:59:41 PM12/13/15

to ats-lang-users

I would like to test it.

Where is TB_BASE declared?

Yannick Duchêne

unread,

Dec 13, 2015, 8:10:28 PM12/13/15

to ats-lang-users

Le lundi 14 décembre 2015 01:59:41 UTC+1, gmhwxi a écrit :

I would like to test it.

Where is TB_BASE declared?

TB_BASE is a #define. I turned multiple things into #define, as I don't like magic numbers, while I agree, there are case where it's more easy to know the real value.

In the attached file, at line 255. Just above, I copied/pasted the defines to help me.

UTF_8.dats

gmhwxi

unread,

Dec 13, 2015, 9:38:39 PM12/13/15

to ats-lang-users

I took a look at the implementation of constraint-solving in ATS.
I don't really know what the cause of the problem is. It seems that
integer division (/) is not treated as a primitive function; this means
that

UTF8_E0_UPPER/TB_BASE is not simplified to 0x3F.
I am pretty sure that this problem does not exist
if you use Z3.

Anyway, you may want to try the following coding style
(instead of introducing functions like f1(), f2(), ...

   prval () =
   prop_verify_and_add
     {UTF8_E0_UPPER/TB_BASE==0x3F}()
   // end of [prval]
   prval () =
   prop_verify_and_add
     {temp <= (UTF8_E0_UPPER/TB_BASE)}()
   // end of [prval]

On Sunday, December 13, 2015 at 7:41:22 PM UTC-5, Yannick Duchêne wrote:

Yannick Duchêne

unread,

Dec 13, 2015, 10:20:37 PM12/13/15

to ats-lang-users

Le lundi 14 décembre 2015 03:38:39 UTC+1, gmhwxi a écrit :

I took a look at the implementation of constraint-solving in ATS.
I don't really know what the cause of the problem is. It seems that
integer division (/) is not treated as a primitive function; this means
that UTF8_E0_UPPER/TB_BASE is not simplified to 0x3F. I am pretty sure that this problem does not exist if you use Z3.

I know, but as long I can rely on ATS only, I prefer to rely on ATS only. Any way, I will give it a try.

prval () = prop_verify_and_add {UTF8_E0_UPPER/TB_BASE==0x3F}() // end of [prval] prval () = prop_verify_and_add {temp <= (UTF8_E0_UPPER/TB_BASE)}() // end of [prval]

Reminds me something. Was not in my notes…

Not a keyword, so instead I will at least make the syntax colorizer, colorize it as “support function”.

Now, I remember there was other in the same family. Will search the files to see where it is defined and what come along with it.

Thanks for the tip, it's already cleaner.

Reply all

Reply to author

Forward