Stadef and stacst questions: am I missing Skolen constants?

55 views
Skip to first unread message

Yannick Duchêne

unread,
Dec 12, 2015, 10:06:45 PM12/12/15
to ats-lang-users
I'm not a logic guru, but my intuition tells me what I'm looking for, is what was named Skolem variables or Skolem constants, in Isabelle/HOL (and not only known of the latter).

Don't be to afraid of the two samples, the last comment will tell what to look at mainly.

Here is something I wanted to write:

prfn utf8_c2_set
 
{c:int; UTF8_C2_LOWER <= c; c <= UTF8_C2_UPPER} ():
    UTF8_C2
(2, c) =
  let
    stacst b1
:hb_C2
    stadef c1
:int = b1 - 0xC0
    stacst b2
:tb_80
    stadef c2
:int = (c1 * 64) + (b2 - 0x80)
    prval pfd1
:DATA(1, b1) = FIRST
    prval pfc1
:UTF8_C2(1, c1) = UTF8_C2_1(pfd1)
    prval pfd2
:DATA(2, b2) = NEXT(pfd1)
    prval pfc2
:UTF8_C2(2, c2) = UTF8_C2_2(pfc1, pfd2)
 
in pfc2 end



It fails, with many errors, including I can't use subset sorts with `stacst`, and I had to write it this way instead, a way with which I very unhappy (sorry) and I'm not even really sure it means what I intended:

prfn utf8_c2_set
 
{c2:int; UTF8_C2_LOWER <= c2; c2 <= UTF8_C2_UPPER}
 
{b1:hb_C2; c1:int; b2:tb_80}
 
{c1 == b1 - 0xC0; c2 == (c1 * 64) + (b2 - 0x80)}
 
(): UTF8_C2(2, c2) =
  let
    prval pfd1
:DATA(1, b1) = FIRST
    prval pfc1
:UTF8_C2(1, c1) = UTF8_C2_1(pfd1)
    prval pfd2
:DATA(2, b2) = NEXT(pfd1)
    prval pfc2
:UTF8_C2(2, c2) = UTF8_C2_2(pfc1, pfd2)
 
in pfc2 end



Mainly, look at the `stacst` and `stadef` in the first version: I had to move it all as constraints outside of the body, while it should be constraints on local constants.

Well, obviously there are constraints to follow to invoke the desired constructors. These are the constraints on the building block showing an `UTF8_C2(2, c)` is feasible for all `c` in `UTF8_C2_LOWER` to `UTF8_C2_UPPER`. But these constraints should not appear there, they should not look like restricting the extend of the intended proof.

This kind of “inside out” issue when there are implicit existential quantifications, reminds me some things for which I often get replied talking about so called Skolem constants.

Still seeking for a solution, but I feel blocked.

gmhwxi

unread,
Dec 12, 2015, 10:17:16 PM12/12/15
to ats-lang-users

To get the skolem constant, please do

prval [b1:int] pfd1:DATA(1, b1) = FIRST

If you show me the type of FIRST, I could probably say more.

gmhwxi

unread,
Dec 12, 2015, 10:19:15 PM12/12/15
to ats-lang-users

Use this instead:

prval [b1:int] (pfd1:DATA(1, b1)) = FIRST

Yannick Duchêne

unread,
Dec 12, 2015, 10:28:04 PM12/12/15
to ats-lang-users


Le dimanche 13 décembre 2015 04:17:16 UTC+1, gmhwxi a écrit :

To get the skolem constant, please do

prval [b1:int] pfd1:DATA(1, b1) = FIRST


I will try this (will tell in a moment). I though about this way, but wrote the quantification between `:` and `DATA`
 
Thanks, Professor Xi :-P

If you show me the type of FIRST, I could probably say more.

I was afraid to post a too long snippet. Any way, I will probably try to documentation this self‑assignment and make it available on GitHub. I believe it can an interesting sample for beginners, on a concrete case.
 

Yannick Duchêne

unread,
Dec 12, 2015, 10:28:55 PM12/12/15
to ats-lang-users


Le dimanche 13 décembre 2015 04:19:15 UTC+1, gmhwxi a écrit :

Use this instead:

prval [b1:int] (pfd1:DATA(1, b1)) = FIRST

OK, I will try both. 

Yannick Duchêne

unread,
Dec 12, 2015, 10:35:42 PM12/12/15
to ats-lang-users


Le dimanche 13 décembre 2015 04:19:15 UTC+1, gmhwxi a écrit :

Use this instead:

prval [b1:int] (pfd1:DATA(1, b1)) = FIRST

I get the same errors I always get: “the identifier [hb_C2] is expected to refer to a sort (instead of a subset sort)”.

`b1` can't be just `int`, there must be more constraints on it, and I'm always facing “no subset sort allowed” issues.

I should write:

prval [b1:hb_C2] pfd1:DATA(1, b1) = FIRST
prval pfc1
:UTF8_C2(1, c1) = UTF8_C2_1(pfd1)


Because:

| {b:hb_C2; i == 1} UTF8_C2_1(i, b - 0xC0) of BYTE

That's this constraint I can't express on the static constant `b1`. Similarly with a `b2` which has it's own constraint too.

gmhwxi

unread,
Dec 12, 2015, 10:40:34 PM12/12/15
to ats-lang-users
There is a lot missing information.

How is hb_C2 defined?
What is the type of FIRST?

Yannick Duchêne

unread,
Dec 12, 2015, 10:50:09 PM12/12/15
to ats-lang-users


Le dimanche 13 décembre 2015 04:40:34 UTC+1, gmhwxi a écrit :
There is a lot missing information.

How is hb_C2 defined?
What is the type of FIRST?

Modulo the #define use, it stands for this:

sortdef hb_C2 = {b:int | 0xC2 <= b && b <= 0xDF}


And for FIRST:

dataprop DATA (i:int, b:int) =
| FIRST(1, b)
| NEXT(i, b) of [pb:int] DATA(i-1, pb)



I'm finally attaching the whole source, may be that's better.



That's also an idea I had:

prfn utf8_c2_set
 
{c:int; UTF8_C2_LOWER <= c; c <= UTF8_C2_UPPER} ():
    UTF8_C2
(2, c) =

  let
    prval pfd1
:DATA(1, b1) = FIRST
    prval pfd2
:DATA(2, b2) = NEXT(pfd1)
    prval pf
=
      sif
(0xC2 <= b1 && b1 <= 0xDF) && (0x80 <= b2 && b2 <= 0xBF) then
        let
          prval pfc1
:UTF8_C2(1, c1) = UTF8_C2_1(pfd1)

          prval pfc2
:UTF8_C2(2, c2) = UTF8_C2_2(pfc1, pfd2)
       
in pfc2 end

     
else
       
// Don't know what to put here
       
// This won't be a valid UTF8_C2(2, c)
       
// There is no constnt meaning “Impossible”
 
in pf end


 
UTF_8.dats

gmhwxi

unread,
Dec 12, 2015, 11:32:45 PM12/12/15
to ats-lang-users

I suggest:

sortdef byte = {a:nat | a <= 0xFF}

dataprop DATA
(int, int) =
| {b:byte} FIRST(1, b)
| {i:int;i >= 2}{b,pb:byte} NEXT(i, b) of DATA(i-1, pb)

The following code typechecks:


prfn
utf8_c2_set
{c:int;
 UTF8_C2_LOWER
<= c;
 c
<= UTF8_C2_UPPER}
(
) : UTF8_C2(2, c) =
let
  stadef b1
= c/64+0xC0
  stadef b2
= c%64+0x80
  prval pfd1
= FIRST{b1}()
in
  UTF8_C2_2
(UTF8_C2_1(pfd1), NEXT{2}{b2,b1}(pfd1))
end

Yannick Duchêne

unread,
Dec 12, 2015, 11:46:50 PM12/12/15
to ats-lang-users


Le dimanche 13 décembre 2015 05:32:45 UTC+1, gmhwxi a écrit :


The following code typechecks:


Indeed.

What means `FIRST{b1}()` and `NEXT{2}{b2,b1}(pfd1)` ? I remember I use to know it a long time ago, but I forgot. May be something I missed which prevented me from discovering a solution.

I studying your version of `utf8_c2_set`.

Yannick Duchêne

unread,
Dec 12, 2015, 11:58:05 PM12/12/15
to ats-lang-users


Le dimanche 13 décembre 2015 05:32:45 UTC+1, gmhwxi a écrit :

I suggest:

sortdef byte = {a:nat | a <= 0xFF}

dataprop DATA
(int, int) =
| {b:byte} FIRST(1, b)
| {i:int;i >= 2}{b,pb:byte} NEXT(i, b) of DATA(i-1, pb)


I tested something: if I say  `dataprop DATA (i:int, b:int)` it fails. I did not suspect this was so different to name the arguments there. I though it was just a kind of default declaration. It seems to have some other consequences.

gmhwxi

unread,
Dec 13, 2015, 12:45:59 AM12/13/15
to ats-lang-users
If you write DATA(i:int, b:int), then the quantifier {i:int;b:int} is
automatically added to the front of the type of each constructor
associated with DATS.

Yannick Duchêne

unread,
Dec 13, 2015, 1:06:25 AM12/13/15
to ats-lang-users


Le dimanche 13 décembre 2015 06:45:59 UTC+1, gmhwxi a écrit :
If you write DATA(i:int, b:int), then the quantifier {i:int;b:int} is
automatically added to the front of the type of each constructor
associated with DATS.

I see. It seems to also prevents from being able to pass some constraint using `{…}` .

The same can be achieved using an explicit type, as in this variant I just tested, not using `{…}`:

prfn utf8_c2_set
 
{c:int; UTF8_C2_LOWER <= c; c <= UTF8_C2_UPPER} ():
    UTF8_C2
(2, c) =

  let
    stadef b1
= c / 64 + 0xC0
    stadef b2
= c % 64 + 0x80

    prval pfd1
:DATA(1, b1) = FIRST
    prval pfd2
:DATA(2, b2) = NEXT(pfd1)

    prval pfc1
= UTF8_C2_1(pfd1)
    prval pfc2
= UTF8_C2_2(pfc1, pfd2)
 
in pfc2 end


I'm keeping this in my notes.

Yannick Duchêne

unread,
Dec 13, 2015, 2:59:25 AM12/13/15
to ats-lang-users


Le dimanche 13 décembre 2015 04:06:45 UTC+1, Yannick Duchêne a écrit :
Mainly, look at the `stacst` and `stadef` in the first version: I had to move it all as constraints outside of the body, while it should be constraints on local constants.

In the correct version, stadef b1 and b2 are created from c. My way of trying this was failing because I could not define stacst with a subsort. Stadef neither allows a subsort, while it can holds indirectly constrained values, as with the two expressions, so that's not because some things can't support constraints.

What is the rational for not allowing subsort and only base sort, at multiple places? Is this an implementation reason or a logical reason? Is there really no way to add an explicit constraint to a stasct and a stadef?

Yannick Duchêne

unread,
Dec 13, 2015, 3:41:13 AM12/13/15
to ats-lang-users


Le dimanche 13 décembre 2015 04:19:15 UTC+1, gmhwxi a écrit :

Use this instead:

prval [b1:int] (pfd1:DATA(1, b1)) = FIRST

This construct is OK when the type says so:

typedef t = [i:int] int(i)
val
[i:int] (a:t) = 0:t

gmhwxi

unread,
Dec 13, 2015, 7:34:12 AM12/13/15
to ats-lang-users

ATS is a stratified system.

There are sort-checking and type-checking.

Sort-checking does not involve any constraint-solving.

Subsorts can only be used together with quantification as a form of convenience.

Not supporting general use of subsorts is first a design decision. However, this decision
has a big impact on implementation; it greatly simplifies implementation.

Yannick Duchêne

unread,
Dec 13, 2015, 11:27:58 AM12/13/15
to ats-lang-users


Le dimanche 13 décembre 2015 13:34:12 UTC+1, gmhwxi a écrit :

Subsorts can only be used together with quantification as a form of convenience.

This means one is always able to do without subset sorts? And there is no other place, except quantifier expression where it can be used?

To come back on the thread's title, it seems to me the Skolem constants are `stacst`: a fixed arbitrary value of a given type (or sort, here)

To quickly comment (for readers) on the real world case presented in this thread, when I wanted to do something like …

// Just a picture, can't really do this, not allowed.
stacst b1
: {b:int | b >= 0xC2 && b < 0xDF}
stacst b2
: {b:int | b >= 0x80 && b < 0xBF}

… I was wrong any way, because these are arbitrary and unrelated values, which could only be OK to prove the other way: this from of UTF-8 string may generate all values in a given character range. A total of 0x780 different values [(0xDF-0xC2+1)*(0xBF-0x80+1)] starting from UTF8_C2_LOWER indeed fills all values up to UTF8_C2_UPPER [UTF8_C2_LOWER+(0x800-1)].

But the proof function was saying the other way: given all values in a character range, they can always be represented with this form of UTF-8 string. And for this, a relation between b1 and b2 was required, as c imposes this relation. So your way was the only one possible for this proof functions:

stadef b1 = c / 64 + 0xC0 // Ends to be in 0xC2 .. 0xDF
stadef b2
= c % 64 + 0x80 // Ends to be in  0x80 .. 0xBF
// Same ranges, however with a relation between both!


This long comment was for readers, I know I'm not learning you anything you did not know :-P 

gmhwxi

unread,
Dec 13, 2015, 11:37:34 AM12/13/15
to ats-lang-users
For the record, you can introduce a Skolem constant b1 in this way:

extern
prfun
lemma
(
// argless
) : [b:int | b >= 0xC2 && b < 0xDF] void

prval
[b1:int] () = lemma()

The assumption is that you are able to prove 'lemma' first. However, I don't think this is
what you need in this case.

Yannick Duchêne

unread,
Dec 13, 2015, 3:18:32 PM12/13/15
to ats-lang-users


Le dimanche 13 décembre 2015 17:27:58 UTC+1, Yannick Duchêne a écrit :
[…]


But the proof function was saying the other way: given all values in a character range, they can always be represented with this form of UTF-8 string. And for this, a relation between b1 and b2 was required, as c imposes this relation. So your way was the only one possible for this proof functions:
[…]

I tried to continue to investigate on my initial idea, and there is more. I won't tell all of of what get trough my mind, but it can be summarized.

Given this:
dataprop P =
| P1

dataprop Q
=
| Q1 of P

One can say “if you have a P, you can get a Q”:
prfn pf (p:P): Q = Q1(p)

But although it could intuitively make sense, there is no way to say something like “if you have a Q, you could got a P”:
// For the picture, not possible!
prfn pf
(q:Q): [p:P; q == Q1(p)] void

Investigating more, I believe that's also this kind of impasse, I was in.

gmhwxi

unread,
Dec 13, 2015, 4:21:11 PM12/13/15
to ats-lang-users

From Q to P:

prfn Q2P (pf: Q): P =
  let val Q1
(pf2) = pf in pf2 end

Yannick Duchêne

unread,
Dec 13, 2015, 5:24:23 PM12/13/15
to ats-lang-users


Le dimanche 13 décembre 2015 22:21:11 UTC+1, gmhwxi a écrit :

From Q to P:

prfn Q2P (pf: Q): P =
  let val Q1
(pf2) = pf in pf2 end


Yes, thanks. And case+ when there are multiple ways for deriving Q from P:
prfn Q2P (pf:Q): P =

 
case+ pf of
 
| Q1(pf2) => pf2

You also proved I need sleeping …

Reply all
Reply to author
Forward
0 new messages