`dataprop`: is requiring multiple premisses instead of a single one possible?

Yannick Duchêne

unread,

Dec 11, 2015, 8:56:09 PM12/11/15

to ats-lang-users

The source below is not the entire source, it's just what's required to see. The comments comes next to the sample.

dataprop INPUT (i:int, b:int) =
| FIRST(1, b)
| NEXT(i, b) of [pb:int] INPUT(i-1, pb)

sortdef hb00 = {b:int | 0x00 <= b && b <= 0x7F}
sortdef hbC2 = {b:int | 0xC2 <= b && b <= 0xDF}
/* … there is more here … */

sortdef tb80 = {b:int | 0x80 <= b && b <= 0xBF}
/* … there is more here … */

stadef added (c:int, b:int) = (c * 64) + (b - 0x80)

dataprop UTF8_00 (i:int, c:int) =
| {b:hb00; i == 1} UTF8_00_1(i, b) of INPUT(i, b)

dataprop UTF8_C2 (i:int, c:int) =
| {b:hbC2; i == 1} UTF8_C2_1(i, b - 0xC0) of INPUT(i, b)
| {b:tb80; i == 2} UTF8_C2_2(i, added(c, b)) of UTF8_C2(i-1, c) and INPUT(i, b)

/* … there is more here … */

The issue is on the line where I wanted to say:

[…] of UTF8_C2(i-1, c) and INPUT(i, b)

I get a syntax error here.

I would like to require two proofs to derive `UTF8_C2_2`. I don't know it I just don't know the syntax or if it's not feasible at all.

Note for those who want the understand the sample:

The i of INPUT is the index in an input stream abd the b, is the byte read at that index. The sort definitions, introduces bytes which belongs to some ranges: hb00 and hbC2 are two head bytes (coming first) and tb80 is a tail byte (coming after a head byte). UTF8_00 is for ASCII character in an UTF-8 stream, and UTF8_C2 is for two bytes character whose head byte belongs to 0xC2..0xDF. I'm using the data from http://www.unicode.org/versions/Unicode7.0.0/ch03.pdf at table 3-7.

I first written the specification for the decoding only, and it was OK. Then I wanted to express the relation between the decoding process and the order of bytes from a stream, and there, I failed due to an error which is either a syntax error or an impossible construct, as I don't know and that's the purpose of this thread.

.

Yannick Duchêne

unread,

Dec 11, 2015, 9:12:58 PM12/11/15

to ats-lang-users

Le samedi 12 décembre 2015 02:56:09 UTC+1, Yannick Duchêne a écrit :

I get a syntax error here.

I would like to require two proofs to derive `UTF8_C2_2`. I don't know it I just don't know the syntax or if it's not feasible at all.

I could guess it myself:

[…] of (UTF8_C2(i-1, c), INPUT(i, b))

This seems is OK.

Well, I don't mind if I posted this thread and guess so quickly right after, that may be an interesting example for readers, who knows… and perhaps an opportunity for comments.

Yannick Duchêne

unread,

Dec 12, 2015, 1:02:26 PM12/12/15

to ats-lang-users

Le samedi 12 décembre 2015 02:56:09 UTC+1, Yannick Duchêne a écrit :

dataprop INPUT (i:int, b:int) = | FIRST(1, b) | NEXT(i, b) of [pb:int] INPUT(i-1, pb)

[…]

Just to comment on this, I feel it's better to say:

dataprop DATA (i:int, b:int) =
| FIRST(1, b)
| NEXT(i, b) of [pb:int] DATA(i-1, pb)
| NO_DATA

`DATA` more means it may be an input, as much as an output (the decoding specification, specifies the encoding as much), and even not necessarily a stream, it may be an array, or some set of distinct variables. The only important thing, is that a byte at i makes sense and really means a byte at i, only if there was/is a byte at i-1 (for i greater than 1).

I'm just not happy with this:

[…] of [pb:int] DATA(i-1, pb)

… as `pb` (previous byte) is not necessary and could be a placeholder, and I don't how to express it or if it's feasible.

I also feel I will have a question about this dataprop and linear types (consumed proofs), but that's for later.

Yannick Duchêne

unread,

Dec 12, 2015, 1:19:17 PM12/12/15

to ats-lang-users

Le samedi 12 décembre 2015 19:02:26 UTC+1, Yannick Duchêne a écrit :

I'm just not happy with this:
[…] of [pb:int] DATA(i-1, pb)

… as `pb` (previous byte) is not necessary and could be a placeholder, and I don't how to express it or if it's feasible.

I also feel I will have a question about this dataprop and linear types (consumed proofs), but that's for later.

What about this…

dataprop EXISTS (i:int) =
| NONE
| FIRST(1)
| NEXT(i) of EXISTS(i-1)
and DATA (i:int, b:int) =
| ITEM(i, b) of EXISTS(i)

… however, perhaps too much convoluted

Yannick Duchêne

unread,

Dec 12, 2015, 1:32:17 PM12/12/15

to ats-lang-users

Le samedi 12 décembre 2015 19:19:17 UTC+1, Yannick Duchêne a écrit :

… however, perhaps too much convoluted

More than convoluted: unsafe. it does not mean the same, as it allows to derive anything at a given index.

Proof by an example:

extern praxi first(): (EXISTS(1))
prval pf:DATA(1, 0) = ITEM(first())
prval pf:DATA(1, 1) = ITEM(first())

… ouch, can say anything about the value of a byte at i.

While with the previous one:

dataprop DATA (i:int, b:int) =
| FIRST(1, b)
| NEXT(i, b) of [pb:int] DATA(i-1, pb)
| NO_DATA

extern praxi first(): (DATA(1, 23))
prval pf:DATA(1, 23) = first()
// prval pf:DATA(1, 45) = … can't, no way, and that's expected.

That would be a good case for a tutorial.

gmhwxi

unread,

Dec 12, 2015, 5:33:59 PM12/12/15

to ats-lang-users

I would suggest that you declare DATA as an abstract prop.
Instead of introducing constructors like FIRST and NEXT, you can introduce
axioms.

Reply all

Reply to author

Forward