The source below is not the entire source, it's just what's required to see. The comments comes next to the sample.
dataprop INPUT (i:int, b:int) =
| FIRST(1, b)
| NEXT(i, b) of [pb:int] INPUT(i-1, pb)
sortdef hb00 = {b:int | 0x00 <= b && b <= 0x7F}
sortdef hbC2 = {b:int | 0xC2 <= b && b <= 0xDF}
/* … there is more here … */
sortdef tb80 = {b:int | 0x80 <= b && b <= 0xBF}
/* … there is more here … */
stadef added (c:int, b:int) = (c * 64) + (b - 0x80)
dataprop UTF8_00 (i:int, c:int) =
| {b:hb00; i == 1} UTF8_00_1(i, b) of INPUT(i, b)
dataprop UTF8_C2 (i:int, c:int) =
| {b:hbC2; i == 1} UTF8_C2_1(i, b - 0xC0) of INPUT(i, b)
| {b:tb80; i == 2} UTF8_C2_2(i, added(c, b)) of UTF8_C2(i-1, c) and INPUT(i, b)
/* … there is more here … */
The issue is on the line where I wanted to say:
[…] of UTF8_C2(i-1, c) and INPUT(i, b)
I get a syntax error here.
I would like to require two proofs to derive `UTF8_C2_2`. I don't know it I just don't know the syntax or if it's not feasible at all.
Note for those who want the understand the sample:
The i of INPUT is the index in an input stream abd the b, is the byte read at that index. The sort definitions, introduces bytes which belongs to some ranges: hb00 and hbC2 are two head bytes (coming first) and tb80 is a tail byte (coming after a head byte). UTF8_00 is for ASCII character in an UTF-8 stream, and UTF8_C2 is for two bytes character whose head byte belongs to 0xC2..0xDF. I'm using the data from
http://www.unicode.org/versions/Unicode7.0.0/ch03.pdf at table 3-7.
I first written the specification for the decoding only, and it was OK. Then I wanted to express the relation between the decoding process and the order of bytes from a stream, and there, I failed due to an error which is either a syntax error or an impossible construct, as I don't know and that's the purpose of this thread.
.