Shen and nested datatypes

234 views
Skip to first unread message

alpha...@orange.fr

unread,
Dec 12, 2021, 1:44:14 PM12/12/21
to Shen


Hello fellow-SHEN-ers,

First i must say that i am an hobbist ocaml+coq programmer.
Thus, of course, what i will say is from an ocaml+coq programmer point of view.

To validate the assertion that shen typing is strictly more powerful than ocaml typing i have decided to demonstrate nested datatypes in shen.
The datatype bush example is from "Nested Datatypes", Bird & Meertens 1998.
The ocaml snippet reads like this :


type 'a bush =
   | L
   | R of 'a * 'a bush bush

let rec fold : 'a. ('a -> 'b -> 'b) -> 'a bush -> 'b -> 'b =
   fun f ab z -> match ab with
   | L -> z
   | R (a,abb) -> f a (fold (fold f) abb z)

let size ab = fold (fun _ n -> n + 1) ab 0


The equivalent shen snippet reads like this :


\* shen version *\
(datatype bush

   _____________
   [] : (bush A);

   L : A;
   R : (bush (bush A));
   ===================
   [L R] : (bush A);)

(define fold
   {(A --> B --> B) --> bush A --> B --> B}
   F [] Z -> Z
   F [A ABB] Z -> (F A (fold F) (ABB Z)))

\* how many A in a bush A data ? *\
(define size
   AB -> fold (/. _ N (+ N 1)) AB 0)
Exception in shen.constructor-error: [lambda _ [lambda N [+ N 1]]] is not a legitimate constructor



That is a somewhat daunting message for a beginner like me.
Thus now i wish :
* either a correction that solves my case (happy)
* or an acknowledgement that ML typing is sadly not strictly lesser than the shen one

Regards,

- damien

Bruno Deferrari

unread,
Dec 12, 2021, 1:46:49 PM12/12/21
to qil...@googlegroups.com
Hello Damien. You are missing a parentheses here. Should read:

(define size
  AB -> (fold (/. _ N (+ N 1)) AB 0))
 


That is a somewhat daunting message for a beginner like me.
Thus now i wish :
* either a correction that solves my case (happy)
* or an acknowledgement that ML typing is sadly not strictly lesser than the shen one

Regards,

- damien

--
You received this message because you are subscribed to the Google Groups "Shen" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qilang+un...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/qilang/f519531a-5dce-4eef-b65c-fdeb16247449n%40googlegroups.com.


--
BD

alpha...@orange.fr

unread,
Dec 12, 2021, 1:56:40 PM12/12/21
to Shen
Hello Bruno,

Thanks. This the most obscure " parenthese  missing" message i have ever seen.
Shen is definitely impressive.

Mark Tarver

unread,
Dec 12, 2021, 4:09:14 PM12/12/21
to Shen

Your original definition actually balances in brackets so there is no missing parenthesis
and hence no parenthesis message.

However 

(define size
   AB -> fold (/. _ N (+ N 1)) AB 0)

has as the first rule

AB -> fold

and the second rule

(/. _ N (+ N 1)) AB 0

This was not your intent.

(/. _ N (+ N 1)) corresponds to what Shen would expect as the first part of the next rule.  
But this has no recognised constructor and hence your message.   

I'd advise writing (/. X N (+ N 1)).

Regarding your size function it does need a type if your want to run it under typed mode.

(define fold
   {(A --> B --> B) --> bush A --> B --> B}
   F [] Z -> Z
   F [A ABB] Z -> (F A (fold F) (ABB Z)))

The part in red needs to be parenthesised.  

This will take you forward some way.

Mark




alpha...@orange.fr

unread,
Dec 12, 2021, 5:10:02 PM12/12/21
to Shen
Well, this is an even worse puzzle than i expected because i have found a counter example for the ocaml code.

# size (R(L,R(R(R(9,L),L),L)));;
- : int = 2

Where obviously the correct answer is 1.
Any help ocaml or shen is warmly welcomed.

- damien

Mark Tarver

unread,
Dec 12, 2021, 7:23:27 PM12/12/21
to Shen
I salute your heroism in trying to run a program which doesn't work in a language you know 
in a language you have just begun to learn.  That takes cojones.  ;))

I wouldn't assume to know exactly whats going on here but at a guess I suppose you want to represent
a kind of recursively nested data type using a polytype type operator.  This might edge you closer.

(datatype bush

  __________________
  [] : (bush A);
 
  X : A; Y : (bush (bush A));
  ===========================
  [X | Y] : (bush A);)  

(12+) [1 [1 [2]]]
[1 [1 [2]]] : (bush number)

(13+) [1 [1 [2 [3]]]]
[1 [1 [2 [3]]]] : (bush number)

  
I'll leave the OCaml stuff to those who know.

Mark

alpha...@orange.fr

unread,
Dec 12, 2021, 7:34:06 PM12/12/21
to Shen
stupid me
(R(L,R(R(R(9,L),L),L)))  has type (bush (bush (bush int))) thus we count how many (bush (bush int)) and they are 2 :
L and  R(R(9,L),L)

Now i have a problem with the corrected fold :

(2+) (define fold
   {(A --> B --> B) --> (bush A) --> B --> B}
   F [] Z -> Z
   F [A ABB] Z -> (F A (fold (fold F)) ABB Z))
Exception in shen.t*-rules: type error in rule 2 of fold

Mark Tarver

unread,
Dec 12, 2021, 7:50:56 PM12/12/21
to Shen

That's an interesting version of Shen you've got; the error messages are somewhat different in wording, but ok,
you have a type error.

The bush datatype is quite interesting.  One interesting higher-order function is mapleaves which maps over the 
leaves of a bushy datatype.

(18+)  (define mapleaves
         {(A --> B) --> (bush A) --> (bush B)}
         _ [] -> []
         F [X | Bush] -> [(F X) | (mapleaves (mapleaves F) Bush)])
(fn mapleaves) : ((A --> B) --> ((bush A) --> (bush B)))

(19+) (mapleaves (+ 1) [1 [1 [2 [3]]]])
[2 [2 [3 [4]]]] : (bush number)


Perhaps this will inspire your experiments.

And so to bed, to sleep, perchance to dream.

Mark


 

alpha...@orange.fr

unread,
Dec 13, 2021, 6:29:06 AM12/13/21
to Shen
I use windows Shen 22.2 port 0.24 ported by Bruno Deferrari and i just discovered the online Book of Shen 4th edition.

(6+) (define map

   {(A --> B) --> (bush A) --> (bush B)}
   _ [] -> []
   F [A | ABB] -> [(F A) | (map (map F) ABB)])
Exception in shen.t*-rules: type error in rule 2 of map


I think, using a much simpler example, i can guess why shen does not type.

the ocaml version is :

# let rec polyrec : 'b. 'a -> 'b -> 'c = fun x y -> polyrec x x;;
val f : 'a -> 'b -> 'c = <fun>

the shen 22.2 version is :

(10+) (define polyrec
   {A --> B --> C}
   X Y -> (polyrec X X))
Exception in shen.t*-rules: type error in rule 1 of polyrec

(11+) (define polyrec
   {A --> A --> C}
   X Y -> (polyrec X X))
polyrec : (A --> (A --> C))

The ocaml type is more general than the shen type.
It seems that ocaml support polymorphic recursion whereas shen does not.

Mark Tarver

unread,
Dec 13, 2021, 6:31:02 AM12/13/21
to Shen
(0+) (define polyrec
   {A --> B --> C}
   X Y -> (polyrec X X))
(fn polyrec) : (A --> (B --> C))

no problem at all.

Mark

Mark Tarver

unread,
Dec 13, 2021, 7:23:13 AM12/13/21
to Shen
Also do not call it 'map' which is a system function.

M.

nha...@gmail.com

unread,
Dec 13, 2021, 6:25:44 PM12/13/21
to Shen
Related to this, here is what I'm using for an option/nullable datatype:

(datatype none

           _______________________
                  ? : (? A);

                    X : A;
               (collapse A A*);
        _____________________________
               (? X) : (? A*);

\\ Debatable if this rule should be in
\\        _____________________________
\\             X : (mode (? A) -);


                      !;
               (collapse X X*);
    ______________________________________
        (mode (collapse (? X) X*) -);

                (unify X X*);
         ____________________________
          (mode (collapse X X*) -);)


(let X (? 1) (? (? X)))  \\ 1 : (? number)
(lambda X (? (? X)))     \\ #<procedure> : (A --> (? A))
(? (? (? 1)))            \\ 1 : (? number)
?                        \\ ? : (? A)

Mark Tarver

unread,
Dec 13, 2021, 6:34:28 PM12/13/21
to qil...@googlegroups.com
The last time I saw something like that it had been run over.

Off to sleep.

M.

You received this message because you are subscribed to a topic in the Google Groups "Shen" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qilang/O1yo_3WRUUI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to qilang+un...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/qilang/2c482344-9444-4154-b3ac-9553f9d5f9bbn%40googlegroups.com.

Mark Tarver

unread,
Dec 13, 2021, 6:43:45 PM12/13/21
to qil...@googlegroups.com
Anyhow I think you were doing something along the lines ....

(22+) (define fold

   {(A --> B --> B) --> (bush A) --> B --> B}
   F [] Z -> Z
   F [A | ABB] Z -> (F A (fold (fold F) ABB Z)))
(fn fold) : ((A --> (B --> B)) --> ((bush A) --> (B --> B)))

(23+) (fold (fn +)  [1 [1 [2 [3]]]] 0)
7 : number

(24+) (fold (/. X N (+ N 1))  [1 [1 [2 [3]]]] 0)
4 : number


That's it from me tonight.  Others in different time zones can take over.  Night all.

Mark

You received this message because you are subscribed to a topic in the Google Groups "Shen" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qilang/O1yo_3WRUUI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to qilang+un...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/qilang/9a792e68-c1c1-4317-a28e-689620b19368n%40googlegroups.com.

Mark Tarver

unread,
Dec 13, 2021, 10:22:11 PM12/13/21
to qil...@googlegroups.com
Seriously - what's it supposed to do? ;)

M.

Bruno Deferrari

unread,
Dec 13, 2021, 10:42:40 PM12/13/21
to qil...@googlegroups.com
The code is incomplete (the `?` function for example is missing, and so are the rules to get an `A` from a `(? A)` ), but it implements nullable types. When a parameter type is `(? A)`, the function accepts either a value of type `A`, or the null value `?`.

although I think I like Neal's version better (especially the collapsing at the type level, because it matches how the values behave).

The boxed, non-collapsing alternative would be https://github.com/tizoc/shen-batteries/blob/master/maybe/maybe.shen which is like option types in ML and Haskell. I actually prefer this representation, something about nullables make me feel uneasy (I think it is the collapsing, makes it not compose well).



--
BD

Mark Tarver

unread,
Dec 13, 2021, 11:45:14 PM12/13/21
to Shen
The ocaml type is more general than the shen type.
It seems that ocaml support polymorphic recursion whereas shen does not. 

Basically 22.2 is running on a kernel that is about 10 years old - it corresponds to the first implementation of Shen
that was written in 2011 with some debugging and enhancements.  This is what you are using.

The S series kernels are a complete rebuild.  The issue is dealt with and footnoted on https://shenlanguage.org/TBoS/tbos_463.html.

Mark



On Monday, 13 December 2021 at 11:31:02 UTC Mark Tarver wrote:

Mark Tarver

unread,
Dec 14, 2021, 12:19:23 AM12/14/21
to Shen
Ah, OK - educational.  

 When a parameter type is `(? A)`, the function accepts either a value of type `A`, or the null value `?`.

Couldn't you represent this as 

X : A;
________
X : (- (? A));

_____
?  : (? A);

I guess the collapsing part is where the complexity lies in Neal's code.

Mark

Mark Tarver

unread,
Dec 14, 2021, 12:53:20 AM12/14/21
to Shen

I've been reading about nullable  types in Wikipedia  https://en.wikipedia.org/wiki/Nullable_type

Back in happier times when Willi walked the earth, he invented an object which he assigned to all types

__________
<any> : A;

He wanted <any> because for certain procedures he needed to return a nullary result whatever the type of the
result was.   I said rightly he couldn't do that w.o. propagating type errors in his program since his <any> could
be multiplied or concatenated with equal validity.    He was most reluctant to give up this treasured object.

But what he really wanted was an object of type (? A) and to return ?.  To be useful you need some dynamic test 
for determining ?.

(define  not-nullary?
    {A --> boolean}
     X -> (not (== X ?)))

And an L rule

X : A >> P;
_______________________________________
(not-nullary? X) : verified, X : (? A) >> P;

Thank you chaps for introducing that to me.   I really need to go back to sleep.

Mark

nha...@gmail.com

unread,
Dec 14, 2021, 4:00:32 AM12/14/21
to Shen
Nullable types have some stigma left over from C I think. Which can be seen in posts like this https://rpeszek.github.io/posts/2021-01-17-maybe-overuse.html (He got flamed for the article apparently, but there is an element of truth to it)

The "billion dollar mistake" in C is much worse because the type system there doesn't distinguish from (? A) and A; plus a pointer can become null at any time, or be cast from an integer, or the arithmetic can overflow etc.

Most of C's problems with NULL don't really apply here. However, nullable types tend to have a contagion effect when used inside a record, since it forces every single use of that field to require conditional handling.

'Maybe' vs 'nullable' is situational, since sometimes you want the nested values (just nothing) : (maybe (maybe A)) and sometimes you don't. It is similar to the situation that arises where you have to collapse a list of empty lists into [].

There is another type similar to nullable which has both an empty element and an annihilating element. This is sometimes useful for representing an insert/update/delete operation in a single function.

It would be better if the empty element '?' above was uniquely generated so it doesn't overlap with the value of '?' as a symbol.

This is also useful in practice (where the second branch of 'and' can depend on the verified status of the first branch):

(datatype verified-types

                             X : boolean;
                     X : verified >> Y : boolean;
               ________________________________________
                         (and X Y) : boolean;

                   X : verified, Y : verified >> P;
    ______________________________________________________________
                      (and X Y) : verified >> P;

                           Test : boolean;
                      Test : verified >> A : T;
                                B : T;
                    _____________________________
                          (if Test A B) : T;)

Mark Tarver

unread,
Dec 14, 2021, 6:47:38 AM12/14/21
to Shen
The "billion dollar mistake" in C is much worse because the type system there doesn't distinguish from (? A) and A; plus a pointer can become null at any time, or be cast from an integer, or the arithmetic can overflow etc.

That is bad indeed.

It would be quite useful to specify a type checker for C as I think the type theory could be interesting.

Mark

alpha...@orange.fr

unread,
Dec 14, 2021, 10:04:50 AM12/14/21
to Shen
Thanks Mark Tarver.
If my miseries are due to an outdated Shen version then where can i find a state-of-the-art Shen binary for windows 10 ?

Mark Tarver

unread,
Dec 14, 2021, 10:14:52 AM12/14/21
to qil...@googlegroups.com
Go to the Download page and load the latest SBL/Shen binary for Windows.

Mark

--
You received this message because you are subscribed to a topic in the Google Groups "Shen" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qilang/O1yo_3WRUUI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to qilang+un...@googlegroups.com.

alpha...@orange.fr

unread,
Dec 14, 2021, 10:56:56 AM12/14/21
to Shen
Thanks Mark.

I have used the example from the "Nested Datatypes", Bird & Meertens 1998 paper :

(8+) [
 4
 [ 8 [5] [[3]] ]
 [ [7] [] [[[7]]] ]
 [ [ [] [[0]] ] ]
]
[4 [8 [5] [[3]]] [[7] [] [[[7]]]] [[[] [[0]]]]] : (bush number)

And everything is OK.

alpha...@orange.fr

unread,
Dec 14, 2021, 11:56:14 AM12/14/21
to Shen

I must say i am really impressed.
It took many years by an army of ocaml researchers/programmers to tackle nested datatypes and generalized algebraic datatypes. That makes improbable that one person can go the sequent calculus way and succeed in a lifetime.  

In Functional Programming forums ocaml/coq people say Shen is inadequate because typing eventually does not terminate. I disagree.

My current personal project is writing a CC-BY-SA ocaml book.
Thus i am rather ocaml oriented at the moment.
My next personal project is writing an Interactive Fiction authoring tool for text adventures.
I will then seriously consider Shen as a programming language.

nha...@gmail.com

unread,
Dec 14, 2021, 3:38:27 PM12/14/21
to Shen
I haven't found the non-termination to be a practical issue at all really, especially when developing interactively with a REPL. It isn't much worse than saying computer programs don't work because you can never really be sure if they will halt.

Someone also claimed earlier that Prolog doesn't scale for type checking. The scaling is on a per function basis, so I would imagine most human written functions to be nowhere near big enough to cause a speed issue.
I tested a 1000 line nested version of the below function and it loaded and typechecked in 8.625 secs with 2,130,042 inferences. This was in a long running REPL session with a 11795 command backlog and plenty of type rules declared.

(define slow
  { number --> number }
  X -> (let X (lambda X X)
          (let X (lambda X X)
             (let X (lambda X X)
                (let X (lambda X X)
                   1)))))


I guess people occasionally see functions 17,000 lines long in imperative languages, but that must be very rare in a functional programming language with pattern matching.

Also, things like the principle type property don't seem to matter much in practice, and I would not want to sacrifice any features for it. Demanding that the user annotate their toplevel functions with a type signature isn't a large burden, and it is often recommended for documentation purposes anyway. Maybe principle types may have a positive effect on development speed (and probably error messages), but it is hard to say.

The best criticism against Shen's type system is that error messages are bad and trace debugging with spy is slow and tedious. The debugger in SWI prolog is much better, and something like that could probably replace what we have now.

Mark Tarver

unread,
Dec 15, 2021, 1:48:17 AM12/15/21
to Shen
On Tuesday, 14 December 2021 at 20:38:27 UTC nha...@gmail.com wrote:
I haven't found the non-termination to be a practical issue at all really, especially when developing interactively with a REPL. It isn't much worse than saying computer programs don't work because you can never really be sure if they will halt.

Someone also claimed earlier that Prolog doesn't scale for type checking. The scaling is on a per function basis, so I would imagine most human written functions to be nowhere near big enough to cause a speed issue.
I tested a 1000 line nested version of the below function and it loaded and typechecked in 8.625 secs with 2,130,042 inferences. This was in a long running REPL session with a 11795 command backlog and plenty of type rules declared.

(define slow
  { number --> number }
  X -> (let X (lambda X X)
          (let X (lambda X X)
             (let X (lambda X X)
                (let X (lambda X X)
                   1)))))


I guess people occasionally see functions 17,000 lines long in imperative languages, but that must be very rare in a functional programming language with pattern matching.

One of the first programmers in 'Shen' technology was Andrew Adams back in 1992-1994 who programmed in SEQUEL.   Back then 
the standard was the SPARC I which clocked 15MHz.  He basically rewrote the entire Boyer-Moore ATP in SEQUEL which was about
6,000 lines of code (much shorter than the Lisp implementation) and won an M.Sc. with distinction.  I used to watch his program INDUCT
loading and it was 

boyer-moore * symbol -> boolean  
<pause for a second or two>
run-top-level * symbol symbol -> boolean

So you actually had time to go off and make a coffee while loading.   The SC compiler technology was way more basic than in Shen.  
I'd say something like 15 LIPS/MHz compared to 220 LIPS/MHz now.  Neal's example, had it been computable then (doubtful), would
have taken between 2-3 hours of computation.   

Conceiving that SC could actually be practically used by programmers was really quite outlandish.  You had to be a bit moon-struck to even attempt to do it
and even today most programmers don't understand the potential.   


Also, things like the principle type property don't seem to matter much in practice, and I would not want to sacrifice any features for it. Demanding that the user annotate their toplevel functions with a type signature isn't a large burden, and it is often recommended for documentation purposes anyway. Maybe principle types may have a positive effect on development speed (and probably error messages), but it is hard to say.

The best criticism against Shen's type system is that error messages are bad and trace debugging with spy is slow and tedious. The debugger in SWI prolog is much better, and something like that could probably replace what we have now.


On Tuesday, December 14, 2021 at 4:56:14 PM UTC alpha...@orange.fr wrote:

I must say i am really impressed.
It took many years by an army of ocaml researchers/programmers to tackle nested datatypes and generalized algebraic datatypes. That makes improbable that one person can go the sequent calculus way and succeed in a lifetime.  

In Functional Programming forums ocaml/coq people say Shen is inadequate because typing eventually does not terminate. I disagree.

My current personal project is writing a CC-BY-SA ocaml book.
Thus i am rather ocaml oriented at the moment.
My next personal project is writing an Interactive Fiction authoring tool for text adventures.
I will then seriously consider Shen as a programming language.

Sounds good.  TBoS due to its structure bearing role is a bit of a tome.    Another book would be free of the responsibility of defining the language standard.
Quite happy to place this on the front page news bar.

Mark

nha...@gmail.com

unread,
Oct 18, 2022, 6:14:42 PM10/18/22
to Shen
Today was the first time I've had issues with the type checker scaling poorly with large functions.

It was with a large typed 'defcc' function. Does the type checker not check each rule/branch independently here? 

Mark Tarver

unread,
Oct 18, 2022, 9:05:45 PM10/18/22
to Shen
No; but what does your definition look like?

Mark

nha...@gmail.com

unread,
Oct 19, 2022, 12:35:47 AM10/19/22
to Shen
I've split the definitions into smaller pieces which makes the problem less bad, but the file still takes 3,000,000 inferences to type check.

The type of the defcc is 
{ (list any) ==> (expr prolog) }

90% of the rules of the defcc function do nothing but reconstruct the parameters, which is necessary for the type system to verify that the expression is an (expr prolog). 

The defcc definition just restates the Shen prolog grammar, which includes all of KLambda + 10 Shen Prolog primitives.

It requires this dodgy rule to help type check it:

  (datatype @grammar-help@
    ______________________________________________________________
       X : (str (list any) B) >> X : (- (str (list (list A)) B));)

'Any' is defined just below, and (expr prolog) is defined via a macro that generates the types and some other boilerplate functions.

The full file is shown at the bottom.

 (datatype @any@ \\ NOTE: mutating via 'any' is unsound...

       __________________________________________________
                      eval : (any --> any);

     _______________________________________________________
                   intern : (string --> any);

     ______________________________________________________
                  macroexpand : (any --> any);

     _______________________________________________________
           <-address : (absvector --> number --> any);

                    ________________________
                      X : T >> X : (- any);

                            X : any;
                            Y : any;
                   ___________________________
                      (cons X Y) : (- any);


                           X : symbol;
                    ________________________
                          X : (- any);

                           X : number;
                    ________________________
                          X : (- any);

                           X : string;
                    ________________________
                          X : (- any);

                          X : boolean;
                    ________________________
                          X : (- any);

                    ________________________
                          [] : (- any);

                         X : (list any);
                 ______________________________
                          X : (- any);

                       X : any >> Y : any;
                _________________________________
                     (lambda X Y) : (- any);

                            X : any;
                _________________________________
                      (freeze X) : (- any);

                           X : symbol;
                  ____________________________
                      (value X) : (- any);

         ______________________________________________
             declare : (symbol --> any --> symbol);

         ______________________________________________
      update-lambda-table : (symbol --> number --> symbol);
                                )



(defexpr _ir prolog (string symbol number boolean)

   \\ System
   [fn X] :: ((X : (expr prolog))) 

   \\ Vars
   [0 X]       :: ((X : symbol))
   [1 X]       :: ((X : symbol))
   [receive X] :: ((X : symbol))
   [protect X] :: ((X : symbol))

   \\ Prolog
   [call X]    :: ((X : (expr prolog))) 
   [fork | XS] :: ((XS : (list (expr prolog))))
   [!]         :: ()
   [return X]  :: ((X : (expr prolog)))
   [when X]    :: ((X : (expr prolog)))
   [is X Y]    :: ((X : (expr prolog)) (Y : (expr prolog)))
   [is! X Y]   :: ((X : (expr prolog)) (Y : (expr prolog)))
   [bind X Y]  :: ((X : (expr prolog)) (Y : (expr prolog)))
   [var? X]    :: ((X : (expr prolog)))
   [findall X Y Z] :: ((X : (expr prolog))
                       (Y : (expr prolog))
                       (Z : (expr prolog)))

   \\ KL

   \\ Boolean Operations
   [if X Y Z]  :: ((X : (expr prolog)) (Y : (expr prolog)) (Z : (expr prolog)))
   [and X Y]   :: ((X : (expr prolog)) (Y : (expr prolog)))
   [or X Y]    :: ((X : (expr prolog)) (Y : (expr prolog)))
   [cond | XS] :: ((XS : (list (expr prolog))))

   \\ Lists
   [cons X Y]  :: ((X : (expr prolog)) (Y : (expr prolog)))
   [hd X]      :: ((X : (expr prolog)))
   [tl X]      :: ((X : (expr prolog)))
   [cons? X]   :: ((X : (expr prolog)))
   []          :: ()

   \\ Symbols
   [intern X]    :: ((X : (expr prolog)))

   \\ Strings
   [pos X Y]    :: ((X : (expr prolog)) (Y : (expr prolog)))
   [tlstr X]    :: ((X : (expr prolog)))
   [cn X Y]     :: ((X : (expr prolog)) (Y : (expr prolog)))
   [str X]      :: ((X : (expr prolog)))

   [string? X]      :: ((X : (expr prolog)))
   [n->string X]    :: ((X : (expr prolog)))
   [string->n X]    :: ((X : (expr prolog)))

   \\ Assignments
   [set X Y]    :: ((X : (expr prolog)) (Y : (expr prolog)))
   [value X]    :: ((X : (expr prolog)))

   \\ Error Handling
   [simple-error X] :: ((X : (expr prolog)))
   [trap-error X Y]    :: ((X : (expr prolog)) (Y : (expr prolog)))
   [error-to-string X] :: ((X : (expr prolog)))

   \\ Generic
   [defun X Y Z]    :: ((X : symbol)
                        (Y : (list symbol))
                        (Z : (expr prolog)))

   [lambda X Y]    :: ((X : symbol)
                       (Y : (expr prolog)))

   [let X Y Z]    :: ((X : symbol)
                      (Y : (expr prolog))
                      (Z : (expr prolog)))

   [= X Y]    :: ((X : (expr prolog)) (Y : (expr prolog)))

   [eval-kl X]    :: ((X : (expr prolog)))

   [freeze X]    :: ((X : (expr prolog)))

   [type X Y]    :: ((X : (expr prolog)) (Y : (expr prolog)))

   \\ Vector

   [absvector X]    :: ((X : (expr prolog)))
   [address-> X Y Z]    :: ((X : (expr prolog)) (Y : (expr prolog)) (Z : (expr prolog)))
   [<-address X Y]    :: ((X : (expr prolog)) (Y : (expr prolog)))
   [absvector? X]    :: ((X : (expr prolog)))

   \\ IO

   [write-byte X Y] :: ((X : (expr prolog)) (Y : (expr prolog)))
   [read-byte X]    :: ((X : (expr prolog)))
   [open X Y]       :: ((X : (expr prolog)) (Y : (expr prolog)))
   [close X]        :: ((X : (expr prolog)))

   \\ Time
   [get-time X Y]       :: ((X : (expr prolog)) (Y : (expr prolog)))

   \\ Number
   [+ X Y] :: ((X : (expr prolog)) (Y : (expr prolog)))
   [- X Y] :: ((X : (expr prolog)) (Y : (expr prolog)))
   [* X Y] :: ((X : (expr prolog)) (Y : (expr prolog)))
   [/ X Y] :: ((X : (expr prolog)) (Y : (expr prolog)))
   [> X Y] :: ((X : (expr prolog)) (Y : (expr prolog)))
   [< X Y] :: ((X : (expr prolog)) (Y : (expr prolog)))
   [>= X Y] :: ((X : (expr prolog)) (Y : (expr prolog)))
   [<= X Y] :: ((X : (expr prolog)) (Y : (expr prolog)))
   [number? X] :: ((X : (expr prolog)))

   \\ Apply
   [call-static X | XS] :: ((X : symbol) (XS : (list (expr prolog))))
   [call-dynamic X | XS] :: ((X : (expr prolog)) (XS : (list (expr prolog))))
   )

-----------------


(module (_pl
         (import-all _ir)
         (import-all expr)
         (import-all any)
         (import-all values)
         (import-qualified list)

         (private-import [atom? register-prolog-function linearize])

         (export function [prolog-compile
                           <head>
                           <wildcard>
                           <hterm1>
                           <hterm2>
                           <hterm>
                           <inners>
                           <inner2>
                           <inner3>
                           <inner>
                           <outer>
                           <prolog-operator>
                           <math-operator>
                           <symbol-operator>
                           <string-operator>
                           <vector-operator>
                           <list-operator>
                           <system-operator>
                           <boolean-operator>
                           <assignment-operator>
                           <generic-operator>
                           <exception-operator>
                           <atomic>
                           <body>
                           <clauses>
                           <clause>
                           <sc>
                           <binder>
                           <binders>
                           <name>
                           <defprolog>])

         (local symbol [@grammar-help@])

         (local function [arity-check
                          linearize-rule equate equivalences
                          pac-h
                          prolog-keyword?
                          semicolon?
                          compile-clauses]))


 (datatype @grammar-help@
    ______________________________________________________________
     X : (str (list any) B) >> X : (- (str (list (list A)) B));)

 (define semicolon?
   { A --> boolean }
   X -> (== X ;))

 (define arity-check
   { symbol --> (list (expr #defprolog.rule)) --> number }
   _ [[#defprolog.rule H B]] -> (length H)
   F [[#defprolog.rule H B] | Clauses] -> (do (pac-h F (length H) Clauses)
                                              (length H)))


 (define pac-h
   { symbol --> number --> (list (expr #defprolog.rule)) --> boolean }
   _ _ [] -> true
   F N [[#defprolog.rule H _] | Clauses] -> (if (= N (length H))
                                                (pac-h F N Clauses)
                                                (error "arity error in prolog procedure ~A~%" F)))

 (define equate
   { symbol --> symbol
            --> (expr prolog) }

   Y X -> (if (value shen.*occurs*) [is! Y X] [is Y X]))

 (define equivalences
   { (list (expr prolog)) --> (pair symbol (list symbol))
                          --> (list (expr prolog)) }

   \\ Tail [Symbol|[E]]    -> Tail  \\ TODO: don't generate useless equivalences
   Tail [Symbol|ES] -> (list.map-append (equate Symbol) ES Tail))



 (define linearize-rule
   { (expr #defprolog.rule) --> (expr #defprolog.rule) }
   [#defprolog.rule PS A] -> (let* (Eqv PS*) (list.foldl-map (/. B A (linearize B A)) PS [])
                                 (let A* (list.foldl (fn equivalences) Eqv A)
                                    [#defprolog.rule PS* A*])))

 (define prolog-keyword?
   { A --> boolean }
   X -> (element? X [; <--]) where (symbol? X)
   _ -> false)

 (defcc <head>
   { (list any) ==> (list (expr prolog.pattern)) }
   <hterm> <head> := [<hterm> | <head>];
   <e> := [];)

 (defcc <wildcard> \\ TODO: this causes prolog compiler to complain about singleton variables
   { (list any) ==> symbol }
   X := (gensym (protect Y))  where (== X _);)

 (defcc <hterm1>
   { (list any) ==> (expr prolog.pattern) }
   <hterm>;)

 (defcc <hterm2>
   { (list any) ==> (expr prolog.pattern) }
   <hterm>;)


 (defcc <hterm>
   { (list any) ==> (expr prolog.pattern) }

   <wildcard> := <wildcard>;
   X := X where (string? X);
   X := X where (number? X);
   X := X where (boolean? X);
   X := X where (and (symbol? X)
                     (not (prolog-keyword? X)));

   [] := [];
   \\ X := X  where (= X (intern ":")); TODO: is ':' not a symbol?

   [cons <hterm1> <hterm2>] := [cons <hterm1> <hterm2>];
   [+ <hterm>] := [+m <hterm>];
   [- <hterm>] := [-m <hterm>];

   [mode <hterm> +] := [+m <hterm>];
   [mode <hterm> -] := [-m <hterm>];)


 (defcc <inners>
   { (list any) ==> (list (expr prolog)) }
   <inner> <inners> := [<inner> | <inners>];
   <e> := [];)


 (defcc <inner2>
   { (list any) ==> (expr prolog) }
   <inner>;)

 (defcc <inner3>
   { (list any) ==> (expr prolog) }
   <inner>;)

 (defcc <binder>
   { (list any) ==> symbol }
   X := X where (symbol? X);)

 (defcc <name>
   { (list any) ==> symbol }
   X := X where (symbol? X);)

 (defcc <binders>
   { (list any) ==> (list symbol) }
   <binder> <binders> := [<binder> | <binders>];
   <e> := [];)


 (defcc <prolog-operator>
   { (list any) ==> (expr prolog) }

   [call <inner>] := [call <inner>];
   [fork <inners>] := [fork | <inners>];
   [return <inner>] := [return <inner>];
   [when <inner>] := [when <inner>];
   [is <inner> <inner2>] := [is <inner> <inner2>];
   [is! <inner> <inner2>] := [is! <inner> <inner2>];
   [bind <inner> <inner2>] := [bind <inner> <inner2>];
   [var? <inner>] := [var? <inner>];
   [findall <inner> <inner2> <inner3>] := [findall <inner> <inner2> <inner3>];)

 (defcc <math-operator>
   { (list any) ==> (expr prolog) }
   \\ Number
   [+ <inner> <inner2>] := [+ <inner> <inner2>];
   [- <inner> <inner2>] := [- <inner> <inner2>];
   [* <inner> <inner2>] := [* <inner> <inner2>];
   [/ <inner> <inner2>] := [/ <inner> <inner2>];
   [> <inner> <inner2>] := [> <inner> <inner2>];
   [< <inner> <inner2>] := [< <inner> <inner2>];
   [>= <inner> <inner2>] := [>= <inner> <inner2>];
   [<= <inner> <inner2>] := [<= <inner> <inner2>];
   [number? <inner>] := [number? <inner>];)

 (defcc <vector-operator>
   { (list any) ==> (expr prolog) }

   [absvector <inner>] := [absvector <inner>];
   [address-> <inner> <inner2> <inner3>] := [address-> <inner> <inner2> <inner3>];
   [<-address <inner> <inner2>] := [<-address <inner> <inner2>];
   [absvector? <inner>] := [absvector? <inner>];)

 (defcc <list-operator>
   { (list any) ==> (expr prolog) }

   [hd <inner>] := [hd <inner>];
   [tl <inner>] := [tl <inner>];
   [cons <inner> <inner2>] := [cons <inner> <inner2>];
   [cons? <inner>] := [cons? <inner>];)

 (defcc <symbol-operator>
   { (list any) ==> (expr prolog) }
   [intern <inner>] := [intern <inner>];)

 (defcc <string-operator>
   { (list any) ==> (expr prolog) }
   [pos <inner> <inner2>] := [pos <inner> <inner2>];
   [tlstr <inner>] := [tlstr <inner>];
   [cn <inner> <inner2>] := [cn <inner> <inner2>];
   [str <inner>] := [str <inner>];
   [string? <inner>] := [string? <inner>];
   [n->string <inner>] := [n->string <inner>];
   [string->n <inner>] := [string->n <inner>];)

 (defcc <system-operator>
   { (list any) ==> (expr prolog) }
   \\ IO
   [write-byte <inner> <inner2>] := [write-byte <inner> <inner2>];
   [read-byte <inner>] := [read-byte <inner>];
   [open <inner> <inner2>] := [open <inner> <inner2>];
   [close <inner>] := [close <inner>];

   \\ Time
   [get-time <inner> <inner2>] := [get-time <inner> <inner2>];)


 (defcc <boolean-operator>
   { (list any) ==> (expr prolog) }

   [if <inner> <inner2> <inner3>] := [if <inner> <inner2> <inner3>];
   [and <inner> <inner2>] := [and <inner> <inner2>];
   [or <inner> <inner2>] := [or <inner> <inner2>];
   [cond <inners>] := [cond | <inners>];)


 (defcc <assignment-operator>
   { (list any) ==> (expr prolog) }

   [set <inner> <inner2>] := [set <inner> <inner2>];
   [value <inner>] := [value <inner>];)

 (defcc <exception-operator>
   { (list any) ==> (expr prolog) }

   [simple-error <inner>] := [simple-error <inner>];
   [trap-error <inner> <inner2>] := [trap-error <inner> <inner2>];
   [error-to-string <inner>] := [error-to-string <inner>];)

 (defcc <generic-operator>
   { (list any) ==> (expr prolog) }

   [defun <name> [<binders>] <inner>] := [defun <name> <binders> <inner>];
   [lambda <binder> <inner>] := [lambda <binder> <inner>];
   [let <binder> <inner> <inner2>] := [let <binder> <inner> <inner2>];
   [= <inner> <inner2>] := [= <inner> <inner2>];
   [eval-kl <inner>] := [eval-kl <inner>];
   [freeze <inner>] := [freeze <inner>];
   [type <inner> <inner2>] := [type <inner> <inner2>];)


 (defcc <atomic>
   { (list any) ==> (expr prolog) }

   [0 X] := [0 X] where (symbol? X);
   [1 X] := [1 X] where (symbol? X);
   [receive X] := [receive X] where (symbol? X);
   [protect X] := [protect X] where (symbol? X);

   <wildcard> := <wildcard>;
   [] := [];
   X := X where (string? X);
   X := X where (number? X);
   X := X where (boolean? X);
   X := X where (symbol? X);)

 (defcc <inner>
   { (list any) ==> (expr prolog) }

   <math-operator>       := <math-operator>;
   <symbol-operator>     := <symbol-operator>;
   <string-operator>     := <string-operator>;
   <vector-operator>     := <vector-operator>;
   <list-operator>       := <list-operator>;
   <system-operator>     := <system-operator>;
   <boolean-operator>    := <boolean-operator>;
   <assignment-operator> := <assignment-operator>;
   <generic-operator>    := <generic-operator>;
   <exception-operator>  := <exception-operator>;
   <atomic>              := <atomic>;
   <outer>               := <outer>;)


 (defcc <outer>
   { (list any) ==> (expr prolog) }

   <prolog-operator> := <prolog-operator>;

   \\ System
   [fn <inner>] := [fn <inner>];

   \\ Apply
   [[protect <inner>] <inners>] := [call-static <inner> | <inners>]
     where (symbol? <inner>);

   [<inner> <inners>] := (if (and (symbol? <inner>) (not (variable? <inner>)))
                             [call-static <inner> | <inners>]
                             [call-dynamic <inner> | <inners>]);

   )


 (defcc <body>
   { (list any) ==> (list (expr prolog)) }
   ! <body> := [[!] | <body>];
   <outer> <body> := [<outer> | <body>];
   <e> := [];)

 (defcc <clauses>
   { (list any) ==> (list (expr #defprolog.rule)) }
   <clause> <clauses> := [<clause> | <clauses>];
   <!> := (if (empty? <!>) [] (error "Prolog syntax error here:~% ~R~% ..." <!>));)


 (defcc <clause>
   { (list any) ==> (expr #defprolog.rule) }
   <head> <-- <body> <sc> := [#defprolog.rule <head> <body>];)


 (defcc <sc>
   { (list A) ==> symbol }
   X := X where (and (symbol? X)
                     (semicolon? X));)


 (define compile-clauses
   { symbol --> (list (expr #defprolog.rule))
            --> (expr #defprolog) }
   F Clauses -> (let Arity      (arity-check F Clauses)
                     LeftLinear (list.map (fn linearize-rule) Clauses)
                   (do
                    (register-prolog-function F Arity)
                    [#defprolog F Arity | LeftLinear])))

 (defcc <defprolog>
   { (list any) ==> (expr #defprolog) }
   F <clauses> := (compile-clauses F <clauses>)
     where (symbol? F);)

 (define prolog-compile
   { any --> (expr #defprolog) }
   [defprolog | XS] -> (compile (/. X (<defprolog> X)) XS)
     where (list? XS))

 )

Mark Tarver

unread,
Oct 19, 2022, 7:28:55 PM10/19/22
to Shen
I cannot run this unfortunately because it uses (macro driven?) constructions
outside the language definition and stlib.

Mark

Mark Tarver

unread,
Oct 22, 2022, 7:45:32 AM10/22/22
to Shen
| Does the type checker not check each rule/branch independently here? 

The present version does not - it generates a Shen function with one rule.  
I have a beta version YACC here which does that and compiles n YACC rules
into n Shen rules using non-deterministic Shen with <- and fail-if.  It's also 60 
lines shorter than the current version.

Bad news is that it does not result in fewer inferences on the test file but 
more.

Mark


Mark Tarver

unread,
Oct 22, 2022, 8:08:28 AM10/22/22
to Shen
For example your <inner>  of 14 loc.

(defcc <inner>
   { (list any) ==> (expr prolog) }
   <math-operator>       := <math-operator>;
   <symbol-operator>     := <symbol-operator>;
   <string-operator>     := <string-operator>;
   <vector-operator>     := <vector-operator>;
   <list-operator>       := <list-operator>;
   <system-operator>     := <system-operator>;
   <boolean-operator>    := <boolean-operator>;
   <assignment-operator> := <assignment-operator>;
   <generic-operator>    := <generic-operator>;
   <exception-operator>  := <exception-operator>;
   <atomic>              := <atomic>;
   <outer>               := <outer>;)

compiles under the current Shen to 173 loc

(define <inner>
 V40241 ->
 (let Result
  (let Parse<math-operator>
   (
    (fn <math-operator>) V40241)
   (if
    (shen.parse-failure? Parse<math-operator>)
    (shen.parse-failure)
    (shen.comb
     (shen.in-> Parse<math-operator>)
     (type
      (shen.<-out Parse<math-operator>)
      (expr prolog)))))
  (if
   (shen.parse-failure? Result)
   (let Result
    (let Parse<symbol-operator>
     (
      (fn <symbol-operator>) V40241)
     (if
      (shen.parse-failure? Parse<symbol-operator>)
      (shen.parse-failure)
      (shen.comb
       (shen.in-> Parse<symbol-operator>)
       (type
        (shen.<-out Parse<symbol-operator>)
        (expr prolog)))))
    (if
     (shen.parse-failure? Result)
     (let Result
      (let Parse<string-operator>
       (
        (fn <string-operator>) V40241)
       (if
        (shen.parse-failure? Parse<string-operator>)
        (shen.parse-failure)
        (shen.comb
         (shen.in-> Parse<string-operator>)
         (type
          (shen.<-out Parse<string-operator>)
          (expr prolog)))))
      (if
       (shen.parse-failure? Result)
       (let Result
        (let Parse<vector-operator>
         (
          (fn <vector-operator>) V40241)
         (if
          (shen.parse-failure? Parse<vector-operator>)
          (shen.parse-failure)
          (shen.comb
           (shen.in-> Parse<vector-operator>)
           (type
            (shen.<-out Parse<vector-operator>)
            (expr prolog)))))
        (if
         (shen.parse-failure? Result)
         (let Result
          (let Parse<list-operator>
           (
            (fn <list-operator>) V40241)
           (if
            (shen.parse-failure? Parse<list-operator>)
            (shen.parse-failure)
            (shen.comb
             (shen.in-> Parse<list-operator>)
             (type
              (shen.<-out Parse<list-operator>)
              (expr prolog)))))
          (if
           (shen.parse-failure? Result)
           (let Result
            (let Parse<system-operator>
             (
              (fn <system-operator>) V40241)
             (if
              (shen.parse-failure? Parse<system-operator>)
              (shen.parse-failure)
              (shen.comb
               (shen.in-> Parse<system-operator>)
               (type
                (shen.<-out Parse<system-operator>)
                (expr prolog)))))
            (if
             (shen.parse-failure? Result)
             (let Result
              (let Parse<boolean-operator>
               (
                (fn <boolean-operator>) V40241)
               (if
                (shen.parse-failure? Parse<boolean-operator>)
                (shen.parse-failure)
                (shen.comb
                 (shen.in-> Parse<boolean-operator>)
                 (type
                  (shen.<-out Parse<boolean-operator>)
                  (expr prolog)))))
              (if
               (shen.parse-failure? Result)
               (let Result
                (let Parse<assignment-operator>
                 (
                  (fn <assignment-operator>) V40241)
                 (if
                  (shen.parse-failure? Parse<assignment-operator>)
                  (shen.parse-failure)
                  (shen.comb
                   (shen.in-> Parse<assignment-operator>)
                   (type
                    (shen.<-out Parse<assignment-operator>)
                    (expr prolog)))))
                (if
                 (shen.parse-failure? Result)
                 (let Result
                  (let Parse<generic-operator>
                   (
                    (fn <generic-operator>) V40241)
                   (if
                    (shen.parse-failure? Parse<generic-operator>)
                    (shen.parse-failure)
                    (shen.comb
                     (shen.in-> Parse<generic-operator>)
                     (type
                      (shen.<-out Parse<generic-operator>)
                      (expr prolog)))))
                  (if
                   (shen.parse-failure? Result)
                   (let Result
                    (let Parse<exception-operator>
                     (
                      (fn <exception-operator>) V40241)
                     (if
                      (shen.parse-failure? Parse<exception-operator>)
                      (shen.parse-failure)
                      (shen.comb
                       (shen.in-> Parse<exception-operator>)
                       (type
                        (shen.<-out Parse<exception-operator>)
                        (expr prolog)))))
                    (if
                     (shen.parse-failure? Result)
                     (let Result
                      (let Parse<atomic>
                       (
                        (fn <atomic>) V40241)
                       (if
                        (shen.parse-failure? Parse<atomic>)
                        (shen.parse-failure)
                        (shen.comb
                         (shen.in-> Parse<atomic>)
                         (type
                          (shen.<-out Parse<atomic>)
                          (expr prolog)))))
                      (if
                       (shen.parse-failure? Result)
                       (let Result
                        (let Parse<outer>
                         (
                          (fn <outer>) V40241)
                         (if
                          (shen.parse-failure? Parse<outer>)
                          (shen.parse-failure)
                          (shen.comb
                           (shen.in-> Parse<outer>)
                           (type
                            (shen.<-out Parse<outer>)
                            (expr prolog)))))
                        (if
                         (shen.parse-failure? Result)
                         (shen.parse-failure) Result)) Result)) Result)) Result)
) Result)) Result)) Result)) Result)) Result)) Result)) Result)) Result)))

Whereas under the experimental model we get 159 loc

(define <inner>
{(str (list any) C40273) --> (str (list any) (expr prolog))}
Stream <-
 (fail-if
  (/. X
   (parse-failure? X))
  (let Parse<math-operator>
   (<math-operator> Stream)
   (if
    (parse-failure? Parse<math-operator>)
    (parse-failure)
    (comb
     (in-> Parse<math-operator>)
     (<-out Parse<math-operator>)))))
     
 Stream <-
 (fail -if
  (/. X
   (parse-failure? X))
  (let Parse<symbol-operator>
   (<symbol-operator> Stream)
   (if
    (parse-failure? Parse<symbol-operator>)
    (parse-failure)
    (comb
     (in-> Par se<symbol-operator>)
     (<-out Parse<symbol-operator>)))))
     
 Stream <-
 (fail-if
  (/. X
   (parse-failure? X))
  (let Parse<string-operator>
   (<string-operator> Stream)
   (if
    (parse-failure? Parse<string-operator>)
    (parse-failure)
    (comb
     (in-> Parse<string -operator>)
     (<-out Parse<string-operator>)))))
     
 Stream <-
 (fail-if
  (/. X
   (parse-f ailure? X))
  (let Parse<vector-operator>
   (<vector-operator> Stream)
   (if
    (parse-fa ilure? Parse<vector-operator>)
    (parse-failure)
    (comb
     (in-> Parse<vector-operator >)
     (<-out Parse<vector-operator>)))))
     
 Stream <-
 (fail-if
  (/. X
   (parse-failure? X))
  (let Parse<list-operator>
   (<list-operator> Stream)
   (if
    (parse-failure? Parse< list-operator>)
    (parse-failure)
    (comb
     (in-> Parse<list-operator>)
     (<-out Parse<list-operator>)))))
     
 Stream <-
 (fail-if
  (/. X
   (parse-failure? X))
  (let Parse<syste m-operator>
   (<system-operator> Stream)
   (if
    (parse-failure? Parse<system-operator>)
    (parse-failure)
    (comb
     (in-> Parse<system-operator>)
     (<-out Parse<system-opera tor>)))))
     
 Stream <-
 (fail-if
  (/. X
   (parse-failure? X))
  (let Parse<boolean-operator>
   (<boolean-operator> Stream)
   (if
    (parse-failure? Parse<boolean-operator>)
    (pa rse-failure)
    (comb
     (in-> Parse<boolean-operator>)
     (<-out Parse<boolean-operator>)))))
     
Stream <-
 (fail-if
  (/. X
   (parse-failure? X))
  (let Parse<assignment-operator>
   (<assignment-operator> Stream)
   (if
    (parse-failure? Parse<assignment-operator>)
    (parse-failure)
    (comb
     (in-> Parse<assignment-operator>)
     (<-out Parse<assignment-operator>)))))
     
 Stream <-
 (fail-if
  (/. X
   (parse-failure? X))
  (let Parse<generic -operator>
   (<generic-operator> Stream)
   (if
    (parse-failure? Parse<generic-operator>)
    (parse-failure)
    (comb
     (in-> Parse<generic-operator>)
     (<-out Parse<generic-op erator>)))))
     
Stream <-
 (fail-if
  (/. X
   (parse-failure? X))
  (let Parse<exception-operator>
   (<exception-operator> Stream)
   (if
    (parse-failure? Parse<exception-opera tor>)
    (parse-failure)
    (comb
     (in-> Parse<exception-operator>)
     (<-out Parse<exception-operator>)))))
     
Stream <-
 (fail-if
  (/. X
   (parse-failure? X))
  (let Parse<atomi c>
   (<atomic> Stream)
   (if
    (parse-failure? Parse<atomic>)
    (parse-failure)
    (comb
     (in-> Parse<atomic>)
     (<-out Parse<atomic>)))))
     
Stream <-
 (fail-if
  (/. X
   (parse-failure? X))
  (let Parse<outer>
   (<outer> Stream)
   (if
    (parse-failure? Parse<outer>)
    (parse-failure)
    (comb
     (in-> Parse<outer>)
     (<-out Parse<outer>)))))

     
 _ -> (parse-failure))

Note that the object YACC code is still about an order of magnitude larger than 
the source code, so you get a lot of work for the type checker.

Mark

nha...@gmail.com

unread,
Oct 22, 2022, 2:34:42 PM10/22/22
to Shen
My example with <inner> is the refactored version - the original had all those sub-clauses merged into <inner>, so it was really quite slow.

Anyway, what is the benefit of type checking the generated yacc code? Seems better to type check closer to the input source for performance and user experience reasons (stepping through compiler boilerplate with 'spy' sucks). 

There are several things that seem to point to Shen needing an extra compilation phase (I wanted to add it to my port, but I think it will break compatibility with Shen too much, and I really want to resist the urge to fork).

What I'm thinking is to have the macro layer desugar all top-level forms ('defprolog', 'datatype', 'defcc', and 'define') into four different (more lispy) intermediate representations. The typechecker then gets applied directly to these new representations, then shen->kl dispatches to the correct backend compiler for each top-level form (you could also allow the user to extend this via a similar mechanism to defmacro).    

The benefits of this are:
- T* is simplified, and probably faster, because it doesn't have to call yacc to parse 'define'
- Generating datatypes and functions with macros becomes cleaner.
- The currying code is maybe simplified, slightly, because functions are already parsed into a more convenient format.
- Yacc code can be type checked nearer it's original form, before all the low level boilerplate is generated.
- Toplevel forms don't disappear after macro expansion (this is a nuisance when generating .kl files from .shen during bootstrapping, particularly if you want to export datatype and macro definitions as well). 
- The prolog backend can be swapped out easier (what I'm working on atm). 

Mark Tarver

unread,
Oct 22, 2022, 3:57:06 PM10/22/22
to qil...@googlegroups.com
Anyway, what is the benefit of type checking the generated yacc code? Seems better to type check closer to the input source for performance and user experience reasons (stepping through compiler boilerplate with 'spy' sucks). 

The problem is that you have to prove the correctness of the type checking process because you are effectively checking a 
different language.  

Mark

--
You received this message because you are subscribed to a topic in the Google Groups "Shen" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qilang/O1yo_3WRUUI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to qilang+un...@googlegroups.com.

Mark Tarver

unread,
Oct 23, 2022, 2:02:02 PM10/23/22
to Shen
| There are several things that seem to point to Shen needing an extra compilation phase (I wanted to add it to my port, but I think it will break compatibility with Shen too much, and I really want to 
| resist the urge to fork).

As far as that goes, I don't mind if people want to fork, the original is established and anyway there
are a number of pre-S kernels that are not updated.   I just think people need to understand what is
involved in forking.  In your case what you need to do, if you want to create a type checking procedure
for YACC, is to produce a math'l proof that your algorithm gives the same results as the current version.
In that event your work would lean on my  correctness results in TBoS.

Any other course requires a formal semantics for YACC and proofs of subject reduction etc. of the kind 
in TBoS and these are not trivial.

M.

nha...@gmail.com

unread,
Oct 24, 2022, 3:28:08 AM10/24/22
to Shen
I was thinking that even if the fork was source-code level incompatible with Shen, it would still be compatible at the KL backend level, so code could still interoperate easily.

The downside would be having a duplicate implementation of Shen included, but your implementation is less than 4k LOC and has no external dependencies so it isn't really so much extra bloat.

Picking a good time to fork, minimising the work required to merge future features of Shen, may be a bit tricky too. What is the current roadmap for Shen?

Mark Tarver

unread,
Oct 24, 2022, 6:44:35 AM10/24/22
to qil...@googlegroups.com
I need to complete Yggdrasil.  

M.

You received this message because you are subscribed to the Google Groups "Shen" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qilang+un...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/qilang/8f75e9cd-e057-4304-be24-2a1a51b66f9cn%40googlegroups.com.

nha...@gmail.com

unread,
Oct 24, 2022, 7:51:30 AM10/24/22
to Shen
Btw, why can't you change the yacc type system to something like this:

(defcc b 
  { (list symbol) --> (yacc number) }
  hi := 1;)

(defcc a
  { (list symbol) --> (yacc number) }
  <b> := (+ <b> 1);)

Parameter <b> gets introduced as type 'number' instead of (yacc number) when type checking the right hand side. The result in tail position on the right hand side gets implicitly lifted from A to (yacc A). 

'compile' then has type { (list A) --> (yacc B) --> B }.

The type of the input stream is constant, the return type of each branch of a yacc function are all the same, and backtracking can't effect the return type.

Some of the monadic parsers in Haskell have similar types as above.

Message has been deleted

nha...@gmail.com

unread,
Oct 24, 2022, 7:56:36 AM10/24/22
to Shen
 Sorry I mean compile is { ((list A) --> (yacc B)) --> (list A) --> B }

Mark Tarver

unread,
Oct 24, 2022, 8:29:00 AM10/24/22
to qil...@googlegroups.com
As far as the road map goes, after Yggdrasil and some other utilities wrt formal methods, a lot of my effort will go into the 
education channel.  So I'll be doing more speaking and less coding.

Btw, why can't you change the yacc type system to something like this:
 
(defcc <b> 
  { (list symbol) --> (yacc number) }
  hi := 1;)

Well <b> isn't a function that takes a list of symbols for one thing.  As TBoS says, the input to a YACC function is a YACC structure

M.







nha...@gmail.com

unread,
Oct 24, 2022, 9:37:17 AM10/24/22
to Shen
Why do any of the implementation details matter from the perspective of defcc though? The type signature for <b> isn't under any obligation to match whatever underlying code is generated from the compiler.

Maybe ((list symbol) --> (yacc number)) isn't good, because it's not opaque enough to the outside world, but ((list symbol) ==> number) like you have now is fine.

The trick is to not expand it to (str A B) and leave it undefined. The only thing the user can then do with a ((list symbol) ==> number) object is call 'compile' on it.


Now, for practical purposes, do you really need to care about anything other than: 
1.) The symbol <b> has type 'number' in the body of <a>
2.) That the result of the right hand side of := matches the output type (number).
3.) The symbol <b> has the appropriate yacc type ((list symbol) ==> number) on the left hand side of :=  

Mark Tarver

unread,
Oct 24, 2022, 11:12:56 AM10/24/22
to Shen
| The trick is to not expand it to (str A B) and leave it undefined.

But as said above, in order to do this (==> being primitive) you need a different type checking algorithm and proof of correctness of aforesaid.   

M.

Mark Tarver

unread,
Oct 24, 2022, 11:41:26 AM10/24/22
to Shen
I'm not saying that this is not doable btw; I'm just saying it has to be done.
The best (= easiest) method IMO is as I set out above.

M.

Mark Tarver

unread,
Oct 24, 2022, 8:03:27 PM10/24/22
to qil...@googlegroups.com
What exactly does the program do?

Mark

--
You received this message because you are subscribed to a topic in the Google Groups "Shen" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qilang/O1yo_3WRUUI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to qilang+un...@googlegroups.com.

Mark Tarver

unread,
Oct 24, 2022, 9:17:38 PM10/24/22
to Shen
Just off to bed.  You might find this useful.

Basically the type algorithm you want is this:

Given a YACC function f of type (list A) ==> B; f is well-typed iff
every rule R of f is well typed.

R is well typed iff for every item i in the syntax (before :=)

Part I

1. If I is a variable then i is well typed and assume I : A in proving part II.
2. If I is a constant prove i : A.
3. If I is a non-terminal g assume (in-> g) : (list A) provided g is assigned
   (list A) ==> C in the program.
4. If i is a list then recurse the above rules within i and show i : A.

Given the assumptions from part I type check the semantics (after :=)
by

1. Replacing each non-terminal G in the semantics S by (<-out G) to produce S*.
2  Assume (<-out G) : C provided g is assigned
   (list A) ==> C in the program and thus .....
   
Part II
   
3. Prove S* : B from the assumptions listed.

Finally

Proof Obligation: show that the above will always produce the same result as the
default Shen implementation - which is type secure by the proofs in TBoS.  
Hence in the presence of such a proof the above is also type secure.

On Monday, 24 October 2022 at 14:37:17 UTC+1 nha...@gmail.com wrote:
Reply all
Reply to author
Forward
0 new messages