A tree-sitter grammar for Shen (plus a reader question about regex.shen)

299 views
Skip to first unread message

Luiz de Milon

unread,
May 30, 2026, 7:36:24 PMMay 30
to Shen

Hello,

I used Claude to help me write a tree-sitter grammar for Shen, and I'd like to share it with the community: https://github.com/luizdemilon/tree-sitter-shen

To be clear about provenance: the grammar, queries, tests, and docs were written by Claude under my direction — I scoped it, made the design decisions, and validated the result against the official sources, but I didn't hand-write the parser. I'm sharing it because I believe it will be useful to more people.

What it is: tree-sitter gives editors fast, incremental, structural parsing, so this provides syntax highlighting and structural navigation for Shen in Neovim, Emacs (29+ treesit), and Zed. It traces to the Official Shen Manual §12 construct by construct (a GRAMMAR.md maps each BNF production to a grammar rule), and I validated it by parsing the whole of shen-sources.

That validation is where I have a question for people who know the reader. It parses every file cleanly except for one line of valid Shen, in lib/stlib/Strings/regex.shen

(master, 93ed67e):

228: [| |RS] -> (re-or RS)

229: [bar! |RS] -> (re-or RS)

Line 228 uses a bare | as a literal list element (the regex "or"); line 229 uses the escaped bar! 

Tracing sources/reader.shen ("<bar> <s-exprs> := [bar! | <s-exprs>]", then cons-form), both [| |RS] and [bar! |RS] seem to read to the same thing — (cons bar! RS). If that's right, the two clauses are identical patterns and line 228 is redundant.

Two questions:

1. Am I reading that correctly — do [| |RS] and [bar! |RS] produce the same pattern, or is there a reader subtlety that makes them distinct?

2. This is the only place in all of shen-sources where a literal bar is written as  | rather than bar!. Since the bare form is ambiguous for any tool that reads | as the cons separator, would it be reasonable to standardize on bar! here? It'd be a one-line change with (as far as Claude can tell) no behavioral effect.

This isn't exactly a bug — regex.shen loads and works in Shen; it's a question about the reader and about tidying the spelling for tooling. (Validation also turned up one genuinely truncated file, tests/lisp.shen; I reported it as https://github.com/Shen-Language/shen-sources/issues/113 and it's already been fixed upstream — thanks, tiz0c!)

Feedback very welcome — on the grammar, the design trade-offs, or anything I've gotten wrong about Shen. And if it's useful to the community, I'd be glad to see it live under the Shen-Language Github org.

Thanks,

Luiz

Luiz de Milon

unread,
May 31, 2026, 1:04:10 AMMay 31
to Shen
In case it interests anyone, I also just published a Zed extension to highlight Shen: https://github.com/luizdemilon/zed-shen

I'll get this published into the Zed registry shortly.
Message has been deleted

Mark

unread,
May 31, 2026, 1:10:20 AMMay 31
to Shen
Cool and a good idea. I wrote my own tree-sitter-shen grammar I can post here. It extends the vanilla Shen grammar with new features behind a few compilation flags. I'm now rewriting Scryer Shen in Common Lisp and I've introduced a functor construction in Shen to mirror ISO Prolog's functors.

Perhaps we can compare notes. I had to expand several rules significantly in my specification to get the more ambiguous parts of the Shen grammar to parse. I also attached the generated parser to a Common Lisp REPL using the cl-tree-sitter library and wrote a pretty printer for the parsed Shen grammar using the Common Lisp Pretty Printing System, for round trip debugging.

Google Groups won't let me post the grammar.js file directory so here are its contents, sorry for the wall of text:

/**
 * @file tree-sitter parser for the Shen programming language.
 * @author Mark Thom <markjor...@gmail.com>
 * @license MIT
 */

/// <reference types="tree-sitter-cli/dsl" />
// @ts-check

// Toggle this to enable/disable functor support
const functor_ext = true;

const functor = $ =>
      seq(
          '(',
          field("functor", $.functor_symbol),
          repeat(field("argument", $.item)),
          ')'
      );

const functor_pattern = $ =>
      seq(
          '(',
          field("functor", $.functor_symbol),
          repeat(field("argument", $.pattern)),
          ')'
      );

// Base item rule (no functor syntax)
const base_item = $ => choice(
    $.base_pattern,
    seq('[', field("list", repeat1($.item)), ']'),
    prec(1, seq('[', field("head", repeat1($.item)), '|', field("tail", $.item), ']')),
    $.abstraction,
    $.application,
);

// Conditionally extended item rule
const extended_item = $ => choice(
    functor($),
    base_item($)
);

const keyword = word => token(prec(1, word));

module.exports = grammar({
    name: "shen",

    extras: $ => [/\s/],

    rules: {
        source_file: $ => repeat($.definition),

        datatype_kw: $ => keyword('datatype'),
        defmacro_kw: $ => keyword('defmacro'),
        defprolog_kw: $ => keyword('defprolog'),
        define_kw: $ => keyword('define'),
        colon: $ => keyword(':'),
        semicolon: $ => keyword(';'),

        if_kw: $ => keyword('if'),
        let_kw: $ => keyword('let'),
        let_bang_kw: $ => keyword('let!'),
        lambda_kw: $ => keyword(choice('/.', 'lambda')),

        type_open_kw: $ => token(prec(2, '{')),
        type_close_kw: $ => token(prec(2, '}')),

        arrow: $ => token(prec(2, choice('->', '<-'))),
        left_double_arrow: $ => token(prec(2, '-->')),
        right_double_arrow: $ => token(prec(2, '<--')),

        where_keyword: $ => token(prec(2, 'where')),

        alpha: _ => /[a-zA-Z\.=\-*/+_?$!@~><&%\'#`;:{}]/,
        digit: _ => /[0-9]/,
        lowercase_alpha: _ => /[a-z=\-*/+_?$!@~><&%\'#`;:{}]/,

        signs: _ => token(repeat1(choice('+','-'))),
        integer: _ => token(/[0-9]+/),
        float: _ => token(choice(
            seq(/[0-9]+/, '.', /[0-9]+/),
            seq('.', /[0-9]+/)
        )),

        number: _ => token(prec(2,
                                /[-+]?(?:\d*\.\d+|\d+)(?:[eE][-+]?\d+)?/
                               )),

        underline: $ => token(prec(2, repeat1('_'))),
        double_underline: $ => token(prec(2, repeat1('='))),

        functor_symbol: $ => token(
            prec(2, /@[a-z=\-*/+?$!@~><&%\'#`:;{}][a-zA-Z0-9\.=\-*/+_?$!@~><&%\'#`:{}]*/),
        ),

        symbol_literal: $ => choice(
            token(prec(1, /[a-z=\-*/+?$!@~><&%\'#`:;][a-zA-Z0-9\.=\-*/+_?$!@~><&%\'#`:]*/)),
            keyword('{'),
            keyword('}'),
        ),

        variable_literal: $ => choice(
            token(prec(1, /[A-Z][a-zA-Z0-9\.=\-*/+_?$!@~><&%\'#`:]*/)),
        ),

        lowercase_literal: $ => choice(
            token(prec(1, /[a-z][a-zA-Z0-9\.=\-*/+_?$!@~><&%\'#`:]*/)),
        ),

        placeholder: $ => token(prec(2, '_')),

        pattern: $ => choice(
            $.placeholder,
            $.base_pattern,
            seq('[', repeat1(field("head", $.pattern)), optional(seq('|', field("tail", $.pattern))), ']'),
            seq('(', 'cons', field("car", $.pattern), field("cdr", $.pattern), ')'),
            functor_ext ? functor_pattern($) :
                seq('(', choice('@p', '@s', '@v'),
                    field("first", $.pattern), repeat1(field("rest", $.pattern)), ')'),
        ),

        boolean_literal: $ => token(prec(2, choice('true', 'false'))),
        string_literal: $ => token(prec(1, /"([^"\\]|\\["\\/bfnrt])*"/)),

        abstraction: $ => seq(
            '(',
            $.lambda_kw,
            field("parameters", repeat1($.variable_literal)),
            field("body", $.item),
            ')',
        ),

        application: $ => seq(
            '(',
            field("items", repeat1($.item)),
            ')',
        ),

        rule: $ => seq(
            repeat(field("patterns", $.pattern)),
            $.arrow,
            field("result", $.item),
            optional(seq($.where_keyword, field("where", $.item)))
        ),

        item: $ => functor_ext ? extended_item($) : base_item($),

        base_pattern: $ => choice(
            field("boolean", $.boolean_literal),
            field("symbol", $.symbol_literal),
            field("variable", $.variable_literal),
            field("string", $.string_literal),
            field("number", $.number),
            field("empty", seq('(',')')),
            field("nil", seq('[',']')),
        ),

        definition: $ => choice(
            $.datatype_definition,
            $.prolog_definition,
            $.shen_def,
            $.application,
        ),

        shen_def: $ => seq(
            '(',
            field("keyword", $.define_kw),
            field("name", $.lowercase_literal),
            optional(field("type", seq(
                $.type_open_kw,
                field("type_expr", $.type),
                $.type_close_kw
            ))),
            repeat1(field("rule", $.rule)),
            ')',
        ),

        datatype_definition: $ => seq(
            '(',
            field("keyword", $.datatype_kw),
            field("name", $.lowercase_literal),
            repeat1(field("rules", $.datatype_rule)),
            ')',
        ),

        side_condition: $ => choice(
            seq($.if_kw, field("condition", $.item)),
            seq($.let_kw, field("binding", $.prolog_pattern), field("value", $.item)),
            seq($.let_bang_kw, field("binding", $.prolog_pattern), field("value", $.item)),
        ),

        scheme: $ => prec.left(1, seq(
            field("context", $.formula),
            optional(
                seq(
                    field("context", repeat(seq(keyword(','), $.formula))),
                    keyword('>>'),
                    field("conclusion", $.formula),
                ),
            ),
        )),

        simple_scheme: $ => prec.left(2, seq(
            field("formula", $.formula),
            $.semicolon,
        )),

        formula: $ => choice(
            prec(1, seq(field("term", $.item), $.colon, field("type", $.item))),
            $.item
        ),

        type: $ => choice(
            prec(1, seq($.left_double_arrow, $.type)),
            $.inner_type,
        ),

        inner_type: $ => choice(
            $.base_pattern,
            $.application,
            seq('[', field("head", $.pattern), '|', field("tail", $.pattern), ']'),
            seq('[', repeat1(field("element", $.pattern)), ']'),
            prec.right(2, seq($.type, $.left_double_arrow, $.type)), // A --> B
        ),

        datatype_rule: $ => seq(
            field("conditions", repeat($.side_condition)),
            field("pre_premises", repeat($.simple_scheme)),
            choice(
                seq(
                    $.double_underline,
                    field("conclusion", $.formula),
                    $.semicolon
                ),
                seq(
                    $.underline,
                    field("conclusion", $.scheme),
                    $.semicolon,
                ),
                seq(
                    field("premises", repeat(seq($.scheme, $.semicolon))),
                    $.underline,
                    field("conclusion", $.scheme),
                    $.semicolon,
                ),
            )
        ),

/* // this is a more natural datatype_rule grammar but it's too
   // ambiguous for tree-sitter.
        datatype_rule: $ => choice(
            seq(
                field("conditions", repeat($.side_condition)),
                field("premises", repeat(seq($.scheme, $.semicolon))),
                $.underline,
                field("conclusion", $.scheme),
                $.semicolon
            ),
            seq(
                field("conditions", repeat($.side_condition)),
                field("premises", repeat1($.simple_scheme)),
                $.double_underline,
                field("conclusion", $.formula),
                $.semicolon
            )
        ),
*/

        prolog_definition: $ => seq(
            '(',
            $.defprolog_kw,
            field("name", $.lowercase_literal),
            field("clauses", repeat1($.clause)),
            ')'
        ),

        prolog_pattern: $ => choice(
            $.placeholder,
            $.base_pattern,
            seq('[', field("head", repeat1($.prolog_pattern)), '|', field("tail", $.prolog_pattern), ']'),
            field("list", seq('[', repeat1($.prolog_pattern), ']')),
            seq('(', 'cons', field("car", $.prolog_pattern), field("cdr", $.prolog_pattern), ')'),
            ...(functor_ext ? [functor_pattern($)] : []),
        ),

        clause: $ => prec.left(1, seq(
            field("head", repeat($.prolog_pattern)), $.right_double_arrow, optional(field("tail", $.tail)),
            $.semicolon,
        )),

        tail: $ => choice(
            seq(field("cut", keyword('!')), optional(field("rest", $.tail))),
            seq(field("goal", $.application), optional(field("rest", $.tail))),
        ),
    }
});

Luiz de Milon

unread,
May 31, 2026, 6:47:52 PMMay 31
to Shen
Hey Mark,

I'm happy you reached out! This was the first effort I made in using Claude to prototype a series of tools to bring the Shen development experience towards what I'm used to. I saw that the syntax highlighting/etc story wasn't homogeneous everywhere and had Claude do a survey of what tools currently exist, etc. So I decided to try and get it to make a tree-sitter grammar so I could read Shen code more comfortably, with syntax highlighting, and whatever else.

There'll be more to come on that front: since yesterday, based on this tree-sitter code, I also prototyped a shenfmt tool to reformat Shen code. Now, as we know, Shen doesn't quite have a global standard everywhere like Go, so I made it be a survey tool and also a configurable formatter. I set the presets to match the most common styles I found in this survey across the sources I parsed (the Shen kernel and shen-sources). It's available at https://github.com/luizdemilon/shenfmt.

Regarding your grammar.js file, considering I don't yet have the technical prowess to review it myself, I directed a comparison via Claude, and here's the report (also written by Claude, all the other text in this email was hand-written by me :D )

> - Ran your grammar.js through the same corpus harness (pinned shen-sources, tree-sitter
>   0.26.8): your grammar parses 55/138 files cleanly, this repo's grammar 138/138.
> - The cases we'd expected to break — separators used as data, e.g. [<-- | B], [a --> b],
>   [{ }], := inside a list — actually parse fine; tree-sitter's lexer is state-aware, so
>   those operator tokens don't fire in data position. (We assumed otherwise; running it
>   corrected the assumption.)

> - The divergence comes down to two non-fundamental things: no comment rule (every
>   kernel/stlib file opens with \\ ...), and definitions recognised only at the top level —
>   but every kernel file is one (package ...), so nested defines/datatypes collapse to
>   plain applications, and their _ patterns and ___ sequent lines become invalid (e.g.
>   lists.shen errors at the _ in `_ [] -> []`; maths.shen at a datatype's ___). Recursing
>   the package body through the top level clears it.

> - Not a knock on yours — it targets your extended dialect and you validate by round-trip,
>   not by chewing the vanilla corpus. They look complementary: yours has the structured
>   sequents/clauses/functors; this one has package recursion, comments, and a corpus
>   regression harness.

I also ran a new /deep-research to figure out how much work it would take to get Shen's error messages to Rust-level, the results were:

> - Short version: very doable, and less than I'd feared — the Scryer-hosted checker
>   already builds most of what's needed. Rust-grade diagnostics split across the three
>   compile-time surfaces (reader/parser, the sequent type checker, the Prolog/datatype
>   rules), and the checker already constructs a full proof tree that today just gets
>   dropped on failure.
>
> - The one genuinely hard part is recovering where/why a check fails, since Prolog discards
>   the proof when it backtracks. But the attributed-variable hooks (verify_attributes) fire
>   at the failing unification, before backtracking unwinds — exactly the place to record it.
>   A "keep-deepest failed goal" recorder on top of that needs no CPS rewrite of the prover.
>
> - It's prototyped end-to-end against the actual scryer-shen (Racket+Scryer): on
>   (apply + [1 2 3]) the patched checker recovers the real culprit — type_check([3], (h-list []))
>   i.e. "the third argument is extra; + takes two" — instead of a bare "type error".
>
> - Since it all lives on the Prolog side, it should carry over to your CL rewrite as-is; a
>   tree-sitter front end then supplies the source spans to render it Rust-style (snippet +
>   caret + message, and eventually stable error codes / --explain).
>
> - The rest is staged, well-scoped steps rather than a rewrite: surface the proof tree, pass
>   a structured diagnostic across the boundary, the recorder above, then spans + a renderer,
>   then JSON for editor/LSP.

Here's the full report: https://github.com/luizdemilon/tree-sitter-shen/blob/compare/thom-grammar/THOM-GRAMMAR-COMPARISON.md

Anyway, this is exactly where you'd know far more than the model
or I do — if you're up for it I'll start a dedicated thread, and the prototype + full writeup
are yours to look at whenever.

Luiz

dr.mt...@gmail.com

unread,
Jun 1, 2026, 1:55:33 AMJun 1
to Shen

1. Am I reading that correctly — do [| |RS] and [bar! |RS] produce the same pattern, or is there a reader subtlety that makes them distinct?

2. This is the only place in all of shen-sources where a literal bar is written as  | rather than bar!. Since the bare form is ambiguous for any tool that reads | as the cons separator, would it be reasonable to standardize on bar! here? It'd be a one-line change with (as far as Claude can tell) no behavioral effect.

They do.   The bar! was introduced to make it easier for the Shen compiler
to handle | as a standard symbol.  | (borrowed from Prolog syntax) was  not 
intended really to be used for anything else but consing.   [X | Y] is simply syntactic
sugar for (cons X Y).  I think bar! should be made internal to the Shen package but
more comprehensively I'd now write the compiler to eliminate | w.o. bar!.   In
the revised syntax scheme | would not be treated as a regular symbol, avoiding
the unfortunate situation where (intern "|") <> |.

Mark 

Mark

unread,
Jun 4, 2026, 4:42:46 PMJun 4
to Shen
Hi Luiz,

My grammar is directly based on the Shen syntax EBNF here: https://shenlanguage.org/OSM/Syntax.html

It's true that it doesn't have rules for the defpackage or shen-yacc forms, as Claude noted. I always supposed these forms would be bootstrapped by the macro system, but if you intend to use the tree-sitter grammar for editor highlighting, yes, it should have them. My grammar is more longer and elaborate, which allows for greater ease in compiling the normalized AST down to Common Lisp and fine-tuning shen-mode in Emacs.

In fact, the AST nodes are objects of CLOS classes, which I believe can be used as a basis of syntax classes analogous to Racket's system of syntax classes in syntax-parse, but much simplified. From there one could create a hygienic macro system for Shen using shen-yacc for parsing, where new syntax classes are again defined as CLOS classes and destructured using the trivia pattern matching library as I'm already doing. This would be a much safer and more powerful (and even typed!) macro system for Shen. But it breaks backward compatibility of Shen's reader macro system, which simply transforms trees of cons cells to trees of cons cells. These macros would deal in the much richer, well-typed structure of the ASTs produced by tree-sitter + Scryer Shen's normalizer.

Mark

unread,
Jun 4, 2026, 4:47:39 PMJun 4
to Shen
As to the topic of human readable type errors for Shen, a large and ambitious chunk of the scryer-shen project is to produce a visual debugger rendering proofs (whether successful or failed) as trees in the syntax of Gentzen's sequent calculus. That is, the type checker would spit out these trees in a text format, according to a grammar that's renderable as a proof tree (using ImageMagick? some LaTeX package? CLOG? I don't quite know yet) and that allows the programmer to focus on particular portions of the tree, and to examine those portions in greater or lesser degrees of detail. I suppose similarly, a Prolog program could be written to parse that representation and produce a narrative in natural language explaining why a proof failed at a particular point, for instance. It would still require the programmer to have some idea of how proofs are successfully conducted in that system, but the visual debugger would also serve as a didactic tool in this sense, by depicting proof search as a stepwise process involving unifications over constraints.

nha...@gmail.com

unread,
Jun 6, 2026, 2:40:46 PMJun 6
to Shen
Are any of the local AI models strong enough to view a Shen spy trace and tell you what line of code in the function is causing trouble? ChatGPT was able to do it reasonably well about a year ago.

Luiz de Milon

unread,
Jun 10, 2026, 9:53:19 PMJun 10
to Shen
Hello everyone,

---

Dr. Mark Tarver - Thank you for confirming. in this case, should shen-sources be updated to use bar! everywhere? though I guess that's a decision for Bruno.

---

Mark Thom - I was wondering if breaking that compatibility of the reader macro system would be a good/bad idea. I think Shen ought to rise in popularity as people converge onto wanting more and more testable/verifiable code, due to all the LLM-generated code etc. I'm very hopeful about scryer-shen and I appreciate your work.


>That is, the type checker would spit out these trees in a text format, according to a grammar that's renderable as a proof tree (using ImageMagick? some LaTeX package? CLOG? I don't quite know yet)

Why not all of them? :smiling:

--- 

>Are any of the local AI models strong enough to view a Shen spy trace and tell you what line of code in the function is causing trouble? ChatGPT was able to do it reasonably well about a year ago.

I'll test some and let you know. I've been busy with other projects but I can kickstart the Shen×LLM exploration again sometime soon.

----

Best regards,
Luiz

Mark

unread,
Jun 11, 2026, 1:47:48 PMJun 11
to Shen
> Mark Thom - I was wondering if breaking that compatibility of the reader macro system would be a good/bad idea. I think Shen ought to rise in popularity as people converge onto wanting more and more testable/verifiable code, due to all the LLM-generated code etc. I'm very hopeful about scryer-shen and I appreciate your work.

It would certainly inhibit the portability of programs between Scryer Shen and other Shen systems, but that's already broken by the addition of free functors (using the '@' notation otherwise reserved for @s, @v, @p) to parallel the functors of ISO Prolog, even if Scryer Shen continues to support defmacro. Scryer Shen also breaks from tradition in that it's a direct implementation, not bootstrapped from KLambda but still very "micro". My hope is that features exclusive to Scryer Shen are seen to be strong enough that other porters are compelled to incorporate them into their ports, thereby solving the incompatibility problem.

I think support for free functors is particularly important for the reason you cite, the mass production of autogenerated "slop" code by LLMs. Without deep indexing on large databases of Prolog clauses, type check querying slows to a crawl, and this can only hurt the perception of Shen as an appropriate host language for such experiments in verifying large volumes of machine generated code.

I have loads of other ideas that will break compatibility further still, like propagating the operator precedence parsing system of ISO Prolog through to Shen (i.e. to support operator notation in infix, postfix, and prefix modes). I want to write a tree-sitter grammar for ISO Prolog that reads operator data from the Scryer Prolog instance, produces ASTs accordingly, and is a subgrammar of my Shen grammar. That's a tricky technical problem to solve since tree-sitter isn't particularly dynamic but it would also open the door to better editor support for ISO Prolog.

All this should be done by leveraging the considerable combined might of Common Lisp and Scryer Prolog together, ultimately delivered at a tiny fraction of the weight / code size used to implement systems of similar ambition and power, and without (so far) the financial support of a university, company, fund, etc.

dr.mt...@gmail.com

unread,
Jun 16, 2026, 9:16:39 AM (11 days ago) Jun 16
to Shen
I think this is worth incorporating into the Shen site with
a link together with the new work on Yggrasil (now renamed).
A few screen shots of editors using this rep would be good.

One thing I was interested in some time ago was a semantic
editor as I called it.    This is an editor that allows you to
toggle views of xource code eg. highlight potential non-termination,
type errors.  It was inspired by the scene in Predator 2 where the
alien switches wavelength on his viewer to track a threat.

Mark

dr.mt...@gmail.com

unread,
Jun 16, 2026, 9:27:16 AM (11 days ago) Jun 16
to Shen

Are any of the local AI models strong enough to view a Shen spy trace and tell you what line of code in the function is causing trouble? ChatGPT was able to do it reasonably well about a year ago.

ChatGPT is now seriously good at Shen.   Almost scarily so.  
I'm using it to spot errors in my code.   The improvement since
2025 is marked.    

ChatGPT Not because ChatGPT has seen mountains of Shen code — it probably has not. It is good at Shen because Shen sits at the intersection of things LLMs already handle fairly well:

Lisp syntax
pattern matching
ML-style types
Prolog-style logic
small formal core
clear examples
mathematical regularity

The biggest reasons:

1. Shen has a small, regular surface syntax.
There is much less incidental noise than in Python, JavaScript, C++, etc. A Shen definition has a very recognizable shape:

(define append
[] Ys -> Ys
[X | Xs] Ys -> [X | (append Xs Ys)])

An LLM can infer a lot from that structure.

2. Pattern matching is highly explicit.
The rules almost read like equations. That makes it easier for the model to reason locally: this case, that case, recursive case.

3. Shen inherits patterns from better-known languages.
Even if the model has seen little Shen, it has seen plenty of Lisp, Scheme, ML, Haskell, Prolog, S-expressions, unification, type signatures, and rewrite rules. Shen is not alien; it is a synthesis of recognizable traditions.

4. Your writings explain Shen unusually clearly.
The manuals, examples, and online material give strong semantic cues. LLMs learn well from worked examples and explanatory prose. Shen has good explanatory material relative to its size.

5. Kλ gives Shen a clean conceptual centre.
The language has a compact kernel and a disciplined translation story. That makes it easier to answer “what does this become?” questions than in languages with huge ad hoc semantics.

6. Shen code is often close to the idea it expresses.
A Shen program tends to expose the algorithm rather than bury it under framework conventions. That helps an LLM see intent.

7. The language rewards symbolic reasoning.
LLMs are not theorem provers, but they are quite good at recognizing symbolic transformations, recursion schemas, list-processing idioms, and type-pattern correspondences.

So the odd answer is: ChatGPT is good at Shen partly because Shen is what a programming language looks like when the accidental complexity has been boiled off.

It is not always reliable on details — especially reader corner cases, package behaviour, datatypes, macro expansion, and exact kernel notation. But at the level of idiom, structure, and likely intent, Shen gives it very strong signals. 

Luiz de Milon

unread,
Jun 16, 2026, 9:35:27 AM (11 days ago) Jun 16
to qil...@googlegroups.com
>Are any of the local AI models strong enough to view a Shen spy trace and tell you what line of code in the function is causing trouble? ChatGPT was able to do it reasonably well about a year ago.

This is what I was trying to find out if Shen compilers could emit statically, without LLMs. I didn't continue going down this path after I learned about Scryer Shen, because I think Mark Thom would crack this much sooner/better than I ever could.

Luiz de Milon

unread,
Jun 16, 2026, 9:35:32 AM (11 days ago) Jun 16
to qil...@googlegroups.com
Mark - I can get a few screenshots ready for you plus a demo of shenfmt if you'd like. Just point me at some source code you'd like to see displayed and i'll share it here.

--
You received this message because you are subscribed to a topic in the Google Groups "Shen" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qilang/S4S90uDpGss/unsubscribe.
To unsubscribe from this group and all its topics, send an email to qilang+un...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/qilang/4bd1700a-963a-46a6-88c0-3a9b5f8c2952n%40googlegroups.com.

dr.mt...@gmail.com

unread,
Jun 19, 2026, 7:05:08 AM (9 days ago) Jun 19
to Shen
thta sounds a great idea; I'll get back to you.

Mark

Luiz de Milon

unread,
Jun 19, 2026, 10:22:41 PM (8 days ago) Jun 19
to Shen
Also, I realized I mentioned shenfmt but never actually posted about it:  https://github.com/luizdemilon/shenfmt
Reply all
Reply to author
Forward
0 new messages