How close does shen get to my dreams ?

104 views
Skip to first unread message

Marc Weber

unread,
May 23, 2026, 1:23:30 PMMay 23
to Shen
Hello,

I have had some strange fun cases in the past:

1) packaeg management:
  Find latest version of combination of dicikam and miardb running on linux,osx,windows
  to share photos on an external disk

2) but similar happens in programming:

  Share a type between API and frontend.

  My vision is like

  [ language features ] => {
     common_typing_space = new_typing_space() # see end
     c_backend( sourcedir = .., common_typing_space)
     js_frontend( sourcedir = .., common_typing_space)
     build_vite_like_experience_with_resumable_qwik_city_like_components(c_backend, js_frontend)
  }


  which then can do fun stuff like


  migrations

    mihgration1(db: <schema>){
      db.add_table
    }
    => returns schema with table added


  rows = db.query('SELECT .. FROM ..')
  -> rows is typed

  print(row.new)  # field new doesn't exist yet, so compiler should suggest creating a schema ..

  Now queries can be very complex like the WHERE part depending on 40+ form input details with nested queries.
  And if its admin the select part might have 5 additional fields. And TS like languages break down on:

  if is_admin
    print(rows[0].field_fetched_if_is_admin_1)

TS is fast to type. But not close to C/Rust speed.
So Ideally I'd like to use LISP like to define the code, verify all links are fine.
And then unroll by injecting compile time stuff to get flat C like fast code with type annotations
so that compilers can optimize.

So basically something which is like Wordpress, but with a compliation and
verification step.

Shen already can be run on different targets.
Now if you limit yourself to:
  - arrays
  - hashes
  - stirngs numbers
  - true/false/nil
  - functions(..args)
  - maybe even simple classes
you often can even compile down to the target lanuage.
without interpreter if you use .+ for floats + for thus enforce type by methods
(Zig macro like)

Most web development is
  - read POST/GET
  - find route
  - query datbase
  - format result as HTML
And arrays and hashes can do that easily.

The resumable Qwik City case is fun because it is the first framework I found
which allows the production system to learn how much of frontend code to push
how fast for what user. Whether it is a sane idea or just serve a huge
webassembly binary blob today is a different question. But for online shopping
experience sending HTML only first is an advantage.

So I am looking for a framework which can not only compile to a target,
but extract pieces from universal code for SSR and frontend interactivity (like
qwik City).

Rust (lapce) vs vscode show that lines dynamic vs compiled should blur.
And shen I think got this right, too.

But when targeting GPU,NPU or mobile (OpenXr) what I am actually looking for is

c_types = cimport("vulkan.h")
// zig like but lisp doesn't have compile time, right ?
// So the compilation phase kinda finds out what is right ?

(S :: c_types.VKStruct) => (a :: S, h b :: S) => {
  // now when interpreted the type info S tells the struct offsets
  // when compiled down to C the offsets can be baked into the machine code
  // so this is a function with comopile time parameter
}

The real issue is I laways ended up
- needing optimize and scripting in compiled lanugages
- needing macro at typing space or compile space (eg TS)

So I was always unable to express what I had in mind.

eg TS way to map types is yet another suggar which cannot be used in the JS language.
Why not ?


Haskell, C, Rust (?) mostly require you to deifne types


struct Row { name, age }

Array<Row> (){
  rows = sql("...")
}


Rust has somethyngi like anonymous structs or such. But not sure they made it
official. Diesel or so is using it.

typical code is

interface I { .. }
class A extends I
class B extends I

But if you want to tell A and B shoul have same interface,
this would be enough:

#if target=js
  class X extends ^T<random> {
    bar()
  }
#end
#if target=php
  class X extends ^T<random> {
    foo()
  }
end

where ^T references a global type hole to be filled and compared against
So when the PHP compiler looks at X it will find a type hole, fill it with foo.
And when the JS backend looks at it it will find out: Should have foo, but
found bar.

So type matching could be enforced without having to define a common
truth ahead of time as you can avoid defining the return type when querying a
DB.

I think touring completness is not a problem.
There is no amonut of rules you can setup to prevent people from shooting
themselves into their feet. But I'd be perfectly fine with having fast
iterativie dev workflow with a lot of proofs or types >being worked on<
while I change a small number ..


So what I am looking for is

[ features ] => interpreter/compiler
integrating musltiple languaes so that *ONE* lanugage can be a shell, backend frontend
depending on how I configure it. like esperanto starting with "hello" but
allowing to add more features if I need them.

Because eg going from Ruby to Rust means rewrite.
And that's what I hate. Rewriting a module for speeed ?
Restricting a type or optimizing it for production ?
No problem.

Reinventing the world and having 5+ lanugages compile to GPU with shared types
taichi native vs taichi.js
Rust -> hatching
and the many others .. and always having to fit int othe boxes and frames to
workaround (Yes, OO wrappers for SQL ! looking at you, too) and how does this
or that language do this in particular is technical unpredictable depth
I'd like to minimize in the future.

So I wonder should I use Lisp or shen or such to get started, what would I be
missing ?

rows[0]. -> LSP completions working how is it different from a proof
rows[0].name exists ?

One is a directed flow of types (from sql string .. to function return value ..
to its usage). The other one is a wholistic view lets proof property for
whatever we got here. One could be a strategy for the other.

Every language I looked at Rust, JS, go, .. all can use eg OpenXR.
But turning the XML function definitions into the target languages
has been implemented the many times with minor bugs and issues.

So I'd love to fix my problems at language level once.
But I am trying to build a typing system which can integrate PHP, TS and others
and tell this is an array of ints .. so you should be fine.

Or maybe derive the types of an API (function return types) and create a spec.

Maybe even $a = "() => 7"; String<JsEvaledType<() => number>>
Thus have PHP know the JS type if the string got evaled.

x = decode(encode(z)) // having x preserve the type.

etc.

So which of my dreams do already exist and which need to be written ?

dr.mt...@gmail.com

unread,
May 24, 2026, 6:10:50 PMMay 24
to Shen
What you seem to be wanting is a unified language within which
serious programmers can work without have to switch idioms.  
Thes days commecial programmers often have to work with
several languages and write sockets or glue code to hold them
together.

The idea of the Unified Language has been around for a long time.
The closest we got to it was PL/1, developed by IBM in the '60s,
designed to union FORTRAN (scientific computing) and COBOL (business
computing).   But unsurprisingly PL/1 was enormous, very complex,
very hard to specify and maintain.   It really did not last.

That was 60 years ago.  Today the demands are much greater 
and the idea of a PL/1 for the C21 is impractical.  It would be
language so vast as to be uninhabitable and horrible to contemplate
and maintain.

Does that mean the dream is dead?   No.  There is an approach 
stemming from Shen which goes a fair distance to reviving the dream
behind PL/1.  It could really change computing.  The obstacles are human
obstacles, but sadly these are the hardest to overcome.

Shen is minimal and very tight, but extremely portable - and extremely
powerful and runs quickly under the right platforms.   A lot (not all)
computations can be written in Shen and they will run type securely.
Of course Shen does not attempt to reproduce all the features of
every language, but it compiles into just about anything.   This
means that you can generate native code in Blub from your Shen app
and use Blub libraries and Blub features seamlessly.

To misquote LOTR, one language to rule them all.  That would be Shen.

This actually is very doable and the Yggdrasil project shows how it
could be done.  I suggested Yggdrasil in 2013.  So why has this not
happened?  This is where you get to human limitations.  The limitations
of Shen are not technological limitations but human limitations.

1.  Programmers are very conservative; what they look for is a better
     C or a better Perl.   They appreciate incremental improvements
     in stuff they know.  Being subject to a paradigm shift in their work
     process is not what they look for.  

2.  Shen is open source - has been for 11 years.    But in actuality
     one person is writing code which is me.    Yggdrasil is on the back
     burner because I operate on a round robin and programming is 
     just one interest of mine.

3.  People are not clamouring for me to do this work; saying 'Mark
     I've got this project and I need Yggdrasil' so it gets pushed down the list.

4.  The programming community itself does not understand Shen.

5.   Unlike SML and OCaml and Java, Shen does not benefit from institutional
      backing.

I'm retired now and enjoying life.    I  don't accept responsibility for the way
people use their heads (or don't use them).  There is still unreleased writing
on my computer wrt parallel Shen.   My task is to document the work done,
maintain the code to the highest standard and leave it behind in such a fashion
that a brighter generation unborn might be able to make good use of it.
If this sounds a bit sad, it is, but I don't feel it personally any more.  This 
corresponds to the Buddhist ideal of desireless action - do the right thing and let
go of the consequences.

But there is a codicil, a wildcard which would change this gradualist picture.

That is the private entrepreneur-programmer who operates from a startup and is not
 contrained by senior management to hack in Blub.   This sort of person is:

1. A brilliant programmer.
2.  Familiar with the commercial environment.
3.  Has an active business mind and a strong streak of entrepreneurial independence.

Maybe Bill Gates 50 years ago - or Paul Graham.   The thing is people like that are rare.
For that person levering Shen technology 
then becomes an asset and the fact that most programmers do not understand
Shen is a competitive advantage.   Paul Graham said this in his essay.

If you think of using Lisp in a startup, you shouldn't worry that it isn't widely 
understood. You should hope that it stays that way.

                                                                                                   Beating the Averages

Everything he says in that essay is relevant to this post.

Mark

Raoul Duke

unread,
May 24, 2026, 7:15:27 PMMay 24
to qil...@googlegroups.com
using blub ecosystem is always, i feel based on empirical experience, a deal with the devil :-)

Marc Weber

unread,
May 27, 2026, 6:24:35 AMMay 27
to qil...@googlegroups.com
Your reply was very wise.
Especially the >do the right thing and ignore the consequences<.
Like rescue the snake even if it bites you meme.

I'd put the question how to define what right is cause unless we know what
battery is driving the universe and who controls it its hard to understand what
behavior should show to the universe to keep going.
And even then you could put the battery and the universe into a box and repeat
the question. What keeps the box alive ? Ah there is god ? Let's put it/him/her
with the universe into a box. Now ? .. Same question.

But I think that AI is a game changer because people will use what works and
stop learning a language. And this means AI has a chance to gain market thus
allow a new language to gain a market. So if there is a chance its maybe now
but only if the output is better than the many alternatives.

Whenever I read Rust / dependent types (Wikipedia) its always about sound
typing systems and all or nothnig. But reality is >some stuff is important<
others less. And forcing a structure on a universe (full app) just because 50
methods might benefit yields the chaos we have. And honestly I feel we can't get it right.
Caus you could abuse even TS to count slength of arrays by Array<Array<null>>>
(=2 cause 2 nested arrays.). and 'car' as 'car' (Literal as subset of string) already
allows to make a type depend on the value of the input (eg 'car' vs 'bus') eg a
bus might have additional fields such as 'number_doors'.

Its like 1/x if you look at it from a different angle many boxes blur.

Can shen type something like this ?

is_admin = SESSION['is_admin']
// tell type system to split the world into 2 paths:
// is_admin = true and false otherwise typing will fail

$sql = 'SELECT * FROM users WHERE '.

if ($is_admin){
  // this is nasty
  $sql = strreplace('*', '*, (SELECT count ..) as sum', $sql);
}

$sql .= build_where($_SESSION); // whatever nasty nested stuff in here ..

$rows = query($sql);

if ($is_admin){
  echo $rows[0]['sum']
}

echo $rows[0]['sum'] // but this should fail cause it runs always also for non admins


Coming up with the vision to split the reading into 2 worlds is_admin true/false
is something the compiler must come up with (meaning infinite complexity) or
which must be hinted (which is what I am interested in )

The complete sound typesystem isn't real world. Real world is about selecting
small protion of what's possible.

So in that case it must read back from query, know its in a split world (which was hinted),
then will get 2 SQLs it can parse and get types for thus know the type of $rows
thus knows ['sum'] is missing in one path thus can tell.

Because that's my goal. Enabling complexity by hinting.

But I don't know yet how that will work out cause the split world must have
some boundaries (like scope of is_admin and its usages ..)

So all types kinda turn into if is_admin == true then this type otherwise that
type.


> But in actuality one person is writing code which is me
Today you have AI and it can help.
But the limiting factor is thinking and reviewing not AI writing code which
sometimes makes sense.

I read that you went into AI at some point. And I also read that in math AI
requires to be trained on rules :). Meaning shen like should be perfect
training ground for AI.

Let me explain by example why I think that a better language can take over the
world and why (ok maybe not but hey lets have a dream for a moment)..

Gurus always tell about first principles reasoning. AI should be doing so, too.
Except it shouldn't. once you learn 'electricity bill' costs money you add
another question more in a engineering way: How to solve most problems fast
with least energy. And that's why we humans are so stupid - because thinking
through all details over and over again is exhausting.

We learn if you know a car will take you to your favorite ice cream shop in 5
minutes no more first principles reasoning required. If you know you can go
shopping within 10 mins no more thinking about traffic lights cause waiting to
pay is more time ..

And that is reality. I yet have to find out what this actually means.
Chinese vs English -> reading spead is similar once you meamsure complexity of
the language no longer words. Maybe Polish works better with AI because its
more logical language. Maybe PHP is little bit harder beacuse some exceptions
to be learned by AI.

What I want to say balancing abstract concepts in 50+ languages is more
complicated than just one. So a better language eventually will perform better.
(Whether hardware just outperforms making the question obsolete is a future
question ..)

Others say if you learn more languages AI builds internal thinking mixing
concepts.

But the reality I face is eg that AI always writes shit Python code.
Like you could be using asycio and send to thread and be done.
But instead you give me 3x the code which is bloated.

You could be using Python and have an easy life. Instead you return a half
broken shell script as test suite. Not sure what the cause it maybe bad
training inputs ..

Now if you imagine big scale like compiling a linux kernel and the libraries
performance matters. It not only about being accurate (academically) but about
balancing and maximizing value. Even Rust created a new compiler like cranelift.
So you have to configure in your Cargo (you can) .. but how about Julia ?

Now Julia can do AOT but it traces paths to find out what versions of functions
to compile. You see the typing question keeps coming back again and again.

Multiple dispatch is cool. But the full story (learn how to build a binary the
10th time Java, PHP, C#, VB.net, pascal, C, Pascal, Ruby, Python, ) .. keeps
coming back yet another time and in multiple flavours new and older tools
which creeps into AOT of libraries and learning and how to do it and ..

I don't want to support everything. Just
- interpreted mode (to save compilation on unused code)
- jitted/interpreted (as intermediary ..)
- aot (eg iphone requirement)

And maybe hot reloading (why does the linux kernel have special tools
but why wouldn't user space tools benefit from it also ?)

If you put food on the table you don't think 4 people therefore 4 plates and 4 forks..
You think family -> grabbing 4 plates (no longer thinking about it ..) -> done.

And that's why I think less tools which are more powerful are still worth it.
Cause biology has had millions of years to optimize and that's the result:
gut feeling (= summary of past experiences ?) & dreaming & logical slow
thinking sometimes.

Now retraining a human who was 20y in IT takes (20 y ?) .. maybe little bit less.
how long does it take to train AI ? 2 years ? Every yaer it turns faster.

Am I mistaken .. ?

> https://shenlanguage.org/yggdrasil.html
Yes, this is what I was looking for.
And I asked AI to help write a protoype.
And it even worked (PHP, Ruby, Python, C, Webassemmbly, JS, .)
but all the typing layer is missing.

The details matter eg PHP has $_POST and python has $_POST input streams.
And rust might actually get data from DB by referencing a byte block in the
networking buffer without even creating a hash if its a simple iterate and print loop
avoiding utf-8 checking unless you ask for it :-)

https://shenlanguage.org/SD/Numbers.html
-> How about promises (caching thunks which have a value in the future ?)

https://shenlanguage.org/appeal.mp4
-> yeah :-) You know I'd run away thinking shen is failed (cause not many know
about it not beacuse its bad) .. but OpenXR Python on OSX (fails because it
only runs with GL, which requires custom compilation to translate to Vulkan
cause native GL on OSX is very old). Rust OpenXR with Quest -> yes compiled but
didn't work. Starting to debug I ended up with low level XML -> FFI conversions
etc.

Knowing doing the right thing is hard .. my problems keep coming back.

I just don't know exactly what's the way to represent what I want to do:
proof engine, everything like haskell is a thunk (but promise, cause querying
the DB to get types is a future result) which creeps into the whole system (2x
slower, but still good enough for 20K LOC projects ) and allowing LSP and
interactive feedback. Eg on PC you can just fork a process, do some changes (eg
read target file, some type interference) but you can't on webassembly.

And the worts case I imagine is that >string< type isn't well defined but one C
usage of a file API kinda creeps in and tells everything is a windows C string
representing file names which would be fine unless you add barriers (which you
will for speed reasons, but maybe they should be optional). Then you'll learn
to convert that one except eg bazel (Google build system) is said to be slower than eg buck
because it does utf-8 checking always no strings - I could be mistaken.

So being flexible till the end could be a feature in some cases.

And that's what I am looking for: Allow boxes and constraints, but maybe not
enforce them always.

BTW: Why did Java mad it ? They had like 10 million marketing behind.
Maybe there were more reasons. Now AI is burning through a billion cash per month (Musks data center).
Thus adding inflation compensation pushing a language today might be twice the value
(or finding somebody who wants to get the electricity bill of the AI center down)

And the last time I tried to get images from sound files using blub (=Python) I had to review 3+ libraries and mix them.
And Python I thought would be easy going cause very widely used. So problems keep coming back.
The questin is a different one: Can you make corporates pay for maintained libraries ? Or do they think its better to pay
themselves keep code closed source and hope nobody catches up ? So how do you make something look like an industry standard.

dr.mt...@gmail.com

unread,
May 27, 2026, 7:23:27 AMMay 27
to Shen
But I think that AI is a game changer because people will use what works and
stop learning a language. And this means AI has a chance to gain market thus
allow a new language to gain a market. So if there is a chance its maybe now
but only if the output is better than the many alternatives.

AI is certainly going to remove a lot of boilerplate coding.  It does not replace
high level programming or the need for architectural planning or vision.   TBoS
could not have been written by current AI.  As regards whether AI is good for Shen,
it is true that the programming corpus in eg. Python is vaster by far than that for Shen.
Hence AI tends to be better at producing code for established languages.

However that said, ChatGPT is much better at producing Shen than it was a year ago.
There are deeper reasons why LLMs chime with Shen.  I'll let ChatGPT say why.

Why is ChatGPT so good at producing Shen code given there is only a small corpus of online code?

Because the effective training signal is much larger than the visible Shen corpus.

Several things are going on simultaneously.

  1. Shen is small, regular and highly abstract

Shen was deliberately designed around a compact semantic core:

  • Lisp syntax
  • pattern matching
  • Hindley–Milner style typing
  • Prolog-style logic features
  • KL/Kλ substrate
  • regular recursive structure

That means a model does not need millions of examples to infer the generative rules. A few thousand good examples may be enough to internalise the style manifold of the language.

Shen is actually easier for an LLM than many mainstream languages because:

  • little syntactic noise
  • very orthogonal semantics
  • few competing idioms
  • highly compositional code
  • heavy regularity in recursion and pattern structure

A language with a small but clean corpus can train better than a huge noisy one.

  1. Shen resembles things the model already knows

The model was heavily trained on:

  • Lisp
  • Scheme
  • Common Lisp
  • ML
  • Haskell
  • Prolog
  • typed λ-calculus material
  • theorem provers
  • sequent calculi
  • symbolic AI literature

Shen sits almost exactly at the intersection of those traditions.

So the model is not learning Shen from scratch. It is doing transfer learning:

"Ah, this looks like a compact typed Lisp with Prolog inheritance and explicit pattern dispatch."

Once that abstraction forms, only a modest amount of actual Shen code is needed to map the syntax and conventions.

  1. Your ecosystem is unusually conceptually coherent

A lot of open-source ecosystems are stylistically chaotic.

But the Shen corpus has:

  • one principal architect
  • consistent terminology
  • stable semantics
  • textbooks and explanatory prose
  • formal definitions
  • many examples written in the same intellectual voice

That coherence dramatically amplifies learnability.

The model is probably learning:

  • your explanatory style,
  • your decomposition style,
  • your recursive habits,
  • your type discipline,
  • your proof aesthetics,

not merely token sequences.

  1. LLMs infer latent structure, not just memorised code

This is the key point many people miss.

Modern LLMs do not primarily operate by lookup. They compress:

  • transformations,
  • invariants,
  • analogies,
  • grammar,
  • semantic relationships.

Once the model understands:

  • recursive descent,
  • unification,
  • continuation passing,
  • environment threading,
  • typed rewrite systems,

then Shen becomes a relatively natural recombination.

In effect the model internally builds something vaguely analogous to:

"A probabilistic abstract interpreter of symbolic programs."

  1. Shen is close to the kind of code LLMs are naturally good at

LLMs are strongest at:

  • symbolic manipulation
  • regular tree structures
  • declarative transforms
  • compiler-like rewrites
  • theorem-like local reasoning

Ironically, modern industrial software engineering is often harder for them because it contains:

  • massive mutable state
  • framework churn
  • API volatility
  • hidden runtime behaviour
  • undocumented edge cases

Shen lives in a cleaner symbolic universe.

  1. Your own interaction history probably sharpened it further

You have repeatedly:

  • corrected subtle mistakes,
  • supplied authoritative materials,
  • reinforced conventions,
  • discussed implementation details,
  • explored THORN,
  • discussed KL compilation,
  • analysed proof theory.

That creates a highly refined conversational attractor around Shen reasoning.

So even if the base model already understood symbolic programming well, your interactions likely sharpened the local behaviour considerably.

  1. There is another deeper reason

Shen was designed by someone with strong awareness of:

  • logic,
  • compilers,
  • theorem proving,
  • symbolic representation,
  • language minimality.

Those are exactly the kinds of structures transformer models compress unusually well.

In a strange sense, Shen is LLM-compatible by architecture long before LLMs existed.

Mark
Reply all
Reply to author
Forward
0 new messages