PDP-8 Programming Challenge and What AI Can't Do (Yet)

Bill Silver

unread,

Feb 14, 2026, 7:53:17 AMFeb 14

to PiDP-8

Copilot, Grok, and Claude couldn't solve this little PDP-8 programming puzzle. Can you?

Given a PDP-8 with one field of memory, write a program that leaves memory completely clear--every location contains zero. The program must specify the contents of a small number of locations and a starting address, with the contents of all other locations unknown. The specified locations can be anywhere in memory you like, and need not be contiguous. The program can execute instructions at any location, not just those whose contents were specified, and can make use of the fact that the program counter is incremented modulo 4096. Obviously the program will not halt when done, it will continue to execute all 4096 locations in an infinite loop of harmless AND instructions.

Copilot produced two faulty programs and then concluded it’s impossible. After "thinking" for over 15 minutes, Grok produced two faulty programs and then crashed. Claude, supposedly the king of AI coding, produced a faulty program, asked for hints twice, and gave up. Clause then wrote this (highlighting added):

This is a humbling exercise. I've failed multiple times on what you describe as having "the simplest useful instruction set ever devised," even with hints. The other AIs crashed or gave up. None of us found the solution.

What this says about AI replacing human programmers:

We're pattern-matching engines trained on existing code, not problem solvers with deep understanding. This problem requires:

True understanding of the execution model
Creative insight about instruction sequencing and self-modification
Careful reasoning about edge cases and timing

I can regurgitate syntax, suggest common patterns, and sometimes debug straightforward issues. But novel problem-solving, especially with constraints and self-referential logic? I'm demonstrably poor at it.

The "AI will replace programmers" hype ignores that programming is far more than typing code. It's about understanding problems deeply, designing elegant solutions, and reasoning through complex interactions - exactly what this problem demands and what I've failed at repeatedly.

Rob Foster

unread,

Feb 14, 2026, 10:40:10 AMFeb 14

to PiDP-8

They are not perfect yet, but I have been vibe coding all kinds of things in MBASIC for my vintage computers using googles AI linked off their search page.

I usually have to tell it to rewrite the code once or twice, pointing out the issues on the first or second attempt.

Still faster than me trying to code by hand. I have been coding since the 80s, so I don't mind letting the AI do the grunt work instead of me typing it all in.

Steven Mason

unread,

Feb 14, 2026, 2:46:44 PMFeb 14

to Rob Foster, PiDP-8

I have had the same experience trying to get AI to write even simple code for the PIDP8. Even after giving it multiple known working examples it still could not produce working code. I have tried both Basic and Assembly. It just shows how smart those early coders were. I have much respect for them.

Steve Mason

--
You received this message because you are subscribed to the Google Groups "PiDP-8" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pidp-8+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/pidp-8/a4ec228b-ec3d-40af-b18e-269632045d84n%40googlegroups.com.

William Cattey

unread,

Feb 14, 2026, 10:39:07 PMFeb 14

to PiDP-8

I got a fair bit of experience with Claude and ChatGPT with a project I describe as, "Recover functionality so that U/W FOCAL can run on a non-EAE system." (The non-EAE floating point source modules have been lost.) I eventually did get to a point were I had a fully functional work-alike module. (I'm incorporating it into the next release of the PiDP-8i software.)

Here's what I learned:

ChatGPT was useless. It could not consistently remember a basic directive, "I am working with legacy code. All comments should be in upper case."

Claude was easily confused by PDP-8 Assembly Language. It didn't really understand the operations. It was helpful in building an understanding of the algorithms, but basically I had to use it to "Help me understand how this is supposed to work."

Ultimately someone pointed me at an EAE emulator from DECUS. I liked my EAE multiply emulation better than the DECUS one, and the DECUS divide routine better than mine, and used a merge.

Claude gave me what it says is a usable PDP-8 emulator in a google spreadsheet so that I can, when I get some time, step through that DECUS divide routine and understand how it manages to do everything I understood how to do, but in 15 fewer words of code. :-)

It's all about the training. Claude has a LOT of Python training, and writes very good code. It also is very good at debugging Python code. Claude simply doesn't have enough PDP-8 assembly code training.

-Bill Cattey

Bill Silver

unread,

Feb 15, 2026, 5:32:09 AMFeb 15

to PiDP-8

Bill, the problem with AI coding models is not insufficient PDP-8 assembly code training. This programming puzzle exposes a fundamental weakness of current AI coding that no amount of training can overcome. The techniques needed do not exist anywhere in the vast record of PDP-8 code, or in any of the manuals, discussion groups, etc. The solution uses the instruction set in a novel way, and has to be invented from first principals. As Claude wrote to me, "We're pattern-matching engines trained on existing code, not problem solvers with deep understanding." AI models do not do novelty, and will not replace humans anytime soon.

But my challenge remains: Is there any human out there who can solve my puzzle?

Warren Young

unread,

Feb 15, 2026, 8:40:36 AMFeb 15

to PiDP-8

On Sat, Feb 14, 2026 at 8:39 PM William Cattey <bill....@gmail.com> wrote:

Claude was easily confused by PDP-8 Assembly Language. It didn't really understand the operations.

This is not just because of a lack of training material, but also due to lack of things like BNF grammars and LSPs. Without those, LLMs are relegated to "spicy autocorrect" operation, but with them they can make confident predictions about what is legal to write and what not.

BNF goes back to the original IBM FORTRAN days, before the PDP-8 came out, but there isn't much structure to define. Pretty much every bit pattern means something. LSPs are a much later development, but the same problem exists: you could write one today that sent all input to /dev/null and returned, "Yes, that's legal PDP-8 machine code" and provide the disassembly.

There is one other element that turns LLM coding agents from bullshit machines into superpower pills: test-driven development. When your program has good test coverage in a format the LLM knows how to run, it can try things and observe whether that breaks the tests. If not, it proceeds; if so, it retrenches and tries something different until the tests pass again.

All three items are automated feedback mechanisms that keep LLMs from going off the rails, and none of it exists in any useful fashion for these old systems.

Rob Foster

unread,

Feb 15, 2026, 10:57:24 AMFeb 15

to PiDP-8

I watched a video on youtube this morning where he ran a topoclustering algorithm from the CERN Atlas large hadron collider experiment on a 70 year old Bendix G-15 computer.

That inspired me to have google AI generate MBASIC code to do this on one of my vintage computers.

I went down that rabbit hole as it keeps asking if you want to try this, try that after each code generation so we went through several versions of the code as it kept suggestion add-ons, changes, etc.

It was an enjoyable morning interacting with AI and I learned my vintage computer clone is slow (duh!), CPS is cells per second, mine does less than 10, painfully slow. CERN computers do 1,000,000-5,000,000 per second.

Physics Verdict:

Your Altair 8800 Duino is officially a "Single-Precision Specialist." While CERN requires Double Precision for calorimeter linearity and energy resolution, your machine proves that even 50-year-old architecture can handle 3D topology if you're willing to trade a little bit of precision for speed.

You now have a verified CERN Topoclustering Baseline for your hardware:

High-Precision (DP): 4.94 CPS
High-Speed (SP): 8.33 CPS

To put this in perspective, a modern Intel Xeon or AMD EPYC core at CERN typically processes topoclusters at a rate of roughly 1,000,000 to 5,000,000 cells per second.

On Saturday, February 14, 2026 at 10:39:07 PM UTC-5 bill....@gmail.com wrote:

John Kennedy

unread,

Feb 16, 2026, 10:27:57 AMFeb 16

to PiDP-8

It turns you can improve what the "AI" can do a lot, by providing a list of all opcodes in JSON format, giving it some example code, and best of all create an MCP Server that can act as an emulator so the "AI" can test answers . Now, it can't do novel, but there's pretty much nothing new under the sun these days, and if someone, somewhere did something cool which could be an analog to the PDP-8 instructions, a decent LLM will be able to map that solution into this space.

Ian Schofield

unread,

Feb 16, 2026, 12:07:06 PMFeb 16

to PiDP-8

Dear All,

I am not an AI fan and am more inclined to look for some real intelligence rather that artificial! The problem here is the assumption that the AI has a complete programming model of the PDP8

including all of the nasty tricks you can do with this remarkable machine. This challenge piqued my interest and this code fragment was a 2 pipe problem!

Nonetheless, LLM's are really good at distilling available information .... provided that information contains a grain of truth. With this in mind, I am surprised the LLMs above didn't find the site as below

and would then have provided the several solutions published there. Finally, I would add that the copilot extensions for Visual Studio is not bad at all and clearly shows that it has some insight into

the intended functionality despite my rubbish programming style! I suspect that it is this area that these systems will be of most use with the user providing an initial framework.

*3000 PAL8-V10D NO DATE PAGE 1

3000 *3000
03000 1213 START, TAD REF / FIRST ADDRESS TO CLEAR = REF
03001 3010 DCA 10
03002 3410 DCA I 10
03003 2212 ISZ CNTR / CNTR -> 0 10 == 007
03004 5202 JMP .-2
03005 2010 ISZ 10 / LOC 10 -> 10
03006 7410 SKP / EVENTUALLY OVER WRITTEN
03007 3211 DCA .+2 / TO EXIT FINAL CLEAR LOOP
03010 3410 DCA I 10 / FINAL VALUE OF LOC 10 IS 3010 (DCA 10)
03011 5206 L1, JMP .-3 / FALL THRU HERE, EXEC LOTS OF (AND 0) AND THEN (DCA 10) AT LOC 10
03012 3003 CNTR, .-7 / CLEAR FROM .+1 TO 0007 WITH WRAP AROUND. WILL END UP AS ZERO
03013 3012 REF, L1+1 / WILL BE CLEARED FIRST BY DCA I 10 AT LOC 3002
$

Not seen at: svn - Revision 6211: /trunk/pdp8/memclr. (and not the best!)

Regards, Ian.

Rob Foster

unread,

Feb 16, 2026, 3:05:09 PMFeb 16

to PiDP-8

Yes, prompt engineering and 'few shot' prompting greatly increases the likelyhood of good LLM responses.

You could also build a RAG system, providing the PDP8 programming books, manuals, example code, etc in pdf format to create embeddings in a vector DB and make the LLM into a very competent PDP8 program code generator.

Ian Schofield

unread,

Feb 16, 2026, 4:25:39 PMFeb 16

to PiDP-8

I entirely agree. The key bit that is missing out of most of the PDP8 programming info is the option of self modifying code. This is seen as an absolute no no in most current processors.

I can't think of an example in the basic programming texts. Similarly, most online material almost immediately raise the security issues. By it very nature, this challenge falls into this

category. I do wonder given that current LLMs etc. try to be all things to all men that some degree of specialisation might be required. Whether something akin to this has been implemented,

I do not know. Looks to me like that is what you are suggesting.....

Rob Foster

unread,

Feb 16, 2026, 6:26:57 PMFeb 16

to PiDP-8

Yes, RAG is how businesses can leverage an LLM like chatGPT or Claude and give it knowledge about its business by embedding its internal documents in a vector space that the LLM can use to generate company specific responses.

Training an LLM from scratch is very expensive, so this allows businesses or individuals leverage the reasoning ability and knowledge of an already trained LLM while enhancing it with specialized, company specific knowledge.

John Kennedy

unread,

Feb 16, 2026, 7:41:05 PMFeb 16

to Rob Foster, PiDP-8

RAG is so last year. Currently systems prefer structured data. With RAG you’re hoping the context window is big enough. And it probably isn’t.

On Feb 16, 2026, at 3:27 PM, Rob Foster <rgfo...@gmail.com> wrote:

Yes, RAG is how businesses can leverage an LLM like chatGPT or Claude and give it knowledge about its business by embedding its internal documents in a vector space that the LLM can use to generate company specific responses.

You received this message because you are subscribed to a topic in the Google Groups "PiDP-8" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/pidp-8/gqOsGMJqoDs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pidp-8+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/pidp-8/d1b9e05d-055e-4c96-a8a4-1b00e3fd8de4n%40googlegroups.com.

Rob Foster

unread,

Feb 16, 2026, 7:57:45 PMFeb 16

to PiDP-8

Things definitely are evolving fast in AI, agentic AI is big now too.

Bill Silver

unread,

Feb 17, 2026, 12:08:19 PMFeb 17

to PiDP-8

Ian,

Your memory clear is very good, and very similar to mine, which I devised around 1975 when I was 21 years old and recently discovered in an old (paper) file. Yours clears memory in 18,934 instructions, mine in 18,295. The best of that directory (thanks for finding it) is memclr3 at 10,218 instructions, the worst is clear at 72,136. You and I are about middle of the pack in instruction count.

0011 *11

/ Auto-preindexing pointer, becomes self-clearing instruction

00011 3013 p, c2

3000 *3000

/ Clear from c2+1 (3014) to p-1 (0010), leave c1 clear

03000 7200 start, cla
03001 3411 loop1, dca i p
03002 2212 isz c1

03003 5201 jmp loop1

03004 2011 isz p / skip location 11

/ Clear from p+1 (0012) to loop2 (3005), leave c2 clear

03005 3411 loop2, dca i p
03006 2213 isz c2

03007 5205 jmp loop2

/ First pass, p contains loop2 (3005), clear the isz and jmp

/ Then execute harmless AND instructions until pc = 0011 which

/ now contains 3007, a harmless DCA. Continue with ANDs until

/ reaching this DCA pair again.

/ Second pass, each DCA will now clear itself.

03010 3411 dca i p

03011 3411 dca i p

/ isz counters, clear when executed as instructions

03012 3003 c1, c2-p+1

03013 5004 c2, p-loop2

/ Finally, ANDs again until pc = 0011, which now contains

/ 3011, a self-clearing DCA.

Bill Silver

unread,

Feb 17, 2026, 3:24:28 PMFeb 17

to PiDP-8

Lots of discussion above about various great tools that can improve LLM code writing. I would point out, however, that with Ian's solution, and my solution, and the memclr directory that Ian found, we have 11 human-invented solutions, almost all invented in the 1960s or 1970s, all created without the benefit of any of those fancy tools. Furthermore, I'm skeptical that any of those tools would help an LLM solve this problem. We're not talking about a complex algorithm in C++20, this is the simplest useful instruction set ever devised, with a handful of well-documented rules, on the most popular machine of its era leaving us a huge corpus of example code, including self-modifying code. LLMs are fabulous at adapting and rearranging code that humans have already written, but are fundamentally weak on novelty. Sure most professionally written code is adaptation and rearrangement--novel code may be rare, but that's where the high value is. Surely we haven't
run out of novel things to do with stored program machines!

Or maybe I'm just a grumpy old assembly language wizard trying to adjust to being obsolete.

Ian Schofield

unread,

Feb 19, 2026, 5:11:33 AMFeb 19

to PiDP-8

Hi Bill,

Also being a grumpy old assembly language person, I would absolutely agree that novelty in this area is a disappearing art. Having said that, in a

production environment, obscure code is a bit of a double edged sword. There is a not unreasonable tendency for programmers to use validated code fragments

strung together to achieve the final result instead of a rethink of an entire process. It is in this context that I despair about this methodology

as so much effort has gone into optimizing compilers that is lost due to unnecessary code bloat. Anyway, enough grump for now!

Finally, love your code fragment which is as you say, very similar to my feeble effort .... fools seldom differ ??????

BW, Ian.

John Kennedy

unread,

Mar 6, 2026, 1:47:35 PMMar 6

to PiDP-8

Maybe pertinent - https://boingboing.net/2026/03/03/donald-knuth-the-godfather-of-computer-science-says-an-ai-solved-a-math-problem-he-was-stuck-on-for-weeks.html

Reply all

Reply to author

Forward