HLL on 6502

D Finnigan

unread,

Oct 2, 2021, 10:15:22 PM10/2/21

to

From https://llvm-mos.org/wiki/Findings:

Several common assumptions about the MOS 6502 processor, and C compilers
targeting it, are now refuted by the current work.

First, the assumption that a modern compiler framework, such as LLVM, cannot
be targeted towards an old 8-bit CPU such as the 6502. LLVM's new GlobalISel
architecture can very well be targeted to the 6502, and it can indeed
produce superior code, if permitted to do so.

Second, the assumption that because the 6502 is "stackless," and has few
registers, it is not a good host for C.

Regarding stacks, while it's true that the standard C runtime model is quite
hostile to 6502 performance, the C standard provides broad latitude for
alternative models that behavior in all points "as if" it were the C model.
In the broad space of possible alternatives, we've found a collection of
techniques that broadly preserve C standard compatibility while emitting
very high quality code. To put it another way, we go to great lengths to
emit code that operates "as if there were stacks", without using stacks at
all. LLVM's sophistication facilitates this; the analyses required are quite
intricate, but most of them are slight modifications to data structures
already available in LLVM.

Regarding registers, the original 6502 designers were well aware of the
6502's register limitations, and so provided a bunch of zero-page addressing
modes to compensate. We present these to LLVM as registers, which takes our
backend from "alien nightmare" to "ugly duckling", not unlike x86 or AVR.
Normal register allocation techniques apply, since 6502 instructions treat
different zero page locations identically. While A, X, and Y are a bit
unusual, they're not any worse than x86, and LLVM's register is fully
capable of handling them, even if the relationship is a bit strained.

Third, that "simpler is better" for producing a performant compiler for
8-bit targets. llvm-mos's architecture and design choices are not at all
simple. I haven't counted, but I think llvm-mos is doing about 100 passes
through the code, about 8 of which are specific to the MOS 6502.

Fourth, that the 6502 is implicitly some sort of "special" architecture, and
it therefore requires special compilers, linkers, binary file formats, etc.
We treat the 6502, ultimately, as just another target within the LLVM
framework, and as such it benefits from all the industry-standard
ELF-compatible file formats.

Fifth, that because the 6502 is a small target, it requires a smaller
compiler and smaller tools. This assumption never really made sense anyway.
In fact the opposite is true: if you want to do advanced codegen for the
6502, you need a really intelligent (and large) compiler and toolchain
framework, not a small one. The state of the art of optimization has
advanced leaps and bounds in the past three decades, and the poor old 6502
has received none of those benefits, until the current work.

Sixth, that peephole optimization produces the best codegen for the 6502. In
fact, llvm-mos gets the most benefit out of 6502-specific optimizations
relatively early in the LLVM machine function pass pipeline, and the code it
produces (in small tests) is quite efficient, even without any 6502-specific
peephole optimizer at all. "The more clothes you put on during the day, the
more you have to take off at night." One high-level instruction can become a
big block of 6502, so a single high-level optimization that removes it can
prevent a thousand cases from being needed to handle it later.

Seventh, that because the 6502 is small, it requires some sort of
specialized language (in the David A. Wheeler sense
<https://dwheeler.com/6502/> ) in order to generate performant code. As it
turns out, anything that generates LLVM IR, can be a candidate for running
on the 65xx series of processors. Making "a version of C" that is easy to
compile just shifts the burden of writing a modern compiler, back onto the
user, in the form of limited functionality. LLVM will likely be able to
handle compiling other languages to the 6502, at some point in the future.
Rust support has already been proven, but there are no problems in principle
with lowering many more languages to the 6502.

Oliver Schmidt

unread,

Oct 3, 2021, 4:32:14 PM10/3/21

to

Hi,

I'm the maintainer of the cc65 compiler you are welcome to consider my
statements biased, but anyway...

>Several common assumptions about the MOS 6502 processor, and C compilers
>targeting it, are now refuted by the current work.

I don't know where the term "common assumptions" comes from. However,
I know that not a single one of them is common among the cc65
developers.

Especially I personally would actually be happy if an
_actively_maintained_ LLVM based 6502 compiler would outperform cc65
allowing cc65 to retire.

However, a _LOT_ of work went into the elaborated, highly optimized
and well tested cc65 C libraries for many 6502 targets. I personally
would really like to see that work being leveraged by a cc65
successor.

>To put it another way, we go to great lengths to
>emit code that operates "as if there were stacks", without using stacks at
>all. LLVM's sophistication facilitates this; the analyses required are quite
>intricate, but most of them are slight modifications to data structures
>already available in LLVM.

This is in fact a great example for what I wrote above: cc65 of course
doesn't have the LLVM capabilities to analyse if a function needs to
be reentrant (making a stack necessary) or not. But the cc65 developrs
understood very well a long time ago that it is preferable to not have
a stack at all. Therfore the user can manually decide if a function
needs to be reentrant or not:

https://cc65.github.io/doc/cc65.html#option-static-locals

Regards,
Oliver

Kent Dickey

unread,

Oct 4, 2021, 12:26:22 AM10/4/21

to

In article <sjd40c$4r3$1...@solani.org>, Oliver Schmidt <ol...@web.de> wrote:
>Hi,
>
>I'm the maintainer of the cc65 compiler you are welcome to consider my
>statements biased, but anyway...
>
>>Several common assumptions about the MOS 6502 processor, and C compilers
>>targeting it, are now refuted by the current work.

I'd like to make this a little meta, if that's OK.

One one hand, I like to try to not discourage people from doing fun things
with Apple IIs. So even if someone says something like "you are all idiots,
I've discovered the best way to do X on an Apple II", I try not to say
anything. I mean, I could shout them down and be mean. But then I would be
making Apples unfun. When comp.sys.apple2 had a few people doing that about
20 years ago, everyone was always upset, and it was very much not fun.

On the other hand, I know it's disheartening to be the one taking the brunt
of comments like this. I wrote the KEGS emulator, and man, some people
really talk a lot of crap about emulators. The vague putdowns and general
condescending comments are actually the most annoying to me. Complaints
about things that don't actually work are not offensive at all.

So what's the best way to not discourage new work, but also not to let
the authors of existing work which is being talked down feel maligned?

As for the new LLVM 6502 compiler, I could find no example output--in fact, the
"license" to download basically says benchmarking is not currently allowed.
I'm unsure if they can actually deliver on their promises since the ratio of
talk to results makes me suspicious.

Whereas lots of 6502 compilers have delivered very impressive results--since
6502 is NOT a great target for C. Everyone makes trade-offs. I am glad cc65
exists since lots of people have used it to deliver useful programs.

Kent

D Finnigan

unread,

Oct 4, 2021, 2:20:47 AM10/4/21

to

Kent Dickey wrote:
>
> I am glad cc65 exists since lots of people have used it
> to deliver useful programs.
>

Yeah, I definitely think that cc65 is the more mature 6502 C compiler, also
considering the community built around it and sample code & tutorials
available.

Oliver Schmidt

unread,

Oct 4, 2021, 4:39:02 AM10/4/21

to

Hi,

>Yeah, I definitely think that cc65 is the more mature 6502 C compiler, also
>considering the community built around it and sample code & tutorials
>available.

That's not the point - not even close to it. cc65 is of course mature.
And llvm-mos is of course the very opposite right now as it has just
started.

The point is the attitude of claiming that everybody else is plain
stupid.

Regards,
Oliver

laserac...@gmail.com

unread,

Oct 5, 2021, 8:38:09 AM10/5/21

to

I do not have anything to contribute technical wise to the convo, but several years ago I wrote my own Applesoft to 6502 compiler "Idiot Compiler"... and for integer number computations and inline manipulable 'touchy-feely' ability to manipulate the structure prior to compile (like insert your basic and asm code together), its been quite a powerful tool for me... but this whole Apple II thing is about the fun you experience while generating/creating your projects. Create the tools that help you eek out more performance from the machine, in your own unique way/mindset. It may be better, it may be slightly worse than whats already out there. But certainly, programming is tough enough, putting down other peoples prior work when they had the same ambition and drive isn't a cool thing to be doing.

John Byrd

unread,

Oct 8, 2021, 11:47:23 AM10/8/21

to

Hi Oliver, I wrote most of the FAQ entry that you're quoting, so I might have some thoughts on your responses.

> Especially I personally would actually be happy if an
> _actively_maintained_ LLVM based 6502 compiler would outperform cc65
> allowing cc65 to retire.

It's still early days, but we're seeing some positive results in terms of benchmarks -- see for example https://llvm-mos.github.io/llvm-test-suite/dev/bench/ .

> However, a _LOT_ of work went into the elaborated, highly optimized
> and well tested cc65 C libraries for many 6502 targets. I personally
> would really like to see that work being leveraged by a cc65
> successor.

So would I, but ca65 and cl65 have their own fairly specific macro languages and linker languages. So when someone asks me for "cc65 compatibility," I confess that I'm not exactly sure what that means.

Anyway, I'd like to encourage you to spend some time looking over the behavior of llvm-mos at some point -- we've done a passable job so far of creating a standards compliant and relatively stable, if not ideally performing, C compiler. It passes the gcc torture test suite, excepting only the floating point tests.
We're still very early in terms of providing per platform support, which we would agree is one of cc65's core strengths.

I hope that cc65 and llvm-mos can benefit from one anothers' strengths, and that they can each continue to leverage what they are both good at. Please feel free to review what we're doing, and give any comments or opinions that you might see fit. You are also welcome to jump into llvm-mos with improvements, bug fixes, and the like.

Sincerely,

John Byrd

David Schmenk

unread,

Oct 8, 2021, 7:13:46 PM10/8/21

to

Although I applaud the work being done to support the 6502 in the LLVM environment, the response to the "common assumptions" feels a little disingenuous to me. I'm don't feel that these "common assumptions" are in fact common assumptions or just the reality of a 46 year old CPU that hasn't really had any real commercial value in decades.

Most of the "common assumptions" are in fact the result of a majority of the HLLs being self hosted. Simpler, smaller, specialized languages and optimizers are the only way to feasibly implement a self hosted programming environment. There is no way LLVM is going to run self hosted on a 64K, 1 MHz 6502 (PLASMA does). I hope that made you chuckle.

And all of the 6502 languages have been written and supported by either an individual or small team. LLVM has been in active development by a worldwide team of developers targeting very diverse targets from 8 bit CPUs to GPUs. It is a very large and sophisticated toolset enabled by a decade of open source development and powerful hardware.

So I am very interested (been following the progress on 6502.org) and looking forward to the results of the LLVM-MOS development effort. Until the LLVM-MOS is capable of delivering binaries of useful programs, i.e not benchmarks, it may be a little presumptuous to claim just how good the final result will be even if it is very encouraging. I just felt compelled to respond to the "common assumptions" being a result of the times and environmental constraints and now having a modern compiler toolset to experiment with approaches that haven't been possible before.

John Byrd

unread,

Oct 8, 2021, 9:38:29 PM10/8/21

to

On Friday, October 8, 2021 at 4:13:46 PM UTC-7, David Schmenk wrote:

> Until the LLVM-MOS is capable of delivering binaries of useful programs, i.e not benchmarks, it may be a little presumptuous to claim just how good the final result will be even if it is very encouraging.

Perhaps we must define the utility of a compiler then. We've chosen to benchmark improvements to LLVM-MOS, by well-known benchmarks and compliance tests that C compilers already use. For example, LLVM-MOS currently passes the gcc torture tests, save only for those that require floating point support, with compilation flags at -O0, -O2, and -O3. We run the torture tests daily on the most recent LLVM-MOS build.

There's tons of stuff remaining to be done, which is why we're seeking programmers with compiler experience to join us on the Slack group as mentioned on the front page of llvm-mos.org , and to start working on bugs that we've already written up on github.

David Schmenk

unread,

Oct 9, 2021, 12:58:42 PM10/9/21

to

On Friday, 8 October 2021 at 18:38:29 UTC-7, John Byrd wrote:
> On Friday, October 8, 2021 at 4:13:46 PM UTC-7, David Schmenk wrote:
>
> > Until the LLVM-MOS is capable of delivering binaries of useful programs, i.e not benchmarks, it may be a little presumptuous to claim just how good the final result will be even if it is very encouraging.
> Perhaps we must define the utility of a compiler then.

No, I think many, if not most reading, understand the utility of compiler. This sounds a little defensive and perhaps now you understand how your "common assumptions" come across. Those "common assumptions" are in fact hard learned lessons given the constraints of the times, technology, environment and/or manpower. It's not as if everyone used the same playbook in coming up with unique solutions to a hard problem.

> We've chosen to benchmark improvements to LLVM-MOS, by well-known benchmarks and compliance tests that C compilers already use. For example, LLVM-MOS currently passes the gcc torture tests, save only for those that require floating point support, with compilation flags at -O0, -O2, and -O3. We run the torture tests daily on the most recent LLVM-MOS build.
>

That would be appropriate to judge the baseline functionality of the compiler.

> There's tons of stuff remaining to be done, which is why we're seeking programmers with compiler experience to join us on the Slack group as mentioned on the front page of llvm-mos.org , and to start working on bugs that we've already written up on github.

Indeed, and I believe everyone is excited to see the results of all your hard work. Years ago I looked at supporting a 6502 backend to LLVM and before that, GCC. It was a significant learning curve and an amount of time I no longer wanted to invest. I scratched my compiler itch with PLASMAs compiler/VM/JIT. So I commend you on your effort and look forward to your advances. Maybe it will inspire some of us to re-engage.

John Byrd

unread,

Oct 17, 2021, 1:02:23 PM10/17/21

to

On Monday, October 4, 2021 at 1:39:02 AM UTC-7, Oliver Schmidt wrote:
> The point is the attitude of claiming that everybody else is plain
> stupid.

Thanks for pointing out my arrogance. I apologize for that. Daniel has gone through and edited my original Findings post to keep it more technical and less arrogant.

The truth is, like cc65, we have managed to make positive strides by engaging members of the 6502 community. So I'll make a special effort in the future to avoid coming off too arrogant.

Here is a discussion of bringing llvm-mos and cc65 closer together, in our bug tracking system. I'd encourage other perspectives from mine and Daniel's on what cc65 compatibility might mean, for a more mature version of llvm-mos. It *is* possible, but first we would have to clearly define what cc65 compatibility might mean for our toolset.

https://github.com/llvm-mos/llvm-mos/issues/29

Sincerely,

John Byrd

Oliver Schmidt

unread,

Oct 18, 2021, 1:34:21 PM10/18/21

to

Hi John,

>> The point is the attitude of claiming that everybody else is plain=20
>> stupid.=20

>Daniel has go=
>ne through and edited my original Findings post to keep it more technical a=
>nd less arrogant.

Great :-)

>The truth is, like cc65, we have managed to make positive strides by engagi=
>ng members of the 6502 community. So I'll make a special effort in the fut=

>ure to avoid coming off too arrogant.

Yep, I imagine that llvm-mos might benefit from contributions coming
from (former?) cc65 contributors...

>Here is a discussion of bringing llvm-mos and cc65 closer together, in our =
>bug tracking system. I'd encourage other perspectives from mine and Daniel=
>'s on what cc65 compatibility might mean, for a more mature version of llvm=
>-mos. It *is* possible, but first we would have to clearly define what cc6=

>5 compatibility might mean for our toolset.
>
>https://github.com/llvm-mos/llvm-mos/issues/29

I've added my pretty and/or too lenghty two cents,
Oliver