Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Sudoku puzzle solver for Apple II

43 views
Skip to first unread message

Michael J. Mahon

unread,
Jul 12, 2006, 6:05:59 AM7/12/06
to
Scott Hemphill and I recently collaborated to create a Sudoku puzzle
solver for Apple II computers. The machine language solver is Scott's,
with some adaptations by me, and I wrote the interactive Applesoft
front-end.

SUDOKU runs on any Apple II with 80-column firmware and a 65C02
processor, which includes the IIc, the IIc+, the Enhanced //e, and
the IIgs.

It is amazingly fast! Most Sudoku solvers run on modern PCs thousands
of times faster than a 1MHz Apple II, and take seconds to solve a
puzzle, but Scott's solver is so time and space efficient that it
solves puzzles in seconds *running on an Apple II*!

Check it out at:

http://members.aol.com/mjmahon/Sudoku.html

It is available both as a ShrinkIt disk archive and as a .dsk image.

Enjoy!

(BTW, the execution profile referred to in the paper has not yet
been uploaded, so for the time being, that's a 404. ;-)

-michael

Fast Sudoku solving for Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design
tool--and it is seriously underused."

Paul Schlyter

unread,
Jul 12, 2006, 8:13:07 AM7/12/06
to
In article <XtudnVI15qUVVCnZ...@comcast.com>,

Michael J. Mahon <mjm...@aol.com> wrote:
>Scott Hemphill and I recently collaborated to create a Sudoku puzzle
>solver for Apple II computers. The machine language solver is Scott's,
>with some adaptations by me, and I wrote the interactive Applesoft
>front-end.
>
>SUDOKU runs on any Apple II with 80-column firmware and a 65C02
>processor, which includes the IIc, the IIc+, the Enhanced //e, and
>the IIgs.
>
>It is amazingly fast! Most Sudoku solvers run on modern PCs thousands
>of times faster than a 1MHz Apple II, and take seconds to solve a
>puzzle,

I don't believe that! Check out this Sudoku solver:
http://homepage.ntlworld.com/valleyway/solver.html

It's written in Javascript, which of course is a quite inefficient
programming language - yet it needs less than half a second to solve
a sudoku, on a 4+ year old PC running at a moderate 1.8 GHz clock speed.

You must have encountered some really sloppily written sudoku solvers
for the PC if they required several seconds to solve a sudoku....


> but Scott's solver is so time and space efficient that it
>solves puzzles in seconds *running on an Apple II*!
>
>Check it out at:
>
>http://members.aol.com/mjmahon/Sudoku.html
>
>It is available both as a ShrinkIt disk archive and as a .dsk image.

...but why does it require a 65C02 ? Couldn't it have been written
in plain 6502 assembly language?


>Enjoy!
>
>(BTW, the execution profile referred to in the paper has not yet
>been uploaded, so for the time being, that's a 404. ;-)
>
>-michael
>
>Fast Sudoku solving for Apple II's!
>Home page: http://members.aol.com/MJMahon/
>
>"The wastebasket is our most important design
>tool--and it is seriously underused."


--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/

BLuRry

unread,
Jul 12, 2006, 10:44:20 AM7/12/06
to
> I don't believe that! Check out this Sudoku solver:
> http://homepage.ntlworld.com/valleyway/solver.html
>
> It's written in Javascript, which of course is a quite inefficient
> programming language - yet it needs less than half a second to solve
> a sudoku, on a 4+ year old PC running at a moderate 1.8 GHz clock speed.

Javascript isn't all that bad. :-P Sure, it's not exactly suited for
realtime applications, but once the code has been interpreted and all
data structures are created, it's pretty decent actually. I've found
it to be not much slower than, say, Perl. Some things in the language
are actually a lot faster than you think! Dynamically-created data
structures are not a strong-point of javascript when speed is concerned
though.

http://www.jorendorff.com/articles/javascript/speed-test.html

I don't see enough people rag on the speed of, say, Ruby or Python.
:-D

> ...but why does it require a 65C02 ? Couldn't it have been written
> in plain 6502 assembly language?

*sigh* Why are there so many anti-65c02 enthusiasts out there? Can't
you afford a free emulator that can handle the opcodes or shell out $6
for a new 65c02 processor? But, seriously, I have a strong preference
for 65c02 opcodes in my code as well becase it's a slight bit more
convienent. Why limp when you can walk? (or use 65816 and go for a
jog ;-)

-B

BLuRry

unread,
Jul 12, 2006, 11:11:08 AM7/12/06
to

Michael J. Mahon wrote:
> Scott Hemphill and I recently collaborated to create a Sudoku puzzle
> solver for Apple II computers. The machine language solver is Scott's,
> with some adaptations by me, and I wrote the interactive Applesoft
> front-end.

Holy algorithm, Batman! That is cool stuff. Very excellent algorithm
design! A few moot points to consider if you're a total perfectionist:

- Could it be possible to support arrow keys?

- Could it use mousetext for the display, since it's already singled
out to use 65c02, and thus mainly enhanced //'s anyway?

- Maybe make some boxes to uneditable (e.g. keep them hilighted) so
that the original numbers can remain unaltered.

Whether you even care about these suggestions is totally up to you. ;-)
Awesome stuff, guys! Keep up the great work!

Michael Black

unread,
Jul 12, 2006, 11:18:36 AM7/12/06
to
What's the point of having a program for a relatively new game for
a computer that hasn't been made in almost twenty years, but then
you are assuming the most interest will be in running it on an emulator?

At that point, you aren't talking about some functionality of an old
computer, you are doing it for no good reason.

And once you move away from emulators, you do start limiting the
computers that can run the program when you use a 65C02. In other
words, you may be limiting who can run this program on an actual Apple
II.

But the question still remains, why is it so important to use
the few instructions that are unique to the 65C02? They may be nice
additions, but lots of people survived without them.

Michael

Scott Hemphill

unread,
Jul 12, 2006, 11:46:04 AM7/12/06
to
pau...@saaf.se (Paul Schlyter) writes:

> In article <XtudnVI15qUVVCnZ...@comcast.com>,
> Michael J. Mahon <mjm...@aol.com> wrote:
> >Scott Hemphill and I recently collaborated to create a Sudoku puzzle
> >solver for Apple II computers. The machine language solver is Scott's,
> >with some adaptations by me, and I wrote the interactive Applesoft
> >front-end.
> >
> >SUDOKU runs on any Apple II with 80-column firmware and a 65C02
> >processor, which includes the IIc, the IIc+, the Enhanced //e, and
> >the IIgs.
> >
> >It is amazingly fast! Most Sudoku solvers run on modern PCs thousands
> >of times faster than a 1MHz Apple II, and take seconds to solve a
> >puzzle,
>
> I don't believe that! Check out this Sudoku solver:
> http://homepage.ntlworld.com/valleyway/solver.html

I don't know about _most_ Sudoku solvers, but there are a lot of slow ones
out there. And in terms of taking seconds, I think Michael was talking
about difficult puzzles, not easy ones. Our most frequently used test
puzzle was one we called "evil01". I loaded this into the solver you
mentioned and it took about 4 seconds on my 3GHz machine. Here's the
puzzle:

.2.......
...6....3
.74.8....
.....3..2
.8..4..1.
6..5.....
....1.78.
5....9...
.......4.

> It's written in Javascript, which of course is a quite inefficient
> programming language - yet it needs less than half a second to solve
> a sudoku, on a 4+ year old PC running at a moderate 1.8 GHz clock speed.

I don't know how inefficient Javascript is. Is this a good speed? My
C program solves the evil01 puzzle in about 2 milliseconds.

> You must have encountered some really sloppily written sudoku solvers
> for the PC if they required several seconds to solve a sudoku....

Yup, if by sloppy you mean not choosing the most efficient algorithms.

> > but Scott's solver is so time and space efficient that it
> >solves puzzles in seconds *running on an Apple II*!
> >
> >Check it out at:
> >
> >http://members.aol.com/mjmahon/Sudoku.html
> >
> >It is available both as a ShrinkIt disk archive and as a .dsk image.
>
> ...but why does it require a 65C02 ? Couldn't it have been written
> in plain 6502 assembly language?

It requires a 65C02 because that's the way I initially wrote the
assembly language solver. As it underwent various transformations
I started migrating it towards the 6502, but we decided to keep the
speed and release the code. It's public domain, with all sources
provided, so anyone is free to modify it for the 6502. Furthermore,
we didn't put in all the speed tweaks we thought of, and Michael has
thoughtfully mentioned some of those improvements on his website if
anyone wants to pick up where we left off.

> >Enjoy!
> >
> >(BTW, the execution profile referred to in the paper has not yet
> >been uploaded, so for the time being, that's a 404. ;-)

(I've sent it to Michael, so I imagine it will appear there soon.)

Scott
--
Scott Hemphill hemp...@alumni.caltech.edu
"This isn't flying. This is falling, with style." -- Buzz Lightyear

BLuRry

unread,
Jul 12, 2006, 2:03:32 PM7/12/06
to
> What's the point of having a program for a relatively new game for
> a computer that hasn't been made in almost twenty years, but then
> you are assuming the most interest will be in running it on an emulator?

A program is a program... I write lots of programs that run within a
emulator for a non-existant machine and make good money doing it. ;-)
Applewin + soduku < 1mb of download. This is still within the realm of
feasibility when you look at how big most games are these days.

> At that point, you aren't talking about some functionality of an old
> computer, you are doing it for no good reason.

Is there ever any "good" reason for playing games? I do crazy things
with apple //'s because, well, I can. And because I was frustrated I
couldn't at an earlier age in life when I had a real //. I'm sure that
others have their own reasons that equally valid. ;-)

> And once you move away from emulators, you do start limiting the
> computers that can run the program when you use a 65C02.

Well, kind of. It's easier to find apple //e's and apple //c's on
ebay, and altogether there were more of those produced than older ]['s.
I agree for the purpose of total nostalgia that you would want
complete portability (which is why I rewrote a2gameserver to support
old ][ computers as well) but ultimately it is harder to find an older,
non 65c02 or 65816 based apple computer than a legacy 6502 machine.

> In other words, you may be limiting who can run this program on an actual Apple II.

I agree that one should be sensitive to that if there was tangible
penalty for limiting the audience. Such as, if this were a commercial
product being sold. But for free, I'll take what I can get! :-)

> But the question still remains, why is it so important to use
> the few instructions that are unique to the 65C02? They may be nice
> additions, but lots of people survived without them.

I think that the extended opcodes were used in developing the algorithm
to make it as efficient as possible. MJ Mahon's page has a pretty
awesome amount of detail around how this thing evolved TSB was one of
the opcodes used at some point, of which there is no easy equivilant
without storing the accumulator somewhere, loading a value, performing
the bit operations, etc... Would be pretty messy in 6502, and
definately less optimal. There are also a lot of table lookups, which
may have been optimized with the extra opcodes (I'm guessing -- I
haven't read thru all the source though it is fascinating work)

Paul Schlyter

unread,
Jul 12, 2006, 4:43:38 PM7/12/06
to
In article <1152715460.7...@h48g2000cwc.googlegroups.com>,
BLuRry <brendan...@gmail.com> wrote:

>> I don't believe that! Check out this Sudoku solver:
>> http://homepage.ntlworld.com/valleyway/solver.html
>>
>> It's written in Javascript, which of course is a quite inefficient
>> programming language - yet it needs less than half a second to solve
>> a sudoku, on a 4+ year old PC running at a moderate 1.8 GHz clock speed.
>
> Javascript isn't all that bad. :-P Sure, it's not exactly suited for
> realtime applications, but once the code has been interpreted and all
> data structures are created, it's pretty decent actually.

...you mean when the program has reached the end of execution?
Interpretation doesn't stop until the program reaches its end -- unless
the "interpreter" actually is a semi-compiler....

> I've found it to be not much slower than, say, Perl.

It probably won't be much slower than interpreted BASIC either....

> Some things in the language are actually a lot faster than you think!
> Dynamically-created data structures are not a strong-point of javascript
> when speed is concerned though.

Interesting --- because dynamically created data structures must be
handled at runtime, even in compiled languages.



> http://www.jorendorff.com/articles/javascript/speed-test.html
>
> I don't see enough people rag on the speed of, say, Ruby or Python.
> :-D
>
>> ...but why does it require a 65C02 ? Couldn't it have been written
>> in plain 6502 assembly language?
>
> *sigh* Why are there so many anti-65c02 enthusiasts out there?

I'm not anti-65C02, but my good ol' Apple II+ is. And there's little
I can do about that, unfortunately.

> Can't you afford a free emulator that can handle the opcodes

I can afford a modern computer, but I see little point in using it
for running some "lobotomy software", such as emulators for older,
less capable, computers.

> or shell out $6 for a new 65c02 processor?

In the early 1980's I shelled out $50 for a 65C02! That's probably
some $250 in today's money .... however I had little use for it.
I tried it in my Apple II+, which just locked up. I also tried it on
some Apple IIe's, and there it worked fine. So the CPU worked OK.
I still have it.....

> But, seriously, I have a strong preference for 65c02 opcodes in my
> code as well becase it's a slight bit more convienent. Why limp
> when you can walk? (or use 65816 and go for a jog ;-)

...if so, why not fly with a modern 32-bit or 64-bit CPU ? ;-)

> -B

OK, you program for your own enjoyment, not for the utility of others.
All fine and well, we all choose why we do what we do -- but it's good
to know that it's no use trying your software on Apple II's earlier
than the IIe....

Paul Schlyter

unread,
Jul 12, 2006, 4:43:38 PM7/12/06
to
In article <m364i29...@jade.local>,
Scott Hemphill <hemp...@alumni.caltech.edu> wrote:

:-) ....this shows the importance of carefully specifying the initial conditions
in your benchmark. I just typed in some numbers until I got a unique solution,
and used that as a test case.

>> It's written in Javascript, which of course is a quite inefficient
>> programming language - yet it needs less than half a second to solve
>> a sudoku, on a 4+ year old PC running at a moderate 1.8 GHz clock speed.
>
> I don't know how inefficient Javascript is. Is this a good speed? My
> C program solves the evil01 puzzle in about 2 milliseconds.
>
>> You must have encountered some really sloppily written sudoku solvers
>> for the PC if they required several seconds to solve a sudoku....
>
> Yup, if by sloppy you mean not choosing the most efficient algorithms.

That's indeed part of sloppiness!


>>> but Scott's solver is so time and space efficient that it
>>>solves puzzles in seconds *running on an Apple II*!
>>>
>>>Check it out at:
>>>
>>>http://members.aol.com/mjmahon/Sudoku.html
>>>
>>>It is available both as a ShrinkIt disk archive and as a .dsk image.
>>
>> ...but why does it require a 65C02 ? Couldn't it have been written
>> in plain 6502 assembly language?
>
> It requires a 65C02 because that's the way I initially wrote the
> assembly language solver. As it underwent various transformations
> I started migrating it towards the 6502, but we decided to keep the
> speed and release the code.

APproximately how much slower do you think a 6502 version would have
been? 10% slower? Twice as slow? 10 times as slow?

> It's public domain, with all sources
> provided, so anyone is free to modify it for the 6502. Furthermore,
> we didn't put in all the speed tweaks we thought of, and Michael has
> thoughtfully mentioned some of those improvements on his website if
> anyone wants to pick up where we left off.

What assembler did you use?



>>>Enjoy!
>>>
>>>(BTW, the execution profile referred to in the paper has not yet
>>>been uploaded, so for the time being, that's a 404. ;-)
>
> (I've sent it to Michael, so I imagine it will appear there soon.)
>
> Scott

--

Michael J. Mahon

unread,
Jul 12, 2006, 4:44:33 PM7/12/06
to
Paul Schlyter wrote:
> In article <XtudnVI15qUVVCnZ...@comcast.com>,
> Michael J. Mahon <mjm...@aol.com> wrote:
>
>>Scott Hemphill and I recently collaborated to create a Sudoku puzzle
>>solver for Apple II computers. The machine language solver is Scott's,
>>with some adaptations by me, and I wrote the interactive Applesoft
>>front-end.
>>
>>SUDOKU runs on any Apple II with 80-column firmware and a 65C02
>>processor, which includes the IIc, the IIc+, the Enhanced //e, and
>>the IIgs.
>>
>>It is amazingly fast! Most Sudoku solvers run on modern PCs thousands
>>of times faster than a 1MHz Apple II, and take seconds to solve a
>>puzzle,
>
>
> I don't believe that! Check out this Sudoku solver:
> http://homepage.ntlworld.com/valleyway/solver.html
>
> It's written in Javascript, which of course is a quite inefficient
> programming language - yet it needs less than half a second to solve
> a sudoku, on a 4+ year old PC running at a moderate 1.8 GHz clock speed.
>
> You must have encountered some really sloppily written sudoku solvers
> for the PC if they required several seconds to solve a sudoku....

That may be--I haven't actually tried a bunch of PC Sudoku solvers.

If they are also "tidy" enough, perhaps one of these faster (and
hopefully small) solvers could also have been ported to the Apple II.

I probably should have been satisfied simply to say that it's fast. ;-)

>>but Scott's solver is so time and space efficient that it
>>solves puzzles in seconds *running on an Apple II*!
>>
>>Check it out at:
>>
>>http://members.aol.com/mjmahon/Sudoku.html
>>
>>It is available both as a ShrinkIt disk archive and as a .dsk image.
>
>

> ....but why does it require a 65C02 ? Couldn't it have been written


> in plain 6502 assembly language?

I usually enforce that restriction on my own code, and am seldom much
constrained by it. However, in this case, a few 65C02 instructions
relieve register pressure in inner loops, making a quite measurable
difference in performance (but still less than a factor of 2, so a
6502 version is quite practical).

I had it on my list to do a 6502 version, but thought it would be
interesting to get this version out for people to play with.

Note that it also depends on the 80-column firmware control characters,
so that pretty well pushes users toward 65C02 machines. The only one
that is "popular" and left out is the unenhanced //e. ;-(

>>Enjoy!
>>
>>(BTW, the execution profile referred to in the paper has not yet
>>been uploaded, so for the time being, that's a 404. ;-)

This has now been uploaded, so the profile is there. ;-)

-michael

Fast Sudoku solver for Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design

tool--and it's seriously underused."

Michael J. Mahon

unread,
Jul 12, 2006, 4:57:41 PM7/12/06
to
Michael Black wrote:

<snip>

> And once you move away from emulators, you do start limiting the
> computers that can run the program when you use a 65C02. In other
> words, you may be limiting who can run this program on an actual Apple
> II.

Well, I began the limitation by exploiting the 80-column firmware
(in 40-column mode ;-) to aloow control characters to clear to end
of page, etc. That could also be re-done, but I left it for later
because I figured that most people had access to machines that had
the firmware--and a 65C02 as well (though I realize that the
unenhanced //e doesn't).

> But the question still remains, why is it so important to use
> the few instructions that are unique to the 65C02? They may be nice
> additions, but lots of people survived without them.

I don't think that it is, as a rule.

In this case, I looked at the inner loops, where every cycle counts,
and couldn't see any "free" or even "inexpensive" way to do away with
the 65C02isms. There are only a few, like LDA (ptr) when Y is busy with
some other useful value. Having to save it and set it to 0, then
restore it, seems a little expensive if you're trying to go fast.

Almost all other cases, like BRA and INA, can be eliminated when the
time comes without penalty, and I've already removed some.

But until I bring myself to remove *all* 65C02 ops, and take the
time penalty, there's little reason to fix a few.

So--I agree with you in principle, but speed and "time to market"
were more important to me as I'm about to leave town for a week. ;-)

-michael

Fast Sudoku solver for Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design

tool--and it's seriously underused."

Michael J. Mahon

unread,
Jul 12, 2006, 5:04:20 PM7/12/06
to
BLuRry wrote:
> Michael J. Mahon wrote:
>
>>Scott Hemphill and I recently collaborated to create a Sudoku puzzle
>>solver for Apple II computers. The machine language solver is Scott's,
>>with some adaptations by me, and I wrote the interactive Applesoft
>>front-end.
>
>
> Holy algorithm, Batman! That is cool stuff. Very excellent algorithm
> design! A few moot points to consider if you're a total perfectionist:
>
> - Could it be possible to support arrow keys?

The arrow keys are already supported--did you try them?

> - Could it use mousetext for the display, since it's already singled
> out to use 65c02, and thus mainly enhanced //'s anyway?

That was both "over the top" and more of a mess in the Applesoft
program. Maybe someone else would like to try it?

Frankly, I kind of like its ugly little text face. ;-)

> - Maybe make some boxes to uneditable (e.g. keep them hilighted) so
> that the original numbers can remain unaltered.

Not sure I know what you mean here... After a complete solution, the
grid is returned to its original state. Do you mean provide protection
of the user from himself? ;-)

When I change something I didn't intend to, I just reload the original
puzzle (which I save as soon as I've entered it).

> Whether you even care about these suggestions is totally up to you. ;-)
> Awesome stuff, guys! Keep up the great work!

Thanks! It was great fun working together!

-michael

Fast Sudoku solver for Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design

tool--and it's seriously underused."

Michael J. Mahon

unread,
Jul 12, 2006, 5:14:53 PM7/12/06
to
Michael J. Mahon wrote:

> But until I bring myself to remove *all* 65C02 ops, and take the
> time penalty, there's little reason to fix a few.

I should also note that it is currently ProDOS-based, and is
distributed with a 65C02 version, 2.0.3, so that would also
need to be changed.

A DOS 3.3 version would not be hard (though I used a
subdirectory for the puzzles for convenience).

Michael J. Mahon

unread,
Jul 12, 2006, 5:18:35 PM7/12/06
to
Michael J. Mahon wrote:

> (BTW, the execution profile referred to in the paper has not yet
> been uploaded, so for the time being, that's a 404. ;-)

The profile of the solver on "evil01" is there now. ;-)

-michael

Fast Sudoku solver for Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design

tool--and it's seriously underused."

Michael J. Mahon

unread,
Jul 12, 2006, 5:23:13 PM7/12/06
to
Paul Schlyter wrote:

> APproximately how much slower do you think a 6502 version would have
> been? 10% slower? Twice as slow? 10 times as slow?

Certainly less than twice as slow. I'll find out what the current
speed impact would be...

But, as noted, there are other things about the program(s) that
would keep it from running on your ][+, too--all fixable with
some more effort...

-michael

Fast Sudoku solver for Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design

tool--and it's seriously underused."

Michael J. Mahon

unread,
Jul 12, 2006, 6:34:12 PM7/12/06
to
Michael J. Mahon wrote:
> Paul Schlyter wrote:
>
>> APproximately how much slower do you think a 6502 version would have
>> been? 10% slower? Twice as slow? 10 times as slow?
>
>
> Certainly less than twice as slow. I'll find out what the current
> speed impact would be...

Paul's message prompted me to re-look at where 65C02 ops are used, and
they are all quite easy to eliminate--most with no or negligible time
impact.

The only one that has a measurable impact is the LDA (p) in setbox,
and that can be eliminated by just doing an LDA bit rather than a TXA,
which frees up the X register to hold 0, so that the LDA (p) can become
LDA (p,x). (I don't get to use that addressing mode much--even if it is
with a constant 0 index. ;-)

The net impact is +1 cycle in a 166K loop, for a total time cost of 0.17
seconds!

So I think I'll make the changes and replace ProDOS with an earlier
version, so unenhanced //e's will run the program, too.

BTW, this is a good example of why it's worth reevaluating earlier
decisions later in the game. I had taken a quick look at this a
week ago, and, without really trying, didn't see an obvious fix.
Now it all seems pretty easy. ;-)

So I guess this is another confirmation of my belief that the 65C02
extensions were largely unnecessary.

> But, as noted, there are other things about the program(s) that
> would keep it from running on your ][+, too--all fixable with
> some more effort...

There's still that 80-column firmware issue... Hmmm...

Chris Alaimo

unread,
Jul 12, 2006, 6:34:22 PM7/12/06
to
By your logic, no one should ever write an Apple II program using 65c02
language because then not every A2 owner can run it. How many Apple II
enthusiasts do you know that don't have AT LEAST one 65c02-equipped
Apple in their arsenal? I think I only have one Apple II that DOESN'T
have a 65c02 in it (or a processor capable of running 65c02 code).

Chris

Michael J. Mahon

unread,
Jul 12, 2006, 6:43:46 PM7/12/06
to
Chris Alaimo wrote:
> By your logic, no one should ever write an Apple II program using 65c02
> language because then not every A2 owner can run it. How many Apple II
> enthusiasts do you know that don't have AT LEAST one 65c02-equipped
> Apple in their arsenal? I think I only have one Apple II that DOESN'T
> have a 65c02 in it (or a processor capable of running 65c02 code).

Although it may sound strange, I think I might try to make exactly
that argument.

In the vast majority of cases, the 65C02 operations bring almost no
advantage to the resourceful programmer, while limiting the "market"
for his product by a significant fraction.

Only when rigid space or time constraints require it would I plan
on using 65C02 ops, and I can count the possible cases on one hand.

It seems to me that the 65C02 extensions were "too little, too late".
They provided a very small advantage in very few cases and at the
significant cost of creating a completely avoidable compatibility
problem.

When an architecture, warts and all, has been around long enough to
be widely exploited, and everyone has learned to live with it, it
doesn't make any sense to make *minor* improvements and break
unconditional compatibility.

As long as there are *any* 6502 Apples out there, that is an overriding
reason to write to the 6502, especially since the cost for doing so is
so low.

In the case of the Sudoku solver, a 30-second full solution is
increased to 30.17 seconds to remain completely compatible!

-michael

Fast Sudoku solver for Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design

tool--and it's seriously underused."

BLuRry

unread,
Jul 12, 2006, 7:40:09 PM7/12/06
to
> > - Could it be possible to support arrow keys?
> The arrow keys are already supported--did you try them?
It didn't work for some strange reason. Maybe applewin doesn't like my
laptop (of course I am using a beta release... who knows?)

> > - Could it use mousetext for the display, since it's already singled
> > out to use 65c02, and thus mainly enhanced //'s anyway?
> That was both "over the top" and more of a mess in the Applesoft
> program. Maybe someone else would like to try it?

I wouldn't mind doing that part. Yeah, it's ugly but I've done
mousetext from applesoft basic once before.

> Frankly, I kind of like its ugly little text face. ;-)

Oh, it's not that bad. The +'s and |'s and -'s give it a leet-looking
appearance that is familar and reminiscent of the earlier ][ programs.
Or .NFO files. I'm working on a 40-col umn DHGR font and a display
method to render a 40-col screen to the DHGR page, reminscient of the
beagle bros and paul lutus type programs that make the hires screen
emulate the text screen. this might also be an option for a minimally
painful look-n-feel overhaul if I ever finish it (free time has been a
little fleeting lately)

> > - Maybe make some boxes to uneditable (e.g. keep them hilighted) so
> > that the original numbers can remain unaltered.
>
> Not sure I know what you mean here... After a complete solution, the
> grid is returned to its original state. Do you mean provide protection
> of the user from himself? ;-)

Yes, sorry. That was pitiful grammar on my part. Reloading the puzzle
makes sense. However, I've found that keeping the original numbers
"locked" and hilighted is somewhat useful when I'm trying to figure out
where I messed up or determine which numbers are definately parts of
the solution. Once you're 75% into the puzzle it makes it harder to
backtrack otherwise -- similarly with the numbers you enter and are
completely sure of. But, like I said, this is relatively moot.

Bryan Parkoff

unread,
Jul 12, 2006, 8:01:27 PM7/12/06
to

"Michael J. Mahon" <mjm...@aol.com> wrote in message
news:XtudnVI15qUVVCnZ...@comcast.com...

Michael,

I understand that space saving memory and efficient optimization on
6502/65C02 processor is best choice. How do you assume that optimized
Sudoku solving writted in x86 assembly and 6502/65C02 assembly might have
the impact speed? Is it possible that x86 assembly runs at 1GHz might be
faster or almost close to the same speed to 6502/65C02 assembly at 1MHz?
One paragraph of your website states that optimization is evil. I guess
that many programmers and designer of processor are not smart enough to
study space saving memory and efficient optimization when they should try to
write algothrims in smaller functions.
I expect them to spend extra time and money to write advanced x86
assembly rather than high level language so they can write PC Operating
System and games. They can outperform better. It is wasting time and money
to develop modern x86 processors by trying to improve and increase the speed
while they forget all the optimization. It is too much dependant on speed
without optimization so software may operate little fair or poor speed.
How could you predict if improved processor under the design of 6502 may
run at 1GHz with 32 bits and 64 bits? They may have better space saving
memory and efficient speed than x86 processors. If it does exist,
programmers may expect modern emulator projects to be written in 6502
assembly might be slower than x86 processor like RISC processor. Do you
expect?

Bryan Parkoff


Scott Hemphill

unread,
Jul 12, 2006, 10:40:51 PM7/12/06
to
pau...@saaf.se (Paul Schlyter) writes:

> In article <m364i29...@jade.local>,
> Scott Hemphill <hemp...@alumni.caltech.edu> wrote:
>
> > pau...@saaf.se (Paul Schlyter) writes:

[snip]

> >> You must have encountered some really sloppily written sudoku solvers
> >> for the PC if they required several seconds to solve a sudoku....
> >
> > Yup, if by sloppy you mean not choosing the most efficient algorithms.
>
> That's indeed part of sloppiness!

To be fair, some of these programs were explicitly written with the
idea of using the same techniques that humans use to solve the
puzzles, and some use this feature to assign a puzzle difficulty that
is supposed to correspond in some degree to the difficulty a human
would have.

If you are only interested in computer solution, however, and you want
to be able to solve any puzzle, then you have to implement backtracking.
Backtracking is so simple for computers (but painful for humans) that
it makes sense for it to play a major role in computer solution.

[snip]

> APproximately how much slower do you think a 6502 version would have
> been? 10% slower? Twice as slow? 10 times as slow?

Apparently under 1% slower, and it looks like Michael might put out a
6502 version in the next couple of days.

[snip]

> > It's public domain, with all sources
> > provided, so anyone is free to modify it for the 6502. Furthermore,
> > we didn't put in all the speed tweaks we thought of, and Michael has
> > thoughtfully mentioned some of those improvements on his website if
> > anyone wants to pick up where we left off.
>
> What assembler did you use?

Merlin-8.

Michael J. Mahon

unread,
Jul 12, 2006, 10:46:31 PM7/12/06
to
BLuRry wrote:
>>>- Could it be possible to support arrow keys?
>>
>>The arrow keys are already supported--did you try them?
>
> It didn't work for some strange reason. Maybe applewin doesn't like my
> laptop (of course I am using a beta release... who knows?)
>
>
>>>- Could it use mousetext for the display, since it's already singled
>>>out to use 65c02, and thus mainly enhanced //'s anyway?
>>
>>That was both "over the top" and more of a mess in the Applesoft
>>program. Maybe someone else would like to try it?
>
>
> I wouldn't mind doing that part. Yeah, it's ugly but I've done
> mousetext from applesoft basic once before.

Be forewarned--I'm thinking about having a small M/L routine just
move the screen image into place--is that compatible with mousetext?

And this is kind of going in the wrong direction, since I'm trying
to make it universal, so it runs on *all* Apple II's...

>>Frankly, I kind of like its ugly little text face. ;-)
>
>
> Oh, it's not that bad. The +'s and |'s and -'s give it a leet-looking
> appearance that is familar and reminiscent of the earlier ][ programs.
> Or .NFO files. I'm working on a 40-col umn DHGR font and a display
> method to render a 40-col screen to the DHGR page, reminscient of the
> beagle bros and paul lutus type programs that make the hires screen
> emulate the text screen. this might also be an option for a minimally
> painful look-n-feel overhaul if I ever finish it (free time has been a
> little fleeting lately)

After you use it for awhile, the framing characters drop out of
consciousness entirely, and only the numbers are there.

BTW, animating this screen in HGR or DHR would have big speed effects!

>>>- Maybe make some boxes to uneditable (e.g. keep them hilighted) so
>>>that the original numbers can remain unaltered.
>>
>>Not sure I know what you mean here... After a complete solution, the
>>grid is returned to its original state. Do you mean provide protection
>>of the user from himself? ;-)
>
>
> Yes, sorry. That was pitiful grammar on my part. Reloading the puzzle
> makes sense. However, I've found that keeping the original numbers
> "locked" and hilighted is somewhat useful when I'm trying to figure out
> where I messed up or determine which numbers are definately parts of
> the solution. Once you're 75% into the puzzle it makes it harder to
> backtrack otherwise -- similarly with the numbers you enter and are
> completely sure of. But, like I said, this is relatively moot.

Um, editing the puzzle after solution has started can't be done.
You have to re-start the solver.

BLuRry

unread,
Jul 12, 2006, 11:49:54 PM7/12/06
to
> BTW, animating this screen in HGR or DHR would have big speed effects!

Yeah, I know. I remember some of the good ol' softdisk programs being
a big sluggish. But if optimized (e.g. only update the characters that
have actually changed) it wouldn't be all that unbearable.

> Um, editing the puzzle after solution has started can't be done.
> You have to re-start the solver.

I meant more along the lines of this: http://www.websudoku.com/

(the original provided numbers of the puzzle are not editable, sort of
a way to prevent you from shooting your own feet.)

aiia...@gmail.com

unread,
Jul 12, 2006, 11:56:23 PM7/12/06
to
BLuRry wrote:
> However, I've found that keeping the original numbers
>"locked" and hilighted is somewhat useful when I'm trying to figure out
>where I messed up or determine which numbers are definately parts of
>the solution.

I haven't tried running the program yet...

but you could:

1)inverse the original puzzle
2)inverse numbers you input
3)surround original puzzle digits with - characters or something


Rich

Michael J. Mahon

unread,
Jul 13, 2006, 1:48:50 AM7/13/06
to

OK, I've done the solver, and in 6502 code, it's 0.85% slower
than the 65C02 version! Looks like a no-brainer... ;-)

Michael J. Mahon

unread,
Jul 13, 2006, 4:21:49 AM7/13/06
to
BLuRry wrote:
>>BTW, animating this screen in HGR or DHR would have big speed effects!
>
>
> Yeah, I know. I remember some of the good ol' softdisk programs being
> a big sluggish. But if optimized (e.g. only update the characters that
> have actually changed) it wouldn't be all that unbearable.

No, it would still be unbearable.

The animation that SUDOKU does is not 15fps or 30fps. It's doing
about 525 screen updates per second! And if you skip any of them,
you have to look at all 81 digits, because you don't know which ones
have changed! So anything you hope to make up in the straightaway
you will lose in the curves. ;-)

This is a problem that hardware character generators were invented
to solve, and every Apple II has one of its very own. ;-)

Every 'setbox' and 'unsetbox' places and removes a digit from the
screen memory, and each such operation requires 37 cycles. All but
9 of those are spent just getting the screen address to update, and
only 9 are spent changing the screen memory. If you have to update
7-8 bytes of HGR memory, or 14-16 bytes of DHR memory, with a pattern
that will take over a dozen cycles to set up addressing for, things
will get s-l-o-w-e-r.

(BTW, an accelerator doesn't help as much as you'd like, since all
the screen stores have to run at 1MHz.)

>>Um, editing the puzzle after solution has started can't be done.
>>You have to re-start the solver.
>
>
> I meant more along the lines of this: http://www.websudoku.com/
>
> (the original provided numbers of the puzzle are not editable, sort of
> a way to prevent you from shooting your own feet.)

Ah. I suppose this could be useful for experimentation with a puzzle.

I never thought of the program as a "paper" for solving puzzles by
hand, only as a machine solver... To be used as a "coach", a very
different interface and "coaching" functions would be appropriate.

You might want to write such a program. ;-)

It wouldn't be too hard to add uneditable boxes to the BASIC program,
just use a different representation for the "givens", like inverse text
(though it looks a little sloppy in the grid, since inverse digits tend
to merge with the major dividing "lines"). And the solver would have to
ignore that distinction by masking the grid entries--no problem.

Have at it! But you might want to wait a couple of days to let me
get a new version re-converged.

Frankly, I personally don't have much interest in making this program
more like some other program--as I'm sure you can understand. ;-)

Think of this program more like a "prover"--it tells you whether a
puzzle is inconsistent, has a unique solution, or has many solutions--
but it doesn't help you design puzzles or decide what to do next.

Michael J. Mahon

unread,
Jul 13, 2006, 4:24:58 AM7/13/06
to

Yep, 1)'s my first idea. The only snag is that inverse characters
come in contact with the inverse "major" grid lines, and so tend
to "attach" themselves, which is unattractive.

My real issue is that I don't see the purpose, or the purpose is
at odds with our original design intent.

Michael J. Mahon

unread,
Jul 13, 2006, 4:47:31 AM7/13/06
to

Surely you jest!

x86 code would be approximately as efficient in terms of space and ops
required to get to a solution, but would execute those ops about 6000
times faster! In fact, Scott's compiled C code for the algorithm solves
the 'evil01' puzzle in about 2 milliseconds, while a 1MHz Apple II takes
about 20 seconds--so there's a factor of 10,000.

> One paragraph of your website states that optimization is evil. I guess
> that many programmers and designer of processor are not smart enough to
> study space saving memory and efficient optimization when they should try to
> write algothrims in smaller functions.

"*Premature* optimization is the root of all evil" is a well-known
software engineering aphorism. It refers to the fact that a well-
structured but inefficent program can be optimized later without much
difficulty, but a program that has become twisted and "spaghettified"
in an effort to speed it up a little is very difficult to improve--and
most big improvements are algorithmic or data structure related, not
just coding hacks.

The meaning of the aphorism is to refrain from optimizing until you have
a reasonably well-structured program actually running. Then you can
measure it and determine exactly which 10% of it actually makes a real
difference in its efficiency. Optimization isn't bad, but optimizing
before you know what to optimize or why it matters is *very* bad. It's
a waste of time and it hardens otherwise malleable code.

> I expect them to spend extra time and money to write advanced x86
> assembly rather than high level language so they can write PC Operating
> System and games. They can outperform better. It is wasting time and money
> to develop modern x86 processors by trying to improve and increase the speed
> while they forget all the optimization. It is too much dependant on speed
> without optimization so software may operate little fair or poor speed.

As you know from my previous posts, I really lament the passing of the
age of craftsmanship in most software projects. But I also recognize
that the vast majority of code that is written is executed very seldom
if at all (think about the percentage of code that is error recovery
code). It is unreasonable to expect that most of the large applications
that we use today would be written in assembly language. (However, I
would like to see measurement-targeted dynamic optimization flourish--
maybe in another decade...)

> How could you predict if improved processor under the design of 6502 may
> run at 1GHz with 32 bits and 64 bits? They may have better space saving
> memory and efficient speed than x86 processors. If it does exist,
> programmers may expect modern emulator projects to be written in 6502
> assembly might be slower than x86 processor like RISC processor. Do you
> expect?

There could hardly be a more apples and oranges comparison. The 6502
architecture is very different from the x86 architecture in *many* ways.
The efficiency of a computer architecture is not an absolute, but is
completely relative to the nature of the workload that it will process,
and the cost point which it must satisfy.

For two very similar architectures, or for two implementations of the
same architecture, relative evaluations are much simpler, but are still
based on predetermined workload samples.

In the case of the Sudoku solver, I think that the empirical ratio of
performance between a 1MHz 6502 and a 3GHz 4-way superscalar x86 is in
rough agreement with their expected op throughput. Of course, the 6502
requires much less hardware to achieve its performance, and could, in
principle, run 20-50 times faster with essentially the same logic design
in modern technology, but that is all speculation.

-michael

Fast Sudoku solver for Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design

tool--and it's seriously underused."

David Empson

unread,
Jul 13, 2006, 9:08:34 AM7/13/06
to
Scott Hemphill <hemp...@hemphills.net> wrote:

> pau...@saaf.se (Paul Schlyter) writes:
>
> > In article <XtudnVI15qUVVCnZ...@comcast.com>,
> > Michael J. Mahon <mjm...@aol.com> wrote:
> > >It is amazingly fast! Most Sudoku solvers run on modern PCs thousands
> > >of times faster than a 1MHz Apple II, and take seconds to solve a
> > >puzzle,
> >
> > I don't believe that! Check out this Sudoku solver:
> > http://homepage.ntlworld.com/valleyway/solver.html
>
> I don't know about _most_ Sudoku solvers, but there are a lot of slow ones
> out there. And in terms of taking seconds, I think Michael was talking
> about difficult puzzles, not easy ones. Our most frequently used test
> puzzle was one we called "evil01". I loaded this into the solver you
> mentioned and it took about 4 seconds on my 3GHz machine. Here's the
> puzzle:
>
> .2.......
> ...6....3
> .74.8....
> .....3..2
> .8..4..1.
> 6..5.....
> ....1.78.
> 5....9...
> .......4.

That's certainly a tough one. I regularly do the puzzles in the local
paper, and this one took me about 1h10m, which is twice as long as the
typical time I take for a "diabolical" puzzle.

Only one speculative move and backtrack required, but I had to stretch
my usual techniques somewhat to avoid guesswork near the beginning.

The Sudoku program on my PDA took about ten seconds to verify it was
solvable (it usually takes well under a second).

--
David Empson
dem...@actrix.gen.nz

Scott Hemphill

unread,
Jul 13, 2006, 9:23:34 AM7/13/06
to
dem...@actrix.gen.nz (David Empson) writes:

If you think that one's tough, try this one. It's much harder, at least
for my algorithm. I don't know where I got it, but I saved it under the
filename "hints17-2". I have four files of the form "hints17-x", all
very tough, but this one is the worst:

..1.2.7..
.5.....9.
...4.....
.8...5...
.9.......
....6...2
..2......
..6.....5
.....9.83

Hmmm. I just noticed that there are only 17 givens in this puzzle, which
is the minimum number known to be required. (What I mean is that although
no one has proved a minimum number, there are no known puzzles with
as few as 16 givens that result in a unique solution.) That might account
for the "17" in the filename.

BLuRry

unread,
Jul 13, 2006, 9:39:53 AM7/13/06
to

Michael J. Mahon wrote:
> BLuRry wrote:
> >>BTW, animating this screen in HGR or DHR would have big speed effects!
> >
> >
> > Yeah, I know. I remember some of the good ol' softdisk programs being
> > a big sluggish. But if optimized (e.g. only update the characters that
> > have actually changed) it wouldn't be all that unbearable.
>
> No, it would still be unbearable.
>
> The animation that SUDOKU does is not 15fps or 30fps. It's doing
> about 525 screen updates per second! And if you skip any of them,
> you have to look at all 81 digits, because you don't know which ones
> have changed! So anything you hope to make up in the straightaway
> you will lose in the curves. ;-)

AH! I see what you mean now -- I didn't realize it was doing screen
updates while searching for the solution. It's a little too fast to
catch that, you see? I think it must have been doing all its
calculations during one VBL cycle. ;-) I was talking more about just
the UI that you interact with to solve the puzzle.

> (BTW, an accelerator doesn't help as much as you'd like, since all
> the screen stores have to run at 1MHz.)

That and having a large font table would not have enough cache locality
to be very efficient unless you happen to be storing the same character
everywhere. Either way, I wouldn't bet on an accelorator making things
much better -- and also I wouldn't want to expect the average apple //
user to have one.

-B

BLuRry

unread,
Jul 13, 2006, 9:44:09 AM7/13/06
to
> Yep, 1)'s my first idea. The only snag is that inverse characters
> come in contact with the inverse "major" grid lines, and so tend
> to "attach" themselves, which is unattractive.

There's always a simple solution, like surrounding the numbers with
asterisks or periods or some other character that "shades" the cell.

> My real issue is that I don't see the purpose, or the purpose is
> at odds with our original design intent.

well, that is very fair. As a prover, this program more than meets the
requirements, and very ellegantly for that matter. I can understand
why you wouldn't want to duplicate features of other suducko solver
programs out there, too much UI, and a total pain to write in basic.

I wouldn't mind rewriting the ui to be fancier, but honestly I have
other projects backlogged that I promised myself to work on first (e.g.
get a2gameserver to work properly for //e's, rewrite my raycaster
engine for double-hires and fix bugs in it, etc)

-B

David Empson

unread,
Jul 13, 2006, 9:44:12 AM7/13/06
to
Scott Hemphill <hemp...@hemphills.net> wrote:

> If you think that one's tough, try this one. It's much harder, at least
> for my algorithm. I don't know where I got it, but I saved it under the
> filename "hints17-2". I have four files of the form "hints17-x", all
> very tough, but this one is the worst:
>
> ..1.2.7..
> .5.....9.
> ...4.....
> .8...5...
> .9.......
> ....6...2
> ..2......
> ..6.....5
> .....9.83

Not even symmetrical. (Shudder.)

Interestingly my PDA was able to verify this one much faster. I don't
have time to look at it now, but I'll get to it in a few days.

--
David Empson
dem...@actrix.gen.nz

Paul Schlyter

unread,
Jul 13, 2006, 12:14:01 PM7/13/06
to
In article <NpidnWmquZS-5ijZ...@comcast.com>,

Michael J. Mahon <mjm...@aol.com> wrote:

A 6502 assembly language programmer using 65C02 specific ops would be
compared to one who uses an original 6502 and there utilizes the
undocumented 6502 ops which certainly will fail on a 65C02. And for
the same reason: marginal speed and space advantages.

--

Paul Schlyter

unread,
Jul 13, 2006, 12:14:01 PM7/13/06
to
In article <1152743662.1...@p79g2000cwp.googlegroups.com>,
Chris Alaimo <cpal...@gmail.com> wrote:

> By your logic, no one should ever write an Apple II program using 65c02
> language because then not every A2 owner can run it. How many Apple II
> enthusiasts do you know that don't have AT LEAST one 65c02-equipped
> Apple in their arsenal?

I know myself pretty well... :-) ...and I have only one Apple II: my
very first personal computer, bought in January 1980, before the 65C02
even existed. Perhaps you think my posession of only one single Apple II
disqualifies me as an "Apple II enthusiast" ????

Michael J. Mahon

unread,
Jul 13, 2006, 2:05:48 PM7/13/06
to
Paul Schlyter wrote:
> In article <NpidnWmquZS-5ijZ...@comcast.com>,
> Michael J. Mahon <mjm...@aol.com> wrote:
>
> A 6502 assembly language programmer using 65C02 specific ops would be
> compared to one who uses an original 6502 and there utilizes the
> undocumented 6502 ops which certainly will fail on a 65C02. And for
> the same reason: marginal speed and space advantages.

An interesting comparison.

While quite different in circumstance (since, at least early in the
cycle one might assume that the legitimate extended variant of an old
standard will itself become the "standard"), I do agree that the results
of such "standard-splintering" events is quite similar. They fragment
the market with potentially negative results.

An example I am too familiar with is Appleworks. By version 3.0, it
was the most widely used integrated package in the world, and had
become a "platform" for applications in its own right. Then, as the
Apple II market was losing momentum, two new versions were introduced
(4 and 5), and, while they never became the dominant version, they did
serve to splinter the Appleworks market, so that the cohesion and
synergy was effectively lost. It was all downhill from there.

Of course, the Appleworks community was destined to decline anyway
as the hardware platform fell behind, but the Appleworks "improvements"
actually hastened the disintegration of its community.

In a world where models become obsolete and are either retired or
replaced by newer models, the older standard simply vanishes from
the stage, and the "evolution with backward compatibility" model
works well enough.

But in the case of retrocomputing, older models *never* disappear,
so writing for the lowest common denominator *where practical*
should be the rule.

Michael J. Mahon

unread,
Jul 13, 2006, 2:39:19 PM7/13/06
to

Scott and I have both noticed that the time for a backtrack solution
depends very strongly on the order in which the boxes are scanned for
a trial placement.

This puzzle is on the SUDOKU disk as VEVIL (Very evil ;-). I also added
the 180-degree rotation of this puzzle as RVEVIL, where the solver will
effectively scan the grid in the reverse order. The solver takes 3.5
times longer to solve the rotated puzzle!

Scott did an experiment in which the solver scanned all the "evil"
puzzles in reverse (actually, upper left to lower right) and in a
majority of cases, the solution times changed by a factor of two
or more.

The solver currently scans "backward", from lower right to upper
left, for efficiency and coding convenience.

BTW, I'm glad that the 1MHz Apple II is hanging in there with your
PDA in solution speed. ;-)

Michael J. Mahon

unread,
Jul 13, 2006, 2:55:01 PM7/13/06
to
BLuRry wrote:
> Michael J. Mahon wrote:

>>The animation that SUDOKU does is not 15fps or 30fps. It's doing
>>about 525 screen updates per second! And if you skip any of them,
>>you have to look at all 81 digits, because you don't know which ones
>>have changed! So anything you hope to make up in the straightaway
>>you will lose in the curves. ;-)
>
>
> AH! I see what you mean now -- I didn't realize it was doing screen
> updates while searching for the solution. It's a little too fast to
> catch that, you see? I think it must have been doing all its
> calculations during one VBL cycle. ;-) I was talking more about just
> the UI that you interact with to solve the puzzle.

I'd forgotten that you were running it in an emulator!

Try it slowed down to 1MHz and you'll see the animation.

>>(BTW, an accelerator doesn't help as much as you'd like, since all
>>the screen stores have to run at 1MHz.)
>
>
> That and having a large font table would not have enough cache locality
> to be very efficient unless you happen to be storing the same character
> everywhere. Either way, I wouldn't bet on an accelorator making things
> much better -- and also I wouldn't want to expect the average apple //
> user to have one.

That's a good point. The 8KB Zip Chip cache can hold essentially all
of the code and data now.

Michael J. Mahon

unread,
Jul 13, 2006, 2:57:52 PM7/13/06
to
BLuRry wrote:
>>Yep, 1)'s my first idea. The only snag is that inverse characters
>>come in contact with the inverse "major" grid lines, and so tend
>>to "attach" themselves, which is unattractive.
>
>
> There's always a simple solution, like surrounding the numbers with
> asterisks or periods or some other character that "shades" the cell.

The current screen layout is only two lines per grid line (2x9=18),
so an inverse character is always going to contact an adjacent
inverse "line".

>>My real issue is that I don't see the purpose, or the purpose is
>>at odds with our original design intent.
>
>
> well, that is very fair. As a prover, this program more than meets the
> requirements, and very ellegantly for that matter. I can understand
> why you wouldn't want to duplicate features of other suducko solver
> programs out there, too much UI, and a total pain to write in basic.
>
> I wouldn't mind rewriting the ui to be fancier, but honestly I have
> other projects backlogged that I promised myself to work on first (e.g.
> get a2gameserver to work properly for //e's, rewrite my raycaster
> engine for double-hires and fix bugs in it, etc)

If you do get a chance, go for it! It's not like we're drowning in
new software for the Apple II. ;-)

David Empson

unread,
Jul 13, 2006, 5:26:20 PM7/13/06
to
David Empson <dem...@actrix.gen.nz> wrote:

That was a doddle. I did it before I went to sleep. It took me about 15
minutes, though I did need one guess. The previous "evil101" was much
harder.

This "hints17-x" one might be harder for a simple brute force solution,
but not if it is solved logically.

--
David Empson
dem...@actrix.gen.nz

Paul Schlyter

unread,
Jul 13, 2006, 5:43:37 PM7/13/06
to
In article <yvOdnYZ_3_3jFivZ...@comcast.com>,

Quite true!

Anyone got a sudoku solver for the ENIAC? I'd like to try to run it on
this ENIAC simulator: http://page.mi.fu-berlin.de/~zoppke/eniac/

An Apple II emulator for the ENIAC will be ok too...

:-)


> -michael
>
> Fast Sudoku solver for Apple II's!
> Home page: http://members.aol.com/MJMahon/
>
> "The wastebasket is our most important design
> tool--and it's seriously underused."

aiia...@gmail.com

unread,
Jul 13, 2006, 8:37:25 PM7/13/06
to

Michael J. Mahon wrote:
> Scott Hemphill and I recently collaborated to create a Sudoku puzzle
> solver for Apple II computers.

Very neat! I had no idea you could get a solver like this
to run so fast on the IIe!

It's fun to watch the solver go over the more complex puzzles...
watch where it's thinking..

Rich

BLuRry

unread,
Jul 14, 2006, 12:06:43 AM7/14/06
to
> I'd forgotten that you were running it in an emulator!
>
> Try it slowed down to 1MHz and you'll see the animation.

Actually, I never speed up an emulator, except disk i/o. I noticed
numbers appear on the screen, but the solver is too fast for me to
really see every little step it takes, unless I load one of the evil
puzzles. Man this thing is fast. Still astounded by what you guys
pulled off. Have a good week!

-B

Anton Treuenfels

unread,
Jul 14, 2006, 12:32:09 AM7/14/06
to

"Michael J. Mahon" <mjm...@aol.com> wrote in message
news:GL2dnfZbpopeQyjZ...@comcast.com...

> Michael J. Mahon wrote:
> > Michael J. Mahon wrote:
> >
> >> Paul Schlyter wrote:
> >>
> >>> APproximately how much slower do you think a 6502 version would have
> >>> been? 10% slower? Twice as slow? 10 times as slow?
> >>
> >>
> >>
> >> Certainly less than twice as slow. I'll find out what the current
> >> speed impact would be...
> >
> >
> > Paul's message prompted me to re-look at where 65C02 ops are used, and
> > they are all quite easy to eliminate--most with no or negligible time
> > impact.
> >
> > The only one that has a measurable impact is the LDA (p) in setbox,
> > and that can be eliminated by just doing an LDA bit rather than a TXA,
> > which frees up the X register to hold 0, so that the LDA (p) can become
> > LDA (p,x). (I don't get to use that addressing mode much--even if it is
> > with a constant 0 index. ;-)

Speaking of 'setbox', I'm surprised that you used a lookup table to
determine screen addresses but did not use one for the 'other' array. If I
understand the code correctly, you could replace the
'best*24+other_base_address' calculation with a pre-computed value. Also you
could get rid of three '-1' values in 80 rows of 'other', for a net savings
of 80 bytes to boot.

Heck, you could get rid of that last -1 value in each row as well: add $80
to the last entry, so that the ASL instruction in the main loop of 'setbox'
sets the carry for that entry, clears it for all the others. Then after
you've set the bit of interest, break the loop if the carry flag is set.
That also saves one LDA instruction at the top of the loop (20 executions
instead of 21).

And then you might consider breaking the 'setbox' loop in two loops, one for
setting the low byte and one setting the 9th bit. That would eliminate the
redundant test executed inside the loop: just test it once to decide which
loop to execute. Of course all the loop control code would be duplicated and
the only differences would be the few instructions that actually set bits,
but you just saved a bunch of space by implementing that 'others' lookup
table.

And of course you don't need 65C02 instructions to do any of this :)

Something like this:

...
lda best
asl
tax
lda otherptr,x
sta p
lda otherptr+1,x
sta p+1
ldx #20-1
lda bit
beq high_bit_loop

:1 txa
tay
lda (p),y
asl
tay
lda (bits),y
ora bit
sta (bits),y
dex
bpl :1
rts

high_bit_loop:

txa
tay
lda (p),y
asl
tay
iny
lda (bits),y
ora #$01
sta (bits),y
dex
bpl high_bit_loop
rts

...to take advantage of the fact that there are two indirect pointers
already. Can use the X-register for counting directly,
and don't need to put any kind of special coded values in the 'other' table
at all.

And if that works, why not store the neighbor square# times two in the
'other' table and eliminate the ASL instruction in each loop?

Oh, I gotta stop now...

- Anton Treuenfels


Michael J. Mahon

unread,
Jul 14, 2006, 6:26:05 AM7/14/06
to
Anton Treuenfels wrote:
> "Michael J. Mahon" <mjm...@aol.com> wrote in message
> news:GL2dnfZbpopeQyjZ...@comcast.com...
>
>>Michael J. Mahon wrote:
>>
>>>Michael J. Mahon wrote:
>>>
>>>
>>>>Paul Schlyter wrote:
>>>>
>>>>
>>>>>APproximately how much slower do you think a 6502 version would have
>>>>>been? 10% slower? Twice as slow? 10 times as slow?
>>>>
>>>>
>>>>
>>>>Certainly less than twice as slow. I'll find out what the current
>>>>speed impact would be...
>>>
>>>
>>>Paul's message prompted me to re-look at where 65C02 ops are used, and
>>>they are all quite easy to eliminate--most with no or negligible time
>>>impact.
>>>
>>>The only one that has a measurable impact is the LDA (p) in setbox,
>>>and that can be eliminated by just doing an LDA bit rather than a TXA,
>>>which frees up the X register to hold 0, so that the LDA (p) can become
>>>LDA (p,x). (I don't get to use that addressing mode much--even if it is
>>>with a constant 0 index. ;-)
>
>
> Speaking of 'setbox', I'm surprised that you used a lookup table to
> determine screen addresses but did not use one for the 'other' array. If I
> understand the code correctly, you could replace the
> 'best*24+other_base_address' calculation with a pre-computed value. Also you
> could get rid of three '-1' values in 80 rows of 'other', for a net savings
> of 80 bytes to boot.

I think the simple answer is that the pieces of code were written by
different people at different times. ;-)

> Heck, you could get rid of that last -1 value in each row as well: add $80
> to the last entry, so that the ASL instruction in the main loop of 'setbox'
> sets the carry for that entry, clears it for all the others. Then after
> you've set the bit of interest, break the loop if the carry flag is set.
> That also saves one LDA instruction at the top of the loop (20 executions
> instead of 21).

Switching to count control for the loop is a big win.

> And then you might consider breaking the 'setbox' loop in two loops, one for
> setting the low byte and one setting the 9th bit. That would eliminate the
> redundant test executed inside the loop: just test it once to decide which
> loop to execute. Of course all the loop control code would be duplicated and
> the only differences would be the few instructions that actually set bits,
> but you just saved a bunch of space by implementing that 'others' lookup
> table.
>
> And of course you don't need 65C02 instructions to do any of this :)

Right. I've already eliminated the 65C02 instructions, and one of the
paths not (yet?) taken was splitting the loop.

It's all good...I'm working on the solver as we "speak".

There's no need to do an OR in the "=9" case, since the only values
that byte can take on are 0 and 1. So LDA #1; STA (bits),y will do
fine. This, together with other changes, gets this loop down to 18
cycles per iteration--too bad that it's much less frequent than the
other case. ;-)

> Oh, I gotta stop now...

Please don't! ;-)

BTW, as I was thinking about the additional ops in the 65C02, I realized
how much more useful it would have been to provide more symmetry between
the X and Y registers. For example, the register juggling in the above
loop would be unnecessary if you could post-index by X as well as Y.

Kind of makes you wonder who designed the extensions...

We can avoid this juggling by modifying the addresses of the "neighbor
index" loads, outside the loop. This allows X to be used directly as
an index. When combined with storing the index already doubled, it
drops the "<9" loop to 22 cycles per iteration, resulting in a net
saving of about 260 cycles per 'setbox' call. In our benchmark
puzzle, it's called about 8300 times, for a saving of about 2.3 seconds
out of 30, or 7.6%. Though I prefer factors of two, I'll take it! ;-)

-michael

Even faster Sudoku solver for *all* Apple II's soon!

Michael J. Mahon

unread,
Jul 14, 2006, 6:28:46 AM7/14/06
to

When you try one that takes more than 5 seconds, the animation will
amuse you. It's actually fun to watch it trying various permutations.
;-)

You have a good one, too!

Michael J. Mahon

unread,
Jul 14, 2006, 6:29:17 AM7/14/06
to

Glad you like it, Rich!

A2Pro

unread,
Jul 14, 2006, 4:10:19 PM7/14/06
to
Hey, Mike!

Michael J. Mahon wrote:
> Glad you like it, Rich!

You have the REMs at the end of line numbers 410 and 420 backwards.

Willi

Michael J. Mahon

unread,
Jul 14, 2006, 5:08:56 PM7/14/06
to

Oops--thanks!

A2Pro

unread,
Jul 14, 2006, 8:40:04 PM7/14/06
to
Hi!

Michael J. Mahon wrote:


> A2Pro wrote:
> > You have the REMs at the end of line numbers 410 and 420 backwards.
>
> Oops--thanks!

The end of "sudoku.solver.s" has the line 'bitmap ds 0'. What is
its actual size? I think it's 162 * 2 but I thought I'd better check
with you.

Willi

Anton Treuenfels

unread,
Jul 14, 2006, 11:47:12 PM7/14/06
to

"Michael J. Mahon" <mjm...@aol.com> wrote in message
news:oridnYySprSi7CrZ...@comcast.com...

> Anton Treuenfels wrote:
> > "Michael J. Mahon" <mjm...@aol.com> wrote in message
> > news:GL2dnfZbpopeQyjZ...@comcast.com...
> >
> >>Michael J. Mahon wrote:

> It's all good...I'm working on the solver as we "speak".
>
> There's no need to do an OR in the "=9" case, since the only values
> that byte can take on are 0 and 1. So LDA #1; STA (bits),y will do
> fine. This, together with other changes, gets this loop down to 18
> cycles per iteration--too bad that it's much less frequent than the
> other case. ;-)
>
> > Oh, I gotta stop now...
>
> Please don't! ;-)
>

Ah, ya talked me into it...but my next idea is a bit more work...

> BTW, as I was thinking about the additional ops in the 65C02, I realized
> how much more useful it would have been to provide more symmetry between
> the X and Y registers. For example, the register juggling in the above
> loop would be unnecessary if you could post-index by X as well as Y.
>
> Kind of makes you wonder who designed the extensions...

Fred Bowen and friends designed the 65CE02, which had a Z-register that acts
like the Y-registers and 'usurps' the non-indexed indirect instructions of
the 65C02. So

lda (zp)

is really

lda (zp),z

but performs identically to a 65C02 if you never change the Z-register.

I may add the 65CE02 instructions to my assembler just because I think
they're neat - better than the W65C16S, for instance.

> We can avoid this juggling by modifying the addresses of the "neighbor
> index" loads, outside the loop. This allows X to be used directly as
> an index. When combined with storing the index already doubled, it
> drops the "<9" loop to 22 cycles per iteration, resulting in a net
> saving of about 260 cycles per 'setbox' call. In our benchmark
> puzzle, it's called about 8300 times, for a saving of about 2.3 seconds
> out of 30, or 7.6%. Though I prefer factors of two, I'll take it! ;-)

Self-modifying code? How about this instead:

ldx best
lda bit
beq set_high_byte_bit:

set_low_byte_bit::

ldy other00,x


lda (bits),y
ora bit
sta (bits),y

ldy other01,x


lda (bits),y
ora bit
sta (bits),y

... ; and so on (loop unrolling)
ldy other19,x


lda (bits),y
ora bit
sta (bits),y

rts

set_high_byte_bit:

lda #$01
ldy other00,x
iny
sta (bits),y
ldy other01,x
iny
sta (bits),y
... ; same unrolled loop
ldy other19,x
iny
sta (bits),y
rts

The key here is that the 'other' table has to be re-arranged (which is where
most of the work involved occurs). Instead of 81 20-byte tables, use 20
81-byte tables. The first table holds the first 'other' cell for all 81
cells, the second table the second, and so on. The 'best' value indexes the
cell we want to find the 'others' of directly, once for each of the 20
tables.

No address calculation, no counting, no loop overhead, no self-modifying
code - and it's pure 6502 (of course!).

- Anton Treuenfels


Sean Fahey

unread,
Jul 15, 2006, 1:54:41 AM7/15/06
to
This thread is cross-posted to three different groups which isn't really
necessary. Just about every Apple II person who reads one of the
comp.sys.apple2 groups reads all the others.

Michael J. Mahon

unread,
Jul 16, 2006, 2:06:00 AM7/16/06
to

Nice. Another index register is a very powerful addition.

Transposing the array--always an interesting strategy to evaluate
on a 6502.

> No address calculation, no counting, no loop overhead, no self-modifying
> code - and it's pure 6502 (of course!).

I notice that I have an aversion to self-modifying code, which is
justified in some instances. But going for speed when all code is
in RAM is not one of those instances. Code modification is quite
manageable when done in a structured way, and when the modifications
are done orders of magnitude less often than the modified addresses
are used.

BTW, I had no bugs related to the address modification, though I did
find an assembler bug related to incorrectly chosing a zero-page variant
for an instruction with a non-zero page operand of the form "xxx+0*0".

I added a table to multiply an index register by two into the other
index register--4 cycles and doesn't clobber the A reg--no biggie,
but handy in avoiding saves/reloads.

We considered unrolling early in the game, but the space used is pretty
high for the benefit, especially when compared with what can be done
with modified addresses. I've taken the latter path, in 'select',
'setbox', and the 'bits' save on recursion (not so important), and
got a total speed improvement of a little more than 18%.

Frankly, I'm not sure it was worth the work!

I'll look at the timing of your version of 'setbox'.

I also put the screen painting routine into the solver, and that makes
it much snappier on a 1MHz machine. And I added a "clear bottom 3
lines" routine so I wouldn't need the 80-column firmware's "clear to
end of screen" control character.

I should have the new version up by Monday, after some more testing.

-michael

Fast Sudoku solver for Apple II's!

Michael J. Mahon

unread,
Jul 16, 2006, 2:07:54 AM7/16/06
to

I agree, Sean--it was just all followup to the announcement.

After this, I'll reply only in 'programmer'.

-michael

Fast Sudoku solver for Apple II's!

Michael J. Mahon

unread,
Jul 16, 2006, 2:12:29 AM7/16/06
to

As it says in the comment ( ;-), the worst case recursion is 81 levels.
That won't practically happen, especially with the very effective
"expensive" recursion avoidance that Scott used. The deepest I have
seen it go is 16 levels.

In the latest version, since space is not cramped, I traded space for
some time saving by page-aligning the 'bits' stack, and making each
"frame" a full page, wasting 94 bytes per level, but saving address
modification time.

0 new messages