# Sudoku puzzle solver

68 views

### Marcel Hendrix

Aug 11, 2005, 8:24:05 AM8/11/05
to
A simple game.

-marcel
-- -----------
(*
* LANGUAGE : ANS Forth with extensions
* PROJECT : Forth Environments
* DESCRIPTION : Solver a for (simple) numeric puzzle
* CATEGORY : Example
* AUTHOR : Marcel Hendrix
* LAST CHANGE : August 11, 2005, Marcel Hendrix
*)

NEEDS -miscutil
NEEDS -fsl_util

REVISION -sudoku "Ã„Ã„Ã„ SUDOKU solver Version 1.00 Ã„Ã„Ã„"

PRIVATES

DOC
(*
Insert the digits 1..9 into each horizontal line, in each vertical column
and in all marked 3 x 3 squares. Each digit may only appear once per line,
column and per square.

9 6 _ | 3 _ _ | 8 _ _
8 5 _ | _ 1 _ | _ _ 9
_ _ 3 | _ 9 6 | _ 2 _
------+-------+------
1 _ _ | _ _ _ | _ _ 2
_ _ _ | 2 7 _ | 1 8 _
6 _ 9 | _ 8 4 | _ _ 3
------+-------+------
_ _ 6 | 7 _ _ | _ 5 _
_ _ _ | _ 2 8 | 3 _ 4
_ 4 _ | _ _ 5 | _ 6 _

Optimization has no use, this is too fast already.
*)
ENDDOC

-- Utility, count bits.
: #BITS ( u -- #bits )
0 SWAP
BEGIN TUCK
WHILE 1+ SWAP DUP 1- AND
REPEAT NIP ;

9 INTEGER ARRAY rows{ PRIVATE
9 INTEGER ARRAY cols{ PRIVATE
9 INTEGER ARRAY boxes{ PRIVATE

-- Puzzle descriptor
9 9 INTEGER MATRIX puzzle{{ PRIVATE

9 6 0 3 0 0 8 0 0
8 5 0 0 1 0 0 0 9
0 0 3 0 9 6 0 2 0
1 0 0 0 0 0 0 0 2
0 0 0 2 7 0 1 8 0
6 0 9 0 8 4 0 0 3
0 0 6 7 0 0 0 5 0
0 0 0 0 2 8 3 0 4
0 4 0 0 0 5 0 6 0

: R,C->IX ( row col -- box# ) 3 / SWAP 3 / 3 * + ; PRIVATE
: IX->R,C ( box# -- row col ) 3 /MOD 3 * SWAP 3 * ; PRIVATE

: .NUM ( u -- ) #10 1 DO DUP I 2^x AND IF I . ENDIF LOOP DROP ; PRIVATE

MARKER -nonce

-- compute and store bitmasks for each row
: stuff-row ( row -- )
LOCAL row
0 9 0 DO puzzle{{ row I }} @ DUP IF 2^x OR ELSE DROP ENDIF LOOP rows{ row } ! ; PRIVATE

-- compute and store bitmasks for each column
: stuff-col ( col -- )
LOCAL col
0 9 0 DO puzzle{{ I col }} @ DUP IF 2^x OR ELSE DROP ENDIF LOOP cols{ col } ! ; PRIVATE

-- compute and store bitmasks for each box
: stuff-box ( box -- accu )
DUP IX->R,C LOCALS| col row box |
0 3 0 DO 3 0 DO puzzle{{ row J + col I + }}
@ DUP IF 2^x OR ELSE DROP ENDIF
LOOP
LOOP
boxes{ box } ! ; PRIVATE

:NONAME ( -- )
9 0 DO I stuff-row LOOP
9 0 DO I stuff-col LOOP
9 0 DO I stuff-box LOOP ; EXECUTE -nonce

-- When we find out that NUM can be put at (row, col) without violating restrictions,
-- we must update all 3 bitmasks for that position.
: PUPDATE ( num row col -- )
LOCALS| col row num |
rows{ row } DUP @ num OR SWAP !
cols{ col } DUP @ num OR SWAP !
boxes{ row col R,C->IX } DUP @ num OR SWAP !
num LOWEST-BIT puzzle{{ row col }} ! ; PRIVATE

-- Test which numbers can be put at (row,col) without violating the restrictions.
-- Returns a bitmap where each high bit corresponds to a number from 1..9.
: POSSIBLE ( row col -- u )
LOCALS| col row |
puzzle{{ row col }} @ ?DUP IF DROP 0 EXIT ENDIF
rows{ row } @
cols{ col } @ OR
boxes{ row col R,C->IX } @ OR ( all the number that are already taken )
INVERT ( those yet free )
%1111111110 AND ( mask of 0 and those higher than 9 ) ; PRIVATE

: TRY ( verbosity -- flag )
0 LOCALS| #changes verbose? |
9 0 DO 9 0 DO J I POSSIBLE DUP #BITS
1 = IF verbose? IF CR ." only one possibility at ("
J 0 .R &, EMIT I 0 .R
." ) = " DUP .NUM
ENDIF
( u) J I PUPDATE 1 +TO #changes
ELSE DROP
ENDIF
LOOP
LOOP
#changes ; PRIVATE

: (SUDOKU) ( verbosity -- )
LOCAL verbose?
print-width #digits 9 TO print-width 1 TO #digits
CR ." The problem = " puzzle{{ }}print
CR TIMER-RESET
BEGIN verbose? TRY 0= UNTIL
.ELAPSED
CR ." The solution = " puzzle{{ }}print
TO #digits TO print-width ; PRIVATE

: SUDOKU FALSE (SUDOKU) ; ( just do it )
: SUDOKU+ TRUE (SUDOKU) ; ( print intermediates )

:ABOUT CR ." Try: SUDOKU | SUDOKU+ " ;

DEPRIVE

(* End of Source *)

### Robert Spykerman

Aug 11, 2005, 9:43:41 PM8/11/05
to

Marcel Hendrix wrote:
> A simple game.

More code to dissect :) Thanks Marcel!

Mental note: Must really look at FSL.

Not looked at the algorithm yet (because believe it or not I was
actually trying to write one of these myself!)

Allegedly Paul Hsieh ( http://www.azillionmonkeys.com/ ) has a pretty
it's in c, but again am not looking at it till I finish mine...

Cheers

Robert

### Robert Spykerman

Aug 30, 2005, 3:31:56 AM8/30/05
to
Marcel Hendrix wrote:
> A simple game.
... snip...

Did this (below) trying to learn forth. Now I can look at your source
code and study it, Marcel!

Anyway, I would like to trouble the lot of you to see if you can
improve in anywhich way on what I have done.

Any other algorithms, I'm interested too.

Thanks all of ye,

Robert

And, if I am obviously doing something REALLY unforthlike or wrong,

\ 'Brute Force' Sudoku Solver in Forth.
\ No special extensions were used.
\ Tested on in win32forth, VFX and Swift (evaluation).
\ No locals were harmed during this experiment ;)
\
\ Version: 1300 30052005 - Robert Spykerman
\ email: robspyke_nospam@iprimus_no_spam.com.au
\ (delete the obvious)

\ ---------------------
\ Variables
\ ---------------------

\ Still haven't figured out FSL and the oodles of stuff
\ in it so the following will have to do ;)

create sudokugrid

0 C, 9 C, 0 C, 0 C, 0 C, 4 C, 0 C, 0 C, 7 C,
0 C, 0 C, 0 C, 0 C, 0 C, 7 C, 9 C, 0 C, 0 C,
8 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C,

4 C, 0 C, 5 C, 8 C, 0 C, 0 C, 0 C, 0 C, 0 C,
3 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 2 C,
0 C, 0 C, 0 C, 0 C, 0 C, 9 C, 7 C, 0 C, 6 C,

0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 4 C,
0 C, 0 C, 3 C, 5 C, 0 C, 0 C, 0 C, 0 C, 0 C,
2 C, 0 C, 0 C, 6 C, 0 C, 0 C, 0 C, 8 C, 0 C,

create sudoku_row 9 cells allot
create sudoku_col 9 cells allot
create sudoku_box 9 cells allot

\ ---------------------
\ Logic
\ ---------------------
\ Basically : 1. Parses grid to see if current numbers on it valid
\ 2. ... By putting them all in sets of numbers, row
\ column, and box.
\ 3. Then iterates thru' empty spaces. When hits one,
\ tries a number not already conflicting with
\ what's in the sets.
\ 4. If try succeeds, puts said number into grid and
\ Tries to see if further numbers fit.
\ End Condition : No spaces left
\ 5. If nothing fits space, recursive word returns
\ false...
\ 6. Removes failed numbers from grid and sets,
\ Tries again till end condition...

\ Grid Related

: xy 9 * + ; \ x y -- offset ;
: getrow 9 / ;
: getcol 9 mod ;
: getbox dup getrow 3 / 3 * swap getcol 3 / + ;

\ Puts and gets numbers from/to grid only
: setnumber sudokugrid + c! ; \ n position --
: getnumber sudokugrid swap + c@ ;

: cleargrid sudokugrid 81 0 do dup i + 0 swap c! loop drop ;

\ --------------
\ Set related: sets are sudoku_row, sudoku_col, sudoku_box

\ ie x y -- ; adds x into bitmap y
: addbits_row 1 rot lshift swap cells sudoku_row + dup @ rot or swap !
;
: addbits_col 1 rot lshift swap cells sudoku_col + dup @ rot or swap !
;
: addbits_box 1 rot lshift swap cells sudoku_box + dup @ rot or swap !
;

\ ie x y -- ; remove number x from bitmap y
: removebits_row 1 rot lshift swap cells sudoku_row + dup @ rot invert
and swap ! ;
: removebits_col 1 rot lshift swap cells sudoku_col + dup @ rot invert
and swap ! ;
: removebits_box 1 rot lshift swap cells sudoku_box + dup @ rot invert
and swap ! ;

\ clears all bitsmaps to 0
: clearbitmaps 9 0 do i cells
0 over sudoku_row + !
0 over sudoku_col + !
0 swap sudoku_box + !
loop ;

\ Adds number to grid and sets
: addnumber \ number position --
2dup setnumber
;

\ Remove number from grid, and sets
: removenumber \ position --
dup getnumber swap
2dup getrow removebits_row
2dup getcol removebits_col
2dup getbox removebits_box
nip 0 swap setnumber
;

\ gets bitmap at position, ie
\ position -- bitmap

: getrow_bits getrow cells sudoku_row + @ ;
: getcol_bits getcol cells sudoku_col + @ ;
: getbox_bits getbox cells sudoku_box + @ ;

\ position -- composite bitmap (or'ed)
: getbits
dup getrow_bits
over getcol_bits
rot getbox_bits or or
;

\ Try tests a number in a said position of grid
\ Returns true if it's possible, else false.
: try \ number position -- true/false
over 1 swap lshift
over getbits and 0= -ROT 2drop
;

\ --------------
: parsegrid \ Parses Grid to fill sets.. Run before solver.
sudokugrid
81 0 do
dup i + c@
dup if
dup i try if
else
unloop drop drop FALSE exit
then
else
drop
then
loop
drop
TRUE
;

\ MAIN SOLVER
\ Recursively called. Will unwind when end condition
\ reached i.e. when NO MORE SPACES in the grid
\ Returns t/f depending if current 'fork' successful.
\ Iterates thru' grid.
\ If position is occupied, continues to next.
\ If not occupied, try and recurse

\ Morespaces manually checks for spaces ...
\ Obviously this can be optimised to a count,
\ Will be faster that way....

: morespaces?
0 81 0 do sudokugrid i + c@ 0= if 1+ then loop ;

: solver
morespaces? IF \ main exit condition
81 0 do \ Let's find some empty spaces
sudokugrid i + c@ IF
ELSE \ empty space found !
10 1 do
i j try \ try fitting numbers in...
if
recurse
if
unloop unloop TRUE exit
else
j removenumber
then
then
loop
FALSE leave \ none of 1-9 fit, bugger...
\ false trail....
THEN
loop
ELSE
TRUE \ NO More spaces, we're thru!
THEN
;

\ main entry to solver routine
: startsolving
clearbitmaps \ reparse bitmaps and reparse grid
parsegrid \ just in case..
solver
AND
;

\ ---------------------
\ Display Grid
\ ---------------------
\
\ Prints grid nicely, takes address of grid as argument
\
: .sudokugrid
CR CR
sudokugrid
81 0 do
dup i + c@ . ." "
i 1+
dup 3 mod 0= if
dup 9 mod 0= if
CR
dup 27 mod 0= if
dup 81 < if ." ------+-------+------" CR then
then
else
." ! "
then
then
drop
loop
drop
CR
;

\ ---------------------
\ Higher Level Words
\ ---------------------

: add \ n x y --
xy 2dup
try if
.sudokugrid
else
CR ." Not a valid move. " CR
2drop
then
;

: rm
xy removenumber
.sudokugrid
;

: clearit
cleargrid
clearbitmaps
.sudokugrid
;

: solveit
CR CR
startsolving
if
." Solution Found " CR .sudokugrid
else
." No Solution Found " CR CR
then
;

: showit .sudokugrid ;

: help
CR
." Type clearit ; to clear grid " CR
." 1-9 x y add ; to add 1-9 to grid at x y (0 based) " CR
." x y rm ; to remove number at x y " CR
." showit ; redisplay grid " CR
." solveit ; to solve " CR
." help : for help " CR
CR
;

\ ---------------------
\ Execution starts here
\ ---------------------

: godoit
CR
clearbitmaps
parsegrid if
CR ." Grid in source valid. "
else
CR ." Warning: Grid in source invalid. "
then
.sudokugrid
help
;

godoit

\ Possible optimisations I see :
\ 1. morespaces? end-condtion checking - use a counter.
\ 2. Avoid recursion altogether? I can't picture this,
\ but I know someone's done it. I think Paul Hsieh has...
\ (www.azillionmonkeys.com)
\ 3. optimise 3/ and like as discussed already...
\ 4. Better idioms and factoring (that's where you come in! ;)

### Anton Ertl

Aug 30, 2005, 4:06:57 AM8/30/05
to
"Robert Spykerman" <robert.s...@gmail.com> writes:
>\ 'Brute Force' Sudoku Solver in Forth.
>\ No special extensions were used.
>\ Tested on in win32forth, VFX and Swift (evaluation).

Also tested on Gforth and iForth. Works without changes. I love
portable programs.

Timings (user time on 2.26GHz Pentium 4):
0.910s time gforth-fast sodoku.fs -e "solveit bye"
1.080s time iforth "include sodoku.fs solveit bye"

I guess iForth hit another cache consistency hurdle.

gforth ans-report.fs sodoku.fs -e "print-ans-report bye"
prints:

The program uses the following words
from CORE :
Create c, cells allot : * + ; / mod dup swap c! c@ DO i LOOP drop rot
lshift @ or ! invert and over 2dup 0= 2drop IF ELSE unloop EXIT THEN
1+ j recurse LEAVE cr . ." <
from CORE-EXT :
nip false true
from BLOCK-EXT :
\
from non-ANS :
-rot

Strangely, iForth does not complain about the -ROT.

Hmm, given that -ROT seems to be a de-facto standard, it's an obvious
candidate for formal standardization.

I eliminated the -ROT and put the result on

http://www.complang.tuwien.ac.at/forth/programs/sodoku.fs

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.complang.tuwien.ac.at/forth/ansforth/forth200x.html
EuroForth 2005: http://www.complang.tuwien.ac.at/anton/euroforth2005/

### Robert Spykerman

Aug 30, 2005, 5:27:59 AM8/30/05
to

Anton Ertl wrote:
> "Robert Spykerman" <robert.s...@gmail.com> writes:
> >\ 'Brute Force' Sudoku Solver in Forth.
> >\ No special extensions were used.
> >\ Tested on in win32forth, VFX and Swift (evaluation).
>
> Also tested on Gforth and iForth. Works without changes. I love
> portable programs.
>
>
>
> Timings (user time on 2.26GHz Pentium 4):
> 0.910s time gforth-fast sodoku.fs -e "solveit bye"
> 1.080s time iforth "include sodoku.fs solveit bye"
>
> I guess iForth hit another cache consistency hurdle.

I get about 0.33 secs average on VFX 3.7 (best 0.25 worst 0.5)

This may be worth trying. Assuming P4, assuming cache line size of 1024
(am I right?)

Does this improve iForth?

Robert

... code ....

create buf0 1024 allot \ ensure the data is not between cache lines

create sudokugrid

0 C, 9 C, 0 C, 0 C, 0 C, 4 C, 0 C, 0 C, 7 C,
0 C, 0 C, 0 C, 0 C, 0 C, 7 C, 9 C, 0 C, 0 C,
8 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C,

4 C, 0 C, 5 C, 8 C, 0 C, 0 C, 0 C, 0 C, 0 C,
3 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 2 C,
0 C, 0 C, 0 C, 0 C, 0 C, 9 C, 7 C, 0 C, 6 C,

0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 4 C,
0 C, 0 C, 3 C, 5 C, 0 C, 0 C, 0 C, 0 C, 0 C,
2 C, 0 C, 0 C, 6 C, 0 C, 0 C, 0 C, 8 C, 0 C,

create buf1 1024 allot

create sudoku_row 9 cells allot

create buf2 1024 allot

create sudoku_col 9 cells allot

create buf3 1024 allot

create sudoku_box 9 cells allot

create buf4 1024 allot

### webs...@gmail.com

Aug 30, 2005, 6:27:01 AM8/30/05
to
Robert Spykerman wrote:
> \ 2. Avoid recursion altogether? I can't picture this,
> \ but I know someone's done it. I think Paul Hsieh has...
> \ (www.azillionmonkeys.com)

Well, the compromise I made for not doing recursion, is that I cannot
*prove* that my algorithm always solves every sudoku. I must say that
I have not ever encountered one that my program cannot solve (including
a lot of the "insanely difficult" and many "starting with only 17
filled cells" boards), however, personally, I can't guarantee that it
solves every possible sudoku board.

My algorithm essentially mixes constraint based "simulate how a human
would do it" with one "trial and error" level running on top of the
constraint routines. And basically solves any sudoku puzzle instantly.

See the problem is that if I were to implement a fully recursive
solution, I still would never see it go deeper than one level since
none of the sample sudoku's I've seen needs it. So I wouldn't be able
to test it.

I don't know anything about forth, so unforunately, I can't comment on

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

### Robert Spykerman

Aug 30, 2005, 11:35:07 AM8/30/05
to
webs...@gmail.com wrote:

... snip ...

> My algorithm essentially mixes constraint based "simulate how a human
> would do it" with one "trial and error" level running on top of the
> constraint routines. And basically solves any sudoku puzzle instantly.

> I don't know anything about forth, so unforunately, I can't comment on

... snip ...

Hmmm... Okay, I'd better sit down and try and solve a couple more to
get a better idea how a human does it. I've only ever finished just
one!!! (and it took me like an hour, with a pencil!, which I take it to
be not what the pros like to use)Or better yet, have a look at your

By the way, dare I ask,

How fast does your algorithm crunch a sudoku, say the one I gave in my
source ?

My brute force method is not very efficient, as you may have noted , it
takes about .33 secs on a PIV 2.6/800 i865 stock Dell. Made approx
250,000 recursive calls... ( should have measured the depth, will try
that next )

Of course, the more I look at my code, the more I see things to be
improved upon ... But 250,000 recursive calls is still 250,000
recursive calls ...

Also, how fast does a brute force C do it in ?

Cheers

Robert

### Anton Ertl

Aug 30, 2005, 2:05:20 PM8/30/05
to
"Robert Spykerman" <robert.s...@gmail.com> writes:

>
>Anton Ertl wrote:
>> Timings (user time on 2.26GHz Pentium 4):
>> 0.910s time gforth-fast sodoku.fs -e "solveit bye"
>> 1.080s time iforth "include sodoku.fs solveit bye"
>>
>> I guess iForth hit another cache consistency hurdle.
>
>I get about 0.33 secs average on VFX 3.7 (best 0.25 worst 0.5)
>
>This may be worth trying. Assuming P4, assuming cache line size of 1024
>(am I right?)

Well, it's not cache lines (they are 64bytes in L1 and 128 bytes in
L2), but consistency regions, and yes, they are 1024 bytes.

>Does this improve iForth?

Well I just put "1024 allot" before and after the data block, and the
iForth time changed to 0.79s. Putting such lines between the data
items did not help further.

Looking at the Number of instructions executed, iForth is quite a lot
better than Gforth, but it does not translate into a similar speed
advantage; I guess that there is still some cache or pipelining issue
somewhere.

gforth-fast iforth
1966432598 786191491 instructions
2102876916 1830296296 Proc cycles

### Anton Ertl

Aug 30, 2005, 2:21:11 PM8/30/05
to
webs...@gmail.com writes:
>Well, the compromise I made for not doing recursion, is that I cannot
>*prove* that my algorithm always solves every sudoku. I must say that
>I have not ever encountered one that my program cannot solve (including
>a lot of the "insanely difficult" and many "starting with only 17
>filled cells" boards), however, personally, I can't guarantee that it
>solves every possible sudoku board.

AFAIK it's a design goal for Sodoku puzzles that they can be solved
without backtracking. So, a somewhat sophisticated solver should be
able to solve published Sodoku puzzles without backtracking; however,
I am pretty sure that Sodoku puzzles with unique solutions could be
designed that do not have this property (i.e., that require
backtracking unless the solver is lucky and guesses right at all
choice points).

### Marcel Hendrix

Aug 30, 2005, 6:06:55 PM8/30/05
to
"Robert Spykerman" <robert.s...@gmail.com> writes Re: Sudoku puzzle solver
[..]

> Anyway, I would like to trouble the lot of you to see if you can
> improve in anywhich way on what I have done.

Looks good to me. I don't like your indentation style, though.
IMHO the logic e.g. here is very hard to follow:

> : parsegrid \ Parses Grid to fill sets.. Run before solver.
> sudokugrid
> 81 0 do
> dup i + c@
> dup if
> dup i try if
> else
> unloop drop drop FALSE exit
> then
> else
> drop
> then
> loop
> drop
> TRUE
> ;

\ Parses Grid to fill sets.. Run before solver.

: parsegrid ( -- bool )
sudokugrid
81 0 DO
dup I + C@
dup IF
dup I try
ELSE drop drop FALSE UNLOOP EXIT
ENDIF
ELSE drop
ENDIF
LOOP drop
TRUE ;

Again, IMHO !

When I run your algorithm on the present development version of iForth.
on a 3 GHz P4ht, I get:

FORTH> solveit ( no inlining )

Solution Found, 0.593 seconds elapsed.

This can be improved by inlining some of the words, because iForth refuses
to do that automatically:

FORTH> solveit ( judicious inlining )

Solution Found, 0.338 seconds elapsed.

To find out which words to inline I used the iForth lprof.frt profiler
utility. Here's an excerpt from the linecount listing:

[254393 calls to morespaces?, exclusive/total = 1.6%, cumulative/total = 1.6%]
254393 * : morespaces?
254393 * 0 81 0 do sudokugrid i + c@ 0= if 1+ then loop ;
|
[254393 calls to solver, exclusive/total = 13.7%, cumulative/total = 26.7%]
254393 * : solver
254393 * morespaces? IF \ main exit condition
254392 * 81 0 do \ Let's find some empty spaces
10146740 * sudokugrid i + c@
10146740 * 0= IF \ empty space found !
254392 * 10 1 do
2289268 * i j try \ try fitting numbers in...
2289268 * if
254392 * recurse
254392 * if
61 * unloop unloop TRUE exit
| else
254331 * j removenumber
254331 * then
254331 * then
2289207 * loop
254331 * FALSE leave \ none of 1-9 fit, bugger...
| \ false trail....
| THEN
9892348 * loop
254331 * ELSE
1 * TRUE \ NO More spaces, we're thru!
1 * THEN
254332 * ;

FORTH> 0 10 scol ( judicious inlining, use lprof.frt )

____________ NAME ______________.__ #CALLED ___ EXCL __ TOTAL
setnumber 508743 2.3% 2.3%
morespaces? 254393 6.2% 6.2%
solver 254393 40.8% 21.8%
getnumber 254331 1.2% 1.2%
removebits_row 254331 1.2% 1.2%
removebits_col 254331 1.2% 1.2%
removebits_box 254331 1.2% 0.5% ok

The output from SCOL above shows that after inlining the bulk of the
time is spent in SOLVER. What to do follows from the dynamic linecount
output above that. You have a loop that is executed 10 million times.

-marcel

### Robert Spykerman

Aug 30, 2005, 10:12:56 PM8/30/05
to

Marcel Hendrix wrote:
> "Robert Spykerman" <robert.s...@gmail.com> writes Re: Sudoku puzzle solver
> [..]
>
> > Anyway, I would like to trouble the lot of you to see if you can
> > improve in anywhich way on what I have done.

...

> Looks good to me. I don't like your indentation style, though.
> IMHO the logic e.g. here is very hard to follow:

...

> \ Parses Grid to fill sets.. Run before solver.
> : parsegrid ( -- bool )
> sudokugrid
> 81 0 DO
> dup I + C@
> dup IF
> dup I try
> ELSE drop drop FALSE UNLOOP EXIT
> ENDIF
> ELSE drop
> ENDIF
> LOOP drop
> TRUE ;

Ok, point taken. Nice wide indents are nicer. I really must look at a
more sophisticated code editor then. Will check out the editors you
guys were talking about sometime earlier.

> When I run your algorithm on the present development version of iForth.
> on a 3 GHz P4ht, I get:
>
> FORTH> solveit ( no inlining )
>
> Solution Found, 0.593 seconds elapsed.
>
> This can be improved by inlining some of the words, because iForth refuses
> to do that automatically:
>
> FORTH> solveit ( judicious inlining )
>
> Solution Found, 0.338 seconds elapsed.
>
> To find out which words to inline I used the iForth lprof.frt profiler
> utility. Here's an excerpt from the linecount listing:
>
> [254393 calls to morespaces?, exclusive/total = 1.6%, cumulative/total = 1.6%]
> 254393 * : morespaces?
> 254393 * 0 81 0 do sudokugrid i + c@ 0= if 1+ then loop ;

Geez. You've got a really sweet set of tools, Marcel.

In retrospect, I think I could have done better that as a word, I guess
I could have approached it differently...

VFX appears to inlines almost everything to solver, which is nice.
(except the call to MOD, which I have currently left not as a code word
in the one I posted, so that it would not be implementation specific.
Actually my posts earlier about rapid muls and divides were all because
of this current experiment).

> ____________ NAME ______________.__ #CALLED ___ EXCL __ TOTAL
> setnumber 508743 2.3% 2.3%
> morespaces? 254393 6.2% 6.2%
> solver 254393 40.8% 21.8%
> getnumber 254331 1.2% 1.2%
> removebits_row 254331 1.2% 1.2%
> removebits_col 254331 1.2% 1.2%
> removebits_box 254331 1.2% 0.5% ok
>
> The output from SCOL above shows that after inlining the bulk of the
> time is spent in SOLVER. What to do follows from the dynamic linecount
> output above that. You have a loop that is executed 10 million times.

Yeah, that's not nice. I thought I could do better... Hmm the figures
don't quite add up to 100%. Where's the other time hiding in?

You really have a nice set of tools. They are iForth specific, I take
it?

Anyway, I have an alternative, which I cooked up early this morning,
but at least from initial testing, does not appear any faster... Any
ideas why?

Replace morespaces? and solver with below...

: getspace \ basically finds the next empty space and
-1 \ puts index on datastack
BEGIN \ is there a better way ?
1+ \ if out of bounds, mins it.
dup 80 min \ (allocating a bigger 82 byte array and
sudokugrid + c@ \ dropping the min speeds it up. I have
0= \ to try a cmovcc min for fun later)
over 80 > \ 2 conditional checks...
\ If space, return index
OR \ if outofbounds return out of bounds
UNTIL \ (no more spaces)
;

: solver
getspace \ gets next space by calling above
dup 80 > if \ checks for out of bounds
\ (exit condition)
drop TRUE exit
then
10 1 do \ the 'BIG' loop ;)
i over try if
\ removenumber
recurse if \ and try would be ideal

\ spots to try
drop unloop TRUE EXIT \ and speed up.
else \ in a true recursive
\ solver, I don't
dup removenumber \ reckon I can get rid
\ of the big loop
then \ or 250,000+ calls for
then \ this puzzle.
loop
drop FALSE
;

Still not much faster than the original. I think it's because there may
just be a couple more conditional checks than my previous example.
Removing them, ie the min, and having a slightly larger char array will
help.

Or it could be I am just not using forth correctly ;)

Am still studying your code Marcel, it's got a lot of interesting words
that I am unfamiliar with. How can I get it to run under VFX or
win32forth so I can better understand it?

Anyway, I've got to go do some other more 'real' work ;)

Robert

### Robert Spykerman

Aug 30, 2005, 10:29:20 PM8/30/05
to

Anton Ertl wrote:
...

> AFAIK it's a design goal for Sodoku puzzles that they can be solved
> without backtracking. So, a somewhat sophisticated solver should be
> able to solve published Sodoku puzzles without backtracking; however,
> I am pretty sure that Sodoku puzzles with unique solutions could be
> designed that do not have this property (i.e., that require
> backtracking unless the solver is lucky and guesses right at all
> choice points).

To take it to extremes, the one below is definitely solvable, albeit,
either by computer or very slowly by human hand ;)

I am not sure if there is more than one solution to this, but it would
stand to reason at least from the distance I am looking at it, there
might.

1 0 0 ! 0 0 0 ! 0 0 0
0 0 0 ! 0 0 0 ! 0 0 0
0 0 0 ! 0 0 0 ! 0 0 0
------+-------+------
0 0 0 ! 0 0 0 ! 0 0 0
0 0 0 ! 0 4 0 ! 0 0 0
0 0 0 ! 0 0 0 ! 0 0 0
------+-------+------
0 0 0 ! 0 0 0 ! 0 0 0
0 0 0 ! 0 0 0 ! 0 0 0
0 0 0 ! 0 0 0 ! 0 0 9

ok
solveit

Solution Found

1 2 3 ! 4 5 6 ! 7 9 8
4 5 6 ! 7 8 9 ! 1 2 3
7 8 9 ! 1 2 3 ! 4 5 6
------+-------+------
2 1 4 ! 3 6 5 ! 9 8 7
3 6 7 ! 9 4 8 ! 2 1 5
5 9 8 ! 2 1 7 ! 3 6 4
------+-------+------
6 3 1 ! 5 9 4 ! 8 7 2
9 4 5 ! 8 7 2 ! 6 3 1
8 7 2 ! 6 3 1 ! 5 4 9

Elapsed Time: 0
Calls : 447 ok

Sometimes, the more you have on the grid, the harder it is the solver
has to work... i.e. in the puzzle in my source, it calls solver not 447
times but 250,000+ times..

Coupled with a random number generator though, one could very easily
cobble up a sudoku puzzle generator, although I wonder if humans would
like the output. It would have to be tweaked I guess, specified range
of numbers in boxes, as opposed to just shotgunning a whole whack of
numbers onto the grid randomly.

By the way, there was a little bug in the high level console word I
wrote " add ". It did not check to see if a cell was empty before
adding it to the puzzle. Here's a fix.. This is not called by the
solver so it fortunately does not add to the execution time.

: checkifoccupied \ offset -- t/f
sudokugrid + c@
;

: add \ n x y --
xy 2dup

dup checkifoccupied if
dup removenumber
then

try if
.sudokugrid
else
CR ." Not a valid move. " CR
2drop
then
;

Thanks

Robert

### Robert Spykerman

Aug 30, 2005, 10:37:54 PM8/30/05
to

Anton Ertl wrote:
> "Robert Spykerman" <robert.s...@gmail.com> writes:
....
> > .... Assuming P4, assuming cache line size of 1024

> >(am I right?)
>
> Well, it's not cache lines (they are 64bytes in L1 and 128 bytes in
> L2), but consistency regions, and yes, they are 1024 bytes.
>
> Well I just put "1024 allot" before and after the data block, and the
> iForth time changed to 0.79s. Putting such lines between the data
> items did not help further.

Okay, that's the RIGHT way to do it ;)

> Looking at the Number of instructions executed, iForth is quite a lot
> better than Gforth, but it does not translate into a similar speed
> advantage; I guess that there is still some cache or pipelining issue
> somewhere.
>
> gforth-fast iforth
> 1966432598 786191491 instructions
> 2102876916 1830296296 Proc cycles

Well, I guess SOLVER is actively reading and writing to the array in
quick succession. I am not sure how much of the writes to the array
thru' the cache actually get written back to main memory, this would be
a big bottle neck I guess.

With Ghz processors now I remember someone saying something to the like
that in one cycle, an electron can only travel several inches (that
sort of magnitude anyway, if one sits down to calculate).

It is a very sobering thought.

With that one wonders if it is physically possible to design CPU's
which cycle many orders higher....

Anyway I digress

Robert

### webs...@gmail.com

Aug 31, 2005, 12:26:07 AM8/31/05
to
Robert Spykerman wrote:
> webs...@gmail.com wrote:
>
> ... snip ...
> > My algorithm essentially mixes constraint based "simulate how a human
> > would do it" with one "trial and error" level running on top of the
> > constraint routines. And basically solves any sudoku puzzle instantly.
>
> > I don't know anything about forth, so unforunately, I can't comment on
>
> ... snip ...
>
> Hmmm... Okay, I'd better sit down and try and solve a couple more to
> get a better idea how a human does it. I've only ever finished just
> one!!! (and it took me like an hour, with a pencil!, which I take it to
> be not what the pros like to use) Or better yet, have a look at your

> source code which I do believe I did download sometime earlier.

There are 4 basic (self discovered) rules I use to solve sudoku's by
hand and in my program. From what I can determine from looking at
various web resources on sudoku, there don't appear to be any other
rules that are not covered by the ones I use.

> By the way, dare I ask,
>
> How fast does your algorithm crunch a sudoku, say the one I gave in my
> source ?

I don't know! I put a timer around it, but it just keeps spitting back

0 to me. I think its well under a million cycles. The program, I'm
sure, is totally dominated by IO (reading the input, and producing the
output) which itself is so trivially small. I think in the end, it
just doesn't cost anything, because the "n" in this case is just 81.
So the problem is just too small.

You should also keep in mind that my program does not only just solve
sudokus but it will tell you if you have multiple solutions or an
impossible starting grid. For example, the 1, 4, 9 puzzle you give in
another post, has multiple solutions according to my program (there is
a conjecture out there that you need at least 17 filled in squares to
have a valid unique sudoku puzzle). So it would be more interesting,
say, to use my program as a back end for a sudoku puzzle *creator*.
Then we could ask how many puzzles (correct and incorrect) it solves in

a second.

> My brute force method is not very efficient, as you may have noted ,
> it takes about .33 secs on a PIV 2.6/800 i865 stock Dell. Made approx
> 250,000 recursive calls... ( should have measured the depth, will try
> that next )
>
> Of course, the more I look at my code, the more I see things to be
> improved upon ... But 250,000 recursive calls is still 250,000
> recursive calls ...
>
> Also, how fast does a brute force C do it in ?

No idea! I just naturally assumed that such an approach would be too
slow (O(9^(81-17)) steps? forget it) and thus never tried it.

### webs...@gmail.com

Aug 31, 2005, 12:34:46 AM8/31/05
to
Anton Ertl wrote:
> webs...@gmail.com writes:
> >Well, the compromise I made for not doing recursion, is that I cannot
> >*prove* that my algorithm always solves every sudoku. I must say
> >that I have not ever encountered one that my program cannot solve
> >(including a lot of the "insanely difficult" and many "starting with
> >only 17 filled cells" boards), however, personally, I can't
> >guarantee that it solves every possible sudoku board.
>
> AFAIK it's a design goal for Sodoku puzzles that they can be solved
> without backtracking.

Some puzzle authors definately adhere to this. That way you really
can solve the newspaper version with a pencil without tearing to
shreds from erasing it. But others do not adhere to this condition.

> [...] So, a somewhat sophisticated solver should be

> able to solve published Sodoku puzzles without backtracking;

Well, *I* am able to solve any one that doesn't require backtracking,
and even some that do. I'm not sure I would call it sophistication ...

:)

> [...] however,

> I am pretty sure that Sodoku puzzles with unique solutions could be
> designed that do not have this property (i.e., that require
> backtracking unless the solver is lucky and guesses right at all
> choice points).

The archive I give, gives an example of a publicized puzzle (i.e., I
got it from a website that archives such puzzles that I think appeared
in a newspaper) that almost surely requires backtracking.

### The Beez'

Aug 31, 2005, 6:04:45 AM8/31/05
to

Anton Ertl schreef:

> "Robert Spykerman" <robert.s...@gmail.com> writes:
> >\ 'Brute Force' Sudoku Solver in Forth.
> >\ No special extensions were used.
> >\ Tested on in win32forth, VFX and Swift (evaluation).
>
> Also tested on Gforth and iForth. Works without changes. I love
> portable programs.

So do I, although I had to make a few minor changes to make it run on
4tH, since , creates a read-only grid:

\ 4tH routine to fill the grid
: >grid ( a1 n1 n2 --)
9 chars * sudokugrid + swap ( a1 a2 n1)
0 do ( a1 a2)
over i chars + c@ dup is-digit ( a1 a2 c f)
if [char] 0 - over i chars + c! then
loop ( a1 a2)
drop drop ( --)
;

\ Fill the grid the 4tH way
s" 090004007" 0 >grid
s" 000007900" 1 >grid
s" 800000000" 2 >grid
s" 405800000" 3 >grid
s" 300000002" 4 >grid
s" 000009706" 5 >grid
s" 000000004" 6 >grid
s" 003500000" 7 >grid
s" 200600080" 8 >grid

Timings on a P4 2.8 MHz, 2.6 kernel Linux machine:
real 0m7.431s
user 0m7.395s

Well, I don't think FICL, pForth or AtLast will be faster.. ;-)

Hans Bezemer

### Marcel Hendrix

Aug 31, 2005, 6:16:16 PM8/31/05
to
"Robert Spykerman" <robert.s...@gmail.com> writes Re: Sudoku puzzle solver

> Hmm the figures

> don't quite add up to 100%. Where's the other time hiding in?

The full list, when nothing is inlined. The numbers in EXCL add up,
approximately, to 100%.

FORTH> 1 30 scol ( sort on column 1, highest 30 or less)

____________ NAME ______________.__ #CALLED ___ EXCL __ TOTAL

getbits 2289288 14.4% 5.4%
getbox 2798031 13.7% 22.2%
solver 254393 12.8% 26.7%
getcol 5596062 9.0% 9.0%
try 2289288 7.7% 13.2%
getrow 5596062 7.6% 7.6%
getrow_bits 2289288 6.9% 10.0%
getbox_bits 2289288 6.9% 25.1%
getcol_bits 2289288 6.8% 4.0%
removenumber 254331 3.8% 8.5%
morespaces? 254393 1.7% 1.7%
setnumber 508743 0.6% 0.6%
removebits_row 254331 0.4% 0.4%
removebits_col 254331 0.3% 0.3%
removebits_box 254331 0.3% 0.3%
getnumber 254331 0.3% 0.3%
.sudokugrid 1 0.1% 0.1%
solveit 1 0.0% 99.9%
parsegrid 1 0.0% 0.0%
clearbitmaps 1 0.0% 0.0%
startsolving 1 0.0% 99.8%
xy 0 0.0% 0.0%
cleargrid 0 0.0% 0.0%
rm 0 0.0% 0.0%
clearit 0 0.0% 0.0%

-marcel

PS: Here's another easy on (2 ms on my machine)

0 C, 0 C, 5 C, 0 C, 4 C, 0 C, 0 C, 0 C, 2 C,
8 C, 0 C, 0 C, 0 C, 0 C, 5 C, 0 C, 0 C, 0 C,
6 C, 2 C, 0 C, 0 C, 0 C, 0 C, 0 C, 4 C, 1 C,

0 C, 0 C, 9 C, 3 C, 7 C, 0 C, 0 C, 5 C, 0 C,
4 C, 0 C, 0 C, 2 C, 0 C, 0 C, 1 C, 0 C, 7 C,
3 C, 8 C, 0 C, 0 C, 0 C, 6 C, 4 C, 0 C, 0 C,

0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 5 C,
0 C, 0 C, 1 C, 0 C, 0 C, 9 C, 0 C, 0 C, 0 C,
0 C, 0 C, 0 C, 7 C, 6 C, 0 C, 8 C, 0 C, 3 C,

Note that the numbers are suspiciously similar to your example.
Might there be a rule to convert a working sudoku into similar ones
(that all those newpaper puzzle companies are using)?

-marcel

### robert spykerman

Sep 1, 2005, 4:53:01 AM9/1/05
to
A BETTER SOLVER ENGINE...

Improved solving engine - uses a bit of intelligence as well as
recursion, thanks to all of you, who suggested a more intelligent
approach.

The new solver finds a grid-position most likely to yield a good guess
by looking at the number sets first, instead of just blindly thumping
numbers in from start to end.

458 calls to solver versus 250,000+ initially...
Win32forth hesitated a couple of seconds on the old one.
Now it doesn't. Wow...

Marcel, I haven't figured out your code yet, does yours do it in a
similar way?

\ ------------- SAMPLE RUN ( full source comes after)
((

PUZZLE
0 9 0 ! 0 0 4 ! 0 0 7
0 0 0 ! 0 0 7 ! 9 0 0
8 0 0 ! 0 0 0 ! 0 0 0
------+-------+------
4 0 5 ! 8 0 0 ! 0 0 0
3 0 0 ! 0 0 0 ! 0 0 2
0 0 0 ! 0 0 9 ! 7 0 6
------+-------+------
0 0 0 ! 0 0 0 ! 0 0 4
0 0 3 ! 5 0 0 ! 0 0 0
2 0 0 ! 6 0 0 ! 0 8 0

\ OLD solver:
solveit

Solution Found

5 9 1 ! 2 8 4 ! 3 6 7
6 4 2 ! 3 5 7 ! 9 1 8
8 3 7 ! 9 6 1 ! 4 2 5
------+-------+------
4 7 5 ! 8 2 6 ! 1 9 3
3 6 9 ! 7 1 5 ! 8 4 2
1 2 8 ! 4 3 9 ! 7 5 6
------+-------+------
7 5 6 ! 1 9 8 ! 2 3 4
9 8 3 ! 5 4 2 ! 6 7 1
2 1 4 ! 6 7 3 ! 5 8 9

Elapsed Time: 547 msec
Depth : 61
Calls : 254393
ok

\ NEW solver:
solveit

Solution Found

5 9 1 ! 2 8 4 ! 3 6 7
6 4 2 ! 3 5 7 ! 9 1 8
8 3 7 ! 9 6 1 ! 4 2 5
------+-------+------
4 7 5 ! 8 2 6 ! 1 9 3
3 6 9 ! 7 1 5 ! 8 4 2
1 2 8 ! 4 3 9 ! 7 5 6
------+-------+------
7 5 6 ! 1 9 8 ! 2 3 4
9 8 3 ! 5 4 2 ! 6 7 1
2 1 4 ! 6 7 3 ! 5 8 9

Elapsed Time: 15 msec
Depth : 61
Calls : 458

))

\ ------------- SOURCE
\ Sudoku Solver in Forth.

\ No special extensions were used.
\ Tested on in win32forth, VFX and Swift (evaluation).

\ No locals were harmed during this experiment.
\
\ Version: 1900 01092005 - Robert Spykerman

\ email: robspyke_nospam@iprimus_no_spam.com.au
\ (delete the obvious)

\ ---------------------
\ Variables
\ ---------------------

create sudokugrid

0 C, 9 C, 0 C, 0 C, 0 C, 4 C, 0 C, 0 C, 7 C,
0 C, 0 C, 0 C, 0 C, 0 C, 7 C, 9 C, 0 C, 0 C,
8 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C,

4 C, 0 C, 5 C, 8 C, 0 C, 0 C, 0 C, 0 C, 0 C,
3 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 2 C,
0 C, 0 C, 0 C, 0 C, 0 C, 9 C, 7 C, 0 C, 6 C,

0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 0 C, 4 C,
0 C, 0 C, 3 C, 5 C, 0 C, 0 C, 0 C, 0 C, 0 C,
2 C, 0 C, 0 C, 6 C, 0 C, 0 C, 0 C, 8 C, 0 C,

create sudoku_row 9 cells allot

create sudoku_col 9 cells allot

create sudoku_box 9 cells allot

1024 allot \ just to be sure there is no cache issue.

\ ---------------------
\ Logic
\ ---------------------
\ Basically :
\ Grid is parsed. All numbers are put into sets, which are
\ implemented as bitmaps (sudoku_row, sudoku_col, sudoku_box)
\ which represent sets of numbers in each row, column, box.
\ only one specific instance of a number can exist in a
\ particular set.
\
\ SOLVER is recursively called
\ SOLVER looks for the next best guess using FINDNEXTSPACE
\ tries this trail down... if fails, backtracks... and tries
\ again.
\

\ Grid Related

\ algorithm from c.l.f circa 1995 ? Will Baden
: countbits ( number -- bits )
[ HEX ] DUP 55555555 AND SWAP 1 RSHIFT 55555555 AND +
DUP 33333333 AND SWAP 2 RSHIFT 33333333 AND +
DUP 0F0F0F0F AND SWAP 4 RSHIFT 0F0F0F0F AND +
[ DECIMAL ] 255 MOD
;

\ Try tests a number in a said position of grid
\ Returns true if it's possible, else false.
: try \ number position -- true/false
over 1 swap lshift
over getbits and 0= -ROT 2drop
;

\ --------------
: parsegrid \ Parses Grid to fill sets.. Run before solver.

sudokugrid \ to ensure all numbers are parsed into sets/bitmaps

81 0 do
dup i + c@
dup if
dup i try if
else
unloop drop drop FALSE exit
then
else
drop
then
loop
drop
TRUE
;

\ Morespaces? manually checks for spaces ...
\ Obviously this can be optimised to a count var, done initially
\ a 'spaces' variable.

: morespaces?
0 81 0 do sudokugrid i + c@ 0= if 1+ then loop ;

: findnextmove \ -- n ; n = index next item, if -1 finished.

-1 10 \ index prev_possibilities --
\ err... yeah... local variables, kind of...

81 0 do
i sudokugrid + c@ 0= IF
i getbits countbits 9 swap -

\ get bitmap and see how many possibilities
\ stack diagram:
\ index prev_possibilities new_possiblities --

2dup > if
\ if new_possibilities < prev_possibilities...
nip nip i swap
\ new_index new_possibilies --

else \ else prev_possibilities < new possibilities, so:

drop \ new_index new_possibilies --

then
THEN
loop
drop
;

\ findnextmove returns index of best next guess OR returns -1
\ if no more guesses. You then have to check to see if there are
\ spaces left on the board unoccupied. If this is the case, you
\ need to back up the recursion and try again.

: solver
findnextmove
dup 0< if
morespaces? if
drop false exit
else
drop true exit
then
then

10 1 do

i over try if

recurse if
drop unloop TRUE EXIT
else
dup removenumber
then
then
loop

drop FALSE
;

\ SOLVER

: startsolving
clearbitmaps \ reparse bitmaps and reparse grid
parsegrid \ just in case..
solver
AND
;

\ ---------------------
\ Display Grid
\ ---------------------
\
\ Prints grid nicely

\
: .sudokugrid
CR CR
sudokugrid
81 0 do
dup i + c@ . ." "
i 1+
dup 3 mod 0= if
dup 9 mod 0= if
CR
dup 27 mod 0= if
dup 81 < if ." ------+-------+------" CR then
then
else
." ! "
then
then
drop
loop
drop
CR
;

\ ---------------------
\ Higher Level Words
\ ---------------------

: checkifoccupied \ offset -- t/f
sudokugrid + c@
;

: add \ n x y --
xy 2dup

dup checkifoccupied if
dup removenumber
then

: rm
xy removenumber
.sudokugrid
;

: clearit
cleargrid
clearbitmaps
.sudokugrid
;

: showit .sudokugrid ;

godoit

\ ------------- END SOURCE

### robert spykerman

Sep 1, 2005, 5:00:24 AM9/1/05
to
On Thu, 01 Sep 2005 18:53:01 +1000, robert spykerman
<robspyke_nospam@no_spam_iprimus.com.au_no_spam> wrote:

Oh, I should say, the timings I quoted in the new engine are based on
VFX 3.70 build 1889, 2.6/800 PIV (i865 Dell).

>A BETTER SOLVER ENGINE...
...

### Marcel Hendrix

Sep 1, 2005, 7:45:57 AM9/1/05
to
robert spykerman <robspyke_nospam@no_spam_iprimus.com.au_no_spam> writes Re: Sudoku puzzle solver

> Oh, I should say, the timings I quoted in the new engine are based on
> VFX 3.70 build 1889, 2.6/800 PIV (i865 Dell).

>> Elapsed Time: 15 msec
>> Depth : 61
>> Calls : 458

That's strange! With an old version of iForth (only using
3, not 4 free registers), it runs in 11 ms on my university's
3 GHz PIV. I didn't inline anything, just did a straight
copy-and-paste from the newsreader. With full inlining, that
figure drops to 8 ms. Apparently VFX is having problems.

Must try this at home on the development version...

I don't think your algorithm is differently from mine. I didn't
use backtracking because I underestimated Sudoku's complexity (as
I did with the intelligence of Times' readers, apparently).
I simply verified the horizontal, vertical and box possibilities of
each cell. If that leaves only a single possibility, try another
pass until completion. That actually works for the Sudokus in a
local newspaper (I hasten to say that I only read the left-over
copies that I find in the train to work ;-)

Try these ones, they'll need all of Paul's rules:

( Dutch low-brow newspaper )

9 6 0 3 0 0 8 0 0
8 5 0 0 1 0 0 0 9
0 0 3 0 9 6 0 2 0
1 0 0 0 0 0 0 0 2
0 0 0 2 7 0 1 8 0
6 0 9 0 8 4 0 0 3
0 0 6 7 0 0 0 5 0

0 0 0 0 2 8 3 0 4
0 4 0 0 0 5 0 6 0

( 20 minutes for a human being )
2 0 0 0 3 0 7 0 6
0 0 8 4 0 0 0 0 0
6 0 0 0 9 0 0 3 0
0 0 0 0 7 0 0 5 0
8 0 9 3 0 5 1 0 2
0 4 0 0 6 0 0 0 0
0 6 0 0 2 0 0 0 8
0 0 0 0 0 7 4 0 0
3 0 2 0 5 0 0 0 9

( 2 hours for a human being )
0 0 0 0 0 3 5 0 0
0 0 0 7 0 0 4 0 0
7 0 3 0 0 6 0 1 0
0 1 0 0 0 5 0 0 6
8 0 0 0 0 0 0 0 2
4 0 0 3 0 0 0 8 0
0 5 0 4 0 0 1 0 8
0 0 8 0 0 9 0 0 0
0 0 9 8 0 0 0 0 0

( > 2 hours for a human being, maybe impossible )
9 2 0 0 0 0 0 0 8
0 8 0 0 0 0 0 5 1
0 0 1 5 0 0 3 0 0
0 0 0 9 0 7 8 0 0
0 0 0 0 0 0 0 0 0
0 0 3 6 0 2 0 0 0
0 0 6 0 0 4 7 0 0
5 7 0 0 0 0 0 8 0
8 0 0 0 0 0 0 9 3

( 45 minutes for a human being )
0 0 6 0 5 0 0 0 0
0 7 0 0 3 9 1 0 0
0 8 0 0 0 0 0 3 0
0 0 0 0 0 2 5 1 8
0 0 0 0 0 0 0 0 0
7 5 9 8 0 0 0 0 0
0 6 0 0 0 0 0 7 0
0 0 2 5 9 0 0 4 0
0 0 0 0 6 0 3 0 0

-marcel

### Marcel Hendrix

Sep 1, 2005, 3:55:39 PM9/1/05
to
m.a.m....@tue.nl (Marcel Hendrix) wrote Re: Sudoku puzzle solver

robert spykerman <robspyke_nospam@no_spam_iprimus.com.au_no_spam> writes Re: Sudoku puzzle solver

>> Oh, I should say, the timings I quoted in the new engine are based on
>> VFX 3.70 build 1889, 2.6/800 PIV (i865 Dell).

>>> Elapsed Time: 15 msec
>>> Depth : 61
>>> Calls : 458

> That's strange! With an old version of iForth (only using
> 3, not 4 free registers), it runs in 11 ms on my university's
> 3 GHz PIV. I didn't inline anything, just did a straight
> copy-and-paste from the newsreader. With full inlining, that
> figure drops to 8 ms. Apparently VFX is having problems.

> Must try this at home on the development version...

FORTH> solveit ( inline version; times in microseconds )

Solution Found in 2,142 microseconds.

5 9 1 ! 2 8 4 ! 3 6 7

6 4 2 ! 3 5 7 ! 9 1 8

8 3 7 ! 9 6 1 ! 4 2 5
------+-------+------
4 7 5 ! 8 2 6 ! 1 9 3

3 6 9 ! 7 1 5 ! 8 4 2

1 2 8 ! 4 3 9 ! 7 5 6
------+-------+------
7 5 6 ! 1 9 8 ! 2 3 4
9 8 3 ! 5 4 2 ! 6 7 1
2 1 4 ! 6 7 3 ! 5 8 9

Something is definitely wrong with your VFX. Didn't you exclude
the time to print the solution? Or has VFX's timer still a 10 ms
resolution problem?

-marcel

### robert spykerman

Sep 1, 2005, 7:56:42 PM9/1/05
to
On Thu, 1 Sep 2005 19:55:39 GMT, m...@iae.nl (Marcel Hendrix) wrote:

>m.a.m....@tue.nl (Marcel Hendrix) wrote Re: Sudoku puzzle solver
>
>robert spykerman <robspyke_nospam@no_spam_iprimus.com.au_no_spam> writes Re: Sudoku puzzle solver
>>> Oh, I should say, the timings I quoted in the new engine are based on
>>> VFX 3.70 build 1889, 2.6/800 PIV (i865 Dell).
>
>>>> Elapsed Time: 15 msec

...
>
>FORTH> solveit ( inline version; times in microseconds )
>
>Solution Found in 2,142 microseconds.

That's 2 microseconds? Wow... Or did you mean 2 milliseconds

...

>Something is definitely wrong with your VFX. Didn't you exclude
>the time to print the solution? Or has VFX's timer still a 10 ms
>resolution problem?
>
>-marcel

I think it's the resolution. I called GetTickCount. And I only did one
run of the puzzle and did not get an average, so it's probably not a
representative time.

Run varies with GetTickCount at the default granularity (millisecs).
Sometimes it gives me 0 millisecs, sometimes 10-15...

I think there's a way to adjust the granularity of GetTickCount,
something along the lines of let's see... would this be right?

Anyway, I'm way behind in things I have to do in my 'other' life ;) So
I'm going to duck and keep my head under for a while (yeah, how likely
is that?)

Cheers

Robert

### robert spykerman

Sep 1, 2005, 8:28:25 PM9/1/05
to
On Thu, 1 Sep 2005 11:45:57 GMT, m.a.m....@tue.nl (Marcel Hendrix)
wrote:

...

>Try these ones, they'll need all of Paul's rules:

Hmm.. I might take a look at Paul's page now that I've got a better
solver and see how he did it. My current solver seems to solve the
puzzles you pose quite easily, by the looks of things. Of course, I
have a 'backtracker'.

I have not been able to get your sudoku solver to run under VFX yet -
that will be my next chore, to dissect your code, it seems wonderfully
concise.

At first glance, I note you used bitmaps too, I guess that was the
only sensible way to do it, even I managed to think of it ;)

I have a different bit counter though (I have to say I managed to find
my bit counter word by googling). I wonder which is more efficient
given today's architectures.. I must say the one I got scores a
penalty on the MOD. OTOH, you have a loop in #bitcount.

I used a char array mainly because I was wondering if it was better
for the compiler to generate MOVZX's type code to read bytes without
having to shift left by 2 the index ( 4 * ) to get the offset for a
32bit word sized array. (You might remember a recent thread where I
was wondering out aloud about whether char arrays or int arrays were
faster...)

Robert

PS - as you can see, I think I have timer resolution issues...

>( Dutch low-brow newspaper )
> 9 6 0 3 0 0 8 0 0
> 8 5 0 0 1 0 0 0 9
> 0 0 3 0 9 6 0 2 0
> 1 0 0 0 0 0 0 0 2
> 0 0 0 2 7 0 1 8 0
> 6 0 9 0 8 4 0 0 3
> 0 0 6 7 0 0 0 5 0
> 0 0 0 0 2 8 3 0 4
> 0 4 0 0 0 5 0 6 0

solveit

Solution Found

9 6 4 ! 3 5 2 ! 8 1 7
8 5 2 ! 4 1 7 ! 6 3 9
7 1 3 ! 8 9 6 ! 4 2 5
------+-------+------
1 8 7 ! 5 6 3 ! 9 4 2
4 3 5 ! 2 7 9 ! 1 8 6
6 2 9 ! 1 8 4 ! 5 7 3
------+-------+------
3 9 6 ! 7 4 1 ! 2 5 8
5 7 1 ! 6 2 8 ! 3 9 4
2 4 8 ! 9 3 5 ! 7 6 1

Elapsed Time : 0 msecs

>( 20 minutes for a human being )
> 2 0 0 0 3 0 7 0 6
> 0 0 8 4 0 0 0 0 0
> 6 0 0 0 9 0 0 3 0
> 0 0 0 0 7 0 0 5 0
> 8 0 9 3 0 5 1 0 2
> 0 4 0 0 6 0 0 0 0
> 0 6 0 0 2 0 0 0 8
> 0 0 0 0 0 7 4 0 0
> 3 0 2 0 5 0 0 0 9

solveit

Solution Found

2 9 1 ! 5 3 8 ! 7 4 6
7 3 8 ! 4 1 6 ! 2 9 5
6 5 4 ! 7 9 2 ! 8 3 1
------+-------+------
1 2 6 ! 8 7 9 ! 3 5 4
8 7 9 ! 3 4 5 ! 1 6 2
5 4 3 ! 2 6 1 ! 9 8 7
------+-------+------
4 6 7 ! 9 2 3 ! 5 1 8
9 1 5 ! 6 8 7 ! 4 2 3
3 8 2 ! 1 5 4 ! 6 7 9

Elapsed Time : 0 msecs
ok

>( 2 hours for a human being )
> 0 0 0 0 0 3 5 0 0
> 0 0 0 7 0 0 4 0 0
> 7 0 3 0 0 6 0 1 0
> 0 1 0 0 0 5 0 0 6
> 8 0 0 0 0 0 0 0 2
> 4 0 0 3 0 0 0 8 0
> 0 5 0 4 0 0 1 0 8
> 0 0 8 0 0 9 0 0 0
> 0 0 9 8 0 0 0 0 0

solveit

Solution Found

9 8 4 ! 2 1 3 ! 5 6 7
5 6 1 ! 7 9 8 ! 4 2 3
7 2 3 ! 5 4 6 ! 8 1 9
------+-------+------
2 1 7 ! 9 8 5 ! 3 4 6
8 3 6 ! 1 7 4 ! 9 5 2
4 9 5 ! 3 6 2 ! 7 8 1
------+-------+------
6 5 2 ! 4 3 7 ! 1 9 8
1 7 8 ! 6 5 9 ! 2 3 4
3 4 9 ! 8 2 1 ! 6 7 5

Elapsed Time : 0 msecs
ok

>( > 2 hours for a human being, maybe impossible )
> 9 2 0 0 0 0 0 0 8
> 0 8 0 0 0 0 0 5 1
> 0 0 1 5 0 0 3 0 0
> 0 0 0 9 0 7 8 0 0
> 0 0 0 0 0 0 0 0 0
> 0 0 3 6 0 2 0 0 0
> 0 0 6 0 0 4 7 0 0
> 5 7 0 0 0 0 0 8 0
> 8 0 0 0 0 0 0 9 3

solveit

Solution Found

9 2 5 ! 1 7 3 ! 6 4 8
3 8 7 ! 4 6 9 ! 2 5 1
4 6 1 ! 5 2 8 ! 3 7 9
------+-------+------
6 5 4 ! 9 1 7 ! 8 3 2
2 1 8 ! 3 4 5 ! 9 6 7
7 9 3 ! 6 8 2 ! 5 1 4
------+-------+------
1 3 6 ! 8 9 4 ! 7 2 5
5 7 9 ! 2 3 1 ! 4 8 6
8 4 2 ! 7 5 6 ! 1 9 3

Elapsed Time : 16 msecs
ok

>( 45 minutes for a human being )
> 0 0 6 0 5 0 0 0 0
> 0 7 0 0 3 9 1 0 0
> 0 8 0 0 0 0 0 3 0
> 0 0 0 0 0 2 5 1 8
> 0 0 0 0 0 0 0 0 0
> 7 5 9 8 0 0 0 0 0
> 0 6 0 0 0 0 0 7 0
> 0 0 2 5 9 0 0 4 0
> 0 0 0 0 6 0 3 0 0
>
>-marcel

solveit

Solution Found

4 3 6 ! 1 5 8 ! 2 9 7
2 7 5 ! 6 3 9 ! 1 8 4
9 8 1 ! 7 2 4 ! 6 3 5
------+-------+------
6 4 3 ! 9 7 2 ! 5 1 8
1 2 8 ! 3 4 5 ! 7 6 9
7 5 9 ! 8 1 6 ! 4 2 3
------+-------+------
5 6 4 ! 2 8 3 ! 9 7 1
3 1 2 ! 5 9 7 ! 8 4 6
8 9 7 ! 4 6 1 ! 3 5 2

Elapsed Time : 0 msecs

### Stephen Pelc

Sep 2, 2005, 8:41:43 PM9/2/05
to
On Thu, 1 Sep 2005 11:45:57 GMT, m.a.m....@tue.nl (Marcel Hendrix)
wrote:

>robert spykerman <robspyke_nospam@no_spam_iprimus.com.au_no_spam> writes Re: Sudoku puzzle solver

>
>> Oh, I should say, the timings I quoted in the new engine are based on
>> VFX 3.70 build 1889, 2.6/800 PIV (i865 Dell).
>
>>> Elapsed Time: 15 msec
>>> Depth : 61
>>> Calls : 458
>
>That's strange! With an old version of iForth (only using
>3, not 4 free registers), it runs in 11 ms on my university's
>3 GHz PIV. I didn't inline anything, just did a straight
>copy-and-paste from the newsreader. With full inlining, that
>figure drops to 8 ms. Apparently VFX is having problems.

Putting the timer around StartSolving on my P4 2.8GHz WXPpro box
yields 3 ms on VFX Forth 3.80 build 1921. The timer resolution is
1 ms. I suspect that Rob's results are dominated by text display
time.

Stephen
P.S. VFX Forth v3.8 will be released 1 Oct 2005. Major new
stuff is a web services interface that can talk to Excel,
MS Project and so on using SOAP and XML. Plenty of other
new stuff, including (Rob?) a Sudoku solver.

--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691

### Robert Spykerman

Sep 3, 2005, 3:00:26 AM9/3/05
to

Stephen Pelc wrote:
...

> Putting the timer around StartSolving on my P4 2.8GHz WXPpro box
> yields 3 ms on VFX Forth 3.80 build 1921. The timer resolution is
> 1 ms. I suspect that Rob's results are dominated by text display
> time.

A good guess but what I actually did was call GetTickCount before and
after calling STARTSOLVING and got the difference. It's only one
difference so it's crude I know, but nope, the calls did not AFAIR
include the text output.

I'm actually at work now - I can't quite check out why I am getting
intermittent differences, calls return 0 msecs or 15 msecs... If it did
include the text output I reckon I wouldn't get 0 msecs. Or would I?

> Stephen
> P.S. VFX Forth v3.8 will be released 1 Oct 2005. Major new
> stuff is a web services interface that can talk to Excel,
> MS Project and so on using SOAP and XML. Plenty of other
> new stuff, including (Rob?) a Sudoku solver.

Your solver with a windoze GUI ? :)

That's my next thing, to wrap it up in a gui... But that means reading
Petzold. I've long been putting that off. What's the best way to get a
GUI up using VFX, Stephen?

I've been thinking of just extracting the machine code/inline asm
(suitably modified) for the solver engine and chucking it into a GUI
builder program like Delphi to get the GUI up.

I have just visited Paul's Pages (the UK site, not Paul Hsieh) and the
rules look really interesting and useful for manual/human solving. I do
realize now that my routine doesn't allow generation of proper human
compatible puzzles as it exists. I don't know if the extra logic to
implement these rules will make any difference to execution speed
though....

By what you're saying, am I right in presuming you've got a solver
routine running, Stephen?

As a digression... the most fun I've had with sudoku's is actually
writing the solver... Now that I believe I have one that doesn't
require 250,000+ calls just to solve a puzzle, I have to say my
enthusiasm is somewhat sated... I might just try to run the current
version of my algo on an Apple ][ just for the heck of it (Did I ever

By the way, Stephen, just a suggestion, do you remember the little
discussion we were talking about - IIRC, the EDI user locals pointer? I
am wondering if it would be an idea to put a switch or something in the
code generator to enable EDI to be used to cache stack items IF local
vars are not used? It may be a silly unworkable idea, but I thought I

Cheers

Robert

### Marcel Hendrix

Sep 3, 2005, 10:46:17 AM9/3/05
to
steph...@mpeforth.com (Stephen Pelc) writes Re: Sudoku puzzle solver
[..]

> Putting the timer around StartSolving on my P4 2.8GHz WXPpro box
> yields 3 ms on VFX Forth 3.80 build 1921. The timer resolution is
> 1 ms. I suspect that Rob's results are dominated by text display
> time.

Must be.

I whacked at Robert's source as hard as I could (without actually improving
the algorithm), and below are the results for 8 different grids. Each
grid is solved 1000 times and the times are averaged.

FORTH> speedthem ( 3 GHz PIVht )
0.496 milliseconds (originally 4.36 ms for the computer)
0.488 milliseconds (45 minutes human)
2.087 milliseconds (2 hours human)
0.464 milliseconds (2 hours for a human, maybe impossible)
0.041 milliseconds (unknown source)
0.080 milliseconds (Paul Hsieh example #1)
0.058 milliseconds (Paul Hsieh example #2)
0.058 milliseconds (Paul Hsieh example #3) ok

Changes involve the suggestion already done by Robert to count spaces
differently, some inlining, an unbeatable COUNTBITS, and unrolling.
I used the development version of iForth (4 free registers).

The grids used by Paul Hsieh are apparently particularly easy.

-marcel
-- ---------------------------------------------

These are the 8 grids used:

: rg 9 0 DO BL <WORD> >FLOAT DROP F>S C, LOOP ;

CREATE grid0
rg 0 9 0 0 0 4 0 0 7
rg 0 0 0 0 0 7 9 0 0
rg 8 0 0 0 0 0 0 0 0

rg 4 0 5 8 0 0 0 0 0
rg 3 0 0 0 0 0 0 0 2
rg 0 0 0 0 0 9 7 0 6

rg 0 0 0 0 0 0 0 0 4
rg 0 0 3 5 0 0 0 0 0
rg 2 0 0 6 0 0 0 8 0
," originally 4.36 ms for the computer"

CREATE grid1
rg 0 0 6 0 5 0 0 0 0
rg 0 7 0 0 3 9 1 0 0
rg 0 8 0 0 0 0 0 3 0

rg 0 0 0 0 0 2 5 1 8
rg 0 0 0 0 0 0 0 0 0
rg 7 5 9 8 0 0 0 0 0

rg 0 6 0 0 0 0 0 7 0
rg 0 0 2 5 9 0 0 4 0
rg 0 0 0 0 6 0 3 0 0
," 45 minutes human"

CREATE grid2
rg 9 2 0 0 0 0 0 0 8
rg 0 8 0 0 0 0 0 5 1
rg 0 0 1 5 0 0 3 0 0

rg 0 0 0 9 0 7 8 0 0
rg 0 0 0 0 0 0 0 0 0
rg 0 0 3 6 0 2 0 0 0

rg 0 0 6 0 0 4 7 0 0
rg 5 7 0 0 0 0 0 8 0
rg 8 0 0 0 0 0 0 9 3
," 2 hours human"

CREATE grid3
rg 0 0 0 0 0 3 5 0 0
rg 0 0 0 7 0 0 4 0 0
rg 7 0 3 0 0 6 0 1 0

rg 0 1 0 0 0 5 0 0 6
rg 8 0 0 0 0 0 0 0 2
rg 4 0 0 3 0 0 0 8 0

rg 0 5 0 4 0 0 1 0 8
rg 0 0 8 0 0 9 0 0 0
rg 0 0 9 8 0 0 0 0 0
," 2 hours for a human, maybe impossible"

CREATE grid4
rg 2 0 0 0 3 0 7 0 6
rg 0 0 8 4 0 0 0 0 0
rg 6 0 0 0 9 0 0 3 0

rg 0 0 0 0 7 0 0 5 0
rg 8 0 9 3 0 5 1 0 2
rg 0 4 0 0 6 0 0 0 0

rg 0 6 0 0 2 0 0 0 8
rg 0 0 0 0 0 7 4 0 0
rg 3 0 2 0 5 0 0 0 9
," unknown source"

CREATE grid5
rg 9 0 0 0 1 0 6 0 0
rg 0 0 0 8 0 0 0 0 7
rg 6 3 0 0 0 7 2 0 0

rg 0 0 0 0 0 0 5 7 4
rg 0 0 0 2 4 3 0 0 0
rg 0 4 1 0 0 0 0 0 0

rg 0 0 9 6 0 4 8 3 1
rg 4 0 0 0 0 8 0 0 0
rg 0 0 8 0 2 0 0 0 9
," Paul Hsieh example #1"

CREATE grid6
rg 0 0 9 6 0 0 1 0 0
rg 0 0 0 0 0 0 0 2 4
rg 6 0 0 0 0 2 0 9 0

rg 3 0 0 4 2 0 0 0 0
rg 0 8 0 3 1 0 0 0 7
rg 2 0 4 0 0 0 0 1 8

rg 0 0 0 7 0 4 8 0 2
rg 5 4 0 0 3 0 0 0 1
rg 7 9 0 5 0 0 0 0 0
," Paul Hsieh example #2"

CREATE grid7
rg 0 5 0 0 0 8 6 0 0
rg 7 0 0 5 4 0 0 0 9
rg 0 1 0 0 6 0 0 0 3

rg 6 0 0 0 0 0 0 0 0
rg 0 3 0 0 0 0 0 8 0
rg 0 0 0 0 0 0 0 0 5

rg 9 0 0 0 3 0 0 1 0
rg 4 0 0 0 7 6 0 0 8
rg 0 0 1 8 0 0 0 7 0
," Paul Hsieh example #3"

### Marcel Hendrix

Sep 3, 2005, 6:18:54 PM9/3/05
to
> steph...@mpeforth.com (Stephen Pelc) writes Re: Sudoku puzzle solver
> [..]
>> Putting the timer around StartSolving on my P4 2.8GHz WXPpro box
>> yields 3 ms on VFX Forth 3.80 build 1921. The timer resolution is
>> 1 ms. I suspect that Rob's results are dominated by text display time.

> Must be.

> I whacked at Robert's source as hard as I could (without actually improving

^^^^^^^^^^^^^^^^^^
> the algorithm)
[..]

> FORTH> speedthem ( 3 GHz PIVht )
> 0.496 milliseconds (originally 4.36 ms for the computer)
> 0.488 milliseconds (45 minutes human)
> 2.087 milliseconds (2 hours human)
> 0.464 milliseconds (2 hours for a human, maybe impossible)
> 0.041 milliseconds (unknown source)
> 0.080 milliseconds (Paul Hsieh example #1)
> 0.058 milliseconds (Paul Hsieh example #2)
> 0.058 milliseconds (Paul Hsieh example #3) ok

I was wrong.

FORTH> speedthem ( 3 GHz PIVht )

0.194 milliseconds (originally 4.36 ms for the computer)
0.196 milliseconds (45 minutes human)
0.799 milliseconds (2 hours human)
0.182 milliseconds (2 hours for a human, maybe impossible)
0.020 milliseconds (unknown source)
0.037 milliseconds (Paul Hsieh's example #1)
0.029 milliseconds (Paul Hsieh's example #2)
0.027 milliseconds (Paul Hsieh's example #3) ok

-marcel

### jza...@mail.com

Sep 4, 2005, 11:34:08 AM9/4/05
to
Here are some speedups to the original code in Robert Spykerman's
sudokusolver code. Some are trivial, but a few are significant.
These changes are mere code replacements and don't affect the
algorithm, which I'm still studying, before I try to see if
there are algorithmic optimizations. Just for fun *WHEN* I have time.

\ Original code
: getnumber ( position -- n ) sudokugrid swap + c@ ;

\ Modified code
: getnumber ( position -- n ) sudokugrid + c@ ;

-------------------------------------------------------
\ Original code

: cleargrid sudokugrid 81 0 do dup i + 0 swap c! loop drop ;

\ Modified code
: cleargrid 81 0 do 0 sudokugrid i + c! loop ;
-------------------------------------------------------

\ Original code
: try ( number position -- t|f)

over 1 swap lshift
over getbits and 0= -ROT 2drop
;

\ Modified code
: try ( number position -- t|f )
getbits swap 1 swap lshift and 0=
\ getbits 1 rot lshift and 0 =
;
-------------------------------------------------------

\ The original version checks the whole grid to create
\ a count of the number of remaining spaces in the grid.
\ This is unnecessary. All you need to do is check to see if
\ any space exists, and terminate search if this is true.
\ Since morespaces? is used in solver this creates a
\ significant speedup by not going thru the full grid
\ everytime.

: morespaces? ( -- n|0) \ Original code

0 81 0 do sudokugrid i + c@ 0= if 1+ then loop ;

: morespaces? ( -- t|f) \ Modified code
81 0 do sudokugrid i + c@ 0= if unloop TRUE exit then loop FALSE ;

Jabari Zakiya

### robert spykerman

Sep 5, 2005, 9:48:51 AM9/5/05
to
On 4 Sep 2005 08:34:08 -0700, jza...@mail.com wrote:

>Here are some speedups to the original code in Robert Spykerman's
>sudokusolver code. Some are trivial, but a few are significant.
>These changes are mere code replacements and don't affect the
>algorithm, which I'm still studying, before I try to see if
>there are algorithmic optimizations. Just for fun *WHEN* I have time.

..

Thanks for that... If any of ye are free, DO tear this code apart.
It's been a useful exercise posting up here, seeing those glaring ...
ermm.. inelegancies :)

By the way, there is a bit of a slight logical flaw in the
FINDNEXTMOVE word. Marcel may already have picked it up, I suspect...
But it's got to do with this fragment:

...
81 0 do
i sudokugrid + c@ 0= IF
i getbits countbits 9 swap -

...

I didn't see this earlier but when FINDNEXTMOVE hits a 0 with NO LEGAL
MOVES, ie the composite bitmap (or set union, call it what you want)
has all 9 members in it, ie FULL, ie 111111110, the bitcount is 9 and
hence 9 - 9 = 0.

What I meant to do here was get the empty space with the smallest
composite bitmap still able to accommodate a new member, ie with a
bitcount of 8. So 9-8 = 1. Or if that doesn't exist, get the next best
one, hopefully one with a bitcount of 7, ie 9-7 = 2.

The rest of FINDNEXTMOVE actually computes the emptyspace index with
the smallest result of 9-BITCOUNT.

So you can see that 9-9 (for a 'FULL' space) is actually less than
9-8=1 for the space I ideally want to start solving at..

That's not a real issue, because the SOLVER word compensates for this
by trying it and backtracking, and it would appear that this ugliness
does not cause much of a speedbump... it only happens really 'deep'.

One could modify the offending fragment, but so far, all the
modifications I have tried seem to result in slower code because it
involves either a compare or a sequence of expensive instructions...
ie..

Caveat: I have yet to time this properly with a proper timer and not
GetTickCount...

Fragment 1: minus with 8 instead of 9... And strip sign bit.
: findnextmove

-1 10 \ index prev_possibilities --

\ initial start values...

81 0 do
i sudokugrid + c@ 0= IF

i getbits countbits 8 swap - \ gets no. of possibilities
\$7FFFFFFF AND \ strips sign bit for -1 (ie 8-9)

2dup > if
\ if new_possibilities < prev_possibilities...
nip nip i swap
\ new_index new_possibilies --
else \ else prev_possibilities < new possibilities, so:
drop \ new_index new_possibilies --
then
THEN
loop
drop

;.

or Fragment 2: \ use unsigned compare...
: findnextmove

-1 10 \ index prev_possibilities --

\ initial start values...

81 0 do
i sudokugrid + c@ 0= IF

i getbits countbits 8 swap - \ gets no. of possibilities
2dup U> if \ UNSIGNED compare

\ if new_possibilities < prev_possibilities...
nip nip i swap
\ new_index new_possibilies --
else \ else prev_possibilities < new possibilities, so:
drop \ new_index new_possibilies --
then
THEN
loop
drop

;.

The above 2 fragments appear to run slower than just letting the
SOLVER mark out the occassional bad trail and backtrack...

But logically for peace of mind, the code looks nicer mainly because
it works slightly more correctly....

So that's one other word I have put at the back of my mind to try and
optimise...

Oh, I probably could have factorised MORESPACES? into above as well
with a few modifications...

Robert

### Rod Oakford

Sep 13, 2005, 5:25:08 PM9/13/05
to
Hello,

I have written an application to play and solve sudoku puzzles using
Win32Forth. (See http://groups.yahoo.com/group/win32forth/messages).
I have been reading this thread with interest and implemented a basic
brute force method to find a solution which was fast enough for my
purposes - less than 100 milliseconds for most puzzles on a P700.

I would also like to find a way to determine the number of solutions a
puzzle has and create random new puzzles with a unique solution. Any
suggestions would be appreciated.

Regards,
Rod

### Robert Spykerman

Sep 15, 2005, 2:22:54 AM9/15/05
to
Hiya Rod

...

> I would also like to find a way to determine the number of solutions a
> puzzle has and create random new puzzles with a unique solution. Any
> suggestions would be appreciated.

As Marcel suggested, I might suggest looking at:

http://www.paulspages.co.uk/sudoku/

While there is no source available, if you were looking for a puzzle
generator, IIRC there's a lot of food for thought there on how the
human approach is, and by running his applet you may get some ideas...

He describes the human algos... which add more rules to the algo I used
personally myself in my solver, but I am not sure that implementing
extra rules is going to help execution speed, at least from the point
of view of my circumstances, very much... ( I have to say I haven't had
the time to experiment very much though in the past 2 weeks ).

My algo assumes backtracking is likely and accommodates for it. Having
not solved many sudokus, the lazy git that I am, I had initially
assumed that sudoku players had to backtrack most times.

As I understand it, apparently the puzzles that are usually printed in
the press are solvable in most instances with no backtracking - or
perhaps just one backtrack step.

I believe finding puzzles with unique solutions may not be such a big
issue

Rather, the bigger issue may be designing puzzles that humans will
actually WANT to solve. I think for this, it may be that a min of 18-20
squares need numbers in, and in the right places.

I guess that in itself is another puzzle - which may be itself
something worth pursuing next. But at the moment I've a to-do list that
actually recurses, with no apparent sign of unwinding. Bother...

Back to uniqueness.. Hmmm... I believe, but cannot prove yet, that if
one has reached a threshold of a certain number of numbers on the grid,
the solution will have to be uniqute.

But I'm no math/group theory whizz. By any stretch of the imagination.

Cheers

Robert

### Rod Oakford

Sep 19, 2005, 4:01:31 PM9/19/05
to

Hi Robert,

I have found a way to determine the number of possible solutions a
puzzle has:- simply by preventing the solver routine from exiting when
a solution has been found and incrementing a counter instead until all
spaces have been tried with all possible numbers that are allowed. I
was able to confirm that the following puzzle (from a Sky One
competition) has 1905 solutions in just a few seconds. I have added
this feature to my program to check the number of solutions possible
though in most cases puzzles set in newspapers will only have one
taken rises very quickly. As for making random puzzles, it is going to
take much more work especially to be able to determine the level of
dificulty as far as a human solver is concerned.

506020903
008000500
000000000
600285009
000903000
800761004
000000000
004000300
201050607

This sudoku puzzle has 1905 different solutions even though 26 numbers
are given.

Rod