Hats off to Atari, err, Apple!

Michael 'AppleWin Debugger Dev'

unread,

Apr 16, 2017, 2:45:26 PM4/16/17

to

I converted the famous Atari 3D Hat to Applesoft and cleaned up the sloppy Basic usage.

100 REM ARCHIMEDES SPIRAL
110 REM
120 REM ANALOG MAGAZINE
130 REM
140 HGR:POKE 49234,0:REM $C052
150 F=4.71238905/144
160 FOR Z=-64 TO 64
170 S=Z*Z*5.0625
180 L=INT(SQR(20736-S)+0.5)
190 FOR I=-L TO L
200 U=SQR(I*I+S)*F
210 V=(SIN(U)+SIN(U*3)*0.4)*56
220 X=I+Z+160:Y=90-V+Z
222 IF X>279 THEN GOTO 250
224 IF Y>191 THEN GOTO 250
230 HCOLOR=3:HPLOT X,Y
240 HCOLOR=0:HPLOT X,Y+1 TO X,191
250 NEXT:NEXT

If you don't want to wait the literal 3 hours, pictures, DSK, Javascript, and link to the description included!

https://github.com/Michaelangel007/apple2_hat3d

Tom Porter

unread,

Apr 19, 2017, 9:43:54 AM4/19/17

to

Crazy code that 'small' can create such an interesting image...

Matthew Power

unread,

Apr 29, 2017, 9:14:16 PM4/29/17

to

This is so much fun! With Transwarp enabled it rendered in about 24 minutes. I'm going to try using Alan Bird's complier and see what happens....

barrym95838

unread,

Apr 29, 2017, 11:14:48 PM4/29/17

to

On Sunday, April 16, 2017 at 11:45:26 AM UTC-7, Michael 'AppleWin Debugger Dev' wrote:
> I converted the famous Atari 3D Hat to Applesoft and cleaned up the sloppy Basic usage.

> ...

http://anycpu.org/forum/viewtopic.php?p=841#p841

Mike B.

Michael J. Mahon

unread,

Apr 30, 2017, 11:44:19 AM4/30/17

to

Matthew Power <matthe...@gmail.com> wrote:
> This is so much fun! With Transwarp enabled it rendered in about 24
> minutes. I'm going to try using Alan Bird's complier and see what happens....
>

It won't help as much as you might expect. This program spends almost all
its time in ROM routines computing trig functions and square roots.

--
-michael - NadaNet 3.1 and AppleCrate II: http://michaeljmahon.com

barrym95838

unread,

Apr 30, 2017, 12:18:41 PM4/30/17

to

On Sunday, April 30, 2017 at 8:44:19 AM UTC-7, Michael J. Mahon wrote:

The inner loop which draws the "horizontal" lines is symmetric, so instead of going from -L to L you could just go 0 to L and HPLOT two points per iteration, reducing the trig overhead but making the program a bit larger.

Mike B.

barrym95838

unread,

Apr 30, 2017, 4:40:04 PM4/30/17

to

On Sunday, April 30, 2017 at 9:18:41 AM UTC-7, barrym95838 wrote:
> The inner loop which draws the "horizontal" lines is symmetric,
> so instead of going from -L to L you could just go 0 to L and
> HPLOT two points per iteration, reducing the trig overhead but
> making the program a bit larger.
>
> Mike B.

10 HGR2 :E = 57:F = .4 * E:P = - .0374:Q = 180
20 FOR J = - E TO E:U = J * J * 5:C = 140 + J:D = 87 + J
30 FOR I = - SQR (16245 - U) TO 0:Y = SQR (I * I + U) * P
35 Y = D + E * SIN (Y) + F * SIN (Y + Y + Y)
40 HCOLOR= 3: HPLOT C - I,Y: HPLOT C + I,Y
50 HCOLOR= 0: HPLOT C - I,Y + 1 TO C - I,Q: HPLOT C + I,Y + 1 TO C + I,Q
60 NEXT : NEXT

]PR#0

barrym95838

unread,

Apr 30, 2017, 5:19:21 PM4/30/17

to

On Sunday, April 30, 2017 at 1:40:04 PM UTC-7, barrym95838 wrote:
> On Sunday, April 30, 2017 at 9:18:41 AM UTC-7, barrym95838 wrote:
> > The inner loop which draws the "horizontal" lines is symmetric,
> > so instead of going from -L to L you could just go 0 to L and
> > HPLOT two points per iteration, reducing the trig overhead but
> > making the program a bit larger.
> >
> > Mike B.
>

... by my estimates, that got it down to about 35 minutes at 1 MHz.

Mike B.

Michael J. Mahon

unread,

Apr 30, 2017, 6:47:26 PM4/30/17

to

Now for the Y symmetry... ;-)

barrym95838

unread,

Apr 30, 2017, 8:25:29 PM4/30/17

to

On Sunday, April 30, 2017 at 3:47:26 PM UTC-7, Michael J. Mahon wrote:
> ...

> Now for the Y symmetry... ;-)
>
> --
> -michael - NadaNet 3.1 and AppleCrate II: http://michaeljmahon.com

That would require a big stack by my reckoning, to preserve the
"hidden-line" removal. Maybe you could work some MJM magic on
SIN(), just like you did on SQR() a while back ...

Mike B.

barrym95838

unread,

Apr 30, 2017, 11:40:30 PM4/30/17

to

On Sunday, April 30, 2017 at 5:25:29 PM UTC-7, barrym95838 wrote:
> ...

> That would require a big stack by my reckoning

> ...

5 K = PEEK (115) + 256 * PEEK (116)

10 HGR2 :E = 57:F = .4 * E:P = - .0374:Q = 180

20 FOR J = - E TO 0:U = J * J * 5:C = 140 + J:D = 87 + J
30 FOR I = - SQR (16245 - U) TO 0:Y = SQR (I * I + U) * P:Y = D + E * SIN (Y) + F * SIN (Y + Y + Y)
35 IF J THEN K = K - 1: POKE K,Y

40 HCOLOR= 3: HPLOT C - I,Y: HPLOT C + I,Y

50 HCOLOR= 0: HPLOT C - I,Y + 1 TO C - I,Y + 5: HPLOT C + I,Y + 1 TO C + I,Y + 5
60 NEXT : NEXT
65 FOR J = 1 TO E:U = J * J * 5:C = 140 + J
70 FOR I = 0 TO SQR (16245 - U):Y = PEEK (K) + J + J:K = K + 1
80 HCOLOR= 3: HPLOT C - I,Y: HPLOT C + I,Y
90 HCOLOR= 0: HPLOT C - I,Y + 1 TO C - I,Q: HPLOT C + I,Y + 1 TO C + I,Q
99 NEXT : NEXT

Down to about 21 minutes at 1 MHz.

Mike B.

Michael J. Mahon

unread,

May 1, 2017, 2:26:41 AM5/1/17

to

Now, THAT'S what I'm talkin' about!

No more redundant SIN and COS calls.

Nice job!

barrym95838

unread,

May 1, 2017, 11:05:42 AM5/1/17

to

On Sunday, April 30, 2017 at 11:26:41 PM UTC-7, Michael J. Mahon wrote:
>
> Now, THAT'S what I'm talkin' about!
>
> No more redundant SIN and COS calls.
>

... now, to replace those HPLOTs with DRAWs ...
No time this morning, gotta go to work ...

Mike B.

John Brooks

unread,

May 1, 2017, 9:06:13 PM5/1/17

to

On Sunday, April 30, 2017 at 11:26:41 PM UTC-7, Michael J. Mahon wrote:

I replaced the hacky hidden-surface-removal code with a better z-buffer visibility system.

It now draws front-to-back with no overdraw or erase.

Time is around 6 minutes on a IIGS and around 16 minutes at 1MHz.

It would be a lot faster with a bit of assembly...

5 PRINT CHR$ (21): HGR2 : HCOLOR= 3
10 DIM Z(279): FOR I = 0 TO 279:Z(I) = 255: NEXT
15 K = 16383:E = 57:F = .4 * E:P = - .0374
20 FOR J = - E TO 0:U = J * J * 5:C = 140 - J:D = 87 - J:LK = K

30 FOR I = - SQR (16245 - U) TO 0:Y = SQR (I * I + U) * P:Y = D + E *

SIN (Y) + F * SIN (Y + Y + Y):K = K - 1: POKE K,Y: IF Y < Z(C - I)
THEN HPLOT C - I,Y:Z(C - I) = Y
50 IF Y < Z(C + I) THEN HPLOT C + I,Y:Z(C + I) = Y
60 NEXT : NEXT :K = LK
65 FOR J = 1 TO E:U = J * J * 5:C = 140 - J:JJ = J + J:S = SQR (16245 -
U)
70 FOR I = 0 TO S:Y = PEEK (K) - JJ:K = K + 1: IF Y < Z(C - I) THEN HPLOT
C - I,Y:Z(C - I) = Y
80 IF Y < Z(C + I) THEN HPLOT C + I,Y:Z(C + I) = Y
90 NEXT : NEXT

-JB
@JBrooksBSI

Michael 'AppleWin Debugger Dev'

unread,

May 1, 2017, 9:38:12 PM5/1/17

to

On Monday, May 1, 2017 at 6:06:13 PM UTC-7, John Brooks wrote:
> I replaced the hacky hidden-surface-removal code with a better z-buffer visibility system.
>
> It now draws front-to-back with no overdraw or erase.
>
> Time is around 6 minutes on a IIGS and around 16 minutes at 1MHz.

Very nice John! When I showed this to a friend last month we were talking about adding a z-buffer system. We have other projects so it is great to see someone do this! I'll forward this to him as he'll love the optimization as well.

barrym95838

unread,

May 1, 2017, 10:25:49 PM5/1/17

to

On Monday, May 1, 2017 at 6:06:13 PM UTC-7, John Brooks wrote:

> ...

>
> I replaced the hacky hidden-surface-removal code with a better
> z-buffer visibility system.
>
> It now draws front-to-back with no overdraw or erase.
>
> Time is around 6 minutes on a IIGS and around 16 minutes at 1MHz.
>
> It would be a lot faster with a bit of assembly...
>

> ...

Nice! I'm glad that I didn't have time to explore that kludgy
back-to-front path I was pondering; Thank you! :-)

Mike B.

Matthew Power

unread,

May 1, 2017, 10:53:36 PM5/1/17

to

I typed in this code and after removing all my own typos, (which admittedly took a while), this code stops shortly after the left side of the hat starts drawing (when the process starts speeding up). I put a little message in a new line 100 to show that the code ended without error and I got the message. I get about 5/8 of a hat and the program sits.

An admittedly tricky program to type in correctly, perhaps I've missed a punctuation somewhere.

Matt

John Brooks

unread,

May 2, 2017, 12:47:24 AM5/2/17

to

I verified that if I copy the text from my post and paste it into the Virtual ][ emulator, the program runs correctly.

Can you transfer the program directly to your Apple II or emulator instead of typing it in?

-JB
@JBrooksBSI

John Brooks

unread,

May 2, 2017, 12:52:38 AM5/2/17

to

Here is an even more optimized version (4m35s on IIGS). It builds and uses an array to reduce the expensive trig & dual-frequency function evaluation. Still just basic:

]poke 33,72:list

5 PRINT CHR$ (21): HGR2 : HCOLOR= 3
10 DIM Z(279): FOR I = 0 TO 279:Z(I) = 255: NEXT

15 K = 16383:E = 57:F = .4 * E:P = - .0374:PI = 3.141592654:TS = - 256 * P / PI
20 DIM Q(512):D = - 2 * PI / 512: FOR I = 0 TO 512:Y = I * D:Q(I) = E * SIN (Y) + F * SIN (Y + Y + Y): NEXT
25 FOR J = - E TO 0:U = J * J * 5:C = 140 - J:D = 87 - J:LK = K
30 FOR I = - SQR (16245 - U) TO 0:Y = D + Q( SQR (I * I + U) * TS):K = K - 1: POKE K,Y: IF Y < Z(C - I) THEN HPLOT C - I,Y:Z(C - I) = Y

James Davis

unread,

May 2, 2017, 3:10:04 AM5/2/17

to

Hi John,

Nice! Took about 4 seconds to finish in AppleWin at highspeed.

JPD

Matthew Power

unread,

May 2, 2017, 9:50:07 PM5/2/17

to

This new code renders fully in 3 mins 24 secs on my //e with a transwarp. Incredible programming, Mr. Brooks.

barrym95838

unread,

May 2, 2017, 11:09:03 PM5/2/17

to

On Monday, May 1, 2017 at 9:52:38 PM UTC-7, John Brooks wrote:
> Here is an even more optimized version (4m35s on IIGS). It builds
> and uses an array to reduce the expensive trig & dual-frequency
> function evaluation. Still just basic:

> ...

John, I remember writing a text file printing utility many years
ago, in which I read in the text file char by char with GET A$,
then PEEKed my way through the printing. I lost my code, but I
think it can be done relatively easily by someone with your
experience. The gist of the idea is that the K = K - 1: POKE K,Y
can be replaced with A$ = CHR$ (Y) ... popping would still
require PEEKing, but pushing might be a little faster ...

Also, could there be a slight advantage in declaring Y first, to
give it the fastest look-up time every time it is referenced?

Also, could there be a slight speed advantage in making sure that
each NEXT resides on a line number with a high-byte that is at
least 1 greater than the high-byte of its associated FOR, and that
the busiest FORs should be as close to the beginning of the code
as possible? I can't remember when or where I read this, or even
if it's true, but ISTR that it was a quirk of some of the 8-bit
MS BASICs ...

Mike B.

barrym95838

unread,

May 2, 2017, 11:25:43 PM5/2/17

to

On Tuesday, May 2, 2017 at 8:09:03 PM UTC-7, barrym95838 wrote:
> ...

> Also, could there be a slight speed advantage in making sure that
> each NEXT resides on a line number with a high-byte that is at
> least 1 greater than the high-byte of its associated FOR, and that
> the busiest FORs should be as close to the beginning of the code
> as possible? I can't remember when or where I read this, or even
> if it's true, but ISTR that it was a quirk of some of the 8-bit
> MS BASICs ...
>
> Mike B.

Something about NEXT, GOTO and GOSUB trying to look forward if the
high-byte of the target line-number was >= the high-byte of the
current line-number before backing up and starting from the
beginning to find the target ... hey, maybe the FORs could come
after the NEXTs in the code, to help BASIC find them more quickly!

Mike B.

barrym95838

unread,

May 2, 2017, 11:34:55 PM5/2/17

to

On Tuesday, May 2, 2017 at 8:25:43 PM UTC-7, barrym95838 wrote:
>
> ... hey, maybe the FORs could come
> after the NEXTs in the code, to help BASIC find them more quickly!
>
> Mike B.

I forgot the ;-) at the end of that one! But it might help just a
tiny bit to GOTO the initialization and the first "FOR J ..." at the
end of the code, then GOTO the first "FOR I ..." in the second line
of the code. Just brainstorming; let me know if I'm being kooky ...

Mike B.

Richard Thiebaud

unread,

May 2, 2017, 11:53:21 PM5/2/17

to

Of course to really optimize this sort of thing you compile the Basic.

John Brooks

unread,

May 3, 2017, 12:22:34 AM5/3/17

to

It's good stuff. I've forgotten many of the arcane Applesoft optimization techniques I used to know, as I switched from basic to assembly around 1983. But it has been fun optimizing this little basic program. I usually deal with large code bases and graphics engines.

There is one more major optimization to try out, then I'll take a stab at the micro-optimization stuff like sorting vars & line numbers by loop depth.

I'm interested in references for Applesoft optimization techniques if anyone has links or pointers.

-JB
@JBrooksBSI

Michael J. Mahon

unread,

May 3, 2017, 1:26:13 AM5/3/17

to

Micro optimizations, indeed.

The big guns are avoiding redundant computation, particularly complex
computations like trig functions and square roots.

"The fastest computation is the one avoided."

Among the more significant micro optimizations is replacing frequently
referenced literal constants, particularly multi-digit ones, with short
variables.

ISTR that FOR pushes the loop-top address on a stack for quick NEXTs,
unlike GOTOs and GOSUBs,
so there's nothing to gain by positioning or line number fiddling.

It would be interesting to profile this program to see what fraction of its
execution time is spent in various regions.

This doesn't have to be complex--if some region is taking 30% of the time,
then three out of every ten samples will be in that region. Taking just 20
to 30 samples will instantly reveal the approximate proportion of time
spent in the BASIC interpreter, as opposed to the time spent in graphics or
trig ROM routines.

Such profiling requires only a ROM memory map and an emulator with a
debugger option. For example, in AppleWin F7 enters the debugger, revealing
the PC address. Another F7 returns to normal execution. So simply pressing
F7, recording the PC, pressing F7 again, waiting a few seconds, and
pressing F7 again, provides another sample.

Repeat this 20 to 30 times, then use the ROM map to see what each sample
was doing, and group the results. Presto--the truth about where the
execution time is going!

Find the top two or three "peaks" and see if any optimization is possible
to reduce them. Everything else is essentially "noise".

I once sped up a prototype compiler by a factor of three in about ten
minutes by using this approach--stopping the machine every few seconds
during a compile and writing the PC on the back of a punched card. (Now you
know approximately when this was!)

Fully 3/4ths of the samples were in the lexical scanner, and a brief code
inspection revealed a glaring (but pretty) inefficiency that could be fixed
with two lines of code!

A slight variation on this technique that is more BASIC-oriented is to
sample the current statement number, which can be automated in the AppleWin
debugger by setting a memory "mini-window" on the appropriate page zero
bytes. (Of course, if there are lots of computations per line, this may not
provide enough detail.)

Optimization should be about large improvements before even considering
fooling with trivialities. A 1% improvement that adds complexity or hurts
transparency is usually a bad tradeoff. (The exception is where the 1%
makes the difference between something being possible or impossible. ;-)

gid...@sasktel.net

unread,

May 3, 2017, 2:31:40 AM5/3/17

to

I ran the program through a variable cross reference and put all the variables used in the inner loop first and in order of used most often to least, then the variables in the outer loop and order of used most to least. I was able to gain about 17 seconds. Time 4 minutes 18 seconds.

]LIST

5 PRINT CHR$ (21): HGR2 : HCOLOR= 3

10 DIM Z(279),Q(512): FOR I = 0 TO 279:Z(I) = 255: NEXT
15 C = 0:Y = 0:K = 16383:D = 0:U = 0:J = 0:L = 1:PI = 3.141592654:TS = -
256 * - .0374 / PI:M = 0:S = 0:R = 16245:N = 5:P = 140:Q = 87:E = 5
7:F = .4 * E:G = 512
20 D = - 2 * PI / G
23 FOR I = M TO G:Y = I * D:Q(I) = E * SIN (Y) + F * SIN (Y + Y + Y):
NEXT
25 FOR J = - E TO M:U = J * J * N:C = P - J:D = Q - J:LK = K
30 FOR I = - SQR (R - U) TO M:Y = D + Q( SQR (I * I + U) * TS):K = K -
L: POKE K,Y
40 IF Y < Z(C - I) THEN HPLOT C - I,Y:Z(C - I) = Y

50 IF Y < Z(C + I) THEN HPLOT C + I,Y:Z(C + I) = Y
60 NEXT : NEXT :K = LK

65 FOR J = L TO E:U = J * J * N:C = P - J:JJ = J + J:S = SQR (R - U)
70 FOR I = M TO S:Y = PEEK (K) - JJ:K = K + 1
75 IF Y < Z(C - I) THEN HPLOT C - I,Y:Z(C - I) = Y

80 IF Y < Z(C + I) THEN HPLOT C + I,Y:Z(C + I) = Y
90 NEXT : NEXT

On a another note and keeping to applesoft, once could theoretically do it in under 5 seconds.

10 PRINT CHR$(4)"BLOAD HAT"

gid...@sasktel.net

unread,

May 3, 2017, 2:32:17 AM5/3/17

to

I ran the program through a variable cross reference and put all the variables used in the inner loop first and in order of used most often to least, then the variables in the outer loop and order of used most to least. I was able to gain about 17 seconds. Time 4 minutes 18 seconds.

]LIST

5 PRINT CHR$ (21): HGR2 : HCOLOR= 3

10 DIM Z(279),Q(512): FOR I = 0 TO 279:Z(I) = 255: NEXT
15 C = 0:Y = 0:K = 16383:D = 0:U = 0:J = 0:L = 1:PI = 3.141592654:TS = -
256 * - .0374 / PI:M = 0:S = 0:R = 16245:N = 5:P = 140:Q = 87:E = 5
7:F = .4 * E:G = 512
20 D = - 2 * PI / G

23 FOR I = M TO G:Y = I * D:Q(I) = E * SIN (Y) + F * SIN (Y + Y + Y):
NEXT

25 FOR J = - E TO M:U = J * J * N:C = P - J:D = Q - J:LK = K

30 FOR I = - SQR (R - U) TO M:Y = D + Q( SQR (I * I + U) * TS):K = K -
L: POKE K,Y
40 IF Y < Z(C - I) THEN HPLOT C - I,Y:Z(C - I) = Y

50 IF Y < Z(C + I) THEN HPLOT C + I,Y:Z(C + I) = Y
60 NEXT : NEXT :K = LK

65 FOR J = L TO E:U = J * J * N:C = P - J:JJ = J + J:S = SQR (R - U)
70 FOR I = M TO S:Y = PEEK (K) - JJ:K = K + 1
75 IF Y < Z(C - I) THEN HPLOT C - I,Y:Z(C - I) = Y

80 IF Y < Z(C + I) THEN HPLOT C + I,Y:Z(C + I) = Y
90 NEXT : NEXT

barrym95838

unread,

May 3, 2017, 4:46:25 AM5/3/17

to

On Tuesday, May 2, 2017 at 11:31:40 PM UTC-7, gid...@sasktel.net wrote:
> ...

Here's a more copy&paste friendly version of Rob's effort, with my
"A$ = CHR$ (Y)" push mod (worth about 10 seconds @ 1 MHz):

5 PRINT CHR$ (21): HGR2 : HCOLOR= 3: DIM Z(279),Q(512)
10 FOR I = 0 TO 279:Z(I) = 255: NEXT :C = 0:Y = C:K = C
15 D = C:U = C:J = C:L = 1:TS = 3.04762617:M = C:R = 16245
17 N = 5:P = 140:Q = 87:E = 57:F = .4 * E:G = 512
20 D = - 8 * ATN (L) / G: FOR I = M TO G:Y = I * D
23 Q(I) = E * SIN (Y) + F * SIN (Y + Y + Y): NEXT

25 FOR J = - E TO M:U = J * J * N:C = P - J:D = Q - J

30 FOR I = - SQR (R - U) TO M:Y = D + Q( SQR (I * I + U) * TS)

40 IF Y < Z(C - I) THEN HPLOT C - I,Y:Z(C - I) = Y
50 IF Y < Z(C + I) THEN HPLOT C + I,Y:Z(C + I) = Y

60 IF J THEN A$ = CHR$ (Y)
62 NEXT : NEXT :K = PEEK (111) + 256 * PEEK (112)

65 FOR J = L TO E:U = J * J * N:C = P - J:JJ = J + J

70 FOR I = M TO SQR (R - U):Y = PEEK (K) - JJ:K = K + L

75 IF Y < Z(C - I) THEN HPLOT C - I,Y:Z(C - I) = Y
80 IF Y < Z(C + I) THEN HPLOT C + I,Y:Z(C + I) = Y

90 NEXT : NEXT : STOP

About 10'50" at 1 MHz.

Mike B.

barrym95838

unread,

May 3, 2017, 11:24:34 AM5/3/17

to

On Wednesday, May 3, 2017 at 1:46:25 AM UTC-7, barrym95838 wrote:
> ...
>
... forgot to hoist the "IF J THEN":

5 PRINT CHR$ (21): HGR2 : HCOLOR= 3: DIM Z(279),Q(512)
10 FOR I = 0 TO 279:Z(I) = 255: NEXT :C = 0:Y = C:K = C
15 D = C:U = C:J = C:L = 1:TS = 3.04762617:M = C:R = 16245
17 N = 5:P = 140:Q = 87:E = 57:F = .4 * E:G = 512
20 D = - 8 * ATN (L) / G: FOR I = M TO G:Y = I * D
23 Q(I) = E * SIN (Y) + F * SIN (Y + Y + Y): NEXT
25 FOR J = - E TO M:U = J * J * N:C = P - J:D = Q - J

27 K = PEEK (111) + 256 * PEEK (112)

30 FOR I = - SQR (R - U) TO M:Y = D + Q( SQR (I * I + U) * TS)
40 IF Y < Z(C - I) THEN HPLOT C - I,Y:Z(C - I) = Y
50 IF Y < Z(C + I) THEN HPLOT C + I,Y:Z(C + I) = Y

60 A$ = CHR$ (Y): NEXT : NEXT

65 FOR J = L TO E:U = J * J * N:C = P - J:JJ = J + J
70 FOR I = M TO SQR (R - U):Y = PEEK (K) - JJ:K = K + L
75 IF Y < Z(C - I) THEN HPLOT C - I,Y:Z(C - I) = Y
80 IF Y < Z(C + I) THEN HPLOT C + I,Y:Z(C + I) = Y
90 NEXT : NEXT : STOP

Don't have the timing for this one ... off to work again ...

Mike B.

Nick Westgate

unread,

May 3, 2017, 8:39:08 PM5/3/17

to

On Thursday, 4 May 2017 03:24:34 UTC+12, barrym95838 wrote:
> Don't have the timing for this one ... off to work again ...

Here's an execution count for a complete run of that code on an enhanced //e ROM:
https://drive.google.com/open?id=0B3JBd-TShLlLdm5DN2x3bWptdk0
(zip file 11kb, unzipped 90kb)

It's a list of memory locations sorted by execution count (ascending) and the count and percentage of the sum of the execution counts.

E.g. the last lines are:
E9E1- 3542217 = 1.000%
E9DB- 4048248 = 1.143%
E9D9- 4048248 = 1.143%
E9DD- 4048248 = 1.143%
E9DE- 4048248 = 1.143%
E9D4- 4048248 = 1.143%
E9D5- 4048248 = 1.143%
E9D6- 4048248 = 1.143%
E9D7- 4048248 = 1.143%
E9D8- 4048248 = 1.143%
E9B9- 4048248 = 1.143%
E9B8- 4048248 = 1.143%
E9DF- 4048248 = 1.143%
E9E0- 4048248 = 1.143%
E9DA- 4048248 = 1.143%
E9DC- 4048248 = 1.143%

Which means, according to the Applesoft disassembly:
http://jamtronix.com/files/applesoft.html#MULTIPLY.2

... it looks like multiply is top of the pile.

Cheers,
Nick.

Michael J. Mahon

unread,

May 3, 2017, 9:03:27 PM5/3/17

to

Outstanding!

I inserted the line "4 PRINT CHR$(4)"BRUN USR.SQR":GET(A$) and changed
all the SQRs to USRs. It then runs in about 35% less time!

Timing is done after the BRUN, by pressing any key to start the code.

USR.SQR is available on my website, and is about 9 times faster than SQR
(which is *awful*) and more precise.

--

-michael

NadaNet 3.1 for Apple II parallel computing!
Home page: http://michaeljmahon.com

"The wastebasket is our most important design
tool--and it's seriously underused."

Michael J. Mahon

unread,

May 3, 2017, 9:08:12 PM5/3/17

to

Well, almost... line 4 should have "GET A$", no parens!

John Brooks

unread,

May 3, 2017, 10:47:00 PM5/3/17

to

I did a major overhaul on the program structure.

The 'Fast Hats Off' version is here:
https://groups.google.com/forum/#!topic/comp.sys.apple2/Uc4EmM3xX88

-JB

Nick Westgate

unread,

May 3, 2017, 11:36:13 PM5/3/17

to

On Thursday, 4 May 2017 13:03:27 UTC+12, Michael J. Mahon wrote:
> I inserted the line "4 PRINT CHR$(4)"BRUN USR.SQR":GET(A$) and changed
> all the SQRs to USRs. It then runs in about 35% less time!

Well that got rid of the low-hanging multiply fruit.

00B4- 823718 = 0.408%
00B3- 829440 = 0.411%
00B2- 829440 = 0.411%
00B1- 829440 = 0.411%
E068- 872514 = 0.433%
E067- 872514 = 0.433%
E066- 872514 = 0.433%
E065- 872514 = 0.433%
E069- 872514 = 0.433%
E062- 872515 = 0.433%
E061- 872515 = 0.433%
E05F- 872515 = 0.433%
E05E- 872515 = 0.433%
E05B- 872515 = 0.433%
E05D- 872515 = 0.433%
E063- 872515 = 0.433%
E05C- 872515 = 0.433%
00BD- 1380184 = 0.684%
00C8- 1655970 = 0.821%
00BC- 1655970 = 0.821%
00BB- 1655970 = 0.821%
00BA- 1655970 = 0.821%
00B9- 1655970 = 0.821%
00B8- 1655970 = 0.821%
00B7- 1655970 = 0.821%

Next top of the pile looks to be parsing variable names:
http://jamtronix.com/files/applesoft.html#PTRGET4

Cheers,
Nick.