I found a cheetsheet from Beagle Bros that lists useful ROM subroutine
calls, one of these (at $F3F2) is purported to clear the high-
resolution screen that is currently being used. But when I try to use
it, I can't seem to get it to work.
Here is the code that I am trying:
ORG $1000
TXTCLR EQU $C050 ; text mode off (graphics on)
MIXCLR EQU $C052 ; mixed mode off
HISCR EQU $C054 ; use HIRES screen 1
HIRES EQU $C057 ; HIRES mode on
CLEAR EQU $F2F3
LDY #$00 ; set graphics modes
STY TXTCLR
STY MIXCLR
STY HIRES
STY HISCR
JSR CLEAR ; CLEAR the screen
TYA
BRK
Am I doing something wrong here? The screen is still filled with
random data after execution. (I wrote a subroutine to clear the
screen, but the built-in version is probably much faster)
I've got my mind set on re-creating pong, and I noticed that there
were some other useful-looking calls that I might be able to use,
including Hi-Res Colission-Check ($E6). Where might I find more on
these calls and how to use them?
If anyone might direct me to some more information about this, it
would be greatly appreciated! (I just ordered a book called "Assembly
Language for Applesoft Programmers" that should give me a little bit
more info, but I'm impatient, hah )
Page zero location $E6 must be preset to the graphics page you
are clearing: $20 for HGR1 and $40 for HGR2.
$F2F3 clears the current page to black, but you can set the page
to any color by placing a "color byte" in the A register and calling
$F3F4, or by setting $1C to the desired color byte and calling $F3F6.
> I've got my mind set on re-creating pong, and I noticed that there
> were some other useful-looking calls that I might be able to use,
> including Hi-Res Collision-Check ($E6). Where might I find more on
> these calls and how to use them?
The collision counter is incremented when a DRAW or XDRAW changes a
screen 1 to a 0, so it will be greater than zero after the "draw" if
the shape "collided" with something non-black.
(I've frequently used it to "read" the hi-res screen a pixel at a time
by XDRAWing a single-pixel shape and checking the collision counter.)
> If anyone might direct me to some more information about this, it
> would be greatly appreciated! (I just ordered a book called "Assembly
> Language for Applesoft Programmers" that should give me a little bit
> more info, but I'm impatient, hah )
What you really want is "All About Applesoft" from Call-A.P.P.L.E.,
and a commented listing of the Applesoft ROMs is also very helpful.
I believe that both are available on the web, though I find myself using
the Applesoft disassembly produced by Sourcerer (on the back side of
Merlin 8) most often.
-michael
NadaNet 3.0 for Apple II parallel computing!
Home page: http://home.comcast.net/~mjmahon/
"The wastebasket is our most important design
tool--and it's seriously underused."
> > CLEAR EQU $F2F3
<snip>
> Page zero location $E6 must be preset to the graphics page you
> are clearing: $20 for HGR1 and $40 for HGR2.
>
> $F2F3 clears the current page to black, but you can set the page
> to any color by placing a "color byte" in the A register and calling
> $F3F4, or by setting $1C to the desired color byte and calling $F3F6.
I'm no assembly programmer, but looking at the Beagle chart (and A2
FAQ), it would seem you both have made the same typo - isn't the
correct address to call $F3F2, *not* $F2F3?
Cheers,
Mike
I recommend doing a search for a book called "Hi-Res Graphics and
Animation using Assembly Language" by Leonard Malkin which also
touches on double Hi-Res for the Apple IIc. It pretty much covers all
the animation drawing routines and has some good routines that cover
counters and collision as well.
Someone may have it in .pdf format. If not, I may decide to sit down
and scan it sometime. All 313 pages. :)
Rob
If you're more interested in learning assembly language I would steer
clear of using any routines in the ROM and write your own. They will
be faster and you will learn more about assembly language by doing so.
Of course studying the ROMs for coding tips and tricks is a good idea.
When I was a kid I started with the excellent Graph Paper articles:
http://www.atarimagazines.com/creative/index/index.php?author=David+Lubar
Only a few are able to be read there. I'd recommend "A bit of a shift"
and "The graph paper" (the final article). Hopefully someone has links
to the complete articles.
Cheers,
Nick.
Ooh! Yes please! :-D
Cheers,
Nick.
You're absolutely right. I read the later entry points off the listing,
but copied the first from the question--they are all adjacent. ;-)
And the "clear the hi-res page" routine is, in fact, one of the slower
Applesoft ROM routines!
The Monitor ROM ($F800..$FFFF) is by Woz and Allen Baum, and is a fine
tutorial in assembly language techniques. The Applesoft ROM, not so
much.
I'm gonna go buy a lottery ticket now, 'cause its a rare day indeed
when I catch you out.... ;-)
Cheers,
Mike
Hmmm.... Title fight between Woz & Microsoft: Microsoft is KO'ed in the
first 30 seconds of the first round... ;-)
Cheers,
Mike
I have this book. The explanations are good but the code is mediocre.
--
Paul Santa Maria
Maumee, Ohio USA
The ROM routines by Woz and Baum were written to optimize size, not
speed.
The Hi-Res routines in Applesoft were written by Woz. See the
Programmer's
Aid #1 manual for assembly source code.
My standard way of clearing the hi-res screen is:
LDA #0 ;FILL BYTE
LDY #PAGE ;PAGE=$20, $40, or $60
STY PTR+1
LDY #0 ;INIT INDEX
STY PTR
LDX #$20 ;PAGE COUNT
a STA (PTR),Y
INY
BNE a
INC PTR+1
DEX
BNE a
RTS
If I need speed, then I use
; clear hi-res page 1
LDA #0 ;FILL BYTE
LDY #0 ;INIT INDEX
a STA $2000,Y
STA $2100,Y
STA $2200,Y
STA $2300,Y
...
STA $3E00,Y
STA $3F00,Y
INY
BNE a
RTS
Very nice unrolling, Paul! That must be pretty speedy! Is it
possible to avoid screen holes without losing too much performance?
-Brendan
Insert "CPY #$F8" directly after the INY and you should be done.
The memory gain is negligible (unless you absolutely want to
preserve the screen holes) but it's infact a bit speedier!
While you need additional cycles for the comparison:
(CPY #)= 2 cycles x 248 runs = 496 cycles
you save more cycles because you avoid eight index runs:
(STA $aaaa,Y) = 5 cycles x 32 instructions x 8 runs = 1280 cycles
(INY) = 2 cycles x 8 runs = 16 cycles
(BNE with branch) = 3 cycles x 8 runs = 24 cycles
total = 1320
Cycles gained: 1320 - 496 = 824
Branches over page boundaries ignored in both cases and the
final BNE with no branch (2 cycles) happens in both variants.
bye
Marcus
The "CPY #F8" does not work because screen holes only happen only once
every $400 bytes so the program needs a little more coding. Screen
holes are from
$23F8 - $23FF
$27F8 - $27FF
$2BF8 - $2BFF
$2FF8 - $2FFF
$33F8 - $33FF
$37F8 - $37FF
$3BF8 - $3BFF
$3FF8 - $3FFF
here is a program that I use to save the screen holes
LDY #PAGE ;PAGE=$20, $40, or $60
STY PTR+1
LDY #0
STY PTR
LDX #$20
a LDA #0
b STA (PTR),Y
INY
BNE b
INC PTR+1
DEX
BEQ d
TXA
AND #3
CMP #3
BNE a
LDA #0
c STA (PTR),Y
INY
CPY #F8
BCC c
INC PTR+1
DEX
BNE b
d RTS
As you can see, only the one page out of 4 would get slowed down due
to the comparison.
Rob
Here is the corrected version
LDY #PAGE ;PAGE=$20, $40, or $60
STY PTR+1
LDY #0
STY PTR
LDX #$1F
a LDA #0
b STA (PTR),Y
INY
BNE b
INC PTR+1
DEX
TXA
AND #3
CMP #3
BNE a
LDA #0
c STA (PTR),Y
INY
CPY #F8
BCC c
INC PTR+1
DEX
BPL b
d RTS
I'm not sure about this. Aren't there screen holes at 78-7F as well
as F8-FF? Assuming my memory's right, I can come up with three options.
If you have tons of code space, you can do the thoughtless thing
and just double the loop length:
LDA #0
LDY #$77
a STA $2000,Y
STA $2080,Y
...
STA $3F00,Y
STA $3F80,Y
DEY
BPL a
Or you can do something like what Calibrator said but skip over the
middle hole along the way:
LDA #0
LDY #0
a STA $2000,Y
...
STA $3F00,Y
INY
CPY #$F8
BEQ end
CPY #$78
BNE a
LDY #$80
BNE a
end
Or involve the X register and run the same loop twice with different
index values but the same exit condition:
LDA #0
LDY #$77
b LDX #$77
a STA $2000,Y
STA $2100,Y
...
STA $3F00,Y
DEY
DEX
BPL a
CPY #$7F
BEQ end
LDY #$F7
BNE b
end
Why do you want to preserve the different screen holes of the HGR
screens?
I understand that when you deal with the text screen as useful data is
recorded there but for high resolution pages where you will probably
use page flipping, sprites, page erasing, etc. then speed is required
and Paul's second algorithm is fine.
If you want to save room, use the routine in ROM. If you want speed,
forget about the screen holes.
antoine
According to the Apple IIe TechNote #10 you are right!
This TN handles the IIe-card for the Mac LC and I quote:
---
Notes:
1. The "Screen-Hole" areas in the above address ranges do
not trap.
These are the $xx78-7F and $xxF8-FF address ranges in the
display areas.
---
This makes sense as
- 240 bytes of each page x 32 pages = 7680 bytes
and
- 40 bytes per hires-line x 192 lines = 7680 bytes!
> Assuming my memory's right, I can come up with three options.
>
> If you have tons of code space, you can do the thoughtless thing
> and just double the loop length:
>
> LDA #0
> LDY #$77
> a STA $2000,Y (5)
> STA $2080,Y (5)
> ...
> STA $3F00,Y (5)
> STA $3F80,Y (5)
> DEY (2)
> BPL a (2/3)
You'd need a branch over a "JMP a" combo at the end as the
unrolled loop would be larger than the branch range...
(64 STA instructions with 3 bytes each = 192 bytes)
About 39600 cycles in total with a JMP and not counting
the register setup at the beginning of the routine - but about
200 bytes of code (400 bytes if you need both hires pages)...
> Or you can do something like what Calibrator said but skip
> over the middle hole along the way:
Sounds interesting!
> LDA #0
> LDY #0
> a STA $2000,Y (5)
> ...
> STA $3F00,Y (5)
> INY (2)
> CPY #$F8 (2)
> BEQ end (2/3)
> CPY #$78 (2)
> BNE a (2/3)
> LDY #$80 (2)
> BNE a (2/3)
> end
The 32 STAs run effectively 240 times x 5 cycles = 38400.
Add to that the stuff beginning with the INY and you'll get:
1 x 7 cycles ($F8 condition)
1 x 15 cycles ($78 condition)
238 x 11 cycles (first BNE successful)
= 2640 cycles
Which results in a total of 41040 cycles.
(I'm probably wrong with the loops but it should be close
enough ;-)
Paul's "standard unrolled version" uses a total of 42239 cycles,
though, as the loop runs 256 times.
With my $F8 check (which still fills about half the holes) it
would need about 40919 cycles...
> Or involve the X register and run the same loop twice with different
> index values but the same exit condition:
>
> LDA #0
> LDY #$77
> b LDX #$77
> a STA $2000,Y
> STA $2100,Y
> ...
> STA $3F00,Y
> DEY (2)
> DEX (2)
> BPL a (2/3)
> CPY #$7F (2)
> BEQ end (2/3)
> LDY #$F7 (2)
> BNE b (always 3)
> end
I hate cycle counting in nested loops - especially when
decrementing! ;-)
The a-loop runs 240 times = 38400+1679 = 40079
Special case #1 (Y=$7F) = +5 cycles (once)
Special case #2 (Y=$FF after DEY) = +11 cycles (once)
= total 40095 cycles (I hope...)
About 2000 cycles faster than Paul's routine (5%) - not very
eye-friendly but perhaps the best trade-off, isn't it?
bye
Marcus
I would do it for speed reasons (see my answer to Mark) but
Rob (ict@ccess) apparently uses them for embedded code.
See this thread:
http://groups.google.com/group/comp.sys.apple2/browse_frm/thread/6764c2d36dd29d52#
> I understand that when you deal with the text screen as useful data is
> recorded there but for high resolution pages where you will probably
> use page flipping, sprites, page erasing, etc. then speed is required
> and Paul's second algorithm is fine.
If speed is very important you would use a line table anyway
and except for the page erasing I don't see why one would
access the screen holes with a sprite routine (or anything
else using the line table) at all.
> If you want to save room, use the routine in ROM. If you want speed,
> forget about the screen holes.
It's always a trade-off of course but if speed is paramount
I claim that bytes written to screen holes are wasted cycles.
bye
Marcus
Yes, I thought to do a similar check after my post, namely:
40 x 192 = 7680 for the graphics
64 x 8 = 512 for the holes
Total = 8192: check!
> You'd need a branch over a "JMP a" combo at the end as the
> unrolled loop would be larger than the branch range...
> (64 STA instructions with 3 bytes each = 192 bytes)
Good point. I did say the method was thoughtless :-)
>> Or you can do something like what Calibrator said but skip
>> over the middle hole along the way:
>
> Which results in a total of 41040 cycles.
>
> Paul's "standard unrolled version" uses a total of 42239 cycles,
> though, as the loop runs 256 times.
>
> With my $F8 check (which still fills about half the holes) it
> would need about 40919 cycles...
I wasn't approaching this from the angle of skipping holes to improve
speed, but as a way of doing a fast clear while preserving the screen
holes in case they were being used. Just a silly exercise really.
So I'm not too surprised that this method turns out slower.
>> Or involve the X register and run the same loop twice with different
>> index values but the same exit condition:
>
> The a-loop runs 240 times = 38400+1679 = 40079
> Special case #1 (Y=$7F) = +5 cycles (once)
> Special case #2 (Y=$FF after DEY) = +11 cycles (once)
> = total 40095 cycles (I hope...)
>
> About 2000 cycles faster than Paul's routine (5%) - not very
> eye-friendly but perhaps the best trade-off, isn't it?
Ah, so this is actually quicker! I've never tried it so I have no idea
how it performs visually.
Thanks for doing the cycle counts!
Mark
It seems I incorrectly assumed that Microsoft had written the
Applesoft routines, my bad. Thanks for setting me straight. It must be
my inner desire to blame Microsoft for everything bad in this
world.. ;-)
Cheers,
Mike
They are all "ugly" IMHO as they aren't working the screen line
by line. Totally irrelevant if you use both pages and draw on the
invisible page, though.
With "eye-friendly" I meant the source code itself ;-)
bye
Marcus
Actually, Microsoft did write Applesoft--with the exception of a
few Apple II specials, like lo- and hi-res graphics, paddles, etc.
The Applesoft interpreter is a bit uneven in its quality. Some
tradeoffs seem to have been carefully thought out, but parts of
it look like a hurried conversion from another microprocessor.
My guess is that the programmer was not very familiar with
6502 coding techniques going into the project...a situation
that was quite common in the early days.
The routine in the Applesoft ROM can be used to clear any screen to
any "color" (including some striped/dithered patterns). To support
this it complements the low 7 bits of A on each horizontally adjacent
byte to preserve the "color meanings" for both even and odd bytes.
And it does the complementing in a *subroutine* instead of in-line--
a clear case of optimizing for space regardless of speed.
It's also interesting that the capability of "clearing" a screen to
a color other than Black1 (hi-bit off) is not used in Applesoft!
This suggests to me that the incorporation of Woz's Programmer's Aid #1
hi-res routines was done not by Woz, but by some delegate, maybe even
a Microsoft employee. ;-)
Aha! So my title fight metaphor earlier in this thread may still be
valid! I knew it!! ;-)
Cheers,
Mike
A very different visual effect can be achieved by replacing
the single INY with three INY instructions. This tweak is
courtesy of Bruce Artwick (the flight simulator guy).
On Thu, 9 Jul 2009, Michael J. Mahon wrote:
> The Applesoft interpreter is a bit uneven in its quality. Some
> tradeoffs seem to have been carefully thought out, but parts of
> it look like a hurried conversion from another microprocessor.
It was... it was ported over from the Z80.
-uso.
Or perhaps even the 8080...but some ports are done with a much more
subtle adaptation to the target architecture. In the case of 6502
Microsoft BASIC, it seems that it was more important that the port
be completed quickly.
It's interesting to consider the "path not taken", which would
have been Woz completing the integration of floating-point into
Integer BASIC.
(Of course, this would have resulted in Apple BASIC being quite
different from most micro BASICs.)
On Thu, 9 Jul 2009, Michael J. Mahon wrote:
> lyricalnanoha wrote:
>>
>>
>> On Thu, 9 Jul 2009, Michael J. Mahon wrote:
>>
>>> The Applesoft interpreter is a bit uneven in its quality. Some
>>> tradeoffs seem to have been carefully thought out, but parts of
>>> it look like a hurried conversion from another microprocessor.
>>
>> It was... it was ported over from the Z80.
>
> Or perhaps even the 8080...but some ports are done with a much more
> subtle adaptation to the target architecture. In the case of 6502
> Microsoft BASIC, it seems that it was more important that the port
> be completed quickly.
>
> It's interesting to consider the "path not taken", which would
> have been Woz completing the integration of floating-point into
> Integer BASIC.
>
> (Of course, this would have resulted in Apple BASIC being quite
> different from most micro BASICs.)
And more like Atari BASIC, not? IIRC, that was done by the DOS 3.1
people.
-uso.