Hires screen clear

481 views
Skip to first unread message

John Brooks

unread,
Oct 21, 2015, 1:54:00 AM10/21/15
to

On Tuesday, October 20, 2015 at 9:22:53 AM UTC-7, fadden wrote:
> On Monday, October 19, 2015 at 7:50:09 PM UTC-7, jamesiw...@gmail.com wrote:
> > I know this is an old thread, but I just thought I'd post my solution to this problem, in case anyone would like to see it.
>
> Since anything worth doing is worth overdoing...
>
> Discussion:
> https://github.com/fadden/fdraw/blob/master/docs/manual.md#notes
>
> Screen clear code, written two different ways (size vs. speed):
> https://github.com/fadden/fdraw/blob/master/FDRAW.S#L271
>
> The hi-res screen is a quirky beast... which is what makes it so much fun to play with.
>
> It's always fun to solve something then see how other people have solved it. Having gone through the process yourself, you appreciate the subtleties in the other solutions. For example: nice job switching the graphics screen on *after* the clear -- looks much better that way.


> Since anything worth doing is worth overdoing...
> Screen clear code, written two different ways (size vs. speed):
> https://github.com/fadden/fdraw/blob/master/FDRAW.S#L271
> It's always fun to solve something then see how other people have solved it.

I like it! Here's another approach:

*-------------------------------------------------
* Fast, small, right-to-left hires screen clear
* By John Brooks 10/20/2015
* Size: 136 bytes
* Speed: ~5.4 cycles per hires byte. Unrolled 32x.
* >5x faster than ROM HGR
* Inputs: A=color byte for even columns
* Y=color byte for odd columns
*-------------------------------------------------

Clear
sta :_EvenColor+1
sty :_OddColor+1

ldy #3*40-1 ;3x 40-byte rows in a half-page: 128/40=3
:OddColorClc clc ;c=0 on odd columns
:_OddColor lda #$DD ;MOD color for odd columns
:ClearEven
]Offset equ 31*$400
lup 32 ;# of pages in hires screen: 8k/256
sta ]Offset/$2000*$100+]Offset&$1fff+$2000,y
]Offset equ ]Offset-$400
--^

bcs :NextHalfPage
:_EvenColor
lda #$DD ;MOD color for even columns
dey
sec ;c=1 during even columns
bcs :ClearEven ;always
:NextHalfPage
tya
bmi :NextRow
adc #$80+1-1 ;$80=to 2nd half-page. +1 goes back to odd column. -1 due to c=1
tay
bmi :OddColorClc ;always
:NextRow
adc #$80+1-40-1 ;$80=to 1st half-page. +1 goes back to odd column. -40=next row. -1 due to c=1
tay
bpl :OddColorClc ;branch if in same page
adc #3*40-2 ;3*40=# of rows in a half-age. -2 decrements to prev odd/even column
tay
cmp #2*40 ;loop if we're not done with all 40 bytes in all the 3rd row
bcs :OddColorClc

rts


And here's the Merlin32 listing:

+-----------------------+-------------------------------------------------------------------
| Address Object Code | Source Code
+-----------------------+-------------------------------------------------------------------
| 00/8000 | *-------------------------------------------------
| 00/8000 | * Fast, small, right-to-left hires screen clear
| 00/8000 | * By John Brooks 10/20/2015
| 00/8000 | * Size: 136 bytes
| 00/8000 | * Speed: ~5.4 cycles per hires byte. Unrolled 32x.
| 00/8000 | * >5x faster than ROM HGR
| 00/8000 | * Inputs: A=color byte for even columns
| 00/8000 | * Y=color byte for odd columns
| 00/8000 | *-------------------------------------------------
| 00/8000 |
| 00/8000 | org $300
| 00/0300 | Clear
| 00/0300 : 8D 6E 03 | sta __EvenColor+1
| 00/0303 : 8C 0A 03 | sty __OddColor+1
| 00/0306 |
| 00/0306 : A0 77 | ldy #3*40-1 ;3x 40-byte rows in a half-page: 128/40=3
| 00/0308 : 18 | _OddColorClc clc ;c=0 on odd columns
| 00/0309 : A9 DD | __OddColor lda #$DD ;MOD color for odd columns
| 00/030B | _ClearEven
| 00/030B | ]Offset equ 31*$400
| 00/030B : 99 00 3F | sta 31744/$2000*$100+31744&$1fff+$2000,y
| 00/030E : 99 00 3B | sta 30720/$2000*$100+30720&$1fff+$2000,y
| 00/0311 : 99 00 37 | sta 29696/$2000*$100+29696&$1fff+$2000,y
| 00/0314 : 99 00 33 | sta 28672/$2000*$100+28672&$1fff+$2000,y
| 00/0317 : 99 00 2F | sta 27648/$2000*$100+27648&$1fff+$2000,y
| 00/031A : 99 00 2B | sta 26624/$2000*$100+26624&$1fff+$2000,y
| 00/031D : 99 00 27 | sta 25600/$2000*$100+25600&$1fff+$2000,y
| 00/0320 : 99 00 23 | sta 24576/$2000*$100+24576&$1fff+$2000,y
| 00/0323 : 99 00 3E | sta 23552/$2000*$100+23552&$1fff+$2000,y
| 00/0326 : 99 00 3A | sta 22528/$2000*$100+22528&$1fff+$2000,y
| 00/0329 : 99 00 36 | sta 21504/$2000*$100+21504&$1fff+$2000,y
| 00/032C : 99 00 32 | sta 20480/$2000*$100+20480&$1fff+$2000,y
| 00/032F : 99 00 2E | sta 19456/$2000*$100+19456&$1fff+$2000,y
| 00/0332 : 99 00 2A | sta 18432/$2000*$100+18432&$1fff+$2000,y
| 00/0335 : 99 00 26 | sta 17408/$2000*$100+17408&$1fff+$2000,y
| 00/0338 : 99 00 22 | sta 16384/$2000*$100+16384&$1fff+$2000,y
| 00/033B : 99 00 3D | sta 15360/$2000*$100+15360&$1fff+$2000,y
| 00/033E : 99 00 39 | sta 14336/$2000*$100+14336&$1fff+$2000,y
| 00/0341 : 99 00 35 | sta 13312/$2000*$100+13312&$1fff+$2000,y
| 00/0344 : 99 00 31 | sta 12288/$2000*$100+12288&$1fff+$2000,y
| 00/0347 : 99 00 2D | sta 11264/$2000*$100+11264&$1fff+$2000,y
| 00/034A : 99 00 29 | sta 10240/$2000*$100+10240&$1fff+$2000,y
| 00/034D : 99 00 25 | sta 9216/$2000*$100+9216&$1fff+$2000,y
| 00/0350 : 99 00 21 | sta 8192/$2000*$100+8192&$1fff+$2000,y
| 00/0353 : 99 00 3C | sta 7168/$2000*$100+7168&$1fff+$2000,y
| 00/0356 : 99 00 38 | sta 6144/$2000*$100+6144&$1fff+$2000,y
| 00/0359 : 99 00 34 | sta 5120/$2000*$100+5120&$1fff+$2000,y
| 00/035C : 99 00 30 | sta 4096/$2000*$100+4096&$1fff+$2000,y
| 00/035F : 99 00 2C | sta 3072/$2000*$100+3072&$1fff+$2000,y
| 00/0362 : 99 00 28 | sta 2048/$2000*$100+2048&$1fff+$2000,y
| 00/0365 : 99 00 24 | sta 1024/$2000*$100+1024&$1fff+$2000,y
| 00/0368 : 99 00 20 | sta 0/$2000*$100+0&$1fff+$2000,y
| 00/036B |
| 00/036B : B0 06 | bcs _NextHalfPage
| 00/036D | __EvenColor
| 00/036D : A9 DD | lda #$DD ;MOD color for even columns
| 00/036F : 88 | dey
| 00/0370 : 38 | sec ;c=1 during even columns
| 00/0371 : B0 98 | bcs _ClearEven ;always
| 00/0373 | _NextHalfPage
| 00/0373 : 98 | tya
| 00/0374 : 30 05 | bmi _NextRow
| 00/0376 : 69 80 | adc #$80+1-1 ;$80=to 2nd half-page. +1 goes back to odd column. -1 due to c=1
| 00/0378 : A8 | tay
| 00/0379 : 30 8D | bmi _OddColorClc ;always
| 00/037B | _NextRow
| 00/037B : 69 58 | adc #$80+1-40-1 ;$80=to 1st half-page. +1 goes back to odd column. -40=next row. -1 due to c=1
| 00/037D : A8 | tay
| 00/037E : 10 88 | bpl _OddColorClc ;branch if in same page
| 00/0380 : 69 76 | adc #3*40-2 ;3*40=# of rows in a half-age. -2 decrements to prev odd/even column
| 00/0382 : A8 | tay
| 00/0383 : C9 50 | cmp #2*40 ;loop if we're not done with all 40 bytes in all the 3rd row
| 00/0385 : B0 81 | bcs _OddColorClc
| 00/0387 |
| 00/0387 : 60 | rts
+-----------------------+-------------------------------------------------------------------

-JB
@JBrooksBSI

mwillegal

unread,
Oct 21, 2015, 11:59:22 AM10/21/15
to
for what it's worth, here's the code used in the brain board to clear the hi-res screen. We needed the clear line routine, since this application emulates low res text mode on the high res page and the scroll function has to clear the bottom line of text on the screen.

regards,
Mike Willegal

;;; Clears hires page 1

LDX #0

CLEAR2
JSR CLEAR_LINE
INX
CPX #48
BNE CLEAR2
;; page cleared



;
; clear line - X contains line #
;
CLEAR_LINE
LDA PG1ROWS,x ; target (was last source)
STA TRGHIGH
inx
LDA PG1ROWS,x ; target
STA TRGLOW

LDA #8
STA CNT3
JMP CL4.1
;
; adjust address to next line of pixels
;
CL4
LDA #$4
CLC
ADC TRGHIGH
STA TRGHIGH
CL4.1
LDY #39
lda #$0
;copy 40 characters that make up a line of pixels
CL5
STA (TRGLOW),y
DEY
BPL CL5 ; repeat for 40 characters that make line of pixels
DEC CNT3
BNE CL4 ; done with this line of pixels =- goto to next liine of pixels
RTS

PG1ROWS HEX 4000 4080 4100 4180 4200 4280 4300 4380 4028 40A8 4128 41A8 4228 42A8 4328 43A8 4050 40D0 4150 41D0 4250 42D0 4350 43D0

fadden

unread,
Oct 21, 2015, 12:45:29 PM10/21/15
to
On Tuesday, October 20, 2015 at 10:54:00 PM UTC-7, John Brooks wrote:
> * Fast, small, right-to-left hires screen clear
> * By John Brooks 10/20/2015

Nice! I especially like the odd/even trick.

qkumba

unread,
Oct 21, 2015, 2:11:08 PM10/21/15
to
Any reason to not use X to hold the even colour? It saves 4 bytes, because you omit the store and replace LDA with TXA, for the same cycle count.

qkumba

unread,
Oct 21, 2015, 2:29:44 PM10/21/15
to
$379 can also branch to __OddColor instead of __OddColorClc to save 2 cycles per iteration, because the ADC at $376 always clears the carry.

qkumba

unread,
Oct 21, 2015, 2:54:25 PM10/21/15
to
and save 2 bytes by changing $306 to LDA, and placing a TAY between _OddColorClc and __OddColor (and $379 branches to the TAY instead). Then the three TAYs at $378, $37D, and $382 go away.
Message has been deleted

qkumba

unread,
Oct 21, 2015, 8:53:51 PM10/21/15
to
+-----------------------+-------------------------------------------------------------------
| Address Object Code | Source Code
+-----------------------+-------------------------------------------------------------------
| 00/8000 | *-------------------------------------------------

| 00/8000 | * Fast, small, right-to-left hires screen clear
| 00/8000 | * By John Brooks 10/20/2015
| 00/8000 | * Updated by Peter Ferrie 10/21/2015
| 00/8000 | * Size: 130 bytes

| 00/8000 | * Speed: ~5.4 cycles per hires byte. Unrolled 32x.
| 00/8000 | * >5x faster than ROM HGR
| 00/8000 | * Inputs: A=color byte for even columns

| 00/8000 | * X=color byte for odd columns

| 00/8000 | *-------------------------------------------------
| 00/8000 |
| 00/8000 | org $300
| 00/0300 | Clear

| 00/0300 : 8D 6B 03 | sta __EvenColor+1
| 00/0303 |
| 00/0303 : A9 77 | lda #3*40-1 ;3x 40-byte rows in a half-page: 128/40=3
| 00/0305 : 18 | _OddColorClc clc ;c=0 on odd columns
| 00/0306 : A8 | _OddColorTay tay
| 00/0307 : 8A | __OddColor txa ;color for odd columns
| 00/0308 | _ClearEven
| 00/0308 | ]Offset equ 31*$400
| 00/0308 : 99 00 3F | sta 31744/$2000*$100+31744&$1fff+$2000,y
| 00/030B : 99 00 3B | sta 30720/$2000*$100+30720&$1fff+$2000,y
| 00/030E : 99 00 37 | sta 29696/$2000*$100+29696&$1fff+$2000,y
| 00/0311 : 99 00 33 | sta 28672/$2000*$100+28672&$1fff+$2000,y
| 00/0314 : 99 00 2F | sta 27648/$2000*$100+27648&$1fff+$2000,y
| 00/0317 : 99 00 2B | sta 26624/$2000*$100+26624&$1fff+$2000,y
| 00/031A : 99 00 27 | sta 25600/$2000*$100+25600&$1fff+$2000,y
| 00/031D : 99 00 23 | sta 24576/$2000*$100+24576&$1fff+$2000,y
| 00/0320 : 99 00 3E | sta 23552/$2000*$100+23552&$1fff+$2000,y
| 00/0323 : 99 00 3A | sta 22528/$2000*$100+22528&$1fff+$2000,y
| 00/0326 : 99 00 36 | sta 21504/$2000*$100+21504&$1fff+$2000,y
| 00/0329 : 99 00 32 | sta 20480/$2000*$100+20480&$1fff+$2000,y
| 00/032C : 99 00 2E | sta 19456/$2000*$100+19456&$1fff+$2000,y
| 00/032F : 99 00 2A | sta 18432/$2000*$100+18432&$1fff+$2000,y
| 00/0332 : 99 00 26 | sta 17408/$2000*$100+17408&$1fff+$2000,y
| 00/0335 : 99 00 22 | sta 16384/$2000*$100+16384&$1fff+$2000,y
| 00/0338 : 99 00 3D | sta 15360/$2000*$100+15360&$1fff+$2000,y
| 00/033B : 99 00 39 | sta 14336/$2000*$100+14336&$1fff+$2000,y
| 00/033E : 99 00 35 | sta 13312/$2000*$100+13312&$1fff+$2000,y
| 00/0341 : 99 00 31 | sta 12288/$2000*$100+12288&$1fff+$2000,y
| 00/0344 : 99 00 2D | sta 11264/$2000*$100+11264&$1fff+$2000,y
| 00/0347 : 99 00 29 | sta 10240/$2000*$100+10240&$1fff+$2000,y
| 00/034A : 99 00 25 | sta 9216/$2000*$100+9216&$1fff+$2000,y
| 00/034D : 99 00 21 | sta 8192/$2000*$100+8192&$1fff+$2000,y
| 00/0350 : 99 00 3C | sta 7168/$2000*$100+7168&$1fff+$2000,y
| 00/0353 : 99 00 38 | sta 6144/$2000*$100+6144&$1fff+$2000,y
| 00/0356 : 99 00 34 | sta 5120/$2000*$100+5120&$1fff+$2000,y
| 00/0359 : 99 00 30 | sta 4096/$2000*$100+4096&$1fff+$2000,y
| 00/035C : 99 00 2C | sta 3072/$2000*$100+3072&$1fff+$2000,y
| 00/035F : 99 00 28 | sta 2048/$2000*$100+2048&$1fff+$2000,y
| 00/0362 : 99 00 24 | sta 1024/$2000*$100+1024&$1fff+$2000,y
| 00/0365 : 99 00 20 | sta 0/$2000*$100+0&$1fff+$2000,y
| 00/0368 |
| 00/0368 : B0 06 | bcs _NextHalfPage
| 00/036A | __EvenColor
| 00/036A : A9 DD | lda #$DD ;MOD color for even columns
| 00/036C : 88 | dey
| 00/036D : 38 | sec ;c=1 during even columns
| 00/036E : B0 98 | bcs _ClearEven ;always
| 00/0370 | _NextHalfPage
| 00/0370 : 98 | tya
| 00/0371 : 30 04 | bmi _NextRow
| 00/0373 : 69 80 | adc #$80+1-1 ;$80=to 2nd half-page. +1 goes back to odd column. -1 due to c=1
| 00/0375 : 30 8F | bmi _OddColorTay ;always
| 00/0377 | _NextRow
| 00/0377 : 69 58 | adc #$80+1-40-1 ;$80=to 1st half-page. +1 goes back to odd column. -40=next row. -1 due to c=1
| 00/0379 : 10 8A | bpl _OddColorClc ;branch if in same page
| 00/037B : 69 76 | adc #3*40-2 ;3*40=# of rows in a half-age. -2 decrements to prev odd/even column
| 00/037D : C9 50 | cmp #2*40 ;loop if we're not done with all 40 bytes in all the 3rd row
| 00/037F : B0 84 | bcs _OddColorClc
| 00/0381 |
| 00/0381 : 60 | rts
+-----------------------+-------------------------------------------------------------------

Antoine Vignau

unread,
Oct 21, 2015, 9:06:57 PM10/21/15
to
Nice code, John,
av

mmphosis

unread,
Oct 21, 2015, 11:57:07 PM10/21/15
to
I'll wade in. This is from an old posting:

http://macgui.com/usenet/?group=2&id=22543#msg

NEW

0 HIMEM: 5608

1DATA5532563300922139021390213932135045238263536503037252627503845615305817312092293902932835845478875645836575037148845550351083966132653706165276445377489621322135821322334213502130051520282036

2DATAQLNZQLNZQAQQDSAQRDSAQQDSAQVDSAQXCKDNAPFXANAQXXANANXNXNXQXNXCQXKXNZQXKXNZQCQODMAQODMAUUYQXCKATAUVQXCKMNXQXKANXUAUTCQXKENXUMUSQJHSAHFQZTXRDFQZTAOCQZTAOBQHDSAPDSAQADSANZNZKDSAQZDSAXZQZOZXZUAXZJ

3 READ L$: READ H$

4 FOR I = 1 TO LEN (L$)

5POKE767+I,10*(ASC(MID$(H$,I,1))-65)+VAL(MID$(L$,I,1))

6 NEXT

RUN

HGR : CALL 768: CALL 5608

John Brooks

unread,
Oct 22, 2015, 12:46:51 AM10/22/15
to
On Wednesday, October 21, 2015 at 9:45:29 AM UTC-7, fadden wrote:
I cleaned up the next row logic and used zero page for inputs so the caller can just JSR without having to load registers. These changes saved 3 bytes.

Combined with Peter's improvements, the size has dropped from 136 bytes to 127 bytes!

*-------------------------------------------------
* Fast, small, right-to-left hires screen clear
* By John Brooks 10/21/2015
* with improvements by Peter Ferrie 10/21/2015
* Size: 127 bytes
* Speed: ~5.38 cycles per hires byte. Unrolled 32x.
* >5x faster than ROM HGR
* Inputs: $CE=color byte for even columns
* $CF=color byte for odd columns
*-------------------------------------------------

org $300

ClearColorEven equ $ce
ClearColorOdd equ $cf

Clear ldx ClearColorEven
lda #3*40-1 ;3x 40-byte rows in a half-page: 128/40=3
:ClearOddClc clc ;c=0 on odd columns
:ClearOdd tay
lda ClearColorOdd ;color for odd columns
:ClearEven
]Offset equ 31*$400
lup 32 ;# of pages in hires screen: 8k/256
sta ]Offset/$2000*$100+]Offset&$1fff+$2000,y
]Offset equ ]Offset-$400
--^

bcs :NextRow

txa ;color for even columns
dey
sec ;c=1 during even columns
bcs :ClearEven ;always
:NextRow
tya
adc #$80+1-1 ;$80=to 2nd half-page. +1 goes back to odd column. -1 due to c=1
bcc :ClearOdd
sbc #40 ;-40=next row to clear
bpl :ClearOddClc ;branch if in same page
adc #3*40-2 ;3*40=# of rows in a half-page. -2 decrements to prev odd/even column
cmp #2*40 ;loop if we're not done with all 40 bytes in all rows
bcs :ClearOddClc

rts


Here's the Merlin32 listing:

-----------------------+-------------------------------------------------------------------
Address Object Code | Source Code
-----------------------+-------------------------------------------------------------------
00/8000 | *-------------------------------------------------
00/8000 | * Fast, small, right-to-left hires screen clear
00/8000 | * By John Brooks 10/21/2015
00/8000 | * and improvements by Peter Ferrie 10/21/2015
00/8000 | * Size: 127 bytes
00/8000 | * Speed: ~5.38 cycles per hires byte. Unrolled 32x.
00/8000 | * >5x faster than ROM HGR
00/8000 | * Inputs: $CE=color byte for even columns
00/8000 | * $CF=color byte for odd columns
00/8000 | *-------------------------------------------------
00/8000 |
00/8000 | org $300
00/0300 |
00/0300 | ClearColorEven equ $ce
00/0300 | ClearColorOdd equ $cf
00/0300 |
00/0300 : A6 CE | Clear ldx {$ce}
00/0302 : A9 77 | lda #3*40-1 ;3x 40-byte rows in a half-page: 128/40=3
00/0304 : 18 | _ClearOddClc clc ;c=0 on odd columns
00/0305 : A8 | _ClearOdd tay
00/0306 : A5 CF | lda {$cf} ;color for odd columns
00/0308 | _ClearEven
00/0308 | ]Offset equ 31*$400
00/0308 : 99 00 3F | sta 31744/$2000*$100+31744&$1fff+$2000,y
00/030B : 99 00 3B | sta 30720/$2000*$100+30720&$1fff+$2000,y
00/030E : 99 00 37 | sta 29696/$2000*$100+29696&$1fff+$2000,y
00/0311 : 99 00 33 | sta 28672/$2000*$100+28672&$1fff+$2000,y
00/0314 : 99 00 2F | sta 27648/$2000*$100+27648&$1fff+$2000,y
00/0317 : 99 00 2B | sta 26624/$2000*$100+26624&$1fff+$2000,y
00/031A : 99 00 27 | sta 25600/$2000*$100+25600&$1fff+$2000,y
00/031D : 99 00 23 | sta 24576/$2000*$100+24576&$1fff+$2000,y
00/0320 : 99 00 3E | sta 23552/$2000*$100+23552&$1fff+$2000,y
00/0323 : 99 00 3A | sta 22528/$2000*$100+22528&$1fff+$2000,y
00/0326 : 99 00 36 | sta 21504/$2000*$100+21504&$1fff+$2000,y
00/0329 : 99 00 32 | sta 20480/$2000*$100+20480&$1fff+$2000,y
00/032C : 99 00 2E | sta 19456/$2000*$100+19456&$1fff+$2000,y
00/032F : 99 00 2A | sta 18432/$2000*$100+18432&$1fff+$2000,y
00/0332 : 99 00 26 | sta 17408/$2000*$100+17408&$1fff+$2000,y
00/0335 : 99 00 22 | sta 16384/$2000*$100+16384&$1fff+$2000,y
00/0338 : 99 00 3D | sta 15360/$2000*$100+15360&$1fff+$2000,y
00/033B : 99 00 39 | sta 14336/$2000*$100+14336&$1fff+$2000,y
00/033E : 99 00 35 | sta 13312/$2000*$100+13312&$1fff+$2000,y
00/0341 : 99 00 31 | sta 12288/$2000*$100+12288&$1fff+$2000,y
00/0344 : 99 00 2D | sta 11264/$2000*$100+11264&$1fff+$2000,y
00/0347 : 99 00 29 | sta 10240/$2000*$100+10240&$1fff+$2000,y
00/034A : 99 00 25 | sta 9216/$2000*$100+9216&$1fff+$2000,y
00/034D : 99 00 21 | sta 8192/$2000*$100+8192&$1fff+$2000,y
00/0350 : 99 00 3C | sta 7168/$2000*$100+7168&$1fff+$2000,y
00/0353 : 99 00 38 | sta 6144/$2000*$100+6144&$1fff+$2000,y
00/0356 : 99 00 34 | sta 5120/$2000*$100+5120&$1fff+$2000,y
00/0359 : 99 00 30 | sta 4096/$2000*$100+4096&$1fff+$2000,y
00/035C : 99 00 2C | sta 3072/$2000*$100+3072&$1fff+$2000,y
00/035F : 99 00 28 | sta 2048/$2000*$100+2048&$1fff+$2000,y
00/0362 : 99 00 24 | sta 1024/$2000*$100+1024&$1fff+$2000,y
00/0365 : 99 00 20 | sta 0/$2000*$100+0&$1fff+$2000,y
00/0368 |
00/0368 : B0 05 | bcs _NextRow
00/036A |
00/036A : 8A | txa ;color for even columns
00/036B : 88 | dey
00/036C : 38 | sec ;c=1 during even columns
00/036D : B0 99 | bcs _ClearEven ;always
00/036F | _NextRow
00/036F : 98 | tya
00/0370 : 69 80 | adc #$80+1-1 ;$80=to 2nd half-page. +1 goes back to odd column. -1 due to c=1
00/0372 : 90 91 | bcc _ClearOdd
00/0374 : E9 28 | sbc #40 ;-40=next row to clear
00/0376 : 10 8C | bpl _ClearOddClc ;branch if in same page
00/0378 : 69 76 | adc #3*40-2 ;3*40=# of rows in a half-page. -2 decrements to prev odd/even column
00/037A : C9 50 | cmp #2*40 ;loop if we're not done with all 40 bytes in all rows
00/037C : B0 86 | bcs _ClearOddClc
00/037E |
00/037E : 60 | rts
-----------------------+-------------------------------------------------------------------

-JB
@JBrooksBSI

fadden

unread,
Oct 22, 2015, 1:11:47 AM10/22/15
to
On Wednesday, October 21, 2015 at 9:46:51 PM UTC-7, John Brooks wrote:
> * >5x faster than ROM HGR

FWIW, I once came up with 271121 cycles for the code at $f3f6 (62454) if clearing to black or white, 328465 if clearing to color.

qkumba

unread,
Oct 22, 2015, 1:17:51 AM10/22/15
to
> 00/036F : 98 | tya
> 00/0370 : 69 80 | adc #$80+1-1 ;$80=to 2nd half-page. +1 goes back to odd column. -1 due to c=1
> 00/0372 : 90 91 | bcc _ClearOdd

I considered this one, but somehow thought that it would be slower, when in fact it is 1 cycle faster on every second iteration.
We could run even faster if we reversed the carry, like this:

00/0300 : A6 CE | Clear ldx {$ce}
00/0302 : A9 77 | lda #3*40-1 ;3x 40-byte rows in a half-page: 128/40=3
00/0304 : 38 | _ClearOddSec sec ;c=0 on odd columns
00/0305 : A8 | _ClearOdd tay
00/0306 : A5 CF | lda {$cf} ;color for odd columns
...
00/0368 : 90 05 | bcc _NextRow
00/036A |
00/036A : 8A | txa ;color for even columns
00/036B : 88 | dey
00/036C : 38 | clc ;c=1 during even columns
00/036D : 90 99 | bcc _ClearEven ;always
00/036F | _NextRow
00/036F : 98 | tya
00/0370 : 69 80 | adc #$80+1-1 ;$80=to 2nd half-page. +1 goes back to odd column. -1 due to c=1
00/0372 : 90 90 | bcc _ClearOddSec
00/0374 : E9 28 | sbc #40 ;-40=next row to clear
00/0376 : 10 8D | bpl _ClearOdd ;branch if in same page
00/0378 : 69 76 | adc #3*40-2 ;3*40=# of rows in a half-page. -2 decrements to prev odd/even column
00/037A : C9 50 | cmp #2*40 ;loop if we're not done with all 40 bytes in all rows
00/037C : B0 87 | bcs _ClearOdd

John Brooks

unread,
Oct 22, 2015, 1:59:07 AM10/22/15
to
Awesome hires clear, though it is not exactly small (or practical). But it's wicked fast once it gets going.

To summarize the awesomeness:

1) 564 byte text file initiates the screen clear via DOS exec (or I guess you could type it in)

2) Exec file creates a basic program which then uses a neat 2-chars-per-byte encoding scheme to poke 190 bytes of machine code into $300-$3BD

3) The code at $300 then generates this:

15e8: ldx #$AA
15ea: ldy #$D5
15ec: stx $2000
15ef: sty $2001
...
1ffa: stx $3512
1ffd: jmp $4000

4000: sty $3513
...
8fe9: stx $3ff6
8fec: sty $3ff7
8fef: rts

A 23k function to clear the 7.5k hires screen!

The screen clear function at $15e8 times out at almost exactly 4 cycles per byte.

Large and in charge!

Nice job mmphosis!

-JB


John Brooks

unread,
Oct 22, 2015, 2:29:17 AM10/22/15
to
Yikes. That's pretty crummy performance.

My 32x unrolled clear is about 41,300 cycles which is slightly faster than the small clear in fdraw (I don't clear screen holes). And while my clear doesn't have the venetian blind effect, it does have a slight comb effect which can be seen as it clears right-to-left. The comb effect is due to clearing half pages in two passes.

I suspect an improved approach would be to clear 1/2 the screen with a 32x unroll which included half-pages, then mod those addresses for the 2nd half of the screen and repeat.

-JB

mmphosis

unread,
Oct 22, 2015, 2:32:01 AM10/22/15
to
And, I've heard that Applesoft BASIC is slow! ;)

REM GENERATE CODE $0300 (768)
REM FOR FAST, BIG, TOP TO BOTTOM
REM HIRES SCREEN CLEAR ROUTINE
REM SIZE: $BE (190)
REM INPUT: PAGE $E6 (230)

REM CLEAR HIRES SCREEN $15E8 (5608)
REM SIZE: $5A09 (23049) HGR
REM SPEED: 4 CYCLES PER HIRES BYTE. UNROLLED ALL THE WAY!
REM INPUTS:
REM EVEN BYTE $1539 (5609)
REM ODD BYTE $153B (5611)

HGR : CALL 768: REM GENERATE CODE FOR HGR
POKE5609,0:POKE5611,0:CALL5608:REM BLACK
POKE5609,85:POKE5611,42:CALL5608:REM VOILET
POKE5609,42:POKE5611,85:CALL5608:REM GREEN
POKE5609,213:POKE5611,170:CALL5608:REM BLUE
POKE5609,170:POKE5611,213:CALL5608:REM ORANGE
POKE5609,127:POKE5611,127:CALL5608:REM WHITE
HGR2 : CALL 768 : REM GENERATE CODE FOR HGR2


John Brooks

unread,
Oct 23, 2015, 9:26:14 PM10/23/15
to

And here's a tiny IIGS version which is over 2x faster:

*-------------------------------------------------
* IIGS Fast, small, top-down hires screen clear
* By John Brooks 10/23/2015
* Size: 78 bytes
* Speed: ~0.92 uSec per hires byte (Bank 0 shadowing off)
* ~1.86 uSec per hires byte (Bank 0 shadowing on)
* ~2.34 cycles per hires byte. Unrolled 20x.
* Inputs: Y=clear color word: YL=even columns, YH=odd
*-------------------------------------------------

org $300

mx %00
Clear lda #$2000+40-1 ;Start at top row's last byte
tsx
sei
:NewStride clc
:Loop1024 tcs
lup 20 ;40 bytes per row = 20 words
phy
--^
adc #1024
cmp #8*1024+$2000 ;8x rows, 1024 byte stride
bcc :Loop1024

sbc #8*1024-128 ;Back up 8 rows & step to next half-page
bit #8*128 ;Have we done 8x half-pages?
beq :NewStride

sbc #8*128-40 ;Step to next row in the half-page
bit #128 ;See if we've done all 3 rows in all half-pages
beq :NewStride

txs
cli
rts

Merlin32 listing:

-----------------------+-------------------------------------------------------------------
Address Object Code | Source Code
-----------------------+-------------------------------------------------------------------
00/8000 | *-------------------------------------------------
00/8000 | * IIGS Fast, small, top-down hires screen clear
00/8000 | * By John Brooks 10/23/2015
00/8000 | * Size: 78 bytes
00/8000 | * Speed: ~0.92 uSec per hires byte (Bank 0 shadowing off)
00/8000 | * ~1.86 uSec per hires byte (Bank 0 shadowing on)
00/8000 | * ~2.34 cycles per hires byte. Unrolled 20x.
00/8000 | * Inputs: Y=clear color word: YL=even columns, YH=odd
00/8000 | *-------------------------------------------------
00/8000 |
00/8000 | org $300
00/0300 |
00/0300 | mx %00
00/0300 : A9 27 20 | Clear lda #$2000+40-1 ;Start at top row's last byte
00/0303 : BA | tsx
00/0304 : 78 | sei
00/0305 : 18 | _NewStride clc
00/0306 : 1B | _Loop1024 tcs
00/0307 : 5A | phy
00/0308 : 5A | phy
00/0309 : 5A | phy
00/030A : 5A | phy
00/030B : 5A | phy
00/030C : 5A | phy
00/030D : 5A | phy
00/030E : 5A | phy
00/030F : 5A | phy
00/0310 : 5A | phy
00/0311 : 5A | phy
00/0312 : 5A | phy
00/0313 : 5A | phy
00/0314 : 5A | phy
00/0315 : 5A | phy
00/0316 : 5A | phy
00/0317 : 5A | phy
00/0318 : 5A | phy
00/0319 : 5A | phy
00/031A : 5A | phy
00/031B : 69 00 04 | adc #1024
00/031E : C9 00 40 | cmp #8*1024+$2000 ;8x rows, 1024 byte stride
00/0321 : 90 E3 | bcc _Loop1024
00/0323 |
00/0323 : E9 80 1F | sbc #8*1024-128 ;Back up 8 rows & step to next half-page
00/0326 : 89 00 04 | bit #8*128 ;Have we done 8x half-pages?
00/0329 : F0 DA | beq _NewStride
00/032B |
00/032B : E9 D8 03 | sbc #8*128-40 ;Step to next row in the half-page
00/032E : 89 80 00 | bit #128 ;See if we've done all 3 rows in all half-pages
00/0331 : F0 D2 | beq _NewStride
00/0333 |
00/0333 : 9A | txs
00/0334 : 58 | cli
00/0335 : 60 | rts
-----------------------+-------------------------------------------------------------------

-JB
@JBrooksBSI
Reply all
Reply to author
Forward
0 new messages