NASCOM BASIC with interrupt driven Tx/Rx ACIA code

phillip.stevens

unread,

Nov 22, 2016, 6:55:54 AM11/22/16

to RC2014-Z80

I've had some time spare (waiting for parts... still), so I thought it would be worthwhile adding my newly written Tx/Rx interrupt ACIA code to the existing NASCOM BASIC port. So, I did it.

NASCOM_BASIC_4.7 - ACIA Tx/Rx Interrupt Driven

The code implements 96 Byte buffers for both Tx and Rx directions. This should permit downloading whole lines of code before pausing to prevent overrun.

There are ROMs for both 32k RAM and 56k RAM versions of the RC2014 in the repository.

Tested working on real hardware.

I'll add some test data, and performance numbers in a follow up post.

Enjoy,

Phillip

phillip.stevens

unread,

Nov 24, 2016, 5:15:20 AM11/24/16

to RC2014-Z80

Following up with some simple testing of the Microsoft Basic PRINT function, it seems that there is a net loss of throughput, mainly due to the amount of time spent processing the interrupt.

Here's the original Searle Tx code in action, which just uses busy wait polling to transmit characters.

Note that the INT line is not touched, because the Tx function continually polls the ACIA status byte to determine when it may load another character.

It takes about 0.6ms to transmit the characters.

With my interrupt based Tx code, you can see that the INT line is triggered to signal when a character can be transmitted, and some processing time elapses in the interrupt code determining that no character needs to be received, before the transmission can begin.

We have to do it this way because there is only one shared interrupt for the ACIA.

With the interrupt driven Tx code it takes 0.76ms to transmit the characters.

I thought OK, perhaps it is taking more time to transmit the characters, but during this time the processor can be doing something productive other than idly polling the ACIA status byte. So, overall we're going to win.

If we were able to do productive process things whilst the ACIA did its work unaided, we would see that more sets of characters could be transmitted.

Specifically the rate of character sets are transmitted should increase.

But, the transmission rate has actually decreased from 929Hz to 720Hz.

But, unfortunately that doesn't seem to be the case. So we're not winning there either.

So that leads me to conclude that the Z80 processor at 7.68MHz is just too slow to take advantage of an interrupt based 115,200 baud serial transmission capability.

Unless, I've got this wrong, and I'm missing something obvious?

Any thoughts? Let me know please.

phillip.stevens

unread,

Nov 26, 2016, 7:10:55 AM11/26/16

to RC2014-Z80

Following up with some simple testing of the Microsoft Basic PRINT function, it seems that there is a net loss of throughput, mainly due to the amount of time spent processing the interrupt.

I've had a close look at the interrupt code, and have made some minor optimisations by removing stack operations.

https://github.com/feilipu/NASCOM_BASIC_4.7

But, the transmission rate has actually decreased from 929Hz to 720Hz.

With the optimisations, the rate has increased to 729Hz. But it is still slower than Tx polling.

So that leads me to conclude that the Z80 processor at 7.68MHz is just too slow to take advantage of an interrupt based Tx at 115,200 baud.

With that in mind, I've reduced the Tx buffer substantially to 15 Bytes.

Also, I've increased the Rx buffer to allow me to paste in 239 Bytes of Basic program code with one action.

This works around the fact that the RTS signal is not respected by my USB interface or console program.

So, I'm keeping the changes as worthwhile improvements.

Jan S

unread,

Nov 27, 2016, 3:46:40 AM11/27/16

to RC2014-Z80

II've done a "rewrite" of the original "Nascom Basic Manual" (attached). Please update the manual as you add/remove functionality to the BASIC code :-)

/Jan

Nascom Basic Manual Rewrite.docx

Scott Lawrence

unread,

Nov 27, 2016, 11:40:43 AM11/27/16

to rc201...@googlegroups.com

Has anyone hooked back in to the load and save routine hooks that have been removed? I haven't had a chance to look into it yet, but I want to hook in my sd card interface to it. ;)

Sent from my fancy-schmancy phone.

On Nov 27, 2016, at 3:46 AM, Jan S <futt...@gmail.com> wrote:

II've done a "rewrite" of the original "Nascom Basic Manual" (attached). Please update the manual as you add/remove functionality to the BASIC code :-)

/Jan

--
You received this message because you are subscribed to the Google Groups "RC2014-Z80" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rc2014-z80+...@googlegroups.com.
To post to this group, send email to rc201...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rc2014-z80/5bab447f-d1d5-4682-baa4-ddd1f4c20b7c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

<Nascom Basic Manual Rewrite.docx>

Jan S

unread,

Nov 27, 2016, 1:48:43 PM11/27/16

to RC2014-Z80

Scott,
Will you be able to see what you need to see out of the this source?: http://www.nascomhomepage.com/lang/8kbasic.asm
:)

Spencer Owen

unread,

Nov 29, 2016, 4:49:25 PM11/29/16

to rc201...@googlegroups.com

Hi Jan,

Thanks for the link to that. Yes, that has indeed got the CLOAD and CSAVE routines in it :-)

Scott,

The Nascom 2 used a 6402 UART on port 1 and 2 and that UART would communicate through a bunch of analog & digital stuff to make the appropriate beepy stuff. It looks like it should be easy* enough to recreate something that works in a similar way but talks to a Arduino (or similar) connected to an SD Card and to reintroduce the appropriate code back in to BASIC.

Spencer - Noting that lots of things "should be easy" :-)

--
You received this message because you are subscribed to the Google Groups "RC2014-Z80" group.

To unsubscribe from this group and stop receiving emails from it, send an email to rc2014-z80+unsubscribe@googlegroups.com.

To post to this group, send email to rc201...@googlegroups.com.

To view this discussion on the web, visit https://groups.google.com/d/msgid/rc2014-z80/edb30ad1-6c94-4817-9ea8-e2e907bfc2f8%40googlegroups.com.

Scott Lawrence

unread,

Nov 29, 2016, 4:55:30 PM11/29/16

to rc201...@googlegroups.com

That was my thought. ;)

-s

To view this discussion on the web, visit https://groups.google.com/d/msgid/rc2014-z80/CAO93Ptet3agqDae5nLzvoscWX-Td52dOJGoi%3Dhyud8Uip6Fr9A%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--

Scott Lawrence
yor...@gmail.com

phillip.stevens

unread,

Jan 16, 2017, 7:11:15 AM1/16/17

to RC2014-Z80

Perhaps is a personal thing, but when something is left not quite right, I have to come back to it to make it right.

And so in that spirit I've had another crack at the interrupt driven Tx code I wrote back in November.

On Saturday, 26 November 2016 23:10:55 UTC+11, phillip.stevens wrote:

Following up with some simple testing of the Microsoft Basic PRINT function, it seems that there is a net loss of throughput, mainly due to the amount of time spent processing the Tx interrupt.

https://github.com/feilipu/NASCOM_BASIC_4.7

But, the transmission rate has actually decreased from 929Hz to 720Hz.

With the optimisations, the rate has increased to 729Hz. But it is still slower than Tx polling at 929Hz.

So that leads me to conclude that the Z80 processor at 7.68MHz is just too slow to take advantage of an interrupt based Tx at 115,200 baud.

The major change in logic was to use the ACIA as the first two bytes in the Tx buffer. By that I mean checking that the RAM Tx buffer is empty, and if so then writing the Tx byte directly to the ACIA register if possible. That way, the first two bytes transmitted can be done so without any interrupt or polling. Subsequent bytes will be then fed into the buffer, and will be serviced by the ACIA interrupt.

This change effectively recovers all of the transmission rate losses against the Tx polling method, back to 922Hz for this test message, whilst retaining the fact that the CPU is never in busy-wait mode.

The image shows that the timing has returned to the former rate, and that twice (in a 7 byte transmission) the CPU had to use the Tx buffer. The CPU is slowed down generating the CR and LF characters, so the Tx buffer is emptied before they are output.

This can be seen in greater detail when a longer text string is output, as the CPU is regularly interrupted to produce the next character.

There remains a 239 byte Rx buffer, so substantial pieces of BASIC (three full length lines of code) can be pasted into the RC2014 before the buffer overflows.

The Tx buffer is only 15 bytes, but that seems a sufficient trade off.

I've put the code in Github in the same place, my NASCOM BASIC Repository.

Enjoy.

phillip.stevens

unread,

Feb 4, 2017, 4:22:18 PM2/4/17

to RC2014-Z80

So I'm having another go at this, because there is still room for improvement. Let's work on efficiency.

On Monday, 16 January 2017 23:11:15 UTC+11, phillip.stevens wrote:

Perhaps is a personal thing, but when something is left not quite right, I have to come back to it to make it right.
And so in that spirit I've had another crack at the interrupt driven Tx code I wrote back in November.

On Saturday, 26 November 2016 23:10:55 UTC+11, phillip.stevens wrote:
Following up with some simple testing of the Microsoft Basic PRINT function, it seems that there is a net loss of throughput, mainly due to the amount of time spent processing the Tx interrupt.

https://github.com/feilipu/NASCOM_BASIC_4.7

As the ACIA code has to run during an interrupt, it is important to make it as efficient as possible.

This means reducing cycles (or T States) where ever possible.

Let's look at the standard Searle Rx code below, and how it increments the Rx ring buffer.

notFull: LD HL,(serInPtr)
         INC HL
         LD A,L ; Only need to check low byte because buffer<256 bytes
         CP (serBuf+SER_BUFSIZE) & $FF
         JR NZ, notWrap
         LD HL,serBuf
notWrap: LD (serInPtr),HL

The serInPtr contains the address of the byte where we will insert the next received byte. We increment this 16 bit address, but only test the lower byte assuming we have a buffer smaller than 256 bytes. The test is generated from the base address of the serBuf plus its size. This means that we can locate the serBuf on any memory location, and still have our test work correctly.

If the test is passed, i.e. we haven't incremented past the end of the buffer, then we jump to notWrap. Otherwise we reset our pointer to the base address of serBuf.

Let's work out how many cycles this takes. But first cut off the code we have to run, irrespective of improvement opportunities. And look at the normal case (not the final case).

         INC HL                          ;  6 T States
         LD A,L                          ;  4
         CP (serBuf+SER_BUFSIZE) & $FF   ;  7
         JR NZ, notWrap                  ; 12
                                         ; 29 T States Total

29 T States is our base line from Grant.

How do I do?

        inc hl
        ld a, l
        cp (serRxBuf + SER_RX_BUFSIZE) & $FF
        jr nz, rxa_no_wrap

Well, looks pretty much the same at this stage. 29 T States for the base line.

If we make some assumptions about the location and size of our buffer, we can probably do a better job.

Let's place the buffer on a Byte boundary, and limit the buffer size options to powers of 2. i.e 2^4, 2^7, etc.

Then we can optimise a little.

        inc l                       ; move the Rx pointer, just low byte along  -  4
        ld a, SER_RX_BUFSIZE        ; load the buffer size, power of 2          -  7
        and l                       ; range check                               -  4
        ld l, a                     ; return the low byte to l                  -  4
                                                                                - 19 T States Total

So we've been able to reduce the code time significantly, with some simple assumptions when establishing the buffer.

But, what if we add one more assumption? That the size of the buffer is exactly 256 bytes, which is a reasonable size for the Rx buffer, anyway.

If we do this the range check falls away, leaving just the increment, because the 8 bit increment rolls over automatically.

inc l ; move the Rx pointer, just low byte along - 4

What we have now is a shorter piece of code, that used to take 29 T States, which now takes just 4 T States.

Enjoy.

PianoMatt

unread,

Feb 8, 2017, 10:28:17 AM2/8/17

to RC2014-Z80

Which interrupt mode are you using to talk to the ACIA? I've just been reading about Mode 2 and the 8 bit vectors and it opens up possibilities for sharing the interrupt line with other devices.

phillip.stevens

unread,

Feb 8, 2017, 4:30:26 PM2/8/17

to RC2014-Z80

Which interrupt mode are you using to talk to the ACIA?

IM 1. It always vectors to $0038. See the code starting with:

.ORG 0038H

I've just been reading about Mode 2 and the 8 bit vectors and it opens up possibilities for sharing the interrupt line with other devices.

Yes, but I don't know if the INT line can be shared on the rc2014 hardware. I think there needs to be a kind of "daisy chain" thing implemented to make IM 2 feasible.

Also the peripheral needs to place its address on the data bus during the interrupt, which adds a significant degree of complexity. But, I haven't read up on all the details, so it might be much easier than it sounds.

Reply all

Reply to author

Forward