Round Robbin Scheduler

Interocitor Steve

unread,

Oct 29, 2022, 11:12:05 PM10/29/22

to retro-comp

Hi Guys, it has been a while.

I saw the thread on Z80 protected mode. On a related topic, I have been thinking about a simple scheduler to get some sort of multi-tasking support. Below is a scheme using a clock on the NMI pin to advanve scheduling ticks. There is no priority in this scheme, except for a solo PRI-ority state.

Task Control Blocks (TCB) are used to store application states. Only one task can be active at a time. The NMI will periodically interrupt the CPU, save the current state in memory (in the TCB), and move on to the next task using previously saved data.

The TCB address and other NMI variables would be at fixed addresses. There would be no relocatable code. This will keep things simple and quick.

This scheme depends upon updating the NMI saved state with previously saved TCB data to effect an alternate return from the NMI to go tothe next scheduled task. I assume this can be done, but I am not sure.

I welcome comments and questions.

=Steve.

ROundRobbin.pdf

Mark T

unread,

Oct 30, 2022, 1:41:31 AM10/30/22

to retro-comp

Would it be easier to save the registers and status to the stack, so only the stackpointer is saved in the tcb?

Phillip Stevens

unread,

Oct 30, 2022, 6:13:49 AM10/30/22

to retro-comp

Mark T wrote:

Would it be easier to save the registers and status to the stack, so only the stackpointer is saved in the tcb?

Interocitor Steve wrote:
Hi Guys, it has been a while.

I saw the thread on Z80 protected mode. On a related topic, I have been thinking about a simple scheduler to get some sort of multi-tasking support. Below is a scheme using a clock on the NMI pin to advanve scheduling ticks. There is no priority in this scheme, except for a solo PRI-ority state.

Task Control Blocks (TCB) are used to store application states. Only one task can be active at a time. The NMI will periodically interrupt the CPU, save the current state in memory (in the TCB), and move on to the next task using previously saved data.

The TCB address and other NMI variables would be at fixed addresses. There would be no relocatable code. This will keep things simple and quick.

This scheme depends upon updating the NMI saved state with previously saved TCB data to effect an alternate return from the NMI to go tothe next scheduled task. I assume this can be done, but I am not sure.

I welcome comments and questions.

Just following on from Mark's comment, the FreeRTOS TCB only has a few items recorded.

typedef struct TaskControlBlock_t
{
volatile StackType_t * pxTopOfStack; /*< Points to the location of the last item placed on the task's stack. THIS MUST BE THE FIRST MEMBER OF THE TCB STRUCT. */
ListItem_t xStateListItem; /*< The list that the state list item of a task is reference from denotes the state of that task (Ready, Blocked, Suspended ). */
ListItem_t xEventListItem; /*< Used to reference a task from an event list. */
UBaseType_t uxPriority; /*< The priority of the task. 0 is the lowest priority. */
StackType_t * pxStack; /*< Points to the start of the stack. */
char pcTaskName[ configMAX_TASK_NAME_LEN ]; /*< Descriptive name given to the task when created. Facilitates debugging only. */

} TCB_t;

The whole state of the machine is pushed onto the Task's stack in a Timer interrupt, before recording the location of the last item pushed into the TCB in the pxTopOfStack.
That's very portable way to do it. As who knows how many registers you might need to save?

I'm sure are lots of ideas to crib from the FreeRTOS Kernel that you can add to your scheduler, and as the majority of it is in C it is pretty easy to read.

Most of the code in the source portable directory is related to other machines. The scheduler and core code is really just a few files.

Cheers, Phillip

Tom Storey

unread,

Oct 30, 2022, 3:20:38 PM10/30/22

to retro-comp

FreeRTOS is something I have done a lot of work with in recent years, even creating a port for the 68k (have yet to push it into the community maintained ports repo, but I'm working on a project at the moment in which I'll make final refinements before doing so).

I think it's a great little RTOS to use, and provides great insight into OS mechanics. It could also serve as great inspiration if you're looking to implement your own (RT)OS as it's got all of the basic components you need to do so.

--
You received this message because you are subscribed to the Google Groups "retro-comp" group.
To unsubscribe from this group and stop receiving emails from it, send an email to retro-comp+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/retro-comp/42eb041e-366f-475b-b18c-ec177ca12caan%40googlegroups.com.

Dr .Steve Mastrianni

unread,

Oct 30, 2022, 3:51:16 PM10/30/22

to retro-comp

Years back I wrote several round-robin type kernels, long before the advent of things like VxWorks and Vrtx. I did them mainly for embedded applications, most would never have a storage device, keyboard or display. Non-preemptive and cooperative time-sliced scheduling is easy, but anything bad that happens during the slice can render the system unusable. A single I/O wait can starve other processes or threads - even waiting on a single bit. The slices have to be large enough to be fair, yet not too large to starve others. Some of the 'processes' required real time data acquisition - if you weren't there for the data it would pass you by. These simple schedulers quickly grew to accommodate a priority architecture, real and protected mode, then finally to handle NMIs and preemptive processing. Even without memory allocation, these bare bones kernels quickly grew more and more complex. Maintaining state required fairly complex structures and before I knew it, initializing and maintaining those tables introduced a significant overhead. Writing code meant avoid things like malloc and calloc, and the stack became a frequent trouble spot. However, they did work, and we deployed many of these in critical systems for avionics, warfare, and telecommunications. Some are probably still in use.

More modern OS's have added memory allocation and locking, semaphores, IPC, and other features. However, IMHO I don't think that a Z80 has the architecture and processor features necessary to support such a system you describe, at least in a meaningful way as to provide some sort of value. There are much better alternatives.

--Steve

Interocitor Steve

unread,

Oct 30, 2022, 3:54:53 PM10/30/22

to retro-comp

Thanks for the comments guys.

Mark, On pushing everything to the stack. Will this work when I need to switch to another task? I will need to sketch this out to see. You may be right. I am worrying about other variables that apps may write to the stack. Should every app have its own stack? Maybe so? This seems obvious now that I have asked the question.

Phillip and t...@snnap, RTOS looks like more than I need. I am thinking a round-robin scheduler will meet my needs. I don't want the overhead of a real time OS. That beimg said, I am planning to use the PRI state to sieze the system resources for time critial operations when needed. The PRI task state basically turns off the scheduler. I will look at FreeRTOS and see what I can use.

=Steve.

Interocitor Steve

unread,

Oct 30, 2022, 4:12:06 PM10/30/22

to retro-comp

Hi Steve.ma,

Your post came in as I was writing my post above.

I used VRTX on an embedded system that collected telemetry from various devices. I appreciate you description of the development of OS's. I have a similar perspective. My plan of attack is (1) to have relatively large slices of time (0.25 secs or larger) and (2) to write apps with the knowledge of the limitation of the scheduler.

But you have me thinking. Maybe NMI should not be used for the scheduler as this could interrupt other lower level interrupts and switch tasks at the wrong time. So maybe the scheduler is the lowest priority interrupt. Then it can only run when no other interrupts are being processed. It would be safer, I think, to only switch tasks when things are relatively quiet. That way, there is nothing real-time about the scheduler. What do you think?

=Steve.

Tadeusz Pycio

unread,

Oct 30, 2022, 4:45:12 PM10/30/22

to retro-comp

On this topic, it is worth reading the RTM-Z80.

Alan Cox

unread,

Oct 30, 2022, 8:06:07 PM10/30/22

to retro-comp

For Fuzix I just stack the registers then switch bank and stack
pointers as needed. The only really exciting bit is getting the
restore right so that you never take an interrupt mid restore with an
invalid stack pointer or bank. On Z80 that's easy, on 6502 it's a bit
exciting.

You don't want to be using NMI - as you'll need to avoid NMI to use
floppy disks, and NMI has a lot of other problems around repeat NMI
whilst in an NMI, plus NMI is needed for single step debug and debug
buttons etc. On a banked system NMI is near impossible to use for
anything else, and if you want to run CP/M code it's also not possible
to use NMI.

Phillip Stevens

unread,

Oct 30, 2022, 9:36:59 PM10/30/22

to retro-comp

Interocitor Steve wrote:

My plan of attack is (1) to have relatively large slices of time (0.25 secs or larger) and (2) to write apps with the knowledge of the limitation of the scheduler.

IMHO, either you run the scheduler with a fast Tick (perhaps 1ms or faster) and use the Scheduler for everything. Or do as you propose and run the scheduler to take care of the "soft" real time tasks like the user interface, serial ports with hardware buffers, or other relatively slow processes.

But you have me thinking. Maybe NMI should not be used for the scheduler as this could interrupt other lower level interrupts and switch tasks at the wrong time. So maybe the scheduler is the lowest priority interrupt. Then it can only run when no other interrupts are being processed. It would be safer, I think, to only switch tasks when things are relatively quiet. That way, there is nothing real-time about the scheduler. What do you think?

Admittedly only hobby experience, but if something needs to happen at very regular intervals, then I'd prefer to use a hardware timer to trigger it at a higher priority than the Scheduler. I built a synthesiser on an Arduino 8-bit AVR, and the audio output to the MCP4822 DAC (up to 44kHz) was done with a hardware timer, and the User Interface (screen and serial I/O) were done using the (lazy) 15ms Tick RTOS.

Phillip and Tom, RTOS looks like more than I need. I am thinking a round-robin scheduler will meet my needs. I don't want the overhead of a real time OS. That beimg said, I am planning to use the PRI state to sieze the system resources for time critial operations when needed. The PRI task state basically turns off the scheduler. I will look at FreeRTOS and see what I can use.

One of the useful things might be the handling of context save and restore, which you'll need no matter what. There is a little bit of bit trickery to get the correct interrupt enable status stored in the context, irrespective of whether you're restoring state from normal operations, or from within an interrupt. Normally you just read the i register (in a CMOS Z80) and restore based on the Odd/Even bit, but from within an interrupt you need to fake a set interrupt enable bit.

P.

Alan Cox

unread,

Oct 31, 2022, 12:35:18 AM10/31/22

to Phillip Stevens, retro-comp

> IMHO, either you run the scheduler with a fast Tick (perhaps 1ms or faster) and use the Scheduler for everything. Or do as you propose and run the scheduler to take care of the "soft" real time tasks like the user interface, serial ports with hardware buffers, or other relatively slow processes.

Or you write your scheduler so that it's smart enough not to task
switch except when needed. A lot of older operating systems used a mix
of
- Simple priority queues (often with a basic 'get dumped in the lowest
priority if you use a whole timeslice)
- Interrupt handlers not task switches used whenever possible to
reduce switching cost
- When swapping so you always run a swapped in task a bit before it
can get swapped out even if there are higher priority candidates who
are swapped - so you always make progress

Fuzix also pulls an extra trick in that it doesn't actually task
switch when there is no runnable task but sits in the current task
until the point something is runnable and if that one thing is the old
task (usual case by far) then it just carries on running it. In a
typical usage case of single user running say Zork, there will be 2 or
3 context switches between shell starting the game and exiting it an
hour later 8)

An RTOS is a bit different because you can only do so much in
interrupt context because of the need to handle priority inversions.
Z80 is horrible for this because of the register count, although
reserving the alt registers for the actual irq paths helps a bit. On
the plus side you don't have to play zero page allocation games like
the 6502.

>> But you have me thinking. Maybe NMI should not be used for the scheduler as this could interrupt other lower level interrupts and switch tasks at the wrong time. So maybe the scheduler is the lowest priority interrupt. Then it can only run when no other interrupts are being processed. It would be safer, I think, to only switch tasks when things are relatively quiet. That way, there is nothing real-time about the scheduler. What do you think?

Scheduler is usually for a non-RTOS the lowest priority. If you have
an interrupt controller you don't want to lose serial characters to a
task switch. With the Z80 you are usually stuck with a single real
priority unlike say 8085.

> One of the useful things might be the handling of context save and restore, which you'll need no matter what. There is a little bit of bit trickery to get the correct interrupt enable status stored in the context, irrespective of whether you're restoring state from normal operations, or from within an interrupt. Normally you just read the i register (in a CMOS Z80) and restore based on the Odd/Even bit, but from within an interrupt you need to fake a set interrupt enable bit.

It's usually quicker to just track your interrupt state in software.
That also allows for a bunch of other optimizations, and works on the
NMOS parts as well. Be warned several compilers have supposed
"interrupt" support but don't understand NMOS parts. Trying to do
clever stuff with the I register gets really messy as if you support
any kind of interrupt nesting.

Alan

Tom Storey

unread,

Oct 31, 2022, 6:36:11 AM10/31/22

to Interocitor Steve, retro-comp

FreeRTOS gives each task its own stack, so when switching to another task you push the current tasks "context" onto its stack, then adjust the SP to point to the bottom of the next tasks stack and pop its context back into the CPU registers.

The same basic switching in mechanism is used to get a task started in the first place. The (empty) stack is populated with whatever default values should be popped into the CPU registers when the task starts up and the address of the first instruction. To make it run you switch the task in (set SP to the bottom of its stack), pop its register values into the CPU regs, and simply RET. This way you only have to start the first task, and switches between subsequent tasks happen "naturally" as if they had been switched out before.

FreeRTOS also recommends running the tick timer interrupt as the absolute lowest priority interrupt. Tasks are only run once they are "ready", that is they have just been started, or if they were previously waiting on something that has now occurred, or perhaps was pre-emptively switched out for another task to run and no other tasks need to run so it can just continue on (sounds a bit like what Alan described). Otherwise if the "ready" queue(s) are empty, then you just sit in the idle loop waiting for one to become ready.

--

You received this message because you are subscribed to the Google Groups "retro-comp" group.
To unsubscribe from this group and stop receiving emails from it, send an email to retro-comp+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/retro-comp/5347bb85-ffba-4326-8115-2d67c71be296n%40googlegroups.com.

Richard Deane

unread,

Oct 31, 2022, 2:42:32 PM10/31/22

to Interocitor Steve, retro-comp

I remember back to the early days of DOS where a task switcher (and GemXM) could in fact be more useful than a multitasker, at least from the plain user experience. DRDOS had a good switcher in TaskMax (could also be multitasking)

I would find a task switcher handy under CP/M and maybe less stressful on Z80 CPU resource, however implicit in a task switcher is the ability to switch user interface so it could end up being terminal specific, ( save and restore current screen and keyboard buffer/status), unless it used an RSX type of add-in to intercept and keep current screen images, would need access to banked memory for screen buffers, but you'd want more than 64k system if you were doing multi-user multi-tasking anyway.

Food for thought.

I like the FabGL implementation of Multitasking - Multisession CP/M 3 (Plus) with ESP32

Richard

--
You received this message because you are subscribed to the Google Groups "retro-comp" group.
To unsubscribe from this group and stop receiving emails from it, send an email to retro-comp+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/retro-comp/7a100c12-e649-46b3-89ea-e82429bb702an%40googlegroups.com.

Alan Cox

unread,

Oct 31, 2022, 3:56:33 PM10/31/22

to Richard Deane, retro-comp

On Mon, 31 Oct 2022 at 18:42, Richard Deane <rwd...@gmail.com> wrote:
>
> I remember back to the early days of DOS where a task switcher (and GemXM) could in fact be more useful than a multitasker, at least from the plain user experience. DRDOS had a good switcher in TaskMax (could also be multitasking)
>
> I would find a task switcher handy under CP/M and maybe less stressful on Z80 CPU resource, however implicit in a task switcher is the ability to switch user interface so it could end up being terminal specific, ( save and restore current screen and keyboard buffer/status), unless it used an RSX type of add-in to intercept and keep current screen images, would need access to banked memory for screen buffers, but you'd want more than 64k system if you were doing multi-user multi-tasking anyway.

VPMM did this on the Gemini. It did have some platform specific code
but didn't need banked memory for the the saves as it could write
state to disk but did use Nascom banking in order to get a near 64K
TPA. It could also run ISIS and UCSD Pascal sessions as well as CP/M
which was a neat trick.

http://81.98.24.96/xbeaver/xbeaver_software_vault.html

ladislau szilagyi

unread,

Nov 1, 2022, 12:11:00 AM11/1/22

to retro-comp

Hi,

I'm maybe just a bit late...but...

Take a look at RTM/Z80: ( https://github.com/Laci1953/RTM-Z80 )

Basic RTM/Z80 characteristics and features:
• May be run directly under CP/M 2.2 on any vintage Z80 based computer as a .COM executable program, or may be run on any simulated/emulated Z80 system, or on any Z80 retro/home brew computer (e.g. RC2014).
• It’s very easy to build applications that use RTM/Z80. Using the HiTech vintage C compiler, assembler and linker, the application can be quickly compiled, assembled and linked with the RTM/Z80 library, in a minimal number of steps.
• It’s easy to configure. It can be quickly tailored according to the user options. Can be built as an object code library or can be ported to a (E)EPROM + RAM configuration.
• It’s quite small. In its minimal version, it requires less than 4KB ROM plus 8K RAM. However, RTM/Z80 can be scaled-up easily: if your RC2014 hardware is provided with a 512KB ROM + 512KB RAM memory module, then RTM/Z80 can be configured to run very large multitasking applications (with total code & read-only data size up to 512KB) capable to allocate/deallocate very large amounts of read-write data (up to 460KB)
• It’s fast enough. The following measurements are valid for a Z80 CPU running at 7.3728 MHz:
o Task switching time (switching from a task executing a semaphore “signal” to another task “waiting” for the semaphore) is under 160 microseconds.
o Allocate/deallocate a block of dynamic memory takes under 130 microseconds
o Run/stop task operations are executed in less than 220 microseconds
• Its functions are easy to be used (no complicated C or assembler data structures need to be initialized)
• Provides 8KB of RAM dynamic memory, enabling users to allocate/deallocate blocks of memory of variable sizes, from 16 bytes to 4 Kbytes.
• Users can define and run up to 32 concurrent tasks. The highest priority task gains CPU access.
• RTM/Z80 is basically a “cooperative” multitasking system; however, round-robin pre-emptive scheduling is possible, and can be switched on/off using the API
• Semaphores can be used to control access to common resources. Semaphores are implemented as “counting” semaphores. There is no limit regarding the number of semaphores that can be used.
• Queues and mailboxes can be used for inter-task communication. Messages can be sent and received between tasks. There is no limit regarding the number of queues or mailboxes that can be used.
• Timers can be used to delay/wait for variable time intervals (multiple of 5 milliseconds). There is no limit regarding the number of timers that can be used.
• Interrupt driven I/O drivers are used to perform I/O requests targeted to serial hardware devices (console, printer, etc.) for baud rates up to 115.2 K

There is available a detailed RTM/Z80 manual...

Ladislau

Message has been deleted

Interocitor Steve

unread,

Nov 1, 2022, 12:57:59 AM11/1/22

to retro-comp

Here is my scheme. Comments welcome.

In my soultion, every task has its own stack. This is key. Otherwise things would get clobbered.

Starting
Every task is assigned a preassigned stack pointer
and an initial start address.
These are hardcoded at compile time.
The TCB's are populated at link time with this info.

At startup, the scheduler runs first and sets up the Mode 2 low pri ticker.

Interrupts are enabled.

An idle loop is run until the first tock comes in.

At startup, the scheduler starts with Task 1.
The current stack return info is over written with Task 1 TCB PC and SC
The task count is bumped to Task 2
Interrupts are enabled.
A RETI is issued and Task 1 runs.

A scheduler tick occurs.
The scheduler sees it is time for Task 2.
The scheduler saves the Task 1 return data into the Task 1 TCB for later use.
The Task 2 PCB SC and PC are jammed onto the stack.
The task count is bumped to Task 3
Interrupts are enabled.
A RETI is issued and Task 2 runs.

A scheduler tick occurs.
. . . repeat the above for all tasks and loop.

o Registers will need to be saved, too.
o Prime registers (alt) could be dedicated to the scheduler.
o This might be useful for debugging.

o Note: A task could hog the system with a HOG state.
In that case, the scheduler would simply return and allow the task to continue.
The hogging task will need to change the state to allow the scheduler to proceed.

As the scheduler is the lowest priority, other higher pri. ints. will be serviced.
But I think interrupts may need to be disable when the scheduler monkeys with stacks, - not sure

Tom Storey

unread,

Nov 1, 2022, 4:22:59 AM11/1/22

to Interocitor Steve, retro-comp

On Tue, 1 Nov 2022 at 04:52, Interocitor Steve <ZO...@gladucalled.com> wrote:

But I think interrupts may need to be disable when the scheduler monkeys with stacks, - not sure

FreeRTOS disables interrupts during particularly sensitive code paths.

It also has a concept of "critical sections", which are blocks of code where interrupts should be disabled (usually up to a certain level if your CPU supports it - this allows other higher priority interrupts which do not interact with the FreeRTOS API to continue to run unimpeded).

There are basically two macros to enter and exit a critical section respectively, and both of which maintain a counter so that you can nest critical sections. When you enter a critical section, if the counter is 0 then interrupts are disabled, and the counter is always incremented regardless. When you exit a critical section the counter is decremented, and only once it reaches 0 are interrupts re-enabled.

Tasks can use these macros as well as the kernel, and naturally it is recommended to keep such sections as short as possible. I guess you can use it to implement "kind of atomic operations".

Interocitor Steve

unread,

Nov 1, 2022, 12:00:37 PM11/1/22

to retro-comp

Hello T...,

I had forgotten the counter part of this. Thanks for the tip.

I seem to also remember using test-and-set instructions, too. I forget if the Z80 has this instruction. But they were part of testing something without having an interrupt seek in and change something just after you read it. The test and set code could be used to avoid the trap when you read something, get an interrupt, and then use the flags to continue. This can't happen with Z80 interrupts exactly, but it could happen with task switching when shared variables are used. I don't remember the details of shared variables on a Z80.

Most of my experience was with OS32, a real-time OS on Interdata mini-computers with an MMU, multiple processor states, 256 levels of interrupts, and three system states. Of course, the OS was already written and well documented so all I had to do was read it and work the practice.

I am striving for simplicity here with the Z80.

Thanks, =Steve.

Mark T

unread,

Nov 1, 2022, 1:46:37 PM11/1/22

to retro-comp

Inc (hl) and dec (hl) are atomic as far as interupts are concerned, but not for dma.

Steve Mastrianni

unread,

Nov 1, 2022, 1:57:14 PM11/1/22

to Interocitor Steve, retro-comp

Problems you’re still likely to encounter are in dealing with spin locks, resources, race conditions, shared memory contention, and deadlocks that can only be restarted with hardware intervention.

It’s a great exercise for learning the nuances of multitasking but beyond just doing it for fun, I’d be inclined to ask ‘why?’ There are much better architectures that could provide you with what you’re trying to develop. The 80c286 would give you a nice starting point where you could take advantage of the ring architecture, real and protect mode, and most importantly exception handling - a ‘must’ for any multitasker.

I do enjoy the discussion. Programmers today seem to be writing cloud native apps, JavaScript, python and the like with little knowledge of the underlying architecture that enables all of that higher level functionality.

--Steve

Dr. Steve Mastrianni

Sent from my iPhone

On Nov 1, 2022, at 12:00 PM, Interocitor Steve <ZO...@gladucalled.com> wrote:

--
You received this message because you are subscribed to a topic in the Google Groups "retro-comp" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/retro-comp/U-DfKSDRVek/unsubscribe.
To unsubscribe from this group and all its topics, send an email to retro-comp+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/retro-comp/cd565cdd-bbce-4c6e-830a-c51445d5dd61n%40googlegroups.com.

Mark T

unread,

Nov 1, 2022, 7:00:40 PM11/1/22

to retro-comp

I would guess you want to handle mutexes in the task scheduler rather than inside each task, to avoid starting a task that doesn’t currently hold the mutex. As there might be more than one shared resource, with separate mutexes this might mean that the task control block needs to be able to handle a list of mutexes. I’m only guessing as I haven’t studied this type of system, just interested in how it might work.

Phillip Stevens

unread,

Nov 1, 2022, 7:25:05 PM11/1/22

to retro-comp

Mark T wrote:

Inc (hl) and dec (hl) are atomic as far as interupts are concerned, but not for dma.

Steve wrote:
I seem to also remember using test-and-set instructions, too. I forget if the Z80 has this instruction. But they were part of testing something without having an interrupt seek in and change something just after you read it. The test and set code could be used to avoid the trap when you read something, get an interrupt, and then use the flags to continue. This can't happen with Z80 interrupts exactly, but it could happen with task switching when shared variables are used. I don't remember the details of shared variables on a Z80.

For the Z80 the most commonly used "test and set" or MUTEX is the SRA (HL) instruction. It is prepared by loading memory with $FE. The first thread to "take" it will get a clear carry bit which can then be tested (at leisure) and acted upon. All future attempts to "take" it will get a set carry bit which is used to signal "NO". To "give" the mutex back the memory location is loaded with LD (HL),$FE which sets things up for the next attempt.

The operation needs to affect both memory (outside the CPU), and the CPU state. It is unaffected by whether interrupts are enabled or disabled.

That is the mechanism that we used to put a MUTEX inside HBIOS for RomWBW, so that it could support FreeRTOS on the Z180 SCZ180*.

The FreeRTOS Scheduler, on the other hand only runs with interrupts disabled. There are too many different architectures supported to rely on mutex tricks available on one or other machines, that may not be available on others.

*HBIOS is complicated and non-reentrant, and there are still some edge cases where things don't work as expected.

So FreeRTOS on the SCZ180 can't pass the heavy (eg 10+ task) system tests. I don't know why, and it is difficult to debug.

Phillip.

Mark T

unread,

Nov 1, 2022, 10:01:16 PM11/1/22

to retro-comp

Inc (hl) and dec (hl) also updates both memory and status. Initialization would be to set to ff. Inc would change memory to 0 and set zero flag, dec would reset to ff. If inc fails to set zero flag then its followed by dec.

I think this might be safer than ld (hl),$fe.

Phillip Stevens

unread,

Nov 1, 2022, 11:15:13 PM11/1/22

to retro-comp

Mark wrote:

Inc (hl) and dec (hl) also updates both memory and status. Initialization would be to set to ff. Inc would change memory to 0 and set zero flag, dec would reset to ff. If inc fails to set zero flag then its followed by dec.

Unfortunately, inc (hl) and dec (hl) leave a small hole in the exclusion armour. That is if they are hit 256 times between inc (hl) and dec (hl), then their exclusion rolls over and you get two processes with the zero flag. Which is bad. So it relies on you being quick with the test and reset to minimise the hole, and it relies on something else bad (i.e. getting hit 256 times in the hole) never happening. IMHO, you can’t rely on something bad never happening.

I think this might be safer than ld (hl),$fe.

It doesn’t matter how ugly your release of the exclusion is, as long as it is done atomically. And this method is atomic (except by DMA).

P.

Interocitor Steve

unread,

Nov 2, 2022, 1:29:51 AM11/2/22

to retro-comp

Hi Steve.ma,

I am mostly dabbling with this for fun. I know how to do and have done it the right way. But I have never dealt with a round-robin scheduler. I want so see what I can do with this. Basically, I have some time consuming math do to (quaternions) and I don't want to bog down the system waiting for computational results. The only resource I expect to share is terminal I/O, and maybe not even that if I use a separate graphics display for the quaternion stuff and an ASCII terminal for other app stuff. There is no preemptive requirement and signaling is not time critical.

=Steve.

On Tuesday, November 1, 2022 at 1:57:14 PM UTC-4 steve.ma...@gmail.com wrote:

Interocitor Steve

unread,

Nov 2, 2022, 2:04:43 AM11/2/22

to retro-comp

Phillip and Mark,

Thanks for the MUTEX advise. I will need something for signaling. Phillip, thanks for the 256 tip. This problem would be a bear to find in real life.

I may make use to the alternate registers for each task signal needs and for scheduler use, just a thought.

As this is round-robin, it will be OK for the scheduler not to catch a signal/semaphore right away. The next loop through the TCBs will be soon enough. As I said, there are no real time requirements placed upon the scheduler. The word "scheduler" inside large operating systems usually means the code that evaluates changing priorities and I/O events and decides what to do next. This is not that. The scheduler I am designing is more like a main() loop in a Z80 program. My goal is to get rid of some main() loops and to let time-consuming, but not time-critical "subroutines," i.e. calls(), run "in the background" while other functions of the system continue on.

=Steve.

Interocitor Steve

unread,

Nov 2, 2022, 2:06:36 AM11/2/22

to retro-comp

Also,

I am developing a WS2812 board and driver. I hope not to have to use DMA. I don't think I will need it. Besides, I haven't built a DMA board. But, are you guys suggesting that DMA can mess with CPU flags? =Steve.

Mark T

unread,

Nov 2, 2022, 3:57:05 AM11/2/22

to retro-comp

Busrq on z80 can release the bus at the end of any bus cycle, so if you had two or more processors accessing shared memory then the atomic instructions are no longer atomic. One processor might read the value of (hl) but then another processor could gain access to the bus to read the same mutex before the first processor has chance to write the updated value. Probably only a risk in a multi z80 system with shared memory.

Tom Storey

unread,

Nov 2, 2022, 5:35:50 AM11/2/22

to Interocitor Steve, retro-comp

DMA cant modify CPU flags - CPU flags belong to the CPU.

You received this message because you are subscribed to the Google Groups "retro-comp" group.

To unsubscribe from this group and stop receiving emails from it, send an email to retro-comp+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/retro-comp/23a4131f-d48b-4a93-b8eb-bb5e19bb7824n%40googlegroups.com.

Alan Cox

unread,

Nov 2, 2022, 7:23:42 AM11/2/22

to Tom Storey, Interocitor Steve, retro-comp

On Wed, 2 Nov 2022 at 09:35, Tom Storey <t...@snnap.net> wrote:
>
> DMA cant modify CPU flags - CPU flags belong to the CPU.

What can happen on dual Z80 though is that bits of the two shifts or
increments happen interleaved so that an operation is effectively
lost. The flags may be local but what they are computed from is not.
The Z280 addresses this as with a cache itt's theoretically useful
dual processor, although I've never heard of any such Z280 combos.

On RC2014 the only case I know where it matters is the dual port
shared memory card but that has an inbuilt interrupt facility in each
direction and is intended to be used as a messaging interface anyway.

Otherwise you need some kind of external hardware for most 8bit
machines to do this sort of stuff (or a rather inefficient algorithmic
solution). Later WDC 65C02 added a pin for bus locking
Read/Modfy/Write instructions (akin to LOCK on an 8086/286). Not sure
if 68xx ever got one.

Alan

Douglas Miller

unread,

Nov 2, 2022, 8:16:12 AM11/2/22

to retro-comp

If you're talking about multi-CPU systems (SMP), you will probably need to implement some sort of specialized hardware atomics. I've seen this done as a small bank of specialized "registers" that, if operated upon in a consistent manner, provide an SMP-safe "test-and-set" operator. Unless the CPU provides some sort of "bus lock" operation (or like the PowerPC "reserve" and "test"), you'll need to implement something external. Note the having a cache would actually complicate the situation, as you have to worry about cache coherency. DMA can also be a problem, although it seems unlikely that one would put a mutex/lock in an area that is being used for DMA.

But for single-CPU systems, you just need to ensure that the current task/process does not get preempted while operating the mutex. This could be using DI/EI or could be using a flag to avoid getting dispatched, or even a flag that prevents the dispatcher from doing EI. MP/M used all of those, depending on the situation. If the mutex requires a system call to operate, you should have control over that (in the OS).

Alan Cox

unread,

Nov 2, 2022, 10:32:02 AM11/2/22

to retro-comp

On Wed, 2 Nov 2022 at 12:16, Douglas Miller <durga...@gmail.com> wrote:
>
> If you're talking about multi-CPU systems (SMP), you will probably need to implement some sort of specialized hardware atomics

Z80/Z180 are in-order machines so you can use the Bakery algorithm.
Just something I've not needed to play with because I use the dual
port memory card as a message passing bridge instead. Once you get to
newer processors it gets scary complex as you say.

Alan

Bill Shen

unread,

Nov 2, 2022, 11:02:35 AM11/2/22

to retro-comp

>WS2812B board and driver
Now that's a real world application that's interesting. Assuming you want a decent size panel, says 64x64 and run animated graphic and sound. So one Z80 banging on the WS21812B, one Z80 working on sound, one Z80 dealing with mass storage, and multiple Z80 working on the math of animate graphic. Multiple Z80 processors display animated graphic on color LED array with cool sound, now that's a project. How about BadApple!! for demo?
Bill
PS, I did "game of life" with 3 Z80 driving I2C OLED display, but no sound and tiny display, boring!
https://groups.google.com/g/retro-comp/c/A98YbLhjHyI/m/xu6vTwEuBAAJ

Mark T

unread,

Nov 2, 2022, 2:25:45 PM11/2/22

to retro-comp

A single bus cycle atomic test and set on the z80 could use in r,(c), or in a,(nn), with an edge triggered register eg 74hct574 registering the value of A8..15 and providing the previous value as input. This then means processors would need to share the bus for IO, which adds all sorts of problems with interupts unless only one processor is allowed to respond.

Douglas Miller

unread,

Nov 2, 2022, 5:29:30 PM11/2/22

to retro-comp

So a "lock" routine would then load the "set" value into B and perform the IN instruction, and test for "not set".

Multi-CPU is actually a pretty broad category. I keep thinking SMP, where the CPUs share the bus (which is pretty difficult here, especially for N > 2). I'm not sure what kind of multi-tasking OS you'd have for isolated/dedicated CPUs. I guess it's a good thing we're talking about single-CPU multi-processing (not processor) OS.

Douglas Miller

unread,

Nov 2, 2022, 5:44:34 PM11/2/22

to retro-comp

You might want to have a look at the MP/M documentation. Their various structures (process, queues, etc) give one a good idea of how things worked. Having recently done some debugging of MP/M (tracing code paths on a simulator), I gained more insight into how some things are handled. Like most OSes, MP/M even has the "Idle" process, which is what runs when nothing else is ready. I'm assuming you are implementing various system calls for mutexing, etc, such that the OS can more easily manage putting processes to sleep and waking them.

As I recall, on entry into the "kernel" MP/M saves the user SP and switches to the system stack. Then, or at the same time, it temporarily changes the SP to point to the process descriptor register area and pushes registers (saving them). Of course, MP/M was written to be 8080 compatible, and only performs Z80 instructions to save the extra registers. You can fully optimize for Z80.

On using the alternate registers for the system, that might be possible in your environment if you are laying down the law from the start. CP/M and MP/M did not, and so some apps used the alternate registers and some BIOSes did too (which does not work well together). Of course, those instructions aren't protected and so any app *could* disturb them.

On Tuesday, November 1, 2022 at 11:00:37 AM UTC-5 Interocitor Steve wrote:

...

Tadeusz Pycio

unread,

Nov 2, 2022, 6:02:38 PM11/2/22

to retro-comp

You could try using the built-in features in the Z280, it has mechanisms for protection as well as working with multiple CPUs through shared memory. It is an ideal candidate for this purpose.

Mark T

unread,

Nov 2, 2022, 6:28:44 PM11/2/22

to retro-comp

I think multi processor with shared memory is an interesting challenge on the z80, I’ve had a few iterations at how this might work.

Unlike the INS8060 which is very slow and has significant gaps between memory cycles that allow another processor to get access to shared memory, the z80 is almost continuosly on the bus unless you try to use the refresh cycles but its not easy to synch these between processors. I think it might work to have separate ram on each z80 as a cache with a write though to all memory for any write instructions. Using wait to hold the z80 until it can access all memory didn’t seem feasible, it looked like it might work latching address and data on every write cycle, and generating busrq early enough to prevent the next cycle starting. Busrq then requests the bus of all processors and writes the latched data to all memories. The write cycle has to be detected before write enable is asserted, using memrq without read enable.

Interocitor Steve

unread,

Nov 3, 2022, 12:18:12 AM11/3/22

to retro-comp

On Wednesday, November 2, 2022 at 5:44:34 PM UTC-4 Douglas Miller wrote:

Like most OSes, MP/M even has the "Idle" process, which is what runs when nothing else is ready. I'm assuming you are implementing various system calls for mutexing, etc, such that the OS can more easily manage putting processes to sleep and waking them.

Nope! Every process gets an equal slice of CPU time. There is no task waiting or putting a task to sleeping (except when the scheduler switches to the next task). No task will ever end. It might spin a loop during its slice of time waiting for a signal from another task. But that's it. Nor is there an idle process. This is much simpler.

=Steve.

Douglas Miller

unread,

Nov 3, 2022, 8:52:59 AM11/3/22

to retro-comp

Interesting. It sounds like you are talking about an "embedded" system and not a general-purpose OS. So, not really any need for a "process state" as all processes are always "runnable". Obviously, the spinning wastes the time-slice, especially when spinning on another process. But, it is simple. I can't think of how it would be any more prone to deadlock or livelock than typical MP/M would be (obviously, any process that spins with interrupts disabled is a potential problem). Just more latency.

Interocitor Steve

unread,

Nov 4, 2022, 10:41:33 AM11/4/22

to retro-comp

I took a look at FreeTROS and it is very complete, with 163 different functions() and a 400 page manual. Does it all though.

Interocitor Steve

unread,

Nov 4, 2022, 10:56:33 AM11/4/22

to retro-comp

Douglas,

It just struck me. Yes! This is for embedded system projects. I hadn't thought of it that way exactly, but the embedded system perspective is perfect. Thanks for the clarity. Now I need a name for it, like Loop the Tasks Scheduler (LTS), and leave the term "Operating System" out of it. Thanks for the help! =Steve.

Mark T

unread,

Nov 4, 2022, 1:54:15 PM11/4/22

to retro-comp

I was searching to see if there were any programming languages aimed at real time systems and found this link from wikipedia that might be worth a look. I thought it was interesting for its time in history, prior to multitasking operating systems being so common.

http://async.org.uk/Hugo.Simpson/MASCOT-3.1-Manual-June-1987.pdf

Phillip Stevens

unread,

Nov 5, 2022, 2:14:05 AM11/5/22

to retro-comp

Mark wrote:

I was searching to see if there were any programming languages aimed at real time systems and found this link from wikipedia that might be worth a look. I thought it was interesting for its time in history, prior to multitasking operating systems being so common.
http://async.org.uk/Hugo.Simpson/MASCOT-3.1-Manual-June-1987.pdf

First thought when reading the MASCOT intro. "Wonder if they knew about ADA, over the pond?"

Spoiler, yes they did and it is referenced.

Also, I wonder if there's any connection to the Erlang language which is used in big telecom systems? https://www.erlang.org

I used to work one of Ericsson's competitors, and we used CHILL for our systems, and I do remember i80168 processors being used extensively. https://en.wikipedia.org/wiki/CHILL