HexLoadr - with step by step instructions

446 views
Skip to first unread message

phillip.stevens

unread,
Feb 11, 2017, 7:45:00 AM2/11/17
to RC2014-Z80
I've spent a bit of time untangling the arbitrary programme HEX loading story,
and have assembled a bit of "how to" mainly so I don't forget what to do.
I count 11 steps in the process, and that is more than I have fingers.

I also did a bit of a rework of some of the code,
and put some tools together to check for loading success.


Thanks and credit to Filippo, Dave, Jason, Scott for making it easy.

Just a small point. My RC2014 is still poorly, so I don't actually know whether it works.
The system is working well on my other machine though, so it must be pretty close.

Once I can prove that everything is working, I'll pull it over to the RC2014 repository.

Enjoy

Phillip

phillip.stevens

unread,
Feb 13, 2017, 5:42:31 AM2/13/17
to RC2014-Z80
Well, as usual there's always just a little more to be done, and so there's another iteration on the HexLoadr.

This time I've optimised the code further, and used the saved bytes to add in support for the Extended Segment Address (ESA) Record Type.

The ESA effectively extends the resolution of the Intel HEX format out to 20 bits (or 1MB) range. This is useful for Z180 processors with an integrated MMU. The upper byte of the ESA is written to the BBR of the Z180. So now the HexLoadr program is compiled with some Z180 instructions to load and store the internal Z180 registers, but these instructions are not activated unless the ESA Record Type is found in the read HEX file (which should be never when dealing with pure Z80 CPU).

And, now my rc2014 is working again, this version of HexLoadr is tested on the rc2014.
Instructions in the README.

Enjoy

Phillip

On Saturday, 11 February 2017 23:45:00 UTC+11, phillip.stevens wrote:
I've spent a bit of time untangling the arbitrary programme HEX loading story,
and have assembled a bit of "how to" mainly so I don't forget what to do.

phillip.stevens

unread,
Feb 18, 2017, 9:31:53 AM2/18/17
to RC2014-Z80
Well, as usual there's always just a little more to be done, and so there's another iteration on the HexLoadr.

I've been using the HexLoadr quite a lot to help me write Z80 assembly programs.
At the moment I'm testing the Am9511A APU, for example.

But, the process to poke the HexLoadr code into Basic is quite slow, compared to the time to load the actual assembly HEX code.
So I decided to include the HexLoadr into my Nascom Basic initialisation code to speed up the process.
Now, I've done that for the RC2014 version too.

The attached HEX file, if burned to a ROM, provides an added option "H" to load a HEX file over the serial terminal from RESET.
When the HEX file is finished, Basic is warm-started normally.

Usually, I'll do a cold start before I load my assembly program, if I need to reserve some space for it at the top of the 32k RAM space.

But, if you're using the 56k RAM module, then just set your assembly program origin somewhere between 0x2000 and 0x7999, because this space is ignored by 32k Basic.

To load the HEX program you can use the slowprint.py program, or actually just Linux cat works fine for me because the assembly HexLoadr version is very fast.

The final step before using the newly loaded program is to configure the USR(x) jump location to point to the origin of your program.
The location of the USR(x) jump is at 0x8123, and the actual jump address is at 0x8124 and 0x8125.

WRKSPC  .EQU    8120H           ; <<<< BASIC Work space origin
USR    
.EQU    WRKSPC+3H       ; "USR (x)" jump

For example if your assembly origin is 0x3000, then these are the Basic commands to set the USR(x) jump correctly.
poke &h8124, &h00
poke
&h8215, 0h30

I hope this is useful, and saves some time.

Enjoy,

Phillip


nascom32k.hex

Spencer Owen

unread,
Feb 18, 2017, 1:27:58 PM2/18/17
to rc201...@googlegroups.com
I had a little bit of time earlier today and gave this a quick test. It works great :-)

(Sadly, my assembly code doesn't work quite so great, but I discovered what was going wrong much much quicker than I did yesterday)

Also, just a quick tip, instead of have to poke 2 values in, I used
DOKE &h8124, &hA000
(My code was located at 0xA000 - but it's a little bit more readable this way)

Cheers

Spencer 

--
You received this message because you are subscribed to the Google Groups "RC2014-Z80" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rc2014-z80+unsubscribe@googlegroups.com.
To post to this group, send email to rc201...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rc2014-z80/32692d96-9f17-4d81-810f-f762ce5db732%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

phillip.stevens

unread,
Feb 18, 2017, 6:07:22 PM2/18/17
to RC2014-Z80
I had a little bit of time earlier today and gave this a quick test. It works great :-)

Good to hear it is useful for you.

I realised I was trying to chop down an asm tree with a sharpened butter knife.
When what I really needed to build an axe.
 
Also, just a quick tip, instead of have to poke 2 values in, I used
DOKE &h8124, &hA000
(My code was located at 0xA000 - but it's a little bit more readable this way)

Yes. Thanks absolutely. I'll fix my README.
Standing too close to the tree to see the wood (to continue the analogy).

Phillip

phillip.stevens

unread,
Feb 18, 2017, 9:00:24 PM2/18/17
to RC2014-Z80
I had a little bit of time earlier today and gave this a quick test. It works great :-)

Good to hear it is useful for you.

Small errata.
The final checksum byte was being ignored and causing an error after HexLoadr successfully exited.
So I've fixed that now too.

New HEX file attached.

I've uploaded my code and the HexLoadr HEX file to RC2014 Github.

Enjoy.

Phillip


nascom32k_hexloadr.hex

Ed Brindley

unread,
Mar 12, 2017, 8:51:28 AM3/12/17
to RC2014-Z80

Hi Philip,


Many thanks for this, it's great and will definitely speed up my development process!


One thing I've seen though is selecting cold reset prints a load of garbage to the serial, then pressing the reset button a second time brings you to BASIC correctly. (See attached screenshot)


Have you seen this behaviour? I tried assembling myself and using your hex file, both behaved the same.


thanks,

Ed
Screenshot_20170312_122007.png

David Hardingham

unread,
Mar 12, 2017, 1:37:08 PM3/12/17
to RC2014-Z80

Do you have a large capacitor between the 5v and Gnd pins? I've seen this when a large capacitor provides power to the RC2014 after the 5v supply has been removed.


phillip.stevens

unread,
Mar 12, 2017, 10:14:21 PM3/12/17
to RC2014-Z80

Many thanks for this, it's great and will definitely speed up my development process!


You're welcome. I'm glad that it has been useful.

I've been using it to port the Lawrance Livermore Labs Floating Point Library, and having a fast Hex Loader has been invaluable as it is a very large Hex upload.
I made a short video, which shows the Hex upload speed in action, at second 24.

One thing I've seen though is selecting cold reset prints a load of garbage to the serial, then pressing the reset button a second time brings you to BASIC correctly. (See attached screenshot)


Have you seen this behaviour? I tried assembling myself and using your hex file, both behaved the same.


I had not previously seen the behaviour you mention. Typically I power my RC2014 from the FTDI (or actually Prologic) Serial adaptor.

But, since you have mentioned it, I've tried to power my RC2014 via the external barrel adaptor (with the FTDI power jumper removed), and then tried replugging the power without removing the FTDI adaptor. And, yes, I can get similar behaviour to what you're seeing.

I'm guessing (since I've not debugged it yet) that the serial Rx buffer count is getting some corrupted value and that value is causing the Rx routine to spit similarly corrupted buffer contents until it decrements to zero. It is odd because the ACIA counters are zeroed, before the startup options are presented.


I've uploaded my code and the HexLoadr HEX file to RC2014 Github.

If I get a chance, I'll have a look at what might be causing it. Though, as David H. points out. It is probably a transitory power issue.

Cheers, Phillip

Thomas Riesen

unread,
Mar 13, 2017, 5:38:17 PM3/13/17
to RC2014-Z80
Hi Philippe,
What do you do with the AM9511A? What's the intention?
I made only first steps with this ALU ... unfortunately I don't see all the secrets of this chip.
Regards
Thomas

phillip.stevens

unread,
Mar 13, 2017, 6:41:34 PM3/13/17
to RC2014-Z80
What do you do with the AM9511A? What's the intention?
I made only first steps with this ALU ... unfortunately I don't see all the secrets of this chip.

I've built a Z180 based board, supporting the AM9511A APU.
Partly for historical enjoyment. Partly because it is actually still faster than "modern" Z80 devices.

For example, from the Am9511/Am9512 Floating Point Processor Manual by Steven Cheng, we have comparison tables.

On average the Am9511A APU (at 1.966MHz) produces a hardware floating point divide in 165.9 cycles (of a 2MHz 8080 processor).
Converted to my Am9511A implementation (at 2.304MHz), we have the equivalent in 141.5 cycles of the 2MHz 8080.
Converted to best case modern Z180 terms (overclocked to 36.864MHz) this is 2,609 CPU cycles to return a hardware floating point divide.
To produce an equivalent software floating point divide, using the LLL floating point library, requires 13,080 cycles.

This means that floating point on the 40 year old AM9511A APU is still 5.0 times faster than an overclocked Z180.
Sweet!

A A

unread,
Mar 13, 2017, 7:44:22 PM3/13/17
to RC2014-Z80


On Monday, March 13, 2017 at 4:41:34 PM UTC-6, phillip.stevens wrote:
On average the Am9511A APU (at 1.966MHz) produces a hardware floating point divide in 165.9 cycles (of a 2MHz 8080 processor).
Converted to my Am9511A implementation (at 2.304MHz), we have the equivalent in 141.5 cycles of the 2MHz 8080.
Converted to best case modern Z180 terms (overclocked to 36.864MHz) this is 2,609 CPU cycles to return a hardware floating point divide.
To produce an equivalent software floating point divide, using the LLL floating point library, requires 13,080 cycles.

This means that floating point on the 40 year old AM9511A APU is still 5.0 times faster than an overclocked Z180.
Sweet!

I ran some quick simulations using a program that divided 1000 random floats.

math48 (float package inside z88dk supporting a 48-bit float type):  ~10900 Z80 cycles per divide
IAR z80 (I did not simulate but took its performance relative to z88dk in whetstone to compute a fake guess, 32-bit float type):  ~8700 Z80 cycles per divide

The Z180 is more cycle efficient, maybe 3/4?  So about 8175 Z180 cycles and 6525 Z180 cycles for the above.

It still doesn't catch the AM9511A.
I am pretty sure that these soft float packages could still be sped up but there are very few volunteers for that :P

phillip.stevens

unread,
Mar 23, 2017, 8:30:08 AM3/23/17
to RC2014-Z80
It has been a few weeks since I worked on this program.

Mainly because I've been using it to play around with an APU device so it has been working exactly as intended, with nothing to improve.

However, since I was doing some code refactoring for other purposes, I've moved that work here too.
Hopefully, the code rearrangement makes it easier for me to manage more complex drivers, in preparation for z88dk integration

Effectively, there is no change to the HexLoadr function.

But, I have put in the option to disconnect any of the RST jumps using a jump table. I'm guessing this will be useful for someone.

The table below describes the locations. I've only built this function for the "32k Basic" version, because I use the space from 0x2000 to 0x7FFF to load the programs I write. That avoids the need to interact with the normal Basic memory initialisation sequence during cold boot.

To redirect the program flow following any of the RST, INT, or NMI inputs, just the address of your program needs to be written to the location in the below table.
Some NULL vectors are also provided, for convenience, which do the correct RET, RETI, or RETN depending on the origination of the interrupt.

    Z80 VECTOR ADDRESS TABLE

Label        Value      Label        Value      Label        Value
------------------      ------------------      ------------------

NULL_RET_ADDR
0040      NULL_INT_ADDR 0060      NULL_NMI_ADDR 0062

RST_08_ADDR  
8002      RST_10_ADDR   8006      RST_18_ADDR   800A
RST_20_ADDR  
800E      RST_28_ADDR   8012      RST_30_ADDR   8016
INT_00_ADDR  
801A      INT_NMI_ADDR  801E

I've uploaded my code and the HexLoadr HEX file to RC2014 Github.

The instructions for using HexLoadr haven't changed, and they're at the README in Github.

Enjoy, Phillip
nascom32k_hexloadr.hex
Reply all
Reply to author
Forward
0 new messages