Does the 65CE02 decimal bug exist in the C65?

193 views
Skip to first unread message

Paul Gardner-Stephen

unread,
Nov 17, 2014, 1:57:39 PM11/17/14
to c65gs-de...@googlegroups.com
Interesting developments as some folks reverse engineer the 65CE02, and find that it has a bug in decimal mode.

If anyone here has access to a working C65 (I can't remember who does), they would appreciate being able to get the results of running a simple test on it. Let me know if you or someone you know might be willing to do this.

Thanks,
Paul.

Wayne Sander

unread,
Nov 18, 2014, 9:14:05 PM11/18/14
to c65gs-de...@googlegroups.com, pa...@servalproject.org
Hi Paul,

I don't have a working C65, but I do have a VIC-20 with a 65CE02 chip in it.  It works quite well, a little too well actually.  Some games have been rendered unplayable due to the increase in speed.  I'd like to see if I can recreate the error, how do I do the test?

If you'd like to help me design an adapter to use my 4510 chip in my VIC-20 or one of my PETs, I'd be willing to give that a try.  Building a 6510 adapter for the C64 would require a few extra chips and I'd like to keep it simple.

Wayne

Paul Gardner-Stephen

unread,
Nov 18, 2014, 9:38:02 PM11/18/14
to Wayne Sander, c65gs-de...@googlegroups.com
Hello,

Okay, let's start with testing the 65CE02 in your VIC-20.

Run the following and tell me what number it outputs:

10 FOR I = 6144 TO 6154
20 READ A: POKEI,A:NEXT
30 DATA 248,169,65,56,233,8,141,0,28,216,96
40 SYS 6144: ?PEEK(7168)

If the CPU has no bug, it will print 51, else it will print something else.

We can then move onto thinking about how to test your 4510.
(Is it just the VIC-III and DMAgic that your C65 is missing?)

Paul.



--
You received this message because you are subscribed to the Google Groups "C65GS Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to c65gs-developm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Wayne Sander

unread,
Nov 18, 2014, 10:08:57 PM11/18/14
to c65gs-de...@googlegroups.com, wa...@sander.com, pa...@servalproject.org
Hi Paul,

The VIC-20 with the 65CE02 chip outputs the number 57.  I ran the program on a PET4008 and a C64 and they both report 51.  I'm confused though.  It looks like the program is subtracting 8 from 65 in decimal, isn't 57 the correct answer?

Wayne

Daniel England

unread,
Nov 22, 2014, 7:12:44 PM11/22/14
to c65gs-de...@googlegroups.com, wa...@sander.com, pa...@servalproject.org


On Wednesday, 19 November 2014 13:08:57 UTC+10, Wayne Sander wrote:
Hi Paul,

The VIC-20 with the 65CE02 chip outputs the number 57.  I ran the program on a PET4008 and a C64 and they both report 51.  I'm confused though.  It looks like the program is subtracting 8 from 65 in decimal, isn't 57 the correct answer?

Wayne


Hey Wayne!

If you disassemble  the code you get:

1800    SED       ;Enter decimal mode
1801    LDA #$41  ;Load the accumulator with initial value
1803    SEC       ;Set carry flag (so there is no "borrow" for SBC)
1804    SBC #$08  ;Subtract an immediate value from .A
1806    STA $1C00 ;Store the result
1809    CLD       ;Leave decimal mode
180A    RTS       ;Return (to BASIC)


When you run the BASIC program, it should report the value 51 if everything went well.  The values are a little difficult to understand because of the different base types and modes at play.

Firstly, to understand the value reported, you have to convert it to hexadecimal (base 16) which gives you 33 because BASIC will be reporting the value in decimal (base 10).  Also, you have to remember that in decimal mode, the numbers aren't treated as though they are in hex (base 16) but are treated at face value, without numeric (base) conversion, as base 10.

So, the program is performing this equation (all numbers in decimal, base 10):

41 - 8 = 33


You can see that this is the correct operation.  #$41 in decimal mode is 41 decimal.  #$33 in decimal mode is 33 decimal.

There is a bug in the 65CE02 and therefore possibly in the 4502 as well.  If you play around with the figures and the value output by the 65CE02 (57) you will see that the decimal mode is in a big mess, forgetting to perform conversion in one place and performing inappropriate conversion in another.  It seems it would be treating the #$41 as a hexadecimal value and then converting the result to hexadecimal before it is written.  So would be performing the following equation:

65 - 8 = 57


Since we are printing the result as decimal (converting it from hexadecimal) we would see the value 57 (the 65CE02 wrote the value #$39).  So, to my mind, there are actually two bugs at play here.  First, the #$41 value is not treated as a decimal mode value and second, the resultant decimal mode value is converted to hex when stored.

Its quite strange.  There must be some mix up of signals that are being fed into the ALU in the 65CE02, perhaps a short, cross-talk or they are just swapped somehow.  I hope this clears it up a little for you!

I should note that technically, interrupts should be disabled at the start of the test routine and then re-enabled at the end to prevent potential problems.

The information on this page might be helpful:  <http://www.6502.org/tutorials/decimal_mode.html>  It also has some more thorough test routines.

On another topic (to save me creating another thread) you mentioned that you are successfully using MESS as a C65 emulator.  Which version of MESS are you using?  I downloaded the most recent version and it is, at least for me, broken.  There is documentation to support this situation (saying that core features were changed which resulted in the 6502 and descendants being broken).  I have used it successfully myself although that was years ago and a very old version, now.  Can you tell me where you got your version of MESS from?

Thanks!


Daniel.

Wayne Sander

unread,
Nov 22, 2014, 8:34:34 PM11/22/14
to Daniel England, c65gs-de...@googlegroups.com, pa...@servalproject.org
Hi Daniel,

I replied to Paul separately and he cleared everything up for me. The problem was, I used a tiny disassembler that was written in BASIC to look at the code. It didn’t do the conversions to hex, so I saw the program as:

6144 SED
6145 LDA #65
6147 SEC
6148 SDB #08
6150 STA 7168
6153 CLD
6154 RTS

So, in the midst of all these decimal to hex conversions and non-conversions, I got lost. I’m usually not this impaired. Honestly. :)

Thanks for taking the time to respond though.

> Since we are printing the result as decimal (converting it from hexadecimal) we would see the value 57 (the 65CE02 wrote the value #$39). So, to my mind, there are actually two bugs at play here. First, the #$41 value is not treated as a decimal mode value and second, the resultant decimal mode value is converted to hex when stored.

I think they’re just two aspects of the same bug. SBC only works in hex mode. It’s as if the SED never happened.

> On another topic (to save me creating another thread) you mentioned that you are successfully using MESS as a C65 emulator. Which version of MESS are you using? I downloaded the most recent version and it is, at least for me, broken. There is documentation to support this situation (saying that core features were changed which resulted in the 6502 and descendants being broken). I have used it successfully myself although that was years ago and a very old version, now. Can you tell me where you got your version of MESS from?

I’ve never used MESS so it must be someone else you’re thinking about. However, I do have a 4510 chip. If someone would like to help me build an adaptor to plug it into my VIC-20, C64 or even one of my PETs I’ll gladly play along and run whatever tests you’d like. I’ve got plenty of 84 pin PLCC sockets and a brand new roll of solder.

Wayne

Daniel England

unread,
Nov 22, 2014, 9:08:16 PM11/22/14
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org


On Sunday, 23 November 2014 11:34:34 UTC+10, Wayne Sander wrote:
Hi Daniel,

I replied to Paul separately and he cleared everything up for me.  The problem was, I used a tiny disassembler that was written in BASIC to look at the code.  It didn’t do the conversions to hex, so I saw the program as:

6144 SED
6145 LDA #65
6147 SEC
6148 SDB #08
6150 STA 7168
6153 CLD
6154 RTS

So, in the midst of all these decimal to hex conversions and non-conversions, I got lost.  I’m usually not this impaired.  Honestly.  :)

Hehe...  SBC not SDB...  :)

I love it when I have those slightly skewed moments, too...  Hmm...

 
I think they’re just two aspects of the same bug.  SBC only works in hex mode.  It’s as if the SED never happened.

Actually, yes, you could be right.  Maybe I overcomplicated the problem.  I do that sometimes.  I'd like to know what the value would be if it were SBC #$10 or something.  The circuitry is there apparently but doesn't function correctly?  If it is being used then it would have to be multiple problems.  Still, it may not be at all.  Would you be able to perform this variation of the test?

Its very strange because CSG patented the decimal mode circuitry in the 6502 in the first place.  Oh Commodore...  *sigh*

 
I’ve never used MESS so it must be someone else you’re thinking about.  However, I do have a 4510 chip.  If someone would like to help me build an adaptor to plug it into my VIC-20, C64 or even one of my PETs I’ll gladly play along and run whatever tests you’d like.  I’ve got plenty of 84 pin PLCC sockets and a brand new roll of solder.

Oh yes.   Sorry about that.  I read two posts around the same time, one of which was from you and got confused about who wrote them.  I'll contact the correct person now.

I'd imagine that the 4510 won't work in the C64 because of the IO ports at $0000/$0001 and the PLA dependency.  It will probably work in the VIC20?  I read someone say that the 65CE02 is pin compatible with a 65C02 but the 4510 has a lot more going on and I don't know enough to say anything about how to make it work in the VIC20.  It could be a problem if the 4510 relies on the operation of its internal 6526/8520s and that's all I know.  I could read, speculate and experiment but with a part that rare and valuable you'd want to know for sure that it will be okay.  Speaking of which, are you sure your 4510 is functional?  Its a shame you don't have the other ICs in your C65.

Its very interesting stuff, that this hasn't been found before.  If the problem still exists in the 4510 then even Commodore hadn't tested its 65CE02 very well.  Fascinating.


Daniel.

Wayne Sander

unread,
Nov 30, 2014, 1:32:17 AM11/30/14
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
Hi Daniel,



On Saturday, November 22, 2014 7:08:16 PM UTC-7, Daniel England wrote:


On Sunday, 23 November 2014 11:34:34 UTC+10, Wayne Sander wrote:
Hi Daniel,

I replied to Paul separately and he cleared everything up for me.  The problem was, I used a tiny disassembler that was written in BASIC to look at the code.  It didn’t do the conversions to hex, so I saw the program as:

6144 SED
6145 LDA #65
6147 SEC
6148 SDB #08
6150 STA 7168
6153 CLD
6154 RTS

So, in the midst of all these decimal to hex conversions and non-conversions, I got lost.  I’m usually not this impaired.  Honestly.  :)

Hehe...  SBC not SDB...  :)

I love it when I have those slightly skewed moments, too...  Hmm...

Argh.  :)
 
I'd imagine that the 4510 won't work in the C64 because of the IO ports at $0000/$0001 and the PLA dependency.  It will probably work in the VIC20?  I read someone say that the 65CE02 is pin compatible with a 65C02 but the 4510 has a lot more going on and I don't know enough to say anything about how to make it work in the VIC20.  It could be a problem if the 4510 relies on the operation of its internal 6526/8520s and that's all I know.  I could read, speculate and experiment but with a part that rare and valuable you'd want to know for sure that it will be okay.  Speaking of which, are you sure your 4510 is functional?  Its a shame you don't have the other ICs in your C65.

Its very interesting stuff, that this hasn't been found before.  If the problem still exists in the 4510 then even Commodore hadn't tested its 65CE02 very well.  Fascinating.

I just spent the last several days studying and modifying some boards to test my 4510 in my VIC-20 (these are reproductions of the C65 motherboard and C65 widget board, not the real thing).  It was just easier to use what I had laying around than build something else.  Here's a photo of the setup.  Unfortunately, I cannot get it to work so someone else is going to have to test out the 4510 decimal mode.  I really have no idea why it's not working. I hope my chip isn't dead, but I have a sneaking suspicion it might be.  It would also explain why Commodore cannibalized parts from it before it made its way into my hands.

Wayne
VIC4510.jpg

Daniel England

unread,
Nov 30, 2014, 4:08:35 AM11/30/14
to c65gs-de...@googlegroups.com
Wayne,

That's a very pretty looking VIC20.

Shame you couldn't get it to work.  How were you powering the board?  I wonder if you can use that widget board to detect some kind of life in the 4510 if you just try powering it up on its own?

Would you be able to run that test again as you did the first time but make it so that it does a SBC $10 for me (replacing the 8 in the data to 16?)?  Just out of curiosity...

All I can add is that the emulation in MESS reports the correct value but that's probably of no consequence.  Hmm...  I thought I had better check the source code for MESS in case it knows about the issue and it doesn't.  The 65CE02 emulation in MESS knows nothing of how decimal mode is broken and just performs exactly the same thing as the 6502 does.

IMHO, the correct thing to do for Paul's 65GS02 is to implement the correct functionality.  Since it is/was known to be broken in the 65CE02, anything expecting to be run on it shouldn't be using it so it should be okay to do it properly in the 65GS02.


Daniel.

Wayne Sander

unread,
Dec 1, 2014, 11:04:58 AM12/1/14
to c65gs-de...@googlegroups.com
Hi Daniel,


That's a very pretty looking VIC20.

Thanks, I picked it up on Ebay for almost nothing.  It was really grimy but it cleaned up very nicely.  It's my first VIC.  My parents almost bought me one in the summer of 1982 but a neighbor told me to hold out for the C64.  I got a C64 in October 1982.  I still have it actually.
 
Shame you couldn't get it to work.  How were you powering the board?  I wonder if you can use that widget board to detect some kind of life in the 4510 if you just try powering it up on its own?

I was powering the board from the VIC-20.  The correct voltages and signals were going to the correct pins as far as I could tell.  The 4510 doesn't output a PH2 signal like the 6502 does, but I tested the VIC-20 by feeding it PH0 instead and it worked fine.  PH0 and PH2 are almost in sync and the timing in my VIC-20 doesn't seem to be that picky.  At the very least I should got a scrambled startup screen using a bad clock, but I got nothing.  I'm still trying to figure out what went wrong.

Would you be able to run that test again as you did the first time but make it so that it does a SBC $10 for me (replacing the 8 in the data to 16?)?  Just out of curiosity...

Subtracting 16 from 65 equals 49.  However, it's an inconclusive test as 65-16=49 but also $41-$10=$31.  The result is identical which might be why no one ever caught it before.  There's no carry so it works the same.  If you've got another example I'd be happy to run it.

IMHO, the correct thing to do for Paul's 65GS02 is to implement the correct functionality.  Since it is/was known to be broken in the 65CE02, anything expecting to be run on it shouldn't be using it so it should be okay to do it properly in the 65GS02.

 I agree.  Paul's vision is the future and shouldn't be constrained by the past.  I'm very excited by this project.

Wayne

Paul Gardner-Stephen

unread,
Dec 1, 2014, 6:58:48 PM12/1/14
to Wayne Sander, c65gs-de...@googlegroups.com
Hello,

Sorry to hear your 4510 doesn't seem to be working.  If it proves really dead, I'll have to make you up an FPGA-based replacement.

As for the GS4510 in the C65GS, at present it implements the 6502 SBC decimal mode functionality, however I am planning to add a flag in the hypervisor that will allow switching between 65CE02 and 6502 behaviour so that C64 and C65 compatibility can be upheld.

If someone can fully characterise SBC in decimal mode on a 65CE02, that would be extremely helpful at this point.

Paul.

--

Daniel England

unread,
Dec 1, 2014, 11:49:17 PM12/1/14
to c65gs-de...@googlegroups.com
Heyas!

Attached is an archive containing supposedly comprehensive decimal mode testing routines and BASIC program test harnesses for the C64 and VIC20.

I wrote the programs based on the information at: <http://www.6502.org/tutorials/decimal_mode.html>.  See the readme.txt file for more info.

I modified the final test routine a little as noted in the readme and wrote the BASIC test harnesses.  I have tested them on VICE but I haven't been able to test failure.  All of the source code is included.  Feel free to modify as you desire!  

The routines can be modified to test 6502, 65c02 or 65816 specific behaviours but they only test for full 6502 compliance (including undocumented behaviours) as compiled.

The program uses almost all of the user program area for an unexpanded VIC20!  *Whew*  They follow the VICE testing guidelines in that they change the border colour to red upon failure.

I hope this helps!


Daniel.
6502TestDMode.zip

Wayne Sander

unread,
Dec 3, 2014, 12:49:23 AM12/3/14
to c65gs-de...@googlegroups.com
Hi Daniel,

Here's a screenshot of my VIC-20 with 65CE02 running your program.  Looks like it failed (but we kind of knew that already). :)

Wayne
Decimal Test.JPG

Daniel England

unread,
Dec 3, 2014, 1:17:20 AM12/3/14
to c65gs-de...@googlegroups.com
Wayne,

Hey, cool.  You typed my program in and ran it?

Do you have a 1541 connected to your VIC20?  Do you have some way of transferring image files from your PC to your VIC20?  How much memory does your VIC20 have?

I ask because I'm wondering how it would be best to deliver programs for your use.

I have been thinking about how to improve the program.  I think it should ask which of the three "compliance levels" (6502, 65c02 or 65816) to test and patch the JSR address before executing.  Also, I think Paul wants a complete characterisation which pretty much means we need to store the results for every operation.  That would take a lot of memory and we'd need to save the results to disk too, probably.

If there is some ability to transfer image files (rather than have you type the programs in) then I don't need to have the BASIC program install the routine, I can just attach it as binary data to the end.

I hadn't actually programmed on the VIC20 before.  This is so exciting and its so cool to see someone else run one of my programs on a real machine!

Thank you!


Daniel.

Wayne Sander

unread,
Dec 3, 2014, 1:53:30 AM12/3/14
to c65gs-de...@googlegroups.com
Hi Daniel,

Actually I copied the D64 to my 1541U2 cartridge, plugged that into my C64 and plugged the serial cable into the VIC-20.  I used the C64 to mount the disk image and the VIC-20 thought it was a regular 1541.  I probably could have used my old SD2IEC thing but I have no idea where it is.  I lost track after I got the 1541U2.

My VIC-20 is unexpanded.  I picked up a 24k expansion board for free, but only 500 bytes of it registers and I'm not convinced that it's stable.  It's a project for down the road, I have to hunt down some new SRAMs for it.

However, I do have plenty of 8k EPROMS, a UV eraser, a programmer and a few old 8k cartridge boards.

So I guess you can send whatever you like.  If it's a program, it'll have to fit in the unexpanded memory space.  If you want to write something that I can burn on an EPROM, you've got more space to work with.

Wayne

Daniel England

unread,
Dec 3, 2014, 2:07:32 AM12/3/14
to c65gs-de...@googlegroups.com
Wayne,

Very clever way of using the 1541U2.  They are so neat.  I wish I had one.

Hmm...  Well, I don't think I'll worry about the EPROMs.   I don't know that I could really test it properly before I give it to you.

Its a shame your expansion board is broken.

On the other hand, you do have a way of transferring data to and from your PC with the 1541U2.  That will allow me to store the results to the disk and we can then get them back again.  Instead of writing the data to memory, I can just write it to disk.  You have VICE as well (for its c1541 program)?

I think I know what to do for the next version.  Unfortunately, to make it easier on myself, the way I plan to do it will probably make the total run time go out to more like an hour rather than a minute for the full test.  Is that all right?


Daniel.

Daniel England

unread,
Dec 3, 2014, 2:10:42 AM12/3/14
to c65gs-de...@googlegroups.com
Oh!  I forgot to ask.  What sort of PC do you have?  IBM compatible Windows, Linux or Apple Mac something?


Daniel.

Daniel England

unread,
Dec 3, 2014, 7:55:31 AM12/3/14
to c65gs-de...@googlegroups.com
Hmm...

Whoops!  I need about 2MB of storage to store a comprehensive break-down of the results.

I've rewritten the BASIC program and redesigned the ASM code so that you can choose which version to test (6502, 65c02 or 65816) and also made it so that every combination is tested and can be analysed even if there is an error.  Basically, I've made the ASM code into something like a driver and loop though the individual tests with the BASIC program.

I've made it so that all of the test data is written to an output file as the tests are done.  I just wrote it as quickly as possible so currently the format is CSV.  However, that means I can only run the first 300 or so tests (out of 131072).  Unfortunately, even if I write the data as a binary stream, I'm still going to need much more space than is available.

I could possibly output only the result and combination data but I would still need to pack the data into 18 bit packets and require something like 160KB in total.  I don't really want to do buffering and disk writing in the ASM code.  The strange alignment is a pain.  I could potentially write the data as separate streams but I think we're still going to have a problem with block sizes.

A potential solution is to prompt the user to change disks as required.  I could add this quite easily.  Given that we have a virtually unlimited supply (because they are virtual), I think this might be the best option so long as having about 15 disks of result data is acceptable.  The only problem is that the program is then going to require hand-holding and it will take quite some time to run.

I guess I need to know from you Paul how comprehensive you need this test to be and how much of the available resultant data you need.  Also, I need to know from you Wayne, how committed you'd be to getting those results.

I won't post the new code right now unless you guys want it.  I'll wait to hear back.


Daniel.

Wayne Sander

unread,
Dec 3, 2014, 10:48:40 PM12/3/14
to c65gs-de...@googlegroups.com
Hi Daniel,

My preferred machine is a MacBook Pro running OS X Mavericks.  I do have a Windows 8 laptop as well.  And I have VICE on my Mac.

I'm game to run whatever you want me to run.  If it take a few hours and a few dozen disk swaps, that's fine.

Wayne

Daniel England

unread,
Dec 4, 2014, 4:19:50 AM12/4/14
to c65gs-de...@googlegroups.com
Argh!!

I'm rewriting the program again to allow not only compliance level selection but also test type selection (specific case, comprehensive, standard compliance).  Also, I'm adding in the disk swapping logic and streaming the full data as binary.  I think its the only way to get the information required otherwise we won't learn anything of any interest so I went ahead with it.  The data will require 30 disks in order to store it all though, according to my calculations.

Unfortunately, the BASIC code blew out to a (not overly but not sparsely packed) 180 lines of code.  With the ASM routines as well, that's almost 3 times as much memory as actually I have to work with.  For a laugh, I loaded it in anyway and was highly amused by the screen going to garble and the machine behaving very strangely afterwards.

I broke the BASIC code into two programs, one for the init and menu and another one for the main routines and ASM driver and compressed them up a bit so they fit in memory fine.  I was testing them and working out some bugs, got to the point where it looked like it would work and...

? ILLEGAL QUANTITY ERROR
 
Oh dear...  I was sure that variables weren't supposed to change after a LOAD.  Well...  That's what I remember from my C64 anyway.  I did some research and the VIC20 BASIC stores its variables after the end of the code.  If you load in a program that's bigger than the last one, the variables get clobbered.  Maybe that happens on a C64 too, its just so long ago and I can't recall.

After thinking about it, I'm not sure if the VIC20 BASIC is going to start clobbering my appended ASM routines, as well.  It didn't seem to from the tests I did yesterday but I should make sure.

Anyway, I'm a bit frustrated now.  I think I'll experiment with forcing the VIC20 BASIC to do something sensible by changing the pointers.  I could transfer only the most essential data over in the ZP (but it seems even more clogged than on the C64) or the cassette buffer (not sure if its used for disk loading) but having all of the initialisation in the main code as well as the memory required for the data is effectively wasted space which I can ill afford.

I'll have a play with it tomorrow and see what I come up with, unless someone can give me some advice.


Daniel.

PS:  Sorry Wayne, I seemed to have tried to post at about the same time you did and it made my session go very strange.  I was going to write a program to recompose the data into something readable and give it to you to run it too but I think it might be easier if you'd just give me the resultant data and I'll reformat it.  I might have something for you tomorrow.  It really will take quite a while to process.

Daniel England

unread,
Dec 5, 2014, 1:32:21 AM12/5/14
to c65gs-de...@googlegroups.com
*SIGH*

Oops, have to turn off caps lock!

I've bitten the bullet and I'm rewriting the program (again...) entirely in ASM this time.

I'll post the program and source when I'm done.


Daniel.

Wayne Sander

unread,
Dec 5, 2014, 12:07:41 PM12/5/14
to Daniel England, c65gs-de...@googlegroups.com
Hi Daniel,

I just purchased an 8k RAM cartridge on Ebay.  It was cheaper than hunting down SRAMs.  So if you can wait a week until it arrives, you won’t have to rewrite anything.

Wayne


--
You received this message because you are subscribed to a topic in the Google Groups "C65GS Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/c65gs-development/ZNNF5nF_4Fo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to c65gs-developm...@googlegroups.com.

Daniel England

unread,
Dec 6, 2014, 12:40:17 AM12/6/14
to c65gs-de...@googlegroups.com
Wayne,

*lol*

I've almost finished writing the ASM only version.  Actually, its been a really good experience.  Its been a long, long time since I've done any assembly programming of any magnitude and I'm being much more critical of efficiency in terms of instructions and memory use as well as program structure.  I can see how things I would normally do in a higher level language really are inefficient.  Guaranteed execution paths are so much easier to see and realise in assembly, too.

Anyway, I was just popping by to give a quick update of my progress.  For the core functionality, I have only the disk writing left to implement.  Everything else is finished and pretty thoroughly tested.  My main problems have been strange regressions and not thinking about things in the most simple terms.

I'm really happy with the overall design, structure and utility of the program and the presentation is very nice.  I've made sure to comment the code as much as I can right now and I've tried to optimise the execution speed as much as possible without spending hours on it specifically.

So, the program can do a standard compliance test now.  It is though, rather slow.  If I turn off the user interface updates during the testing, the speed could be improved significantly, I think.  I'll make it an option if its required but seeing the progress might be important enough to keep it.  A BASIC version of the program would be even slower so I think its a good thing that I committed to the ASM version, in the end.

If we need the "run a specific test case" functionality, I'll need to implement getting the input from the user.  I've had to be careful about not using decimal mode which is routinely used when handing decimal number input and output.  It currently displays hex data for the progress updates and I think I will need to get the user to input hex numbers, too.

When its finished, it should even have a whole page (256 bytes) of memory free on the unexpanded VIC-20!  I might need you to do a small "acceptance test" on the program before we can put it into real use.

I should post back in two hours or so.


Daniel.


PS:  I decided that its really too exciting not to let you have a look at this in-progress version if you want to so I've attached it.  Hehe!  :)

TestDMode_vic20_asm_beta.zip

Daniel England

unread,
Dec 6, 2014, 4:31:43 AM12/6/14
to c65gs-de...@googlegroups.com
*oOPS*

Ooh...

I had some other commitments I'd forgotten about so I'm getting back to the program now.

I want to add a splash screen but just a few lines of text and a routine to show them consumes a page and a half of memory!  *Eeii*  I'm having to look at some size optimisations.


Daniel.

Paul Gardner-Stephen

unread,
Dec 6, 2014, 5:20:44 AM12/6/14
to Daniel England, c65gs-de...@googlegroups.com
Hello all,

Great to see the progress going on here.

In terms of the data that is useful for me, is to list the set of inputs that have differing results between the 65CE02 and 6502, so that I can confirm that it is only the decimal mode behaviour that is different between them.  I can then (when I am back in the right country) look at implementing the behaviour switch in the hypervisor.

Paul.

--
You received this message because you are subscribed to the Google Groups "C65GS Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to c65gs-developm...@googlegroups.com.

Daniel England

unread,
Dec 6, 2014, 12:52:43 PM12/6/14
to c65gs-de...@googlegroups.com
Paul,

Hmm...  I'm a little concerned that these tests we're running aren't going to tell us much in the end because we'll end up seeing a whole string of failures and little else.  Still, it will be interesting to see it for whatever it is...  There is a problem though, that the tests seem to assume some kind of consistent behaviour, be it correct or incorrect.  They don't really compute a result for every possible function and just target cases of interest.  They do exhaustively test the undocumented behaviour of the 6502 though.  I think we may have to tailor the tests later for more in-depth results.  The work here will pave the way, I guess.  The resultant data sets are huge and the testing process takes a long time.  Having a stable test harness that we're used to using will help in the future.  I'm really quite proud of the program structure and presentation style guidelines.

As an update, I have finished the core functionality.  I have tested it, too.  Gosh its surprising how long tracking down the cause of a problem takes when you've accidentally typed "CLD" instead of "CLC"...  Its taken me a lot longer than I anticipated because I've had to refactor the code a lot to reduce the size which meant I had to test and debug it all from the start.  Also, testing the disk handling routines took a very, very long time because of how slow it is.  Even at 12 times the speed, its sloooooowwwwww....  And I had to run it over and over and over...  Oh boy...

Presently, it will take 31 disks and I predict on a real machine about 20 minutes or so per disk.  That's about 10 hours for the entire test set.  I'm seeing what I can do to improve this situation but the main time consumption is in writing to the disk.  The 1541U2 might respond faster than a 1541....  With all of the "like a real machine" settings on in VICE, I notice that there are often long delays while the emulated drive flushes its buffers.

As it stands, I have exactly 8 bytes of memory free but I haven't done the result report screen nor am I doing any error checking on the disk activity.  Input for specific case testing is also unimplemented.

The code that creates the in-progress statistics is a big target for me to change.  The code itself is large and its also very slow.  I want to replace it with a graph.  I've figured out a way to do it that should be fast and should also have a small memory footprint.  It should give the user a nice fuzzy feeling, to-boot.  Ha!

Also, I'm too used to using flow control with blocks and scope and single exit points.  I've done a few things in the code that could be optimised by using RTS more judiciously.  I need to try to use the stack more "efficiently" - as in much more than I am - because its really just a waist of a potential 128 bytes of buffer memory.  I have to review how I'm using data tables for lookups.  I wish I could use the ZP but the KERNEL and BASIC consume it all and I'd need to know how they use it more intimately in order to start re-purposing it.  The other problem is of course that you need to have the data in your user area before you can put it somewhere else, anyway.  I could use a loader and compression schemes...  I do want a readme...  Its all tricky with such a small amount of RAM to work with.

The other way to speed things up would be to use better disk control logic but with 8 bytes right now, I have no idea how I'd open up the memory to have a "speed loader" (writer) present as well.  I don't even know of any for the VIC20.  I suppose it might be possible to wedge it into the SP.  The other thing is that I'm not sure how the 1541U2 will respond to it, especially when also connected to the C64.  I'm also only writing data in 19 byte "packets".  I'll need to investigate what's possible.

Wayne says he has bought an expansion RAM cart.  Oh boy, the things I could do with that 8KB!  I was so depressed to see how just having my name and a splash screen chewed up a whole page.  190 something bytes!  Crazy.  I'm too used to typing a lot.  Perhaps we do need that expansion...

Anyway, I'm going to try to get this program finished "tonight"...  I can't believe its 3:30am already.  Err...  almost 4.  Proof reading takes so long.


Daniel.
Message has been deleted

Daniel England

unread,
Dec 6, 2014, 7:36:42 PM12/6/14
to c65gs-de...@googlegroups.com
*Whoops* I had to delete that post 'cause I forgot a source file.

Original post as follows:


*Ack*

Well, I've nearly killed myself but I was determined to get a test release that was stable enough for use done today.  Here it is!

I don't anticipate any problems and I think it should be fine to be put into service.  There are only 12 bytes or so free so I'm not quite able to do things as nicely as I'd like just at the moment but I don't think its an issue for the time being.  More info in the readme.

Let me know what you think and how it goes.

I love the "throbber" driver.  So neat.  The source is provided.  I really need some help with optimisation.  Another 48 bytes would allow me to do better result reporting.  Maybe another 64 on top of that and I could do the specific case functionality.  I'm considering dividing the program into a splash and readme launcher and a program proper.  The splash screen uses so much memory...

Anyway, have fun!  I'm going to bed.


Daniel.
6502TestDModeV2beta.zip

Daniel England

unread,
Dec 7, 2014, 4:48:01 PM12/7/14
to c65gs-de...@googlegroups.com
Oh boy!  I'm so excited!

I wanted to create a kind of standard or framework for doing any tests if I was going to put the effort into writing something like the DMode test in ASM so it could be reused.  

Something Paul said, maybe a suggestion, about wanting to do more testing made me think.  I was thinking I was probably going to have to split off the splash screen and I want a readme, too.  I was probably going to have to write a loader...

Then I thought, "what if I can make the test app a real modular frame-work?"

Since I was going to have to make a front-end loader and possibly a help text viewer app and since I wanted them all to be based on the same framework, why not try to make it so that the front-end actually installs the services and provides a basic library of routines for use by modules that it can load and unload?

I wasn't sure it was really going to be feasible given the amount of memory but I've been working on it today and so far, its looking great.  I call it the "Test Centre".  I have put together the core and a preliminary version of the DMode Test as a module (I call them "TMO"s - Test Module Objects) and there is so far plenty enough memory!

I have to finish it up and test the module loading but its looking really promising.  I will "embed" an "About" TMO with the core and once it is cleared the core will detect that it needs to load the "Menu" TMO.  That should allow you to browse the other available TMOs and launch one or view the help texts.  Once a TMO exits back to the core, the Menu can be reloaded or the application terminated if signalled by the TMO.

Like I say, so far a real-world test case should be possible so its looking promising.  I've been careful to make sure that the source is clean, too.  Its all very elegant really!

I hope it works out and you guys like it!  I'll let you know how it goes.


Daniel.

Daniel England

unread,
Dec 7, 2014, 11:36:15 PM12/7/14
to c65gs-de...@googlegroups.com
Hmm...  A bit of a disappointment.  I was investigating how reasonable it would be to use string compression since the strings consume so much memory.

I came up with a fairly standard system of 6bit characters with 2 bits for flags (word substitution and control function flags).  I use a "dictionary" to hold lists of pointers, one list for words and another for the strings.  These pointer lists are probably the most inefficient part.

Unfortunately it doesn't seem to be worth while. Because I have cut back on the verbosity (you just can't even afford to remind the user of something they aught to know) the number of bytes saved isn't significant versus the code required to do the decompression.  Some things would actually become a little simpler (referring to a string by dictionary pointer and index) but on the whole its not practical.

So far the compression would only save me some 65 bytes when the DMode test module is loaded.  I need a fair bit more than that for the decompression code.  

Hmm...  Looking at it again, if I change the dictionary to use lists of lengths then I save about 150 bytes instead.  Its starting to be interesting but probably not quite worth it still.

I guess I could investigate a brute force, bit-wise compression but I doubt it would really help, either.  Doing it in a way that would allow me to reconstruct arbitrary string reference tables wouldn't be feasible, I'd think.

I thought it might be worth it still if I could support relocation of the string data when the module is loaded but I'd need about 1.5 to 2 pages (~350 to 512 bytes) of memory and it isn't available and the core probably won't ever utilise any kind of heap allocation logic.

The string compression feature might be worth reinvestigating more seriously if we want to make use of the expansion RAM and make the modules more verbose.  I guess now that I've re-thought the pointer lists (I don't know why I did it that way except for the fact I'd already been using pointer tables) I should double check because it might be close.


Daniel.

Paul Gardner-Stephen

unread,
Dec 8, 2014, 4:56:10 AM12/8/14
to Daniel England, c65gs-de...@googlegroups.com
Hello,

Sounds like you have been busy!

For string compression, take a look at SMAZ as an example of a very simple but effective string compressor.  Also consider something as simple as using a nybl to encode each character if it is in the most common 15 characters, else indicates that a whole byte follows.  This should get you pretty close to 50% savings for common text, and with a very small routine for decompression.

Back to the tests, I think the only results we need to log are those that differ to what is expected for a 6502 or our understanding of the 65CE02 -- i.e., only log remarkable data.

Paul.

--

Daniel England

unread,
Dec 8, 2014, 8:33:28 PM12/8/14
to c65gs-de...@googlegroups.com, pa...@servalproject.org
Paul,

Sounds like you have been busy!

A little bit!  I'm able to allocate some time to it at the moment so I thought I'd throw myself into it.
 
 
For string compression, take a look at SMAZ as an example of a very simple but effective string compressor.  Also consider something as simple as using a nybl to encode each character if it is in the most common 15 characters, else indicates that a whole byte follows.  This should get you pretty close to 50% savings for common text, and with a very small routine for decompression.

Okay, I'll have a look.

I originally did want to use nibble-wise compression for the strings because it was the most logical thing to do but I would have needed another program (probably a PC-side one) to generate the tables.  I decided I wanted to use something that would be easier to do by hand.  The only problem is that the method I use won't achieve real results unless the string data is large enough with enough unique words or if there is enough memory to have a large default lexicon.  Neither of these situations will occur.  I think it might be the case that I need to invest some time in building a string table generation tool.

I actually did start writing such a tool with my initial investigations.  It was really tough going back to high-level, PC-side coding after the 8bit assembly stint.  It made me realise just how much high-level languages lack a "real respect" for the complexity of the things you are writing.  The fact that an embedded icon file in your program uses more memory than is available to a 6502 CPU is also depressing...  What has the world come to?  I was eager to get back into the assembly! :)

 
Back to the tests, I think the only results we need to log are those that differ to what is expected for a 6502 or our understanding of the 65CE02 -- i.e., only log remarkable data.

It is a good idea.  I will add the ability to write only the failed test results to the disk but it will be most of them anyway.  I'll make it an option so we can still generate a baseline dataset but that would probably be best generated on an emulator.
 

Thanks for the encouragement!


Daniel.

Daniel England

unread,
Dec 9, 2014, 9:37:01 AM12/9/14
to c65gs-de...@googlegroups.com
Hey all!

I thought you might want an update about my string compression.

SMAZ was definitely out of the question because it requires a large, always in memory "lexicon".  I decided to implement my own nybble compression scheme which I've called "NybMP".  Its probably been done before but oh well.

I've created a tool that will accept string data using a "BasText" method that I've extended to include all of the significant characters.  It can then generate a "dictionary" of the compressed strings.  I'll include an example of the output below.

Its an interesting problem because of the small amount of text and because you will tend to keep the strings to "only the most important information" which means that they are relatively dissimilar in content.

After reviewing the output I think the overhead is a little high and the misses too many.  You can see that the ratio is not great.  I think I will try allowing for up to two "lexicons" (if the first nybble is $0 then get next nybble if not $0 then use second lexicon other wise get next...).  I'm wondering how many unique characters we will ever expect to have...  I could try making the number of lexicons flexible and chosen to best suit the situation (misses and overhead).  I don't think it would add too much to the code if I'm clever enough.

Anyway, the amount of memory saved in the strings themselves is not staggering but it is probably enough.  In addition to this, I am able to implement some other things more efficiently so there are other savings on top of it.

The "misses" value is only the number of unique characters not included in the lexicon.  I should probably change that to be the total number of characters stored as whole bytes.  The "overhead" measurement rates the total number of extra nybbles needed for whole byte characters as a percentage of the total data size.  I should include alignment nybbles in this and we will really see the dilemma.  

The strange justification of the data is to break it up by the record structure (given below), individual strings and to keep the line lengths reasonable.

Here is the output:


;===============================================================================
;Data for string dictionary: DMDTST
;===============================================================================
;Generated on: 09DEC2014 @ 23:51:13
;
;-------------------------------------------------------------------------------
;Original text strings
;-------------------------------------------------------------------------------
;   $00:  "{CLEAR}{CYAN}DECIMAL MODE TEST{RETURN}"
;   $01:  "{CYAN}6502{RETURN}"
;   $02:  "{CYAN}65C02{RETURN}"
;   $03:  "{CYAN}65816{RETURN}"
;   $04:  "{CYAN}COMPLIANCE{RETURN}"
;   $05:  "{CYAN}COMPREHENSIVE{RETURN}"
;   $06:  "{CYAN}SPECIFIC CASE{RETURN}"
;   $07:  "{BLUE} {RETURN}"
;   $08:  "{BLUE}LEVEL SELECT: {RETURN}"
;   $09:  "{BLUE}  <{CYAN}1{BLUE}> - 6502{RETURN}"
;   $0A:  "{BLUE}  <{CYAN}2{BLUE}> - 65C02{RETURN}"
;   $0B:  "{BLUE}  <{CYAN}3{BLUE}> - 65816{RETURN}"
;   $0C:  "{BLUE}TYPE SELECT: {RETURN}"
;   $0D:  "{BLUE}  <{CYAN}1{BLUE}> - COMPLIANCE{RETURN}"
;   $0E:  "{BLUE}  <{CYAN}2{BLUE}> - COMPREHENSIVE{RETURN}"
;   $0F:  "{BLUE}  <{CYAN}3{BLUE}> - SPECIFIC CASE{RETURN}"
;   $10:  "{BLUE}  <{CYAN}B{BLUE}> - BACK{RETURN}"
;   $11:  "{BLUE}  <{CYAN}Q{BLUE}> - QUIT{RETURN}"
;   $12:  "{BLACK}INSERT DISK #"
;   $13:  "S0:TESTDMODE"
;
;-------------------------------------------------------------------------------
;Compression lexicon
;-------------------------------------------------------------------------------
;   $ 1:  {SPACE}         43
;   $ 2:  E               25
;   $ 3:  {BLUE}          19
;   $ 4:  C               18
;   $ 5:  {RETURN}        18
;   $ 6:  {CYAN}          15
;   $ 7:  S               13
;   $ 8:  I               12
;   $ 9:  T               9
;   $ A:  -               8
;   $ B:  6               8
;   $ C:  <               8
;   $ D:  >               8
;   $ E:  L               7
;   $ F:  M               7
;
;-------------------------------------------------------------------------------
;Summary
;-------------------------------------------------------------------------------
;Used: 15, misses: 24, overhead: 38.31%
;Original size: 316, compressed size: 268, ratio: 16.22%
;
;-------------------------------------------------------------------------------
;Dictionary data
;-------------------------------------------------------------------------------
DMDTST WORD    DMDTST + $25
       BYTE    $20,$45,$1F,$43,$0D,$9F,$53,$49,$54,$2D,$36,$3C,$3E,$4C,$4D
       BYTE    $0F,$15,$1C,$23,$2D,$3B,$46,$48,$52,$5E,$6B,$78,$83,$93,$A7,$B8
       BYTE    $C4,$CF,$DC,$E7
       BYTE    $09,$36,$04,$42,$48,$F0,$41,$E1,$F0,$4F,$04,$42,$19,$27,$95
       BYTE    $6B,$03,$50,$30,$03,$25
       BYTE    $6B,$03,$54,$03,$00,$32,$50
       BYTE    $6B,$03,$50,$38,$03,$1B,$50
       BYTE    $64,$04,$FF,$05,$0E,$80,$41,$04,$E4,$25
       BYTE    $64,$04,$FF,$05,$00,$52,$20,$48,$20,$4E,$78,$05,$62,$50
       BYTE    $67,$05,$02,$48,$04,$68,$41,$40,$41,$72,$50
       BYTE    $31,$50
       BYTE    $3E,$20,$56,$2E,$17,$2E,$24,$90,$3A,$15
       BYTE    $31,$1C,$60,$31,$3D,$1A,$1B,$03,$50,$30,$03,$25
       BYTE    $31,$1C,$60,$32,$3D,$1A,$1B,$03,$54,$03,$00,$32,$50
       BYTE    $31,$1C,$60,$33,$3D,$1A,$1B,$03,$50,$38,$03,$1B,$50
       BYTE    $39,$05,$90,$50,$21,$72,$E2,$49,$03,$A1,$50
       BYTE    $31,$1C,$60,$31,$3D,$1A,$14,$04,$FF,$05,$0E,$80,$41,$04,$E4,$25
       BYTE    $31,$1C,$60,$32,$3D,$1A,$14,$04,$FF,$05,$00,$52,$20,$48,$20,$4E
       BYTE    $78,$05,$62,$50
       BYTE    $31,$1C,$60,$33,$3D,$1A,$17,$05,$02,$48,$04,$68,$41,$40,$41,$72
       BYTE    $50
       BYTE    $31,$1C,$60,$42,$3D,$1A,$10,$42,$04,$14,$04,$B5
       BYTE    $31,$1C,$60,$51,$3D,$1A,$10,$51,$05,$58,$95
       BYTE    $09,$08,$04,$E7,$20,$52,$91,$04,$48,$70,$4B,$10,$23
       BYTE    $70,$30,$03,$A9,$27,$90,$44,$F0,$4F,$04,$42
;-------------------------------------------------------------------------------
;
;

The format for the dictionary is:

StringDataPtr: Pointer;
          Lexicon: array[0..14] of Byte;
          StringLens:  array of Byte;              //One for each of the strings - a variable number, hence the pointer
          StringData: array of Byte;              //The compressed strings.  Each string is individually limited to whole
                                                             //bytes so that they start on bye alignment.

I should consider adding an info byte for "compression method".  I could use this for lexicon count as well.  I should think about adding a byte to indicate how many strings there are for sanity checking but that's probably just me being weird.  The programmer knows what they are doing!  Hmm...


Daniel.

Daniel England

unread,
Dec 10, 2014, 1:44:28 AM12/10/14
to c65gs-de...@googlegroups.com
Well, I'm having so much fun.

I wasn't happy with the results so I've been investigating ways of improving them.

I've explored a few methods but I'm almost settled on one now.  Its a combination of all the ideas I've been looking at.  So far, for the same test data I showed above, I've improved the ratio up to 27.55%.  That's a 69% improvement.  The code required should still be rather small.  I need to check that its going to be flexible enough with smaller data sets and not require complex code to handle.

At the moment I am going to use the 191 byte block of memory at $0334 to $03FB for decompressing the string data.  Most of it is the cassette buffer (from $033C) and looking at the kernel disassembly, the area above it ($0334) is unused.  Its not identified for any purpose in the Programmer's User Guide (just strangely marked as '??') nor any memory maps I have.  Unfortunately, the method I want to use will probably require all of it in order to keep the code size down.  Unfortunately, there will be some "thrashing" when changes between string "dictionaries" is required such as when you want the "core" to display system messages, making the calls slow, unless I can work around the problem.  

The other problem is that for the method I'm currently investigating, it would be fairly complicated to produce an automated compressor.  I'm currently doing it by hand.  I'll have to look into that as well.

I really didn't think that data set analysis was my sort of thing at all but I wasn't able to sleep properly for thinking so much about data relationships, set mapping, bit crunching, sequence encoding, absolute ranges of sample sets...  Oh dear...  I think my brain is melting and its so hot, too.


Daniel.

Daniel England

unread,
Dec 12, 2014, 8:59:26 AM12/12/14
to c65gs-de...@googlegroups.com
Okay...

I've all but finished the new version of my NybMP compression format.  I just have to tidy the compressor output generation and implement the 6502 decompression code.

Unfortunately, I've determined that the method I wanted to implement would probably be too costly in terms of code required for decompression.  I wanted to implement the use of nybbles for some length information and some flags to signal different compression methods for difficult cases.  I found I could get better compression ratios doing that but now that I have more of a idea of how it all works, I am pretty sure it would be too hard to write the code to handle all of those variations and complexity just seeking in the data.

With the algorithm as it stands I am getting a compression ratio of 21.69% out of the same sample data, up from 16.22%.  The code size increase for handling this method should be almost negligible since I am reusing much of the same technique just recursively in effect (well, deterministically I suppose you could say).

It has taken a while to make the compressor program because its fairly complicated to do although simple in concept.  Its probably terribly inefficient and I have to refactor the code for better structure.  I had to build in some overly complicated debugging so that I could see what the code was doing and visualise the method I needed to implement better (literally, it showed me what it was doing with the data).

Now that I have what I want pretty much worked out, once I have implemented the decompression I'll look again at getting that extra 6%.

I'll make an update when I have some better output.


Daniel.

Paul Gardner-Stephen

unread,
Dec 12, 2014, 4:51:46 PM12/12/14
to Daniel England, c65gs-de...@googlegroups.com
Hello,

I had a go at writing a simple compression scheme while flying home.  I use just a table of the 15 most common characters, and about a further 50 bytes of code to support decompression - so about 64 bytes all up.  It is saving about 30% of string lengths.  Other characters use 12 bits to be encoded (4 bit escape code + 8 bit character code).

The code is on the computer I used on the plane, so I don't have it at hand right now.  It might be just as quick to reimplement it.

Paul.

--

Daniel England

unread,
Dec 12, 2014, 8:14:37 PM12/12/14
to c65gs-de...@googlegroups.com
Paul,

How are you calculating that amount?  Also, how are the strings to be indexed or how do you seek a particular string in a table or list of strings?

I need a string table.  Presently, without a compression scheme, I use two lists of bytes, one for the lo and another for the hi values for the pointers and index these lists by string number to get the address of the string and then pass the address to the BASIC print PChar routine.

However, with the compression, I have a list of length offsets which I index by string number and use then to seek in the string data itself to start the decompression to a buffer and pass the buffer pointer to the BASIC print PChar routine.

For my first pass of the NybMP algorithm, I store a list of the most frequently used characters (up to 15) for that string list as well as the offsets list and a pointer to the start of the string data.  It achieves only, roughly, 16% compression ratio.  I should note that version is obsolete now and I am getting a better than 21% ratio with the new version.

The ratio is calculated as per the common method for compression:  the ratio of size delta to original size.  So, (original size - compressed size) / original size * 100.

I could, I suppose, get rid of the offset list and pass a pointer to the decompression routine that I get from a label.  However, how would I know when to terminate the string?  I would need to keep {NULL} ($00) as one of the dictionary characters reducing its effectiveness even further since it will match only once per string or store it as 3 nybbles which means it uses more data area than the lengths do (single byte per string).  Even hard coding the value using generated macros perhaps, we are still talking about some kind of load instruction using an extra 2 bytes every time a string is referred to.

I could get rid of the string data pointer but it would make handling the block of strings more difficult, requiring about 6 lines more of LDA/STA code at least.

I could use a standard character dictionary and not have to include it with the string data but this will reduce the effectiveness of the compression.  If I don't include it in the above sample, the rate is improved to a theoretical 20.57% but there is no way that in reality the ratio is going to be anywhere near that if I actually compress the string data using a standard character dictionary instead of the generated, optimal one.

In fact, I tried using the list of most frequent characters from the wiki page: <http://en.wikipedia.org/wiki/Letter_frequency>.  I included space since they say it is the most common and in my own tests this proves to be true.  The characters then are: {SPACE}, E, T, A, O, I, N, S, H, R, D, L, C, U, M.  Compressing the test data above using this character set gave a ratio of 2.85%.  This is a terrible result due, most likely, to the fact that punctuation characters (as noted on the wiki) are the fourth most commonly occurring characters and because we are effectively using two sets of punctuation in the test data with the colour codes.

For my new implementation, I began by producing tables of character affinity or association by proximity.  I used a bit in the nybble to indicate whether it was a "control nybble" otherwise it was a character nybble.  It it was a character nybble, it was a index into the current character table.  If it was a control nybble it could be either to indicate a character embedded in the next byte of the stream or to switch the current character table to another one.  This seemed promising to me at first but in the end, didn't work because of the limited number of characters that can be stored in the tables using only a nybble for indexing and control.  I even "optimised" the compression process with a "peep hole" that tried to find the best table for the upcoming characters.  Still, the results were terrible.

I went back to my original idea of using substitution flagging and instead of using a whole byte as I did at first, used only a nybble.  So now I have a bit in the nybble to indicate if it is a word substitution or character substitution.  If it is a character substitution it might also indicate a character directly in the stream.  This is actually working out well.

The problems really are that we have so few bits to work with as indexes using nybbles in the data stream.  Additionally, we have no memory for extensive substitution or dictionary tables.  Also, the strings are themselves relatively dissimilar.

I did some analysis of my test data using the PC "zip" and "7z" methods.  I created a file with the equivalent test data in it.  I named the file "t" to reduce the issue of file names stored in the compression since I could be bothered creating a program to produce a custom, data only stream.  I couldn't set the dictionary size anywhere near equivalent to what we are working with.  The "zip" method used 32KB, the "7z" method 64KB.  My new NybMP has an effective dictionary size of 135bytes.  I could control the word size, 16 bytes so that it is the same for my method but it wasn't going to make any difference with the dictionary sizes they use.  The "zip" method internally gave a ratio of 47.11%.  The "7z" internally 46.10%.  However, the overheads were huge for these methods with 136 bytes of header for the "zip" method and 100 bytes for the "7z" one.  Understandably, some of this overhead is for the file description but that would be small compared to the actual internal method signatures and checksums which they insist on having.  In the end, the "zip" method gave a final ratio of 1.03% and "7z" 12.20%.  Looking at these figures, I'm actually pleased with my NybMP method.

I would like to know more about your methods.


Daniel.

Paul Gardner-Stephen

unread,
Dec 12, 2014, 8:50:08 PM12/12/14
to Daniel England, c65gs-de...@googlegroups.com
On Sat, Dec 13, 2014 at 11:44 AM, Daniel England <mewpo...@hotmail.com> wrote:
Paul,

How are you calculating that amount?

The same way as you, i.e., 0% means no savings, and 100% means the string requires zero bytes.
 
 Also, how are the strings to be indexed or how do you seek a particular string in a table or list of strings?

You need the pointer to the start of the string.  I haven't optimised that part yet, and am currently just using absolute,Y addressing mode, but could easily switch to (zp),Y which would shave a couple of bytes as well as allowing arbitrary string addresses. 

I need a string table.  Presently, without a compression scheme, I use two lists of bytes, one for the lo and another for the hi values for the pointers and index these lists by string number to get the address of the string and then pass the address to the BASIC print PChar routine.

If the compressed strings all fit in a page you can use 1 byte each instead of 2 as addresses.  If more than one page, I would split them into sets that fit in a page each.

Alternatively, you could just use a string number in the whole table, and count delimiter characters encountered, and only begin printing when you have reached the correct string, and stop after the delimiter for that string.  This would trade some CPU time in return for being able to index up to 256 strings using just one byte.
 
However, with the compression, I have a list of length offsets which I index by string number and use then to seek in the string data itself to start the decompression to a buffer and pass the buffer pointer to the BASIC print PChar routine.

See above, this can be completely avoided.
 
For my first pass of the NybMP algorithm, I store a list of the most frequently used characters (up to 15) for that string list as well as the offsets list and a pointer to the start of the string data.  It achieves only, roughly, 16% compression ratio.  I should note that version is obsolete now and I am getting a better than 21% ratio with the new version.

This sounds in the right ball-park for this kind of compression.
 
The ratio is calculated as per the common method for compression:  the ratio of size delta to original size.  So, (original size - compressed size) / original size * 100.

I could, I suppose, get rid of the offset list and pass a pointer to the decompression routine that I get from a label.  However, how would I know when to terminate the string?  I would need to keep {NULL} ($00) as one of the dictionary characters reducing its effectiveness even further since it will match only once per string or store it as 3 nybbles which means it uses more data area than the lengths do (single byte per string).  Even hard coding the value using generated macros perhaps, we are still talking about some kind of load instruction using an extra 2 bytes every time a string is referred to.

If NULL (or CR as I use in my scheme) occurs once per string, it is by definition a fairly frequent character, and probably deserves to be in the table, unless there is a character that occurs more often.  Just let the probabilities of occurrence govern this decision -- it is almost always the right decision for compression.
 
I could get rid of the string data pointer but it would make handling the block of strings more difficult, requiring about 6 lines more of LDA/STA code at least.

See my above comments about skipping through delimiters.  Printing a string would reduce to:

LDA #string number
JSR print_compressed_string

Also, chaining the strings this way saves you 1/2 a byte per string on average, because you can't waste the last nybl in strings anymore.
 
I could use a standard character dictionary and not have to include it with the string data but this will reduce the effectiveness of the compression.  If I don't include it in the above sample, the rate is improved to a theoretical 20.57% but there is no way that in reality the ratio is going to be anywhere near that if I actually compress the string data using a standard character dictionary instead of the generated, optimal one.

I don't think there is a good way to get away from including the letter frequencies for the strings you are encoding.  The list needs to be there anyway, and only requires 15 bytes.  But you should choose the characters to put in there based on absolute frequency.
 

In fact, I tried using the list of most frequent characters from the wiki page: <http://en.wikipedia.org/wiki/Letter_frequency>.  I included space since they say it is the most common and in my own tests this proves to be true.  The characters then are: {SPACE}, E, T, A, O, I, N, S, H, R, D, L, C, U, M.  Compressing the test data above using this character set gave a ratio of 2.85%.  This is a terrible result due, most likely, to the fact that punctuation characters (as noted on the wiki) are the fourth most commonly occurring characters and because we are effectively using two sets of punctuation in the test data with the colour codes.

See above -- count the occurrences of every character you use, and put the entries in the common character list based on that.
 
For my new implementation, I began by producing tables of character affinity or association by proximity.  I used a bit in the nybble to indicate whether it was a "control nybble" otherwise it was a character nybble.  It it was a character nybble, it was a index into the current character table.  If it was a control nybble it could be either to indicate a character embedded in the next byte of the stream or to switch the current character table to another one.  This seemed promising to me at first but in the end, didn't work because of the limited number of characters that can be stored in the tables using only a nybble for indexing and control.  I even "optimised" the compression process with a "peep hole" that tried to find the best table for the upcoming characters.  Still, the results were terrible.

What might make sense is to have separate lists for character prediction following vowels and non-vowels, but not sure if it will help.

The other thing that might make sense, and it all depends on the character frequencies, is to make the common character lookup multi-stage:

($1-F) = 15 most common characters.
$0 + ($1-$F) = 15 next most common characters.
$0 + $0 + ($1-F) = 15 next most common characters.

and so on, until you have covered all characters that occur.
 
I went back to my original idea of using substitution flagging and instead of using a whole byte as I did at first, used only a nybble.  So now I have a bit in the nybble to indicate if it is a word substitution or character substitution.  If it is a character substitution it might also indicate a character directly in the stream.  This is actually working out well.

The problems really are that we have so few bits to work with as indexes using nybbles in the data stream.  Additionally, we have no memory for extensive substitution or dictionary tables.  Also, the strings are themselves relatively dissimilar.

Yes, string-matching based compression is going to be problematic.

You could consider bi-grams, tri-grams and n-grams as "common characters", and have a table with shifted characters as the terminator for each, and then have a "lookup n-gram" routine based on the "common n-gram" indicated in the stream.
 
I did some analysis of my test data using the PC "zip" and "7z" methods.  I created a file with the equivalent test data in it.  I named the file "t" to reduce the issue of file names stored in the compression since I could be bothered creating a program to produce a custom, data only stream.  I couldn't set the dictionary size anywhere near equivalent to what we are working with.  The "zip" method used 32KB, the "7z" method 64KB.  My new NybMP has an effective dictionary size of 135bytes.  I could control the word size, 16 bytes so that it is the same for my method but it wasn't going to make any difference with the dictionary sizes they use.  The "zip" method internally gave a ratio of 47.11%.  The "7z" internally 46.10%.  However, the overheads were huge for these methods with 136 bytes of header for the "zip" method and 100 bytes for the "7z" one.  Understandably, some of this overhead is for the file description but that would be small compared to the actual internal method signatures and checksums which they insist on having.  In the end, the "zip" method gave a final ratio of 1.03% and "7z" 12.20%.  Looking at these figures, I'm actually pleased with my NybMP method.

Yes, compressing short strings is hard.
Search for SMAC (Short Message Arithmetic Coder) for an example of how to do this, and the complexity trade-off that is incurred as a result.
 
I would like to know more about your methods.
 
I have covered most in the comments above.
If you can send me your string table, I can make some estimates of how well it can be compressed using the multi-stage n-gram based method I describe.

Paul.

Paul Gardner-Stephen

unread,
Dec 12, 2014, 9:04:41 PM12/12/14
to Daniel England, c65gs-de...@googlegroups.com
Looking at your code: you are using 16 bytes for hex string conversion lookup table.  You can save 6 bytes by removing the table, and adding the following routine:

TOHEX
cmp #$0a
bcs TOHEXhigh
adc #$30
rts
TOHEXhigh
adc #$36
rts

Then use:

; SETUP THE FILE NAME STRINGS

UTILSFS LDA     TSTBFDH

        JSR TOHEX

        STA     TSTDSK0

        STA     TSTFIL0

        STA     TSTFIL1

        

        LDA     TSTBFDL

        JSR TOHEX

        STA     TSTDSK0+1

        STA     TSTFIL0+1

        STA     TSTFIL1+1

Now to look at your table of strings for compression potential ...

Paul.

Daniel England

unread,
Dec 12, 2014, 9:27:47 PM12/12/14
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
Paul,

How are you calculating that amount?

The same way as you, i.e., 0% means no savings, and 100% means the string requires zero bytes.

Hmm..  But you were saying 30%...  Even 50% at first.

 
 Also, how are the strings to be indexed or how do you seek a particular string in a table or list of strings?

You need the pointer to the start of the string.  I haven't optimised that part yet, and am currently just using absolute,Y addressing mode, but could easily switch to (zp),Y which would shave a couple of bytes as well as allowing arbitrary string addresses. 

Right.  But you need to "attach" to the data in memory.  Somehow, the pointer needs to be stored outside of where it is used.  You can probably use ZP during the decompression but the pointer still needs to be somewhere else in memory or loaded into registers for it to be used.

 
I need a string table.  Presently, without a compression scheme, I use two lists of bytes, one for the lo and another for the hi values for the pointers and index these lists by string number to get the address of the string and then pass the address to the BASIC print PChar routine.

If the compressed strings all fit in a page you can use 1 byte each instead of 2 as addresses.  If more than one page, I would split them into sets that fit in a page each.

Hmm..  Yes, I thought about this too but the strings don't and won't fit into a single page.  Requiring that the strings be aligned to pages potentially wastes memory and probably lots of it.

 
Alternatively, you could just use a string number in the whole table, and count delimiter characters encountered, and only begin printing when you have reached the correct string, and stop after the delimiter for that string.  This would trade some CPU time in return for being able to index up to 256 strings using just one byte. 

This would be very slow and still require either 16bit math or storage/retrieval of multiple pointers.  Also, the only delimiter that you could possibly use is {NULL}.  {RETURN} is not guaranteed at the end of a string.

 
However, with the compression, I have a list of length offsets which I index by string number and use then to seek in the string data itself to start the decompression to a buffer and pass the buffer pointer to the BASIC print PChar routine.

See above, this can be completely avoided. 

I'm still not sure I agree.

 
For my first pass of the NybMP algorithm, I store a list of the most frequently used characters (up to 15) for that string list as well as the offsets list and a pointer to the start of the string data.  It achieves only, roughly, 16% compression ratio.  I should note that version is obsolete now and I am getting a better than 21% ratio with the new version.

This sounds in the right ball-park for this kind of compression. 

This is a little frustrating because 16% is a lot different to 30% like you suggest is possible with a simple nybble-wise dictionary substitution.
 

If NULL (or CR as I use in my scheme) occurs once per string, it is by definition a fairly frequent character, and probably deserves to be in the table, unless there is a character that occurs more often.  Just let the probabilities of occurrence govern this decision -- it is almost always the right decision for compression.

This is an incorrect assumption.  Other characters will appear possibly multiple times and in many strings meaning the total usage count is much higher than the number of strings.

You could force {NULL} into the dictionary but doing so will potentially mean another character with an actually higher incident count will be lost, increasing the size of the compressed data.

I think that storing string lengths is still the best trade-off for speed, code size, data size and ease of use.

 
Also, chaining the strings this way saves you 1/2 a byte per string on average, because you can't waste the last nybl in strings anymore.

Actually, the rate at which there is an alignment nybble is more like 30%, not 50%.  What you suggest still requires page alignment, probably wasting more memory than these byte alignment nybbles unless you optimise your code by hand and can use the other space (ignoring the problem of having to work with multiple pointers).
 

I don't think there is a good way to get away from including the letter frequencies for the strings you are encoding.  The list needs to be there anyway, and only requires 15 bytes.  But you should choose the characters to put in there based on absolute frequency.

Which is what I was doing.  The frequency information is provided in the compression output I posted.


What might make sense is to have separate lists for character prediction following vowels and non-vowels, but not sure if it will help.

The other thing that might make sense, and it all depends on the character frequencies, is to make the common character lookup multi-stage:

($1-F) = 15 most common characters.
$0 + ($1-$F) = 15 next most common characters.
$0 + $0 + ($1-F) = 15 next most common characters.

and so on, until you have covered all characters that occur.

Looked at these and the results were not promising.

 
I went back to my original idea of using substitution flagging and instead of using a whole byte as I did at first, used only a nybble.  So now I have a bit in the nybble to indicate if it is a word substitution or character substitution.  If it is a character substitution it might also indicate a character directly in the stream.  This is actually working out well.

The problems really are that we have so few bits to work with as indexes using nybbles in the data stream.  Additionally, we have no memory for extensive substitution or dictionary tables.  Also, the strings are themselves relatively dissimilar.

Yes, string-matching based compression is going to be problematic.

You could consider bi-grams, tri-grams and n-grams as "common characters", and have a table with shifted characters as the terminator for each, and then have a "lookup n-gram" routine based on the "common n-gram" indicated in the stream.

I guess this is what I've ended up doing.
 

Yes, compressing short strings is hard.
Search for SMAC (Short Message Arithmetic Coder) for an example of how to do this, and the complexity trade-off that is incurred as a result.

I'll have a look.

 
If you can send me your string table, I can make some estimates of how well it can be compressed using the multi-stage n-gram based method I describe

The test data was in the output I listed but I'll give it to you, here.  You'll need to convert the "BasTxt" codes into the proper Petscii ones.  I'll give you a list to make it easier.

{CLEAR}{CYAN}DECIMAL MODE TEST{RETURN}
{CYAN}6502{RETURN}
{CYAN}65C02{RETURN}
{CYAN}65816{RETURN}
{CYAN}COMPLIANCE{RETURN}
{CYAN}COMPREHENSIVE{RETURN}
{CYAN}SPECIFIC CASE{RETURN}
{BLUE} {RETURN}
{BLUE}LEVEL SELECT: {RETURN}
{BLUE}  <{CYAN}1{BLUE}> - 6502{RETURN}
{BLUE}  <{CYAN}2{BLUE}> - 65C02{RETURN}
{BLUE}  <{CYAN}3{BLUE}> - 65816{RETURN}
{BLUE}TYPE SELECT: {RETURN}
{BLUE}  <{CYAN}1{BLUE}> - COMPLIANCE{RETURN}
{BLUE}  <{CYAN}2{BLUE}> - COMPREHENSIVE{RETURN}
{BLUE}  <{CYAN}3{BLUE}> - SPECIFIC CASE{RETURN}
{BLUE}  <{CYAN}B{BLUE}> - BACK{RETURN}
{BLUE}  <{CYAN}Q{BLUE}> - QUIT{RETURN}
{BLACK}INSERT DISK #
S0:TESTDMODE


$93 {CLEAR} 
$9C {CYAN}
$0D {RETURN}
$1F {BLUE}
$90 {BLACK} 
 

Daniel.

Daniel England

unread,
Dec 12, 2014, 9:32:24 PM12/12/14
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
Paul,

 
Looking at your code: you are using 16 bytes for hex string conversion lookup table.  You can save 6 bytes by removing the table, and adding the following routine:

TOHEX
cmp #$0a
bcs TOHEXhigh
adc #$30
rts
TOHEXhigh
adc #$36
rts

Then use:

; SETUP THE FILE NAME STRINGS

UTILSFS LDA     TSTBFDH

        JSR TOHEX

        STA     TSTDSK0

        STA     TSTFIL0

        STA     TSTFIL1

        

        LDA     TSTBFDL

        JSR TOHEX

        STA     TSTDSK0+1

        STA     TSTFIL0+1

        STA     TSTFIL1+1

Now to look at your table of strings for compression potential ...

Paul.


*lol*

Yes, its terribly unoptimised code.  Don't look too closely 'cause you'll see a memory move that uses an instruction more than it needs to.

Anyway, I coded those numeric conversion routines very quickly.  I knew I was going to have to do a decimal value to string conversion as well without the ability to use decimal mode so I just implemented them both as a look-up into the same table.  I had intended on using the same code for both hex and decimal conversion.  Don't know if I did or not.  Shocking really.  I am working on some of these problems for the next version.

Thanks for reading my code though, I definitely need the help!


Daniel.

Paul Gardner-Stephen

unread,
Dec 12, 2014, 9:32:45 PM12/12/14
to Daniel England, c65gs-de...@googlegroups.com
Looking a savings from rephrasing strings without losing clarity of meaning:

Change "APPROX. 30 BLANK DISKS" to "ABOUT 30 BLANK DISKS" to save 2 bytes.

Consider removing "BLANK " above to save a further 6 bytes.

Change "QUITE SOME TIME" to "A LONG TIME" to save 4 bytes.

Consider merging "LEVEL SELECT: " and "TYPE SELECT: " into a single "SELECT: " or "PICK: " to save 19 or 21 bytes.

Replace all strings like $1F,"  <",$9F,"B",$1F,"> ",$2D," with a single routine that prints $1F,"  <",$9F,{customcharacter},$1F,"> ",$2D.  This will save a lot of bytes in a lot of strings.  This could be done with a template string that gets modified, e.g.:

TXTOPTB   .BYTE $1F,"  <",$9F
TXTOPTV   .BYTE "A"
TXTOPTA   .BYTE $1F,"> ",$2D," ",$00

PRTOPT
sta TXTOPTV
ldy #$00
PRTOPTNC
lda TXTOPTB,y
iny
jsr $FFD2
bne PRTOPTNC
jmp PRTCSTR

i.e., 17 bytes for the routine, plus 12 for the string template = 29 bytes.
Then each option can be printed using the following 7 bytes:

lda #'B'
ldx #STRBACK
JSR PRTOPT

Where PRTCSTR is a routine that prints a compressed string indexed by a single byte.  If I get a chance I will revisit my code and cook up such a routine.

I have just read your latest email, so some of the above is not taking your comments in that message into account yet.

Paul.

Paul Gardner-Stephen

unread,
Dec 12, 2014, 9:38:04 PM12/12/14
to Daniel England, c65gs-de...@googlegroups.com
Hello,

No problem.  Compression can only help so much, it is generally better to remove the redundancy first if you can, because the compression scheme will almost invariable be less intelligent than the programmer.  Substitution of words with shorter or lower scrabble scoring synonyms is a good place to start, thus my suggestions in the previous email.

Regarding the speed of printing strings if using the chained NULL terminated compressed string scheme I propose, I think you will find that the CPU time for the messages is insignificant.  My decompression routine takes only a few rasters on a C64 to decode ~20 bytes.  Multiply that by the ~40 messages you have, and it is still of the order of 1 message per frame, and that assumes that you don't put the most commonly used messages first, and errors that terminate last.

I need a nap now, but will try to generate a decompression routine that can lookup strings in this way when I get the chance.  I'll also test the performance of my existing compressor on your string table to see if it still yields ~30%.

Finally, don't be worried about imperfection in your code, and don't worry about apologising for it.  Just present what you can, and let the community work with you to improve it.  Similarly, don't unnecessarily delay release under an open-source license: they are designed to allow fast collaboration on imperfect code!

Paul.

--

Daniel England

unread,
Dec 12, 2014, 10:03:06 PM12/12/14
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
*rofl*


On Saturday, 13 December 2014 12:32:45 UTC+10, Paul Gardner-Stephen wrote:
Looking a savings from rephrasing strings without losing clarity of meaning:

Change "APPROX. 30 BLANK DISKS" to "ABOUT 30 BLANK DISKS" to save 2 bytes.

Umm...  The "non technical" language is foreign to me.  I prefer the way "approximately" sounds to "about".  It presents a more authoritative personality on behalf of the program.

 
Consider removing "BLANK " above to save a further 6 bytes.

You can't really because technically, the disks have to be blank (except for the sole condition they were used for DMode test data before).  I guess I could change the code to do a quick format on them instead but I didn't want to do that for saving execution time (quick format is still slow).

 
Change "QUITE SOME TIME" to "A LONG TIME" to save 4 bytes.

Sorry, to my mind, "a long time" does not say "a very, very, so very, long time".  I'm in the habit of being as informative as I can with messages I present to the user.  "Quite some time" is a more accurate, although still "polite" way of saying, "you'll be here all day and probably most of tomorrow, too".

 
Consider merging "LEVEL SELECT: " and "TYPE SELECT: " into a single "SELECT: " or "PICK: " to save 19 or 21 bytes.

Again, I'm just in the habit of providing as much information as I can.  The "Level select" and "Type select" tell you what options you are configuring with your selection.  I don't think this should be changed in order to allow the user to understand what it is they are doing.

Programs should be as informative as possible and avoid terse or overly specific terminology in order to be well written.  I do find it a bit of a strain in such confined memory.  I write a message I want to produce and, oops, those four words are longer than a line.  Rephrase!  You can't really afford to remind the user of what they aught to know. Still, I don't think its an excuse for being difficult to work with or immediately understand.

 
Replace all strings like $1F,"  <",$9F,"B",$1F,"> ",$2D," with a single routine that prints $1F,"  <",$9F,{customcharacter},$1F,"> ",$2D.  This will save a lot of bytes in a lot of strings.  This could be done with a template string that gets modified, e.g.:

Yes, this is potentially a good idea except that I print all of the strings in one call rather than setting each of them up and printing them independently.  Bulk printing saves quite a bit of code.  Anyway, the pros and cons can be analysed after the string compression is done.  Perhaps it will still be better to store them once and modify and call to print multiple times.  I'm not entirely sure off the top of my head.  The BASIC print PChar routine does not preserve the registers as far as I'm aware.  You'd need to preserve the pointer before you call or load it again afterwards.  I think you are probably right on this one, however.


Daniel.

Daniel England

unread,
Dec 12, 2014, 10:22:16 PM12/12/14
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
 
Hello

Welcome home!  You said you'd been on a flight back.  I hope the trip was a pleasant one.

 
No problem.  Compression can only help so much, it is generally better to remove the redundancy first if you can, because the compression scheme will almost invariable be less intelligent than the programmer.  Substitution of words with shorter or lower scrabble scoring synonyms is a good place to start, thus my suggestions in the previous email.

I just can't do certain things.  The program has to present itself in an appropriate way and inform the user, as completely as possible, of what it is they are expected to do.

 
Regarding the speed of printing strings if using the chained NULL terminated compressed string scheme I propose, I think you will find that the CPU time for the messages is insignificant.  My decompression routine takes only a few rasters on a C64 to decode ~20 bytes.  Multiply that by the ~40 messages you have, and it is still of the order of 1 message per frame, and that assumes that you don't put the most commonly used messages first, and errors that terminate last.

One per frame??  But I'm having to print some 10 - 12 lines (or more) per screen.  The BASIC print PChar routine is also slow.  I'm sure that this will result in times of more than half a second to display a screen.  Possibly even three quarters of a second.  That will definitely make the program seem sluggish.  

*sigh*  It doesn't matter I suppose.  I should see the validity of the recommendation.

 
I need a nap now, but will try to generate a decompression routine that can lookup strings in this way when I get the chance.  I'll also test the performance of my existing compressor on your string table to see if it still yields ~30%.

Okay, I'd love to see your results.

 
Finally, don't be worried about imperfection in your code, and don't worry about apologising for it.  Just present what you can, and let the community work with you to improve it.  Similarly, don't unnecessarily delay release under an open-source license: they are designed to allow fast collaboration on imperfect code!

Hmm...  Yes, I'm probably too used to working in a more commercial environment and think of the community too much as a client.  I'm thinking of the final product as being undelivered until properly accepted and that the final product is then open-source.  I guess I needn't have attached copyright to the code from the start but I think that really, even if you don't, in some way it is owned by the original author and if they haven't specifically released it as public domain then it is proprietary.  I didn't want the coding to be left dangling with no ownership, no one know to have cared for it.  I do say that it is, for all intents and purposes, LGPL...  I hate having to put all of the legal-eese at the start of each file.  *Meh*

Umm..  Maybe what I should be doing is working with the code in a public hosting with LGPL license right from the start.  Would you like these sort of programs to be stored with your C65GS project or should I create a new one?


Daniel.

Daniel England

unread,
Dec 13, 2014, 12:20:45 AM12/13/14
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
Paul,

Further to this, you actually can only save 2 bytes because technically, since you are using ADC, you need to save the processor status, clear the decimal mode and restore the status before exit.

This might seem to be overkill but if the routine is going into what is becoming my testing core then it would have to do this.  

Optimising for size rather than speed...  I was intending on using the conversion table for printing the results to the screen upon failure but that code (to print the results) had to be pulled at the last minute in order to get the program to fit in the available RAM.  Since I was going to have to convert about 16 numbers, the routine would be called 32 times.  Still not really significant overhead...

Looking at the alternatives to handle the results conversion we have:

;-------------------------------------------------------------------------------
;Test data location and count
TSTDAT  =       $1E00
TSTCNT  =       $0C

;Routine to do the string updating
TSTSTOR RTS


;-------------------------------------------------------------------------------
;Look-up table method

TBLCNV  BYTE    "0123456789ABCDEF"

TSTCNV1 LDX     #TSTCNT
@1      LDA     TSTDAT,X
        PHA
        AND     #$0F
        TAY
        LDA     TBLCNV,Y
        JSR     TSTSTOR
        PLA
        AND     #$F0
        LSR    
        LSR
        LSR
        LSR
        TAY
        LDA     TBLCNV,Y
        JSR     TSTSTOR
        DEX
        BNE     @1
        RTS
;-------------------------------------------------------------------------------

;-------------------------------------------------------------------------------
; Calculation method

TST2HEX PHP
        CLD
        CMP     #$0A
        BCS     @1
        ADC     #$30
        PLP
        RTS
@1      ADC     #$36
        PLP
        RTS

TSTCNV2 LDX     #TSTCNT
@1      LDA     TSTDAT,X
        PHA
        AND     #$0F
        JSR     TST2HEX
        JSR     TSTSTOR
        PLA
        AND     #$F0
        LSR    
        LSR
        LSR
        LSR
        JSR     TST2HEX
        JSR     TSTSTOR
        DEX
        BNE     @1
        RTS


Its fascinating.  Even though the calculation method looks more complicated, its actually smaller by a fair amount and most importantly, it saves .Y for use by the store routine for it to index into the strings.  It allows other routines to be smaller, as well.  Initially, it might be slower but simplification in the TSTSTOR routine might even make up some of the time.

I find this really remarkable because I'm so used to programming in high-level languages.  The look-up table method is the method I used because its the be the easiest to write, requiring the least number of lines by quite a few and is "nominally faster".  So, when I'm writing in a high-level language, it would seem to be the most efficient method for quick development.

I really have developed some bad habits.  I used to think about this stuff but I don't any more.  Mostly, I've become more concerned with getting the job done rather than what I'm effectively making the machine live with.  Getting back into assembly coding really is working out to be a good thing for my technical skills.


Daniel.

Paul Gardner-Stephen

unread,
Dec 13, 2014, 4:35:08 AM12/13/14
to Daniel England, c65gs-de...@googlegroups.com
On Sat, Dec 13, 2014 at 1:33 PM, Daniel England <mewpo...@hotmail.com> wrote:
*rofl*

On Saturday, 13 December 2014 12:32:45 UTC+10, Paul Gardner-Stephen wrote:
Looking a savings from rephrasing strings without losing clarity of meaning:

Change "APPROX. 30 BLANK DISKS" to "ABOUT 30 BLANK DISKS" to save 2 bytes.

Umm...  The "non technical" language is foreign to me.  I prefer the way "approximately" sounds to "about".  It presents a more authoritative personality on behalf of the program.

"About" is a synonym for "approximately", the main difference is that more people understand "about" than "approximately", and is often true in such cases, the more understandable word is also shorter.  

My experience as an academic has taught me that people often think that technical language should use as many long words as possible, because it sounds "more important" or some variation on that.  The opposite is true, because technical matter should be as clear as possible, and as hard as possible to mis-understand.  

If your text needs big words to sound authoritative, then it is possible you are relying on appearances rather than substance.

The reason we still see long and often vague words and phrases in academic literature is that people learn this bad habit as students by reading the text full of long and vague words produced by the previous generation of academics.
 
 
Consider removing "BLANK " above to save a further 6 bytes.

You can't really because technically, the disks have to be blank (except for the sole condition they were used for DMode test data before).  I guess I could change the code to do a quick format on them instead but I didn't want to do that for saving execution time (quick format is still slow).

I have no real objection to this, although to me, it is implied that the disks must be empty.  What is the probability that someone will put a full disk in instead of an empty one?  If you feel it is substantially above zero, then by all means "BLANK" should be retained.
  
Change "QUITE SOME TIME" to "A LONG TIME" to save 4 bytes.

Sorry, to my mind, "a long time" does not say "a very, very, so very, long time".  I'm in the habit of being as informative as I can with messages I present to the user.  "Quite some time" is a more accurate, although still "polite" way of saying, "you'll be here all day and probably most of tomorrow, too".
 
The choice is yours. If you know what order of magnitude it takes, you could change it to "HOURS", which is even shorter.  It really depends how much space you need to save.
 
Consider merging "LEVEL SELECT: " and "TYPE SELECT: " into a single "SELECT: " or "PICK: " to save 19 or 21 bytes.

Again, I'm just in the habit of providing as much information as I can.  The "Level select" and "Type select" tell you what options you are configuring with your selection.  I don't think this should be changed in order to allow the user to understand what it is they are doing.
 
Again, up to you, but my feeling is that the extra information is redundant, because it is clear what the options are in each case.  If you feel that people could make a mistake if offered only "SELECT:", then the extra word is justified.

Programs should be as informative as possible and avoid terse or overly specific terminology in order to be well written.  I do find it a bit of a strain in such confined memory.  I write a message I want to produce and, oops, those four words are longer than a line.  Rephrase!  You can't really afford to remind the user of what they aught to know. Still, I don't think its an excuse for being difficult to work with or immediately understand.
 
Programmes should allow people to use them correctly the first time, and with the least surprise.  This is effectively what you are saying, but viewed from a different angle.  It is, however, an angle that often allows for removing redundancy and verbosity from user information.
 
Replace all strings like $1F,"  <",$9F,"B",$1F,"> ",$2D," with a single routine that prints $1F,"  <",$9F,{customcharacter},$1F,"> ",$2D.  This will save a lot of bytes in a lot of strings.  This could be done with a template string that gets modified, e.g.:

Yes, this is potentially a good idea except that I print all of the strings in one call rather than setting each of them up and printing them independently.  Bulk printing saves quite a bit of code.  Anyway, the pros and cons can be analysed after the string compression is done.  Perhaps it will still be better to store them once and modify and call to print multiple times.  I'm not entirely sure off the top of my head.  The BASIC print PChar routine does not preserve the registers as far as I'm aware.  You'd need to preserve the pointer before you call or load it again afterwards.  I think you are probably right on this one, however.
 
You could add an escape code to the printing routine that prints the option prefix part, without having to split the strings up.

Anyway, these are all just suggestions to save memory should you need it.

Paul Gardner-Stephen

unread,
Dec 13, 2014, 4:39:08 AM12/13/14
to Daniel England, c65gs-de...@googlegroups.com
On Sat, Dec 13, 2014 at 1:52 PM, Daniel England <mewpo...@hotmail.com> wrote:
 
Hello

Welcome home!  You said you'd been on a flight back.  I hope the trip was a pleasant one.

 
No problem.  Compression can only help so much, it is generally better to remove the redundancy first if you can, because the compression scheme will almost invariable be less intelligent than the programmer.  Substitution of words with shorter or lower scrabble scoring synonyms is a good place to start, thus my suggestions in the previous email.

I just can't do certain things.  The program has to present itself in an appropriate way and inform the user, as completely as possible, of what it is they are expected to do.

 
Regarding the speed of printing strings if using the chained NULL terminated compressed string scheme I propose, I think you will find that the CPU time for the messages is insignificant.  My decompression routine takes only a few rasters on a C64 to decode ~20 bytes.  Multiply that by the ~40 messages you have, and it is still of the order of 1 message per frame, and that assumes that you don't put the most commonly used messages first, and errors that terminate last.

One per frame??  But I'm having to print some 10 - 12 lines (or more) per screen.  The BASIC print PChar routine is also slow.  I'm sure that this will result in times of more than half a second to display a screen.  Possibly even three quarters of a second.  That will definitely make the program seem sluggish.  

*sigh*  It doesn't matter I suppose.  I should see the validity of the recommendation.

Again it comes down to a time/space trade-off.  If you can avoid the trade-off, do so.  But also, to assess the trade-off you need to know the actual cost.  My calculation is likely to be pessimistic.  I think that you would find the performance to be quite fine. But without testing, we won't know.
 
 
I need a nap now, but will try to generate a decompression routine that can lookup strings in this way when I get the chance.  I'll also test the performance of my existing compressor on your string table to see if it still yields ~30%.

Okay, I'd love to see your results.

The code is on my wife's computer, which she is using to watch Dr.Who right now, so I can't get to it at present.  I can easily re-implement it, however, at least to the point of providing the compression performance data.
  
Finally, don't be worried about imperfection in your code, and don't worry about apologising for it.  Just present what you can, and let the community work with you to improve it.  Similarly, don't unnecessarily delay release under an open-source license: they are designed to allow fast collaboration on imperfect code!

Hmm...  Yes, I'm probably too used to working in a more commercial environment and think of the community too much as a client.  I'm thinking of the final product as being undelivered until properly accepted and that the final product is then open-source.  I guess I needn't have attached copyright to the code from the start but I think that really, even if you don't, in some way it is owned by the original author and if they haven't specifically released it as public domain then it is proprietary.  I didn't want the coding to be left dangling with no ownership, no one know to have cared for it.  I do say that it is, for all intents and purposes, LGPL...  I hate having to put all of the legal-eese at the start of each file.  *Meh*

The normal method in open-source work is to put the GPL/LGPL/whatever header on the file at the outset, and just make it explicit.
 

Umm..  Maybe what I should be doing is working with the code in a public hosting with LGPL license right from the start.  Would you like these sort of programs to be stored with your C65GS project or should I create a new one?

This is what I would recommend. Get yourself a github.com account, create a repository and put your code there. Get used to using git (there are lots of tutorials), as it will pay dividends in the long term.

Paul.

Paul Gardner-Stephen

unread,
Dec 13, 2014, 4:49:56 AM12/13/14
to Daniel England, c65gs-de...@googlegroups.com
Hello,

On Sat, Dec 13, 2014 at 3:50 PM, Daniel England <mewpo...@hotmail.com> wrote:
Paul,

Further to this, you actually can only save 2 bytes because technically, since you are using ADC, you need to save the processor status, clear the decimal mode and restore the status before exit.

There is an even more efficient way to do it using decimal mode:


But it boils down to:
; Convert a hex digit ($00-$0F) to ASCII ('0'-'9' or 'A'-'F')
HEX2ASC SED             ;Enter BCD mode
        CLC             ;Ensure the carry is clear
        ADC #$90        ;Produce $90-$99 (C=0) or $00-$05 (C=1)
        ADC #$40        ;Produce $30-$39 or $41-$46
        CLD             ;Leave BCD mode
If you want to PHP/PLP you can, but I would suggest that you should explicitly control the D flag in your tests so that it can be avoided.  In any case, you just need the PHP and PLP once in the surrounding routine, so the 2 bytes are amortised over the two calls, giving an effective cost of just 1 extra byte.

So in total we need 8 bytes for the HEX2ASC routine, plus 2 bytes to save/restore processor flags, so 10 in total, saving 6 bytes in total.
:)

Daniel England

unread,
Dec 13, 2014, 5:26:52 AM12/13/14
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
Paul,

 
Umm...  The "non technical" language is foreign to me.  I prefer the way "approximately" sounds to "about".  It presents a more authoritative personality on behalf of the program.

"About" is a synonym for "approximately", the main difference is that more people understand "about" than "approximately", and is often true in such cases, the more understandable word is also shorter.  

My experience as an academic has taught me that people often think that technical language should use as many long words as possible, because it sounds "more important" or some variation on that.  The opposite is true, because technical matter should be as clear as possible, and as hard as possible to mis-understand.  

If your text needs big words to sound authoritative, then it is possible you are relying on appearances rather than substance.

The reason we still see long and often vague words and phrases in academic literature is that people learn this bad habit as students by reading the text full of long and vague words produced by the previous generation of academics.

I should upsub your goodthink and make use your newspeak to doubleplusgood my speakwrite.

*rofl*

I think there are layers of meaning which could be appropriate in setting a mood or context. 

 
Consider removing "BLANK " above to save a further 6 bytes.

You can't really because technically, the disks have to be blank (except for the sole condition they were used for DMode test data before).  I guess I could change the code to do a quick format on them instead but I didn't want to do that for saving execution time (quick format is still slow).

I have no real objection to this, although to me, it is implied that the disks must be empty.  What is the probability that someone will put a full disk in instead of an empty one?  If you feel it is substantially above zero, then by all means "BLANK" should be retained.

I want to make sure the user appreciates what is happening.
 

  
Change "QUITE SOME TIME" to "A LONG TIME" to save 4 bytes.

Sorry, to my mind, "a long time" does not say "a very, very, so very, long time".  I'm in the habit of being as informative as I can with messages I present to the user.  "Quite some time" is a more accurate, although still "polite" way of saying, "you'll be here all day and probably most of tomorrow, too".
 
The choice is yours. If you know what order of magnitude it takes, you could change it to "HOURS", which is even shorter.  It really depends how much space you need to save.

Actually, yes.  You are correct here.  I wasn't sure when I wrote it just _how_ long it would be other than it would be _much_ more than "very".  I think it probably would be better to revise this for accuracy since more information is now available.  "Quite some time" should be "about 10 hours" but that only saves 1 character.

 
Consider merging "LEVEL SELECT: " and "TYPE SELECT: " into a single "SELECT: " or "PICK: " to save 19 or 21 bytes.

Again, I'm just in the habit of providing as much information as I can.  The "Level select" and "Type select" tell you what options you are configuring with your selection.  I don't think this should be changed in order to allow the user to understand what it is they are doing.
 
Again, up to you, but my feeling is that the extra information is redundant, because it is clear what the options are in each case.  If you feel that people could make a mistake if offered only "SELECT:", then the extra word is justified.

I think they are necessary here to put the options in context of some meaning which they wouldn't have otherwise. 

  
You could add an escape code to the printing routine that prints the option prefix part, without having to split the strings up.

Yes, there are standard option selections are in the core so I think I will make some standard routine to print a menu, given strings for labelling the options. 

 
Anyway, these are all just suggestions to save memory should you need it.

I appreciate all of the input!


Daniel. 

Daniel England

unread,
Dec 13, 2014, 5:30:51 AM12/13/14
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
 
Again it comes down to a time/space trade-off.  If you can avoid the trade-off, do so.  But also, to assess the trade-off you need to know the actual cost.  My calculation is likely to be pessimistic.  I think that you would find the performance to be quite fine. But without testing, we won't know.

I'm working on something now but I'm really displeased with the code size at the moment.
 

The code is on my wife's computer, which she is using to watch Dr.Who right now, so I can't get to it at present.  I can easily re-implement it, however, at least to the point of providing the compression performance data.

Its okay, when you get to it.


This is what I would recommend. Get yourself a github.com account, create a repository and put your code there. Get used to using git (there are lots of tutorials), as it will pay dividends in the long term.

I have been looking into which free hosting to use for some other projects, too.  I have only used SVN and CVS (and some custom systems) in the past.  The programming tools I use don't have Git support, only SVN so I was tending towards sourceforge.  But I will have a look at Git and see what happens.  I have downloaded the tools and a GUI, now.


Daniel England

unread,
Dec 13, 2014, 5:34:48 AM12/13/14
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
 
There is an even more efficient way to do it using decimal mode:


But it boils down to:
; Convert a hex digit ($00-$0F) to ASCII ('0'-'9' or 'A'-'F')
HEX2ASC SED             ;Enter BCD mode
        CLC             ;Ensure the carry is clear
        ADC #$90        ;Produce $90-$99 (C=0) or $00-$05 (C=1)
        ADC #$40        ;Produce $30-$39 or $41-$46
        CLD             ;Leave BCD mode
If you want to PHP/PLP you can, but I would suggest that you should explicitly control the D flag in your tests so that it can be avoided.  In any case, you just need the PHP and PLP once in the surrounding routine, so the 2 bytes are amortised over the two calls, giving an effective cost of just 1 extra byte.

So in total we need 8 bytes for the HEX2ASC routine, plus 2 bytes to save/restore processor flags, so 10 in total, saving 6 bytes in total.


Hmm.  We can't rely on the decimal mode working correctly in this program and have to make sure its turned off.  I take your meaning about turning it off in the calling code but I'm not sure that this should apply to a "library function".  Perhaps I will leave it up to the test module to determine how to do the translations and keep the core simpler with fewer "purely utility" functions.

Paul Gardner-Stephen

unread,
Dec 13, 2014, 5:44:29 AM12/13/14
to Daniel England, c65gs-de...@googlegroups.com
In the long run, git is much better than SVN or CVS, in particular for open-source development where you have simultaneous work going on from a loose team.  Mercurial is the only other real contender.

Paul.

Paul Gardner-Stephen

unread,
Dec 13, 2014, 5:53:46 AM12/13/14
to Daniel England, c65gs-de...@googlegroups.com
On Sat, Dec 13, 2014 at 8:56 PM, Daniel England <mewpo...@hotmail.com> wrote:
Paul,

 
Umm...  The "non technical" language is foreign to me.  I prefer the way "approximately" sounds to "about".  It presents a more authoritative personality on behalf of the program.

"About" is a synonym for "approximately", the main difference is that more people understand "about" than "approximately", and is often true in such cases, the more understandable word is also shorter.  

My experience as an academic has taught me that people often think that technical language should use as many long words as possible, because it sounds "more important" or some variation on that.  The opposite is true, because technical matter should be as clear as possible, and as hard as possible to mis-understand.  

If your text needs big words to sound authoritative, then it is possible you are relying on appearances rather than substance.

The reason we still see long and often vague words and phrases in academic literature is that people learn this bad habit as students by reading the text full of long and vague words produced by the previous generation of academics.

I should upsub your goodthink and make use your newspeak to doubleplusgood my speakwrite.

*rofl*

:)

Of course, this is also a good example of where "simplifying" text is a false goal.  However, done well, it is very effective. Aviation English is a good example of it done well (but of course not perfectly).

Paul.
 
I think there are layers of meaning which could be appropriate in setting a mood or context. 

 
Consider removing "BLANK " above to save a further 6 bytes.

You can't really because technically, the disks have to be blank (except for the sole condition they were used for DMode test data before).  I guess I could change the code to do a quick format on them instead but I didn't want to do that for saving execution time (quick format is still slow).

I have no real objection to this, although to me, it is implied that the disks must be empty.  What is the probability that someone will put a full disk in instead of an empty one?  If you feel it is substantially above zero, then by all means "BLANK" should be retained.

I want to make sure the user appreciates what is happening.
 

  
Change "QUITE SOME TIME" to "A LONG TIME" to save 4 bytes.

Sorry, to my mind, "a long time" does not say "a very, very, so very, long time".  I'm in the habit of being as informative as I can with messages I present to the user.  "Quite some time" is a more accurate, although still "polite" way of saying, "you'll be here all day and probably most of tomorrow, too".
 
The choice is yours. If you know what order of magnitude it takes, you could change it to "HOURS", which is even shorter.  It really depends how much space you need to save.

Actually, yes.  You are correct here.  I wasn't sure when I wrote it just _how_ long it would be other than it would be _much_ more than "very".  I think it probably would be better to revise this for accuracy since more information is now available.  "Quite some time" should be "about 10 hours" but that only saves 1 character.

 
Consider merging "LEVEL SELECT: " and "TYPE SELECT: " into a single "SELECT: " or "PICK: " to save 19 or 21 bytes.

Again, I'm just in the habit of providing as much information as I can.  The "Level select" and "Type select" tell you what options you are configuring with your selection.  I don't think this should be changed in order to allow the user to understand what it is they are doing.
 
Again, up to you, but my feeling is that the extra information is redundant, because it is clear what the options are in each case.  If you feel that people could make a mistake if offered only "SELECT:", then the extra word is justified.

I think they are necessary here to put the options in context of some meaning which they wouldn't have otherwise. 

  
You could add an escape code to the printing routine that prints the option prefix part, without having to split the strings up.

Yes, there are standard option selections are in the core so I think I will make some standard routine to print a menu, given strings for labelling the options. 

 
Anyway, these are all just suggestions to save memory should you need it.

I appreciate all of the input!


Daniel. 

--

Daniel England

unread,
Dec 29, 2014, 4:20:39 AM12/29/14
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
Guys,

After improving the code as per the suggestions and wading through the licensing guff, I have finally published the TestDMode program to GitHub.  Currently there is only a version for the VIC20 but that's all that is required at this point.

You can access it at:  <https://github.com/M3wP/6502-Test-DMode>

Comments and suggestions are most welcome and you can, of course, participate in the development.


Daniel.

Daniel England

unread,
Jan 5, 2015, 1:47:26 AM1/5/15
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
Guys,

Wayne got back to me with the test data results that he collected after running the TestDMode program on his VIC 20 with his 65CE02 CPU installed.

I wrote a program to convert the test data into a CSV file so it will be easier to analyse.  I also wrote some batch files to extract the test data results files from the test data results disks but that's only for Windows.

The program is up on GitHub.  You'll need Lazarus/FPC to compile it.  It should work all right on any platform but I haven't checked it yet.

I've attached the CSV file for your convenience. 

A quick look through reveals results that I wasn't expecting.  It seems that the 65CE02 will fail for some ADC and some SBC instructions but not consistently.  It seems that there really could be some scrambled logic in there but it will take someone more fluent in reading the results than me to tell exactly what is going on.


Daniel.
TestDModeResults.7z

Daniel England

unread,
Jan 5, 2015, 7:06:59 AM1/5/15
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
Forwarded from Paul because he forgot to reply to the group:


Hello Daniel,

Have you forwarded a copy of the results to the 6502hackers list?  I think that they might be interested.  Also, I presume you have confirmed that it isn't a bug in your program, i.e, by taking one of the strange results and writing a very simple program that just tests those inputs so that you can be sure it is a real problem with the silicon.

Paul.

Daniel England

unread,
Jan 5, 2015, 7:15:51 AM1/5/15
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
Paul,

Yes, I was just thinking about how I should do that, forward the results to the 6502.org forums, too.

You are right though, I should try to verify the results.

I wanted to implement the specific test request functionality in the test program for this purpose but now I'm thinking about it, maybe it should be an independent program, written independently, to verify.

I did run the tests very many times on the VICE vic20 emulator and was able to confirm that all of the tests passed.  I could potentially implement the C64 version and run it on the SCPU version of the C64 emulator to test how it performs on a 65816 (although emulated).

I have contacted Bruce Clark about the program and have had a few communications with him about the program but I am unsure if he has checked the code.

I was very careful when I modified the code, to be sure that I didn't disturb any of the test logic.

I'll have to have a think about it.  I don't have a VIC 20 myself but I do have some C128s and possibly a working C64.  I'm not sure if any of my drives still work and I don't have a u1541-II or such device.

Maybe the best thing would be to post a request to the 6502.org forums for help in verifying that the program does work and/or for someone to double check our results.

I'll make the post either later on tonight or tomorrow morning.


Daniel.

Daniel England

unread,
Jan 5, 2015, 7:38:22 AM1/5/15
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
*Doh!*

I just realised that I can modify and recompile VICE such that it ignores the P_D flag, convert the results and do a diff to see if the results are the same or not.

I'll do that now.  VICE takes a while to build, though.


Daniel.

Daniel England

unread,
Jan 5, 2015, 9:33:39 AM1/5/15
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
*sigh*

Well, it seems there are problems with VICE writing to the disk in warp mode.  I did notice some glitches in my first run of testing where the data was written to the screen instead of the disk.  I thought I had fixed it (I thought that the problem may have been in the KERNAL itself) by making sure that interrupts were disabled while writing but it seems that the problem is actually in VICE because the data can still get corrupted.

I am re-running the tests with the modified VICE (no decimal mode support at all in the CPU) at the regular speed.

I did notice though that I got a much larger number of errors than Wayne did.  I wonder if he did the tests against the 6502 reference "conformance level".  I'll have to ask.  We need it done at that "conformance level" to test against all of the 6502 behaviours which include undocumented ones.  If he did it at another "conformance level" then the tests done are actually different to account for the fact that the CPUs don't support the undocumented behaviours or fix V/N flag oddities.

Also, as for verifying the results, I've remembered that the code does check itself as it runs.  It actually performs the calculations using regular binary mode and then converts the values into decimal mode values and performs the calculation again.  If the results don't match, it fails the test.  So, we probably don't need to do any other verification.

I'll contact Wayne and then get back to you with the comparisons.


Daniel.

Paul Gardner-Stephen

unread,
Jan 5, 2015, 4:15:48 PM1/5/15
to Daniel England, c65gs-de...@googlegroups.com
All sounds good. However, when presenting amazing results one is wise to also go to amazing lengths to verify.  The case of the scientists who thought that they had found a roughly earth-sized exo-planet comes to mind.  Independent verification revealed that they had not corrected for the motion of the telescope on the earth around the sun.  The only way to be sure to have excluded such systematic errors is to test in a different way.  You don't need to test every possible result, just one or two of the ones that behave differently than expected.

Paul.

--

Daniel England

unread,
Jan 6, 2015, 3:58:35 AM1/6/15
to c65gs-de...@googlegroups.com, mewpo...@hotmail.com, pa...@servalproject.org
Okay...

I've made some changes to the conversion tool to better support doing comparisons/diffs.  That is, if you can get a diff engine that doesn't try to be smart and insist on claiming that the first five line differences are missing ones.  I certainly wasn't.

I've just ended up importing the csv files into Calc (a spreadsheet) and then copy/pasting one along side the other.  I think I might make some scripts for Calc to highlight differences/merge sheets (so you put the results on different sheets and it will diff and merge into another sheet).  I'll see...

So, I now have some reference results where the decimal mode logic has effectively been pulled out of the 6502 by modifying VICE, recompiling it and running the test program.  Gosh that took a while.  C/C++ is notoriously slow to compile and the program itself takes ages when you can't run it in warp mode (corrupts the output).  

I must say that I hate working with C/C++ and the build chain is hideous.  They really aren't 3rd gen languages, IMHO.  They are the reason that such terminology has died.  It is a masquerade, performed by a string of programs massaging the code, that presents itself as a 3rd gen environment/language.  Working with it is arcane.  *blech*

Anyway, I think I should also note that the VICE team seems to have hacked in the DTV support by replacing whole files.  I couldn't get changes in 6510core.c to "take".  I had to modify 6510dtvcore.c to get my changes to appear even though I was compiling xVIC which has nothing to do with the DTV.  I think that's pretty poor form but I also think that working with C/C++ makes your brain function in bizarre ways that are entirely unnatural.  Hence the constant problems in programs with buffer overrun security problems, fatal exceptions and all that nonsense.  If they were using a real programming language, none of those problems would exist today.  Okay, okay...  Rant over.

I now just have to wait to hear back from Wayne as to whether or not he used the correct compliance level in his tests.  It seems to me now, examining the results with the level of detail I can now do it at, that he probably didn't which would explain why the results weren't what was I expecting to being with.

I won't post the results I have just yet because I think it would be better to have them side by side in Calc for analysis.  I also haven't synced my changes to the conversion tool yet.  I just need to make sure its all tidy, have a break and then make the commits.


Daniel.
Reply all
Reply to author
Forward
0 new messages