Long-term Management of Disk Images and Scanned Docs

87 views
Skip to first unread message

Mark Garlanger

unread,
Jul 12, 2023, 11:54:58 AM7/12/23
to se...@googlegroups.com

Great discussion in the other thread, but the subject is important enough to have its own thread. The archive.org collection approach is definitely something we should investigate further. 

A few points I'd like to add is that we should consider not only what we need now, but what we think people in the future would want. One thing I have been doing lately, is to also take a photo of the disk. This option was added in AppleSauce, and it seems like something nice to add. Also, we should try to use existing popular tools/formats if they do what we need. For example, the ImageDisk program and IMD format is widely used for soft-sectored disks. 

Here are some of the key points from the other thread, if I missed any, please add to it.

On Wednesday, July 12, 2023 at 6:23:14 AM UTC-6 Glenn Roberts wrote:
  

so I’d be interested in discussion of the best way to maintain our growing library of disk images.  I’m looking to release another 80-100 that I’ve recently scanned. I know Mark says he has a large collection that he hasn’t had time to scan/index.

 

Our largest current library was originated by Les Bird in what he calls the “Application Archive”

https://sebhc.github.io/sebhc/software.html#Application_Archive

 

including a complete index of all the disks

sebhc.github.io/sebhc/software/H8DCATALOG.HTML

 

these are stored in H8D format – just a linear dump of the sectors on the disk – in 100K, 200K and 400K formats.  For my own purposes this has been sufficient for reproducing physical disks or loading into my “jukebox”.

 

Les has also taken the time to pull out and index the Heath User Group disks that we have:

https://sebhc.github.io/sebhc/software.html#HUG_Application_Library

 

I have also contributed to Les’ collection from my own personal library, plus “rescues” that I’ve done.  If I added right, there are 842 disk images in this library.

 

So this is quite a collection but it’s largely uncurated – its just a dump of a bunch of disk images, albeit with a master index.  I have found a great deal of useful programs in this collection.

 

 

Mark has a carefully structured/curated collection on his site:

Software Library (garlanger.com)

 

and has captured manuals and archival copies of the disks (e.g. .H17DISK format)

 

I personally think this two-tiered system has worked well. Both collections serve a purpose. I suggest we all work with Mark to make sure he has copies of any master disks and manual that we have.

 

More recently Les has created a more structured/curated collection in the SEBHC “wiki”

https://github.com/sebhc/sebhc/wiki

 

these are all great, but it can be a bit confusing knowing where to look for what.

 

Creating a GitHub site for disk images perhaps offers other value and I’m sure a case can be made. Maybe we keep multiple libraries going to see which are more useful over time, then consolidate.

 

Curious what others think.  Lacking any other guidance or decisions I will probably create a 10th volume in the Les/SEBHC archive to capture my latest scans (and will get copies to Mark of any special packages or commercial products that we don’t have captured there).  Les’ tools make this easy to do and easy to generate a master index of the disks.

 


On Jul 12, 2023, at 8:44 AM, J.B. Langston <jb.la...@gmail.com> wrote:

so I’d be interested in discussion of the best way to maintain our growing library of disk images.  I’m looking to release another 80-100 that I’ve recently scanned. I know Mark says he has a large collection that he hasn’t had time to scan/index.

I was thinking about uploading my stuff to archive.org.  They already have a large collection of other retro media, but I checked and they have very little related to Heathkit. They allow creating collections, so we could create "Heathkit 8-bit Software" and "Heathkit Documentation" collections and then collaboratively add things to them. They also have a DMCA exemption so it might be safer from a copyright perspective to upload things there rather than self-hosting them.

Here are the instructions for setting up a collection: https://help.archive.org/help/collections-a-basic-guide/

There is a requirement to have a minimum of 50 items to request a collection, so someone would need to upload some images first, and then request the collection.

You could have a look at Apple II Library: The 4am Collection as an example of a good collection.

Also for an example of a document library, see Sextant Magazine.  I really like their online viewer as well.

-------
Glenn:

And while I’m thinking about scanning old disks… just a reminder that we have an obligation to remove personally identifiable information (PII).  Back in the day lots of people put their checking accounts, wills, other personal information on their computers (I actually found stuff like this in some of the disks I scanned).  Someone here recently reminded us that some folks even etched their social security number on things (remember that?)

 

So some due diligence is in order before we post anything scanned from a private collection…

 --------

Gene:


> Not aware of any tools for this ?
>
In the past, I've just run the disk image through "strings", looking for
PII stuff.  It's worked well.
--------

Doug:


Are these HDOS images? I created some tools in JAVA for accessing HDOS images, for listing/extracting files plus expanded my "format" image-maker to create and populate HDOS images. None of these allows you to alter (delete files) and existing HDOS image, but you could extract files and then create a new image that excludes files you don't want. In case that's your last option...

For CP/M images, "cpmtools" should work - once you know the right diskdef to use.

------
Mark:

One of the command line tools in the heath-imager repo - https://github.com/mgarlanger/heath-imager/blob/master/src/cmd/h17d_extract_files.cpp, will extract all files for HDOS and CP/M disks. It tries to determine if it is an HDOS, CP/M, or Dual formatted disk and then extracts accordingly. The actual HDOS and CP/M handling is done in files in the lib directory, so those could potentially be used in other programs.



J.B. Langston

unread,
Jul 13, 2023, 8:14:41 PM7/13/23
to SEBHC
On the subject, I have finished imaging all the disks (all 153 of them) with an apparent 100% success rate.  I am planning to upload them to archive.org after I validate some of them.  I will probably hold off on the non-originals until I verify there are no personal files on them.  I suppose I could scan the disk image of the original though the idea of individually scanning disks doesn't sound particularly fun after feeding 153 of them to a floppy drive.

Glenn Roberts

unread,
Jul 13, 2023, 9:41:56 PM7/13/23
to se...@googlegroups.com
Since you’re taking the lead, do you have any thoughts on how to organize these? And index files to search them? What format do you plan to use?

Thx!

Sent from my iPad

On Jul 13, 2023, at 8:14 PM, J.B. Langston <jb.la...@gmail.com> wrote:

On the subject, I have finished imaging all the disks (all 153 of them) with an apparent 100% success rate.  I am planning to upload them to archive.org after I validate some of them.  I will probably hold off on the non-originals until I verify there are no personal files on them.  I suppose I could scan the disk image of the original though the idea of individually scanning disks doesn't sound particularly fun after feeding 153 of them to a floppy drive.
--
You received this message because you are subscribed to the Google Groups "SEBHC" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sebhc+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sebhc/8b425631-e916-4adc-bde3-c14012efa9c9n%40googlegroups.com.

J.B. Langston

unread,
Jul 13, 2023, 10:40:15 PM7/13/23
to SEBHC
I will probably convert them to h8d since that seems to be the prevailing standard at the moment. I will keep the h17disks in case the metadata is needed later. The Internet Archive keeps metadata about individual artifacts (author, year, format, etc.).  Usually multiple disks for a single release would be a single entry (e.g. if there was a multi-disk distribution like HDOS or CP/M). I will send over a link to the first one I post.

Mark Garlanger

unread,
Jul 13, 2023, 11:31:48 PM7/13/23
to se...@googlegroups.com
Doesn't archive.org support multiple formats? I know for scanned documents, they typically have it in a wide range of formats. The sextant magazines are available in many formats including:
  • abbyy gz
  • comic book zip
  • daisy
  • epub
  • full text
  • item tile
  • kindle
  • pdf
  • single page processed jp2 zip
Shouldn't be a problem to have both h8d and h17disk v1 (and later to add v2). The h17disk image has not just metadata, but includes the sector header information, which can vary depending on the type of disk (CP/M vs. HDOS). With h8d, any program that is trying to use it in an application like an emulator will have to try to infer which OS the disk is for, in order to generate the proper headers needed to properly run.

Mark


Glenn Roberts

unread,
Jul 14, 2023, 5:43:15 AM7/14/23
to se...@googlegroups.com
Once you have a collection of H8Ds it would be worth using Les’ H8DUtility to generate a master listing of the contents of all the disks.

I can do that if you like. Without this it is very hard to find what you’re looking for.

Assuming this all works as planned we should lay out an organizational system that can grow

Might be good to have different collections for commercial software, HUG, user disks, etc.  also give some thought to how to separate CP/M and HDOS

We probably should archive the UCSD pSystem disks here too (we have little or no commercial or HUG software for these).

Think about how to keep the disks and associated manuals or instructions together or at least cross linked.

We should also capture the cassette OS and documentation here.

This will be great once we’ve been able to upload and organize everything. It feels like our best shot at assuring perpetuity. Everything else we’ve done has relied on sites we personally own/manage.

Sent from my iPad

On Jul 13, 2023, at 10:40 PM, J.B. Langston <jb.la...@gmail.com> wrote:

I will probably convert them to h8d since that seems to be the prevailing standard at the moment. I will keep the h17disks in case the metadata is needed later. The Internet Archive keeps metadata about individual artifacts (author, year, format, etc.).  Usually multiple disks for a single release would be a single entry (e.g. if there was a multi-disk distribution like HDOS or CP/M). I will send over a link to the first one I post.

J.B. Langston

unread,
Jul 17, 2023, 11:40:54 AM7/17/23
to SEBHC
Mark,

It does support multiple artifacts/formats per entry so it should be be no problem to upload both h17 and h8d files.

I was trying to use your emulator to validate that some of the disk images boot before uploading them, but I'm getting ?Boot Error from all the disk images I try.  I am using the attached config.hdos file which is basically your config from the README a few of the lines commented out, and the disk image names changed.  I have my V89_CONFIG variable set to point to this file:

$ env | grep V89
V89_CONFIG=/home/jblang/VirtualH89/config.hdos

The first disk image is what I assume should be the bootable disk for HDOS2.0 downloaded from https://sebhc.github.io/sebhc/software/HDOS/HDOS_2-0.zip

h17_disk1 = /home/jblang/HDOS/HDOS_2-0_Issue_#50-06-00_890-64.h8d

I also tried it with h17_disk1 set to each of the other two disks as well in case that was not the right one, and none of them boot.  Not sure what I'm missing... any pointers?

Thanks,
J.B.
config.hdos

J.B. Langston

unread,
Jul 17, 2023, 11:50:23 AM7/17/23
to SEBHC
Once you have a collection of H8Ds it would be worth using Les’ H8DUtility to generate a master listing of the contents of all the disks.

I was going to verify a few of the disks in an emulator before I upload them but I'm running into some difficulties. I emailed Mark separately about that.
 
I can do that if you like. Without this it is very hard to find what you’re looking for.

The h17d_extract_files command from Mark's heath-imager tool also generates a 0_index.info file that I was going to include in the description.  An HDOS index looks like this:

Disk info
  Serial Number: 0
  Single-Sided
  40-Track

   Label:        'HDOS 2.0 Issue #50.06.00 (Copyright(C) Heath Co 1980) 890-64'

Name    .Ext    Size      Date          Flags
HDOS    .SYS    31      03-Nov-80       SLWC
HDOSOVL0.SYS    26      03-Nov-80       SLWC
HDOSOVL1.SYS    11      03-Nov-80       SLWC
SYSCMD  .SYS    12      03-Nov-80       SLW
PIP     .ABS    19      03-Nov-80       SLW
SY      .DVD    10      03-Nov-80       SL
DK      .DVD    15      03-Nov-80       SL
ERRORMSG.SYS    11      03-Nov-80       SW
SET     .ABS    12      03-Nov-80       SW
FLAGS   .ABS    4       03-Nov-80       SW
ONECOPY .ABS    20      03-Nov-80       SW
EDIT    .ABS    17      03-Nov-80       W
BASIC   .ABS    42      03-Nov-80       W
PATCH   .ABS    11      03-Nov-80       W
INIT    .ABS    29      03-Nov-80       W
SYSGEN  .ABS    21      03-Nov-80       W
TEST17  .ABS    32      03-Nov-80       W
TEST47  .ABS    29      03-Nov-80       W
SYSHELP .DOC    3       03-Nov-80       SW
HELP    .       2       03-Nov-80       SW
RGT     .SYS    1        No-Date        SLWC
GRT     .SYS    1        No-Date        SLWC
DIRECT  .SYS    18       No-Date        SLW

  23 Files, Using 377 Sectors (0 Free)

And a CPM index looks like this:

 ASM.COM         8K
 ASSIGN.COM      2K
 BIOS.SYS        5K
 CONFIGUR.COM   14K
 DDT.COM         5K
 DUP.COM         5K
 ED.COM          7K
 FORMAT.COM      6K
 LOAD.COM        2K
 MOVCPM17.COM   11K
 PIP.COM         8K
 STAT.COM        6K
 SUBMIT.COM      2K
 SYSGEN.COM      2K
 XSUB.COM        1K
Free space:       6K

If the H8DUtility output is accepted as the standard then I can use that instead.
 
Assuming this all works as planned we should lay out an organizational system that can grow

Might be good to have different collections for commercial software, HUG, user disks, etc.  also give some thought to how to separate CP/M and HDOS

I think we can keep it simple for now and once we get about 50 disk images uploaded start the Heathkit 8-Bit Software collection. We can request additional collections once the volume of disk images grows enough to justify it. Entries on archive.org can belong to multiple collections so that shouldn't be an issue.
 
We probably should archive the UCSD pSystem disks here too (we have little or no commercial or HUG software for these).

I have no experience with UCSD pascal and it wasn't included in any of the disks in my lot so I will leave that to someone else.
 
Think about how to keep the disks and associated manuals or instructions together or at least cross linked.

Agree once a piece of software and associated manual have both been uploaded, we should update the description of each to point to the other.
 
We should also capture the cassette OS and documentation here.

I'll leave that to someone else too. I have the cassette OS manuals and one casette that appears to be software of some kind but I have no cassette player to play it on.  I am sure that the main manual has been captured by someone before.
 
This will be great once we’ve been able to upload and organize everything. It feels like our best shot at assuring perpetuity. Everything else we’ve done has relied on sites we personally own/manage.

Yes, I think the Internet Archive is a great resource and I support them yearly. I have used it to find a lot for old books about Z80 programming, C64 books, etc. so it will be great to have some Heathkit stuff there too.

Mark Garlanger

unread,
Jul 17, 2023, 2:57:30 PM7/17/23
to se...@googlegroups.com
Hey J.B.,

  The emulator actually uses yet another format, which I have been calling h17raw. It is basically 320 bytes per sector/3200 bytes per track. Basically, just a raw dump of the full track data which I use internally in the emulator. I planned to pull in the code from the heath-imager to add support for h17disk, but haven't got to that. There is program in the cmd directory for the heath-imager to create these h17raw images: h17d_raw, This program takes two params. First, the filename of the h17disk image and then the filename to write the raw image to.

Another thing you may run into, when booting an HDOS disk, like an unmodified distribution disk, is that it requires you to press the spacebar a few times for it to determine the baud rate of the terminal attached to it. The boot process will stop until the space bar has been pressed a few times.

Let me know if you run into any other issues.

Mark


glenn.f...@gmail.com

unread,
Jul 17, 2023, 9:14:39 PM7/17/23
to se...@googlegroups.com

The format is fine but I believe what’s needed is a single file with all the contents.  This sounds like a simple shell scripting problem.

 

This is one area where the H8DUtility is nice – just put all the H8Ds in a single directory and it’ll automatically generate a full set of directories for all disks.

 

Hopefully once you get something started others of us can add on and build out the collection.

 

  • Glenn

--

You received this message because you are subscribed to the Google Groups "SEBHC" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sebhc+un...@googlegroups.com.

J.B. Langston

unread,
Jul 18, 2023, 11:55:09 PM7/18/23
to SEBHC
OK, that worked. I have uploaded the HDOS 2.0 3 disk set to archive.org to get us started!  Happy to hear any feedback on the formatting or conventions used.

Mark Garlanger

unread,
Jul 19, 2023, 12:20:10 AM7/19/23
to se...@googlegroups.com
Looking good, one thing I did notice is in the topics you listed z80, and although the h89 had the z80, the H8 came with the 8080 and HDOS was compatible with it. 

Also should we include the type of software? So for this 'operating system'. We could have other things like "games", "word processor", "programming language", etc. 

J.B. Langston

unread,
Jul 19, 2023, 12:24:18 AM7/19/23
to se...@googlegroups.com
I can add a tag for 8080 and operating-system.  I haven't looked at the way other platforms are tagged but I will look at how the Apple II stuff is done.  I'm also trying to upload image scans of the disks but they are taking for... ever to upload.  The images are 5MB each (hillarious considering the size of the data on the disk), so I should probably reduce the resolution and try again.

You received this message because you are subscribed to a topic in the Google Groups "SEBHC" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sebhc/s2NVSslg1As/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sebhc+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sebhc/CAAjkm7_wt7dP-pxTpt5nXi6L8z4PFhacnbceeWUmKgKAdZfkZg%40mail.gmail.com.

Mark Garlanger

unread,
Jul 19, 2023, 8:01:34 AM7/19/23
to se...@googlegroups.com
I had been resizing my photos of the disks to about 1000x1000. I think it was coming out to about 300k per image.


J.B. Langston

unread,
Jul 19, 2023, 8:12:10 PM7/19/23
to se...@googlegroups.com
That's what I ended up doing and it finally finished. Images just seem to take longer though, maybe because they are generating thumbnails.

Side question: on your emulator, I tried to run ywing fighter and I'm getting letters instead of the graphical characters... is there some trick to enabling those?

Mark Garlanger

unread,
Jul 19, 2023, 9:26:23 PM7/19/23
to se...@googlegroups.com
For the letters issue, I'm guessing you are using HDOS and a distribution disk that hasn't been configured for lower-case support. By default HDOS assumes the worst (that the terminal does not support lower case or backspace). If the letters you are seeing are upper-case, then that would be the issue, in graphics mode, only the lower-case letters are graphic symbols. So you will need to enter a few set commands on the command-line. 
  • set tt: bks
  • set tt: nomlo
  • set tt: nomli
First one lets the backspace work. The second 2 are to prevent mapping lower case to upper on output and input respectively. It seems like only one is needed, but I don't remember which and for completeness, I always do both. The emulator should auto save the disks you are using to something like "saveA.tmpDisk" (not positive on what extension I had set in the latest github version). But that is just a raw image, which the emulator supports, so you can rename your configured HDOS disk to hdos.h17raw, then update the emulator configure file to use it. Then you won't have to do the changes every time you boot.

Another thing I usually do is to set the floppy drives to 6ms to speed it up slightly. You can do that with "set sy0: step 6" and repeat for the other drives. Being a virtual drive, it could even be 0, but HDOS has a lower limit of 6.







J.B. Langston

unread,
Jul 19, 2023, 11:49:30 PM7/19/23
to se...@googlegroups.com
I did figure out the backspace one after reading through the hdos docs but it didn’t occur to me that the graphics would be related to lower case. I will try this tomorrow. Thanks for the advice! This and the other stuff about the disk format would be great wiki content on your repo, or just readme even. I’m happy to do a PR. 

Mark Garlanger

unread,
Jul 19, 2023, 11:57:00 PM7/19/23
to se...@googlegroups.com
Sounds great, I love to get help on the project.

J.B. Langston

unread,
Jul 28, 2023, 9:24:09 AM7/28/23
to SEBHC
I have started putting together a set of notes based on my explorations of the Heathkit via your emulator, and reading the docs and listings I have.


It's still very much a work in progress, but you are welcome to take anything you find useful.

glenn.f...@gmail.com

unread,
Jul 28, 2023, 4:03:59 PM7/28/23
to se...@googlegroups.com

JB: this is great. Thanks for taking the time. 

 

Not sure how aware you are of the various resources we have. For example, full source code for HDOS (not just PDFs but assemblable ASCII), scans of all the manuals, rom listings, etc.  not trying to discourage your web collection at all – just make it easier on you to find the info if you’re not familiar…  it’s admittedly a bit scattershot but we can guide you to what you need…

 

  • Glenn

 

 

 

From: se...@googlegroups.com <se...@googlegroups.com> On Behalf Of J.B. Langston
Sent: Friday, July 28, 2023 9:24 AM
To: SEBHC <se...@googlegroups.com>
Subject: Re: [sebhc] Re: Long-term Management of Disk Images and Scanned Docs

 

I have started putting together a set of notes based on my explorations of the Heathkit via your emulator, and reading the docs and listings I have.

J.B. Langston

unread,
Jul 28, 2023, 6:18:56 PM7/28/23
to se...@googlegroups.com
Yes, I have seen the HDOS text sources on Mark's website, as well as the listing and doc PDFs. The reason I created this doc though was that the heathkit docs are IMO extremely wordy and facts like "What are all the port numbers for the devices in a typical H89?" are not consolidated in one place. I wanted a reference of the stuff I would want to be able to quickly refer to in a very condensed form. 

Glenn Roberts

unread,
Jul 28, 2023, 7:36:06 PM7/28/23
to se...@googlegroups.com
Great! We also have most of the group’s materials at SEBHC.ORG.

thanks!

Sent from my iPad

On Jul 28, 2023, at 6:18 PM, J.B. Langston <jb.la...@gmail.com> wrote:


Reply all
Reply to author
Forward
0 new messages
Search
Clear search
Close search
Google apps
Main menu