I am sure there are debates to be made for 512 vs 1024 on an 8MB drive. I have reached a practical limit that will require working around the DIR limit. Which can be done.
Has anyone investigate the impacts to existing tools in RomWBW that might be assuming 512 DIR entries for the 8MB drives?
Has anyone created a compatible disk image for RomWBW with 1024 DIR entries?
Oh, is it possible to create a compile time configuration switch for CBIOS? Is that possible?
--
You received this message because you are subscribed to the Google Groups "retro-comp" group.
To unsubscribe from this group and stop receiving emails from it, send an email to retro-comp+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/retro-comp/160cebe1-a98c-4c5a-bdee-b07405682432%40googlegroups.com.
None of the existing tools would be impacted. They would all just work. The issue is the CBIOS DPB.Has anyone created a compatible disk image for RomWBW with 1024 DIR entries?I have. It is trivial to do.
> cpmcp -f rc2014v1-16MB USER.CPM 0:*.* ~/Desktop/cpm_drives/user.cpm/
> cpmcp -f rc2014-16MB USER.CPM ~/Desktop/cpm_drives/user.cpm/*.* 0:
I am very open to thoughts from this group. Do people think I should just make the change and warn everyone?
diskdef rc2014-16MB
seclen 512
tracks 1024
sectrk 32
blocksize 4096
maxdir 1024
skew 0
boottrk -
os 2.2
end
Gioven that I am mostly running CPM3, 1024 seems more than adequate unless we move to a 16MB slice size and then 2048 does makes sense. I am very happy with 8MB and 1024.
Phillip Stevens wrote:I’ve heard that here before, iirc. That CP/M 2.2 can’t support 16MB drives.
Is there a BDOS corner case you’re aware of?
I’ve used 16 MB for a while with nothing obvious breaking.
Is there something you can point out?
Douglas Miller wrote:It has to do with the math being performed. In the CP/M 2.2 BDOS it uses a 16-bit integer (register pair) to calculate the ARECORD (absolute record number). Since records are 128 bytes, that is 65536*128 = 8MB. That is the largest "drive" that can be addressed. If you've been using a 16MB drive on 2.2, you probably have not tried to allocate blocks beyond the 8MB boundary and have just been lucky. Or else you've been overwriting/corrupting older files and not noticed.
diskdef rc2014-8MB
seclen 512
tracks 512
sectrk 32
blocksize 4096
maxdir 2048
skew 0
boottrk -
os 2.2
end
...
If I'm not mistaken (???), as long as the seclen and tracks are contiguous data, there is no real dependency on these numbers when calculating the LBA, as long as each BIOS calculates the right record number in terms of LBA?The only real issue is the blocksize and maxdir numbers (and of course skew).P.
Correct. In fact, it's actually a huge waste of CPU in the case of devices like this that can/do use LBA. since the BDOS starts with the ARECORD, all you really need to do is compute "ARECORD << 2" to convert to 512B blocks (LBA), then issue that to the controller. But instead, the BDOS goes through a divide operation by SPT, and sends the quotient and remainder to the BIOS as track and sector, where they get recombined into LBA.
...
Phillip, does that fit your objectives reasonably?None of this answers the question of backward compatibility, but that is a different topic...Thanks,Wayne
First, I'll just say that I don't want to modify BDOS, but happy to accommodate doing things such that it could be done.
Second, it is turning out to be a little troublesome to use 2048 directory entries in CP/M 3. So, it is kind of a choice between hashed directories of 1024 entries and unhashed directories of 2048 entries.
I did a little test (very unscientific). I copied all the files in the current zsdos drive to other user areas until I ran out of space. I ran out of space on the drive well before using up the 1024 directory entries. I generally think the file sizes represent a typical usage scenario and I think it indicates 1024 entries is pretty workable. Thoughts?
Taking everything into account, I am currently thinking this:
diskdef wbw_hd0
seclen 512
tracks 511
sectrk 32
blocksize 4096
maxdir 1024
skew 0
boottrk 1
os 2.2
endThis is very close to what you suggested Phillip. However, it reduces maxdir to 1024. I am open to making it 2048 if that is preferred by most folks with the understanding that we lose directory hashing in CP/M 3. The other change is that tracks has been reduced by one so that a boottrk can be carved out of the 8MB space. This will make a slice exactly 8MB. That will really help the math required to use slices (which is not inside BDOS).
diskdef wbw_hdn
seclen 512
tracks 512
sectrk 32
blocksize 4096
maxdir 1024
skew 0
boottrk -
os 2.2
end
Phillip, does that fit your objectives reasonably?
...
Not an expert, but isn't it a bit unnecessary to do hashing for data stored on a modern SD or SSD? They have their own error correction and wear levelling algorithms at the physical layer, which presents corrected bits on their SPI or IDE interfaces. If the hashing is for other reasons then ok, but if it is to purely to protect against disk errors I'd call CP/M 3 directory hashing obsolete.(Note this isn't saying that file hashing for data transmission, security or proof of identity is unnecessary. Otherwise I'd be calling out block-chain too).
...Cheers, P.
Phillip Stevens wrote:Not an expert, but isn't it a bit unnecessary to do hashing for data stored on a modern SD or SSD? They have their own error correction and wear levelling algorithms at the physical layer, which presents corrected bits on their SPI or IDE interfaces. If the hashing is for other reasons then ok, but if it is to purely to protect against disk errors I'd call CP/M 3 directory hashing obsolete.(Note this isn't saying that file hashing for data transmission, security or proof of identity is unnecessary. Otherwise I'd be calling out block-chain too).
The hashing I'm talking about is CP/M 3 "directory hashing", which means that the hash table allows the BDOS to quickly locate what directory entries it needs for a given file (even empty ones), without doing *any* I/O. Even on a fast IDE/CF device, I think that is a significant boost.
A) By decreasing the SPT from 256 to 32 you will increase the number of subtract loops done in the BDOS by a factor of 8. You might want to be certain that the simplification of the BIOS really offsets that overhead. I would wonder if a little shifting and combining in the BIOS is really worse than those 8x loops. I've not seen the BIOS code, so maybe it is not optimized as much as it could.
Douglas Miller wrote:A) By decreasing the SPT from 256 to 32 you will increase the number of subtract loops done in the BDOS by a factor of 8. You might want to be certain that the simplification of the BIOS really offsets that overhead. I would wonder if a little shifting and combining in the BIOS is really worse than those 8x loops. I've not seen the BIOS code, so maybe it is not optimized as much as it could.That is a significant impact, and something worth taking into account. Given that the only objective is to get the 8 MB drive mapped contiguously onto LBA address, it makes no physical difference how sectrk or tracks are defined. Now you've brought in the division (or subtract loops) in BDOS it makes sense to maximise sectrk into a reasonable number, rather than 32.
A question. IIRC sectors count from 1 (rather than 0), right?If so, then it would be "messy" to have 256 as the sectrk. Unless there was a decrement or similar.Otherwise if sectors count from 0, then it would be ok (and much more efficient) to have 256, as the bit shifting LBA code is easier if anything.This is something I've never accounted for, as I didn't study the BDOS code previously (obviously an oversight).
The BDOS calls the BIOS SECTRN function with a 0-based logical sector number (16-bit, in BC). Then, the BDOS calls SETSEC with whatever value was returned by SECTRN (HL -> BC). So, the BIOS has full control (unless I've missed something) over whether the sector is 0-based or 1-based.
sectran: ;translate passed from BDOS sector number BC
ld h,b
ld l,c
ret
diskdef wbw_hdn
seclen 512
tracks 64
sectrk 256
blocksize 4096
maxdir 1024
skew 0
boottrk -
os 2.2
end
diskdef wbw_hdn
seclen 512
tracks 64
sectrk 256
blocksize 4096
# RAM disk in Tiny68K
# BLS of 4096
# 1024 sectors per track
# 63 tracks
diskdef t68kram
seclen 128
tracks 63
sectrk 1024
blocksize 4096
maxdir 512
skew 0
boottrk 1
os 2.2
Douglas,
Interesting comment about what BDOS does with sector & track calculation. From day one my CF disk definition is 1024 for sectrk, it is mainly to help me with sector/track calculation in BIOS. I have not thought of what BDOS does with it. So does it help or hurt overall calculation with the airly large sectrk value?
Bill
PS, I have track offset of 1 and that does eat up 128K of CF memory, but I use the first track to store bootstrap, monitor, and CP/M BIOS/BDOS/CCP.
# RAM disk in Tiny68K
# BLS of 4096
# 1024 sectors per track
# 63 tracks
diskdef t68kram
seclen 128
tracks 63
sectrk 1024
blocksize 4096
maxdir 512
skew 0
boottrk 1
os 2.2
...
...
And the general definition looks more like this... right?
diskdef wbw_hdn
seclen 512
tracks 64
sectrk 256
blocksize 4096
maxdir 1024
skew 0
boottrk -
os 2.2
endOR (for CP/M 2.2)
diskdef wbw_hdn
seclen 512
tracks 64
sectrk 256
blocksize 4096
maxdir 2048
skew 0
boottrk -
os 2.2
end
It is a creative idea and I see why it would optimize the math required for calculating sector offsets for slices. However, I don't think I am going to get comfortable with this. There are many tools and documents that expect system tracks to be at the start. The system tracks value in the DPB is not an offset, it is an quantity. I am worried about code breakage and confusion. Sorry Phil.
I'm really struggling with this one. I like the idea of maxing it out (2048) just to make sure I never deal with this again. However, in every real-word test I can come up with, 1024 seems to be entirely sufficient for an 8MB hard disk. I'm not sure I got Douglas' heuristic right because I come up with a number much less than 1024. However, as long as the average file size is >= 8K, you will not run out if directory entries. And I dislike precluding directory hashing. Probably inclined to use 1024 directory entries unless someone can come up with a plausible real world example that would need more.
Anna is right that it would help with data integrity. This is really not that hard and is an elegant solution, but a couple downsides. Technically, it means doing a sector I/O on every disk login. It also means doing a 32-bit addition on every sector seek. The bigger concern is operational. A user would be required to run fdisk before they could use any disk slices. Use of fdisk seems to be a stumbling point for many users. I would like to hear more thoughts on the pros and cons of this.
As long as it is a power of 2, the math is as optimal as it can be. I acknowledge that a higher SPT reduces the iterations in the BDOS divide. However, I have seen some code that will choke on values of 256 or higher because they assume that a single byte is sufficient to store the value. I currently like 32 -- just feels about right.
I don't use skew on any of my formats and really don't ever plan to. So, BDOS will always consider the first sector to be zero. I don't think there is a lot more to discuss there.
So, a lot of interesting ideas... I am going to try and consolidate them and provide some thoughts:
- Putting system track at end of slice instead of start
It is a creative idea and I see why it would optimize the math required for calculating sector offsets for slices. However, I don't think I am going to get comfortable with this. There are many tools and documents that expect system tracks to be at the start. The system tracks value in the DPB is not an offset, it is an quantity. I am worried about code breakage and confusion. Sorry Phil.
- Sectors per track
As long as it is a power of 2, the math is as optimal as it can be. I acknowledge that a higher SPT reduces the iterations in the BDOS divide. However, I have seen some code that will choke on values of 256 or higher because they assume that a single byte is sufficient to store the value. I currently like 32 -- just feels about right.
diskdef wbw_hdn
seclen 512
tracks 512
sectrk 32
blocksize 4096
maxdir 1024
...
I'll do some testing / quantification on an RC2014 BIOS soon to see how much difference 32 & 512 makes vs 256 & 64.Just for interest. I imagine it will pay out when the drive is quite full?
...
Anything using partition tables expects that space to be usable and
free. Fuzix for example uses those blocks. So if you do that you can
no longer build hybrid fdisk/ROMWBW disks safely because the first 63
sectors are the boot area (legacy) and 2048 (modern). I wonder ifthat's why you originally created a 128K system area ?
- Sectors per track
As long as it is a power of 2, the math is as optimal as it can be. I acknowledge that a higher SPT reduces the iterations in the BDOS divide. However, I have seen some code that will choke on values of 256 or higher because they assume that a single byte is sufficient to store the value. I currently like 32 -- just feels about right.
I'll do some testing / quantification on an RC2014 BIOS soon to see how much difference 32 & 512 makes vs 256 & 64.Just for interest. I imagine it will pay out when the drive is quite full?
The one open question for me is whether it is possible to differentiate hd0 with a boot track, from hd1, ..., hdn to have no boot track?hd0 == system drivehdn == data driveThat would allow systems that don't use a boot track to use the full 512 tracks of an 8 MB slice.And that would make the BIOS math much easier (=faster) for those hdn drives.
diskdef wbw_hdn
seclen 512
tracks 512
sectrk 32
blocksize 4096
maxdir 1024
skew 0
boottrk -
os 2.2
end
Any slice of a hard disk can be booted by RomWBW and it has turned out to be a very useful capability because that is what allows me to have a "combo" disk that will boot any of 5 OSes. So, I am not inclined to make it only the first slice.
- Sectors per track
As long as it is a power of 2, the math is as optimal as it can be. I acknowledge that a higher SPT reduces the iterations in the BDOS divide. However, I have seen some code that will choke on values of 256 or higher because they assume that a single byte is sufficient to store the value. I currently like 32 -- just feels about right.I'll do some testing / quantification on an RC2014 BIOS soon to see how much difference 32 & 512 makes vs 256 & 64.Just for interest. I imagine it will pay out when the drive is quite full?That will be interesting to hear back on.
I can easily (and will) ensure a system area of 64 sectors. How important is it to accommodate the modern standard of 2048 sectors? I can do that, but would certainly not want to carve 1MB out of the 8MB. I would need to rethink how to handle that.
But, my BIOS can't handle this larger than HSTSPT of 32. Other BIOS implementations may be similar.And if I'm remembering correctly, that's why I too ended up using a HSTSPT of 32 too. Thinking about this has reminded me that 32 wasn't a coincidence.
I'll do some testing / quantification on an RC2014 BIOS soon to see how much difference 32 & 512 makes vs 256 & 64.Just for interest. I imagine it will pay out when the drive is quite full?
Well, I haven't done testing, but I have done a code review.Based on the example BIOS implementation (provided by DRI, which I've followed closely) for the chkuna (check un-allocated) routine within the write routine, there is a single Byte comparison with the CPMSPT constant to work out whether the next track is required.This CPMSPT constant is going to be 4x the configured host sectors per track. If the host sectors per track is 32, then the CPMSPT will be 128, which will work. Anything more than HSTSPT of 32, and the single Byte comparison won't work any more.And, at that point you start heading out on your own tack with your own BIOS implementation. The CPMSPT is held in two Bytes, so it can be larger, and the BDOS expects that it may be larger, so the issue is simply within the BIOS.But, my BIOS can't handle this larger than HSTSPT of 32. Other BIOS implementations may be similar.And if I'm remembering correctly, that's why I too ended up using a HSTSPT of 32 too. Thinking about this has reminded me that 32 wasn't a coincidence.Anyway, I think that's totally done now.
b> a:pip random5.txt=random4.txt
...
Perhaps there's another test that can be suggested to expose more improvement?Cheers, Phillip
Phillip Stevens wrote:Perhaps there's another test that can be suggested to expose more improvement?
Douglas Miller wrote:
I suspect you'd see the greatest improvement wherever there is minimal overhead for I/Os. That would probably be CP/M 3 PIP (or any multi-sector I/O on CP/M 3 with a BIOS that leverages that). In that case, the I/Os go directly from the device to/from PIP memory (no buffering or deblocking causing extra mem-mem copying). While it's pretty obvious that the BDOS division will be faster, realizing a noticeable difference in "real life" will depend on a lot of things.
As another example, I just checked in the same code changes on a Z180 with the same PPIDE and SSD, and there the 1MB disk to disk copy takes 15 seconds before and after the change.
15 seconds for megabyte PIP is really fast for CP/M2.2. Have you try CP/M3 PIP?
With my hardware of 22MHz Z80, 8-bit CF interface, and CF disk and 256 sectors/track, CP/M3 PIP of megabyte file is 14 seconds but CP/M2.2 PIP is 37 seconds.
I think the PPIDE is pretty good at getting "performance" numbers, especially driving an SSD. The Z180 is 36MHz 1-Memory 2-I/O Wait States, so that's not particularly special. The BIOS is simple too, with COMMON memory disk I/O routines. No banking or otherwise to get in the way. So that helps too.I haven't built a CP/M 3 version yet, I'm afraid. I haven't found an application that requires CP/M 3, so there's little incentive. Still looking for a round 'tuit.
As another example, I just checked in the same code changes on a Z180 with the same PPIDE and SSD, and there the 1MB disk to disk copy takes 15 seconds before and after the change.I can't hand time a difference. There the CPU is less of bottleneck, so the divide doesn't have as much impact.
The RC2014 Z80 doesn't have DMA, but using an unrolled LDI (rather than LDIR) saves 532 cycles per CP/M sector and shaves another 2 seconds off the time, now at 65 seconds for the same 1MB copy on a full drive.Since one has about 4x the CPU clock speed of the other, the 4x performance difference is exactly natural.
It is the PPIDE interface causing the delay. Twiddling the RD and WR lines and management of two byte transfers is complex and slow (relatively). Just using the DMAC for the CP/M <-> host copy.
> Of course, the cost of unrolling the LDIR is that you lose another 256+ bytes of TPA.
It is not fully unrolled, just 32 LDI instructions.
And it fitted in slack space. So no loss really. Luckily. :-)
P.
Unless I am missing something here,the 512 byte copy on a CF adapter
takes 18000 cycles without DMA (that's just the drive set up , block
transfer with LDI in and out). You do 2048 of these for the transfer
plus some meta data lets say 10% as a reasonable transfer count cost.
That is 40 million cycles or a bit over 5 seconds for the Z80 RC2014
board with LDI. You are taking 65 seconds, so your disk performance is
5% of your actual performance.
for (;;)
{
br = fread(buffer, sizeof(char), BUFFER_SIZE, In);
if (br == 0) break; // eof or error
bw = fwrite(buffer, sizeof(char), BUFFER_SIZE, Out);
if (bw != br) break; // error or disk full
}
667 00AB l_main_00114:
668 00AB 21 00 00 ld hl,_buffer
669 00AE E5 push hl
670 00AF 21 01 00 ld hl,0x0001
671 00B2 E5 push hl
672 00B3 21 00 10 ld hl,0x1000
673 00B6 E5 push hl
674 00B7 2A 00 10 ld hl, (_In)
675 00BA E5 push hl
676 00BB CD 00 00 call _fread
677 00BE F1 pop af
678 00BF F1 pop af
679 00C0 F1 pop af
680 00C1 F1 pop af
681 00C2 7C ld a,h
682 00C3 4D ld c,l
683 00C4 47 ld b,a
684 00C5 B5 or a, l
685 00C6 28 1E jr Z,l_main_00107
686 00C8 C5 push bc
687 00C9 21 00 00 ld hl,_buffer
688 00CC E5 push hl
689 00CD 21 01 00 ld hl,0x0001
690 00D0 E5 push hl
691 00D1 21 00 10 ld hl,0x1000
692 00D4 E5 push hl
693 00D5 2A 02 10 ld hl, (_Out)
694 00D8 E5 push hl
695 00D9 CD 00 00 call _fwrite
696 00DC F1 pop af
697 00DD F1 pop af
698 00DE F1 pop af
699 00DF F1 pop af
700 00E0 C1 pop bc
701 00E1 AF xor a,a
702 00E2 ED 42 sbc hl,bc
703 00E4 28 C5 jr Z,l_main_00114
704 00E6 l_main_00107:
1309 F133 ide_rdblk2:
1310 F133 16 48 ld d,__IO_IDE_DATA|__IO_IDE_RD_LINE ; 7
1311 F135 ED 51 out (c),d ; 12 and assert read pin
1312 F137 0E 20 ld c,__IO_PIO_IDE_LSB ; 7 drive lower lines with lsb
1313 F139 ED A2 ini ; 16 read the lower byte (HL++)
1314 F13B 0C inc c ; 4 drive upper lines with msb
1315 F13C ED A2 ini ; 16 read the upper byte (HL++)
1316 F13E 0C inc c ; 4 drive control port
1317 F13F 16 08 ld d,__IO_IDE_DATA ; 7
1318 F141 ED 51 out (c),d ; 12 deassert read pin
1319 F143 10 EE djnz ide_rdblk2 ; 13 keep iterative count in b
1383 F180 ide_wrblk2:
1384 F180 16 28 ld d,__IO_IDE_DATA|__IO_IDE_WR_LINE ; 7
1385 F182 ED 51 out (c),d ; 12 and assert write pin
1386 F184 0E 20 ld c,__IO_PIO_IDE_LSB ; 7 drive lower lines with lsb
1387 F186 ED A3 outi ; 16 write the lower byte (HL++)
1388 F188 0C inc c ; 4 drive upper lines with msb
1389 F189 ED A3 outi ; 16 write the upper byte (HL++)
1390 F18B 0C inc c ; 4 drive control port
1391 F18C 16 08 ld d,__IO_IDE_DATA ; 7
1392 F18E ED 51 out (c),d ; 12 deassert write pin
1393 F190 10 EE djnz ide_wrblk2 ; 13 keep iterative count in b
The RC2014 Z80 doesn't have DMA, but using an unrolled LDI (rather than LDIR) saves 532 cycles per CP/M sector and shaves another 2 seconds off the time, now at 65 seconds for the same 1MB copy on a full drive.
SPT Empty Full - using PIP
32 46 sec 69 sec
256 42 sec 65 sec
In other words you are watching the wrong ball - there is a very large
per block constant (for a given CPU speed) in your experiment which is
telling you that most of the overhead is somewhere else.
SPT Empty Full - using C program z88dk classic stdio
256 53 sec 80 sec
1. I agree - if you are using CP/M 2.2 then your raw disk performance
doesn't matter at modern speeds (it was a problem with 1980s hard
disks hence CP/M 3 fixing it) and you are correct that PPIDE is about
as fast as CF because the bottleneck simply isn't the device.
2. You need to fix the big overhead in the core codeAll this looks like what I've seen on other devices. Bitbang SD cards
feel fine with CP/M despite being at best 20K/second raw transfer
rate.
I thought C arguments on the stack were reverse order to those in the example, is that standard C order?
Mark
...
And, I still can't find the issue in the BDOS core code. It is not BIOS, and it is not sector deblocking. And, there's nothing else left.That will have to be a problem for another rainy afternoon.Phillip
What is RomWBW? =Steve.
RomWBW provides a complete software system for a wide variety of hobbyist Z80/Z180 CPU-based systems produced by these developer communities:
General features include:
Mark T wrote:I was aware that C didn’t specify the order of evaluation, but I always thought that the location on the stack was consistent.
I guess I was just lucky when I used variable argument passing similar to printf, probably only worked with the compiler I used at the time. Or maybe the calling convention was declared, it was a long time ago.
Usually, each compiler has it's own calling convention. With the compiler being constant, the calling convention should be also. Some might have variable rules for number of arguments vs. register/stack location. But I think 8080/Z80 compilers always used the stack, albeit possibly different order.
The equipment is one standard 7.3MHz RC2014 Plus 64KB Ram, with one of Spencer's new IDE Modules (makes no difference whether Spencer or Ed's Module, but Spencer just launched the new IDE Module available on Tindie).To get to the bottom of this, I've done a test using a C program and z88dk IDE drivers. It takes 21.5 seconds to copy a 1048576 Byte file (throughput is double as the same disk is being read and written).
SPT Empty - using C program z88dk with ChaN FATFS onto FAT32
256 21 sec
The raw BIOS time is 98 cycles per 2 bytes. 25088 per 512 Byte sector. Or written another way it can do a 1MB raw to Read or Write in 6.9689 seconds, into the system hstbuf.So, the 21.5 seconds for the 1MB copy consists of 14 seconds for the raw data transfer and 7 seconds of housekeeping. Not too bad.
But, how do we look from inside CP/M?
I've tested the 4 situations 32 Byte SPT, 256 Byte SPT, empty 8 MB drive, and full 8MB drive. The results look like this, using PIP to do the file copy.
SPT Empty Full - using PIP to standard BDOS
32 46 sec 69 sec
256 42 sec 65 sec
In other words you are watching the wrong ball - there is a very large
per block constant (for a given CPU speed) in your experiment which is
telling you that most of the overhead is somewhere else.In another post I cast derogatory remarks about PIP, and that I was sure that it would be soaking up the 21 seconds that are getting lost here.So I wrote a simple C program, using the z88dk CP/M stdio functions (from the classic library), to do a file copy.
The result was not pretty. My CP/M C program is much worse than PIP at doing file copies.
Next time, I'll keep quiet. PIP is very good.
SPT Empty Full - using C program z88dk classic stdio
256 53 sec 80 sec
So where did that 20 to 30 seconds of "fat" get added onto the CP/M file access?Was it coming from the deblocking algorithm in my BIOS?
Well no. The unrolled LDI 32 version takes about 4.8 seconds for the transfer, and the LDIR version takes about 6 seconds for the transfer.That leaves a remaining 14 seconds getting lost inside the BDOS and PIP code, somewhere.
EDIT. I've written a simple assembly program CP.COM that replicates the PIP fast copy algorithm. It is a little faster than PIP (about 2 to 5%), and it shows that PIP is very efficient.
This demonstrates that the overhead comes from within the BDOS.
SPT Empty - using assembly calls to standard BDOS
256 39 sec
And, I still can't find the issue in the BDOS core code. It is not BIOS, and it is not sector deblocking. And, there's nothing else left.That will have to be a problem for another rainy afternoon.
Maybe the code generated from c could be modified. Would be no need to pop the arguments for the first call and then push them again for the second.
SPT Empty - using PIP to NZ-COM BDOS
256 47 sec
On Friday, April 24, 2020 at 7:46:18 AM UTC-7, Jim McGinnis wrote:I am sure there are debates to be made for 512 vs 1024 on an 8MB drive. I have reached a practical limit that will require working around the DIR limit. Which can be done.Actually, it is not much of a debate. It absolutely should have been 1024 originally. The choice of 512 goes back over a decade when RomWBW came to life and was used simply because it was compatible with the other work being done at the time.I am very open to thoughts from this group. Do people think I should just make the change and warn everyone?-Wayne
1024 directory entries makes clear sense for most people. I have avoided problems by defining more disks (most of my ROMWBW systems have 16 drives configured) so I can separate projects onto different disks so no one drive is overloaded.However, if the default number of directory entries is changed can i suggest that a ROMWBW configuration tag be created that allows the existing DPB layout to be used. There are a lot of systems out there using the current setup which will be locked out of newer ROMWBW revisions and features until they find a way to reload their drives in the new format.
1) The single OS prebuilt images are missing the prepended "hdnew_prefix.bin" and will not work correctly.The "combo" prebuilt image does have the prepended bin file and works flawlessly to boot each slice.
echo. echo Building New Hard Disk Images... echo. call BuildDisk.cmd cpm22 wbw_hdnew ..\cpm22\cpm_wbw.sys call BuildDisk.cmd zsdos wbw_hdnew ..\zsdos\zsys_wbw.sys call BuildDisk.cmd nzcom wbw_hdnew ..\zsdos\zsys_wbw.sys call BuildDisk.cmd cpm3 wbw_hdnew ..\cpm3\cpmldr.sys call BuildDisk.cmd zpm3 wbw_hdnew ..\cpm3\cpmldr.sys call BuildDisk.cmd ws4 wbw_hdnew if exist ..\BPBIOS\bpbio-ww.rel call BuildDisk.cmd bp wbw_hdnew copy hdnew_prefix.bin ..\..\Binary\ echo. echo Building Combo Disk (new format) Image... copy /b hdnew_prefix.bin + ..\..\Binary\hdnew_cpm22.img + ..\..\Binary\hdnew_zsdos.img + ..\..\Binary\hdnew_nzcom.img + ..\..\Binary\hdnew_cpm3.img + ..\..\Binary\hdnew_zpm3.img + ..\..\Binary\hdnew_ws4.img ..\..\Binary\hdnew_combo.img echo. echo Building Jim Disk (new format) cpm3 Image... copy /b hdnew_prefix.bin + ..\..\Binary\hdnew_cpm3.img ..\..\Binary\hdnew_jim.img
echo. echo Building final (1024 DIR) cpm22 Image... copy /b hdnew_prefix.bin + ..\..\Binary\hdnew_cpm22.img ..\..\Binary\hdnew_cpm22_1024.img echo Building final (1024 DIR) cpm3 Image... copy /b hdnew_prefix.bin + ..\..\Binary\hdnew_cpm3.img ..\..\Binary\hdnew_cpm3_1024.img echo Building final (1024 DIR) zsdos Image... copy /b hdnew_prefix.bin + ..\..\Binary\hdnew_zsdos.img ..\..\Binary\hdnew_zsdos_1024.img echo Building final (1024 DIR) cpm22 Image... copy /b hdnew_prefix.bin + ..\..\Binary\hdnew_nzcom.img ..\..\Binary\hdnew_nzcom_1024.img echo Building final (1024 DIR) cpm22 Image... copy /b hdnew_prefix.bin + ..\..\Binary\hdnew_zpm3.img ..\..\Binary\hdnew_zpm3_1024.img
1) The single OS prebuilt images are missing the prepended "hdnew_prefix.bin" and will not work correctly.The "combo" prebuilt image does have the prepended bin file and works flawlessly to boot each slice.