sigrok-mfm mfm decoder

228 views
Skip to first unread message

Rasz

unread,
Oct 21, 2025, 7:19:38 AMOct 21
to MFM Discuss
Hello everyone Im Rasz_pl

First TLDR version:
https://github.com/raszpl/sigrok-mfm Working mfm decoding plugin for open-source Sigrok/PulseView/DSView logic analyzer software. Lets one visualize data and help with recovery/analysis. You dont need logic analyzer to use this, attached file lets one convert one track fron any .transition file into .vcd for loading into PulseView.

Long version:
I was always interested in magnetic storage. Some time ago I stumbled on majenkotech twitch streaming his attempts at MFM emulator. VODs:
https://www.youtube.com/watch?v=LrI0blg4r7I
https://www.youtube.com/watch?v=RbrRLh-PsS8
https://www.youtube.com/watch?v=rvOJJcxFEO4
https://www.youtube.com/watch?v=U6rS2Gz0gZA
https://www.youtube.com/watch?v=or5SL96_X1A
https://www.youtube.com/watch?v=11bu10Ehw30
https://www.youtube.com/watch?v=P_xAgsFjWCo
https://www.youtube.com/watch?v=_Gxs4f0QJdI

He briefly looked at 2017 David C. Wiens Pulseview mfm decoder https://www.sardis-technologies.com/ufdr/pulseview.htm but couldnt get it running. I managed to google a fix and this is how everything started.


I started playing with https://bitsavers.org/projects/hd_samples/ . Whipped quick python .tr converter and Im successfully decoding mfm dumps:
test.py \hd_samples\ST-278R\MFM_1-1_interleave_17sect\ST21M_MFM_615-6.tr" -t6 -d track0
This dumps track 6 into track0.vcd file. vcd was the easiest format I could find, its just adding pulse intervals from transition file. Next I run that file in sigrok-cli:
sigrok-cli.exe -D -i test\track0.vcd -P mfm:header_crc_bits=32:header_crc_poly=0x41044185:header_crc_init=0:data_crc_poly=0x41044185:data_crc_init=0:report=DAM:report_qty=18 -A mfm=fields:reports
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=0, len=128
mfm-1: CRC OK 99B7F53E
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK C8F97415
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=1, len=128
mfm-1: CRC OK C8B30F3A
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK D3349D6B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=2, len=128
mfm-1: CRC OK 3BBE0136
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=3, len=128
mfm-1: CRC OK 6ABAFB32
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=4, len=128
mfm-1: CRC OK 9CA05CAB
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=5, len=128
mfm-1: CRC OK CDA4A6AF
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=6, len=128
mfm-1: CRC OK 3EA9A8A3
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=7, len=128
mfm-1: CRC OK 6FAD52A7
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=8, len=128
mfm-1: CRC OK 9398A614
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=9, len=128
mfm-1: CRC OK C29C5C10
mfm-1: Sync pattern 14 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=10, len=128
mfm-1: CRC OK 3191521C
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=11, len=128
mfm-1: CRC OK 6095A818
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=12, len=128
mfm-1: CRC OK 968F0F81
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=13, len=128
mfm-1: CRC OK C78BF585
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=14, len=128
mfm-1: CRC OK 3486FB89
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=15, len=128
mfm-1: CRC OK 6582018D
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=16, len=128
mfm-1: CRC OK 8DE9536A
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=254, len=128
mfm-1: CRC OK FBEABA85
mfm-1: Sync pattern 14 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK 28E9D789
mfm-1: Summary: IAM=0, IDAM=18, DAM=18, DDAM=0, CRC_OK=36, CRC_err=0, EiPW=0, CkEr=0, OoTI=10/79324

Now the inevitable questions:
- why is Seagate ST11/21M controller claiming sector length 128 in sector header while recording 512 bytes?
- why is Seagate ST11/21M controller recording 18 sectors per track? :o
- why is 18th sector so weird? :) Its zoomed in on pulseview_idrecord.png On every track it has same sector=254 field and looking at flux transitions you can tell sync pattern starts with no gaps (or rather after 1 byte gap2) with no room for rewriting almost like it was meant to be read only.
- whats on the first tracks 0-5? They all look like RLL, not that I would know how RLL looks like :) but they look exactly like what I get from
test.py "D:\_code\disk mfm\MfmDecoder\hd_samples\ST-278R\RLL_1-1_interleave_26sect\ST21R_RLL_615-6.tr" -t6 -d track0
funnily enough ST21R_RLL_615-6.tr tracks 0-5 are all almost completely empty filled with 400ns pulses. Whats going on, are those magical service area reserved tracks? Why would ST21M store RLL encoded data on those even when explicitly set to MFM mode?
Attaching my super janky test.py converter if someone wants to play with sigrok using transition files as source. Pro tip: Once you load one .vcd file into PulseView and setup MFM decoder you dont have to repeat the process. You can keep re-running test.py with different -t parameter but same -d track0 and then click Reload in PulseView.

Too bad the .tr files I looked at so far are all freshly formatted with no data to see flux transitions after few rewrites.

ps: I also managed to compile mfm_util for windows using MSYS and hacking away incompatible bits
- commented out pread/pwrite so most functionality apart from analyze is broken
- used __builtin_ffsll in crc_ecc.c:148
 span = fls(syndrome) - __builtin_ffsll(syndrome) + 1;
- added O_BINARY flag in emu_tran_file.c:752
 fd = open(fn, O_RDONLY| O_BINARY);
- removed -lrt flag in makefile
With those changes:
D:\_code\disk mfm\mfm-master\mfm>mfm_util.exe -a -t "D:\_code\disk mfm\MfmDecoder\hd_samples\ST-278R\MFM_1-1_interleave_17sect\ST21M_MFM_615-6.tr"
Original decode arguments: --heads 6 --cylinders 615 --sector_length 512 --retries 50,4 --drive 1
Found matching format Seagate_ST11MB: good count difference 0
Number of heads 6 number of sectors 17 first sector 0
Interleave (not checked): 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Command line to read disk:
--format Seagate_ST11MB --sectors 17,0 --heads 6 --cylinders 615 --header_crc 0x0,0x41044185,32,6 --data_crc  0x0,0x41044185,32,6 --sector_length 512 --retries 50,4 --drive 1

I needed that to learn crc poly of the Seagate 21M controller, didnt know 11M will have the same one.
test.py

David Gesswein

unread,
Oct 21, 2025, 9:25:13 AMOct 21
to mfm-d...@googlegroups.com
On Tue, Oct 21, 2025 at 03:21:14AM -0700, Rasz wrote:
> Hello everyone Im Rasz_pl
>
Hi

> My ultimate goal is adding RLL support.
>
There is someone working on adding RLL support to my code. They are getting
pretty close to releasing the code.
https://github.com/dgesswein/mfm/issues/8

> Now the inevitable questions:
> - why is Seagate ST11/21M controller claiming sector length 128 in sector
> header while recording 512 bytes?

The various controller manufacturers mostly used there own unique sector header
format. Looks like I have 93 supported formats currently. Some are just different
sector size or CRC polynomials other have more significant differences.

Western Digital uses the 2 bit field to mark sector size. Seagate doesn't seem to.

You are going to have to decide what you want to do about all the different formats
for your code.

> - why is Seagate ST11/21M controller recording 18 sectors per track? :o

Good question. Many controllers have no documentation on the track format. Haven't
see documentation on Seagate controllers. I was assuming it was for having spare sector
for error handling.

> - why is 18th sector so weird? :) Its zoomed in on pulseview_idrecord.png
> <https://raw.githubusercontent.com/raszpl/sigrok-mfm/main/doc/pulseview_idrecord.png>
> On every track it has same sector=254 field and looking at flux transitions
> you can tell sync pattern starts with no gaps (or rather after 1 byte gap2)
> with no room for rewriting almost like it was meant to be read only.
>
I have an example of similar format where a few tracks the 254 sector not the last.
Looks like it reformats the track and puts the 254 sector over the bad spot.

> - whats on the first tracks 0-5? They all look like RLL, not that I would
> know how RLL looks like
>
Some controllers take some portion of the start of the disk for storing information such
as the disk geometry. Likely its that and since its a RLL controller they recorded it in
RLL format.

Rasz

unread,
Oct 22, 2025, 11:31:46 PMOct 22
to mfm-d...@googlegroups.com
On Tue, Oct 21, 2025 at 3:25 PM David Gesswein <d...@pdp8online.com> wrote:
> There is someone working on adding RLL support to my code. They are getting
> pretty close to releasing the code.
> https://github.com/dgesswein/mfm/issues/8

Yes I saw that, but no code posted yet, ill try to code something myself too.

> The various controller manufacturers mostly used there own unique sector header
> format. Looks like I have 93 supported formats currently. Some are just different
> sector size or CRC polynomials other have more significant differences.

I was hoping they more or less stuck to same format for headers :/ oh well

> You are going to have to decide what you want to do about all the different formats
> for your code.

Supporting different kinds and sizes of Polynomials was easy enough to
add. Whole new formats might not be worth the hassle, I would have to
somehow make state machine
https://github.com/raszpl/sigrok-mfm/blob/6e26f6669a07d8a5c9dfb72d58a4de7d52b172a8/mfm/pd.py#L1106
dynamic/reconfigurable on the fly. I havent looked at your code
handling this yet.

> Good question. Many controllers have no documentation on the track format. Haven't
> see documentation on Seagate controllers. I was assuming it was for having spare sector
> for error handling.

> I have an example of similar format where a few tracks the 254 sector not the last.
> Looks like it reformats the track and puts the 254 sector over the bad spot.

Looking at few other mfm disk track dumps it seems like there was
always enough space for 18 sectors (assuming ~922us with all the gaps)
and some vendors even left sector sized free hole instead of spacing
out their 17 sectors. What a huge waste, 2.5MB just left there unused
on 42MB models. Looking at PC Bios tables
https://aeb.win.tue.nl/linux/hdtypes/hdtypes-3.html 18 sectors was
never a thing on that platform.

> Some controllers take some portion of the start of the disk for storing information such
> as the disk geometry. Likely its that and since its a RLL controller they recorded it in
> RLL format.

6 tracks seem like a huge waste again. Its wasted potential wherever
one looks. Not to mention lack of zoned recording all the way to early
nineties even after switching to IDE :(

Do you happen to have any ESDI captures? There is tons of IBM PS/2
with unobtainable dying ESDI drives and stupid MCA precluding easy
alternative storage options. Good opportunity for emulation. ESDI
while faster at up to 20Mbit (no idea what speed the PS/2s are using)
provides reference clock so should be much easier to emulate.

Main use case for sigrok-mfm Pulseview decoder is to help developers
or in advanced data recovery. Personally I find staring as graphical
representation of something easier than just long strings of hex :)

--
Who logs in to gdm? Not I, said the duck.

Al Kossow

unread,
Oct 22, 2025, 11:42:32 PMOct 22
to mfm-d...@googlegroups.com
On 10/22/25 8:33 PM, Rasz wrote:

> Do you happen to have any ESDI captures?

The data separator is on the drives themselves.
AFAIK the on-disk format was always RLL.

BBMDonut

unread,
Oct 23, 2025, 1:35:22 AMOct 23
to MFM Discuss
On Wednesday, October 22, 2025 at 10:31:46 PM UTC-5 citi...@gmail.com wrote:
On Tue, Oct 21, 2025 at 3:25 PM David Gesswein <d...@pdp8online.com> wrote:
> There is someone working on adding RLL support to my code. They are getting
> pretty close to releasing the code.
> https://github.com/dgesswein/mfm/issues/8

Yes I saw that, but no code posted yet, ill try to code something myself too.


Hi there! My apologies for the lack of a push, as life has been a tad chaotic as of late. I'm
going through the upgrades and doing some basic sanity checks/cleanup before I push
and submit the PR. Mr. Gesswein and I have been refining some of the changes, so I'm also
working to get his latest updates integrated into my work as well. Once things have settled,
not only will the initial RLL support be in place (currently working for WD1004-27X/
WD1006V-SR2/basically anything compatible with the WD5011 controller IC from what I gather),
but the separate modifications for utilizing external stepper controllers have been pulled in as
well. As of now, I have only tested the 4-wire Sparkfun controller, along with the ST-238R head
motor (as that is the combination I have been working to use for recovery).

Using this updated roll-up build, I have managed to dump the data from Usagi Electric's EDS
PC drive (the ST-238R mentioned above). I believe Mr. Gesswein has been able to use the 
stepper update to pull from MFM drives, however I'm not 100% certain what combinations of
other tests he's tried as of yet. I know that we did identify some issues with the pinout
differences between revisions of the MFM emulator and hooking up the Sparkfun PCB, so
that will be documented as well (on the bright side, the new pinout should be universal across
emulator revisions).

I should have more time to work on it over the next few nights with any luck, but regardless, I'll
go ahead and push the WIP to my GitHub fork by the weekend just to be safe (with the standard
"there be dragons here" warning for anyone wanting to experiment before the official release). 
From that point, the other two major efforts that would need to be worked out are adding support
to the emulation engine itself for RLL formats, along with adding new controller types. One of the
annoying issues - aside from the difference in bit timings - is the fact that different vendors (and
possibly different controller ICs from the same vendor) can all encode RLL with slight differences
(whereas MFM's flux-to-binary translation was pretty standard across the boards). It's just a few
tweaks to some lookup tables, but still annoying nonetheless...

David Gesswein

unread,
Oct 23, 2025, 9:33:31 AMOct 23
to mfm-d...@googlegroups.com
On Thu, Oct 23, 2025 at 05:33:08AM +0200, Rasz wrote:
> I havent looked at your code handling this yet.
>
Decoding code is just brute force code that for each format has a few lines of code to
pull out the correct information and for wildly different formats separate routines
for finding the sector start etc.
At the start didn't know enough about how they varied to do something cleaner and
after I had enough didn't have the energy to change.

Ext2emu which does the reverse conversion was written later so does use a data
table to define the transformation. Not all formats are in it.

> Do you happen to have any ESDI captures? There is tons of IBM PS/2
> with unobtainable dying ESDI drives and stupid MCA precluding easy
> alternative storage options. Good opportunity for emulation. ESDI
> while faster at up to 20Mbit (no idea what speed the PS/2s are using)
> provides reference clock so should be much easier to emulate.
>
I don't. I do know someone working on an ESDI emulator who has gotten it far enough to
capture some data. I'll let him know this thread exists.

Al Kossow

unread,
Oct 23, 2025, 9:40:45 AMOct 23
to mfm-d...@googlegroups.com
On 10/23/25 6:33 AM, David Gesswein wrote:

> I don't. I do know someone working on an ESDI emulator who has gotten it far enough to
> capture some data. I'll let him know this thread exists.
>


"cw" on the cctalk discord mostly working as of Aug.
"ESDI emulator update: writing apparently works, everything is still in ddr memory. I think I still have writing issues though, so I might
need to find another testing program"

Reece Pollack

unread,
Oct 23, 2025, 11:53:59 AMOct 23
to MFM Discuss
Hi folks! I'm the "someone" David mentions above. 

As someone else in this thread mentioned, with ESDI the data separator is on the drive, not the controller. The formatting of data on the media itself is left entirely to the data encoder/separator on the drive; the controller sends and receives only the data bit stream synchronized with a separate clock. Thus there is no way to capture the raw flux transitions as you can with ST-506 (MFM) drives, so no way to see whether the media is encoded RLL-2/7 or something else entirely.

This makes it both easier and harder to read a vintage ESDI drive. It's easier because there's no need to analyze the flux transitions to extract the clock and data; you get data and clock as differential signals. It's harder because you have to monitor the Index and Sector/AMD signals to identify where data starts and ends, and there no indication of where the write splices are.

Reece Pollack

unread,
Oct 23, 2025, 1:24:35 PMOct 23
to MFM Discuss
On Wednesday, October 22, 2025 at 11:31:46 PM UTC-4 citi...@gmail.com wrote:

Do you happen to have any ESDI captures? There is tons of IBM PS/2
with unobtainable dying ESDI drives and stupid MCA precluding easy
alternative storage options. Good opportunity for emulation. ESDI
while faster at up to 20Mbit (no idea what speed the PS/2s are using)
provides reference clock so should be much easier to emulate.

As David mentioned, I'm developing an ESDI emulator so I have a bit of familiarity with the technology.

ESDI is specified to run up to 24 MHz. Since the data clock is provided by the drive, it must be used with a controller that will support whatever speed the drive provides.  10 MHz drives are most common, and the PS/2 model 80 I've captured data from has a 10 MHz drive in it.

I'm told there's a SCSI adapter for MCA bus. This may be an alternative for those who don't want to deal with ESDI. There is also a disk that goes in a PS/2 model 50, 55, or 70 that some people say is an ESDI drive. I've looked at this drive and it is a direct MCA bus connect rather than ESDI.

-Reece

Alison Telford

unread,
Oct 23, 2025, 6:32:13 PMOct 23
to mfm-d...@googlegroups.com
Hi all -- if it's useful, I had posted a snippet about using David's MFM emulator on PS/2 machines -- did get it to work. There are also scsi adapters available (these I find work well and are straightforward) and the really cool McIDE product from ZZXIO. 

Here's the blurb for what it's worth: (does not address the ESDI issue)

 
--
You received this message because you are subscribed to the Google Groups "MFM Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mfm-discuss...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/mfm-discuss/db922a20-bbc6-42a9-925a-e109b0ab15c5n%40googlegroups.com.

Rasz

unread,
Oct 27, 2025, 7:27:49 AMOct 27
to mfm-d...@googlegroups.com
On Thu, Oct 23, 2025 at 7:35 AM BBMDonut <bbmd...@gmail.com> wrote:
> On Wednesday, October 22, 2025 at 10:31:46 PM UTC-5 citi...@gmail.com wrote:
> Yes I saw that, but no code posted yet, ill try to code something myself too.
>
>
> Hi there! My apologies for the lack of a push

Absolutely no need for any apologies.

> going through the upgrades and doing some basic sanity checks/cleanup before I push

I was hoping you were uploading your progress to your own github, all
I needed was a beachhead, a place to start :) instead had to push that
rock uphill on my own :)

>One of the
> annoying issues - aside from the difference in bit timings - is the fact that different vendors (and
> possibly different controller ICs from the same vendor) can all encode RLL with slight differences
> (whereas MFM's flux-to-binary translation was pretty standard across the boards). It's just a few
> tweaks to some lookup tables, but still annoying nonetheless...

from what I gathered by now its
1 using different rll tables.
WD is using WD, Seagate same IBM one found in SSI 32D535 ENDEC datasheet.
2 different magic Markers
All I could find so far was some patent mentioning 0x8B
(Clipboard03.png) - that was fail, and another patent mentioning
Adaptec AIC-270 ENDEC and 0x5EAx (Clipboard02.png) - another fail.
Ended up brute forcing everything around illegal 0b00100000001001 and
finally got somewhere hopefully (rll_maybe.png attachment). This is
WD1003V-SR1_RLL_820-6.tr Still not sure if its right as the header
looks too short, only 5 bytes of any data, 3 + CRC?

ps: I like how in both SR1_RLL_820-6.tr and WD1003V-SR1_RLL_820-6.tr
you can clearly see with naked eye controllers were really bad at
write pre-compensation. Shortest 200ns transitions shrink to ~185ns
when not surrounded by other short gaps :)
rll_maybe.png
Clipboard03.png
Clipboard02.png

Rasz

unread,
Oct 27, 2025, 7:40:46 AMOct 27
to mfm-d...@googlegroups.com
On Thu, Oct 23, 2025 at 7:24 PM Reece Pollack <reece....@gmail.com> wrote:
> ESDI is specified to run up to 24 MHz. Since the data clock is provided by the drive, it must be used with a controller that will support whatever speed the drive provides. 10 MHz drives are most common, and the PS/2 model 80 I've captured data from has a 10 MHz drive in it.

Could I bother you for a fragment of one of the dumps? One/two tracks.
Im very interested in seeing the timings and how the data looks like
in flight. I might try adding it to PulseView decoder just for fun.

Rasz

unread,
Oct 27, 2025, 7:42:59 AMOct 27
to mfm-d...@googlegroups.com
Yes, Im interested in seeing format of the NRZ data between drive and
controller.

BBMDonut

unread,
Oct 27, 2025, 9:26:23 AMOct 27
to MFM Discuss
On Monday, October 27, 2025 at 6:27:49 AM UTC-5 citi...@gmail.com wrote:
On Thu, Oct 23, 2025 at 7:35 AM BBMDonut <bbmd...@gmail.com> wrote:
> On Wednesday, October 22, 2025 at 10:31:46 PM UTC-5 citi...@gmail.com wrote:
> Yes I saw that, but no code posted yet, ill try to code something myself too.
>
>
> Hi there! My apologies for the lack of a push

Absolutely no need for any apologies.

> going through the upgrades and doing some basic sanity checks/cleanup before I push

I was hoping you were uploading your progress to your own github, all
I needed was a beachhead, a place to start :) instead had to push that
rock uphill on my own :)

In all honesty, I'm horrible about remote pushes to my projects much of the time. I blame
having multiple local backups plus a tinge of perfectionism (I get weird about others
seeing the "in progress" ugliness that sometimes comes with prototyping/R&D). I'll get over
it eventually.

That being said, it's a tough time at the moment to get anything set up just yet. Some life 
things are going on outside of the project that have taken much of my free time as of late.
Hopefully things resolve within the next week or so, and I'll be back to getting this thing ready
to go.

I'll rip off the band-aid, however, and try to get the "ugly" version pushed to a branch tonight.
I didn't realize anyone would actually be looking at it at this point. On a side note, there is a
Python script that was added to the contrib/ folder from another developer that can be used
to do a "quick and dirty" parse on a raw transition dump for the WD RLL table at least - I hope
that it's able to help out in the mean time. Kudos to PHK for contributing that, as it helped
me to get over some of my final issues.



>One of the
> annoying issues - aside from the difference in bit timings - is the fact that different vendors (and
> possibly different controller ICs from the same vendor) can all encode RLL with slight differences
> (whereas MFM's flux-to-binary translation was pretty standard across the boards). It's just a few
> tweaks to some lookup tables, but still annoying nonetheless...

from what I gathered by now its
1 using different rll tables.
WD is using WD, Seagate same IBM one found in SSI 32D535 ENDEC datasheet.
2 different magic Markers
All I could find so far was some patent mentioning 0x8B
(Clipboard03.png) - that was fail, and another patent mentioning
Adaptec AIC-270 ENDEC and 0x5EAx (Clipboard02.png) - another fail.
Ended up brute forcing everything around illegal 0b00100000001001 and
finally got somewhere hopefully (rll_maybe.png attachment). This is
WD1003V-SR1_RLL_820-6.tr Still not sure if its right as the header
looks too short, only 5 bytes of any data, 3 + CRC?


Looking at the screenshot of rll_maybe.png, I see a little bit of the issue. For some
reason, the parsing looks alright until it gets to the CRC section. The gaps between
the 05 and 08, and the 08 and A6 look a little off. When running the rest of the
header through the right CRC function, the checksum bytes should come out to
4F DE. Using the entire header byte list of "A1 FE 01 48 05 4F DE", the CRC check 
zeroes out as expected. Though something else doesn't seem to be sitting right
in the header, now that I'm looking at it. It's possible that the mapping is different, but
for the WD controller breakout I have, I'm seeing the "48" showing that the reading
would be from head 9, with a sector size of 1024 bytes... The rest would be from
cylinder 1, sector 5/6 (depending on zero-based or one-based, of course). I could have
a bad lookup table, which is possible - I'll double-check the datasheet to make sure
I'm reading it right.

This site was handy in helping to get the header decoding working for me:


As far as the ECC used for confirming/correcting the data section (A1F8), the contributed
Python script mentioned above has the algorithm in place.

Rasz

unread,
Oct 27, 2025, 10:21:37 AMOct 27
to mfm-d...@googlegroups.com
On Mon, Oct 27, 2025 at 2:26 PM BBMDonut <bbmd...@gmail.com> wrote:
> In all honesty, I'm horrible about remote pushes to my projects much of the time. I blame
> having multiple local backups plus a tinge of perfectionism

I know what you mean :)

> I'll rip off the band-aid, however, and try to get the "ugly" version pushed to a branch tonight.

no need to hurry, I just got it on my own :o :D wooo /happy dance,
took a lot of garbage logging but i got there in the end. Example of
the horrors and crimes I committed in the process:

pll_shift b1001001001001001001001001000000100000001 8 5957
self.lock_count 70 6064
pll_shift b1001001001001001001001000000100000001001 3 6064
self.lock_count 71 6101
pll_shift b100100100100100100000010000000100100001 5 6101
byte_synced 4 6170
pll_shift b1001001001001000000100000001001000010001 4 6170
pll_shift b10010010000001000000010010000100010001 4 6223
pll_shift b1001000000100000001001000010001000100001 5 6277
pll_shift b1000000100000001001000010001000100001001 3 6344
RLL_2 b10001000100001001 10001000100001001
RLL_TABLE[pattern] 11 11 4 1000
RLL_TABLE[pattern] 1111 11 8 1000
RLL_TABLE[pattern] 111111 11 12 1000
RLL_TABLE[pattern] 11111110 10 16 0100
RLL_shift b1000000100000001001000010001000100001001 11111110 1 -16 6344
annotate_bits_new 0xfe special.clock
byte_ 1 0
byte_ 0
byte_ 0 0
byte_ 0
byte_ 1 0
byte_ 0
byte_ 0 0
byte_ 0
byte_ 1 0
byte_ 0
byte_ 0 0
byte_ 0
byte_ 0 0
byte_ 1
byte_ 0 0
byte_ 0
pll_shift b100000001001000010001000100001001001 3 6383
pll_shift b100000001001000010001000100001001001001 3 6423
pll_shift b1001000010001000100001001001001001 3 6463
pll_shift b1001000010001000100001001001001001000001 6 6502
RLL_2 b1001001001000001 1001001001000001
RLL_TABLE[pattern] 000 000 6 100100
RLL_TABLE[pattern] 000000 000 12 100100
pll_shift b10001000100001001001001001000001000001 6 6584
pll_shift b1000100001001001001001000001000001000001 6 6663
RLL_2 b1000001000001 0001000001000001
RLL_TABLE[pattern] 000000010 010 6 000100
RLL_shift b1000100001001001001001000001000001000001 00000001 10 -18 6663
annotate_bits_new 0x1 False
byte_ 0 -2
byte_ 0
byte_ 1 -2
byte_ 0
byte_ 0 -2
byte_ 1
byte_ 0 -2
byte_ 0
byte_ 0 -2
byte_ 0
byte_ 0 -2
byte_ 1
byte_ 0 -2
byte_ 0
byte_ 0 -2
byte_ 0
pll_shift b100001001001001001000001000001000001001 3 6744
pll_shift b1001001001001000001000001000001001001 3 6783
RLL_2 b1000001001001 0001000001001001
RLL_TABLE[pattern] 0010 010 6 000100
RLL_TABLE[pattern] 0010010 010 12 000100

not having a real debugger when working with sigrok decoders sucks super bad :(

> Looking at the screenshot of rll_maybe.png, I see a little bit of the issue. For some
> reason, the parsing looks alright until it gets to the CRC section. The gaps between
> the 05 and 08, and the 08 and A6 look a little off. When running the rest of the
> header through the right CRC function, the checksum bytes should come out to
> 4F DE. Using the entire header byte list of "A1 FE 01 48 05 4F DE"

previous screenshot was still misaligned :) but now I think I got it!
WD1003V-SR1_RLL_820-6.tr" Track 10: Cylinder 1, Head 4, first sector
on the track:
0xA1 0xFE 0x1 0x24 0x1 = CRC 0x411D

now ill try to decode DAM and verify crc

> This site was handy in helping to get the header decoding working for me:
> https://crccalc.com/?crc=A1FE0148054FDE&method=CRC-16/IBM-3740&datatype=hex&outtype=hex

I use this one https://www.sunshine2k.de/coding/javascript/crc/crc_js.html
allows custom polys and custom Initial Values

Rasz

unread,
Oct 28, 2025, 3:58:28 AMOct 28
to mfm-d...@googlegroups.com
On Mon, Oct 27, 2025 at 3:23 PM Rasz <citi...@gmail.com> wrote:
> got it on my own :o :D wooo /happy dance1D
>
> now ill try to decode DAM and verify crc

D:\_code\disk mfm\sidecat\test>test.py "D:\_code\disk
mfm\MfmDecoder\hd_samples\ST-278R\RLL_1-1_interleave_26sect\WD1003V-SR1_RLL_820-6.tr"
-t0 -d track0
File Type: 1 (Transition), Major Version: 2, Minor Version: 2
Offset to first track: 121 bytes
Track header size: 12 bytes
Number of cylinders: 820
Number of heads: 6
Number of tracks (cylinders*heads): 4920
Transition count rate: 200,000,000 Hz
Command line: --heads 6 --cylinders 820 --sector_length 512 --retries
50,4 --drive 1
Note:
Start time from index: 0 ns
Header CRC: read bbfa068c, computed bbfa068c
Track 0: Cylinder 0, Head 0
Data bytes: 77356
Track CRC: read ee8a3234, computed ee8a3234
Writing D:\_code\disk mfm\sidecat\test\track0.vcd

D:\_code\disk mfm\sidecat\test>sigrok-cli.exe -D -i "D:\_code\disk
mfm\sidecat\test\track0.vcd" -P
mfm:data_rate=7500000:encoding=RLL:header_bytes=3:data_crc_bits=56:data_crc_poly=0x140a0445000101:report=DAM:report_qty=26
-A mfm=fields:reports
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=1, len=512
mfm-1: CRC OK BAE9
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK 226506C50A78BD
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=2, len=512
mfm-1: CRC OK 8A8A
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK 36B8CBF4C5926E
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=3, len=512
mfm-1: CRC OK 9AAB
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=4, len=512
mfm-1: CRC OK EA4C
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=5, len=512
mfm-1: CRC OK FA6D
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=6, len=512
mfm-1: CRC OK CA0E
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=7, len=512
mfm-1: CRC OK DA2F
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=8, len=512
mfm-1: CRC OK 2BC0
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=9, len=512
mfm-1: CRC OK 3BE1
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=10, len=512
mfm-1: CRC OK B82
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=11, len=512
mfm-1: CRC OK 1BA3
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=12, len=512
mfm-1: CRC OK 6B44
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=13, len=512
mfm-1: CRC OK 7B65
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=14, len=512
mfm-1: CRC OK 4B06
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=15, len=512
mfm-1: CRC OK 5B27
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=16, len=512
mfm-1: CRC OK B8F9
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=17, len=512
mfm-1: CRC OK A8D8
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=18, len=512
mfm-1: CRC OK 98BB
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=19, len=512
mfm-1: CRC OK 889A
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=20, len=512
mfm-1: CRC OK F87D
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=21, len=512
mfm-1: CRC OK E85C
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=22, len=512
mfm-1: CRC OK D83F
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=23, len=512
mfm-1: CRC OK C81E
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=24, len=512
mfm-1: CRC OK 39F1
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=25, len=512
mfm-1: CRC OK 29D0
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=26, len=512
mfm-1: CRC OK 19B3
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Summary: IAM=0, IDAM=26, DAM=26, DDAM=0, CRC_OK=52, CRC_err=0,
EiPW=0, CkEr=0, OoTI=55/75902

WD RLL controller decoding working :) The only dilemma I have right
now is what to do with Sync Marks. I expected it to work like in
FM/MFM dumps where A1 is always in same spot no matter what, but here
the magic 100000001001 moves around ?! On same track it can be one of
these three possibilities:
... 3 3 3 7 8 3 5
100100100100000010000000100100001
... 3 3 3 5 8 3 5
1001001001000010000000100100001
... 3 3 3 3 8 3 5
10010010010010000000100100001

I resigned to syncing on 8 3 5 and rewinding one bit before decoding
proper data.
I dont know what to display in place of Sync Mark for the user. I
found only one patent where they suggest the illegal 100000001001 is
created by omitting impulse from 100010001001 which is supposed to be
5EAx. That doesnt work here :( no matter where I would inject
additional impulse whatever decodes in that spot leaves one dangling 0
bit :|
Oh well, moving on to Seagate RLL dump, maybe working on that one will
give me some ideas.
rll_working.png

BBMDonut

unread,
Oct 28, 2025, 8:58:11 AMOct 28
to MFM Discuss
This is pretty much the same way I handled it on my end, so it's good to have
confirmation that other people see it this way as well.

As for the variable length on the sync marks, that's been causing me some
craziness too - I've been looking at getting the RLL support added to the
hardware emulator itself, but need to dig a little more into the track layout
descriptors to see where/how those fields can be defined. The data sheet
for the 50C12 controller IC (the closest thing I can find to my 5011 on the 
WD1004-27X) seems to validate the variable-length fields from what I can
tell (see attached image).


I dont know what to display in place of Sync Mark for the user. I
found only one patent where they suggest the illegal 100000001001 is
created by omitting impulse from 100010001001 which is supposed to be
5EAx. That doesnt work here :( no matter where I would inject
additional impulse whatever decodes in that spot leaves one dangling 0
bit :|
Oh well, moving on to Seagate RLL dump, maybe working on that one will
give me some ideas.

Awesome work! Another idea I had, but have zero time to work on, is something
similar to the HxC Floppy Emulator Toolkit 0 but for MFM/RLL HDD images
(or expanding out the existing tool to support them, if possible). I've been wanting
to see graphical representations of the disk data ever since I started working with
the MFM Emulator. 

wd50c12_soft_sector_format.png

David Gesswein

unread,
Oct 28, 2025, 9:42:16 AMOct 28
to mfm-d...@googlegroups.com
On Tue, Oct 28, 2025 at 05:58:11AM -0700, BBMDonut wrote:
Lots of good fun deleted

> I've been looking at getting the RLL support added to the
> hardware emulator itself, but need to dig a little more into the track
> layout
> descriptors to see where/how those fields can be defined.
>

The emulator is storing the raw MFM and hopefully soon RLL bit patterns. The
only decoding it is doing is converting the transition timing to 1's and 0's.
Any further decoding is done with mfm_util. I picked that so it doesn't need to
know much about the format though it wastes space.

Rasz

unread,
Oct 28, 2025, 11:03:38 AMOct 28
to mfm-d...@googlegroups.com
On Tue, Oct 28, 2025 at 1:58 PM BBMDonut <bbmd...@gmail.com> wrote:

>The data sheet for the 50C12 controller IC

Oh that is a nice datasheet, I havent seen it before.

> Awesome work! Another idea I had, but have zero time to work on, is something
> similar to the HxC Floppy Emulator Toolkit 0 but for MFM/RLL HDD images
> (or expanding out the existing tool to support them, if possible). I've been wanting
> to see graphical representations of the disk data ever since I started working with
> the MFM Emulator.

https://github.com/davidgiven/fluxengine is the closest I could find.
Could be further extended with HDD support and maybe its low level
display could add same per flux transition labels as in Pulseview
decoder plugin I adopted.

Got the Seagate decoding working now :D Reading patents finally paid
off - Seagate does use 0x5EA1 as the sync marker for headers and
0xDEA1 for data! At least leaves no ambiguity what to display in UI.
In contrast WD 50C12 datasheet just states illegal 1000 0000 1001 0000
but doesnt explicitly mention anything about it randomly sliding
around up to 4 bits away from sync pattern :|
"Since each address mark should be preceded by approximately 12 bytes
of zeros, when a sequence of zeros is detected by the activation of
DRUN, Read Gate is activated and read data is examined until either an
address mark is detected or a non-zero byte which is not an address
mark is detected."
"If a non-zero non-address mark byte is detected, read gate is dropped
for at least 2 byte times, allowing the phase lock loop to
resynchronize with the write clock, before inspecting DRUN input
again."
2 bytes is not 4 bits, what is going on :| Figure 5 and 6 on the other
hand read to me like after syncing PLL controller just looks for AM in
a window of 2 or 3 bytes without mentioning what do they mean by bytes
(32-46 clocks?).

Btw Seagate uses 32bit CRC in RLL mode, but writes full 56 bits with
last 2 bytes zeroed before starting GAP3, so even more waste :) and of
course there are 27 sectors on every track again, like there was in
MFM mode, with last one unused/unusable (no GAP2 before data = cant
safely write to it) :|

Rasz

unread,
Nov 9, 2025, 11:29:45 PMNov 9
to MFM Discuss
On Tue, Oct 21, 2025 at 1:19 PM Rasz <citi...@gmail.com> wrote:
> My ultimate goal is adding RLL support.

Hardcoded RLL decoding was easy. Making single universal decoder able to handle multiple formats required some thinking. Hardest part was figuring out how to name and display sync marks :-) Settled on mangling invalid 0b100000001001 into 0b100010001001 and just displaying whatever is there as part of sync mark. That way WD one turns into nice 0xF0 combined with header (val & 0xF4) == 0xF4) and 0xF0F8 data.

>sigrok-cli.exe -D -i "D:\_code\disk mfm\sigrok-mfm\test\hdd_rll_wd1003.sr" -P mfm:data_rate=7500000:encoding=RLL_WD:data_crc_bits=56:data_crc_poly=0x140a0445000101:header_bytes=3:report=DAM:report_qty=26 -A mfm=reports

mfm-1: Summary: IAM=0, IDAM=26, DAM=26, DDAM=0, CRC_OK=52, CRC_err=0, EiPW=0, CkEr=0, OoTI=55/75902
pulseview_headers_hdd_rll_wd1003.png

Seagate is weird. I spend few days just wondering how to handle it. It does produce nice promised in patents 5EA1 but only if you start decoding while still in 001001001 sync pattern train, without it that nice 0x95EA100 (01001001000100100010) turns into 0x1EA1 (00010010001000100100) for headers and 0xDEA1F8 data. I just display those 0x1EA1/0xDEA1F8 as sync mark. For some reason Seagate (and I later learned probably more vendors) decided their 0x1EA1 sync mark is unique enough to not bother with proper IDmark. IDRecord starts right after 0x1EA1 forcing me rewrite my decoding State Machine to make it more flexible.

>sigrok-cli.exe -D -i "D:\_code\disk mfm\sigrok-mfm\test\hdd_rll_st21r.sr" -P mfm:data_rate=7500000:encoding=RLL_SEA:header_crc_bits=32:header_crc_init=0:header_crc_poly=0x41044185:data_crc_init=0:data_crc_poly=0x41044185:report=DAM:report_qty=27 -A mfm=reports
mfm-1: Summary: IAM=0, IDAM=27, DAM=27, DDAM=0, CRC_OK=54, CRC_err=0, EiPW=0, CkEr=0, OoTI=82/44854
pulseview_headers_hdd_rll_st21r.png

That Seagate complication turned to blessing when I opened \hd_samples\ST-278R\RLL_1-1_interleave_26sect\ACB-2372_RLL_820-6.tr today and was able to support it in less than 5 minutes :o It was just a matter of defining few fields
encoding.RLL_Adaptec: { # (2,7) RLL Adaptec ACB-237x
'table': RLL_IBM_R,
'cells_allowed': (3, 4, 5, 6, 7, 8),
'sync_pattern': 3,
'sync_marks': [[4, 3, 8, 3, 4], [5, 6, 8, 3, 4], [8, 3, 4]],
'shift_index': [18, 18, 18],
'IDsync_mark': [0xA1],
'IDData_mark': [0xA0],
'nop_mark': [0x1E, 0x5E, 0xDE],
'pb_state': state.sync_mark,
},

and it magically worked almost 100% :o

>sigrok-cli.exe -D -i "D:\_code\disk mfm\sidecat\test\track1.vcd" -P mfm:data_rate=7500000:encoding=RLL_Adaptec:header_crc_init=0:report=DAM:report_qty=26 -A mfm=fields:reports
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=1, len=0
mfm-1: CRC OK F380

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error D9912681
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=14, len=1024
mfm-1: CRC OK 9359

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=2, len=1024
mfm-1: CRC OK D634

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error D4AC9E77
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=15, len=0
mfm-1: CRC OK C0AE

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=3, len=0
mfm-1: CRC OK 2489

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=16, len=0
mfm-1: CRC OK 72A9

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=4, len=0
mfm-1: CRC OK BD1E

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=17, len=0
mfm-1: CRC OK 4198

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=5, len=0
mfm-1: CRC OK 8E2F

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=18, len=0
mfm-1: CRC OK 14CB

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=6, len=0
mfm-1: CRC OK AB9B

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=19, len=0
mfm-1: CRC OK 775F

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes

mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=7, len=128
mfm-1: CRC OK 19A2

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=20, len=128
mfm-1: CRC OK 4F82

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes

mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=8, len=128
mfm-1: CRC OK 99C

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=21, len=128
mfm-1: CRC OK 7CB3
mfm-1: Sync pattern 12 bytes

mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes

mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=9, len=128
mfm-1: CRC OK 3AAD

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=22, len=128
mfm-1: CRC OK 29E0

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes

mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=10, len=128
mfm-1: CRC OK 6FFE

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=23, len=128
mfm-1: CRC OK 1AD1

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes

mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=11, len=128
mfm-1: CRC OK 5CCF

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=24, len=128
mfm-1: CRC OK AEF

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes

mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=12, len=128
mfm-1: CRC OK C558

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=25, len=128
mfm-1: CRC OK 39DE

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes

mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=13, len=128
mfm-1: CRC OK F669

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Sync pattern 9 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=26, len=128
mfm-1: CRC OK 6C8D

mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC error CD3E79B9
mfm-1: Summary: IAM=0, IDAM=26, DAM=26, DDAM=0, CRC_OK=26, CRC_err=26, EiPW=0, CkEr=0, OoTI=59/44350
pulseview_headers_hdd_rll_Adaptec.png

Looks like it uses 48bit Data ECC I dont know/support yet, and header len flipping between 0, 128 and 1024 makes it obvious its in different format I also dont know yet :)


Question: Who is the owner of those awesome and super helpful /hd_samples/ in https://bitsavers.org/projects/hd_samples/ ? Im asking because I would love to attach one track of each to my test samples https://github.com/raszpl/sigrok-disk?tab=readme-ov-file#available-test-sample-files library and ship it with the Decoder.pulseview_headers_hdd_rll_wd1003.pngpulseview_headers_hdd_rll_st21r.pngpulseview_headers_hdd_rll_Adaptec.png

David Gesswein

unread,
Nov 10, 2025, 8:26:00 PMNov 10
to mfm-d...@googlegroups.com
On Sun, Nov 09, 2025 at 08:29:45PM -0800, Rasz wrote:
>
>
> Question: Who is the owner of those awesome and super helpful /hd_samples/
> in https://bitsavers.org/projects/hd_samples/ ? Im asking because I would
> love to attach one track of each to my test samples
> https://github.com/raszpl/sigrok-disk?tab=readme-ov-file#available-test-sample-files
> library and ship it with the Decoder.[image:

Al Kossow. I don't think he would have issue with that since its public.
Would be nice to credit bitsavers.

Al Kossow

unread,
Nov 10, 2025, 8:43:10 PMNov 10
to mfm-d...@googlegroups.com
that's fine with me. it is what they are there for

Rasz

unread,
Nov 12, 2025, 4:08:20 AMNov 12
to mfm-d...@googlegroups.com
On Tue, Nov 11, 2025 at 2:43 AM Al Kossow <a...@bitsavers.org> wrote:
> that's fine with me. it is what they are there for

Thank you sir for sampling those drives! Your captures were simply
indispensable in implementing RLL decoding.

I stopped being lazy and found original thread on VCFED
https://forum.vcfed.org/index.php?threads/rll-drive-sampling-project.1209575/
Was thinking about moving discussing samples there, but here might be
a better place for technical stuff + my post if hanging in moderation.

ST-251\MFM_1-1_interleave_17sect\AMS1100M4_820-6.tr
>Bad block set on cyl 622, head 1, sector 1
>ECC Corrections on cylinder 622 head 1: 9(5)

Sector 1 looks fine, Sector 9 on the other hand has two massive 680ns
followed by 910ns holes

>1 sectors corrected with ECC. Max bits in burst corrected 5

Im dubious about that. There should be neat train of 400ns pulses
here, instead we get
mfm-1: 395ns
mfm-1: 395ns
mfm-1: 370ns
mfm-1: 405ns
mfm-1: 680ns out-of-tolerance leading edge
mfm-1: 910ns out-of-tolerance leading edge
mfm-1: 430ns
mfm-1: 425ns
mfm-1: 400ns

1590ns adds to ~9 bad bits? and its not like that first 680ns lands
anywhere near a good spot. Can ECC really rescue that? :o If thats
possible then I would love to implement it in sigrok-disk :)

Later same ST-251 on Everex EV-346 using exact same Cirrus Logic
CL-SH260 controller:
ST-251\MFM_1-1_interleave_17sect\EV346_MFM_820-6.tr
>ECC Corrections on cylinder 622 head 1: 9(6)
>1 sectors corrected with ECC. Max bits in burst corrected 6

sector 7 at ~same spot:
mfm-1: 200ns s1644024 - 1644064
mfm-1: 375ns s1644064 - 1644139
mfm-1: 265ns s1644139 - 1644192
mfm-1: 290ns s1644192 - 1644250
mfm-1: 375ns s1644250 - 1644325
mfm-1: 240ns s1644325 - 1644373
mfm-1: 415ns s1644373 - 1644456
mfm-1: 325ns s1644456 - 1644521
mfm-1: 315ns s1644521 - 1644584

Right now my decoder is doing the stupid thing of not flagging
out-of-tolerance pulses when they are still inside allowed 200-400ns
window and instead just guesses whats most probable at the time (PLL
is tracking window size). 240ns becomes 200ns, 265ns becomes 300ns,
375ns becomes 400ns etc. With that naive approach whole sector 9
decodes perfectly with no ECC corrections required.

It looks like Everex Format simply got lucky landing in less bad
magnetic spots and pulse just got slightly moved around instead of
outright missing transitions like the AMS one?

David Gesswein

unread,
Nov 16, 2025, 8:55:24 PMNov 16
to mfm-d...@googlegroups.com
On Wed, Nov 12, 2025 at 10:08:01AM +0100, Rasz wrote:
>
> ST-251\MFM_1-1_interleave_17sect\AMS1100M4_820-6.tr
> >Bad block set on cyl 622, head 1, sector 1
> >ECC Corrections on cylinder 622 head 1: 9(5)
>
> Sector 1 looks fine, Sector 9 on the other hand has two massive 680ns
> followed by 910ns holes
>
> >1 sectors corrected with ECC. Max bits in burst corrected 5
>
> Im dubious about that. There should be neat train of 400ns pulses
> here, instead we get
> mfm-1: 395ns
> mfm-1: 395ns
> mfm-1: 370ns
> mfm-1: 405ns
> mfm-1: 680ns out-of-tolerance leading edge
> mfm-1: 910ns out-of-tolerance leading edge
> mfm-1: 430ns
> mfm-1: 425ns
> mfm-1: 400ns
>
> 1590ns adds to ~9 bad bits? and its not like that first 680ns lands
> anywhere near a good spot. Can ECC really rescue that? :o If thats
> possible then I would love to implement it in sigrok-disk :)
>
I created extracted data file with and without ECC correction.
Without correction that decoded to ... AA AA 80 AA AA ...
With correction the 0x80 became 0xAA. So ECC changed binary 00000 to 10101 or
5 bit correction. Some of the ECC codes were claimed to correct up to 11 bits.
If you have a single error burst they do well. If you have multiple separate
errors they can do false corrections where the bits changed make the CRC match
but its not really the original data. I reduced the correction length based on
polynomial to try to reduce miscorrections at expense of not correcting all
errors that could be fixed.


> Later same ST-251 on Everex EV-346 using exact same Cirrus Logic
> CL-SH260 controller:
> ST-251\MFM_1-1_interleave_17sect\EV346_MFM_820-6.tr
> >ECC Corrections on cylinder 622 head 1: 9(6)
> >1 sectors corrected with ECC. Max bits in burst corrected 6
>
> sector 7 at ~same spot:
> mfm-1: 200ns s1644024 - 1644064
> mfm-1: 375ns s1644064 - 1644139
> mfm-1: 265ns s1644139 - 1644192
> mfm-1: 290ns s1644192 - 1644250
> mfm-1: 375ns s1644250 - 1644325
> mfm-1: 240ns s1644325 - 1644373
> mfm-1: 415ns s1644373 - 1644456
> mfm-1: 325ns s1644456 - 1644521
> mfm-1: 315ns s1644521 - 1644584
>
> Right now my decoder is doing the stupid thing of not flagging
> out-of-tolerance pulses when they are still inside allowed 200-400ns
> window and instead just guesses whats most probable at the time (PLL
> is tracking window size). 240ns becomes 200ns, 265ns becomes 300ns,
> 375ns becomes 400ns etc. With that naive approach whole sector 9
> decodes perfectly with no ECC corrections required.
>
Mine decodes sector 7 without error. Its sector 9 where I have decode
error.

Mine decodes it as A5 B0 05 A5 which should be A5 A5 A5 A5. ECC corrects.

I see big gap in deltas in sector 9 which is where I did a correction.
Delta in ns
295
385
200
370
290
255
1035 (at delta 26870 of track, 8276160ns)
340
305

Rasz

unread,
Nov 17, 2025, 6:18:02 AMNov 17
to mfm-d...@googlegroups.com
On Mon, Nov 17, 2025 at 2:55 AM David Gesswein <d...@pdp8online.com> wrote:
>
> On Wed, Nov 12, 2025 at 10:08:01AM +0100, Rasz wrote:
> > mfm-1: 405ns
> > mfm-1: 680ns out-of-tolerance leading edge
> > mfm-1: 910ns out-of-tolerance leading edge
> > mfm-1: 430ns
> >
> > 1590ns adds to ~9 bad bits?

> I created extracted data file with and without ECC correction.
> Without correction that decoded to ... AA AA 80 AA AA ...
> With correction the 0x80 became 0xAA. So ECC changed binary 00000 to 10101 or
> 5 bit correction.

Thank you, I get it now. glitch is
01 00 00 00 10 00 00 00 01 = 10000000 with bad clock
we are fixing it back to
01 00 01 00 01 00 01 00 01 = 10101010
in reality only 5 data bits were ever flipped

In my defense I got the flu (or covid, hard to tell nowadays) last
week and have been loopy ever since. I feel like LLM, yesterday I
started sector decoder rewrite after a dream I had only to realize
half way Im hallucinating non existent states :|

> > sector 7 at ~same spot:
> > mfm-1: 375ns s1644064 - 1644139
> > mfm-1: 265ns s1644139 - 1644192
> > mfm-1: 290ns s1644192 - 1644250
> > mfm-1: 375ns s1644250 - 1644325
> > mfm-1: 240ns s1644325 - 1644373

> Mine decodes sector 7 without error. Its sector 9 where I have decode
> error.

7 was a typo :(. I meant 9, sample numbers are correct. My decoder
arrives at A5A5A5 by plowing thru this doing the stupid
self.halfbit_cells = round(pulse_ticks / self.halfbit)
where halfbit is PLL estimation and pulse_ticks current length from last pulse.
I probably should apply some healthy threshold here like I do when
training on sync sequence. I think I saw some discussion about smart
approaches at https://github.com/kristomu/flux-analyze or
https://github.com/davidgiven/fluxengine
maybe this one https://github.com/davidgiven/fluxengine/issues/385

> I see big gap in deltas in sector 9 which is where I did a correction.
> Delta in ns
> 255
> 1035 (at delta 26870 of track, 8276160ns)
> 340
> 305

1035 at 8276160ns? in
\hd_samples\ST-251\MFM_1-1_interleave_17sect\EV346_MFM_820-6.tr"
Cylinder 622, Head 1?
hmm lets check
test.py "D:\_code\disk
mfm\MfmDecoder\hd_samples\ST-251\MFM_1-1_interleave_17sect\EV346_MFM_820-6.tr"
-t3733 -d track1
Track 3733: Cylinder 622, Head 1
Data bytes: 54717
Track CRC: read 2ac1d10c, computed 2ac1d10c
Track 3734: Cylinder 622, Head 1
Data bytes: 54717
Track CRC: read d7daaea5, computed d7daaea5
Track 3735: Cylinder 622, Head 1
Data bytes: 54719
Track CRC: read 94905ed2, computed 94905ed2
Track 3736: Cylinder 622, Head 1
Data bytes: 54719
Track CRC: read a99acfa4, computed a99acfa4
what the hell is going on? Track 3737 finally flips to Cylinder 622, Head 2

So in this dump Track 3733 had no CRC errors, but Track 3734 does
contain 1035ns glitch you are talking about, and also 3735, while 3733
and 3736 decode fine. and all of those contain same Cylinder 622, Head
1? or am I disordered again
I dont remember seeing anything about possibility of doubled tracks in
.tr dumps, not that I looked that hard into source as the format
seemed very straightforward. Was this triggered by Emu hardware
detecting error and trying few reads? but why would first try be ok?
weird

David Gesswein

unread,
Nov 17, 2025, 9:51:29 PMNov 17
to mfm-d...@googlegroups.com
On Mon, Nov 17, 2025 at 12:17:49PM +0100, Rasz wrote:
> So in this dump Track 3733 had no CRC errors, but Track 3734 does
> contain 1035ns glitch you are talking about, and also 3735, while 3733
> and 3736 decode fine. and all of those contain same Cylinder 622, Head
> 1? or am I disordered again
>
I see CRC errors on sector 9 in all 4 tracks with my decoder.

The first 3 track decode as 0xa5,0xb0,0x05,0xa5
The last track decodes as 0xa5,0xa0,0x85,0xa5 which is correctable in the
6 bit ECC limit so it corrected and retries stop.

For the first 3 it doesn't have the really large delta but does have a
number of short deltas which pulls the decoding off so it thinks there
are 2 zeros between ones when it should be 3 zeros. Takes it a little while
to recover so further bits are wrong.

When I have free time it may be worth comparing how the decoders compare on real
marginal disks.

Rasz

unread,
Nov 19, 2025, 1:53:27 PMNov 19
to mfm-d...@googlegroups.com
On Tue, Nov 18, 2025 at 3:51 AM David Gesswein <d...@pdp8online.com> wrote:
> I see CRC errors on sector 9 in all 4 tracks with my decoder.

ah of course, that is obvious in hindsight :) so my bruteforce method
just guesses correctly

> For the first 3 it doesn't have the really large delta but does have a
> number of short deltas which pulls the decoding off

somehow (I try not to think about it too hard, I didnt tune those,
just pulled out of old course material) PI PLL with
`pll_kp` PLL: PI Filter proportinal constant (Kp). : `0.5`
`pll_ki` PLL: PI Filter integral constant (Ki). : `0.0005`
and bruteforce matching works great on those
I didnt scan all of the samples, but just random checking few tracks I
stumbled on only one where those parameters didnt work out of the box.
MattisLind https://forum.vcfed.org/index.php?threads/rll-drive-sampling-project.1209575/post-1209655
hdd_rll_ACB4070.sr has some really funky pulses on Track 4444:
Cylinder 740, Head 4 ~574500ns
mfm-1: 200ns s114864 - 114904
mfm-1: 420ns s114904 - 114988
mfm-1: 440ns s114988 - 115076
mfm-1: 210ns s115076 - 115118
mfm-1: 400ns s115118 - 115198
mfm-1: 475ns s115198 - 115293
mfm-1: 190ns s115293 - 115331
mfm-1: 400ns s115331 - 115411
mfm-1: 465ns s115411 - 115504
mfm-1: 210ns s115504 - 115546
mfm-1: 380ns s115546 - 115622

420ns 440ns :-) amazingly pll_kp=1 is able to recover even that. I
dont have polys for ACB4070 yet to verify whole dump automagically,
but bumping Kp does nudge it back on track to decoding 0x6C 0x6C 0x6C
whole sector

> When I have free time it may be worth comparing how the decoders compare on real marginal disks.

I think I saw collection of very bad floppy captures on
https://github.com/kristomu/flux-analyze and/or
https://github.com/davidgiven/fluxengine together with some converters
between formats (I didnt know kryoflux format is or was a raw sqlite
database dump :o). I will look into that after talking a small break.
Going thru all Als samples, figuring formats out, categorizing and
cataloging took me a while but I got there in the end! Updated list
with parameters I was able to decipher at
https://github.com/raszpl/sigrok-disk?tab=readme-ov-file#available-test-sample-files
Only few surprises.
1. Seagate controller in mfm mode. Both ST-251 and ST-278R dumps are
only 32MB (3690 tracks) and start at Tracks 6. Weird. What I initially
assumed to be service area are actually remnants from previous formats
Al was performing :) ST-251 one was performed right after ACB2372,
and ST-278R after OMTI8240 :]
Only figured this out right now while writing this reply, another
victim of my flu hallucinations.

2. OMTI polynomial 0x0104c981 init 0xd4d7ca20 works only for OMTI 8240
data. Header is also clearly 32bit but poly dont fit.
OMTI 8247 uses ECC 48bit so no go. I think OMTI was calling their
48bit ECC a trade secret in datasheets/prospects :) will try guessing
it later with RevEng.

3. Adaptec ACB-4070 uses 32bit CRC in RLL mode plus those wobbly
readings. One would expect better from SCSI product.

4. DTC-7287 remains a mystery

5. Python still sucks. Being limited to Python 3.4 doubly so. Enums
are a joke. Quick benchmark of enums before I rewrote the thing,
Python 3.4:
# Direct dict Access d['key']: 3.3710 seconds
# Direct dict Enum Access d[enum.key]: 157.9954 seconds

In the end I managed to slightly (~10%) improve on original plugin
code, but its still abysmal:
fdd_fm.sr 1.0028 seconds
fdd_mfm.sr 1.4010 seconds
hdd_mfm_RQDX3.sr 2.2875 seconds
hdd_mfm_RQDX3.sr 2.2776 seconds
hdd_mfm_AMS1100M4.sr 1.6939 seconds
hdd_mfm_WD1003V-MM2.sr 2.2507 seconds
hdd_mfm_WD1003V-MM2_int.sr 2.0691 seconds
hdd_mfm_EV346.sr 2.0916 seconds
hdd_rll_ST21R.sr 2.3799 seconds
hdd_rll_WD1003V-SR1.sr 2.8801 seconds
Python, when you absolutely, positively want to execute hundreds of
thousands CPU opcodes per line of code, accept no substitutes!

All this with a fever, I need a brake!

Rasz

unread,
2:45 AM (17 hours ago) 2:45 AM
to mfm-d...@googlegroups.com
<borat voice> Great success!

I finally added Binary Output to the decoder
https://github.com/raszpl/sigrok-disk/tree/main?tab=readme-ov-file#binary-output
idraw - ID Records (Header contents)
dataraw - Data Records, order as on track
iddata - combined ID + Data Records, order as on track
idcrc - whole ID Records including Address Mark and crc useful for
reverse engineering Header CRC
datacrc- whole Data Records including Address Mark and crc useful for
reverse engineering Data CRC

I was thinking how to add those formats
tr - dgesswein/mfm transitions file format
ex - dgesswein/mfm extract file format
but decoder plugin seems unsuitable for it - cant generate header
without knowing whats coming later on the wire. Best I can do is
iddata (combined ID + Data Records, order as on track) for easy
external processing into disk.img (deinterlacing etc).

Binary Output allowed me to use slick RevEng one liners
>sigrok-cli -D -i samples\hdd_rll_OMTI8247.sr mfm:data_rate=7500000:format=RLL_OMTI:header_size=4 -B mfm=idcrc | xxd -p -c 8 | paste -sd' ' - | xargs ./reveng -w 16 -F -s
>width=16 poly=0x1021 init=0x7107

>sigrok-cli -D -i samples\hdd_rll_OMTI8247.sr -P mfm:data_rate=7500000:format=RLL_OMTI:data_crc_size=48:sector_size=512 -B mfm=datacrc | split -b 520 -d --additional-suffix=.bin - tmp_chunk_ & reveng.exe -w 48 -F -s -f tmp_chunk_.bin & rm tmp_chunk_.bin
>width=48 poly=0x181814503011 init=0x6062ebbf22b4

Omti super duper trade secret 48bit ECC is 0x181814503011, curiously
its also used by Adaptec! making me think there might have been some
cooperation goin on, or maybe corporate espionage ;-) Difference is in
inits. Adaptec 48bit init is 0x010000000000 while Omti loves weird
ones (obfuscation?) 48bit ECC 0x6062ebbf22b4, 16bit header 0x7107, I
also rediscovered weird Omti MFM inits 0x2605fb9c 0xd4d7ca20 (which
David already had for a long time in
https://github.com/dgesswein/mfm/blob/4aa2c1c72277b306eca2fd49ec0203f7184369ec/mfm/inc/mfm_decoder.h#L469)
So far Western Digital are the only controllers/dumps using 56bit ECC.

Running RevEng on Adaptec dumps let me spot small bug I had in
RLL_Adaptec format.

I also managed to speed up super slow python decoding of ~3 seconds
per RLL track to still super slow ~2 seconds. Full decoder benchmark
on 10 year old 4GHz cpu
https://raw.githubusercontent.com/raszpl/sigrok-disk/refs/heads/main/benchmarks/tests.txt
Decoding methods benchmark
https://github.com/raszpl/sigrok-disk/blob/main/benchmarks/decode_bench.py
RLL binary shifts unrolled : 1.609 seconds -> 0.18 MiB/s
MFM SWAR local : 0.654 seconds -> 0.14 MiB/s
4GHz combined with Python cant reach real time decoding speed of 40
year old hardware powered by 8051 running at 800KHz (10MHz internally
divided by 12) :)

The only big thing left is figuring out most of the non standard
Header formats and getting my hands on some ESDI/SMD dumps.

Al Kossow

unread,
2:55 AM (17 hours ago) 2:55 AM
to mfm-d...@googlegroups.com
On 12/3/25 11:45 PM, Rasz wrote:

> Omti super duper trade secret 48bit ECC is 0x181814503011, curiously
> its also used by Adaptec! making me think there might have been some
> cooperation goin on, or maybe corporate espionage ;-)
Most ECC for disks came from Glover when he was a consultant who eventually wrote

http://bitsavers.org/components/cirrusLogic/Practical_Error-Correction_Design_For_Engineers_2ed_1991.pdf

Rasz

unread,
3:30 AM (16 hours ago) 3:30 AM
to mfm-d...@googlegroups.com
On Thu, Dec 4, 2025 at 8:55 AM Al Kossow <a...@bitsavers.org> wrote:
> Most ECC for disks came from Glover when he was a consultant who eventually wrote
>
> http://bitsavers.org/components/cirrusLogic/Practical_Error-Correction_Design_For_Engineers_2ed_1991.pdf

3350 code 48bit x48 + x36 + x35 + x23 + x21 + x15 + x13 + x8 + x2 + 1
0x1800a0a105
3370 makes my head hurt
0x140a0445 mentioned as good crc32 + huge table of 32bit polys in octal form

OMTI (or was it Adaptec?) was adamant in proprietary trade secret
nature of their special poly. Google hasnt seem to have seen
"0x181814503011" x48 + x44 + x43 + x36 + x35 + x28 + x26 + x22 + x20 +
x13 + x12 + x4 + 1 anywhere yet. Makes me happy.

David Gesswein

unread,
8:59 AM (11 hours ago) 8:59 AM
to mfm-d...@googlegroups.com
On Thu, Dec 04, 2025 at 08:45:17AM +0100, Rasz wrote:
> <borat voice> Great success!
>
> I was thinking how to add those formats
> tr - dgesswein/mfm transitions file format
>
This is the time between positive edge transitions in data sample rate clocks
encoded slightly to reduce file size.

> ex - dgesswein/mfm extract file format
>
This is similar to your dataraw but the sectors are arranged by the sector
number field in the header. Most emulators want any interleave removed
from the disk image.

For completeness the last format is the emulator file. It is the data
converted to binary but mfm encoding isn't decoded. Mainly only for my emulator
though at least one computer emulator can use this format.

> but decoder plugin seems unsuitable for it - cant generate header
> without knowing whats coming later on the wire.
>
I think the only header you need info you may not have is the one at the start
that has the total number of heads and cylinders. You can always write an empty
header and seek back in the file at the end and write the header.

Was RevEng able to reverse the 48 bit ECC in an acceptable time or did you
find another way to extract it?

Rasz

unread,
5:53 PM (2 hours ago) 5:53 PM
to mfm-d...@googlegroups.com
On Thu, Dec 4, 2025 at 2:59 PM David Gesswein <d...@pdp8online.com> wrote:
> I think the only header you need info you may not have is the one at the start
> that has the total number of heads and cylinders. You can always write an empty
> header and seek back in the file at the end and write the header.

Cant really seek back because Sigrok doesnt inform Decoder about end
of data :(. All you can do is emit small chunks in real time while
decoding. This is also why I have such a stupid elaborate Report
generation system - user has to configure two options:

`report` Display report after encountering specified field type.
**Default**: `no` **Values**: `no`, `IAM` (Index Mark), `IDAM` (ID
Address Mark), `DAM` (Data Address Mark), `DDAM` (Deleted Data Address
Mark)

`report_qty` Number of Marks (specified above) between reports. This
is a workaround for lack of sigrok/pulseview capability to signal
end_of_capture.
**Default**: `9` **Example**: `9` for floppies, `17` for MFM hdd, `26`
for RLL drives

Have to pass report=DAM:report_qty=17 for a nice Report every 17 sectors :|
I think I'll just bundle extended version of my tr_to_vcd.py script to
convert Sigrok decoder Binary output back to .tr/.ex files. Combined
with $69 slogic16u3 Logic Analyser EEVblog Dave recently showed (and
failed to figure out how PolseView works hehe)
https://www.eevblog.com/forum/blog/eevblog-1720-mailbag-ussr-bonanza-logic-training-$69-logic-analyzer/msg6101005/
https://wiki.sipeed.com/hardware/en/logic_analyzer/slogic16u3/Introduction.html
this might be the cheapest off the shelf tool to get new dumps from
people without full BeagleBone Emulator.

> Was RevEng able to reverse the 48 bit ECC in an acceptable time or did you find another way to extract it?

Literally just that one-liner gives immediate reply :o RevEng is
amazing. My post
https://sourceforge.net/p/reveng/discussion/general/thread/f4602ab513/
Worked like magic and even helped me learn about some properties of
CRCs I havent really thought about:

- I had a bug in my RLL_Adaptec format. Adaptec doesn use normal
0xA1, instead its 0xA1 is straight up Header mark while 0xA0 is what
normally 0xA1 does. Bug was I was interpreting 0xA0 as Data_mark (aka
start decoding data immediately after) instead of IDData_mark (aka
next byte tells us if its header of data) because I missed that byte
following 0xA0 was always F8 :) this resulted in interpreting F8 as
first byte of data, followed by 511 bytes of actual sector data,
followed by last data byte interpreted as first byte of CRC etc...
| F8 + 511 data | 1 data + 6 CRC bytes | 00 00 00 00 00 ...

- above made me think Adaptec uses 56bit ECC :) The way I am
guessing CRC size is by looking at trailing repeating bytes so it did
look like 56bit at the time

- Adaptec controller doing the very stupid thing of filling empty
space after checksums with 00 bytes!
Much earlier I already learned obvious in hindsight neat property of
CRC going over its own checksum:
Header 0xa1 0xfe 0x00 0x20 0x01 0xBA 0xE9 0x00
Interpreted as 3 byte payload CRC = 0xBAE9 = OK
Interpreted as 4 byte payload CRC = 0xE900 = also OK :-)
0 as the filler byte not a great idea. Floppies adhered to sensible
4E. With hard drives it was anyone's game with 33, FF, but also plenty
of 0.
0 at the end can make you think CRC is ok when its not :o

All three combined into a weird situation where I told reveng to
reverse engineer 7 byte polynomial by feeding it
| F8 + 511 data | 1 data + 6 CRC bytes |
and it STILL guessed 0x181814503011 but with some nonsense init
(accounting for the superfluous F8 at the beginning). I fed that back
to Sigrok and numbers didnt work :) this made me look harder, notice
this poly is only 6 byte long and that in turn allowed me to spot my
bug. Super neat that reveng guessed correct 48bit poly even when told
its going to be 56bit :)
Reply all
Reply to author
Forward
0 new messages