sigrok-mfm mfm decoder

80 views
Skip to first unread message

Rasz

unread,
Oct 21, 2025, 7:19:38 AMOct 21
to MFM Discuss
Hello everyone Im Rasz_pl

First TLDR version:
https://github.com/raszpl/sigrok-mfm Working mfm decoding plugin for open-source Sigrok/PulseView/DSView logic analyzer software. Lets one visualize data and help with recovery/analysis. You dont need logic analyzer to use this, attached file lets one convert one track fron any .transition file into .vcd for loading into PulseView.

Long version:
I was always interested in magnetic storage. Some time ago I stumbled on majenkotech twitch streaming his attempts at MFM emulator. VODs:
https://www.youtube.com/watch?v=LrI0blg4r7I
https://www.youtube.com/watch?v=RbrRLh-PsS8
https://www.youtube.com/watch?v=rvOJJcxFEO4
https://www.youtube.com/watch?v=U6rS2Gz0gZA
https://www.youtube.com/watch?v=or5SL96_X1A
https://www.youtube.com/watch?v=11bu10Ehw30
https://www.youtube.com/watch?v=P_xAgsFjWCo
https://www.youtube.com/watch?v=_Gxs4f0QJdI

He briefly looked at 2017 David C. Wiens Pulseview mfm decoder https://www.sardis-technologies.com/ufdr/pulseview.htm but couldnt get it running. I managed to google a fix and this is how everything started.


I started playing with https://bitsavers.org/projects/hd_samples/ . Whipped quick python .tr converter and Im successfully decoding mfm dumps:
test.py \hd_samples\ST-278R\MFM_1-1_interleave_17sect\ST21M_MFM_615-6.tr" -t6 -d track0
This dumps track 6 into track0.vcd file. vcd was the easiest format I could find, its just adding pulse intervals from transition file. Next I run that file in sigrok-cli:
sigrok-cli.exe -D -i test\track0.vcd -P mfm:header_crc_bits=32:header_crc_poly=0x41044185:header_crc_init=0:data_crc_poly=0x41044185:data_crc_init=0:report=DAM:report_qty=18 -A mfm=fields:reports
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=0, len=128
mfm-1: CRC OK 99B7F53E
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK C8F97415
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=1, len=128
mfm-1: CRC OK C8B30F3A
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK D3349D6B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=2, len=128
mfm-1: CRC OK 3BBE0136
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=3, len=128
mfm-1: CRC OK 6ABAFB32
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=4, len=128
mfm-1: CRC OK 9CA05CAB
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=5, len=128
mfm-1: CRC OK CDA4A6AF
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=6, len=128
mfm-1: CRC OK 3EA9A8A3
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=7, len=128
mfm-1: CRC OK 6FAD52A7
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=8, len=128
mfm-1: CRC OK 9398A614
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=9, len=128
mfm-1: CRC OK C29C5C10
mfm-1: Sync pattern 14 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=10, len=128
mfm-1: CRC OK 3191521C
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=11, len=128
mfm-1: CRC OK 6095A818
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=12, len=128
mfm-1: CRC OK 968F0F81
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=13, len=128
mfm-1: CRC OK C78BF585
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=14, len=128
mfm-1: CRC OK 3486FB89
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=15, len=128
mfm-1: CRC OK 6582018D
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=16, len=128
mfm-1: CRC OK 8DE9536A
mfm-1: Sync pattern 11 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK F6E6FE4B
mfm-1: Sync pattern 10 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=254, len=128
mfm-1: CRC OK FBEABA85
mfm-1: Sync pattern 14 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK 28E9D789
mfm-1: Summary: IAM=0, IDAM=18, DAM=18, DDAM=0, CRC_OK=36, CRC_err=0, EiPW=0, CkEr=0, OoTI=10/79324

Now the inevitable questions:
- why is Seagate ST11/21M controller claiming sector length 128 in sector header while recording 512 bytes?
- why is Seagate ST11/21M controller recording 18 sectors per track? :o
- why is 18th sector so weird? :) Its zoomed in on pulseview_idrecord.png On every track it has same sector=254 field and looking at flux transitions you can tell sync pattern starts with no gaps (or rather after 1 byte gap2) with no room for rewriting almost like it was meant to be read only.
- whats on the first tracks 0-5? They all look like RLL, not that I would know how RLL looks like :) but they look exactly like what I get from
test.py "D:\_code\disk mfm\MfmDecoder\hd_samples\ST-278R\RLL_1-1_interleave_26sect\ST21R_RLL_615-6.tr" -t6 -d track0
funnily enough ST21R_RLL_615-6.tr tracks 0-5 are all almost completely empty filled with 400ns pulses. Whats going on, are those magical service area reserved tracks? Why would ST21M store RLL encoded data on those even when explicitly set to MFM mode?
Attaching my super janky test.py converter if someone wants to play with sigrok using transition files as source. Pro tip: Once you load one .vcd file into PulseView and setup MFM decoder you dont have to repeat the process. You can keep re-running test.py with different -t parameter but same -d track0 and then click Reload in PulseView.

Too bad the .tr files I looked at so far are all freshly formatted with no data to see flux transitions after few rewrites.

ps: I also managed to compile mfm_util for windows using MSYS and hacking away incompatible bits
- commented out pread/pwrite so most functionality apart from analyze is broken
- used __builtin_ffsll in crc_ecc.c:148
 span = fls(syndrome) - __builtin_ffsll(syndrome) + 1;
- added O_BINARY flag in emu_tran_file.c:752
 fd = open(fn, O_RDONLY| O_BINARY);
- removed -lrt flag in makefile
With those changes:
D:\_code\disk mfm\mfm-master\mfm>mfm_util.exe -a -t "D:\_code\disk mfm\MfmDecoder\hd_samples\ST-278R\MFM_1-1_interleave_17sect\ST21M_MFM_615-6.tr"
Original decode arguments: --heads 6 --cylinders 615 --sector_length 512 --retries 50,4 --drive 1
Found matching format Seagate_ST11MB: good count difference 0
Number of heads 6 number of sectors 17 first sector 0
Interleave (not checked): 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Command line to read disk:
--format Seagate_ST11MB --sectors 17,0 --heads 6 --cylinders 615 --header_crc 0x0,0x41044185,32,6 --data_crc  0x0,0x41044185,32,6 --sector_length 512 --retries 50,4 --drive 1

I needed that to learn crc poly of the Seagate 21M controller, didnt know 11M will have the same one.
test.py

David Gesswein

unread,
Oct 21, 2025, 9:25:13 AMOct 21
to mfm-d...@googlegroups.com
On Tue, Oct 21, 2025 at 03:21:14AM -0700, Rasz wrote:
> Hello everyone Im Rasz_pl
>
Hi

> My ultimate goal is adding RLL support.
>
There is someone working on adding RLL support to my code. They are getting
pretty close to releasing the code.
https://github.com/dgesswein/mfm/issues/8

> Now the inevitable questions:
> - why is Seagate ST11/21M controller claiming sector length 128 in sector
> header while recording 512 bytes?

The various controller manufacturers mostly used there own unique sector header
format. Looks like I have 93 supported formats currently. Some are just different
sector size or CRC polynomials other have more significant differences.

Western Digital uses the 2 bit field to mark sector size. Seagate doesn't seem to.

You are going to have to decide what you want to do about all the different formats
for your code.

> - why is Seagate ST11/21M controller recording 18 sectors per track? :o

Good question. Many controllers have no documentation on the track format. Haven't
see documentation on Seagate controllers. I was assuming it was for having spare sector
for error handling.

> - why is 18th sector so weird? :) Its zoomed in on pulseview_idrecord.png
> <https://raw.githubusercontent.com/raszpl/sigrok-mfm/main/doc/pulseview_idrecord.png>
> On every track it has same sector=254 field and looking at flux transitions
> you can tell sync pattern starts with no gaps (or rather after 1 byte gap2)
> with no room for rewriting almost like it was meant to be read only.
>
I have an example of similar format where a few tracks the 254 sector not the last.
Looks like it reformats the track and puts the 254 sector over the bad spot.

> - whats on the first tracks 0-5? They all look like RLL, not that I would
> know how RLL looks like
>
Some controllers take some portion of the start of the disk for storing information such
as the disk geometry. Likely its that and since its a RLL controller they recorded it in
RLL format.

Rasz

unread,
Oct 22, 2025, 11:31:46 PMOct 22
to mfm-d...@googlegroups.com
On Tue, Oct 21, 2025 at 3:25 PM David Gesswein <d...@pdp8online.com> wrote:
> There is someone working on adding RLL support to my code. They are getting
> pretty close to releasing the code.
> https://github.com/dgesswein/mfm/issues/8

Yes I saw that, but no code posted yet, ill try to code something myself too.

> The various controller manufacturers mostly used there own unique sector header
> format. Looks like I have 93 supported formats currently. Some are just different
> sector size or CRC polynomials other have more significant differences.

I was hoping they more or less stuck to same format for headers :/ oh well

> You are going to have to decide what you want to do about all the different formats
> for your code.

Supporting different kinds and sizes of Polynomials was easy enough to
add. Whole new formats might not be worth the hassle, I would have to
somehow make state machine
https://github.com/raszpl/sigrok-mfm/blob/6e26f6669a07d8a5c9dfb72d58a4de7d52b172a8/mfm/pd.py#L1106
dynamic/reconfigurable on the fly. I havent looked at your code
handling this yet.

> Good question. Many controllers have no documentation on the track format. Haven't
> see documentation on Seagate controllers. I was assuming it was for having spare sector
> for error handling.

> I have an example of similar format where a few tracks the 254 sector not the last.
> Looks like it reformats the track and puts the 254 sector over the bad spot.

Looking at few other mfm disk track dumps it seems like there was
always enough space for 18 sectors (assuming ~922us with all the gaps)
and some vendors even left sector sized free hole instead of spacing
out their 17 sectors. What a huge waste, 2.5MB just left there unused
on 42MB models. Looking at PC Bios tables
https://aeb.win.tue.nl/linux/hdtypes/hdtypes-3.html 18 sectors was
never a thing on that platform.

> Some controllers take some portion of the start of the disk for storing information such
> as the disk geometry. Likely its that and since its a RLL controller they recorded it in
> RLL format.

6 tracks seem like a huge waste again. Its wasted potential wherever
one looks. Not to mention lack of zoned recording all the way to early
nineties even after switching to IDE :(

Do you happen to have any ESDI captures? There is tons of IBM PS/2
with unobtainable dying ESDI drives and stupid MCA precluding easy
alternative storage options. Good opportunity for emulation. ESDI
while faster at up to 20Mbit (no idea what speed the PS/2s are using)
provides reference clock so should be much easier to emulate.

Main use case for sigrok-mfm Pulseview decoder is to help developers
or in advanced data recovery. Personally I find staring as graphical
representation of something easier than just long strings of hex :)

--
Who logs in to gdm? Not I, said the duck.

Al Kossow

unread,
Oct 22, 2025, 11:42:32 PMOct 22
to mfm-d...@googlegroups.com
On 10/22/25 8:33 PM, Rasz wrote:

> Do you happen to have any ESDI captures?

The data separator is on the drives themselves.
AFAIK the on-disk format was always RLL.

BBMDonut

unread,
Oct 23, 2025, 1:35:22 AMOct 23
to MFM Discuss
On Wednesday, October 22, 2025 at 10:31:46 PM UTC-5 citi...@gmail.com wrote:
On Tue, Oct 21, 2025 at 3:25 PM David Gesswein <d...@pdp8online.com> wrote:
> There is someone working on adding RLL support to my code. They are getting
> pretty close to releasing the code.
> https://github.com/dgesswein/mfm/issues/8

Yes I saw that, but no code posted yet, ill try to code something myself too.


Hi there! My apologies for the lack of a push, as life has been a tad chaotic as of late. I'm
going through the upgrades and doing some basic sanity checks/cleanup before I push
and submit the PR. Mr. Gesswein and I have been refining some of the changes, so I'm also
working to get his latest updates integrated into my work as well. Once things have settled,
not only will the initial RLL support be in place (currently working for WD1004-27X/
WD1006V-SR2/basically anything compatible with the WD5011 controller IC from what I gather),
but the separate modifications for utilizing external stepper controllers have been pulled in as
well. As of now, I have only tested the 4-wire Sparkfun controller, along with the ST-238R head
motor (as that is the combination I have been working to use for recovery).

Using this updated roll-up build, I have managed to dump the data from Usagi Electric's EDS
PC drive (the ST-238R mentioned above). I believe Mr. Gesswein has been able to use the 
stepper update to pull from MFM drives, however I'm not 100% certain what combinations of
other tests he's tried as of yet. I know that we did identify some issues with the pinout
differences between revisions of the MFM emulator and hooking up the Sparkfun PCB, so
that will be documented as well (on the bright side, the new pinout should be universal across
emulator revisions).

I should have more time to work on it over the next few nights with any luck, but regardless, I'll
go ahead and push the WIP to my GitHub fork by the weekend just to be safe (with the standard
"there be dragons here" warning for anyone wanting to experiment before the official release). 
From that point, the other two major efforts that would need to be worked out are adding support
to the emulation engine itself for RLL formats, along with adding new controller types. One of the
annoying issues - aside from the difference in bit timings - is the fact that different vendors (and
possibly different controller ICs from the same vendor) can all encode RLL with slight differences
(whereas MFM's flux-to-binary translation was pretty standard across the boards). It's just a few
tweaks to some lookup tables, but still annoying nonetheless...

David Gesswein

unread,
Oct 23, 2025, 9:33:31 AMOct 23
to mfm-d...@googlegroups.com
On Thu, Oct 23, 2025 at 05:33:08AM +0200, Rasz wrote:
> I havent looked at your code handling this yet.
>
Decoding code is just brute force code that for each format has a few lines of code to
pull out the correct information and for wildly different formats separate routines
for finding the sector start etc.
At the start didn't know enough about how they varied to do something cleaner and
after I had enough didn't have the energy to change.

Ext2emu which does the reverse conversion was written later so does use a data
table to define the transformation. Not all formats are in it.

> Do you happen to have any ESDI captures? There is tons of IBM PS/2
> with unobtainable dying ESDI drives and stupid MCA precluding easy
> alternative storage options. Good opportunity for emulation. ESDI
> while faster at up to 20Mbit (no idea what speed the PS/2s are using)
> provides reference clock so should be much easier to emulate.
>
I don't. I do know someone working on an ESDI emulator who has gotten it far enough to
capture some data. I'll let him know this thread exists.

Al Kossow

unread,
Oct 23, 2025, 9:40:45 AMOct 23
to mfm-d...@googlegroups.com
On 10/23/25 6:33 AM, David Gesswein wrote:

> I don't. I do know someone working on an ESDI emulator who has gotten it far enough to
> capture some data. I'll let him know this thread exists.
>


"cw" on the cctalk discord mostly working as of Aug.
"ESDI emulator update: writing apparently works, everything is still in ddr memory. I think I still have writing issues though, so I might
need to find another testing program"

Reece Pollack

unread,
Oct 23, 2025, 11:53:59 AMOct 23
to MFM Discuss
Hi folks! I'm the "someone" David mentions above. 

As someone else in this thread mentioned, with ESDI the data separator is on the drive, not the controller. The formatting of data on the media itself is left entirely to the data encoder/separator on the drive; the controller sends and receives only the data bit stream synchronized with a separate clock. Thus there is no way to capture the raw flux transitions as you can with ST-506 (MFM) drives, so no way to see whether the media is encoded RLL-2/7 or something else entirely.

This makes it both easier and harder to read a vintage ESDI drive. It's easier because there's no need to analyze the flux transitions to extract the clock and data; you get data and clock as differential signals. It's harder because you have to monitor the Index and Sector/AMD signals to identify where data starts and ends, and there no indication of where the write splices are.

Reece Pollack

unread,
Oct 23, 2025, 1:24:35 PMOct 23
to MFM Discuss
On Wednesday, October 22, 2025 at 11:31:46 PM UTC-4 citi...@gmail.com wrote:

Do you happen to have any ESDI captures? There is tons of IBM PS/2
with unobtainable dying ESDI drives and stupid MCA precluding easy
alternative storage options. Good opportunity for emulation. ESDI
while faster at up to 20Mbit (no idea what speed the PS/2s are using)
provides reference clock so should be much easier to emulate.

As David mentioned, I'm developing an ESDI emulator so I have a bit of familiarity with the technology.

ESDI is specified to run up to 24 MHz. Since the data clock is provided by the drive, it must be used with a controller that will support whatever speed the drive provides.  10 MHz drives are most common, and the PS/2 model 80 I've captured data from has a 10 MHz drive in it.

I'm told there's a SCSI adapter for MCA bus. This may be an alternative for those who don't want to deal with ESDI. There is also a disk that goes in a PS/2 model 50, 55, or 70 that some people say is an ESDI drive. I've looked at this drive and it is a direct MCA bus connect rather than ESDI.

-Reece

Alison Telford

unread,
Oct 23, 2025, 6:32:13 PMOct 23
to mfm-d...@googlegroups.com
Hi all -- if it's useful, I had posted a snippet about using David's MFM emulator on PS/2 machines -- did get it to work. There are also scsi adapters available (these I find work well and are straightforward) and the really cool McIDE product from ZZXIO. 

Here's the blurb for what it's worth: (does not address the ESDI issue)

 
--
You received this message because you are subscribed to the Google Groups "MFM Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mfm-discuss...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/mfm-discuss/db922a20-bbc6-42a9-925a-e109b0ab15c5n%40googlegroups.com.

Rasz

unread,
Oct 27, 2025, 7:27:49 AM (11 days ago) Oct 27
to mfm-d...@googlegroups.com
On Thu, Oct 23, 2025 at 7:35 AM BBMDonut <bbmd...@gmail.com> wrote:
> On Wednesday, October 22, 2025 at 10:31:46 PM UTC-5 citi...@gmail.com wrote:
> Yes I saw that, but no code posted yet, ill try to code something myself too.
>
>
> Hi there! My apologies for the lack of a push

Absolutely no need for any apologies.

> going through the upgrades and doing some basic sanity checks/cleanup before I push

I was hoping you were uploading your progress to your own github, all
I needed was a beachhead, a place to start :) instead had to push that
rock uphill on my own :)

>One of the
> annoying issues - aside from the difference in bit timings - is the fact that different vendors (and
> possibly different controller ICs from the same vendor) can all encode RLL with slight differences
> (whereas MFM's flux-to-binary translation was pretty standard across the boards). It's just a few
> tweaks to some lookup tables, but still annoying nonetheless...

from what I gathered by now its
1 using different rll tables.
WD is using WD, Seagate same IBM one found in SSI 32D535 ENDEC datasheet.
2 different magic Markers
All I could find so far was some patent mentioning 0x8B
(Clipboard03.png) - that was fail, and another patent mentioning
Adaptec AIC-270 ENDEC and 0x5EAx (Clipboard02.png) - another fail.
Ended up brute forcing everything around illegal 0b00100000001001 and
finally got somewhere hopefully (rll_maybe.png attachment). This is
WD1003V-SR1_RLL_820-6.tr Still not sure if its right as the header
looks too short, only 5 bytes of any data, 3 + CRC?

ps: I like how in both SR1_RLL_820-6.tr and WD1003V-SR1_RLL_820-6.tr
you can clearly see with naked eye controllers were really bad at
write pre-compensation. Shortest 200ns transitions shrink to ~185ns
when not surrounded by other short gaps :)
rll_maybe.png
Clipboard03.png
Clipboard02.png

Rasz

unread,
Oct 27, 2025, 7:40:46 AM (11 days ago) Oct 27
to mfm-d...@googlegroups.com
On Thu, Oct 23, 2025 at 7:24 PM Reece Pollack <reece....@gmail.com> wrote:
> ESDI is specified to run up to 24 MHz. Since the data clock is provided by the drive, it must be used with a controller that will support whatever speed the drive provides. 10 MHz drives are most common, and the PS/2 model 80 I've captured data from has a 10 MHz drive in it.

Could I bother you for a fragment of one of the dumps? One/two tracks.
Im very interested in seeing the timings and how the data looks like
in flight. I might try adding it to PulseView decoder just for fun.

Rasz

unread,
Oct 27, 2025, 7:42:59 AM (11 days ago) Oct 27
to mfm-d...@googlegroups.com
Yes, Im interested in seeing format of the NRZ data between drive and
controller.

BBMDonut

unread,
Oct 27, 2025, 9:26:23 AM (11 days ago) Oct 27
to MFM Discuss
On Monday, October 27, 2025 at 6:27:49 AM UTC-5 citi...@gmail.com wrote:
On Thu, Oct 23, 2025 at 7:35 AM BBMDonut <bbmd...@gmail.com> wrote:
> On Wednesday, October 22, 2025 at 10:31:46 PM UTC-5 citi...@gmail.com wrote:
> Yes I saw that, but no code posted yet, ill try to code something myself too.
>
>
> Hi there! My apologies for the lack of a push

Absolutely no need for any apologies.

> going through the upgrades and doing some basic sanity checks/cleanup before I push

I was hoping you were uploading your progress to your own github, all
I needed was a beachhead, a place to start :) instead had to push that
rock uphill on my own :)

In all honesty, I'm horrible about remote pushes to my projects much of the time. I blame
having multiple local backups plus a tinge of perfectionism (I get weird about others
seeing the "in progress" ugliness that sometimes comes with prototyping/R&D). I'll get over
it eventually.

That being said, it's a tough time at the moment to get anything set up just yet. Some life 
things are going on outside of the project that have taken much of my free time as of late.
Hopefully things resolve within the next week or so, and I'll be back to getting this thing ready
to go.

I'll rip off the band-aid, however, and try to get the "ugly" version pushed to a branch tonight.
I didn't realize anyone would actually be looking at it at this point. On a side note, there is a
Python script that was added to the contrib/ folder from another developer that can be used
to do a "quick and dirty" parse on a raw transition dump for the WD RLL table at least - I hope
that it's able to help out in the mean time. Kudos to PHK for contributing that, as it helped
me to get over some of my final issues.



>One of the
> annoying issues - aside from the difference in bit timings - is the fact that different vendors (and
> possibly different controller ICs from the same vendor) can all encode RLL with slight differences
> (whereas MFM's flux-to-binary translation was pretty standard across the boards). It's just a few
> tweaks to some lookup tables, but still annoying nonetheless...

from what I gathered by now its
1 using different rll tables.
WD is using WD, Seagate same IBM one found in SSI 32D535 ENDEC datasheet.
2 different magic Markers
All I could find so far was some patent mentioning 0x8B
(Clipboard03.png) - that was fail, and another patent mentioning
Adaptec AIC-270 ENDEC and 0x5EAx (Clipboard02.png) - another fail.
Ended up brute forcing everything around illegal 0b00100000001001 and
finally got somewhere hopefully (rll_maybe.png attachment). This is
WD1003V-SR1_RLL_820-6.tr Still not sure if its right as the header
looks too short, only 5 bytes of any data, 3 + CRC?


Looking at the screenshot of rll_maybe.png, I see a little bit of the issue. For some
reason, the parsing looks alright until it gets to the CRC section. The gaps between
the 05 and 08, and the 08 and A6 look a little off. When running the rest of the
header through the right CRC function, the checksum bytes should come out to
4F DE. Using the entire header byte list of "A1 FE 01 48 05 4F DE", the CRC check 
zeroes out as expected. Though something else doesn't seem to be sitting right
in the header, now that I'm looking at it. It's possible that the mapping is different, but
for the WD controller breakout I have, I'm seeing the "48" showing that the reading
would be from head 9, with a sector size of 1024 bytes... The rest would be from
cylinder 1, sector 5/6 (depending on zero-based or one-based, of course). I could have
a bad lookup table, which is possible - I'll double-check the datasheet to make sure
I'm reading it right.

This site was handy in helping to get the header decoding working for me:


As far as the ECC used for confirming/correcting the data section (A1F8), the contributed
Python script mentioned above has the algorithm in place.

Rasz

unread,
Oct 27, 2025, 10:21:37 AM (11 days ago) Oct 27
to mfm-d...@googlegroups.com
On Mon, Oct 27, 2025 at 2:26 PM BBMDonut <bbmd...@gmail.com> wrote:
> In all honesty, I'm horrible about remote pushes to my projects much of the time. I blame
> having multiple local backups plus a tinge of perfectionism

I know what you mean :)

> I'll rip off the band-aid, however, and try to get the "ugly" version pushed to a branch tonight.

no need to hurry, I just got it on my own :o :D wooo /happy dance,
took a lot of garbage logging but i got there in the end. Example of
the horrors and crimes I committed in the process:

pll_shift b1001001001001001001001001000000100000001 8 5957
self.lock_count 70 6064
pll_shift b1001001001001001001001000000100000001001 3 6064
self.lock_count 71 6101
pll_shift b100100100100100100000010000000100100001 5 6101
byte_synced 4 6170
pll_shift b1001001001001000000100000001001000010001 4 6170
pll_shift b10010010000001000000010010000100010001 4 6223
pll_shift b1001000000100000001001000010001000100001 5 6277
pll_shift b1000000100000001001000010001000100001001 3 6344
RLL_2 b10001000100001001 10001000100001001
RLL_TABLE[pattern] 11 11 4 1000
RLL_TABLE[pattern] 1111 11 8 1000
RLL_TABLE[pattern] 111111 11 12 1000
RLL_TABLE[pattern] 11111110 10 16 0100
RLL_shift b1000000100000001001000010001000100001001 11111110 1 -16 6344
annotate_bits_new 0xfe special.clock
byte_ 1 0
byte_ 0
byte_ 0 0
byte_ 0
byte_ 1 0
byte_ 0
byte_ 0 0
byte_ 0
byte_ 1 0
byte_ 0
byte_ 0 0
byte_ 0
byte_ 0 0
byte_ 1
byte_ 0 0
byte_ 0
pll_shift b100000001001000010001000100001001001 3 6383
pll_shift b100000001001000010001000100001001001001 3 6423
pll_shift b1001000010001000100001001001001001 3 6463
pll_shift b1001000010001000100001001001001001000001 6 6502
RLL_2 b1001001001000001 1001001001000001
RLL_TABLE[pattern] 000 000 6 100100
RLL_TABLE[pattern] 000000 000 12 100100
pll_shift b10001000100001001001001001000001000001 6 6584
pll_shift b1000100001001001001001000001000001000001 6 6663
RLL_2 b1000001000001 0001000001000001
RLL_TABLE[pattern] 000000010 010 6 000100
RLL_shift b1000100001001001001001000001000001000001 00000001 10 -18 6663
annotate_bits_new 0x1 False
byte_ 0 -2
byte_ 0
byte_ 1 -2
byte_ 0
byte_ 0 -2
byte_ 1
byte_ 0 -2
byte_ 0
byte_ 0 -2
byte_ 0
byte_ 0 -2
byte_ 1
byte_ 0 -2
byte_ 0
byte_ 0 -2
byte_ 0
pll_shift b100001001001001001000001000001000001001 3 6744
pll_shift b1001001001001000001000001000001001001 3 6783
RLL_2 b1000001001001 0001000001001001
RLL_TABLE[pattern] 0010 010 6 000100
RLL_TABLE[pattern] 0010010 010 12 000100

not having a real debugger when working with sigrok decoders sucks super bad :(

> Looking at the screenshot of rll_maybe.png, I see a little bit of the issue. For some
> reason, the parsing looks alright until it gets to the CRC section. The gaps between
> the 05 and 08, and the 08 and A6 look a little off. When running the rest of the
> header through the right CRC function, the checksum bytes should come out to
> 4F DE. Using the entire header byte list of "A1 FE 01 48 05 4F DE"

previous screenshot was still misaligned :) but now I think I got it!
WD1003V-SR1_RLL_820-6.tr" Track 10: Cylinder 1, Head 4, first sector
on the track:
0xA1 0xFE 0x1 0x24 0x1 = CRC 0x411D

now ill try to decode DAM and verify crc

> This site was handy in helping to get the header decoding working for me:
> https://crccalc.com/?crc=A1FE0148054FDE&method=CRC-16/IBM-3740&datatype=hex&outtype=hex

I use this one https://www.sunshine2k.de/coding/javascript/crc/crc_js.html
allows custom polys and custom Initial Values

Rasz

unread,
Oct 28, 2025, 3:58:28 AM (10 days ago) Oct 28
to mfm-d...@googlegroups.com
On Mon, Oct 27, 2025 at 3:23 PM Rasz <citi...@gmail.com> wrote:
> got it on my own :o :D wooo /happy dance1D
>
> now ill try to decode DAM and verify crc

D:\_code\disk mfm\sidecat\test>test.py "D:\_code\disk
mfm\MfmDecoder\hd_samples\ST-278R\RLL_1-1_interleave_26sect\WD1003V-SR1_RLL_820-6.tr"
-t0 -d track0
File Type: 1 (Transition), Major Version: 2, Minor Version: 2
Offset to first track: 121 bytes
Track header size: 12 bytes
Number of cylinders: 820
Number of heads: 6
Number of tracks (cylinders*heads): 4920
Transition count rate: 200,000,000 Hz
Command line: --heads 6 --cylinders 820 --sector_length 512 --retries
50,4 --drive 1
Note:
Start time from index: 0 ns
Header CRC: read bbfa068c, computed bbfa068c
Track 0: Cylinder 0, Head 0
Data bytes: 77356
Track CRC: read ee8a3234, computed ee8a3234
Writing D:\_code\disk mfm\sidecat\test\track0.vcd

D:\_code\disk mfm\sidecat\test>sigrok-cli.exe -D -i "D:\_code\disk
mfm\sidecat\test\track0.vcd" -P
mfm:data_rate=7500000:encoding=RLL:header_bytes=3:data_crc_bits=56:data_crc_poly=0x140a0445000101:report=DAM:report_qty=26
-A mfm=fields:reports
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=1, len=512
mfm-1: CRC OK BAE9
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK 226506C50A78BD
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=2, len=512
mfm-1: CRC OK 8A8A
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK 36B8CBF4C5926E
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=3, len=512
mfm-1: CRC OK 9AAB
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=4, len=512
mfm-1: CRC OK EA4C
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=5, len=512
mfm-1: CRC OK FA6D
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=6, len=512
mfm-1: CRC OK CA0E
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=7, len=512
mfm-1: CRC OK DA2F
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=8, len=512
mfm-1: CRC OK 2BC0
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=9, len=512
mfm-1: CRC OK 3BE1
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=10, len=512
mfm-1: CRC OK B82
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=11, len=512
mfm-1: CRC OK 1BA3
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=12, len=512
mfm-1: CRC OK 6B44
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=13, len=512
mfm-1: CRC OK 7B65
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=14, len=512
mfm-1: CRC OK 4B06
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=15, len=512
mfm-1: CRC OK 5B27
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=16, len=512
mfm-1: CRC OK B8F9
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=17, len=512
mfm-1: CRC OK A8D8
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=18, len=512
mfm-1: CRC OK 98BB
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=19, len=512
mfm-1: CRC OK 889A
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=20, len=512
mfm-1: CRC OK F87D
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=21, len=512
mfm-1: CRC OK E85C
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=22, len=512
mfm-1: CRC OK D83F
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=23, len=512
mfm-1: CRC OK C81E
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=24, len=512
mfm-1: CRC OK 39F1
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=25, len=512
mfm-1: CRC OK 29D0
mfm-1: Sync pattern 8 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Sync pattern 8 bytes
mfm-1: ID Address Mark
mfm-1: ID Record: cyl=0, sid=0, sec=26, len=512
mfm-1: CRC OK 19B3
mfm-1: Sync pattern 9 bytes
mfm-1: Data Address Mark
mfm-1: Data Record
mfm-1: CRC OK DA409DE590BC21
mfm-1: Summary: IAM=0, IDAM=26, DAM=26, DDAM=0, CRC_OK=52, CRC_err=0,
EiPW=0, CkEr=0, OoTI=55/75902

WD RLL controller decoding working :) The only dilemma I have right
now is what to do with Sync Marks. I expected it to work like in
FM/MFM dumps where A1 is always in same spot no matter what, but here
the magic 100000001001 moves around ?! On same track it can be one of
these three possibilities:
... 3 3 3 7 8 3 5
100100100100000010000000100100001
... 3 3 3 5 8 3 5
1001001001000010000000100100001
... 3 3 3 3 8 3 5
10010010010010000000100100001

I resigned to syncing on 8 3 5 and rewinding one bit before decoding
proper data.
I dont know what to display in place of Sync Mark for the user. I
found only one patent where they suggest the illegal 100000001001 is
created by omitting impulse from 100010001001 which is supposed to be
5EAx. That doesnt work here :( no matter where I would inject
additional impulse whatever decodes in that spot leaves one dangling 0
bit :|
Oh well, moving on to Seagate RLL dump, maybe working on that one will
give me some ideas.
rll_working.png

BBMDonut

unread,
Oct 28, 2025, 8:58:11 AM (10 days ago) Oct 28
to MFM Discuss
This is pretty much the same way I handled it on my end, so it's good to have
confirmation that other people see it this way as well.

As for the variable length on the sync marks, that's been causing me some
craziness too - I've been looking at getting the RLL support added to the
hardware emulator itself, but need to dig a little more into the track layout
descriptors to see where/how those fields can be defined. The data sheet
for the 50C12 controller IC (the closest thing I can find to my 5011 on the 
WD1004-27X) seems to validate the variable-length fields from what I can
tell (see attached image).


I dont know what to display in place of Sync Mark for the user. I
found only one patent where they suggest the illegal 100000001001 is
created by omitting impulse from 100010001001 which is supposed to be
5EAx. That doesnt work here :( no matter where I would inject
additional impulse whatever decodes in that spot leaves one dangling 0
bit :|
Oh well, moving on to Seagate RLL dump, maybe working on that one will
give me some ideas.

Awesome work! Another idea I had, but have zero time to work on, is something
similar to the HxC Floppy Emulator Toolkit 0 but for MFM/RLL HDD images
(or expanding out the existing tool to support them, if possible). I've been wanting
to see graphical representations of the disk data ever since I started working with
the MFM Emulator. 

wd50c12_soft_sector_format.png

David Gesswein

unread,
Oct 28, 2025, 9:42:16 AM (10 days ago) Oct 28
to mfm-d...@googlegroups.com
On Tue, Oct 28, 2025 at 05:58:11AM -0700, BBMDonut wrote:
Lots of good fun deleted

> I've been looking at getting the RLL support added to the
> hardware emulator itself, but need to dig a little more into the track
> layout
> descriptors to see where/how those fields can be defined.
>

The emulator is storing the raw MFM and hopefully soon RLL bit patterns. The
only decoding it is doing is converting the transition timing to 1's and 0's.
Any further decoding is done with mfm_util. I picked that so it doesn't need to
know much about the format though it wastes space.

Rasz

unread,
Oct 28, 2025, 11:03:38 AM (10 days ago) Oct 28
to mfm-d...@googlegroups.com
On Tue, Oct 28, 2025 at 1:58 PM BBMDonut <bbmd...@gmail.com> wrote:

>The data sheet for the 50C12 controller IC

Oh that is a nice datasheet, I havent seen it before.

> Awesome work! Another idea I had, but have zero time to work on, is something
> similar to the HxC Floppy Emulator Toolkit 0 but for MFM/RLL HDD images
> (or expanding out the existing tool to support them, if possible). I've been wanting
> to see graphical representations of the disk data ever since I started working with
> the MFM Emulator.

https://github.com/davidgiven/fluxengine is the closest I could find.
Could be further extended with HDD support and maybe its low level
display could add same per flux transition labels as in Pulseview
decoder plugin I adopted.

Got the Seagate decoding working now :D Reading patents finally paid
off - Seagate does use 0x5EA1 as the sync marker for headers and
0xDEA1 for data! At least leaves no ambiguity what to display in UI.
In contrast WD 50C12 datasheet just states illegal 1000 0000 1001 0000
but doesnt explicitly mention anything about it randomly sliding
around up to 4 bits away from sync pattern :|
"Since each address mark should be preceded by approximately 12 bytes
of zeros, when a sequence of zeros is detected by the activation of
DRUN, Read Gate is activated and read data is examined until either an
address mark is detected or a non-zero byte which is not an address
mark is detected."
"If a non-zero non-address mark byte is detected, read gate is dropped
for at least 2 byte times, allowing the phase lock loop to
resynchronize with the write clock, before inspecting DRUN input
again."
2 bytes is not 4 bits, what is going on :| Figure 5 and 6 on the other
hand read to me like after syncing PLL controller just looks for AM in
a window of 2 or 3 bytes without mentioning what do they mean by bytes
(32-46 clocks?).

Btw Seagate uses 32bit CRC in RLL mode, but writes full 56 bits with
last 2 bytes zeroed before starting GAP3, so even more waste :) and of
course there are 27 sectors on every track again, like there was in
MFM mode, with last one unused/unusable (no GAP2 before data = cant
safely write to it) :|
Reply all
Reply to author
Forward
0 new messages