I used a CPLD to control the reset line with a register and decode the address logic rather than jumpers/pianos. ATF1504 was probably overkill (PLCC socket) and I could probably get away with an ATF22V10 now that I've figured out (almost reliably) how to program them from a Raspberry Pi (I'm struggling with the ATF16V8C currently). The AF22V10 built-in flip-flops aren't useful here though (because they are all tied to the same external clock pin) so any registers seem to have to be made from combinatorial logic. But the full 8-bit address decode works well in a single IC.
One after thought that I had was that whilst I had a reset line to the PCF8584 I didn't extended it to the I2C peripherals. Also, I only prototyped with a 5V I2C bus and no interrupts yet which makes the diver code somewhat busy. Now that there have been some discussions on the forum about interrupt line daisy chaining I might revise my board.
My specific problem with the dummy writes is that the first full cycle through either the master-transmitter or master-recieve flows ended up with the I2C bus not being released by the PCF8584. An undocumented dummy write to the PCF8584 register seemed to clear the issue and a new cycle could start. I suspect a timing issue as others have reported not seeing this issue (although using slightly different setups).
I used the MCP23008 port expander as a test device on my PCB.