In terms of functionality, not much has happened. It's still lacking interrupts and programs have to be preloaded to RAM. I could perhaps find time to do an attempt at a primitive implementation of the HW support for interrupts, but realistically I can only borrow a couple of hours from the family. And judging from the coughing coming from the youngest kid, I don't expect expect much time to work on this (or sleep!) at all the coming days. And it would still leave me with figuring out the sw side.
So I have instead focused on bringing down the size of the CPU itself, mostly for fun. It's currently clocking in at 461 LUTs. Some parts have been hand-optimized with a ton of Karnaugh maps, while other parts are in a great need to be rewritten so there is still some room to bring this number down with some more work.
It's not a very good CPU, but I think it has some use. Since it's not using any RAM for microcode it could be potentially useful in applications where on-chip RAM is a scarce resource (e.g. as a control plane CPU in a data capturing/streaming application)
There is also a reasonably clear separation between control logic and data path, so it would be fun to make it configurable for different bit widths to see the area/speed trade offs. At least 1,2 and 4 bits would be fun to explore. Any higher and I don't think it makes any sense against an architecture purpose-built for 8/16/32.
Still super interested to see what kind of design choices everyone else has made, and of course a bit sad that I won't make the deadline. But it's ok. Failure is always an option :)