UMDKv2 specific board first thoughts (in response to Chris's e-mail)

30 views
Skip to first unread message

Rafael Gama

unread,
Mar 26, 2012, 6:17:14 PM3/26/12
to umdkv...@googlegroups.com
Hello,

That post contains the first conversation that I had with Chris about the design of the UMDKv2 specific board.

As Chris requested, I've posted it here in order to make the discussion avaiable to everyone.


---------- Forwarded message ----------
From: Rafael Gama <rafael...@gmail.com>
Date: 2012/3/26
Subject: Re: [MakeStuff] Comment: "UMDKv2 Progressing Again..."
To: ch...@makestuff.eu


That's really great news! I apologise in advance for the long email:

Right now I'm in the process of creating the platform files
(.ucf, .xst, .ut & .batch) for a board with a single LX9 FPGA, to see
how easily the existing design fits. One potential problem is I've used
all 102 I/O pins on the FPGA (see attached entity declaration).
 
I will take a close look at this file later (really busy right now). But it's always a problem to fit the needed signals on the avaiable FPGA pins. :)
 
I am not bothering with /AS, CLK or /C_CE from the cartridge slot
because I don't think they're needed:

* I want to clock everything in the FPGA at 48MHz (i.e avoid the ugly
and actually pointless clock-doubling to 96MHz in the current design),
so I need to treat the address, data and control signals as asynchronous
inputs anyway, so there's no need to use the MD's CLK.


I am a little afraid of an asynchronous design. I am just paranoid with metastability issues. It comes from my experience with my old project based on the CPLD.
Maybe it's just a good idea to leave CLK (the 7.67 MHz one, right?) routed to the FPGA.

* The C_CE line is just for convenience and can be regenerated from the
address lines; it's asserted when the lower 4MiB is accessed (i.e the
cartridge address space).

I've attached to the e-mail some old threads(from the segaextreme forum) concerning DRAMS and the megadrive cartridge signals. Sorry if you've already seen them, or if you already know that information. But it's part of the research I did while developing my old CPLD system.

In those threads there is some speculation about using /C_CE and /OE (called in the threads /CAS_0) for a megadrive's built in refresh method. Anyway, if this information is correct, it would work only for FPM/EDO AsynchronousDRAMs. But maybe, relying on these signals is a good way to know the 'no activity window' where we can fit a refresh cycle.

As you can see, I am being paranoid. That's just because my CPLD design didn't work ;). In fact, I know that the UMDKv2 works with a CelullarRAM(PSDRAM) that incorporates a transparent self refresh mechanism. So, if you think that we are able to refresh the SDRAM with no problems, just ignore these thoughts.
 

* The /AS is a weird one. It toggles on most cartridge memory cycles,
but NOT during plays of sampled sounds (e.g the "Sega" ditty on Sonic
1). Since its intent can be achieved just as easily and more reliably
(i.e for ALL cycle types) with the /OE, UDSW & LDSW lines, it is not
necessary.

Yes. From what I've learned with my old project and some research, three devices accesses ROM on megadrive:

1) 680000
2) Z80 (the sound cpu or the main cpu when in 'mark III compatibility mode - sega master system mode') 
3) DMA 

Devices 2) and 3) won't generate /AS (address strobe comes only from 68k). So, we have to rely on /OE to get non-68k reads.
And yes, we must rely on LDSW to get 68k writes.

Indeed, it's ok to leave /AS out of the list.


I have included /DTACK because it would be good to have the freedom to
locate the FPGA's registers in address ranges where there is no
automatic DTACK generation.

Totally agreed.
 

I have dropped the mdBufOE_out signal from the original design because
it was just connected to ground anyway (thus enabling the three
5V<->3.3V level shifters). We can just leave the level shifters always
running, as long as they default to the MD->FPGA direction so as to
avoid bus contention on the MD side.

Totally agreed (just thought the same thing yesterday while designing the interface to my FPGA kit).
 

I have also removed the global reset_in signal because since the FPGA
registers start in a known state it seems a bit superfluous.

Agreed. 


What do you think? Have I missed anything?

It would be good to drive 2 more cartridge signals: 

1) /CART_IN - for Sega CD hardware simulation purposes. It's a low level active input to megadrive. We don't need to pass it thru a level translator: just use a open drain output on the FPGA as it's pulled up on megadrive.
2)  /M3 - for Sega Master System mode purposes. It's a low level active input to megadrive. We dont' need to pass it thru a level translator: same as above.

I am finishing the schematic of the interface to my FPGA KIT (It's a Digitlent Spartan 3E Starter Kit). I will send it to you as soon as I finish it. I've placed some jumpers where it's possible to select the routing between some megadrive signals and a single port on the FPGA. It was the way I found to be able to make experiments with any megadrive signal I want, even having a few diponible ports on the FPGA.


My thinking was to proceed like this:

* Get the platform files done for the LX9 FPGA, see how the resource
utilisation looks.
* Re-write the memory controller to clock at 48MHz instead of 96MHz
(which was then divided by two again to generate the RAM clock...duh).
* Re-assess the resource-utilisation and start doing the schematic and
PCB layout.
* Order enough components for ten boards. I have a friend in Shanghai
who has agreed to accept delivery from Chinese component distributors
and then forward them to me as a gift, so we avoid the UK's extortionate
import taxes.
* Finish PCB layout and send gerbers off to the manufacturer.
* Port the memory controller to a real SDRAM (as opposed to the
psudo-SDRAM on the Nexys2). I have a working SDRAM to play with here:
http://www.makestuff.eu/wordpress/?p=2363. It should be fairly
straightforward to interleave one "DMA" memory cycle between every
MegaDrive cycle, so reads from SD-card can be done without the need for
the 68K to move data, and so the host can read and write cartridge
memory without having to halt the MD as the current design does.
* When boards and components arrive, solder one of them, spend a couple
of days testing it to make sure it works, then solder a few more boards,
test them and send them out to collaborators.

That's really great. 

The next step with my FPGA kit is getting its SDRAM working. I got that (http://opencores.org/project,sdram_controller) project as a reference design. Also, Xilinx has an IP core for that. But I don't know if it is free. (Just going to chek it later today). 
 

Again, what do you think? Is there something in particular you'd like to
work on? You mentioned schematics, but I'd also be very grateful for
help with PCB layout and routing. There are a few layout issues,
especially given the two-layer PCB, like maintaining the signal
integrity of the high-speed (480MHz) USB signals, and making sure the
FPGA has a solid ground and enough decoupling capacitors. Luckily the
actual signal routing should be straightforward, being in general just
wide bus connections between the FPGA and adjacent components (SDRAM,
level-shifters, FX2LP, SD-card slot).

I may help with PCB layout and routing. I am used to working with Altium's Designer software (that's what I use at my job). 
But to be honest, I have a little experience with PCB routing.

That would be great that if one of the sides of the board have a solid ground plane. I think that it's a robust method
to avoid bad signal integrity.

For the USB signals, we may look at some sucessful reference design. I've colaborated to a PCIe project where we were really lost in the routing of the signals from the PCIe transceivers to the board fingers. Solution: reproduce the same physical routing as the PCIe development kit we had. I know it sounds kinda funny, but it's robust :)

Despite the fact that I don't feel like an experient HDL developer, it's my main activity.

 
Also do you have any objection to keeping the project open-source
(GPL'd)? And do you have a development platform preference? In general I
try to make my code build on Windows, Linux and MacOSX, but I have not
yet built this particular project (i.e the GDB backend, etc) on any
platform other than i686 Linux.

That's great making this projcet open-source. 
I mostly develop things under Windows. Sometimes I switch to Linux and try porting things from Windows. I have no preference but I am used to working with FPGA tools under Windows. So it's more comfortable to do everything under Windows, that way I don't have to keep switching between operational systems during the development. Anyway, I know that those FPGAs tools are released for Linux too, So I don't mind trying them.


Personally I'm concentrating on a cartridge that is as cheap as possible
but which achieves these two goals:

* When powered up standalone (i.e without a host computer attached), the
MegaDrive will run a menu program from SD-card, allowing you to choose a
game by scrolling through the list using a standard MD controller.
* When connected to a host computer (Windows, Linux or MacOSX) you can
load code over USB and connect a GDB-based debugger to do
single-stepping and breakpoints, etc.

That's pretty awesome :)
 

Are there any bits of functionality beyond that which you would like to
see?

You mentioned you found it difficult in your CPLD-based project to meet
the timing requirements of VDP DMA accesses. Do you know what these
timing requirements actually are, quantitatively? I guess I've been
working under the assumption that any accesses to the cartridge memory
will have similar timing requirements to those of a real 68000. So far
that assumption has served me well, but in all honesty I don't know
whether I'm one hundred metres from the cliff edge or just one metre
from it, if you know what I mean!


I can't tell you quantitatively. But let me try explaining how my old system works and then present my theory concerning the DMA.

Below there is a pseudo hardware behavior description:
 
There is a timer, it will set a flag every refresh period (~15us).
 
If (timer flag) then do a refresh cycle; else
If (mega drive access) then do a read cycle; else
If (timer flag) and (mega drive access) then do a read cycle followed by a refesh cycle (that's a special cycle supported by FPM/EDO ADRAMs).
 
Only clear the timer flag if the refresh cycle was performed, in order to perform a refresh cycle that was issued during a mega drive only access.
 
:end 

So, at first sight, system behavior is good. But if you leave a game running for 4 hours, it will surelly crash. An incremental data loss happens.
I think that it happens because DMA cycles doesn't leave any space between the reads, thus my logic is not able to insert a refresh cycle during the whole DMA access.
Also, a friend of mine wrote a megadrive demo that just do not use DMA, it keeps on a loop playing music and doing some graphic effects. That demo has runned for 24 hours consecutively.

Other thoughts: my old cpld system was built around a breadboard, big wires, bad grounding - a real mess. At least it had decoupling capacitors everywhere :) 

Today I have a 100 MHz digital oscilloscope, not that incredible bandwidth, but I hope I can get some quantitative results with it.



Regards,

Rafael. 
 


threads.rar

Chris McClelland

unread,
Mar 26, 2012, 7:33:48 PM3/26/12
to umdkv...@googlegroups.com
> I am a little afraid of an asynchronous design. I am just paranoid with
> metastability issues. It comes from my experience with my old project
> based on the CPLD. Maybe it's just a good idea to leave CLK (the
> 7.67 MHz one, right?) routed to the FPGA.

We can't escape the *potential* for metastability issues, because the USB interface's clock (ifclk_in@48MHz) is not synchronised to that of the MD (mdCl...@7.6MHz). To begin with I was thinking the same thing, that the master clock for the system should be a clock-multiplied 7.6MHz clock (mostly because at the time I was using an asynchronous EPP-style interface on the USB side). But as soon as you multiply a clock, the signals which are synchronous to the original clock become effectively asynchronous with respect to the multiplied clock anyway. My code seems to work reliably (i.e games run for many days without crashing) with one level of flip-flop to synchronise the async 68000 signals to the 48MHz ifclk.

If we had plenty of spare I/O, I wouldn't be arguing, but it's pretty tight. The good news is I did manage to reclaim two redundant signals from the USB interface, so we've now got two spare signals to play with. We should allocate them wisely!


> I've attached to the e-mail some old threads(from the segaextreme
> forum) concerning DRAMS and the megadrive cartridge signals. Sorry
> if you've already seen them, or if you already know that information.
> But it's part of the research I did while developing my old CPLD system.

No, I wasn't aware of a lot of this info, thanks for posting it!

> As you can see, I am being paranoid. That's just because my CPLD
> design didn't work ;). In fact, I know that the UMDKv2 works with a
> CelullarRAM(PSDRAM) that incorporates a transparent self refresh
> mechanism. So, if you think that we are able to refresh the SDRAM
> with no problems, just ignore these thoughts.

I'm confident I can build an SDRAM controller with a similar latency to that of the Nexys2's PSRAM, but I don't intend to leave it to hope. Before we have boards made, I want to see measurements for the minimum width of /OE, and the worst-case address and data stability with respect to /OE, so we can plan whether it's feasible to interleave FPGA DMA accesses with the MD's accesses. In the worst-case scenario we just drop back to the software bus arbitration scheme the Nexys2 prototype uses - it's hacky, but it works.

> From what I've learned with my old project and some research,
> three devices accesses ROM on megadrive:
>
> 1) 680000
> 2) Z80 (the sound cpu or the main cpu when in 'mark III compatibility
> mode - sega master system mode') 
> 3) DMA

Can you suggest a way for me to test with the latter two? I do a lot of testing with Sonic 1, but that appears to only do non-68K accesses during the sampled sound played at startup. Are there any ROM images which I can try that make particularly heavy use of VDP DMA for example?


> 1) /CART_IN - for Sega CD hardware simulation purposes. It's a low
> level active input to megadrive. We don't need to pass it thru a level
> translator: just use a open drain output on the FPGA as it's pulled up on megadrive.
> 2)  /M3 - for Sega Master System mode purposes. It's a low level active input to
> megadrive. We dont' need to pass it thru a level translator: same as above.

Surely it's not safe to have a pin pulled up to 5V on a 3.3V FPGA? Presumably we can achieve the same thing with a transistor with its base connected via a resistor to a FPGA pin, like I did for /RESET? I was going to do the same with /DTACK. Will a transistor switch fast enough?

Also, with /CART_IN and /M3 we're back up to 102 I/Os - 100% utilisation!  :-)


> For the USB signals, we may look at some sucessful reference design.

Agreed. I'm not saying it's an insurmountable problem, just something to put on the checklist. I managed to get 480Mbit/s working accross a 10cm trace on a home-etched PCB just by ensuring the D+ and D- lines never went through vias and had unbroken ground-plane on the other side of the board.


> So, at first sight, system behavior is good. But if you leave a game
> running for 4 hours, it will surelly crash. An incremental data loss happens.

Hmm...something weird is definitely going on. The 68000 datasheet says that the address and data strobes remain asserted for 240ns at 8MHz, a little less than half the 500ns bus cycle time. It's a very long time by today's standards. Am I being naive in assuming that the 68000 datasheet is still valid for the MegaDrive? I guess I've been assuming that:

* There will only be one memory access (whether 68K, VDP or Z80) in any given bus cycle.
* A bus cycle lasts at least four 7.6MHz clocks, or ~525ns.


> Today I have a 100 MHz digital oscilloscope, not that incredible
> bandwidth, but I hope I can get some quantitative results with it.


Awesome! I wish I had something like that!

Chris McClelland

unread,
Mar 26, 2012, 7:51:40 PM3/26/12
to UMDKv2 Developers

Forgot to mention:

> The next step with my FPGA kit is getting its SDRAM working. I got
> that (http://opencores.org/project,sdram_controller) project as a
> reference design. Also, Xilinx has an IP core for that. But I don't
> know if it is free. (Just going to chek it later today).

Feel free to try the code I wrote to get me started with my SDRAM
controller for the little daughterboard I made:

http://www.makestuff.eu/wordpress/?p=2363

> So, at first sight, system behavior is good. But if you leave a game
> running for 4 hours, it will surelly crash. An incremental data loss
> happens.

Remember that those old DRAMs were power-hungry, with lots of power
spikes too! Even with good decoupling, I imagine it would be pretty
difficult to build a DRAM system on verobard that was 100% reliable.

Rafael Gama

unread,
Mar 27, 2012, 1:29:06 PM3/27/12
to umdkv...@googlegroups.com

2012/3/26 Chris McClelland <proph...@gmail.com>

We can't escape the *potential* for metastability issues, because the USB interface's clock (ifclk_in@48MHz) is not synchronised to that of the MD (mdCl...@7.6MHz). To begin with I was thinking the same thing, that the master clock for the system should be a clock-multiplied 7.6MHz clock (mostly because at the time I was using an asynchronous EPP-style interface on the USB side). But as soon as you multiply a clock, the signals which are synchronous to the original clock become effectively asynchronous with respect to the multiplied clock anyway. My code seems to work reliably (i.e games run for many days without crashing) with one level of flip-flop to synchronise the async 68000 signals to the 48MHz ifclk.

If we had plenty of spare I/O, I wouldn't be arguing, but it's pretty tight. The good news is I did manage to reclaim two redundant signals from the USB interface, so we've now got two spare signals to play with. We should allocate them wisely! 
 

Yes, you're right.
 
I'm confident I can build an SDRAM controller with a similar latency to that of the Nexys2's PSRAM, but I don't intend to leave it to hope. Before we have boards made, I want to see measurements for the minimum width of /OE, and the worst-case address and data stability with respect to /OE, so we can plan whether it's feasible to interleave FPGA DMA accesses with the MD's accesses. In the worst-case scenario we just drop back to the software bus arbitration scheme the Nexys2 prototype uses - it's hacky, but it works.

 
I will try doing those measurements. Measuring data stability in respect to /OE is going to be a little more tricky. Anyway, I will have to wait until the weekend. Unfortunately, it's my only spare time.
 
Can you suggest a way for me to test with the latter two? I do a lot of testing with Sonic 1, but that appears to only do non-68K accesses during the sampled sound played at startup. Are there any ROM images which I can try that make particularly heavy use of VDP DMA for example?

 
That's a good question for my friend that helped me with the CPLD project. In fact, the idea of building a DRAM based cartridge belongs to him. He have just joined the e-mail group.


Surely it's not safe to have a pin pulled up to 5V on a 3.3V FPGA? Presumably we can achieve the same thing with a transistor with its base connected via a resistor to a FPGA pin, like I did for /RESET? I was going to do the same with /DTACK. Will a transistor switch fast enough?

Please correct me if I am wrong. But I always thought tha it would not be a problem, since that an open drain output on the FPGA is used - just like what is exposed in this topic: http://www.edaboard.com/thread134623.html
 
In my mind, even inside a FPGA, an open drain output is a transistor which has its SOURCE connected to GND, its DRAIN connected to the FPGA port and its GATE controlled by the resulting logic.

1)When the transistor's gate is turned on: there is a current flowing thru its channel limited by the pull-up resistor. That current flows between GND and 5V supply thru the pull-up. After the pull-up resistor the voltage is almost zero since channel resistance is very low. That's driving the line low or asserting the line. The only thing to note is that the pull-up resistor must limit the current to an acceptable level for the FPGA. There is no issue if the voltage level on the GATE is less than the voltage driving the pull-up (eg. 3v3 and 5v respectively).

2)When the transistor's gate is turned off: there is no current flowing thru its channel. Thus, the transistor's DRAIN is open or in high impedance state. The voltage that appears after the pull-up resistor is (almost) 5V, as there is no current flowing thru the transistor - the line is deasserted.

For the above circuit work properly both 3v3 and 5v supplies must have the same GND (and we have that).

So, that transistor works just like the one that we would put out of the FPGA. I think that the discrete 'outside' transistor is good for safety purposes. Since it's the one going to blow if the pull-up resistor doesn't limit current enough. Anyway, we know the pull-up resistor on megadrive.


Also, with /CART_IN and /M3 we're back up to 102 I/Os - 100% utilisation!  :-)

Great! But wait... 100% again :( 
I still have to look the list of signals routed to the FPGA.  


Agreed. I'm not saying it's an insurmountable problem, just something to put on the checklist. I managed to get 480Mbit/s working accross a 10cm trace on a home-etched PCB just by ensuring the D+ and D- lines never went through vias and had unbroken ground-plane on the other side of the board.

 
That's really great. So, I think we won't have problems with the USB differential lines.
 

Hmm...something weird is definitely going on. The 68000 datasheet says that the address and data strobes remain asserted for 240ns at 8MHz, a little less than half the 500ns bus cycle time. It's a very long time by today's standards. Am I being naive in assuming that the 68000 datasheet is still valid for the MegaDrive? I guess I've been assuming that:

* There will only be one memory access (whether 68K, VDP or Z80) in any given bus cycle.
* A bus cycle lasts at least four 7.6MHz clocks, or ~525ns.

I agree with the above 2 sentences. And I think that there is nothing wrong in assuming the 68000 datasheet timing for the MegaDrive. We just can not forget that the DMA accesses is going to be faster than the 68000 accesses.
 

Rafael Gama

unread,
Mar 27, 2012, 1:39:38 PM3/27/12
to umdkv...@googlegroups.com
2012/3/26 Chris McClelland <proph...@gmail.com>

Forgot to mention:


Feel free to try the code I wrote to get me started with my SDRAM
controller for the little daughterboard I made:

http://www.makestuff.eu/wordpress/?p=2363

 
I will surely try that.  

Remember that those old DRAMs were power-hungry, with lots of power
spikes too! Even with good decoupling, I imagine it would be pretty
difficult to build a DRAM system on verobard that was 100% reliable.

Yes, that's a really goodl point. Maybe it's a good idea to start thinking that the main problem in the CPLD project was signal integrity. 

Chris McClelland

unread,
Mar 27, 2012, 2:38:35 PM3/27/12
to UMDKv2 Developers
Thanks for your replies, I really appreciate it, especially on
weekdays
when you're short of time.

> When the transistor's gate is turned off: there is no current flowing
> thru its channel. Thus, the transistor's DRAIN is open or in high
> impedance state. The voltage that appears after the pull-up resistor
> is (almost) 5V, as there is no current flowing thru the transistor -
> the line is deasserted.

I was hopeful for a while ("a way to avoid external transistors,
yay!"),
but I think the problem is not the current but the voltage. Whether or
not the output is open-drain, the pin itself remains connected
directly
to the input buffer of the IOB. The absolute maximum ratings prohibit
the pins from going above about 4V. I assume that means 5V will sooner
or later destroy the input buffer. Maybe that's OK because those pins
will never be inputs, but we can't be sure the damage won't spread
beyond the input buffer. Unfortunately it's not clear whether there
are
clamping diodes to protect the IOBs from over-voltage. See page 44 of
ug381.pdf[1]. Let's be paranoid and keep the external transistors for
now. I guess one problem is the switching speed of the external
transistors; it's not such a problem for RESET and CART_IN because
they
don't need to be super fast. But DTACK needs to switch pretty quickly.

> We just can not forget that the DMA accesses is going to be faster
> than the 68000 accesses.

I can see how DMA reads can be *more frequent* than 68000 reads, but I
can't see that they can be faster without breaking the cycle timing
specified by the 68000 datasheet. I don't know, maybe it *does* break
the 68000 timing and I just haven't noticed (i.e maybe it stops
asserting DTACK or even stop the 68K's clock whilst it asserts C_OE a
few times for some DMA accesses, but for shorter periods than the
240ns
spec'd by Motorola). I could do with a ROM to test it. Something that
just sits in a loop whilst saturating the bus with DMA cycles. Then I
can measure the shortest period during which C_OE is asserted, and the
shortest interval between consecutive assertions.

Chris

On Mar 27, 6:29 pm, Rafael Gama <rg...@desconstruindo.eng.br> wrote:
:
Reply all
Reply to author
Forward
0 new messages