[Alhambra2media] PSRAM Controller

45 views
Skip to first unread message

charli va

unread,
Nov 28, 2025, 7:10:13 PM (4 days ago) Nov 28
to fpga-wars-explora...@googlegroups.com

Hi everyone!

Here is the first version of the controller I’ve been working on for the PSRAM, which I’ve now refocused for Jesus's shield.

https://github.com/cavearr/psram_phy

I think if we start developing things for the shield, we should probably open separate threads to make it easier to follow.

Back to the PSRAM: this is going to give us superpowers! We have 8MBytes of memory on the shield, and based on my tests, access is relatively decent. If we use a BRAM cache in front of it... things get very interesting for many applications.

The module I’m sharing supports:

  • SINGLE Mode: 1 word of 32 bits, 1 bit at a time.

  • QUAD Mode: 4 bits at a time.

  • BURST Mode (SINGLE and QUAD): I’ve set this up like a DMA with a destination BRAM. Basically, you tell the memory "copy X bytes from this BRAM address to the PSRAM" (or vice versa), and it flies (it's blazing fast).

I’ve also added CDC (Clock Domain Crossing) support. This means the system can run at one speed and the PSRAM at another. I’ve tested with non-integer ratios, for example, the system at 12MHz and the PSRAM at 80MHz or 100MHz.

This is the first version and it's pure Verilog. Once it stabilizes a bit more, I’ll make some blocks for Icestudio.

The tests I’m sharing are simple; it’s fresh out of the oven and might have bugs. However, testing gave me a feeling of robustness—I don't know if it's because it took a lot of effort to get it running and I debugged it so much that it feels solid, or if it really turned out robust XD.

In any case, I think it’s a great first step to do cool things with the shield. I’ll keep working on it this weekend and try more complex stuff (maybe a video framebuffer), so bugs will likely pop up. But here it is in case anyone is curious or wants to start testing.

If you don't have the shield but find the idea interesting: I had actually been working on this for a few weeks before Jesus's shield came up (planetary alignment!). I was using a home-made prototype with a Lyontek PSRAM and a generic SMD breakout board. I’m attaching a photo—it’s very easy to build, just plug it into a breadboard and play.


IMG_2571.JPG
IMG_2569.JPG

Happy fpga weekend!

Jo mo

unread,
Nov 29, 2025, 4:49:08 AM (3 days ago) Nov 29
to FPGAwars: explorando el lado libre
Carlos,

Your github page on this subject is very well documented! 
i still have to see but for my "xga graphics" were i am quickly getting out of BRAM (on my ecp5 fpga) your caching Bram strategy could be interesting! 
in the end i am planning to store "long term" images/sprites/fonts on a sd card, then i need to find the right way of reading them (maybe caching them in ddr2 ram then in BRAM, ...)
i need to look at this deeper!

Gracias for sharing amigo and sorry to be slightly off topic ;-)

Jesus Arias

unread,
Nov 29, 2025, 1:09:27 PM (3 days ago) Nov 29
to FPGAwars: explorando el lado libre
Hi Carlos,
Very nice interface with all your possible needs included (an independent  fast clock is a big feature ;). I'll have to try it.

And a question too... Is there a way to configure the number of dummy cycles for Quad-reads? I though it was always 6.

And thinking about the failures of my simpler interface, When executing code from the PSRAM #CE is high for only one cycle between reads, that is about 40ns at 25MHz, and less than the specified 50ns. Maybe this is all the problem, because for other read/write tests some instructions are executed from other memories and #CE remains high for a longer time. This looks like an smoking gun ;)

Nice day

charli va

unread,
Nov 29, 2025, 1:40:39 PM (3 days ago) Nov 29
to fpga-wars-explora...@googlegroups.com
Hi Jesús and Joaquim! I hope the module is very useful to us.

Just keep in mind that it's brand new with minimal testing. Yesterday I found a bug while documenting and testing, and it ruined my afternoon! XD

The QUAD_DUMMY_CYCLES module is one of the most complex aspects of the module, and I needed to fine-tune it empirically through testing and the logic analyzer, since the documentation is inaccurate. In the Readme file, you can find how to configure it and a table with the values from my tests that I needed to work with certain frequencies (the dummy cycles specified in the documentation don't work).

If you find any bugs, let me know. I haven't been able to make any progress today, but tomorrow I hope to create some more complex tests. If you put it in Larva, an interface like your flashcache but with the psram behind it could be killer.

Captura de pantalla 2025-11-29 a las 19.31.06.png
Have a nice weekend!

--
Has recibido este mensaje porque estás suscrito al grupo "FPGAwars: explorando el lado libre" de Grupos de Google.
Para cancelar la suscripción a este grupo y dejar de recibir sus mensajes, envía un correo electrónico a fpga-wars-explorando-el...@googlegroups.com.
Para ver este debate, visita https://groups.google.com/d/msgid/fpga-wars-explorando-el-lado-libre/b19cf6b0-8dc1-4d35-be33-00d687618cc3n%40googlegroups.com.

Jesus Arias

unread,
Dec 1, 2025, 9:51:52 AM (yesterday) Dec 1
to FPGAwars: explorando el lado libre
Hi,
I just want to confirm that the minimum high time for #CE was indeed the source of my problems. Now, the read time is a cycle longer, #CE stays high two cycles or more (>80ns), and I'm able to execute the code stored in the PSRAM. It is slow but it runs. (The updated code is in Github)

In the same sense, the AC characteristics of the PSRAM also state a maximum time for #CE low at 8us. That's something new, not found in Flash devices. This is maybe related with the internal refresh of the DRAM, but in any case it can be a constraint for burst reads. (In my actual cache code a line fill reads 128 bytes and it will require 270 cycles, meaning the clock has to be 33.75 MHz or faster in order to meet this spec)

And a question regarding your sources. "addr" is 21-bit wide, and it is stated to be a byte address:

Signals:
addr[20:0] - 21-bit byte address (2MB addressable)
...

This limits the PSRAM size to 2 Mbytes. ¿Should it be a word address? This seems logical as there are also 4 write-strobes, one per byte (totaling 8 Mbytes). Even if this is the case it could be advisable to add another bit to the address input in order to support 16 Mbyte chips with 24-bit addresses.

Nice day

charli va

unread,
Dec 1, 2025, 10:26:56 AM (yesterday) Dec 1
to fpga-wars-explora...@googlegroups.com
What a great step forward! Being able to execute from PSRAM is already a huge leap, even if we're going slowly for now XD.

Indeed, the addressing is by byte group. It might be a bit unclear, and it could be simpler if we used linear addresses and calculated the corresponding byte, but I found that too complex.

To be honest, reducing the address bus width was a last-minute "mistake." I'll update it later and let you know. I had a problem with bit corruption in the read bits, and it was all due to dummy cycles. During testing, something curious happened: when I had everything working, while testing the single SPI in burst mode, there was a problem with global buffer usage. It's possible that the serialization creates such a long path that nexpnr isn't able to optimize it with the ice40's resources. In the end, since this mode doesn't make sense in this kind of memory, I abandoned it. However, in trying to get it working, I attempted to reduce the logic to see if I could fit it into the routing, and one of the tweaks involved the bus widths—probably a bad idea—and that's where I left it.

I'll fix it and take the opportunity to close some other tests I'm currently working on as soon as they're up and running.

Thank you very much for using it and providing feedback.

Goof afternoon!

Reply all
Reply to author
Forward
0 new messages