Strange issue with SWU QSPI writes on zynqmp

89 views
Skip to first unread message

Ulrich Teichert

unread,
Aug 22, 2025, 6:39:05 AMAug 22
to swupdate
Hi,

I'm having a strange issue when writing into a QSPI flash on a zynqmp with
swupdate. On the first glance this either looked as a hardware or a driver issue,
but now I'm totally mystified - perhaps someone has an idea where to look.

The situation is like this: we have a working swupdate process which writes a bootloader
into our QSPI flash on various ARM64 boards with zynqmp SOCs:

root@esp-TE-0820[SD]:~# mtd_debug info /dev/mtd0
mtd.type = MTD_NORFLASH
mtd.flags = MTD_CAP_NORFLASH
mtd.size = 4194304 (4M)
mtd.erasesize = 131072 (128K)
mtd.writesize = 1
mtd.oobsize = 0
regions = 0

It works well on a 5.10 Linux kernel, but when we use the same swupdate running on
6.1, some words, bytes and bits are set to 0 in the QSPI after an update. The puzzling
fact is that these faults are exactly the same every time and *on different boards*. The word in the QSPI at the position 0xA582 is 0x0000 where it should be 0x10B8, the word at 0xE582 is 0x0000 where it should be 0x2130, the byte at position 0x25001 is 0x00 where it should be 0xE0. That is reproducible every time and on different hardware. If I write the same data with flashcp into the QSPI, there are no such differences - every time and on every board.

Is that a QSPI driver issue, despite the fact that this happens at exactly the same positions over and over again? As it works on 5.10, we can rule out hardware issues, but I've never seen something like that before,

thanks for every hint,
Uli

ayoub...@googlemail.com

unread,
Aug 22, 2025, 6:53:07 AMAug 22
to swupdate
Hi,

I had similar issues on the ZynqMP Ultra-Scale, probably DTS config issues(?)

try to write the bootloader using flashcp command to your mtd device to be sure there is no issue related to swupdate.

Best

Ulrich Teichert

unread,
Aug 25, 2025, 5:51:19 AMAug 25
to swupdate
Hi,

On Friday, August 22, 2025 at 12:53:07 PM UTC+2 [del]@googlemail.com wrote:
Hi,

I had similar issues on the ZynqMP Ultra-Scale, probably DTS config issues(?)

Yes, from 5.10 to 6.1 a lot of DT properties for the QSPI have been changed (and *not* documented by
Xilinix...) as we found out. But that's fixed in our DT now and:
 
 
try to write the bootloader using flashcp command to your mtd device to be sure there is no issue related to swupdate.

As I wrote in my post, I *can* write the QSPI with flashcp without problems and I can boot from QSPI afterwards,
that's why I am confused.

TIA,
Uli

ayoub...@googlemail.com

unread,
Aug 26, 2025, 11:15:55 AMAug 26
to swupdate
okay,

As I wrote, from my side were some issues with QSPI flash when migrating from 4.19 Kernel to 6.1 but it was not related to swupdate (after flashcp the board couldn't boot)
After I fixed the QSPI settings in DTS it worked as it should (both swupdate and flashcp).

As workaround I suggest you to use flashcp based lua external handler.

best regards

Stefano Babic

unread,
Aug 26, 2025, 11:21:02 AMAug 26
to ayoub...@googlemail.com, swupdate
On 8/26/25 17:15, 'ayoub...@googlemail.com' via swupdate wrote:
> okay,
>
> As I wrote, from my side were some issues with QSPI flash when migrating
> from 4.19 Kernel to 6.1 but it was not related to swupdate (after
> flashcp the board couldn't boot)
> After I fixed the QSPI settings in DTS it worked as it should (both
> swupdate and flashcp).
>
> As workaround I suggest you to use flashcp based lua external handler.

Well, but this means to ignore the problem and not find the cause. And
the issue is maybe just triggered by SWUpdate, but it can happen even
later when software is running. If you are just running out of time, you
can close your eyes and do this, else it should be better to find the
real cause.

Stefano

>
> best regards
>
>
> On Monday, August 25, 2025 at 11:51:19 AM UTC+2 Ulrich Teichert wrote:
>
> Hi,
>
> On Friday, August 22, 2025 at 12:53:07 PM UTC+2 [del]@googlemail.com
> <http://googlemail.com> wrote:
>
> Hi,
>
> I had similar issues on the ZynqMP Ultra-Scale, probably DTS
> config issues(?)
>
>
> Yes, from 5.10 to 6.1 a lot of DT properties for the QSPI have been
> changed (and *not* documented by
> Xilinix...) as we found out. But that's fixed in our DT now and:
>
> try to write the bootloader using flashcp command to your mtd
> device to be sure there is no issue related to swupdate.
>
>
> As I wrote in my post, I *can* write the QSPI with flashcp without
> problems and I can boot from QSPI afterwards,
> that's why I am confused.
>
> TIA,
> Uli
>
> --
> You received this message because you are subscribed to the Google
> Groups "swupdate" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to swupdate+u...@googlegroups.com
> <mailto:swupdate+u...@googlegroups.com>.
> To view this discussion visit https://groups.google.com/d/msgid/
> swupdate/172a01f7-8fe4-47cc-af2c-0640fa2e6572n%40googlegroups.com
> <https://groups.google.com/d/msgid/swupdate/172a01f7-8fe4-47cc-
> af2c-0640fa2e6572n%40googlegroups.com?utm_medium=email&utm_source=footer>.

ayoub...@googlemail.com

unread,
Aug 26, 2025, 11:33:41 AMAug 26
to swupdate

Hi Stefano,

You’re right, but I’ve also encountered similar issues in the past, not with QSPI flash, but with NAND flash (and I wasn’t the only one):

https://groups.google.com/g/swupdate/c/VGOQ1XaPJXE/m/budL2opjAAAJ

There definitely seems to be something buggy in swupdate. I spent some time investigating it back then but couldn’t really pin it down.

When the board becomes unbootable, the recovery process is unfortunately quite laborious. And as you can imagine, customers are not willing to pay more if  a workaround is available :-(


Best

Stefano Babic

unread,
Aug 26, 2025, 11:43:46 AMAug 26
to ayoub...@googlemail.com, swupdate
Hi Ayoub,

On 8/26/25 17:33, 'ayoub...@googlemail.com' via swupdate wrote:
> Hi Stefano,
>
> You’re right, but I’ve also encountered similar issues in the past, not
> with QSPI flash, but with NAND flash (and I wasn’t the only one):
>
> https://groups.google.com/g/swupdate/c/VGOQ1XaPJXE/m/
> budL2opjAAAJ<https://groups.google.com/g/swupdate/c/VGOQ1XaPJXE/m/
> budL2opjAAAJ>
>
> There definitely seems to be something buggy in swupdate. I spent some
> time investigating it back then but couldn’t really pin it down.

Well, SWUpdate is using the API provided by the kernel, or better what
is provided by mtd-utils and libmtd. There is not some specific in SWUpdate.

>
> When the board becomes unbootable, the recovery process is unfortunately
> quite laborious. And as you can imagine, customers are not willing to
> pay more if  a workaround is available :-(

Sure, customers do not think to the future....but from dirty and quick
tricks we can write books...

Stefano
> > <http://googlemail.com <http://googlemail.com>> wrote:
> >
> > Hi,
> >
> > I had similar issues on the ZynqMP Ultra-Scale, probably DTS
> > config issues(?)
> >
> >
> > Yes, from 5.10 to 6.1 a lot of DT properties for the QSPI have been
> > changed (and *not* documented by
> > Xilinix...) as we found out. But that's fixed in our DT now and:
> >
> > try to write the bootloader using flashcp command to your mtd
> > device to be sure there is no issue related to swupdate.
> >
> >
> > As I wrote in my post, I *can* write the QSPI with flashcp without
> > problems and I can boot from QSPI afterwards,
> > that's why I am confused.
> >
> > TIA,
> > Uli
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "swupdate" group.
> > To unsubscribe from this group and stop receiving emails from it,
> send
> > an email to swupdate+u...@googlegroups.com
> > <mailto:swupdate+u...@googlegroups.com>.
> > To view this discussion visit https://groups.google.com/d/msgid/
> <https://groups.google.com/d/msgid/>
> > swupdate/172a01f7-8fe4-47cc-af2c-0640fa2e6572n%40googlegroups.com
> <http://40googlegroups.com>
> > <https://groups.google.com/d/msgid/swupdate/172a01f7-8fe4-47cc-
> <https://groups.google.com/d/msgid/swupdate/172a01f7-8fe4-47cc->
> > af2c-0640fa2e6572n%40googlegroups.com?
> utm_medium=email&utm_source=footer <http://40googlegroups.com?
> utm_medium=email&utm_source=footer>>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "swupdate" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to swupdate+u...@googlegroups.com
> <mailto:swupdate+u...@googlegroups.com>.
> To view this discussion visit https://groups.google.com/d/msgid/
> swupdate/64cbb732-c2a7-48f3-946a-76381f6c44c6n%40googlegroups.com
> <https://groups.google.com/d/msgid/swupdate/64cbb732-
> c2a7-48f3-946a-76381f6c44c6n%40googlegroups.com?
> utm_medium=email&utm_source=footer>.

Ayoub Zaki

unread,
Aug 26, 2025, 12:47:32 PMAug 26
to swupdate


On Tue, Aug 26, 2025, 17:43 Stefano Babic <stefan...@swupdate.org> wrote:
Hi Ayoub,

On 8/26/25 17:33, 'ayoub...@googlemail.com' via swupdate wrote:
> Hi Stefano,
>
> You’re right, but I’ve also encountered similar issues in the past, not
> with QSPI flash, but with NAND flash (and I wasn’t the only one):
>
> https://groups.google.com/g/swupdate/c/VGOQ1XaPJXE/m/
> budL2opjAAAJ<https://groups.google.com/g/swupdate/c/VGOQ1XaPJXE/m/
> budL2opjAAAJ>
>
> There definitely seems to be something buggy in swupdate. I spent some
> time investigating it back then but couldn’t really pin it down.

Well, SWUpdate is using the API provided by the kernel, or better what
is provided by mtd-utils and libmtd. There is not some specific in SWUpdate.

I know, this is why I'm puzzled why it does work using mtd-utils commands i.e nandwrite and flashcp and not within swupdate?

Ulrich Teichert

unread,
Aug 27, 2025, 3:29:22 AMAug 27
to swupdate
Hi,

On Tuesday, August 26, 2025 at 6:47:32 PM UTC+2 Ayoub Zaki wrote:


Well, SWUpdate is using the API provided by the kernel, or better what
is provided by mtd-utils and libmtd. There is not some specific in SWUpdate.

I know, this is why I'm puzzled why it does work using mtd-utils commands i.e nandwrite and flashcp and not within swupdate?

Right, that's what irritated me as well. After thinking a bit I used dd to write the same data into /dev/mtd0
(without setting the blocksize!) and the same faults in the same positions were visible! I think that proves
that it is the QSPI kernel driver which can't cope with byte-sized writes, don't you think?

TIA,
Uli


Stefano Babic

unread,
Aug 27, 2025, 3:49:41 AMAug 27
to Ulrich Teichert, swupdate
Hi Uli,
That proves that SWUpdate is just the trigger. I guess you can change
the blocksize (try with 16Kb as SWUpdate), and on depend of the size the
bug raises or not. But surely, it says that there is something in the
kernel.

Best regards,
Stefano

Ayoub Zaki

unread,
Aug 27, 2025, 6:10:06 AMAug 27
to Stefano Babic, Ulrich Teichert, swupdate
this is my DTS QSPI Setting I successfully worked with for 6.1 xilinx Kernel:

&qspi {
    u-boot,dm-pre-reloc;
    status = "okay";
    is-dual = <1>;
    is-stacked = <1>;
    num-cs = <2>;
    has-io-mode = <1>;
    qspi-mode = <1>;
    flash@0 {
        compatible = "jedec,spi-nor";
        reg = <0x0>;
        spi-tx-bus-width = <1>;
        spi-rx-bus-width = <4>;
        parallel-memories = /bits/ 64 <0x4000000 0x4000000>; /* 128MB */
        spi-max-frequency = <50000000>;
        #address-cells = <1>;
        #size-cells = <1>;

        partition@0x00000000 {
			label = "boot";
			reg = <0x00000000 0x00A00000>;
		};
		partition@0x01000000 {
			label = "boot-recovery";
			reg = <0x00A00000 0x00A00000>;
		};
    };
};

Maybe you can give it a try if that solves your problem ?

--
You received this message because you are subscribed to the Google Groups "swupdate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swupdate+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/swupdate/59793de6-7f65-4930-a2fc-d3b7c33ff752%40swupdate.org.

Ulrich Teichert

unread,
Aug 27, 2025, 7:25:48 AMAug 27
to swupdate
Hi,

On Wednesday, August 27, 2025 at 12:10:06 PM UTC+2 Ayoub Zaki wrote:
this is my DTS QSPI Setting I successfully worked with for 6.1 xilinx Kernel:

There were some differences to my DTS, but unfortunately your settings did
not solve the issue, not even when I completely replaced my node with yours.

In my DT, is-dual and is-stacked are missing because these properties are not in use
in 6.1 anymore, I think. qspi-mode = 1 is the default, if I am not mistaken.

More interestingly, I have flash@0 { reg = <0>, <1>; spi-tx-bus-width = <4>;
spi-max-frequency = <108000000>; } set, but changing it had no effect.
 
&qspi { u-boot,dm-pre-reloc; status = "okay"; is-dual = <1>; is-stacked = <1>; num-cs = <2>; has-io-mode = <1>; qspi-mode = <1>; flash@0 { compatible = "jedec,spi-nor"; reg = <0x0>; spi-tx-bus-width = <1>; spi-rx-bus-width = <4>; parallel-memories = /bits/ 64 <0x4000000 0x4000000>; /* 128MB */ spi-max-frequency = <50000000>; #address-cells = <1>; #size-cells = <1>; partition@0x00000000 { label = "boot"; reg = <0x00000000 0x00A00000>; }; partition@0x01000000 { label = "boot-recovery"; reg = <0x00A00000 0x00A00000>; }; }; };
Maybe you can give it a try if that solves your problem ?
It was worth the try, thanks for your help,
CU,
Uli

Ulrich Teichert

unread,
Aug 29, 2025, 5:01:43 AMAug 29
to swupdate
Hi,

I've noticed another odd behaviour: the bug is "sticky" on the QSPI. When I do:
swupdate => QSPI faulty => dd into /dev/mtd0 => QSPI faulty
But if the process is:
flash_eraseall => dd into /dev/mtd0 => QSPI OK
I can even dd multiple times into /dev/mtd0 without erasing the flash and the QSPI
state stays as is: either it's faulty or it's OK. Rebooting in between does not change a thing.

Mysterious to me,
CU,
Uli
Reply all
Reply to author
Forward
0 new messages