On 2020-06-23 10:48, Andrew Ruthven wrote:
> My expectation is that other modules which are pulled in for the build
> have caused the breakage or change in behaviour.
>
> Perhaps the build process should use versioned dependencies on other
> modules? If that is possible of course.
For "library" dependencies (for want of a better description) they are
mostly git submodules, so effectively pinned that way. But due to the
way that litex-buildenv and the hdmi2usb git repos ended up somewhat
linked (litex-buildenv was forked from the hdmi2usb one), I think
there's ended up being "build dependencies" changes for other reasons
(in litex-buildenv, which then got pulled back into hdmi2usb to try to
keep the two in sync).
The tools were effectively pinned through being packaged in conda, and
the download scripts often pinning a specific revision. But those too I
suspect got updates pulled in via litex-buildenv changes.
The main upstream dependencies (litex related things) has had a *lot* of
refactoring in the last 6-9 months, which means any given update in that
dependency pull in quite a bit of change. (I spent quite a while
debugging one fallout of that in litex-buildenv on another small board.)
Ultimately I think the lack of a good regression test setup for the
Opsis (beyond "it builds", which I think is all the cloud CI can do)
probably means these issues are discovered rather late (and "production
uses", eg, at conferences, often end up sticking with a "known good"
release if they're being prepped on a tight time schedule). The
relative lack of developer attention certainly doesn't help (eg, Tim who
started the project has about 10 other projects, and what seems like 3
full time jobs as well!).
If the "breaking revision" is a one line change that seems unrelated,
the most likely cause is that build pushed something else to be laid out
differently, which rippled through the behaviour due to insufficient
constraints. And those bugs are hard enough to figure out in software,
let alone hardware :-)
Off the top of my head my best guess for what to do to debug it would be
to try to figure out how the video stream (over USB) is corrupted, and
see if there's some pattern in that effect that might give a hint to how
it's going wrong (eg, dropped bits/bytes, duplicated bits/bytes,
reordering, etc). But I have neither the time, nor the hardware, to do
anything about that myself.
Ewen