Am Thu, 24 Feb 2022 18:50:50 +0530
schrieb vijai kumar <
vijaikumar....@gmail.com>:
> Hi Henning,
>
> On Tue, Feb 22, 2022 at 8:01 PM Henning Schild
> <
henning...@siemens.com> wrote:
> >
> > Hey Vijai,
> >
> > Am Tue, 22 Feb 2022 16:04:36 +0530
> > schrieb vijai kumar <
vijaikumar....@gmail.com>:
> >
> > > Problem:
> > > --------
> > > We could have several CI jobs that are running in parallel in
> > > different nodes. One might want to consolidate and build a
> > > base-apt from the debs/deb-srcs of all these builds.
> >
> > Can you go into more detail. I do not yet get the problem.
>
> runner 1(Germany) -> Building de0 nano
> runner 2(India) -> Building qemuarm
> runner 3(US) -> Building qemuamd64
>
>
> All these builds are running in different servers.
> If we wanted to create a single base-apt from all these servers, then
> we need to copy over their deb/debsrcs/base-apt to a common server and
> then
> create a consolidated repo.
But why would you want to do that? I mean i get why you would want to
store all in the same location, but not why it should be one repo.
Maybe to save some space on sources and arch all .. but hey there are
ways of deduplcating on filesystem or block level.
You are just risking a weird local "all" package not being so "all"
after all ... false sharing.
> This involves moving around this data.
Yes, if it one central storage place. No matter if it is one "repo" or
many "repos" in i.e. folders.
> The problem can be avoided if we have a single metadata produced by
> all these builds which would have details of all the packages the
> build used.
> Basically a manifest of the build. This manifest can be later used to
> recreate the repo which can be hosted later on for these jobs.
We have a manifest for "image content" which already is fed into
clearing, it is a bill of materials an nothing else, it can not
be used to rebuild.
Even if you had all metadata you need to store sources and binaries
somewhere reliable, whether that is central or distributed is another
story.
Pointers to anything on the internet (including all debian repos) will
at some point stop working. So if "exact rebuilding" in a "far away
future" is what you want, mirroring is what you will need.
Partial mirroring based on base-apt even with sources will be shaky and
you will find yourself digging in snapshots again. But it will work.
In the worst case you will not want "exact rebuild" but "fix backported
rebuild", which means you will need all build-deps mirrored ...
rescursively. In fact any "package relationship" maybe even a Conflicts
might become rebuild relevant.
A partial mirror will not cut it, rather take a full one, so you do not
need to care of which bits to ignore and do not risk forgetting
anything.
The ideal way would be to eventually liberate snapshots of its
throttling, the short term way is to spend some bucks on some buckets
(S3).
> Having metadata and recreating repo is one way. There might be other
> ways as well.
I am afraid you likely can not recreate if you do not keep everything
yourself or a place you trust (snapshots?).
There have been several threads on that topic already, including how
one could help make snapshot work for debootstrap and co. Coming from
reproducible builds and qubes-os [1] [2].
If you dig deeper you will find many people offering help and funding
but for some reason things seem still "stuck".
On top we could maybe see if we can establish something like snapshots
in Siemens. But i guess outside and open to anyone will be much better.
[1]
https://groups.google.com/g/isar-users/c/X9B5chyEWpc/m/nVXwZuIRBAAJ
[2]
https://www.qubes-os.org/news/2021/10/08/reproducible-builds-for-debian-a-big-step-forward/
> That is where we thought about the --print-uris option of apt. It
> basically gives you the complete URL to the package which we can
> download using wget.
> A manifest containing all the packages ever used by the build with its
> complete url. It could easily be used for several purposes, like as
> clearing input,
> repo regeneration etc.
Maybe we can find valid reasons to extend the manifests. But URLs to
packages seem almost redundant, knowing the package names and versions
and all sources.list entries one can generate these URLs for any
mirror, picking just one of many mirrors would be limiting.
And maybe there are valid reasons to having manifests even for
buildchroots. But the problem here is that they change all the time
while we still use one buildchoot. We see packages being added as build
deps all the time, but also removed when build deps conflict.
> I don't think sstate can help here. I might be wrong though.
I guess sstate will not help. It is even more storage needs and more
storage sync needs between runners if you want to share.
Henning