[GSoC] Template Manager: Interactions w/ Repos

114 views
Skip to first unread message

WillyPillow

unread,
Jun 30, 2020, 7:13:56 PM6/30/20
to qubes...@googlegroups.com
So to add the repo interaction functionality to the template manager, I've
spent a while reading the docs of `python-dnf`. However, it occurred to me that
as one should be able to use the manager from both dom0 and mgmtVMs, I have to
deal with both `qubes-dom0-update` and `dnf`.

As such, I'm thinking that perhaps invoking the CLI tools -- mostly `dnf
{download,repoquery}`[^1] and the `qubes-dom0-update` equivalents [^2] --
directly might be an easier way of approaching this?

Somewhat related question: is it reasonable to assume that mgmtVMs have network
access? Or do we have to use a mechanism similar to `qubes-dom0-update`
and allow downloading from another updateVM?

[^1]: Interestingly, the output of `dnf repoquery` seems to be
machine-readable.
[^2]: These actions may have to be added.

Thanks,
WillyPillow

> https://blog.nerde.pw/
>
> PGP fingerprint = 6CCF 3FC7 32AC 9D83 D154 217F 1C16 C70E E7C3 1C84
>
> Protonmail PGP = D02D CEFF ACE5 5A7B FF5D 871E 4004 1CB1 F52B 127E
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

Marek Marczykowski-Górecki

unread,
Jun 30, 2020, 11:10:53 PM6/30/20
to WillyPillow, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Tue, Jun 30, 2020 at 07:21:23PM +0000, WillyPillow wrote:
> So to add the repo interaction functionality to the template manager, I've
> spent a while reading the docs of `python-dnf`. However, it occurred to me that
> as one should be able to use the manager from both dom0 and mgmtVMs, I have to
> deal with both `qubes-dom0-update` and `dnf`.
>
> As such, I'm thinking that perhaps invoking the CLI tools -- mostly `dnf
> {download,repoquery}`[^1] and the `qubes-dom0-update` equivalents [^2] --
> directly might be an easier way of approaching this?

Somehow I don't like this idea. Mostly because I consider
`qubes-dom0-update` tool a big hack that is very fragile to extend. The
dom0<->VM interface there is very loosely defined as "options like for
dnf, with few exceptions", but it happens that depending on yum/dnf
version inside of the VM, this may not also work that well - especially
if the VM is Debian-based[^3], then you get only yum+yumdownloader which
does some things differently (and also have different output format).
This flexibility in options is nice for the user, because they can use
qubes-dom0-update almost as a drop-in replacement for dnf, with all the
features. But not that good for integrating into another script as you
don't have standardized interface and switching
template/version/something can mess things up.

I recommend writing a simple set of qrexec services, which may
internally use `dnf repoquery`, but have defined interface. As for
handling dom0 and mgmt vm case - it isn't that different, in mgmt vm you
can simply call /etc/qubes-rpc/* script directly[^4], skipping qrexec
transport and keep the interface the same.

In case of managing templates, I think those services can be much
simpler than full qubes-dom0-update. If I haven't missed anything, you
need two:
1. list available templates _in a defined format_ (perhaps with some
limiting options, like which repository, or a patter for template name).
2. download a template

Since you don't really care about dependencies for template rpm package,
you don't need to send rpmdb and similar data from dom0. You just need
repository configuration (or maybe just repository url?).
As for downloading - if the service would download directly to stdout,
instead of to file that is later send over another qrexec call, you can
get nice progress indicator directly in template manager (instead of
non-machine readable bar rendered by dnf). It should be quite easy - ask
dnf for the URL, then call `curl`.

> Somewhat related question: is it reasonable to assume that mgmtVMs have network
> access? Or do we have to use a mechanism similar to `qubes-dom0-update`
> and allow downloading from another updateVM?

While in practice some MgmtVMs will have network access, I don't think
we should assume that. For example, sys-gui which is kind of MgmtVM do
not need network access (and by default it doesn't have).

> [^1]: Interestingly, the output of `dnf repoquery` seems to be
> machine-readable.
> [^2]: These actions may have to be added.

[^3]: Debian mgmt vm would be an issue regardless, as dnf nor python-dnf
is not packaged there yet (work in progress).

[^4]: One thing to be careful about - if such service would execute
qrexec call back to the caller (like qubes-download-dom0-updates.sh
does), then this also would require a special case. But this "callback"
approach can be avoided.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl77/rQACgkQ24/THMrX
1yzroQf/TRJy1e85XGN94zAOaZOWauQPGCwN/t9xiobMvNXZCBo82QipzlzfyplQ
HfYiDgSYQHosF3J0xf5R7bjnWsBnUkrejg3MHaX8xpBiaok3+IlCPrQ3nphBzbvb
pt1YwwTZQhpO7yAbnU5FWVGi/YoWUI23KzTdEB0kuiOi5BmpiD7uo1ib/aUhKNZC
Wmrc8CpiJRiKDdqXEN/40zlr7mBU70zYZldgQfa3u64eHmv2BShu5IrnuAo1pqb9
fBm4Nz0DiFSflTlQqm4WEROjufEtxV7/O0slanDTlWfwwnX+DMlmYRS7eUWvOsnw
i1yad1D2b7ZO+Nu+8JmnBFeSMXJtSA==
=lj7l
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 2, 2020, 2:57:45 PM7/2/20
to Marek Marczykowski-Górecki, qubes...@googlegroups.com
> 1. list available templates in a defined format (perhaps with some
> limiting options, like which repository, or a patter for template name).
>

> 2. download a template
>

> Since you don't really care about dependencies for template rpm package,
> you don't need to send rpmdb and similar data from dom0. You just need
> repository configuration (or maybe just repository url?).
> As for downloading - if the service would download directly to stdout,
> instead of to file that is later send over another qrexec call, you can
> get nice progress indicator directly in template manager (instead of
> non-machine readable bar rendered by dnf). It should be quite easy - ask
> dnf for the URL, then call `curl`.

I see. I've opened a PR containing a PoC of this on `core-agent-linux`. (BTW,
I've also opened PRs on the previously modified repos.) Additionally, the
design doc [gist] has been updated to describe the protocol (and include a
small progress tracker :) ).

[gist]: https://gist.github.com/WillyPillow/b8a643ddbd9235a97bc187e6e44b16e4

> > Somewhat related question: is it reasonable to assume that mgmtVMs have network
> > access? Or do we have to use a mechanism similar to `qubes-dom0-update`
> > and allow downloading from another updateVM?
>

> While in practice some MgmtVMs will have network access, I don't think
> we should assume that. For example, sys-gui which is kind of MgmtVM do
> not need network access (and by default it doesn't have).

Perhaps then we can have an option that lets the user decide whether qrexec or
DNF from the mgmtVM itself should be used? (With the default being the former.)

> > [^1]: Interestingly, the output of `dnf repoquery` seems to be
> > machine-readable.
> > [^2]: These actions may have to be added.
>

> [^3]: Debian mgmt vm would be an issue regardless, as dnf nor python-dnf
> is not packaged there yet (work in progress).

As long as the VM running the qrexec server is Fedora-based, I suppose this
would not be a big problem?

> [^4]: One thing to be careful about - if such service would execute
> qrexec call back to the caller (like qubes-download-dom0-updates.sh
> does), then this also would require a special case. But this "callback"
> approach can be avoided.

publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

Marek Marczykowski-Górecki

unread,
Jul 2, 2020, 5:28:33 PM7/2/20
to WillyPillow, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Besides comments I've left there, overall looks good.
I'm not sure if splitting epoch, version and release into separate
fields makes much sense - qvm-template would treat all three together as
a "version" anyway. But it shouldn't hurt either.

> (BTW,
> I've also opened PRs on the previously modified repos.) Additionally, the
> design doc [gist] has been updated to describe the protocol (and include a
> small progress tracker :) ).

I haven an idea, maybe we can try to use github projects feature to
track this?
I've created one here:
https://github.com/orgs/QubesOS/projects/1

And added all pull requests.

> [gist]: https://gist.github.com/WillyPillow/b8a643ddbd9235a97bc187e6e44b16e4
>
> > > Somewhat related question: is it reasonable to assume that mgmtVMs have network
> > > access? Or do we have to use a mechanism similar to `qubes-dom0-update`
> > > and allow downloading from another updateVM?
> >
>
> > While in practice some MgmtVMs will have network access, I don't think
> > we should assume that. For example, sys-gui which is kind of MgmtVM do
> > not need network access (and by default it doesn't have).
>
> Perhaps then we can have an option that lets the user decide whether qrexec or
> DNF from the mgmtVM itself should be used? (With the default being the former.)

Yes, that also could be an option.

> > > [^1]: Interestingly, the output of `dnf repoquery` seems to be
> > > machine-readable.
> > > [^2]: These actions may have to be added.
> >
>
> > [^3]: Debian mgmt vm would be an issue regardless, as dnf nor python-dnf
> > is not packaged there yet (work in progress).
>
> As long as the VM running the qrexec server is Fedora-based, I suppose this
> would not be a big problem?

Yes, currently this is a requirement. Once DNF will be in Debian, it
should "just work" there too.

> > [^4]: One thing to be careful about - if such service would execute
> > qrexec call back to the caller (like qubes-download-dom0-updates.sh
> > does), then this also would require a special case. But this "callback"
> > approach can be avoided.
>
> Thanks,
> WillyPillow
>
> > https://blog.nerde.pw/
> >
> > PGP fingerprint = 6CCF 3FC7 32AC 9D83 D154 217F 1C16 C70E E7C3 1C84
> >
> > Protonmail PGP = D02D CEFF ACE5 5A7B FF5D 871E 4004 1CB1 F52B 127E





- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl7+UXsACgkQ24/THMrX
1ywEhQf/SwgoXo3cJPPeAHH8waSYDqiW4Fv0JIU2KpfpZywcC4wdSP7eNnbWH3yz
3GdRMqAynSWl/+sD6rCFIy8Vj8iH6ElCNKEpDI1oSYQcQgia7lmrUsYwUms1s0IM
Tvozx8MJyHvEePEMnz1FoPM6h/ARapYh5WgfPoSXJ3zQ+xKr0fOjk3IcokVfpco9
69jUpdYMwYJA4iblsXMhBwI+GCG/MFBr3boi+QDGKZ+bappfaUuOpl+vEt9xgZqM
UWuvX3w9A0BffrG/sm1Qn1F2NjA0z970AtE8hZ/uQQ5vmC4WNtt7zSjFE0az/i9h
aThmAjL+DINVDpYJVuJPfQGCwx3zFA==
=tIJ0
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 3, 2020, 2:16:59 PM7/3/20
to Marek Marczykowski-Górecki, qubes...@googlegroups.com
Currently qvm-template does handle version and release separately. (Though I'll
have to include the epoch later.) The reason I did this is because AFAIK,
comparing RPM versions is not exactly trivial [^rpm], and storing them
separately allows me to use `rpm.labelCompare` to do the job.

[^rpm]: <https://blog.jasonantman.com/2014/07/how-yum-and-rpm-compare-versions/>

> > (BTW,
> > I've also opened PRs on the previously modified repos.) Additionally, the
> > design doc gist has been updated to describe the protocol (and include a
> > small progress tracker :) ).
>

> I haven an idea, maybe we can try to use github projects feature to
> track this?
> I've created one here:
> https://github.com/orgs/QubesOS/projects/1
>

> And added all pull requests.

Awesome! I've also edited the titles of the PRs to make them (& the board) more
clear.

> > > > Somewhat related question: is it reasonable to assume that mgmtVMs have network
> > > > access? Or do we have to use a mechanism similar to `qubes-dom0-update`
> > > > and allow downloading from another updateVM?
> >

> > > While in practice some MgmtVMs will have network access, I don't think
> > > we should assume that. For example, sys-gui which is kind of MgmtVM do
> > > not need network access (and by default it doesn't have).
> >

> > Perhaps then we can have an option that lets the user decide whether qrexec or
> > DNF from the mgmtVM itself should be used? (With the default being the former.)
>

> Yes, that also could be an option.
>

> > > > [^1]: Interestingly, the output of `dnf repoquery` seems to be
> > > > machine-readable.
> > > > [^2]: These actions may have to be added.
> >

> > > [^3]: Debian mgmt vm would be an issue regardless, as dnf nor python-dnf
> > > is not packaged there yet (work in progress).
> >

> > As long as the VM running the qrexec server is Fedora-based, I suppose this
> > would not be a big problem?
>

> Yes, currently this is a requirement. Once DNF will be in Debian, it
> should "just work" there too.

Also, after an initial version of repo handling is finished, should I put
`qvm-template` in `qubesadmin/tools` or somewhere else?
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

Marek Marczykowski-Górecki

unread,
Jul 3, 2020, 3:41:40 PM7/3/20
to WillyPillow, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Fri, Jul 03, 2020 at 06:16:44PM +0000, WillyPillow wrote:
> On Friday, July 3, 2020 5:28 AM, Marek Marczykowski-Górecki <marm...@invisiblethingslab.com> wrote:
> > Besides comments I've left there, overall looks good.
> > I'm not sure if splitting epoch, version and release into separate
> > fields makes much sense - qvm-template would treat all three together as
> > a "version" anyway. But it shouldn't hurt either.
>
> Currently qvm-template does handle version and release separately. (Though I'll
> have to include the epoch later.) The reason I did this is because AFAIK,
> comparing RPM versions is not exactly trivial [^rpm], and storing them
> separately allows me to use `rpm.labelCompare` to do the job.
>
> [^rpm]: <https://blog.jasonantman.com/2014/07/how-yum-and-rpm-compare-versions/>

Ok.

> Also, after an initial version of repo handling is finished, should I put
> `qvm-template` in `qubesadmin/tools` or somewhere else?

Yes, that's exactly where the tool should land.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl7/iesACgkQ24/THMrX
1yy5fQf9ETJ8FQM43NwAZh3nys5Vz81Edxj6czHZTnRwJOqvlITJTqP5MoyCYbnE
A6yI0X+E/5TOy9bftzoiOTiL3BETCjAYm3kPX1qlxD8PQuj+PmBlpwvQLSC6Bz/c
6sHMGKgW8+nQF+yglNqZm9l1NznhK7QWpsq5/eNMSgZPzEtb2Rzez2LlDBgXXbxZ
RkS5SDnyViAdP2XxDOGKjy5hx1tpzWCf0knUryKZkU42pBMhRQkQPaOCY3s12w/f
Z3k+eV2sLWX43zatd8jTcI9F8oPkTIa4m99D5Yq7FH5sGfq11gtyhGK4K5HZwi2/
LMw6yCBj1u5aee/PrchGnM0YQMEBiA==
=PfKb
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 6, 2020, 2:38:36 PM7/6/20
to Marek Marczykowski-Górecki, qubes...@googlegroups.com
I'm almost finished with the list/download/install operations, and should be
able to commit the code in the following days. [^1] That being said, in
mgmtVMs, I currently don't have a means to get the repo config files, as they
seem to reside only in the qubes-release package. Similarly, I'm unsure how to
get the Qubes release from mgmtVMs. Perhaps a package similar to qubes-release
is needed in VMs? (Or is there a simpler solution?)

[^1]: Unfortunately this also means that I'm a bit behind on my original
schedule. Hopefully I'm able to catch up soon.
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

Marek Marczykowski-Górecki

unread,
Jul 6, 2020, 8:57:47 PM7/6/20
to WillyPillow, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Mon, Jul 06, 2020 at 06:38:30PM +0000, WillyPillow wrote:
> I'm almost finished with the list/download/install operations, and should be
> able to commit the code in the following days. [^1] That being said, in
> mgmtVMs, I currently don't have a means to get the repo config files, as they
> seem to reside only in the qubes-release package.

Oh, that's right. It would be possible to split repo definitions into
separate sub-package, but it's a bit tricky, because we can't add the
whole qubes-release package into repository for a VM - that would break
the template (replace fedora-release with qubes-release).
And also, I don't think template repositories should be added to dnf in
vm.

Perhaps a better solution would be to create new separate package with
template repository definitions only and have it installable in both
dom0 and VM. And have those definitions outside of /etc/yum.repos.d, so
yum/dnf wouldn't use them normally (/usr/share/qubes/repo-templates?),
but only when called by qvm-template with options specifically pointing
at those files.

What do you think?

> Similarly, I'm unsure how to
> get the Qubes release from mgmtVMs. Perhaps a package similar to qubes-release
> is needed in VMs? (Or is there a simpler solution?)

There is /usr/share/qubes/marker-vm file for this reason. It isn't 100%
accurate, as you can restore template from older system on a newer one.
But I think it should be good enough for this use case. If you want to
manage templates on Qubes 4.1, use VM with packages from Qubes 4.1.

> [^1]: Unfortunately this also means that I'm a bit behind on my original
> schedule. Hopefully I'm able to catch up soon.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl8DyIIACgkQ24/THMrX
1ywBuQf+PKQsV5OK1w6GBYLXwNscYRKRpvzC+lBdug5GT09yXe7yawNXJpWz5OCJ
+ZMDwwcLcPyb64Ob29Hdf2LFILJVK1mBxUw8RcLNYto4xAHEYvJQr3VSnFW9Jbv2
7oxU9S3TrybPdswNR1DNlTyH/6Frq0ozHlKVYaZXfXA0KHZ3ZkaD6x/D91k8dI6M
vmTpPteKNN8DvXAjeh+r+VPxMiina93lnJLxfJPe7Euk/QCqtD8i7sneEeTrEl5n
MORbtuxz8OVNbjnCT3X+18ydsn3NES+GQ4nRfS9mp89Pijko9nzxVbP8w8+g4xS5
z6cWx9YIteCUgcPhkmZLdGIHGPGynw==
=NTs4
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 8, 2020, 2:57:11 PM7/8/20
to Marek Marczykowski-Górecki, qubes...@googlegroups.com
On Tuesday, July 7, 2020 8:57 AM, Marek Marczykowski-Górecki <marm...@invisiblethingslab.com> wrote:

> On Mon, Jul 06, 2020 at 06:38:30PM +0000, WillyPillow wrote:
>

> > I'm almost finished with the list/download/install operations, and should be
> > able to commit the code in the following days. [^1] That being said, in
> > mgmtVMs, I currently don't have a means to get the repo config files, as they
> > seem to reside only in the qubes-release package.
>

> Oh, that's right. It would be possible to split repo definitions into
> separate sub-package, but it's a bit tricky, because we can't add the
> whole qubes-release package into repository for a VM - that would break
> the template (replace fedora-release with qubes-release).
> And also, I don't think template repositories should be added to dnf in
> vm.
>

> Perhaps a better solution would be to create new separate package with
> template repository definitions only and have it installable in both
> dom0 and VM. And have those definitions outside of /etc/yum.repos.d, so
> yum/dnf wouldn't use them normally (/usr/share/qubes/repo-templates?),
> but only when called by qvm-template with options specifically pointing
> at those files.
>

> What do you think?

Originally I thought of creating another version of the package only containing
the repo definitions, just for VMs, but this is indeed a simpler solution. I'll
try to create such a package in the following days.

> > Similarly, I'm unsure how to
> > get the Qubes release from mgmtVMs. Perhaps a package similar to qubes-release
> > is needed in VMs? (Or is there a simpler solution?)
>

> There is /usr/share/qubes/marker-vm file for this reason. It isn't 100%
> accurate, as you can restore template from older system on a newer one.
> But I think it should be good enough for this use case. If you want to
> manage templates on Qubes 4.1, use VM with packages from Qubes 4.1.
>

> > [^1]: Unfortunately this also means that I'm a bit behind on my original
> > schedule. Hopefully I'm able to catch up soon.

I've committed an initial version of the {list,download,install,remove,clean}
operations to core-admin-client. Admittedly, I have yet to test it thoroughly
in mgmtVMs (most of the tests were done in dom0), though as the only major
difference should be the qrexec calls, I should be able to finish that soon.

The remaining operations (reinstall, ...etc) should also be easier to implement
as they are similar to the existing ones.
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

Marek Marczykowski-Górecki

unread,
Jul 8, 2020, 11:11:41 PM7/8/20
to WillyPillow, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Wed, Jul 08, 2020 at 06:57:05PM +0000, WillyPillow wrote:
> On Tuesday, July 7, 2020 8:57 AM, Marek Marczykowski-Górecki <marm...@invisiblethingslab.com> wrote:
> > Perhaps a better solution would be to create new separate package with
> > template repository definitions only and have it installable in both
> > dom0 and VM. And have those definitions outside of /etc/yum.repos.d, so
> > yum/dnf wouldn't use them normally (/usr/share/qubes/repo-templates?),
> > but only when called by qvm-template with options specifically pointing
> > at those files.
> >
>
> > What do you think?
>
> Originally I thought of creating another version of the package only containing
> the repo definitions, just for VMs, but this is indeed a simpler solution. I'll
> try to create such a package in the following days.

Take a look at https://github.com/QubesOS/qubes-meta-packages
You can use this as a skeleton, or perhaps even add repo definition files
there (not sure about that).

> > > [^1]: Unfortunately this also means that I'm a bit behind on my original
> > > schedule. Hopefully I'm able to catch up soon.
>
> I've committed an initial version of the {list,download,install,remove,clean}
> operations to core-admin-client.

:)

> Admittedly, I have yet to test it thoroughly
> in mgmtVMs (most of the tests were done in dom0), though as the only major
> difference should be the qrexec calls, I should be able to finish that soon.

I have left some comments in that pull request.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl8GiuUACgkQ24/THMrX
1yzG1gf/ehMQsw0a/QIcET7hiXdAcVZG/ARjRwW2480zYLIhChUL8Jf744wIcJQs
+qX/f3RkIyjPOUGzvn1d0504DEscYyUOGck3u9GkJasESYZEP4Ex7m+gqhLjFDcr
mF4EEirTEtL4tUAUiN4Y5xqAuSuW5TojzXXf5SBeiHun2Ypz1nCPKTC2ex4/gkPH
R4IlAwoxyhXCDgZsHoXybCirUce4U4kgYA7hTlOytY3v/81JZEfHxBqbLLa4y1mc
lDHR+qrekIVhCYeKuYjp7WR71aomD/by8/PGRTPzmHrt5HERm5DFHihf4vjK/rtr
0ZxCB0lBlmNVKz+yveqtl3DjoWY5hw==
=v7af
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 16, 2020, 2:21:20 PM7/16/20
to Marek Marczykowski-Górecki, woju, qubes...@googlegroups.com
On Thursday, July 9, 2020 11:11 AM, Marek Marczykowski-Górecki <marm...@invisiblethingslab.com> wrote:

> On Wed, Jul 08, 2020 at 06:57:05PM +0000, WillyPillow wrote:
>

> > On Tuesday, July 7, 2020 8:57 AM, Marek Marczykowski-Górecki marm...@invisiblethingslab.com wrote:
> >

> > > Perhaps a better solution would be to create new separate package with
> > > template repository definitions only and have it installable in both
> > > dom0 and VM. And have those definitions outside of /etc/yum.repos.d, so
> > > yum/dnf wouldn't use them normally (/usr/share/qubes/repo-templates?),
> > > but only when called by qvm-template with options specifically pointing
> > > at those files.
> >

> > > What do you think?
> >

> > Originally I thought of creating another version of the package only containing
> > the repo definitions, just for VMs, but this is indeed a simpler solution. I'll
> > try to create such a package in the following days.
>

> Take a look at https://github.com/QubesOS/qubes-meta-packages
> You can use this as a skeleton, or perhaps even add repo definition files
> there (not sure about that).

An initial version of the package is now at
<https://github.com/WillyPillow/qubes-repo-templates>.

Also, for the completeness of this thread, here are some things I've completed
since the last email (as the discussions mostly happened on Github):

- <https://github.com/QubesOS/qubes-issues/issues/5946>
- {reinstall,{down,up}grade} operations in qvm-template (Discussion:
<https://github.com/QubesOS/qubes-core-admin-client/pull/145>; also contains
improvements to other operations)

Next, I suppose I will start implementing the remaining operations and finish
off the TODOs in qvm-template.
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

Wojtek Porczyk

unread,
Jul 17, 2020, 5:56:24 AM7/17/20
to WillyPillow, Marek Marczykowski-Górecki, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Thu, Jul 16, 2020 at 06:21:12PM +0000, WillyPillow wrote:
> An initial version of the package is now at
> <https://github.com/WillyPillow/qubes-repo-templates>.

Just from cursory look, this also needs either PGP keys included or
a dependency on a package providing those keys. Right now (4.0) those are from
qubes-release and also needed in dom0 (esp. -primary are used for usual dom0
packages).

Maybe the -primary key and the key for siging ITL templates should be
separated? Would that be more convenient?


- --
pozdrawiam / best regards
Wojtek Porczyk
Graphene / Invisible Things Lab

I do not fear computers,
I fear lack of them.
-- Isaac Asimov
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEaO0VFfpr0tEF6hYkv2vZMhA6I1EFAl8RdcIACgkQv2vZMhA6
I1HHcxAAiiu/BwSl1Hx7eJAWZWHF0ax4V5umrnlCDi6MBI0giLaCml9kLyWh1JI9
FN6u7Aqi5+UUSLRgxf7kxWMo2b+W55dfvQp7/ZHD1yNMdr0jf84VxRSQGxfL5usG
HH2/wRe2jMJwkcnqtGZ4F/Jn5dIee/3jJDL60YV9lMuMJYmiI9cLeTxMiMEsweZL
ZI1rfKAUHYvF25mKiPxVfqxaMrnszljWmRlz+m0yKUtViRysFGi2pCvP8gyf6GZH
lPAyaE87MYZ6ASlAvOZrvKaVP0w/gyrdPBTGSSvy/1efHxymHwk1OJip6bIBJWSR
FiDqWeQQm5N13SmiHFWDG+j2KWxYcavecHImUIjRuzU7UL4wUEQFZYPYokNe4p8r
BlajA3VN+PWnxAqZVGJWigm5dxNZbENXdBrKVpJ4J4VlR5zWrL+BRqmexpxYca0J
LpuVqzB2whtaEJ0RuTuZs2/wbzsyuUSspNzGuAr9kLpH6N0iNde4MHG5D6FC/1Uz
8s3eplnAzTSSmFvQQ2kdJfFqjGXe17WrEJ/iOg1MzlhGfqUFrKXmyJ2RmK+2fZgV
zVSZh6E1MpH/QXCqLIc05RkLtpgzm4RFteZygMySZhjnLMmPsKzqlmEsSoevOrVp
2FYgl2xuNE/HfdcCDVDdoGToDUSuzEDuzF+U1kpJcy3oF9PjYvE=
=xrIj
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 18, 2020, 4:04:41 AM7/18/20
to wo...@invisiblethingslab.com, marm...@invisiblethingslab.com, qubes...@googlegroups.com
On Friday, July 17, 2020 5:56 PM, Wojtek Porczyk <wo...@invisiblethingslab.com> wrote:

> ​​On Thu, Jul 16, 2020 at 06:21:12PM +0000, WillyPillow wrote:
>
> > An initial version of the package is now at
> > https://github.com/WillyPillow/qubes-repo-templates.
>
> Just from cursory look, this also needs either PGP keys included or
> a dependency on a package providing those keys. Right now (4.0) those are from
> qubes-release and also needed in dom0 (esp. -primary are used for usual dom0
> packages).

I see. I'll fix this in a future commit.

> Maybe the -primary key and the key for siging ITL templates should be
> separated? Would that be more convenient?

I'm a bit unsure about this. Whether the -primary key or another key is used, isn't it the case that two files in two separate repos still need to be maintained anyway?
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

Wojtek Porczyk

unread,
Jul 18, 2020, 3:02:53 PM7/18/20
to WillyPillow, marm...@invisiblethingslab.com, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Sat, Jul 18, 2020 at 08:04:35AM +0000, WillyPillow wrote:
> On Friday, July 17, 2020 5:56 PM, Wojtek Porczyk <wo...@invisiblethingslab.com> wrote:
> > Maybe the -primary key and the key for siging ITL templates should be
> > separated? Would that be more convenient?
>
> I'm a bit unsure about this. Whether the -primary key or another key is
> used, isn't it the case that two files in two separate repos still need to
> be maintained anyway?

They need, and that's the point. So they may be two different key as well, not
just a copy of the same key. We can leave the -primary key in qubes-relase
in dom0 and have Marek generate another key for ITL templates.

Keypairs are cheap [1], so unless I missed something, I'd say this is
preferable solution to two others, which would be just more complicated to
mainain:
- - having two copies of the same key (we risk they desynchronise),
- - we have another package just for the -primary key (more packages to
maintain).


[1] If there is sufficient automation around crypto, but the template build
environment is already automated (there are two of them, as reflected by
- -primary and -community keys), so this is non-issue.


- --
pozdrawiam / best regards
Wojtek Porczyk
Invisible Things Lab

I do not fear computers,
I fear lack of them.
-- Isaac Asimov
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEaO0VFfpr0tEF6hYkv2vZMhA6I1EFAl8TR1QACgkQv2vZMhA6
I1FBXw//dujVG+f+DQo066d5NJAHJhyzK6McNX4QcE5v6J0pExkEc8EcKU+edSdw
b7YNU5WV2cR+5kororOfPBfC0pEatPswnrGAcujiz/nEISzwnp7DeGy92fAdGZ8X
+6lAlhNrAhytTujp/xejEuCo8/8V61UcOSJbW/o7Zrun4LmOzk9/D/gEvspew9eR
mydnvKZQlNjOlNNzz3IO/6wZ/stROIW3MJl4IrNxnQGIR1CRAIU5M799SbIVZC8Y
jHOyI7n2dJ7AQuxI7hpP47E+ZGE1TErHIYjZSVxYlV795oMKeJUqUzI5YBpygGRm
U65c75AqZnsdF0Lh2Yv9tt+gd7A8LKw2yFEFtehY4DNmNId55/nMTG5b0X9Y5som
4SIGwloj20ZiHvRiSuZD0GnYG3GOJ64+5xYRwWmOVWLayAM3E3pbVu7O/mEjk05b
ytI4UQt+ls0qza00GdGKIKihyxcZuaDJCD08xpWYSxUZH8JxsdOL2QEs+p7UgcoT
2y+qXtiAtIQIE8+vJ+qi76mNmVfJ3KgkwZdkprXCNOzSZkxEtQDcpGUBd9z+slZT
QcaliyHe0HT7Th1n3Zueva043J/VIfVNWeNSKLHyf+52q3JPHgKSu7b/0VYlcOuy
B4iE92tBY1ZIL4MW2+sWSZx053LSpU0mY6XXbHSBl3ysfo/xG7E=
=1WOc
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 19, 2020, 2:20:16 PM7/19/20
to Wojtek Porczyk, marm...@invisiblethingslab.com, qubes...@googlegroups.com
On Sunday, July 19, 2020 3:02 AM, Wojtek Porczyk <wo...@invisiblethingslab.com> wrote:

> On Sat, Jul 18, 2020 at 08:04:35AM +0000, WillyPillow wrote:
>

> > On Friday, July 17, 2020 5:56 PM, Wojtek Porczyk wo...@invisiblethingslab.com wrote:
> >

> > > Maybe the -primary key and the key for siging ITL templates should be
> > > separated? Would that be more convenient?
> >

> > I'm a bit unsure about this. Whether the -primary key or another key is
> > used, isn't it the case that two files in two separate repos still need to
> > be maintained anyway?
>

> They need, and that's the point. So they may be two different key as well, not
> just a copy of the same key. We can leave the -primary key in qubes-relase
> in dom0 and have Marek generate another key for ITL templates.
>

> Keypairs are cheap [1], so unless I missed something, I'd say this is
> preferable solution to two others, which would be just more complicated to
> mainain:
>

> - having two copies of the same key (we risk they desynchronise),

Ah, I see what you mean. Taking this concern into consideration, having another
key for this may indeed be a better solution.

For the time being, the -primary and -community keys are placed in the package.
Swapping them out for dedicated keys in the future should be fairly easy if
needed.

> - we have another package just for the -primary key (more packages to
> maintain).
>

> [1] If there is sufficient automation around crypto, but the template build
> environment is already automated (there are two of them, as reflected by
> -primary and -community keys), so this is non-issue.
>


On another note, I'm wonder which fields are needed in the output of the "info"
operation. Comparing my WIP code to DNF, I currently do not have the
architecture [2], URL, licence, and description fields. This is due to
`qubes.TemplateSearch` not currently returning those fields.

For the packages in the official repos, those fields do not contain much
information (in particular, the description field contains the same information
as the summary), though I'm not sure if they might be useful in the future or
for unofficial templates.

One tricky thing is that the description may contain newlines, while `dnf
repoquery` does not escape them at all [3]. This may mean that another method
of querying the repo needs to be considered if the description is included. (Or
use unconventional characters/strings as separators. In particular, I can't
pass NULL characters in the arguments to DNF for obvious reasons.)

[2]: Probably not needed unless Qubes becomes available on other architectures.

[3]: Speaking of which, I'm also unsure what would happen if newlines appear
in, say, the summary field. Maybe I can conduct some experiments about this...
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

Wojtek Porczyk

unread,
Jul 20, 2020, 4:52:38 AM7/20/20
to WillyPillow, marm...@invisiblethingslab.com, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Sun, Jul 19, 2020 at 06:20:06PM +0000, WillyPillow wrote:
> For the time being, the -primary and -community keys are placed in the package.
> Swapping them out for dedicated keys in the future should be fairly easy if
> needed.

Sure.

> On another note, I'm wonder which fields are needed in the output of the "info"
> operation. Comparing my WIP code to DNF, I currently do not have the
> architecture [2], URL, licence, and description fields. This is due to
> `qubes.TemplateSearch` not currently returning those fields.
>
> For the packages in the official repos, those fields do not contain much
> information (in particular, the description field contains the same information
> as the summary), though I'm not sure if they might be useful in the future or
> for unofficial templates.

I don't know, could you design that so that those can be added in the future
if need be? Those need to be understood and properly validated, because some
software might act upon that information. For example: Debian provides
a directory with licence texts, which allows for
/usr/share/common-licenses/$license, which smells path traversal.
Fedora's RPM guidelines is even worse, they support operators like "()",
"and", "or":
https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/

> [2]: Probably not needed unless Qubes becomes available on other architectures.

That's a possibility, Xen supports ARM and I think we'll see more
desktops/laptops on ARM in the future. But we currently don't have such plans
even on a roadmap.


> One tricky thing is that the description may contain newlines, while `dnf
> repoquery` does not escape them at all [3]. This may mean that another method
> of querying the repo needs to be considered if the description is included. (Or
> use unconventional characters/strings as separators. In particular, I can't
> pass NULL characters in the arguments to DNF for obvious reasons.)

Yes, another qrexec call is OK, provided this won't be too slow, i.e., to
display a list of 15 templates available you won't refresh the repo cache
15 times...

> [3]: Speaking of which, I'm also unsure what would happen if newlines appear
> in, say, the summary field. Maybe I can conduct some experiments about this...

Certainly.


- --
pozdrawiam / best regards
Wojtek Porczyk
Invisible Things Lab

I do not fear computers,
I fear lack of them.
-- Isaac Asimov
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEaO0VFfpr0tEF6hYkv2vZMhA6I1EFAl8VW04ACgkQv2vZMhA6
I1HeeQ//XxVfMG3n1MCMR4saAnLL9qSQdrle0KbbJd0NOV14q8eLFM/no+OiDmf7
TtxZukO6VTqrW1S4yNVoKpexAL9JfbUZTxCP3YyFuU6EFMpm1XjW1y5Io89v0bXg
bq2QCVmybiPsGIAcX//y8ug6ucplm79z0um1LMDOlnmfdnW5ktwH4aL56BknON8T
2FndPpFr/9Z7QSqpoSkYykLh/RWRZqKqfEcrHEzs5RLaCnU84mMCmUWQ4yuJwaKE
nceorgqMSBSPLQUQukjg8sW5NN1mhDxESpE29+8/Q59ilo6UsMRRpCJLUwu3oI9i
TUcp+hXhR4UaOBa5Z7IAe5Ne5cogCd1lw6kM3rdz0bvn45qYoJ8FJKBe7G4uibqT
+loM/IbP88fSl0+0sMvWANiMrIXyB/l7G7QZfY9XEAoae2TzgHPVHPgJ5t3wlA2p
EsK9kqUaQwc106u+Xh/vTt86K+KVY3/mfGUMV7gdrBaXNr37sy6HqapkdEAD70Bb
2lL/cXAw9TGhX48WULeA1nxaGncfQFMA4DYvJWaLIAsmDrxwdEk7dlSDO6j6VM8o
ntWYaLeHBEPt0VGVNf+j8WrlPXQ1faaOtAwR1UlX2sr37v1hzAUj07Tf+s9tnqbP
egZ7h2nWW3uJHRG54LRNeViPCLA9jdKLQ2Fw+j4cX7H+ZnuNhkE=
=Ivdc
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 21, 2020, 2:32:09 PM7/21/20
to Wojtek Porczyk, marm...@invisiblethingslab.com, qubes...@googlegroups.com
On Monday, July 20, 2020 4:52 PM, Wojtek Porczyk <wo...@invisiblethingslab.com> wrote:

> On Sun, Jul 19, 2020 at 06:20:06PM +0000, WillyPillow wrote:
>

> > On another note, I'm wonder which fields are needed in the output of the "info"
> > operation. Comparing my WIP code to DNF, I currently do not have the
> > architecture [2], URL, licence, and description fields. This is due to
> > `qubes.TemplateSearch` not currently returning those fields.
> > For the packages in the official repos, those fields do not contain much
> > information (in particular, the description field contains the same information
> > as the summary), though I'm not sure if they might be useful in the future or
> > for unofficial templates.
>

> I don't know, could you design that so that those can be added in the future
> if need be? Those need to be understood and properly validated, because some
> software might act upon that information. For example: Debian provides
> a directory with licence texts, which allows for
> /usr/share/common-licenses/$license, which smells path traversal.
> Fedora's RPM guidelines is even worse, they support operators like "()",
> "and", "or":
> https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/

To include the information, all that is needed is to include it in the
`queryformat` string of the qrexec call and read it in `qvm-template` (Aside
from dealing with the newline issue [4] -- mentioned below.)

As for validation, on our end, as we (AFAICT) only need to display the entry as
is and don't need to parse it, we should be fine.

For other programs parsing the information, IMO it should be up to the program
in question to sanitize its inputs. This is because the fields contain, by
design, arbitrary text, and my understanding of the linked article is that it's
merely a guideline for Fedora repos instead of RPMs in general.

> > One tricky thing is that the description may contain newlines, while `dnf repoquery` does not escape them at all [3]. This may mean that another method
> > of querying the repo needs to be considered if the description is included. (Or
> > use unconventional characters/strings as separators. In particular, I can't
> > pass NULL characters in the arguments to DNF for obvious reasons.)
>

> Yes, another qrexec call is OK, provided this won't be too slow, i.e., to
> display a list of 15 templates available you won't refresh the repo cache
> 15 times...

(Speaking of refreshing the repo cache, a `--refresh` option that forces this
may need to be added, either as an option to the existing qrexec calls or as
another call.)

Creating another qrexec call is an interesting idea, but I'm unsure if it's
really feasible performance-wise. In particular, running `dnf info` (which does
not refresh the cache by default) on any package in `qubes-template-itl` takes
almost a second on my machine.

What I meant by "another method" is actually an alternative to `dnf repoquery`.
The issue is that (AFAIK) DNF provides no methods other than `repoquery` to
obtain a machine-readable form of the information (short of using the API from
Python). However, it has the issue when dealing with newlines.

IMO, the easiest way of doing this in terms of code changes is to modify
`repoquery` so that it allows expanding `\0` as null characters. Currently, it
already does similar replacements with `\n` and `\t`, and adding `\0` should
only be a one-line change.

However, I imagine that it would not be ideal to maintain a patched version of
a package in VMs. Even if the patch gets merged upstream, it'll probably still
take a few months for it to land in the next Fedora release.

Other solutions I can think of right now are:

- Maintain a separate modified copy of `repoquery.py`. From a first glance, it
seems to be fairly self-contained, and the only internal thing it calls,
AFAICT, is for i18n, which we don't care about in this context.
- Write a simplified version of `repoquery` using the DNF API. Probably not too
hard but feels like reinventing the wheel.

> > [3]: Speaking of which, I'm also unsure what would happen if newlines appear
> > in, say, the summary field. Maybe I can conduct some experiments about this...
>

> Certainly.

Unfortunately, newlines in the summary field *does* break things. In
particular, besides malformed repo XML files, a malformed package with newlines
in the summary can also introduce erroneous newlines in the output of
`repoquery`.


Also, initial implementations of `qvm-template {info,search}` along with some
other changes have been added to [the PR to
core-admin-client](https://github.com/QubesOS/qubes-core-admin-client/pull/145).

[4]: Technically, there may also be issues with the colon separator we
currently use, though it's in essence the same issue.
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

Wojtek Porczyk

unread,
Jul 22, 2020, 6:50:45 AM7/22/20
to WillyPillow, marm...@invisiblethingslab.com, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

That would be in ideal world, unfortunately many developers can't be bothered
to do it properly. That's why Qubes exists in the first place.

> This is because the fields contain, by
> design, arbitrary text, and my understanding of the linked article is that it's
> merely a guideline for Fedora repos instead of RPMs in general.

Can't we make our own guidelines? Like explicit whitelist of known licences
and maybe "other"? Linux templates are mostly under GPL, unikernels can have
other licences, but there aren't many unikernels. Unless I'm missing
something, known-good values would be easy to enumerate and check.

Also, some checking probably applies also to name and description: they
shouldn't have fancy characters, because UI toolkits might support things
like pango markup or whatever.
> [4]: Technically, there may also be issues with the colon separator we
> currently use, though it's in essence the same issue.

I wouldn't worry much about either newlines or additional colons generated on
the untrusted side of qrexec call. I'd worry about proper checking of the
untrusted input. Yes, I can make .rpm package with colon and newline in some
metadata fields. So what? The template manager on the trusted side should just
reject any malformed input from the untrusted side.

Now if it would be easy, you could also make some provisions to also reject
such malformed input inside the downloading vm, but it's not as critical as in
dom0. That's why I certainly wouldn't patch dnf, because effort to gains ratio
doesn't look promising, and I'd only fork the the repoquery if it would look
easy to maintain. That ':'.join(['%{something}']) formatstring looks good
enough for PoC.


> Also, initial implementations of `qvm-template {info,search}` along with some
> other changes have been added to [the PR to
> core-admin-client](https://github.com/QubesOS/qubes-core-admin-client/pull/145).

Acked, thanks.



- --
pozdrawiam / best regards
Wojtek Porczyk
Invisible Things Lab

I do not fear computers,
I fear lack of them.
-- Isaac Asimov
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEaO0VFfpr0tEF6hYkv2vZMhA6I1EFAl8YGfwACgkQv2vZMhA6
I1HWQQ//UtvsL3ACQClT/0eeTFA9qqrYsmgnRF/yOtIPZOB3FbNMklrwK70arfbE
RGK1Iq8697mBtrBH9yishkTiVKN7FeetW8jkLEaA1hLgLYTc+TuL8EgrXoDT0CoZ
cEDcFYTs4zX7rvmIOM0ZGq5JOSOQcRWf6ZNuRr0+GnpYNbYN83s7Fi40freogizk
HFzVRsx4BsC9nKLyE3nAAIcTUt0L/GhZ3lgRhUWTZdYRCrYYSFuF5dzAl3Aacovz
3pVjfgdLU3e0JAbFUTI0MRENtd+XIVHm5MBpqlM1zp1fMZNtDeRGSppqOPWxKU6H
JbrO9AXeiMdzKjdZlB6QZ81gNVBSgsQU2AtYLkvHcx8BxH4axzeQZMiUc3SyXLn2
AsMo0l0IBlj+mjF/FVxBJ3YDQfDS7r5T3ile1NEOvMATjn2C5AlkGAgUM+kERE5U
uobNqb8bPkNkAIL+U7uAox+CXqDMvQ4+InJk7wRKKbj4dPVPSA8KCCirxhF0AtpH
ub8ZWihh6QhtBU8G8By8fMAz4VvEV7oi0rZOYaIGrNoOGutP3KiPB66M7OKA2QG6
zAMdQ2aKn/ODxFncqj4WA2djGEB7ewIYCL4m57CUp1O7tXBukKxtH+GWSckdkx8p
8J5bQfuRGr6WwLkvdQUTYlMDVO8bk7nQ/zZ1OGuwMDGQj9/izgM=
=GlEM
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 23, 2020, 1:46:11 PM7/23/20
to Wojtek Porczyk, marm...@invisiblethingslab.com, qubes...@googlegroups.com
I see. A whitelist approach indeed seems to work in this case. The Fedora
article also links to a page containing a list of license short names we can
start from.

> Also, some checking probably applies also to name and description: they
> shouldn't have fancy characters, because UI toolkits might support things
> like pango markup or whatever.

Currently we're already blocking non-ASCII characters and escape sequences. For
names, constraining the character set to, e.g., something like
`[A-Za-z0-9_.+\-]` is pretty doable, though for the summary/description, I'm
unsure if anything more can be done without sacrificing usability, as, for
instance, it's not unimaginable for them to contain stuff like `<>`.

The roadmap also contains plans for `qvm-template` to provide machine-readable
output (as opposed to the current user-facing format). Maybe another way of
approaching this is to provide facilities for escaping the output for
downstream applications?
> > Unfortunately, newlines in the summary field does break things. In
> > particular, besides malformed repo XML files, a malformed package with newlines
> > in the summary can also introduce erroneous newlines in the output of
> > `repoquery`.
>

> > [4]: Technically, there may also be issues with the colon separator we
> > currently use, though it's in essence the same issue.
>

> I wouldn't worry much about either newlines or additional colons generated on
> the untrusted side of qrexec call. I'd worry about proper checking of the
> untrusted input. Yes, I can make .rpm package with colon and newline in some
> metadata fields. So what? The template manager on the trusted side should just
> reject any malformed input from the untrusted side.
>

> Now if it would be easy, you could also make some provisions to also reject
> such malformed input inside the downloading vm, but it's not as critical as in
> dom0. That's why I certainly wouldn't patch dnf, because effort to gains ratio
> doesn't look promising, and I'd only fork the the repoquery if it would look
> easy to maintain. That ':'.join(['%{something}']) formatstring looks good
> enough for PoC.

One issue is that from the qrexec client side it is basically impossible to
distinguish between the two. (Consider the case where a field contains
`xxx\na:b:c`.)

Security-wise, this is unlikely to cause issues as an entity that can do this
can probably modify the repo contents directly.

However, if the repo, by accident, does contain packages with, say, colons in
summaries, it may be an issue usability-wise as it's hard to give meaningful
error messages when things break.

There's also the original issue with descriptions (assuming that we don't omit
them), which contains newlines a lot of the time.

That being said, if we treat such errors as "repo errors" and leave to the repo
maintainers to ensure that the fields follow a certain format, then we can just
use a special character for the separator [5] and ban the character from the
fields.

[5]: The separator may also need to be placed at the end of the format string.

> > Also, initial implementations of `qvm-template {info,search}` along with some
> > other changes have been added to the PR tocore-admin-client.
>

> Acked, thanks.

I'm mainly working on the TODOs in the code currently. After this (and other
stuff in the discussion above) is finished, I'll probably continue to work on
the "Other Possible Features" in the design doc. Is there any particular order
you'd prefer me to proceed in?
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

Wojtek Porczyk

unread,
Jul 24, 2020, 12:18:34 PM7/24/20
to WillyPillow, marm...@invisiblethingslab.com, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Thu, Jul 23, 2020 at 05:45:56PM +0000, WillyPillow wrote:
> One issue is that from the qrexec client side it is basically impossible to
> distinguish between the two. (Consider the case where a field contains
> `xxx\na:b:c`.)

If there are more colons that there are supposed to be, there is no need to
distinguish anything anymore, just error out for "malformed input" or
something.

In Python I like to do it with tuple assingment:

try:
field1, field2, field3 = untrusted_line.split(':')
except TypeError:
raise ParseError('error message')

It's as simple as that. The big advantage is that there aren't many ways to do
something wrong.

> Security-wise, this is unlikely to cause issues as an entity that can do this
> can probably modify the repo contents directly.

The point is, we don't know. The repo content is untrusted, and yes, attacker
can modify it. What counts is signature on RPM.

> However, if the repo, by accident, does contain packages with, say, colons in
> summaries, it may be an issue usability-wise as it's hard to give meaningful
> error messages when things break.

"Malformed input" is OK. If we break loudly, template maintainers (the honest
among them) won't publish such summary, because it will break.

> There's also the original issue with descriptions (assuming that we don't omit
> them), which contains newlines a lot of the time.
>
> That being said, if we treat such errors as "repo errors" and leave to the repo
> maintainers to ensure that the fields follow a certain format, then we can just
> use a special character for the separator [5] and ban the character from the
> fields.

Yes, and IIUC the current proposal is to have ':' as that special character.
Am I missing something?

> [5]: The separator may also need to be placed at the end of the format string.

I don't think so.


- --
pozdrawiam / best regards
Wojtek Porczyk
Graphene / Invisible Things Lab

I do not fear computers,
I fear lack of them.
-- Isaac Asimov
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEaO0VFfpr0tEF6hYkv2vZMhA6I1EFAl8bCdQACgkQv2vZMhA6
I1ElCw/8DBbV8tFs46g/7YKvGOES16ajMbV696vX4TP9rl/uJr8BZWQCu//lfJWe
XZkI4YzNn/ntL9Nb7OEcMDEPiWCMOAD86yl5mcVdYPkgDssFBBdF6hxITqyGQDfc
pYbL11v2L7P1EFWYIfrsJ8cLkQ70qgUPTc8beVGqP9DA/q2hdYnIEDdML1BWqXh6
2nWKbYAeaVj4jeWHvEjvMkvv/mLMfsyE7epZM1I2un7LbYFxBXp/+OKfmHO/+/kV
2c4xuGr0d/8IbtZsIYn+n61YfajE4idITdio3c2uxibN+FVmovWYdDeFRJSJS0FW
8iJOg/kc4nCjocNYh5CHK3HVF/geW/2GzAa/Bjip5FdnJFQNBtjlMfP2uh/2Mg2p
qyZGRYG2/cwZw67WQd/v5Wj0ZnDyyGjHdCUGo8EICIsf/fNG80Pp0gpwRwal6naZ
b5A2y4GrSBSVR85p8HNO9GROWGQW4ObQLJwQbmzav9KrSGCXq9EBF4XlQTNhaKw8
qHCi9YnPw1ph4bsailQRDB+AKOhgzo761Ne5XPjU0uFJlERz+CEadNs5c+Mjw3vv
nTXHANkqelDru8R4t5MJizcxBa29cq8WxS2hS5PC79zxAVPwfDjGVYTAnIOYAG0S
tJ5U9E4QQPDo2VzMH1XHcfM0h4K4Y4qA+kW33GP0n2xRSrh0zus=
=Dusq
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 25, 2020, 6:46:48 AM7/25/20
to Wojtek Porczyk, marm...@invisiblethingslab.com, qubes...@googlegroups.com
On Saturday, July 25, 2020 12:18 AM, Wojtek Porczyk <wo...@invisiblethingslab.com> wrote:

> On Thu, Jul 23, 2020 at 05:45:56PM +0000, WillyPillow wrote:
>

> > One issue is that from the qrexec client side it is basically impossible to
> > distinguish between the two. (Consider the case where a field contains
> > `xxx\\na:b:c`.)
>

> If there are more colons that there are supposed to be, there is no need to
> distinguish anything anymore, just error out for "malformed input" or
> something.
>

> In Python I like to do it with tuple assingment:
>

> try:
> field1, field2, field3 = untrusted_line.split(':')
> except TypeError:
> raise ParseError('error message')

By "distinguish between the two" I meant correct input and malformed input.
Sorry for the confusion.

The point I was trying to make in that example was that when there are colons
*and* newlines, there could be cases like the following:

```
field1 = 'a'
field2 = 'b'
field3 = 'c\nd:e:f'
encoded = ':'.join([field1, field2, field3]) # a single entry
encoded_list = '\n'.join([encoded] * 10) # simulate multiple entries
# on the other side...
decoded_list = encoded_list.split('\n') # becomes 20 rows
for untrusted_line in decoded_list:
field1, field2, field3 = untrusted_line.split(':') # this does not error out
```

> It's as simple as that. The big advantage is that there aren't many ways to do
> something wrong.
>

> > Security-wise, this is unlikely to cause issues as an entity that can do this
> > can probably modify the repo contents directly.
>

> The point is, we don't know. The repo content is untrusted, and yes, attacker
> can modify it. What counts is signature on RPM.
>

> > However, if the repo, by accident, does contain packages with, say, colons in
> > summaries, it may be an issue usability-wise as it's hard to give meaningful
> > error messages when things break.
>

> "Malformed input" is OK. If we break loudly, template maintainers (the honest
> among them) won't publish such summary, because it will break.

The issue is that it's not always possible to detect such errors, as elaborated
above. Of course, as long as we keep this "wart" explicit, I'm okay with having
this as an "undefined behavior". (After all, the repo listing *is* considered
untrusted anyway.)

> > There's also the original issue with descriptions (assuming that we don't omit
> > them), which contains newlines a lot of the time.
> > That being said, if we treat such errors as "repo errors" and leave to the repo
> > maintainers to ensure that the fields follow a certain format, then we can just
> > use a special character for the separator [5] and ban the character from the
> > fields.
>

> Yes, and IIUC the current proposal is to have ':' as that special character.
> Am I missing something?

The reason I'm bringing this up is twofold:

1. The idea of having the repo maintainer ensure such constraints have not been
explicitly brought up before AFAIK.
2. I'm unsure whether `:` is the best choice for this, as it's not an uncommon
character, and banning it might be undesirable. Something like ASCII 28-31
are possible alternatives.

> > [5]: The separator may also need to be placed at the end of the format string.
>

> I don't think so.

If there is no such character at EOLs, then `\n` becomes another "special
character", as stuff would break when doing the `.split('\n')` step. If, on the
other hand, there is such a character, we can be sure that every entry consists
of exactly k (some positive number) <separator>-seperated entries.
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

Wojtek Porczyk

unread,
Jul 26, 2020, 5:12:23 PM7/26/20
to WillyPillow, marm...@invisiblethingslab.com, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

- - Yes, this breaks something and is not distinguishable from template manager.
- - No, it's not malformed, because it conforms to specification.
- - It's OK from security perspective: You cannot cause installation of anything
the user didn't explicitly mean (by choosing an item on a list and/or
accepting a signing key).

To reiterate: in optimal case this should be avoided from the sending side,
but it't not a very high priority. It would be also OK if the tool and
specification did something like "only the first line with ':' skipped".

The worst case is something can be shown on the list of available templates
that actually cannot be installed.

> > It's as simple as that. The big advantage is that there aren't many ways to do
> > something wrong.
> >
>
> > > Security-wise, this is unlikely to cause issues as an entity that can do this
> > > can probably modify the repo contents directly.
> >
>
> > The point is, we don't know. The repo content is untrusted, and yes, attacker
> > can modify it. What counts is signature on RPM.
> >
>
> > > However, if the repo, by accident, does contain packages with, say, colons in
> > > summaries, it may be an issue usability-wise as it's hard to give meaningful
> > > error messages when things break.
> >
>
> > "Malformed input" is OK. If we break loudly, template maintainers (the honest
> > among them) won't publish such summary, because it will break.
>
> The issue is that it's not always possible to detect such errors, as elaborated
> above. Of course, as long as we keep this "wart" explicit, I'm okay with having
> this as an "undefined behavior". (After all, the repo listing *is* considered
> untrusted anyway.)

Sure.

> > > There's also the original issue with descriptions (assuming that we don't omit
> > > them), which contains newlines a lot of the time.
> > > That being said, if we treat such errors as "repo errors" and leave to the repo
> > > maintainers to ensure that the fields follow a certain format, then we can just
> > > use a special character for the separator [5] and ban the character from the
> > > fields.
> >
>
> > Yes, and IIUC the current proposal is to have ':' as that special character.
> > Am I missing something?
>
> The reason I'm bringing this up is twofold:
>
> 1. The idea of having the repo maintainer ensure such constraints have not been
> explicitly brought up before AFAIK.
> 2. I'm unsure whether `:` is the best choice for this, as it's not an uncommon
> character, and banning it might be undesirable. Something like ASCII 28-31
> are possible alternatives.

I think this should be one of the printable characters, so the output is easy
to read and audit. Though ':' might well not be the best character, but there
are others like '|' or even '!' (and I'd argue banning '!' from descriptions
would do good to avoid shouting at people).

> > > [5]: The separator may also need to be placed at the end of the format string.
> >
>
> > I don't think so.
>
> If there is no such character at EOLs, then `\n` becomes another "special
> character", as stuff would break when doing the `.split('\n')` step. If, on the
> other hand, there is such a character, we can be sure that every entry consists
> of exactly k (some positive number) <separator>-seperated entries.

Having separator at the end gains nothing.

> Thanks,
> WillyPillow

Thank you for this week! Tomorrow Marmarek returns, so he'll carry on.


- --
pozdrawiam / best regards
Wojtek Porczyk
Graphene / Invisible Things Lab

I do not fear computers,
I fear lack of them.
-- Isaac Asimov
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEaO0VFfpr0tEF6hYkv2vZMhA6I1EFAl8d8a8ACgkQv2vZMhA6
I1FZ7hAAjBwaVy2uZ9t4dxo9CfHya8VcMagCQoTkwrUpv5GgRzStCiC0SJNQVOgS
V2cS7waqb2pC8HSe5ku055MpGLJwA96MnSqZii1zhtqRqdo9MxEUc3oSK1WFI+MC
0XoGTEGbmlGjwdRoqeM464KOOk7gzNoqloZH0VLCyRDaWj97UR4I1HKh5FWV5alD
WXUeqHhdZnns7LeX8LSn8WIUWXfdSgMsB9APVNrsXlnPVddlKsSoGUNjEKEb5viS
MxRl6RHS5kCGZa4yy+09j5gxYeha/lCrrMpAK3/qhRx4bzAhc2nmbv1qf8xVW+G+
Mj0Sf5BgLDmCE9wxvbAQm2PIVaduF9G8Y5lwqYX7TTV4QC2zW71RJGJu7zlNYkpE
SkvNCH06zhY0zWsxZ/91KOafsMjajbaYfX0VrbXa3vVfanLy+iRJhZ3gIt3HLOOv
4hLZV0k/sy+ooqLXBwN9cE1CyYXVKVSGw403NV9yumFrUMhFuzehyPAdxOI6nCj6
2OV2luEKYrMSZWfaPHSk5b+JKplp39oaX6aHs0TBW2ahQmOhmrg4mnDchc9XwzOr
M4MuZkX+/qtgfKH6/RZeS6OKmKCmQEED7ibfPRgWf9pMgB2ERfDA1zrU4lJx2fSQ
FsFwjjByrE1+W247oniKoDpv61v6tU86yUjVS1klyxofc0/W2co=
=mHmD
-----END PGP SIGNATURE-----

marm...@invisiblethingslab.com

unread,
Jul 26, 2020, 5:56:52 PM7/26/20
to WillyPillow, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Additionally, the canonical way to build template rpms (that are then
included in the repo) is linux-template-builder (or perhaps some other
template builder in some future) and we can easily make sure all the
(benign) package metadata matches the (possibly limited) specification of the
qvm-template.
There is "Pop! OS" (yet another Ubuntu derivative), so '!' is not
unimaginable in the description. '|' IMO should be fine.

> > > > [5]: The separator may also need to be placed at the end of the format string.
> > >
> >
> > > I don't think so.
> >
> > If there is no such character at EOLs, then `\n` becomes another "special
> > character", as stuff would break when doing the `.split('\n')` step. If, on the
> > other hand, there is such a character, we can be sure that every entry consists
> > of exactly k (some positive number) <separator>-seperated entries.
>
> Having separator at the end gains nothing.
>
> > Thanks,
> > WillyPillow
>
> Thank you for this week! Tomorrow Marmarek returns, so he'll carry on.

I'm kind of back, still catching up on emails.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl8d/B0ACgkQ24/THMrX
1yxC2Af+Kj+vDDZVSdBGMtiWaaLSldhpFV/cZC7VerQ0RW83/VUaB9UXwN4FPn8e
TSOMrQ9iLNTn2roxbhyCIw8SxnYBMq6euXv71vSt0yHMyuJKstHWRzhZSuv5pUqy
ZTCyHPSaQvNjiBtLxRAD7RTR3535kNjaZX9d63o1Fd3ugF+xjZcOTaWj+s3rtiRo
5QBdZhNkbsFvr1ZGk4Ax+Pk93hlz6vEBsMOWsB3k7Pi49aDPkf+sclJHi81hdnSn
6Wus2J8hpm+xVrBoYyjYC1daEckEMBBSeswuFh/ubb/nbHSuXW1W5ob8NDWuTDlI
BWiySl29iHzTeXMjEIEJOAIITfyHNA==
=svh8
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 27, 2020, 6:30:20 AM7/27/20
to Wojtek Porczyk, marm...@invisiblethingslab.com, qubes...@googlegroups.com
On Monday, July 27, 2020 5:12 AM, Wojtek Porczyk <wo...@invisiblethingslab.com> wrote:

> On Sat, Jul 25, 2020 at 10:46:40AM +0000, WillyPillow wrote:
>

> > On Saturday, July 25, 2020 12:18 AM, Wojtek Porczyk wo...@invisiblethingslab.com wrote:
> >

> > > On Thu, Jul 23, 2020 at 05:45:56PM +0000, WillyPillow wrote:
> >

> > > > One issue is that from the qrexec client side it is basically impossible to
> > > > distinguish between the two. (Consider the case where a field contains
> > > > `xxx\\\\na:b:c`.)
> >

> > > If there are more colons that there are supposed to be, there is no need to
> > > distinguish anything anymore, just error out for "malformed input" or
> > > something.
> >

> > > In Python I like to do it with tuple assingment:
> >

> > > try:
> > > field1, field2, field3 = untrusted_line.split(':')
> > > except TypeError:
> > > raise ParseError('error message')
> >

> > By "distinguish between the two" I meant correct input and malformed input.
> > Sorry for the confusion.
> > The point I was trying to make in that example was that when there are colons
> > and newlines, there could be cases like the following:
> >

> > field1 = 'a'
> > field2 = 'b'
> > field3 = 'c\\nd:e:f'
> > encoded = ':'.join([field1, field2, field3]) # a single entry
> > encoded_list = '\\n'.join([encoded] * 10) # simulate multiple entries
> > # on the other side...
> > decoded_list = encoded_list.split('\\n') # becomes 20 rows
> > for untrusted_line in decoded_list:
> > field1, field2, field3 = untrusted_line.split(':') # this does not error out
> >

>

> - Yes, this breaks something and is not distinguishable from template manager.
> - No, it's not malformed, because it conforms to specification.
> - It's OK from security perspective: You cannot cause installation of anything
> the user didn't explicitly mean (by choosing an item on a list and/or
> accepting a signing key).

I agree. To be clear, I was only bringing that up from a UX perspective, i.e.,
not always able to detect the error.

(To be pedantic though, to deal with wildcards and stuff, all the templates on
the returned list *are* installed in some cases, so it is possible for some
unwanted templates to be installed currently. That being said, i) it's fairly
easy to check that the template names indeed matches the given patterns and ii)
an accepted signing key is still required, so no big issue here.)

> To reiterate: in optimal case this should be avoided from the sending side,
> but it't not a very high priority. It would be also OK if the tool and
> specification did something like "only the first line with ':' skipped".
>

> The worst case is something can be shown on the list of available templates
> that actually cannot be installed.

Again, I agree. Hence the analogy to undefined behavior below.

> > > It's as simple as that. The big advantage is that there aren't many ways to do
> > > something wrong.
> >

> > > > Security-wise, this is unlikely to cause issues as an entity that can do this
> > > > can probably modify the repo contents directly.
> >

> > > The point is, we don't know. The repo content is untrusted, and yes, attacker
> > > can modify it. What counts is signature on RPM.
> >

> > > > However, if the repo, by accident, does contain packages with, say, colons in
> > > > summaries, it may be an issue usability-wise as it's hard to give meaningful
> > > > error messages when things break.
> >

> > > "Malformed input" is OK. If we break loudly, template maintainers (the honest
> > > among them) won't publish such summary, because it will break.
> >

> > The issue is that it's not always possible to detect such errors, as elaborated
> > above. Of course, as long as we keep this "wart" explicit, I'm okay with having
> > this as an "undefined behavior". (After all, the repo listing is considered
> > untrusted anyway.)
>

> Sure.
>

> > > > There's also the original issue with descriptions (assuming that we don't omit
> > > > them), which contains newlines a lot of the time.
> > > > That being said, if we treat such errors as "repo errors" and leave to the repo
> > > > maintainers to ensure that the fields follow a certain format, then we can just
> > > > use a special character for the separator [5] and ban the character from the
> > > > fields.
> >

> > > Yes, and IIUC the current proposal is to have ':' as that special character.
> > > Am I missing something?
> >

> > The reason I'm bringing this up is twofold:
> >

> > 1. The idea of having the repo maintainer ensure such constraints have not been
> > explicitly brought up before AFAIK.
> >

> > 2. I'm unsure whether `:` is the best choice for this, as it's not an uncommon
> > character, and banning it might be undesirable. Something like ASCII 28-31
> > are possible alternatives.
> >

>

> I think this should be one of the printable characters, so the output is easy
> to read and audit. Though ':' might well not be the best character, but there
> are others like '|' or even '!' (and I'd argue banning '!' from descriptions
> would do good to avoid shouting at people).

I see :)

> > > > [5]: The separator may also need to be placed at the end of the format string.
> >

> > > I don't think so.
> >

> > If there is no such character at EOLs, then `\\n` becomes another "special
> > character", as stuff would break when doing the `.split('\\n')` step. If, on the
> > other hand, there is such a character, we can be sure that every entry consists
> > of exactly k (some positive number) <separator>-seperated entries.
>

> Having separator at the end gains nothing.

Maybe I'm missing something, but I don't see why this is the case.

My understanding of the situation is this: (I'm also attempting to summarize
the context for anyone else reading this; the directly relevant points start
from number 4.)

1. To allow for better features on the `search` and `info` operations, we want
to incorporate additional metadata like URL, description, ...etc in the
results of `qubes.TemplateSearch`.
2. Doing this requires us to rethink the output format. Previously, as the only
field that may contain `:` is the summary field, which i) does not contain
`\n` and ii) is the last field in the output format, parsing it like [commit]
works.
3. As the result of previous discussions, changing the separator from ':' to
something less common like '|', and having a guideline that bans the latter
from the package metadata is an acceptable solution.
4. However, the description field still may contain newlines [6], making
splitting the output into separate records not exactly trivial. For
instance, a description field containing `abc\ndef` will be broken up from the
`\n`, resulting in a description only containing `abc` and a broken row
containing `def` (assuming that the description is the last field). This causes
us to raise an exception for an otherwise conformant package.
(Yes, technically, we can take advantage of the fact that there are multiple
fields (i.e., those other than the description) that are guaranteed not to
contain newlines and place them at "strategic locations" to mitigate this
(e.g., `Name|...|Description` and something like `re.split('\n(?=[^\n|]*\\|)',
output)`). However, I feel that this is additional complexity just for saving a
character.)
5. As such, the idea is to include a separator character at the end of the
line. This allows us to know precisely where the description field ends, and
parse the entire output like the following:

``` py
def parse(output):
split = output.split('|')
return [split[i:i+NUM_FIELDS]
for i in range(0, len(split), NUM_FIELDS)]

```

[6]: Note that it is probably not desirable to ban newlines directly. Not only
does it serve semantic purposes, existing tools like rpmlint also explicitly
complain about description lines longer than 79 characters.

[commit]:
https://github.com/QubesOS/qubes-core-admin-client/pull/145/commits/efda82eaa2bcaebdbc1dfd1c5dce717be46ae63e

> > Thanks,
> > WillyPillow
>

> Thank you for this week! Tomorrow Marmarek returns, so he'll carry on.

Thanks for the awesome help!
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

WillyPillow

unread,
Jul 27, 2020, 6:36:04 AM7/27/20
to Wojtek Porczyk, marm...@invisiblethingslab.com, qubes...@googlegroups.com
This is something that I realized when experimenting with the `--refresh` flag
for force updating repo metadata: in essence, currently the metadata is
*always* refreshed as DNF detects that the mtime of the repo configuration has
changed.

This means that, while the correctness is fine, there may potentially be a
performance issue. Currently, the repositories are quite small in size, so
practically speaking the impact shouldn't be that large (which is also why I
did not realize the issue until recently), but I thought I'd still bring it up
nonetheless.

A low hanging fruit to somewhat reduce the impact is to allow multiple spec
strings in `qubes.TemplateSearch`, reducing the number of calls. (Currently,
we're invoking the qrexec call for every package spec the user passes in.)

To actually utilize the DNF cache, it seems that the following has to be done:

1. Check if the repo configuration has been changed (and the last update time
and `metadata_expire`) from the qvm-template side.
2. Invoke DNF with `--cacheonly` if we deem that the cache can be used.
3. Notably, DNF needs to be invoked with `sudo` as `--cacheonly` only uses the
system cache and not user-specific ones.

Alternatively, a hackish way is to cache the hash and mtime of the repo config
on the updateVM side and override the mtime if the hash matches.

Again, this is probably not very high-priority; just pointing out that the
issue exists.
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc

marm...@invisiblethingslab.com

unread,
Jul 27, 2020, 6:48:35 AM7/27/20
to WillyPillow, Wojtek Porczyk, qubes...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Mon, Jul 27, 2020 at 10:35:56AM +0000, WillyPillow wrote:
> This is something that I realized when experimenting with the `--refresh` flag
> for force updating repo metadata: in essence, currently the metadata is
> *always* refreshed as DNF detects that the mtime of the repo configuration has
> changed.
>
> This means that, while the correctness is fine, there may potentially be a
> performance issue. Currently, the repositories are quite small in size, so
> practically speaking the impact shouldn't be that large (which is also why I
> did not realize the issue until recently), but I thought I'd still bring it up
> nonetheless.
>
> A low hanging fruit to somewhat reduce the impact is to allow multiple spec
> strings in `qubes.TemplateSearch`, reducing the number of calls. (Currently,
> we're invoking the qrexec call for every package spec the user passes in.)

This indeed may be the simplest way, but I consider one spec - one call
a cleaner solution.

> To actually utilize the DNF cache, it seems that the following has to be done:
>
> 1. Check if the repo configuration has been changed (and the last update time
> and `metadata_expire`) from the qvm-template side.
> 2. Invoke DNF with `--cacheonly` if we deem that the cache can be used.
> 3. Notably, DNF needs to be invoked with `sudo` as `--cacheonly` only uses the
> system cache and not user-specific ones.
>
> Alternatively, a hackish way is to cache the hash and mtime of the repo config
> on the updateVM side and override the mtime if the hash matches.

Since the cache is stored in the UpdateVM, IMO storing repo config hash
there sounds like the right thing to do (you can store mtime in the
mtime of the repo config hash file). Then, if unchanged repo config ->
unchanged repo config mtime, then dnf should do the right thing on its
own (respect metadata_expire etc).

I wonder what dnf will say about different repo config path (because of
mktemp), have you checked?

> Again, this is probably not very high-priority; just pointing out that the
> issue exists.

Yes, I agree.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl8esPwACgkQ24/THMrX
1yzImAgAmN8UKqd8cNHhaoDVoNHS2ghjnAPLRwhZ//pYKoRtKTPJk3Q80Jt0CRzy
59QKeDWezybveq6zCecp+i1XmNoTZy9ATRZoX1j/DOyY/ni5FIPSI+RmpW4wcpdb
k+tda3EDn2f25vQvyZL4hgYjy2jvvbOr3CxoHQY3ucyJwZGPrNpXXtyOx06B8Xrs
VsJ/L5TS5cGXbM7iNuGMvogxGf8pisaXqdo4NiaxXJoS8O7C347Pb9YuJxtOAgur
JPr/4C77NKzM6bebezrTDHAz2XLsTfDeqTsISgso5OCsjSy/wAALHpXK1yCkDcrB
dH5flShL8BZPDCzilEXDdFNazyRs3Q==
=SuID
-----END PGP SIGNATURE-----

WillyPillow

unread,
Jul 28, 2020, 2:05:23 PM7/28/20
to marm...@invisiblethingslab.com, wo...@invisiblethingslab.com, qubes...@googlegroups.com
On Jul 27, 2020, 18:48, marm...@invisiblethingslab.com < marm...@invisiblethingslab.com> wrote:
> On Mon, Jul 27, 2020 at 10:35:56AM +0000, WillyPillow wrote:
> > This is something that I realized when experimenting with the `--refresh` flag
> > for force updating repo metadata: in essence, currently the metadata is
> > *always* refreshed as DNF detects that the mtime of the repo configuration has
> > changed.
> >
> > This means that, while the correctness is fine, there may potentially be a
> > performance issue. Currently, the repositories are quite small in size, so
> > practically speaking the impact shouldn't be that large (which is also why I
> > did not realize the issue until recently), but I thought I'd still bring it up
> > nonetheless.
> >
> > A low hanging fruit to somewhat reduce the impact is to allow multiple spec
> > strings in `qubes.TemplateSearch`, reducing the number of calls. (Currently,
> > we're invoking the qrexec call for every package spec the user passes in.)
> This indeed may be the simplest way, but I consider one spec - one call
> a cleaner solution.

I see.


> > To actually utilize the DNF cache, it seems that the following has to be done:
> >
> > 1. Check if the repo configuration has been changed (and the last update time
> > and `metadata_expire`) from the qvm-template side.
> > 2. Invoke DNF with `--cacheonly` if we deem that the cache can be used.
> > 3. Notably, DNF needs to be invoked with `sudo` as `--cacheonly` only uses the
> > system cache and not user-specific ones.
> >
> > Alternatively, a hackish way is to cache the hash and mtime of the repo config
> > on the updateVM side and override the mtime if the hash matches.
> Since the cache is stored in the UpdateVM, IMO storing repo config hash
> there sounds like the right thing to do (you can store mtime in the
> mtime of the repo config hash file). Then, if unchanged repo config ->
> unchanged repo config mtime, then dnf should do the right thing on its
> own (respect metadata_expire etc).

It seems like this can be done pretty easily. I'll
probably push this with other changes (e g.,
additional fields and separator change) I'm
testing right now.


> I wonder what dnf will say about different repo config path (because of
> mktemp), have you checked?

In my testing, DNF indeed does not seem to care
about the path, which is quite interesting.


Thanks,
WillyPillow

> https://blog.nerde.pw/
>
> PGP fingerprint = 6CCF 3FC7 32AC 9D83 D154 217F 1C16 C70E E7C3 1C84
>
> Protonmail PGP = D02D CEFF ACE5 5A7B FF5D 871E 4004 1CB1 F52B 127E

Sent from ProtonMail mobile
publickey - wp@nerde.pw - 0xD02DCEFF.asc
signature.asc
Reply all
Reply to author
Forward
0 new messages