Qubes 4.0.2 severe issue - dom0 kernel crash

94 views
Skip to first unread message

Marek Marczykowski-Górecki

unread,
Jan 4, 2020, 5:06:36 AM1/4/20
to qubes-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi,

The just released 4.0.2 has severe bug - dom0 kernel crashes on DISCARD
operation[1]
I have got this error on 4.19.x kernel while testing upcoming Qubes 4.1
too[2], but not while testing R4.0.2. Not exactly sure why it worked for
me on R4.0.2 (and -rc3), but the bug looks to affect many users. BTW the
same applies to the updated 4.19.89 kernel for current-testing (due to
holiday season, it hasn't migrated to current aka stable - thankfully).
The bug (according to kernel changelog) should be fixed in 4.19.90
(commit "block: fix single range discard merge"), but I haven't tested
it yet.

Anyway R4.0.2 is broken. When the fix is ready, I think we should
substitute it with a fixed version ASAP. But I'm not sure how to name
it:
- R4.0.3 - next point release, just earlier one
- R4.0.2.1 - point release of a point release, since the change is very
minimal

Any opinions?

[1] https://github.com/QubesOS/qubes-issues/issues/5553
[2] https://github.com/QubesOS/qubes-issues/issues/5529#issuecomment-569312158

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl4QY6UACgkQ24/THMrX
1yyu/QgAiJIzoYVM2EIbP5t8j6kIeG15lX/vHj8cRSnoSijrnln5WwGB5fliYHAU
fQuJ7MdAwfv4gsP2rPaP/kfplqzfyoGcCLHmFWMYmGMZbd7my9BD61Vnrnvr4qBG
S7DCYr5fjQK7k2nQer8IVUbePk7UNV66Cj6usGZhUy/2dJG9HSAfl1xS9hl/NBrg
Puz0aJBZWmtLNghckoqMDpI/0gmzah9isj42EDXIkDE/DhUnoR+z0EznvKb8Vo6I
Nh4QX/INTw0FyLSlUOROEmENqtTWcAJNozYbUMdNMvdJfDLpn59uss+ygwOXIEnY
rJ/qekxMiv3Sy43eewD6N+NBfQ2DpA==
=T0+E
-----END PGP SIGNATURE-----

awokd

unread,
Jan 4, 2020, 5:20:59 AM1/4/20
to qubes...@googlegroups.com
Marek Marczykowski-Górecki:

> Anyway R4.0.2 is broken. When the fix is ready, I think we should
> substitute it with a fixed version ASAP. But I'm not sure how to name
> it:
> - R4.0.3 - next point release, just earlier one
> - R4.0.2.1 - point release of a point release, since the change is very
> minimal
>
> Any opinions?

Everyone's got those, including myself! How about R4.0.2b? However, I
vaguely recall some code (in some of the update components maybe) that
parses Qubes versions. R4.0.3 might be more readily digested by that
code than alpha characters or additional sub-points.

--
- don't top post
Mailing list etiquette:
- trim quoted reply to only relevant portions
- when possible, copy and paste text instead of screenshots

dhorf-qrir...@hashmail.org

unread,
Jan 4, 2020, 5:44:58 AM1/4/20
to qubes...@googlegroups.com
On Sat, Jan 04, 2020 at 10:20:02AM +0000, 'awokd' via qubes-devel wrote:
> > - R4.0.3 - next point release, just earlier one
> > - R4.0.2.1 - point release of a point release, since the change is very
> > minimal
> Everyone's got those, including myself! How about R4.0.2b? However, I
> vaguely recall some code (in some of the update components maybe) that
> parses Qubes versions. R4.0.3 might be more readily digested by that code
> than alpha characters or additional sub-points.

very much in favor of 4.0.3

4.0.2.1 adds to the confusion of users who reliably mixed up
"4.0.1" and "4.1" over the last year.
and 4.0.2b is at least as confusing for both users and tooling.



Andrew David Wong

unread,
Jan 4, 2020, 7:37:58 AM1/4/20
to Marek Marczykowski-Górecki, qubes-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 2020-01-04 4:06 AM, Marek Marczykowski-Górecki wrote:
> Hi,
>
> The just released 4.0.2 has severe bug - dom0 kernel crashes on DISCARD
> operation[1]
> I have got this error on 4.19.x kernel while testing upcoming Qubes 4.1
> too[2], but not while testing R4.0.2. Not exactly sure why it worked for
> me on R4.0.2 (and -rc3), but the bug looks to affect many users. BTW the
> same applies to the updated 4.19.89 kernel for current-testing (due to
> holiday season, it hasn't migrated to current aka stable - thankfully).
> The bug (according to kernel changelog) should be fixed in 4.19.90
> (commit "block: fix single range discard merge"), but I haven't tested
> it yet.
>
> Anyway R4.0.2 is broken. When the fix is ready, I think we should
> substitute it with a fixed version ASAP. But I'm not sure how to name
> it:
> - R4.0.3 - next point release, just earlier one
> - R4.0.2.1 - point release of a point release, since the change is very
> minimal
>
> Any opinions?
>
> [1] https://github.com/QubesOS/qubes-issues/issues/5553
> [2] https://github.com/QubesOS/qubes-issues/issues/5529#issuecomment-569312158
>

The semantic versioning standard recommended by semver.org would
suggest "R4.0.3".

- --
Andrew David Wong (Axon)
Community Manager, Qubes OS
https://www.qubes-os.org

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEZQ7rCYX0j3henGH1203TvDlQMDAFAl4QhxoACgkQ203TvDlQ
MDDCEA/8CTQP4Hke0S5/dxR8tRRiEB6aoYvlhe7Qlv+yiQGoaDaF1hepiEE2dSCx
SmTcUQzXIYFp4fusA43Prs+jR4kAU6FvABNnZw1E4esF3tBfvw9U89jO2jgwhQdm
R+JzMx0JE0azzS1BglrJs/TzyfGpoRZYn0UIYiz6v6/3z0b4xlVo2onpVst7oiLg
xbdRVQJ4+/BI9+CToCUXArd4Vs6qSHhksMKRytwMOZcaRAQUu7PmhaL3lj98bYuY
UmhyVZm1eeQIZhN8GFfHkb9/e62PaIItslg6rSO5jDK6Q/0vf33iAnQcuHP4VqdQ
VqpGzXZ6p5vWB03TDQyAzGNYTbX58MXh+k0jhEh8R5ZzV8zTSFHPXATG+J9oIbcM
Ue44gCRuQ/5Uns5RevjXXNHJQpy7f1OuFpLRdNvpoCg5GUuzSAYQOXUSIpch37hq
O2O7JLTvTitCacxoXPXYhA/Y2ximC8ErpSCYckrUZSgOP2raIIkFzPR17OnfyjTn
U7olLz7U5yBbHsaowiwAuTy2LMiRBga4ScXvSdjx5MnLQV8BVTikeVn358yThm/G
UHiD/CBdM06TvGbJtKOwTv9dFiGlwBJ4oCznoz9VbjYCrMWhSyxl8sTxLofdmFmv
fBqTewZ8MLAgq7+Z5fa67HDyQWtwEhPdatj5616V8ArHVetAINw=
=p67l
-----END PGP SIGNATURE-----


Chris Laprise

unread,
Jan 4, 2020, 9:28:49 AM1/4/20
to Marek Marczykowski-Górecki, qubes-devel
On 1/4/20 5:06 AM, Marek Marczykowski-Górecki wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Hi,
>
> The just released 4.0.2 has severe bug - dom0 kernel crashes on DISCARD
> operation[1]
> I have got this error on 4.19.x kernel while testing upcoming Qubes 4.1
> too[2], but not while testing R4.0.2. Not exactly sure why it worked for
> me on R4.0.2 (and -rc3), but the bug looks to affect many users. BTW the
> same applies to the updated 4.19.89 kernel for current-testing (due to
> holiday season, it hasn't migrated to current aka stable - thankfully).
> The bug (according to kernel changelog) should be fixed in 4.19.90
> (commit "block: fix single range discard merge"), but I haven't tested
> it yet.
>
> Anyway R4.0.2 is broken. When the fix is ready, I think we should
> substitute it with a fixed version ASAP. But I'm not sure how to name
> it:
> - R4.0.3 - next point release, just earlier one
> - R4.0.2.1 - point release of a point release, since the change is very
> minimal
>
> Any opinions?
>
> [1] https://github.com/QubesOS/qubes-issues/issues/5553
> [2] https://github.com/QubesOS/qubes-issues/issues/5529#issuecomment-569312158

Can discards be disabled from the 4.0.2 installer? That could reduce the
urgency for a new release.

--

Chris Laprise, tas...@posteo.net
https://github.com/tasket
https://twitter.com/ttaskett
PGP: BEE2 20C5 356E 764A 73EB 4AB3 1DC4 D106 F07F 1886

Marek Marczykowski-Górecki

unread,
Jan 4, 2020, 9:39:25 AM1/4/20
to Chris Laprise, qubes-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Not easily, besides 'discard' mount option (easy), we also have
'blkdiscard' call deeper in our scripts.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl4Qo5QACgkQ24/THMrX
1yyorgf/f1CqnVi4oB3VpCNsA7VCz8VVjp7sKYO+F8Q92txm/cxrAkbf4CPdGk0m
DWDeB698dq2sF7aR4cWdHg+wUA/MuEwZkf3qriTTxsjXpGWvpP+lw0O4d5PlNLg9
G4JhxWNhJV5IhFUreNvxrvag19zE80qX/+/cfmjkGqU3Vym1cpdhgDTecekl+JyX
IskXunialhiSGX+aIsK97/YAOhalUDUl/iyfkb39heSvQ3e4ZNCFa9/FTjQQCG6x
/LUq39P3Ivf2gRDq9zu7aGuyZqbVx+WW1MKXe32a8MeVnMrjdW688cCmQyEewb5D
IjOuBHEeY0ILH1g4c0PHY3EOqCdkHw==
=ap9R
-----END PGP SIGNATURE-----

Chris Laprise

unread,
Jan 4, 2020, 10:53:31 AM1/4/20
to Marek Marczykowski-Górecki, qubes-devel
On 1/4/20 9:39 AM, Marek Marczykowski-Górecki wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> On Sat, Jan 04, 2020 at 09:28:45AM -0500, Chris Laprise wrote:
>> Can discards be disabled from the 4.0.2 installer? That could reduce the
>> urgency for a new release.
>
> Not easily, besides 'discard' mount option (easy), we also have
> 'blkdiscard' call deeper in our scripts.

I thought maybe preventing discards from being passed down to hw would
prevent the crash. For example, setting 'libata.force=noncqtrim' at boot.

That is, if the crash isn't occurring at a higher layer.

Brendan Hoar

unread,
Jan 4, 2020, 11:19:14 AM1/4/20
to Chris Laprise, Marek Marczykowski-Górecki, qubes-devel
Is the crash happening in both domU and dom0?

Chris Laprise

unread,
Jan 4, 2020, 11:49:52 AM1/4/20
to Brendan Hoar, Marek Marczykowski-Górecki, qubes-devel
On 1/4/20 11:19 AM, Brendan Hoar wrote:
> Is the crash happening in both domU and dom0?

Looking at the screenshot in issue 5553, I doubt its happening in domUs.

Notice that 'dm_thin_pool' is in the call trace, so that indicates dom0.
Discard passdown is also involved, however I doubt it is in such a way
that my earlier suggestion would work. :(

Marek Marczykowski-Górecki

unread,
Jan 4, 2020, 1:40:37 PM1/4/20
to Andrew David Wong, qubes-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Sat, Jan 04, 2020 at 06:37:47AM -0600, Andrew David Wong wrote:
> On 2020-01-04 4:06 AM, Marek Marczykowski-Górecki wrote:
> > Hi,
> >
> > The just released 4.0.2 has severe bug - dom0 kernel crashes on DISCARD
> > operation[1]
> > I have got this error on 4.19.x kernel while testing upcoming Qubes 4.1
> > too[2], but not while testing R4.0.2. Not exactly sure why it worked for
> > me on R4.0.2 (and -rc3), but the bug looks to affect many users. BTW the
> > same applies to the updated 4.19.89 kernel for current-testing (due to
> > holiday season, it hasn't migrated to current aka stable - thankfully).
> > The bug (according to kernel changelog) should be fixed in 4.19.90
> > (commit "block: fix single range discard merge"), but I haven't tested
> > it yet.
> >
> > Anyway R4.0.2 is broken. When the fix is ready, I think we should
> > substitute it with a fixed version ASAP. But I'm not sure how to name
> > it:
> > - R4.0.3 - next point release, just earlier one
> > - R4.0.2.1 - point release of a point release, since the change is very
> > minimal
> >
> > Any opinions?
> >
> > [1] https://github.com/QubesOS/qubes-issues/issues/5553
> > [2] https://github.com/QubesOS/qubes-issues/issues/5529#issuecomment-569312158
> >
>
> The semantic versioning standard recommended by semver.org would
> suggest "R4.0.3".

Ok, lets go with R4.0.3.

For the time being, I've reverted download links to R4.0.1 + R4.0.2-rc3.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl4Q3B8ACgkQ24/THMrX
1ywVaQf/Sty+kNURJQo7WO7QKRnrDaao4fITGp8jQ8pN/FIZ6gAtQo1AUyNIAwKD
43LJsD08WDk4V9XUOkY8Q0qt80USljOxPEpNl49OlpCwAbRoQHyZzF1vX2hyrXdU
esQqmtLOyg/WvpPyc7EP3TuaCvCgbgQj0kUyBP9qv2lPC/kFKF+yNjVC1kb4JxKT
4p3CG2GzXcMhTp0dW4uL8H67b3errzVT4ZFMpy4HXQuAhMnXRhuhburjdFj1HE5Q
4N9oc8cROuHB1f1CKX6yIVBI5zGcBL9gJpcxO92VaxG/Z7bNhMhFP/2V/93xw3Qh
qi9/Argkj5HDigUWA8kpH3Jk/j9IFw==
=b4Sz
-----END PGP SIGNATURE-----
Reply all
Reply to author
Forward
0 new messages