On 02/18/2018 06:30 PM, Marek Marczykowski-Górecki wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> On Sun, Feb 18, 2018 at 01:10:44PM -0500, Chris Laprise wrote:
>> I'm thinking about posting a PR to have qubes-firewall raise errors whenever
>> a firewall script from qubes-firewall-user-script or qubes-firewall.d
>> returns an error code.
>>
>> The object is to provide a way to make the qubes-firewall service fail when
>> firewall scripts encounter an error. On failure, the result would be that
>> forwarding (or networking) is disabled and any units bound to qubes-firewall
>> would not run.
>>
>> Default behavior would be little different than it is now, given that shell
>> scripts are fault-tolerant. But script authors will have the option of using
>> "set -e" or "exit 1" etc. so the service goes into a failed state.
>
> Problems like crashed qubes-firewall are very annoying and it isn't easy
> to find where such service have crashed. Also, script exiting with
> non-zero code can happen for various reasons, including misusing "[
> condition ] && action" syntax. I've seen far to many errors like that.
The intention is for a script author who wants net enabled only if
firewall script runs exactly right... they can use "set -e". Otherwise
the service wouldn't be affected by an error.
As for finding errors when they occur, looking at journalctl is pretty
informative since related lines about failure are highlighted and twice
mention the method i.e. "self.run_user_script". I was wondering if the
service could also do a notify-send, or even call a qubes-rpc method
that merely informs about VM state in such a case.
Alternately, instead of failing the service could handle the error by
simply logging it and disabling forwarding for the proxyVM. Another
service (post-misc?) might display an informational popup about
forwarding state.
>
> But if the script author know what he/she is doing, having option to
> fail closed is a good idea. What about choosing en exit code that would
> cause the effect you propose? And let that not be 1. This could allow
> both: fail closed for conscious user, and harder to break the setup by
> stupid error. The idea is inspired by Restart*ExitStatus= settings of
> systemd.service and git bisect run.
I think this would put a burden on the script writer to improvise (and
not accidentally undermine) their own facsimile of try/catch so that
sending the special exit code is possible. In this case, the script
writer has already done (precariously) 95% of what is needed to prevent
the proxyvm from going online anyway. IOW
I'm asking about this because I started to put checks for firewall rules
in the qubes-tunnel startup code, but it seems kludgey to have a
non-firewall script check for specific rules. Its cleaner to just
enforce error checking in the first place.
-
Also... looking at how qubes-firewall fails at that point, its different
than the 3.2 behavior I remember where enabling of forwarding followed
in sequence after the user script. When the R4.0 service fails at
run_user_script the system continues with forwarding enabled which seems
suboptimal.