[GSoC 2026] Intro - System Health Monitor Project

153 views
Skip to first unread message

Sahil Kumar

unread,
Mar 17, 2026, 8:28:27 AMMar 17
to qubes-devel

Hello Ben, Marta, Marek, and the Qubes Team,

My name is Sahil Kumar (GitHub: Sahilll10), and I am a Computer Science student. I am writing to express my interest in the System Health Monitor project for GSoC 2026 and to seek early feedback on my approach before drafting my formal proposal.

Over the past few weeks, I have been actively contributing to Qubes OS and familiarizing myself with its architecture:

  • qubes-core-admin-client#450 — merged by @marmarek
    Fixed failing tests introduced by commit 84edd42 by identifying missing per-VM registration in MockQube._add_to_vm_list. Used git bisect to isolate the issue and resolved all failures with a minimal fix.

  • qubes-desktop-linux-manager#300 — under review
    Implemented automatic file manager launch in DispVM on block device attach using qubes.StartApp + qubes-open-file-manager. Added a comprehensive test suite using MockQubesComplete.

  • qubes-desktop-linux-manager#303 — under review
    Fixed a UI issue where a hyperlink inside a GtkRadioButton was not clickable. Verified via OpenQA and local GTK mock testing.

Through this work, I have gained hands-on experience with PyGTK widgets, qrexec services, the Qubes Admin API, and MockQubesComplete-based testing.

Initial Understanding & Direction

From my exploration, system health signals in Qubes OS appear to be fragmented and not centrally surfaced. Issues such as VM crashes, resource pressure (memory/storage), and insecure USB configurations are either logged internally or exposed indirectly across different components.

My current approach is to:

  • Aggregate health signals from existing sources (e.g., update widget, qubesadmin API)

  • Design a centralized background service to monitor system state

  • Build a PyGTK tray widget that provides a concise overview and proactive notifications

  • Ensure unit test coverage using MockQubesComplete

  • Add integration tests and documentation

To ensure I am aligning with the intended design direction:

  • Should the monitor primarily poll qubesadmin API properties from dom0, or is it preferred to leverage the event system (qubesadmin.events) for a more event-driven design?

  • For detecting insecure USB configurations, is checking the presence of sys-usb and verifying whether any USB controller is assigned directly to dom0 via qubesadmin sufficient, or is there a more authoritative or recommended approach?

I want to make sure I align early with the expected design direction before formalizing my proposal.

Thank you for your time and guidance.

Best regards,
Sahil Kumar

GitHub: https://github.com/Sahilll10
Email: sahilkum...@gmail.com

Marek Marczykowski-Górecki

unread,
Mar 17, 2026, 8:52:10 PMMar 17
to Sahil Kumar, qubes-devel, Marta Marczykowska-Górecka
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Tue, Mar 17, 2026 at 05:28:27AM -0700, Sahil Kumar wrote:
>
>
> Hello Ben, Marta, Marek, and the Qubes Team,

Hello!

> My name is Sahil Kumar (GitHub: Sahilll10), and I am a Computer Science
> student. I am writing to express my interest in the *System Health Monitor*
> project for GSoC 2026 and to seek early feedback on my approach before
> drafting my formal proposal.
>
> Over the past few weeks, I have been actively contributing to Qubes OS and
> familiarizing myself with its architecture:
>
> -
>
> *qubes-core-admin-client#450* — merged by @marmarek
> Fixed failing tests introduced by commit 84edd42 by identifying missing
> per-VM registration in MockQube._add_to_vm_list. Used git bisect to isolate
> the issue and resolved all failures with a minimal fix.
> -
>
> *qubes-desktop-linux-manager#300* — under review
> Implemented automatic file manager launch in DispVM on block device
> attach using qubes.StartApp + qubes-open-file-manager. Added a
> comprehensive test suite using MockQubesComplete.
> -
>
> *qubes-desktop-linux-manager#303* — under review
> Fixed a UI issue where a hyperlink inside a GtkRadioButton was not
> clickable. Verified via OpenQA and local GTK mock testing.
>
> Through this work, I have gained hands-on experience with PyGTK widgets,
> qrexec services, the Qubes Admin API, and MockQubesComplete-based testing.
>
> Initial Understanding & Direction
>
> From my exploration, system health signals in Qubes OS appear to be
> fragmented and not centrally surfaced. Issues such as VM crashes, resource
> pressure (memory/storage), and insecure USB configurations are either
> logged internally or exposed indirectly across different components.
>
> My current approach is to:
>
> -
>
> Aggregate health signals from existing sources (e.g., update widget,
> qubesadmin API)
> -
>
> Design a *centralized background service* to monitor system state
> -
>
> Build a *PyGTK tray widget* that provides a concise overview and
> proactive notifications

Marta surely will have more to say here, but IIUC we try to not
introduce any more widgets - if anything, there is a plan to
consolidate them at some point. So, presenting such information in a
tray widget is IMO desirable, but it should combine it into one of
existing ones (possibly re-purposing it to have broader scope). For
example disk space widget is already monitoring some part of the
system, so maybe it can gain more features?

> -
>
> Ensure *unit test coverage using MockQubesComplete*
> -
>
> Add *integration tests and documentation*
>
> To ensure I am aligning with the intended design direction:
>
> -
>
> Should the monitor primarily *poll qubesadmin API properties from dom0*,
> or is it preferred to *leverage the event system (qubesadmin.events)*
> for a more event-driven design?

Definitely events are preferred when possible. It's also possible to add
new events where needed. But, given the purpose of this tool, it might
want to send occasional ping, to verify if connection to qubesd still
works.

> -
>
> For detecting insecure USB configurations, is checking the presence of
> sys-usb and verifying whether any USB controller is assigned directly to
> dom0 via qubesadmin sufficient, or is there a more authoritative or
> recommended approach?

That's basically it. Plus ensuring relevant system qubes are running.
Note that some users may have more than one USB qube, for different
controllers, and such case should also be handled properly.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAmm59zIACgkQ24/THMrX
1yyV2gf/XlPESIkqCuoW/9AFFNvUB0uP42NrXdbom10xlnsKQmlq1qnE7i/DfsAu
lv5Uuk4ObOjxnyPYTg+oEJJo+/Dljh+d13Uk43ue5OdEPTsppNo6fM9SjFCxp2fn
8iEv6E6CTxiCtySvxeCf9nZE8U0CzD9qAtZCw6EXFUOncKuHTDHGwYrx4nabWi7F
3aShSKhwd28jziMT4laFqOSftGOtfT2DsGQCbbIgnWEjCZnhtrGG8CZxc+qm3rv4
SXn0zZ9f4JAtcy3XziUJ2PWnQB5DazFJ0lFydqW+ycFzILBZ2UrC3iqdn4c2kQbD
ZBnjJQJgkljbzWDmeCfJ5m7WP+Hgwg==
=f+z/
-----END PGP SIGNATURE-----

Sahil Kumar

unread,
Mar 18, 2026, 3:14:00 PMMar 18
to qubes-devel

Hi Marek, Ben, Marta and the Qubes Team.

Thank you for the detailed feedback. I have done the research to align with each point you raised before finalizing my approach.

1) On the widget consolidation point:

Understood — no new tray icon. I traced the existing disk space widget (`qui/tray/disk_space.py`) thoroughly. The `DiskSpace` class already uses `Gtk.Application` + `Gtk.StatusIcon` and monitors pool and VM volume usage.

 My plan is to extend this class directly into a `SystemHealthMonitor`, adding new health checks alongside the existing disk checks — same icon infrastructure, broader scope.


2) On events vs polling:

I traced how `qui/devices/device_widget.py` uses `qubesadmin.events.EventsDispatcher` and confirmed the exact events available in `qubesadmin/events/__init__.py`:

- `domain-shutdown`, `domain-start-failed`, `domain-start` — already exist, usable directly

- `device-assign:pci`, `device-unassign:pci` — already exist, used in device_widget.py

- `property-set:*`, `feature-set:*` — already exist

The current `disk_space.py` uses only `GObject.timeout_add_seconds` — no event integration. My plan is to replace this with `EventsDispatcher` and retain one 120-second ping solely to verify the qubesd connection is alive, as you suggested.

 

3) On USB detection:

Rather than checking by VM name, I traced the authoritative approach via `qubesadmin/device_protocol.py` — USB controllers have PCI interface code `p0c03**` (confirmed at line 690: `PCI_USB = ("p0c03**",)`). The correct check is:

```

for dev in qapp.domains["dom0"].devices["pci"].get_assigned_devices():

    if any(str(iface).startswith("p0c03") for iface in dev.interfaces):

        # USB controller directly in dom0 = insecure

```

This fires on startup and on `device-assign:pci` / `device-unassign:pci` events. Multiple USB qubes are handled correctly — the warning triggers only if a USB controller is in dom0, not based on how many USB qubes exist.

Additionally, I will check that relevant system qubes (sys-net, sys-firewall, USB qubes) are actually running — not just that USB controllers are correctly assigned. A properly configured sys-usb that has crashed is equally a security concern. This will use the domain-shutdown and domain-start-failed events combined with vm.features.get('servicevm', None) to identify and monitor all system qubes, not just USB ones.


4) On memory monitoring:

I confirmed `vm.get_mem()` calls `admin.vm.CurrentState` which returns `mem` in real time — no new upstream event is needed.

My plan is to re-check memory on each `domain-start` event  and compare against `vm.maxmem`, following the `WARN_LEVEL` / `URGENT_WARN_LEVEL` threshold pattern already in disk_space.py.As a stretch goal, I would propose a `domain-memory-warning` event upstream in qubes-core-admin to make this fully reactive.


Proposed Deliverables:

1. Refactor `disk_space.py` to use `EventsDispatcher` + qubesd liveness ping

2. System VM crash detection — `domain-shutdown` + `domain-start-failed` handlers, identified via `servicevm` feature

3. Memory pressure monitoring — `vm.get_mem()` triggered on `domain-start`, per-VM thresholds

4. USB security check — PCI interface `p0c03**` detection in dom0, multi-USB-qube aware

5. Per-VM opt-out feature flag (`health-monitor-not-notify`), following the existing `disk-space-not-notify` + `NeverNotifyItem` pattern

6. Unit tests using `MockQubesComplete` for all new checks

7. Documentation

 

Question: For the widget consolidation — is `disk_space.py` (`DiskSpace` / `Gtk.StatusIcon`) the right entry point to extend, or do you and Marta have a different widget in mind as the consolidation target? I want to make sure I am building on the right foundation before formalizing the proposal, since this decision affects the architecture throughout.

 

Thank you

Best regards,

Sahil Kumar

GitHub: https://github.com/Sahilll10

Gmail: sahilkum...@gmail.com

Marek Marczykowski-Górecki

unread,
Mar 19, 2026, 11:17:23 AMMar 19
to Sahil Kumar, qubes-devel, Marta Marczykowska-Górecka
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Wed, Mar 18, 2026 at 12:14:00PM -0700, Sahil Kumar wrote:
> Hi Marek, Ben, Marta and the Qubes Team.
>
> Thank you for the detailed feedback. I have done the research to align with
> each point you raised before finalizing my approach.

Hi,

Generally looks good, but I have few remarks below:

> 1) On the widget consolidation point:
>
> Understood — no new tray icon. I traced the existing disk space widget
> (`qui/tray/disk_space.py`) thoroughly. The `DiskSpace` class already uses
> `Gtk.Application` + `Gtk.StatusIcon` and monitors pool and VM volume usage.
>
> My plan is to extend this class directly into a `SystemHealthMonitor`,
> adding new health checks alongside the existing disk checks — same icon
> infrastructure, broader scope.
>
>
> 2) On events vs polling:
>
> I traced how `qui/devices/device_widget.py` uses
> `qubesadmin.events.EventsDispatcher` and confirmed the exact events
> available in `qubesadmin/events/__init__.py`:
>
> - `domain-shutdown`, `domain-start-failed`, `domain-start` — already exist,
> usable directly
>
> - `device-assign:pci`, `device-unassign:pci` — already exist, used in
> device_widget.py
>
> - `property-set:*`, `feature-set:*` — already exist
>
> The current `disk_space.py` uses only `GObject.timeout_add_seconds` — no
> event integration. My plan is to replace this with `EventsDispatcher` and
> retain one 120-second ping solely to verify the qubesd connection is alive,
> as you suggested.

For disk space monitoring that requires occasional refresh, maybe
something similar to CPU/memory stats could be used? It also uses
events, but on a separate channel.

> 3) On USB detection:
>
> Rather than checking by VM name, I traced the authoritative approach via
> `qubesadmin/device_protocol.py` — USB controllers have PCI interface code
> `p0c03**` (confirmed at line 690: `PCI_USB = ("p0c03**",)`). The correct
> check is:
>
> ```
>
> for dev in qapp.domains["dom0"].devices["pci"].get_assigned_devices():
>
> if any(str(iface).startswith("p0c03") for iface in dev.interfaces):
>
> # USB controller directly in dom0 = insecure
>
> ```

The above isn't really checking if USB controller remained in dom0. The
correct check would be looking if any of the USB controller is _not_
assigned to some other domain. Devices not assigned anywhere remain in
dom0.

> This fires on startup and on `device-assign:pci` / `device-unassign:pci`
> events. Multiple USB qubes are handled correctly — the warning triggers
> only if a USB controller is in dom0, not based on how many USB qubes exist.
>
> Additionally, I will check that relevant system qubes (sys-net,
> sys-firewall, USB qubes) are actually running — not just that USB
> controllers are correctly assigned. A properly configured sys-usb that has
> crashed is equally a security concern. This will use the domain-shutdown
> and domain-start-failed events combined with vm.features.get('servicevm',
> None) to identify and monitor all system qubes, not just USB ones.
>
>
> 4) On memory monitoring:
>
> I confirmed `vm.get_mem()` calls `admin.vm.CurrentState` which returns
> `mem` in real time — no new upstream event is needed.
>
> My plan is to re-check memory on each `domain-start` event and compare
> against `vm.maxmem`, following the `WARN_LEVEL` / `URGENT_WARN_LEVEL`
> threshold pattern already in disk_space.py.As a stretch goal, I would
> propose a `domain-memory-warning` event upstream in qubes-core-admin to
> make this fully reactive.

This one is a little more complicated. Qmemman will distribute all
available memory to some domains. So, if you have a lot of memory,
usually "mem" will be equal to "maxmem" - which doesn't mean you run low
on memory.
What would be a better check is comparing "pref_mem" (amount of memory
domain reports it wants) against "mem" (actually assigned memory). The
problem is that "pref_mem" isn't exposed anywhere - its only known
internally inside qmemman. I see there is get_pref_mem() function in the
QubesVM class, but it looks unused...

> Proposed Deliverables:
>
> 1. Refactor `disk_space.py` to use `EventsDispatcher` + qubesd liveness ping
>
> 2. System VM crash detection — `domain-shutdown` + `domain-start-failed`
> handlers, identified via `servicevm` feature
>
> 3. Memory pressure monitoring — `vm.get_mem()` triggered on `domain-start`,
> per-VM thresholds
>
> 4. USB security check — PCI interface `p0c03**` detection in dom0,
> multi-USB-qube aware
>
> 5. Per-VM opt-out feature flag (`health-monitor-not-notify`), following the
> existing `disk-space-not-notify` + `NeverNotifyItem` pattern
>
> 6. Unit tests using `MockQubesComplete` for all new checks
>
> 7. Documentation
>
>
>
> Question: For the widget consolidation — is `disk_space.py` (`DiskSpace` /
> `Gtk.StatusIcon`) the right entry point to extend, or do you and Marta have
> a different widget in mind as the consolidation target? I want to make sure
> I am building on the right foundation before formalizing the proposal,
> since this decision affects the architecture throughout.

Marta, what do you think?
> --
> You received this message because you are subscribed to the Google Groups "qubes-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to qubes-devel...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/qubes-devel/d2ed60bd-b9a0-4a4f-87ed-d0982ced0119n%40googlegroups.com.


- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAmm8E3wACgkQ24/THMrX
1yyF5ggAkGpE+g+Z+2ddDkVhT0DtTjg64zoT/eBwe93KRQJVnmaNXUEfxoQGq/U2
29RghGgYG9aB3NOwuollTCmWL0F0F9V+Odt1v0bxrH+sUFp96MgssNb0NKQ4Rz0F
KR/GhCFvwRVKnRty+f9SI5DXuxSBjueVR9o3b5sm2wgeHp2UOYk1NTx223/e0cQR
yVZEL8wWgzCq95kBn5uy9STDsTxk/yK+NuJqso9kodRJiXaFUGDmV399FuXy1xYM
u3u/BOT3fJggnRGIXeI43KZNgXFyE2oHnDnATFoMMNPR0kWpYSNXi7vjglooJxHT
GQy1yi0IGnQL0+nfmg8mI3ke6w1VBg==
=cc02
-----END PGP SIGNATURE-----

Sahil Kumar

unread,
Mar 20, 2026, 6:26:51 AMMar 20
to qubes-devel

Hi Marek, Ben , Marta and Qubes Team

Thank you for the corrections. Researched all  points carefully:

1)  On the Stats channel for disk space:

Confirmed. qui/tray/domains.py already uses a second EventsDispatcher(qapp, api_method="admin.vm.Stats") which emits vm-stats events with memory_kb and cpu_usage kwargs. I will use this same pattern — add a stats_dispatcher alongside the main dispatcher in the refactored widget, and hook disk space refresh to vm-stats events rather than a fixed timer.

2)  On the USB check:

You are right — my check was backwards. The correct approach

# Get all PCI devices exposed by dom0 for dev in qapp.domains["dom0"].devices["pci"].get_exposed_devices(): # Check if it is a USB controller (interface p0c03**) if any(str(iface).startswith("p0c03") for iface in dev.interfaces): # Check if assigned/attached to any non-dom0 VM assigned_elsewhere = False for vm in qapp.domains: if vm.name == "dom0": continue for assigned in vm.devices["pci"].get_dedicated_devices(): if assigned.port == dev.port: assigned_elsewhere = True break if not assigned_elsewhere: # USB controller remains in dom0 = insecure

This fires on startup and on device-assign:pci / device-unassign:pci events.

Multiple USB qubes handled correctly — only warns if a USB controller has no non-dom0 assignment.

3)  On memory monitoring:

I searched the entire codebase thoroughly — pref_mem and get_pref_mem do not exist anywhere in qubesadmin. The memory_kb field in vm-stats is currently assigned memory, not preferred memory. Without pref_mem being exposed via the Admin API, meaningful pressure detection is not possible from the client side.

Would it be in scope for this GSoC project to propose exposing pref_mem via a new field in admin.vm.Stats or a new Admin API call in qubes-core-admin? 

Or would you prefer I scope memory monitoring out of core deliverables and list it as a stretch goal pending that upstream work?


Sir should I now prepare the  first draft of the proposal for GSOC 2026?


Thank you again.

Best regards, 

Sahil Kumar 

GitHub: https://github.com/Sahilll10

Gmail: sahilkum...@gmail.com

Marta Marczykowska-Górecka

unread,
Mar 20, 2026, 7:55:33 AMMar 20
to qubes...@googlegroups.com

Hi,

First things first: you sound a bit like you're writing with an LLM. I would point to the https://doc.qubes-os.org/en/latest/introduction/contributing.html#using-ai-in-contributions , if you are using an LLM to contribute in ANY way, this MUST be disclosed. (Using an LLM to sound better also counts). 


> Question: For the widget consolidation — is `disk_space.py` (`DiskSpace` /
> `Gtk.StatusIcon`) the right entry point to extend, or do you and Marta have
> a different widget in mind as the consolidation target? I want to make sure
> I am building on the right foundation before formalizing the proposal,
> since this decision affects the architecture throughout.

Marta, what do you think?

Why disk space in particular? It can work, not a bad idea, but I think this should be explained. I don't think it's that deep of a decision, though, because the widgets are pretty interchangeable in many ways (and there's too many of them :D). Ideal approach would take into account that Wayland has bad support for tray icons and offer a way to interact with whatever is the result that does not come from a tray icon, but this might be too complicated?

Best,

Marta


On 3/20/26 11:26, Sahil Kumar wrote:
> threshold pattern already in disk_space.py.As <http://disk_space.py.As> a stretch goal, I would
> > -- > You received this message because you are subscribed to the Google Groups "qubes-devel" group. > To unsubscribe from this group and stop receiving emails from it, send an email to qubes-devel...@googlegroups.com <mailto:qubes-devel...@googlegroups.com>. > To view this discussion visit https://groups.google.com/d/msgid/qubes-devel/95ef9cfd-9fb4-4d9b-9e18-7d2ca2c293d7n%40googlegroups.com <https://groups.google.com/d/msgid/qubes-devel/95ef9cfd-9fb4-4d9b-9e18-7d2ca2c293d7n%40googlegroups.com?utm_medium=email&utm_source=footer>. -- 
Best Regards,
Marta Marczykowska-Górecka
Qubes OS / Invisible Things Lab

Sahil Kumar

unread,
Mar 20, 2026, 8:47:01 AMMar 20
to qubes-devel

Hi Marta,

Yes, I have used AI assistant to help me compose mailing list replies, and how to phrase things . I should have disclosed this from the start and I did not — that was wrong and I apologize. Going forward I will disclose this on every contribution where I have used an LLM.

My contributions to QubesOS , running the commands, iterations, fixing the actual bugs are all done by myself and I am confident to explain each and every commit I made.

On the widget question:  I looked at all three tray widgets — disk_space.py, domains.py, and updates.py. All three use Gtk.StatusIcon with the same Gtk.Application base pattern, and all three already import gtk3_xwayland_menu_dismisser which forces GDK_BACKEND=x11 for XWayland compatibility. So the Wayland limitation is shared across all of them equally.

Marek specifically suggested disk_space.py because it already monitors system state — that is the main reason I chose it. If there is a preferred widget from your side, I am happy to work with that instead. For Wayland-native support without tray icons, I could explore Gio.Notification for alerts as a complementary path, though I agree that may be bit complicated.
Thank you.



Best regards, Sahil Kumar
GitHub: https://github.com/Sahilll10

Marta Marczykowska-Górecka

unread,
Mar 20, 2026, 10:53:00 AMMar 20
to qubes...@googlegroups.com

Hi,

Ok, as long as we understand each other. Otherwise, yeah, disk_space can work.

Best,

Marta

To unsubscribe from this group and stop receiving emails from it, send an email to qubes-devel...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/qubes-devel/c40e0880-55ad-4b4a-bc2c-9266d5c6ed8en%40googlegroups.com.

Sahil Kumar

unread,
Mar 21, 2026, 2:15:42 AMMar 21
to qubes-devel

Hi  Marek, Marta , Ben and Qubes Team

As the deadline for project proposal for GSOC approaches , here are a few questions related to my proposal on System Health Monitor project :

Q1 (memory -blocking): Should exposing pref_mem via a new field in admin.vm.Stats or a new Admin API call be part of this GSoC project's scope? Or should memory monitoring be listed as a stretch goal ?

Q2 (updates integration): The updates widget already handles pending updates via EventsDispatcher. Should the health monitor show update status as well, or only cover :VM crashes, USB security, disk space, memory?

Q3 (service architecture): All existing widgets are started via autostart .desktop files and share qubes-widget@.service. Should the health monitor follow this same pattern, or is a separate  background service expected as there is systemd mentioned in the project description?

Q4 (new events): For system VM crash detection: domain-shutdown and domain-start-failed already exist. Are there any other events I should propose adding upstream in qubes-core-admin as part of this project like for example a `domain-memory-warning event`?

I would be very grateful for your suggestions and guidance.

Disclosure: There was no AI agent used in framing these questions.

Best regards

Ben Grande

unread,
Mar 22, 2026, 7:45:26 PMMar 22
to qubes-devel, Sahil Kumar
On 26-03-20 23:15:41, Sahil Kumar wrote:
> Hi  Marek, Marta , Ben and Qubes Team

Hi.

> As the deadline for project proposal for GSOC approaches , here are a few
> questions related to my proposal on System Health Monitor project :
>
> Q1 (memory -blocking): Should exposing pref_mem via a new field in
> admin.vm.Stats or a new Admin API call be part of this GSoC project's scope? Or
> should memory monitoring be listed as a stretch goal ?

I think you have read it, but in case you haven't, please read this
thread about the method to use:

https://github.com/QubesOS/qubes-core-agent-linux/pull/550

It should be in scope because monitoring qube memory is very important,
you can find plenty of issues and forum posts asking for that. It is
also one of the two reasons a VM stalls, CPU or memory hogs. I don't
think it should be a stretch goal, if I understood it correctly, it is
something extra, while I think it is very important.

> Q2 (updates integration): The updates widget already handles pending updates
> via EventsDispatcher. Should the health monitor show update status as well, or
> only cover :VM crashes, USB security, disk space, memory?

I don't understand where the "USB security" comes from on this thread.
Can you bring context? Anyway, this feels closer to #6663

which doesn't need a complex health monitor, such some indication in the
global config. The first post on #2134 mentions that qubes with PCI
devices have a higher chance of crashing. Their crashes should be logged
(there is already a log file, I mean a different thing), containing the
date and reason (if possible) of the crash. Sometimes restarting them is
not the best solution, as it may not solve the problem, such as missing
no-strict-reset for example, so if the reason is known, it shouldn't try
to restart the qube.

I don't think it should show update status as well, there is already a
widget for that, you'd know if using it. I think you will need to have a
computer running Qubes to be able to develop this project. It would make
your proposal more accurate, as you'd notice pain points rather than
relying on 3rd party reports.

> Q3 (service architecture): All existing widgets are started via autostart
> .desktop files and share qubes-widget@.service. Should the health monitor
> follow this same pattern, or is a separate  background service expected as
> there is systemd mentioned in the project description?

Marta has some remarks about the use of a widget, see the following from
the mail you have forwarded:

> On Friday, March 20, 2026 at 5:25:33 PM UTC+5:30 Marta
> Marczykowska-Górecka wrote:
> Why disk space in particular? It can work, not a bad idea, but I
> think this should be explained. I don't think it's that deep of a
> decision, though, because the widgets are pretty interchangeable in
> many ways (and there's too many of them :D). Ideal approach would
> take into account that Wayland has bad support for tray icons and
> offer a way to interact with whatever is the result that does not
> come from a tray icon, but this might be too complicated?

I'd prefer if it was something that can be addded to the system tray
that would open the application if not running or bring it up front if
running. It should also be possible to start it from the command line or
as a .destkop application.

About being a background service, I hope it can replace qui-domains.

About how I'd like such application to look like (I hope the UX lead
doesn't punch me) is something similer to the KDE monitor:

https://apps.kde.org/plasma-systemmonitor/

Also try the application on KDE so you can interact with it and see what
it does.

But displaying the CPU, memory, disk usage, network, per qube, as well
as a global average. Showing historical data as graphs would also be
useful. I don't know if there is going to be enough time to do all of
this, but I am saying anyway to leave room for modifying things in the
future.

I added a lot of sub-issues to #2134, please take a look at it.

> Q4 (new events): For system VM crash detection: domain-shutdown and
> domain-start-failed already exist. Are there any other events I should propose
> adding upstream in qubes-core-admin as part of this project like for example a
> `domain-memory-warning event`?

Wouldn't hurt, although I'd ask for something that would have a generic
prefix, like "domain-health-warning:(memory|cpu|...)".

--
Best regards,
Benjamin Grande
Invisible Things Lab
signature.asc

Sahil Kumar

unread,
Mar 24, 2026, 3:24:14 PM (14 days ago) Mar 24
to qubes-devel

Hi Marek, Marta, Ben and the Qubes Team,

Thank you for detailed review and the guidance throughout this discussion.

https://docs.google.com/document/d/1wClotc_ct9y2h0F92BqaMKSusm91oVZMFICS39b5yew/edit?usp=sharing                                                                                                                                      I have now prepared the first draft of my GSoC proposal for the System Health Monitor project and would really appreciate your feedback .

I also wanted to confirm whether it is appropriate to include the following as stretch goals:

  • Host-level stats (admin.host.Stats):
    Adding an API for system-wide CPU and memory (e.g., total/free memory, dom0 CPU usage) to complement admin.vm.Stats, enabling global health insights.
  • Wayland-native notifications:
    Using Gio.Notification as the primary alert mechanism (compatible with both Wayland and X11), with the tray icon acting as a secondary summary interface.

Please let me know if these align well with the project expectations or if you would recommend adjusting their scope.

Disclaimer: I have used AI agent to frame my thoughts and understanding in the proposal.                                                                                                                                                                            Thank you.

Best regards,
Sahil Kumar

Sahil Kumar

unread,
Mar 25, 2026, 11:06:20 AM (13 days ago) Mar 25
to qubes-devel
The pdf of the Draft Proposal on System Health Monitor Project is attached herewith.

Best Regards 
Sahil Kumar
GSoC_2026_Sahil_Kumar_SystemHealthMonitor.pdf

Marek Marczykowski-Górecki

unread,
Mar 25, 2026, 10:58:23 PM (13 days ago) Mar 25
to Sahil Kumar, qubes-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi,

Regarding your proposal:
There are some minor details of the design that will need to be worked
out (like "avail_memory_kb" should be rather "used_memory_kb", or log
file location), but overall proposal looks quite well structured.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

On Wed, Mar 25, 2026 at 08:06:20AM -0700, Sahil Kumar wrote:
> The pdf of the Draft Proposal on System Health Monitor Project is attached
> herewith.
>
> Best Regards
> Sahil Kumar
> Github: https://github.com/Sahilll10
>
> On Wednesday, March 25, 2026 at 12:54:14 AM UTC+5:30 Sahil Kumar wrote:
>
> > Hi Marek, Marta, Ben and the Qubes Team,
> >
> > Thank you for detailed review and the guidance throughout this discussion.
> >
> >
> > https://docs.google.com/document/d/1wClotc_ct9y2h0F92BqaMKSusm91oVZMFICS39b5yew/edit?usp=sharing
> > <https://docs.google.com/document/d/1wClotc_ct9y2h0F92BqaMKSusm91oVZMFICS39b5yew/edit?usp=sharing>
> >
> > I have now prepared
> > the first draft of my GSoC proposal for the System Health Monitor project
> > and would really appreciate your feedback .
> >
> > I also wanted to confirm whether it is appropriate to include the
> > following as stretch goals:
> >
> > - Host-level stats (admin.host.Stats):
> > Adding an API for system-wide CPU and memory (e.g., total/free memory,
> > dom0 CPU usage) to complement admin.vm.Stats, enabling global health
> > insights.
> > - Wayland-native notifications:
> --
> You received this message because you are subscribed to the Google Groups "qubes-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to qubes-devel...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/qubes-devel/7153628f-66f9-4d30-bf92-a7b68a803879n%40googlegroups.com.



-----BEGIN PGP SIGNATURE-----

iQEyBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAmnEoMkACgkQ24/THMrX
1yxpuAf3X0wVcgAIkjXa3zYXn3HsgJsW1HBuuI5dlkSTNWTcRRzkabs/xqGm9TLv
111D2VFPC+RqOyzh4Puf1ENCuRW1Tp7mseTRiJ7ZB7cIy/+L1PzFNJ2nusnFqLZs
0r0PhTDN6NOKAhENh9sYMIwfIjiNdZ+z1af7+NzAu+mj8AojZZsf80TRkPdvQQPR
rXH8l8x6ddLfso8InvEtRc5+SFsEOD6i6LNwvNXMaaR7f6c8b3y0Eq1vx62QniKm
fDTqZwK1rn1lSJ0xeW74x3dnGJ6OLrYXNm82AmrNjqK7SEz0/PtimqnGiWZJt4bS
WZbNPy61LfT7e2KX/4Vk7549n5DG
=QHGg
-----END PGP SIGNATURE-----

Sahil Kumar

unread,
Mar 28, 2026, 6:07:33 AM (10 days ago) Mar 28
to qubes-devel

 Hi Marek, Marta ,Ben and the Qubes Team

I have updated the final draft to reflect your technical corrections:

  1. Memory Metric Update: Replaced avail_memory_kb with used_memory_kb across the API design and UI components, updating the xenstore extraction logic to calculate MemTotal - MemAvailable.

  2. Log File Standardization: Changed the system VM crash log location to /var/log/qubes/qui.system-health.log to better align with the existing Qubes OS component logging hierarchy.

Added flowcharts and prototype for better understanding.

I would be very grateful for your review on this draft proposal.

GSoC_2026_Sahil_Kumar_SystemHealthMonitor.pdf

Sahil Kumar

unread,
Mar 29, 2026, 4:05:08 AM (9 days ago) Mar 29
to qubes-devel
Hi  Ben, Marek , Marta  and the Qubes Team

Sir as the deadline for the proposal submission approaches.

Can you please review my updated draft proposal for System Health Monitor project .

Best Regards
Sahil Kumar
GSoC_2026_Sahil_Kumar_SystemHealthMonitor.pdf

Marek Marczykowski-Górecki

unread,
Mar 29, 2026, 10:02:45 AM (9 days ago) Mar 29
to Sahil Kumar, qubes-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi,

For the proposal it looks fine now.

BTW, just as a reminder, we have a policy regarding disclosing AI usage:
https://doc.qubes-os.org/en/latest/introduction/contributing.html#using-ai-in-contributions

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

On Sat, Mar 28, 2026 at 03:07:33AM -0700, Sahil Kumar wrote:
>
>
> Hi Marek, Marta ,Ben and the Qubes Team
>
> I have updated the final draft to reflect your technical corrections:
>
> 1.
>
> Memory Metric Update*:* Replaced avail_memory_kb with used_memory_kb
> across the API design and UI components, updating the xenstore extraction
> logic to calculate MemTotal - MemAvailable.
> 2.
>
> Log File Standardization*:* Changed the system VM crash log location to
> To view this discussion visit https://groups.google.com/d/msgid/qubes-devel/1ff167d6-7951-452e-bfdf-ef146e7e6e48n%40googlegroups.com.



-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAmnJMP0ACgkQ24/THMrX
1yyTewf/RsB+pwInrqI5NUFxd2whdxP2bLChweaufX7h53mIbb4GIP9JZSaRya/H
BxoyoJVjcEnbnevHzhUBFUq3csZQd96douMnG5KfJqHuQIu8fhWay6ZuQ7gG+dp6
ozQNhQHG3n34Hx2jxEKXYUHO0DYa4rK1C85f3A9ZhS4mZmC6nSjGdn4Kzr/+k37/
Y5HrSwgI/pG1gr1VDfQLNqhqTB4/P+LCS12pxeu2henOhfNTQ8V0bI3JAGOyIxAB
2KqJnliBIVNxXQ0dOst/bYrggMFNOcdbyKnHFINxwoq26jDJ4ixs56lvKE72E4hW
mJOV0yoxCgPUWO/uLEjD3Qt0jaWqrA==
=mCNt
-----END PGP SIGNATURE-----

Sahil Kumar

unread,
Apr 6, 2026, 5:01:09 PM (18 hours ago) Apr 6
to qubes-devel
Hi Ben , Marek , Marta  and the Qubes Team

As I read through the codebase and the existing PR's  related to System Health Monitor, I wanted to start some groundwork before the Pre-GSoC period begins. Few Questions I would like to ask.

1) Marek suggested in PR #550 that used_memory_kb should be added to admin.vm.Stats by reading xenstore memory/meminfo. 
But  Ben pointed out this key does not exist for fixed-memory VMs like sys-net and sys-usb. 

Should dom0 create this xenstore key and grant write permission even for maxmem == 0 VMs, or should the value go to QubesDB instead where the VM can write freely without dom0 pre-creating anything?

2) Marek also suggested in #10594 that swap monitoring could be useful for detecting memory pressure. 
Is monitoring swap via the same xenstore/QubesDB path something worth adding alongside used_memory_kb, or should that be kept separate?


3) For the total system memory bar described in #8368 — xc.physinfo() already gives total_memory and free_memory inside qubesd but nothing exposes this to clients yet. 

Is a new admin.host.Stats method the right approach here, or is there a different design already being considered?

I would like to open draft PRs for both the upstream memory work and the host stats API , under your guidance.
Reply all
Reply to author
Forward
0 new messages