As I said in my first post: I use Netbox as a source of truth to *configure* other services.
I don't want monitoring to poll known failed devices. Apart from wasting resources, I don't want any more alerts from a device which is known for sure to be out of service. Therefore, when I set the status of a device to "Failed" in Netbox, this automatically removes it from the monitoring configuration.
> what do you do when device polling failed?
The monitoring system generates an alert, and I look at it. In the mean time I keep polling it, because Netbox still says it's an Active (and maybe it will come back). After a human has triaged the problem, *if* the issue is that the device itself is broken and will remain that way for a while, then I set the status to "Failed".
Just because a device fails to respond to monitoring, doesn't mean that the device itself has failed. It could be the network connection to it, for example.
> if "NetBox intends to represent the desired state of a network", what purpose of "Failed" status?
In this example, the "desired" state of my monitoring service is only to poll devices which should be responding. That information is pushed from Netbox to the monitoring platform.
In other words, Netbox is like a control panel, that represents the desired state of the entire network *and* the downstream systems which are controlled by it. It's not a portal onto data collected by other systems: I still need to look at the monitoring platform, analyze logs etc. To build visibility of operational status you'll want something that generates dashboards, like Grafana, and something which gives you historical records of metrics and logs, like Prometheus and Loki.