Device 1 to Device 10 ---> BGPMIB BGP stats are collected every 2 hours
Each and every device has multiple MIBs (variety of data ) collected.
But we have single Prometheus which is scraping every 10 secs and default stale timer 5mins..
1. Best scrape timer for the above scenario -
a. Is it going to be 1 sec ?
b. Is it performance issue?
2. we want to see the actual data points - as this is causing the issue for some of the most granular data failures are notified after 5mins, even though it is expected with in 1min
3. Alerting based on no.of times occurrence instead of timer
4. Resolve them based on number times or duration
Please help us here by answering above queries with little more info.
Regards,
Rajesh