On 18 November 2016 at 01:19, Meher Garda <
meher...@gmail.com> wrote:
>
> How long does crash recovery take?
Usually only a few minutes, but that depends on your storage device.
As you have started Prometheus on a different server but on the same
data, it sounds you are using some kind of network device. That might
explain the slowness in your case.
> Is there a way for prometheus to begin scraping and make new monitoring
> data available while crash recovery is ongoing?
No.
> I ask because since the last 2 hours while crash recovery is happening, prometheus
> is not listening on port 9090 and we basically have no monitoring available.
The immediate remedy is to start another server in parallel with the
same config but a fresh data directory. You won't have history, but
you get fresh metrics for the time being.
That's the good thing with pull-based monitoring. You can do those
things in an instant.
Also, in general, I would run every important Prometheus server twice
with the same config and separate data directory for redundancy.
--
Björn Rabenstein, Engineer
http://soundcloud.com/brabenstein
SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B