I'm currently unclear what happens when a query runs and there are multiple remote_read endpoints configured, in addition to the local tsdb.1. I can see there is an option "read_recent:" (default false), which implies that queries within the tsdb retention period are only served from the tsdb.For a simple query like "up" which doesn't specify any labels to match, how does it know not to query the remote_read endpoints?
Does it just assume that tsdb has a complete collection of all the metrics within the retention period?
(However, I note that the label set in the remote storage may be different, due to write_relabel_configs)2. For queries outside of the retention period: does the same query get sent to all backends and the results merged? What about a simple query like "up" which does not specify any labels?
Aside: I assume this would be filtered if:- metrics in each backend had a distinct set of extra labels added- each remote_read backend has a corresponding "required_matchers": setting- the query itself specified the labels, e.g. up{backend="influxdb1"}3. What happens if multiple remote_read backends contain the same timeseries? Are all the data points merged, or is one backend used in preference to another, and if so which one is chosen? Is this suitable for use as a failover mechanism?
4. What's the best practice here: is it considered OK to have the same timeseries with exact same labels in both local tsdb and remote backend(s)?
Or is it generally recommended to use write_relabel_configs to ensure that the remotely-stored timeseries are distinctly labelled, with a corresponding required_matchers on the remote_read?
I'm currently unclear what happens when a query runs and there are multiple remote_read endpoints configured, in addition to the local tsdb.1. I can see there is an option "read_recent:" (default false), which implies that queries within the tsdb retention period are only served from the tsdb.For a simple query like "up" which doesn't specify any labels to match, how does it know not to query the remote_read endpoints? Does it just assume that tsdb has a complete collection of all the metrics within the retention period?
(However, I note that the label set in the remote storage may be different, due to write_relabel_configs)2. For queries outside of the retention period: does the same query get sent to all backends and the results merged? What about a simple query like "up" which does not specify any labels?
Aside: I assume this would be filtered if:- metrics in each backend had a distinct set of extra labels added- each remote_read backend has a corresponding "required_matchers": setting- the query itself specified the labels, e.g. up{backend="influxdb1"}3. What happens if multiple remote_read backends contain the same timeseries? Are all the data points merged, or is one backend used in preference to another, and if so which one is chosen? Is this suitable for use as a failover mechanism?
4. What's the best practice here: is it considered OK to have the same timeseries with exact same labels in both local tsdb and remote backend(s)? Or is it generally recommended to use write_relabel_configs to ensure that the remotely-stored timeseries are distinctly labelled, with a corresponding required_matchers on the remote_read?Thanks,Brian C.
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/02f5e809-5615-40c7-96a9-e6b9bba09dd5%40googlegroups.com.
4. What's the best practice here: is it considered OK to have the same timeseries with exact same labels in both local tsdb and remote backend(s)?That's how it's expected to work.
Or is it generally recommended to use write_relabel_configs to ensure that the remotely-stored timeseries are distinctly labelled, with a corresponding required_matchers on the remote_read?That's what external_labels is for.
required_matchers is for a remote read backend which isn't actually storage, as it's doing something like e.g. running sql queries or consulting a machine database. For those there's no need to send queries not looking for that whatever external integration explicitly.
On Tuesday, 3 December 2019 14:15:48 UTC, Brian Brazil wrote:4. What's the best practice here: is it considered OK to have the same timeseries with exact same labels in both local tsdb and remote backend(s)?That's how it's expected to work.Great, thanks - I won't worry about overlapping series :-)Or is it generally recommended to use write_relabel_configs to ensure that the remotely-stored timeseries are distinctly labelled, with a corresponding required_matchers on the remote_read?That's what external_labels is for.external_labels is a global setting, so if you have multiple remote_write backends then they'll get the same - hence my question about merging between multiple remote_reads.
But in practice, I expect I'll stick to a single remote_read.required_matchers is for a remote read backend which isn't actually storage, as it's doing something like e.g. running sql queries or consulting a machine database. For those there's no need to send queries not looking for that whatever external integration explicitly.That's very interesting.I had been wondering before whether it's possible to synthesise static timeseries for machine labels. Currently I'm writing machine labels into a file, serving it out of Apache docroot, and scraping that. It works fine, but if I add a new label, it's only available from that point in time forwards.It hadn't occurred to me to use remote_read for that purpose. Are you aware of any existing integration that reads a static metric file and serves it as an infinite timeseries?
Are there any gotchas, like caching of the remote_read results which might break it?