Hi Aliaksandr,
Thanks for the valuable insights..
I shall take a look at bomb-squad in the meanwhile do you foresee any generic options optimizations using metrics_relabel_configs or by using sample_limit can help to reduce the cardinalities.
Time series -
Currently we have close to 8 million time series for a single block which compacts in an event of every 2 hours
Promethus config -
RAM - 120 GB
CPU - 32 core CPU
Storage - 1 TB
Problem statement -
Once the RSS memory spikes more than 110 GB it crashes and which kind of makes our system very unstable ... Even we can't be increasing the resources more as we already operating with higest config.
Any directions/approaches/mechanism are highly appreciated .
Thanks in anticipation
Dinesh