Hi
I have been facing an issue with our prometheus (installed from helm). After the implementation of the remoteWrite towards our metricbeat we receive just like 5 mins of metrics and after that the connection stops. I can that the prometheus_remote_storage_shards are reaching their max shards and in the logs of the operator I see:
2021-03-04T17:56:15.976Z caller=dedupe.go:111 component=remote level=warn remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=161488 0564 minSendTimestamp=1614880565
2021-03-04T17:56:25.976Z caller=dedupe.go:111 component=remote level=warn remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=161488 0564 minSendTimestamp=1614880575
2021-03-04T17:56:35.977Z caller=dedupe.go:111 component=remote level=warn remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=161488 0564 minSendTimestamp=1614880585
2021-03-04T17:56:45.976Z caller=dedupe.go:111 component=remote level=warn remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=161488 0564 minSendTimestamp=1614880595
2021-03-04T17:56:55.976Z caller=dedupe.go:111 component=remote level=warn remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=161488 0564 minSendTimestamp=1614880605
2021-03-04T17:57:05.976Z caller=dedupe.go:111 component=remote level=debug remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg=QueueManager.calculateDesiredShards samplesInRate=1684.132190739004 samplesOutRate=261.7 8113035956926 samplesKeptRatio=0.444250380874013 samplesPendingRate=486.3952368184191 samplesPending=44890.5820306793 samplesOutDuration=1.4510985561508776 timePerSample=0.005543174766484209 desiredShards=11.527856913093473 highestSent=1.614880565e+09 highestRecv=1.614880625e+09 inte
gralAccumulator=395.5166384472137
2021-03-04T17:57:05.977Z caller=dedupe.go:111 component=remote level=debug remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg=QueueManager.updateShardsLoop lowerBound=4.8999999999999995 desiredShards=11.52785691309 3473 upperBound=9.1
We have ChartName": "prometheus-operator" and "helmChartVersion": "8.12.3", that installs prometheus version 2.15.2 and the config of the remote write is:
remoteWrite:
writeRelabelConfigs:
- sourceLabels: [observability]
regex: 'true'
action: keep
queueConfig:
capacity: 3000
maxSamplesPerSend: 1000
maxShards: 2000
I have been reading a bit and there are different views over the queueConfig paramethers but none of them have been good to our case. Did this happen to any of you ?
Thanks in advace for your help
Ruben