At this point I am the only one running queries. When I have no target defined the memory seems to be flat.
When I changed the follow in non-pro it seemed to stabilize the memory usage.
--storage.tsdb.max-block-duration 15d
--storage.tsdb.min-block-duration 1h
I will try copying the binaries and configs from prod to non-prod.
I am planning at looking at Thanos instead of an NFS mount. That is going to take some time.
I did add some file targets back in none-prod for a total of 900 checks and prometheus leveled out at about 22 GB.
Prod - TSDB Status - Head Stats
Number of Series=2 million
Number of Chunks=11million
Number of Label=59k
PairsCurrent Min Time=2021-10-15T16:00:00.006Z (1634313600006)
Current Max Time=2021-10-15T18:51:07.414Z (1634323867414)
Non-Prod - TSDB Status - Head Stats
Number of Series=82k
Number of Chunks=400k
Number of Label=2k
PairsCurrent Min Time=2021-10-15T18:05:27.705Z (1634321127705)
Current Max Time=2021-10-15T18:50:58.200Z (1634323858200)
Prod
Showing nodes accounting for 3939.82MB, 73.70% of 5345.97MB total
Dropped 292 nodes (cum <= 26.73MB)
Non-prod
Showing nodes accounting for 1.43GB, 91.54% of 1.56GB total
Dropped 133 nodes (cum <= 0.01GB)