Azure file as storage backend

30 views
Skip to first unread message

Greg Intive

unread,
Feb 26, 2020, 11:06:30 AM2/26/20
to Prometheus Users
Hi. 

We have AKS cluster on Azure, which is monitored by in-cluster prometheus. 
Prometheus pod has configured persistent storage which uses azure disk as a storage backend (it is discrbed here).

Some time age we decided to use azure file as storage backend (which basic description can be found here).

After some initial tests I have decided to run it, and migrate data from previous instance (by coping old directories with data). 
So I have started new instance, copied some old data to /data dir, restarted prometheus and everything was fine. Newly scraped data are present on dashboards along with old migrated.
I have started to migrate another batch of data. 

Everything is working fine, after ~1,5h hour from start I noticed that server was creating a one folder with data each minute:

drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:00 01E20SAAAK4CC27H07QH3ZXN3C
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:00 01E20SAQWA1KVBCNDBRZ5KTVQ1
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:00 01E20SAYDGXC12DSXA7066NJ03
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:00 01E20SB6G921Y6T0ZFJTKNWDNX
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:00 01E20SBDYENAF1NE7YRDHHY34K
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:00 01E20SBP0VW47HWK00EREXW15A
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:00 01E20SBY3EN5JS0K32X9WTNXC2
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:01 01E20SC70QMKGF0QSSN25PESF0
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:01 01E20SCHR65MQRMPG58XG8JMT9
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:01 01E20SCVD861GEFYHR5695EQNH
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:01 01E20SD8C78SG806AN5X22DPHS
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:01 01E20SDK67WV2X82K8RNWAM1FD
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:01 01E20SDY6R4XVK6RJQA2PZH16H
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:02 01E20SE9X5ZG2WPV6E9RZ88JPK
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:02 01E20SEQG0T38AZK4G8SNZEAEG
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:02 01E20SF9AR8TJZE372ZRP06JYS
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:03 01E10T77TYT8YBJ7RH3R1CE88Z
drwxr-xr-x    2 nobody   nogroup          0 Feb 26 13:03 01E20SG2NKH38JM55SS8Y3N0M5


And in logs i saw something like this: 

level=error ts=2020-02-26T13:01:58.895708548Z caller=db.go:341 component=tsdb msg="compaction failed" err="reload blocks: invalid block sequence: block time ranges overlap: [mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s, blocks: 12]: <ulid: 01E20SBDYENAF1NE7YRDHHY34K, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SAQWA1KVBCNDBRZ5KTVQ1, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SAYDGXC12DSXA7066NJ03, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SB6G921Y6T0ZFJTKNWDNX, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SAAAK4CC27H07QH3ZXN3C, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SBP0VW47HWK00EREXW15A, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SBY3EN5JS0K32X9WTNXC2, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SC70QMKGF0QSSN25PESF0, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SCHR65MQRMPG58XG8JMT9, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SCVD861GEFYHR5695EQNH, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SD8C78SG806AN5X22DPHS, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SDK67WV2X82K8RNWAM1FD, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>" level=info ts=2020-02-26T13:02:02.134855312Z caller=compact.go:443 component=tsdb msg="write block" mint=1582711200000 maxt=1582718400000 ulid=01E20SDY6R4XVK6RJQA2PZH16H  
...
(shortened for readability)

And after some time this log entry is repeated multiple times but with  higher block count: 

level=info ts=2020-02-26T14:02:24.079677532Z caller=compact.go:443 component=tsdb msg="write block" mint=1582711200000 maxt=1582718400000 ulid=01E20WWEN4Y1FW535M5VN1WZCZ
level=error ts=2020-02-26T14:03:40.144058489Z caller=db.go:341 component=tsdb msg="compaction failed" err="reload blocks: invalid block sequence: block time ranges overlap: [mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s, blocks: 95]: <ulid: 01E20TSHHXVCCEQZ7RMR1F0N8H, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SAQWA1KVBCNDBRZ5KTVQ1, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SAYDGXC12DSXA7066NJ03, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SB6G921Y6
T0ZFJTKNWDNX, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SBDYENAF1NE7YRDHHY34K, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SBP0VW47HWK00EREXW15A, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20TYZV0Y8FMM3WT1DY6FRKB, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SC70QMKGF0QSSN25PESF0, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SCHR65MQRMPG58XG8JMT9, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SCVD861GEFYHR5695EQNH, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SD8C78SG806AN5X22DPHS, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SDK67WV2X82K8RNWAM1FD, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SDY6R4XVK6RJQA2PZH16H, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SE9X5ZG2WPV6E9RZ88JPK, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SEQG0T38AZK4G8SNZEAEG, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SF9AR8TJZE372ZRP06JYS, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SG2NKH38JM55SS8Y3N0M5, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SGDREP1SQNRH738259AC5, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SGR6W3VCM0MH8QB6VTCPS, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SH2D4WAD4FM2DC6ZH0TAM, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SHEBKHMV1P7KP7B89FRYM, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SHWT3C6WA3TG09SB80H45, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SJ9NXN1M0AXS18R6X5CK2, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SJQ3Z3KD60YN6WQ805DNH, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SK3F800SK1MKN19PGGETZ, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SKM5AM6FB00248HPRZ1W4, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SM1Z00J5S07ZMCP5V3JR1, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SMF3F3PEGJ8244CBBG784, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SMXWGCF7J2ZW510TE5QVW, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SND70FW39613K843ZVH4Z, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SNVH2ACCPASRWT8H5RZH5, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SPARSEFP8E9EX9461NE4M, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SPT29R98NC852BMQTNZ1Y, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SQAKQE8T41XS5Q70049FG, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SQV17F1F4PBJYSJ9TXVJZ, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SRCYKKG5G210SX9RC4RGA, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SAAAK4CC27H07QH3ZXN3C, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SSG76YSCXA7K9Q75CWDJA, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20ST49KD8029JGAH7Y090EX, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20STV260VAPK05DY5DES9PV, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SVGFTJPGA2YENCGW75QTF, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SX2EEP791A6JJMFMSFRDG, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20SZFX1AG9NYBNHV22XAVQP, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T0683AYR4J7PSC8YTMBFR, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T0TA05FFDNYMJ0W33PFGV, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T1DQGFZ0SA3FW60H6TY2D, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T231MT7CTDQRHR7KR3A4Z, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T2VBWQR4D2T7C11V4HCGB, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T3H789XEW22454B17KHBF, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T46199V8E0WRWQ0GN3J69, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T4X0PXDM79TCW22A1ANJN, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T5K9DGF3TS09EYW5B3E9Q, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T6A0PFCT3JJS86MBQNNYS, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T71QNK9WBGT5QFDWMGYY2, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T7TEM7Y1TNKQYV8VPJW9H, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T8H90FC39T0G7JZG8E86Y, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20T99XASQ8MQTNVQRRPG9XZ, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20TA5JRJZZVP5AXJWTKCYHD, mint: 1582711200000, maxt: 1582718400000, range: 2h0m0s>, <ulid: 01E20TCS5H585B02X2SZCRFN9W, mint: 1582711200000, maxt:
...
(shortened for readability, but this is much longer than the first entry)

When I run prometheus server and I don't copy data everything is ok. Copied data don't overlap, for sure, they are from beginning of this month. 
Timestamps and increasing number of blocks  suggests that compacting loop creates additional folder with data, and don't delete old one, and creates another folder with compacted data... and so one.  

What can be cause of this problem? 



Simon Pasquier

unread,
Feb 27, 2020, 5:48:42 AM2/27/20
to Greg Intive, Prometheus Users
which version of Prometheus are you using?
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/d5d5f178-fb31-4737-99f8-efdd0b8a4ea2%40googlegroups.com.

Greg Intive

unread,
Feb 27, 2020, 6:19:38 AM2/27/20
to Prometheus Users
Hi, 
it is: 

$ prometheus --version
prometheus, version 2.7.1 (branch: HEAD, revision: 62e591f928ddf6b3468308b7ac1de1c63aa7fcf3)
  build user:       root@f9f82868fc43
  build date:       20190131-11:16:59
  go version:       go1.11.5
> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Simon Pasquier

unread,
Mar 19, 2020, 10:19:28 AM3/19/20
to Greg Intive, Prometheus Users
I'd recommend running a more recent version of Prometheus. v2.7.1 is
more than a year old now.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/71e4295f-bc96-4917-a72a-df01433ebb0c%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages