Prometheus failed to restore metrics from snapshots.

455 views
Skip to first unread message

Louis Go

unread,
Apr 21, 2021, 8:38:44 PM4/21/21
to promethe...@googlegroups.com
I want to export data from prometheus and import to another one.
Found three posts and all of them said just put snapshots to the new one under storage.tsdb.path/snapshots and it's okay.
However I can't reproduce it.

I read these posts
     > This post said pointing storage.tsdb.path to snapshots directory would work, I tried but it didn't work.

Please let me know what I was missing.

**What did you do?**
I use docker and do the following steps.

1. Run prometheus container with `--web.enable-admin-api`
2. Snapshot by api $curl -XPOST http://localhost:9090/api/v1/admin/tsdb/snapshot
3. Using docker cp to copy snapshots. And checked the directory size which is ~250M.
$ du -sh snapshots
250M    snapshots

I tried 2 ways to import data
1. Copy to another prometheus container.
2. Copy snapshots to <host>/data/snapshots and mount <host>/data to prometheus'  storage.tsdb.path. 
   Note: <host>/data is empty except snapshots directory.

My tests were done on 2021/4/22 and my data is around 2021/4/16. All of the containers' retention time is default 15 days.

**What did you expect to see?**
I use Grafana's "explore" function to check metric "up" and try to see data on 4/16, but nothing shows.

**What did you see instead? Under which circumstances?**
  Containers of both way 1 and way 2 show only metrics when they start on 2021/4/22, but there are no metrics on 2021/4/16.

**Environment**

* System information:

My host is a vmware player 16 virtual machine running Ubuntu 18.04

Linux 5.4.0-70-generic x86_64

* Prometheus version:

I'm using the container version of Prometheus.
prometheus, version 2.25.2 (branch: HEAD, revision: bda05a23ada314a0b9806a362da39b7a1a4e04c3)
  build user:       root@de38ec01ef10
  build date:       20210316-18:07:52
  go version:       go1.15.10
  platform:         linux/amd64


* Prometheus configuration file:
```
global:
  scrape_interval: 10s
  scrape_timeout: 5s
  evaluation_interval: 15s
  external_labels:
    monitor: 'monitor'

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - alertmanager:9093

rule_files:
  - "/prom_setup/alert.rules"
  # - "second.rules"
 
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets:
        - 192.168.41.164:9090

  - job_name: 'host_A'
    static_configs:
      - targets: ['192.168.41.164:9100']

  - job_name: 'container_A'
    static_configs:
      - targets: ['cadvisor:8080']
```

* docker-compose file:
``` related part for prometheus
    prometheus:
        container_name: promethues
        image: prom/prometheus
        privileged: true
        volumes:
            - ./prometheus.yml:/etc/prometheus/prometheus.yml

           # try to mount only snapshots to prometheus
            - ./prom_data/:/prometheus/data/
        command:
            - '--config.file=/etc/prometheus/prometheus.yml'
            - '--web.enable-lifecycle'

              # to enable snapshot  
            - '--web.enable-admin-api'
```

* File permissions on host

# data directory
drwxrwxrwx  5 lou  lou  4096 Apr 22 08:28 prom_data/

# snapshots directory
drwxrwxrwx  4 lou    lou      4096 Apr 22 08:28 snapshots/

$ du -sh snapshots
250M    snapshots

* Logs:

```  docker-compose up message. It seems prometheus never found these logs?
promethues      | level=info ts=2021-04-22T00:28:20.992Z caller=main.go:366 msg="No time or size retention was set so using the default time retention" duration=15d
promethues      | level=info ts=2021-04-22T00:28:20.993Z caller=main.go:404 msg="Starting Prometheus" version="(version=2.25.2, branch=HEAD, revision=bda05a23ada314a0b9806a362da39b7a1a4e04c3)"
promethues      | level=info ts=2021-04-22T00:28:20.993Z caller=main.go:409 build_context="(go=go1.15.10, user=root@de38ec01ef10, date=20210316-18:07:52)"
promethues      | level=info ts=2021-04-22T00:28:20.993Z caller=main.go:410 host_details="(Linux 5.4.0-70-generic #78~18.04.1-Ubuntu SMP Sat Mar 20 14:10:07 UTC 2021 x86_64 8fa848a981f9 (none))"
promethues      | level=info ts=2021-04-22T00:28:20.993Z caller=main.go:411 fd_limits="(soft=1048576, hard=1048576)"
promethues      | level=info ts=2021-04-22T00:28:20.993Z caller=main.go:412 vm_limits="(soft=unlimited, hard=unlimited)"
promethues      | level=info ts=2021-04-22T00:28:20.998Z caller=web.go:532 component=web msg="Start listening for connections" address=0.0.0.0:9090
promethues      | level=info ts=2021-04-22T00:28:21.003Z caller=main.go:779 msg="Starting TSDB ..."
promethues      | level=info ts=2021-04-22T00:28:21.008Z caller=head.go:668 component=tsdb msg="Replaying on-disk memory mappable chunks if any"
promethues      | level=info ts=2021-04-22T00:28:21.008Z caller=head.go:682 component=tsdb msg="On-disk memory mappable chunks replay completed" duration=4.448µs
promethues      | level=info ts=2021-04-22T00:28:21.008Z caller=head.go:688 component=tsdb msg="Replaying WAL, this may take a while"
promethues      | level=info ts=2021-04-22T00:28:21.008Z caller=head.go:740 component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
promethues      | level=info ts=2021-04-22T00:28:21.008Z caller=head.go:745 component=tsdb msg="WAL replay completed" checkpoint_replay_duration=24.518µs wal_replay_duration=292.774µs total_replay_duration=360.306µs
promethues      | level=info ts=2021-04-22T00:28:21.009Z caller=tls_config.go:191 component=web msg="TLS is disabled." http2=false
promethues      | level=info ts=2021-04-22T00:28:21.009Z caller=main.go:799 fs_type=EXT4_SUPER_MAGIC
promethues      | level=info ts=2021-04-22T00:28:21.010Z caller=main.go:802 msg="TSDB started"
promethues      | level=info ts=2021-04-22T00:28:21.010Z caller=main.go:928 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
promethues      | level=info ts=2021-04-22T00:28:21.011Z caller=main.go:959 msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml totalDuration=1.487753ms remote_storage=2.162µs web_handler=883ns query_engine=1.467µs scrape=420.502µs scrape_sd=113.37µs notify=17.922µs notify_sd=28.211µs rules=550.056µs
promethues      | level=info ts=2021-04-22T00:28:21.011Z caller=main.go:751 msg="Server is ready to receive web requests."
```
Warmest Regards

Louis Go
Reply all
Reply to author
Forward
0 new messages