Error in querying historical data for one year

11 views
Skip to first unread message

li yun

unread,
Aug 10, 2020, 10:28:43 PM8/10/20
to Prometheus Users
I check the historical data for a year, and occasionally some errors may cause problems with Prometheus. The following is the error message

level=info ts=2019-09-19T07:54:25.886041715Z caller=repair.go:48 component=tsdb msg="found healthy b
lock" mint=1568851200000 maxt=1568858400000 ulid=01DN3QFYT850Z2SNWFCPM3DC30
"nohup.out" 173122L, 35151563C                                                    1,1           Top
net/http.(*persistConn).Read(0xc0041afc20, 0xc0044a1000, 0x1000, 0x1000, 0x40505d, 0x60, 0x0)
        /usr/local/go/src/net/http/transport.go:1825 +0x75
bufio.(*Reader).fill(0xc0046f30e0)
        /usr/local/go/src/bufio/bufio.go:100 +0x103
bufio.(*Reader).Peek(0xc0046f30e0, 0x1, 0xc004683800, 0x40cf18, 0x10, 0x2583100, 0x1)
        /usr/local/go/src/bufio/bufio.go:138 +0x4f
net/http.(*persistConn).readLoop(0xc0041afc20)
        /usr/local/go/src/net/http/transport.go:1978 +0x1a8
created by net/http.(*Transport).dialConn
        /usr/local/go/src/net/http/transport.go:1647 +0xc56

goroutine 4271 [select, 2 minutes]:
net/http.(*persistConn).writeLoop(0xc0041afc20)
        /usr/local/go/src/net/http/transport.go:2277 +0x11c
created by net/http.(*Transport).dialConn
        /usr/local/go/src/net/http/transport.go:1648 +0xc7b

goroutine 4380 [select, 2 minutes]:
, 0x0)
        /app/scrape/scrape.go:911 +0x129
        /app/scrape/scrape.go:422 +0x520

goroutine 4412 [select, 2 minutes]:
, 0x0)
        /app/scrape/scrape.go:911 +0x129
        /app/scrape/scrape.go:422 +0x520

Bjoern Rabenstein

unread,
Aug 11, 2020, 8:28:56 AM8/11/20
to li yun, Prometheus Users
On 10.08.20 19:28, li yun wrote:
> I check the historical data for a year, and occasionally some errors may cause
> problems with Prometheus. The following is the error message
> [...]

The log entries shown have nothing to do with querying. It looks more
like your server crashed for some reason (perhaps OOMing, which might
be triggered by your query) and therefore dumped stack traces of all
the goroutines. The initial info-level log line is not a stack trace
and looks more like something that shows up during startup. Perhaps an
earlier crash left the TSDB files on disk in an unrecoverable state.

First thing I would check is Prometheus's memory consumption while
running the expensive query. You might need to give it more RAM.

--
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] bjo...@rabenste.in
Reply all
Reply to author
Forward
0 new messages