Scraping data format

2,120 views
Skip to first unread message

Peter Zaitsev

unread,
Feb 1, 2016, 8:30:49 PM2/1/16
to Prometheus Developers
Hi,

I've been reading this page:

This specifies 2 formats for data.  I wonder when Prometheus scrapes the exporter does it use binary format and human gets human readable data

Or would server use the same text protocol as well and something else is needed to switch to binary ?



--
Peter Zaitsev, CEO, Percona
Tel: +1 888 401 3401 ext 7360   Skype:  peter_zaitsev



Fabian Reinartz

unread,
Feb 1, 2016, 8:40:15 PM2/1/16
to Peter Zaitsev, Prometheus Developers
Prometheus does content-type negotiation when scraping a target. The protobuf format is used if it is supported by the target (i.e. supported by the client library the exporter/application was instrumented with). Otherwise we fallback to the text protocol. (There's even an ancient JSON format, which was deprecated so long ago that it's probably only relevant for a few remaining services inside of SoundCloud.)

If you are using a client library, you don't have to worry about the exposition format in general.


--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Peter Zaitsev

unread,
Feb 1, 2016, 8:41:28 PM2/1/16
to Fabian Reinartz, Prometheus Developers
Fabian, Thanks. So when prometheus server scrapes from  node_exporter or mysqld_exporter it will use efficient binary protocol, right ?


Fabian Reinartz

unread,
Feb 1, 2016, 8:45:32 PM2/1/16
to Peter Zaitsev, Prometheus Developers
Yes, those both use the the Go client library, which supports the binary protocol.
So far text (gzipped) and binary format both seem to work just fine though and neither is the bottleneck for our sample ingestion rate.

Ben Kochie

unread,
Feb 2, 2016, 4:54:26 AM2/2/16
to Fabian Reinartz, Peter Zaitsev, Prometheus Developers
I had the idea yesterday, and am starting work[0] to see if we can get the Prometheus exposition format(s) published as an IETF standard.  This would help with adoption by 3rd party software/hardware.  For example, I would love to see hardware vendors, like switch/router vendors, adopt prometheus metrics in addition to SNMP. :-)

Brian Brazil

unread,
Feb 2, 2016, 5:02:42 AM2/2/16
to Ben Kochie, Fabian Reinartz, Peter Zaitsev, Prometheus Developers
On 2 February 2016 at 09:54, Ben Kochie <sup...@gmail.com> wrote:
I had the idea yesterday, and am starting work[0] to see if we can get the Prometheus exposition format(s) published as an IETF standard.  This would help with adoption by 3rd party software/hardware.  For example, I would love to see hardware vendors, like switch/router vendors, adopt prometheus metrics in addition to SNMP. :-)

Talking with some network people at FOSDEM, there's already something called OpenConfig that's doing this in this space. It seems that the data model will be very compatible with Prometheus, and that pull is coming.

Brian



--

Ben Kochie

unread,
Feb 2, 2016, 6:28:13 AM2/2/16
to Fabian Reinartz, Peter Zaitsev, Prometheus Developers
Here's some test results of various methods:

     Text: 1210kB 551ms (not used by prometheus unless the exporter has no gzip)
zlib Text:   83kB 105ms
   Binary:   80kB  74ms

Details:

$ curl -w "@curl-format.txt" -o /dev/null -s "http://mysql_exporter:9104/metrics"
      content_type: text/plain; version=0.0.4
     size_download: 1239878
    speed_download: 686542.000

   time_namelookup:  0.125
      time_connect:  0.143
   time_appconnect:  0.000
  time_pretransfer:  0.143
     time_redirect:  0.000
time_starttransfer:  1.255
                   ----------
        time_total:  1.806

$ curl -w "@curl-format.txt" -o /dev/null --compressed -s "http://mysql_exporter:9104/metrics"
      content_type: text/plain; version=0.0.4
     size_download: 113943
    speed_download: 85337.000

   time_namelookup:  0.061
      time_connect:  0.079
   time_appconnect:  0.000
  time_pretransfer:  0.079
     time_redirect:  0.000
time_starttransfer:  1.230
                   ----------
        time_total:  1.335

$ curl -w "@curl-format.txt" -o /dev/null --compressed -H 'Accept: application/vnd.google.protobuf; proto=io.prometheus.client.MetricFamily; encoding=delimited' -s "http://mysqld_exporter:9104/metrics"
      content_type: application/vnd.google.protobuf; proto=io.prometheus.client.MetricFamily; encoding=delimited
     size_download: 82083
    speed_download: 68003.000

   time_namelookup:  0.064
      time_connect:  0.082
   time_appconnect:  0.000
  time_pretransfer:  0.082
     time_redirect:  0.000
time_starttransfer:  1.133
                   ----------
        time_total:  1.207

curl-format.txt:
      content_type: %{content_type}\n
     size_download: %{size_download}\n
    speed_download: %{speed_download}\n
\n
   time_namelookup:  %{time_namelookup}\n
      time_connect:  %{time_connect}\n
   time_appconnect:  %{time_appconnect}\n
  time_pretransfer:  %{time_pretransfer}\n
     time_redirect:  %{time_redirect}\n
time_starttransfer:  %{time_starttransfer}\n
                   ----------\n
        time_total:  %{time_total}\n


On Tue, Feb 2, 2016 at 2:45 AM, Fabian Reinartz <fab.re...@gmail.com> wrote:

Björn Rabenstein

unread,
Feb 2, 2016, 6:54:13 AM2/2/16
to Ben Kochie, Fabian Reinartz, Peter Zaitsev, Prometheus Developers
On 2 February 2016 at 12:28, Ben Kochie <sup...@gmail.com> wrote:
>
> Here's some test results of various methods:
>
> Text: 1210kB 551ms (not used by prometheus unless the exporter has no gzip)
> zlib Text: 83kB 105ms
> Binary: 80kB 74ms

Note that this is only about the client side (technically: the server side ;).

Ingestion of text format is more expensive than ingestion of protobuf.
I doubt we would have reached the 500k+ samples / sec with text
format, but those ingestion rates are anyway not what happens often in
practice, and other bottlenecks will be hit much earlier (like number
of time series).

--
Björn Rabenstein, Engineer
http://soundcloud.com/brabenstein

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
Reply all
Reply to author
Forward
0 new messages