Reporting latency metrics

1,551 views
Skip to first unread message

Matías Giorgio

unread,
May 10, 2017, 6:42:48 PM5/10/17
to Kong
Hi folks,

I've been running performance tests on our cluster of Kong and I noticed that the documentation is somewhat incomplete or ambiguous in terms of what is been reported.

Kong reports 2 different latencies (i.e. upstream latency and latency). The documentation doesn't say which time units are used but I assumed they are milliseconds. Nevertheless, it is unclear what Kong calls "upstream". Is it first-byte or last-byte?

In order to makes this discussion clearer, I drew the query lifecycle as follows so we can clearly express what is upstream latency, what is total latency, what is Kong latency (which is only reported as a header, not as a plugin's metric), and so on. See attachment.

The legend for the attachment is:
t0 Client sent 1st byte
t1 Kong received 1st byte from Client
t2 Kong received full message
t3 Kong finished processing / 1st byte sent to upstream
t4 Kong sent last byte to upstream
t5 Kong received 1st byte from upstream
t6 Kong received last byte from upstream
t7 Kong finished processing / 1st byte sent to Client
t8 Last byte sent to Client
t9 Client received full message

Could you folks please clarify which interval [ti,tj] corresponds to Kong latency, upstream latency, total latency, etc?

Maybe we can come up with a story to report multiple useful latency metrics with a clear understanding of what they precisely report.

Many thanks in advance.
Screen Shot 2017-05-10 at 7.40.17 PM.png

Thibault Charbonnier

unread,
May 10, 2017, 7:45:12 PM5/10/17
to kong...@googlegroups.com
Hi,

On 5/10/17 3:42 PM, Matías Giorgio wrote:
> Hi folks,
>
> I've been running performance tests on our cluster of Kong and I noticed
> that the documentation is somewhat incomplete or ambiguous in terms of
> what is been reported.
>
> Kong reports 2 different latencies (i.e. upstream latency and latency).
> The documentation doesn't say which time units are used but I assumed

You're right, our fault. There is an open issue about it that was never
addressed:

https://github.com/Mashape/getkong.org/issues/159

> they are milliseconds. Nevertheless, it is unclear what Kong calls
> "upstream". Is it first-byte or last-byte?

It depends, due to the Nginx internals. These values are indeed unclear,
or undocumented sometimes. I was indeed expecting such a question to
surface some day...

> In order to makes this discussion clearer, I drew the query lifecycle as
> follows so we can clearly express what is upstream latency, what is
> total latency, what is Kong latency (which is only reported as a header,
> not as a plugin's metric), and so on. See attachment.
>
> The legend for the attachment is:
> t_0 Client sent 1st byte
> t_1 Kong received 1st byte from Client
> t_2 Kong received full message
> t_3 Kong finished processing / 1st byte sent to upstream
> t_4 Kong sent last byte to upstream
> t_5 Kong received 1st byte from upstream
> t_6 Kong received last byte from upstream
> t_7 Kong finished processing / 1st byte sent to Client
> t_8 Last byte sent to Client
> t_9 Client received full message

Not entirely accurate, due to some Nginx internals. What actually
happens in the scenario of a successful, proxied request is closer to
something like:

1. Client opens connection
2. Kong executes SSL handshake
3. Client sends 1st byte of the request
4. Nginx receives 1st byte
5. Nginx received all request header
6. Kong executes it's early Lua code and determines to which API a
request should be routed to
7. Kong executes all necessary plugins on the request
8. (optional) Kong reads request body (if a plugin needed it)
9. Kong finished processing a request (all plugins ran)
10. Kong chooses a target from its load-balancing component
11. Kong sends request to upstream chosen by load-balancer
12. (if successful, else back to load-balancing) Kong sent last upstream
byte
13. Nginx receives the upstream status line and headers
14. Kong processes upstream status line and headers if necessary
(plugins)
15. Nginx sends the upstream status line and headers to the client
16. Nginx receives 1st byte of response body from upstream
a. Nginx streams upstream response body to client
*OR*, if a plugin needs to:
b. Kong buffers the upstream response chunk by chunk
c. Kong sends the buffered upstream response down to the client at once
17. Kong executes its 'log' phase (last byte already received by client)

I might be forgetting about something here, but this is more or less
what happens on a successful request being proxied.

What Kong measures is:

- Kong-Proxy-Latency response header: tdiff(10 - 4)
This header's intent is to give a notion of how much time was
spent inside of Nginx and Kong before proxying a request
to upstream.

- Kong-Upstream-Latency response header: tdiff(14 - 10)
This header's intent is to give a notion of how much time Nginx
waited for a response from upstream. It is flawed, as it is
tracking time spent waiting from the Lua-land (from Kong
itself), for technical limitations.

And in the basic logging serializer, that all logging plugins currently use:

- latencies.kong: tdiff(10 - 6) + if (16.b/c) { tdiff(16.c - 14) }
The intent of this property is to give an idea of how much time
was spent inside of Kong, overall (before proxying, and during
the response streaming/buffering, until the last byte was
received from upstream and the last plugin ran on the response
body).

- latencies.proxy: This is identical to the Kong-Upstream-Latency
header. A value of -1 indicates the request was not proxied upstream,
and failed/was rejected by Kong (think: authentication failure)

- latencies.request: tdiff(17 - 4)
The intent here is to give a notion of how much time, total,
everything included, was spent inside of Kong.

I hope this can bring some more insight as to what those values are
about, or try to be about. I also hope the explanation is clear enough,
this ended up longer than I expected...

We know they're not perfect, and I think we can improve them and how
they are measuring things or at least, document them better right now.

--
Thibault

Matías Giorgio

unread,
May 11, 2017, 3:10:31 PM5/11/17
to Kong
Hey Thibault,

Thanks a lot for your thorough response. It does clarify things a lot.

It'd be really helpful if the Kong latency, currently reported by a header, was also reported by plugins. That way, DevOps would be able to monitor the extra overhead added by Kong. Any plan to add it?

Thanks.

Robert Paprocki

unread,
May 11, 2017, 3:17:00 PM5/11/17
to Matías Giorgio, Kong
They do, to a certain extent. See https://getkong.org/plugins/tcp-log/ (and other logging plugins):


"
A few considerations on the above JSON object:

latencies contains some data about the latencies involved:
  • proxy is the time it took for the final service to process the request
  • kong is the internal Kong latency that it took to run all the plugins
  • request is the time elapsed between the first bytes were read from the client and after the last bytes were sent to the client. Useful for detecting slow clients.
"


--
You received this message because you are subscribed to the Google Groups "Kong" group.
To unsubscribe from this group and stop receiving emails from it, send an email to konglayer+unsubscribe@googlegroups.com.
To post to this group, send email to kong...@googlegroups.com.
Visit this group at https://groups.google.com/group/konglayer.
To view this discussion on the web visit https://groups.google.com/d/msgid/konglayer/8fda8a8f-e9be-4fd6-9a08-996d23a06f53%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matías

unread,
May 11, 2017, 3:26:08 PM5/11/17
to rob...@cryptobells.com, Kong
Interesting. Thanks for pointing this up, Robert.

I'm using the statsd plugin, that's why I didn't see that.

The metrics reported by different plugins don't seem to be consistent from one plugin to another. That makes sense considering that many collaborations are made by the community. Nevertheless, maybe the folks at Kong may work on some sort of abstraction layer/framework to make sure that all plugins report the same metrics. Something like log4j and its different appenders but for metrics.



To unsubscribe from this group and stop receiving emails from it, send an email to konglayer+...@googlegroups.com.
--
You received this message because you are subscribed to a topic in the Google Groups "Kong" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/konglayer/6WUSdDHSPAw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to konglayer+...@googlegroups.com.

To post to this group, send email to kong...@googlegroups.com.
Visit this group at https://groups.google.com/group/konglayer.

Robert Paprocki

unread,
May 11, 2017, 3:29:38 PM5/11/17
to Matías, Kong
On Thu, May 11, 2017 at 12:25 PM, Matías <matias....@gmail.com> wrote:
Interesting. Thanks for pointing this up, Robert.

I'm using the statsd plugin, that's why I didn't see that.

The metrics reported by different plugins don't seem to be consistent from one plugin to another. That makes sense considering that many collaborations are made by the community. Nevertheless, maybe the folks at Kong may work on some sort of abstraction layer/framework to make sure that all plugins report the same metrics. Something like log4j and its different appenders but for metrics.


Yep, this is absolutely on our radar :D 

Thibault Charbonnier

unread,
May 11, 2017, 3:34:07 PM5/11/17
to kong...@googlegroups.com
The goal of all the logging plugins has always been to accept a
"serializer" property.

A serializer is a function that just has to output a Lua table with any
properties it'd like the logging plugin to log. This is what the
basic-serializer is currently doing (and we plan on writing a few other
default ones).

As a user, you'd be able to write your own as well and use it instead of
the default ones (some users desperately want to log their request
bodies ;) ).

As you said, all plugins should use the same metrics, and if the statsd
work slightly differently, it could still accept input from a
serializer, instead of bypassing it as it is currently doing.

All of this is "TODO", of course :)

Contributions welcome!

--
Thibault
> --
> You received this message because you are subscribed to the Google
> Groups "Kong" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to konglayer+...@googlegroups.com
> <mailto:konglayer+...@googlegroups.com>.
> To post to this group, send email to kong...@googlegroups.com
> <mailto:kong...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/konglayer.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/konglayer/CAEuShZfRceotEYqqsVx17khAvtCWXQmnLEPSf%2B9n%3DMnUN81orQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/konglayer/CAEuShZfRceotEYqqsVx17khAvtCWXQmnLEPSf%2B9n%3DMnUN81orQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Matías Giorgio

unread,
May 11, 2017, 3:48:46 PM5/11/17
to Kong
Thanks for your responses, folks. It is awesome to see how often you interact with the community!

I work at Medallia, and we use Kong as the core component for our internal API Gateway. Most of our main requirements are covered, and we manage to workaround some other things. You know, monitoring is *super* important but, as we aren't in production yet, it's still more on the should size than the must side :)

We'll definitely get much more involved in terms of contributing stuff but we're defining that on our roadmap.

Thanks!

On Wednesday, May 10, 2017 at 7:42:48 PM UTC-3, Matías Giorgio wrote:

Thibault Charbonnier

unread,
May 11, 2017, 4:01:09 PM5/11/17
to kong...@googlegroups.com
On 5/11/17 12:48 PM, Matías Giorgio wrote:
> Thanks for your responses, folks. It is awesome to see how often you
> interact with the community!

Our pleasure!

> I work at Medallia, and we use Kong as the core component for our
> internal API Gateway. Most of our main requirements are covered, and we
> manage to workaround some other things. You know, monitoring is *super*
> important but, as we aren't in production yet, it's still more on the
> should size than the must side :)
>
> We'll definitely get much more involved in terms of contributing stuff
> but we're defining that on our roadmap.

Awesome! Let us know if we can help in any way.

--
Thibault
Reply all
Reply to author
Forward
0 new messages