ffwd and some thoughts about the specification

117 views
Skip to first unread message

John-John Tedro

unread,
Apr 24, 2014, 7:10:17 PM4/24/14
to metr...@googlegroups.com
We at Spotify have been developing an metrics and event agent who's only purpose is to receive and forward metrics and events to various system.

Find it at Github.

It just so happens that the internal representation for metrics matches really well with what is described in the Metrics 2.0 specification.

The format is specified in the docs. And with the reference JSON protocol there is a basic description of the internal representation. You have tags (list of strings) and attributes (a map, string -> string). An attribute for would be the Metrics 2.0 tag.

Check it out, will probably add a plugin for metrics write protocol in the soon future.

I have two concerns with the specification.

The specification does define escape sequences for control characters in tags, meaning that =(space) and are effectively reserved and there is no way to work around that. This has bitten us before when clients have been acting as thin translation layers allowing broken data to slip through. The spec should either strongly emphasize that these characters are reserved and should be dropped/ignored/whatnot. Or define escape sequences so we can build strong libraries.

The unit tag is mandatory and has a set of specifications, however most metrics are rates over some period (requests/minute) and this has implications over further aggregations (ex. find the average between two time series).
From my experience so far, trying to standardize all available units, periods and aggregation types at the collector is a difficult endeavor, it could unnecessarily complicate the specification and surrounding systems.

Our approach is instead to present the end user with the available raw metric, and then in the appropriate UI add support for them to define its unit through the way the various time series are aggregated and presented.

Thanks for the initiative!

Dieter Plaetinck

unread,
Apr 27, 2014, 9:08:54 PM4/27/14
to metr...@googlegroups.com
hey, the stuff you(plural)'ve been working on looks neat.

re: your first remark,  yes, the equality sign and space are reserved characters in the wire protocol. note that a lot of protocols can be conveived to carry metrics2.0 data; right now the protocol that is in the spec is just a simple protocol that's backwards compatible with graphite's carbon protocol.  we might want to spec out more protocols later, it's just that i've been mainly interested in a proto that's backwards compatible with carbon. as for comma, i don't think that's an explicit control character in this protocol, although graph-explorer does assume tag keys don't contain commas (so that it can do "sum by server,mountpoint,type" etc).  I think this is a fair assumption and should probably be added to the spec.  Actually I think we can pretty safely disallow comma, equality and spaces in tag keys.
But in tag values they should probably be allowed, so in this protocol we would need to support escaping equality and space. does that make sense? anything else?

Regarding units, I think it's one of the properties of the data that is very fundamental to understanding what the data means, so I think it should be mandatory to keep the unit associated to the data at all times, and update it when you process the data in a way that changes the unit.
as per the spec, you could use "Req/M" for requests/minute; as for finding the avg between two series (in the same unit) the unit wouldn't have to change.  (you could actually average two series without caring about the unit, and the aggregator could, if it detects a different unit (i.e. Req/M and Req/s) convert one of the two to the unit of the other, do the averaging, and then emit the resulting average series with the correct unit.  This is another reason why carrying the unit information is so useful; graph-explorer does things like this).  I don't understand what the complications would be you're referring to, would love to see an example.  I can imagine cases where there's semantical overlap between units (and more than one unit could be appropriate), but I think we can adjust the spec to deal with this if/when issues arise.

thanks for your feedback, and looking forward to discussing more.

Dieter

Dieter Plaetinck

unread,
Oct 7, 2014, 4:30:32 PM10/7/14
to metr...@googlegroups.com


On Thursday, 24 April 2014 19:10:17 UTC-4, udoprog work wrote:

From my experience so far, trying to standardize all available units, periods and aggregation types at the collector is a difficult endeavor, it could unnecessarily complicate the specification and surrounding systems.

could you explain this more? perhaps give an example?
Reply all
Reply to author
Forward
0 new messages