Trying to find an aggregation proxy/gateway

43 views
Skip to first unread message

Martina Ferrari

unread,
Sep 8, 2023, 12:32:40 PM9/8/23
to promethe...@googlegroups.com
Hi all,

This is a question that I have asked in the past in other channels, but
I never found a solution, so I am trying again in case somebody knows of
some tool or hack, I cannot be the only one with this need!

I have a NodeJS application that I have developed for which I really
need to add instrumentation. It is a one-shot application, not a daemon,
but it runs for a considerable amount of time, sometimes up to an hour,
and it is launched on demand by a job scheduler (separate Python app).

There is no fixed number of jobs running in parallel, so I cannot define
pseudo-instances for each job without exploding the metric cardinality.

I tried looking at the
https://github.com/zapier/prom-aggregation-gateway project, but it has
some limitations that make it unattractive: you cannot push metrics more
than once as they are aggregated/summed again, unless you clear your
metrics (which AFAIK the official clients do not support); and gauges
are unconditionally added, rendering useless in many cases (e.g. timestamps)

So far I have used mtail for parsing the logs and extracting the
information; with enough work and patience I can do almost everything I
need that way, but it is resource-intensive, tedious, and very error-prone.

The only alternative I can think of right now would be to write a new
prometheus client lib that sends RPCs to a custom aggregation daemon,
but I am afraid of the runtime cost and the effort in writing it from
scratch.

Does anybody know of any solution to this problem? I am still surprised
this is not a more common issue!

Thanks in advance, Tina.

--
Martina Ferrari (Tina)

Chris Siebenmann

unread,
Sep 8, 2023, 2:45:16 PM9/8/23
to promethe...@googlegroups.com, cks.prom...@cs.toronto.edu
> This is a question that I have asked in the past in other channels, but
> I never found a solution, so I am trying again in case somebody knows of
> some tool or hack, I cannot be the only one with this need!
>
> I have a NodeJS application that I have developed for which I really
> need to add instrumentation. It is a one-shot application, not a daemon,
> but it runs for a considerable amount of time, sometimes up to an hour,
> and it is launched on demand by a job scheduler (separate Python app).
>
> There is no fixed number of jobs running in parallel, so I cannot define
> pseudo-instances for each job without exploding the metric cardinality.

I believe that what you may want is the statsd exporter. The statsd
exporter can be used to accumulate running metrics over time from
whatever source feeds into it, with whatever labels you want to stick
on, and sending metrics updates to it is pretty simple. If you want to
feed in ongoing metrics while the NodeJS application is running, you'd
presumably reset any in-application counters every time you sent them to
the statsd exporter.

This does have the drawback that you can't use the Prometheus client
libraries as-is, but you may be able to find statsd libraries instead
(and such libraries may already handle the counter reset issue). You
want libraries that support the extended statsd format with tags.

A few years ago I wrote a web page on this, with the focus on one-shot
scripts:
https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusStatsdForMetricsUpdates

- cks
Reply all
Reply to author
Forward
0 new messages