Hi all,
This is a question that I have asked in the past in other channels, but
I never found a solution, so I am trying again in case somebody knows of
some tool or hack, I cannot be the only one with this need!
I have a NodeJS application that I have developed for which I really
need to add instrumentation. It is a one-shot application, not a daemon,
but it runs for a considerable amount of time, sometimes up to an hour,
and it is launched on demand by a job scheduler (separate Python app).
There is no fixed number of jobs running in parallel, so I cannot define
pseudo-instances for each job without exploding the metric cardinality.
I tried looking at the
https://github.com/zapier/prom-aggregation-gateway project, but it has
some limitations that make it unattractive: you cannot push metrics more
than once as they are aggregated/summed again, unless you clear your
metrics (which AFAIK the official clients do not support); and gauges
are unconditionally added, rendering useless in many cases (e.g. timestamps)
So far I have used mtail for parsing the logs and extracting the
information; with enough work and patience I can do almost everything I
need that way, but it is resource-intensive, tedious, and very error-prone.
The only alternative I can think of right now would be to write a new
prometheus client lib that sends RPCs to a custom aggregation daemon,
but I am afraid of the runtime cost and the effort in writing it from
scratch.
Does anybody know of any solution to this problem? I am still surprised
this is not a more common issue!
Thanks in advance, Tina.
--
Martina Ferrari (Tina)