Filter metrics based on timestamp

1,384 views
Skip to first unread message

Dhanunjaya Mitta

unread,
Oct 12, 2021, 5:39:04 AM10/12/21
to Prometheus Users
Hello,
I have a prometheus metrics query. Is it possible to filter the metrics based on timestamp? For example, I have a bunch of workflows which will be deployed for every certain amount of time. I would like to get the workflows which was deployed in the last one hour, 6 hours etc., Please let me know the suitable solutions.

Regards,
Dhanunjaya Mitta

Brian Candler

unread,
Oct 12, 2021, 9:55:12 AM10/12/21
to Prometheus Users
What prometheus metric contains the timestamp?  Can you show examples of the relevant metrics?  Does each workflow generate a separate timeseries?

Dhanunjaya Mitta

unread,
Oct 12, 2021, 10:40:31 AM10/12/21
to Prometheus Users
We don't have any default metrics to generate timestamp. For that we are using the custom metrics. But that is not the case. I am visualising the metrics in grafana and would like to check which pods are failed in between the particular time period [from 15:30- 17:00] etc.,

Brian Candler

unread,
Oct 12, 2021, 1:09:01 PM10/12/21
to Prometheus Users
If you give specific examples of the metrics, we can suggest queries you could use with them.

If you don't, then the answer is "write a PromQL query which identifies the time periods of interest".

Dhanunjaya Mitta

unread,
Oct 13, 2021, 2:39:23 AM10/13/21
to Prometheus Users
How can we write a promql  query which identifies the time periods of interest? Will you provide an example for that?
You asked me an example metrics right. Here is an example:
argo_workflows_count gives you the count of how many workflows are failed, succeeded, pending, etc., I would like to know in the last one hour how many workflows get succeeded? Hope it will give you an idea of what i am looking for.

Regards,
Dhanunjaya Mitta

Brian Candler

unread,
Oct 13, 2021, 4:07:36 AM10/13/21
to Prometheus Users
You said your metrics are counter values.  If you want to know how much a counter has increased over the preceding hour, you can use

increase(argo_workflows_count[1h])
or:
argo_worklows_count - argo_workflows_count offset 1h

The first takes account of possible counter resets, but may give a value which is not exactly an integer.  It's a rate extrapolated over the period, so if you only have 10 minutes of data, it will work out the rate of increase over that time and then multiply by 60/T where T is the time difference between the first and last data point (roughly).

The second will give an exact value for how much the counter has changed, but may go negative if the counter resets, and will give no answer if there was no value an hour ago.

Dhanunjaya Mitta

unread,
Oct 13, 2021, 4:22:57 AM10/13/21
to Prometheus Users
Hello Brian,
Thanks for your quick reply. Your solution partially satisfied my requirement. What if i would like to get the workflows which are succeeded in the last one hour. (Not the count, but also the name of the workflow). 
argo_workflows_custom_duration_gauge_task is the custom metrics and will give us the status and duration it ran for each workflow. I would like to know which workflows got succeeded in the past one hour.

Regards,
Dhanunjaya Mitta

Brian Candler

unread,
Oct 13, 2021, 5:24:53 AM10/13/21
to Prometheus Users
If you won't show actual examples of the metrics, then I'm unable to help you form queries on those metrics.

Metrics look like this:
metric_name{labels} value

Here are some examples of metrics:
node_uname_info{job="node",instance="nuc1",domainname="(none)",machine="x86_64",nodename="nuc1",release="5.4.0-80-generic",sysname="Linux",version="#90~18.04.1-Ubuntu SMP Tue Jul 13 19:40:02 UTC 2021"} 1
process_start_time_seconds{job="node",instance="nuc1"} 1.6283250991e+09

Show the metrics in that form, and I may be able to help.

Dhanunjaya Mitta

unread,
Oct 13, 2021, 5:44:49 AM10/13/21
to Prometheus Users

argo_workflows_custom_duration_gauge_task{status!="Pending"}

These are the labels:
instanceName: vitam.wf.b.business-layer-master-b4vqj
service: argo-proxy-svc
workflow_name: vitam.wf.b.business-layer-master
namespace: itam-d-app-main
pod: argo-proxy-deploy-5f87dfb6d4-pg4qh
Prometheus: openshift-user-workload-monitoring/user-workload
status: Failed

This is the value: 67  

Brian Candler

unread,
Oct 13, 2021, 9:38:34 AM10/13/21
to Prometheus Users
There's no timestamp in that metric, so it's not possible to answer the question you wanted, which was "show the jobs which terminated in the last hour". All the metric tells you is that the job took 67 seconds - not when it started or ended.

If there's a separate metric with the job start or end time, then it may be doable.

Dhanunjaya Mitta

unread,
Oct 13, 2021, 10:35:59 AM10/13/21
to Prometheus Users
Hello Brian,
Metric: argo_workflows_custom_start_time_gauge_workflow

Labels:
timeStamp: 2021-10-11T08:48:11Z
instanceName: ddd.vitam.wf.s.ldb.part-9tmq5
service: argo-proxy-svc
workflowName: vitam.wf.s.ldb.part
namespace: itam-d-app-main
prometheus: openshift-user-workload-monitoring/user-workload
status: Succeeded

this is the timestamp generated for the above custom metric:
value: 20211011084811

Hope it will help.

Regards,
Dhanunjaya Mitta

Brian Candler

unread,
Oct 13, 2021, 11:16:31 AM10/13/21
to Prometheus Users
> this is the timestamp generated for the above custom metric:
> value: 20211011084811

I've seen this before in another thread.  This is a useless metric.  You need to change it to be the number of seconds since epoch (it's a custom metric, so you can change it).

Then you can query it like this:

(time() - argo_workflows_custom_start_time_gauge_workflow) < 3600

to get all workflows which started in the last hour.

Dhanunjaya Mitta

unread,
Oct 13, 2021, 11:37:54 AM10/13/21
to Prometheus Users
Hello Brian,
We are using workflow.creationTimestamp.<STRFTIMECHAR> to get the start time metric value.
value: >-
            {{workflow.creationTimestamp.Y}}{{workflow.creationTimestamp.m}}{{workflow.creationTimestamp.d}}{{workflow.creationTimestamp.H}}{{workflow.creationTimestamp.M}}{{workflow.creationTimestamp.S}}
          
This will produce the whole timeatamp. Do you recommend anything other than this would be very helpful to me.

Regards,
Dhanunjaya Mitta

Stuart Clark

unread,
Oct 13, 2021, 12:06:28 PM10/13/21
to Dhanunjaya Mitta, Prometheus Users
On 2021-10-13 16:37, Dhanunjaya Mitta wrote:
> Hello Brian,
> We are using workflow.creationTimestamp.<STRFTIMECHAR> to get the
> start time metric value.
>
> value: >-
>
> {{workflow.creationTimestamp.Y}}{{workflow.creationTimestamp.m}}{{workflow.creationTimestamp.d}}{{workflow.creationTimestamp.H}}{{workflow.creationTimestamp.M}}{{workflow.creationTimestamp.S}}
>
> This will produce the whole timeatamp. Do you recommend anything other
> than this would be very helpful to me.
>

You need to be producing a number rather than a string. So you need to
choose whatever option gives the seconds since epoch value rather than
making a string out of YmdHMS string values.

--
Stuart Clark

Brian Candler

unread,
Oct 13, 2021, 1:41:57 PM10/13/21
to Prometheus Users
Googling for "argos workflow.creationTimestamp" finds

workflow.creationTimestamp
Workflow creation timestamp formatted in RFC 3339 (e.g. 2018-08-23T05:42:49Z)

workflow.creationTimestamp.<STRFTIMECHAR>
Creation timestamp formatted with a strftime format character

Ergh.  I guess you have to ask them how to get this as an epoch time, and if it's not possible, make a feature request.

Harald Koch

unread,
Oct 13, 2021, 3:33:13 PM10/13/21
to Prometheus Users
On Wed, Oct 13, 2021, at 13:41, Brian Candler wrote:
Googling for "argos workflow.creationTimestamp" finds

workflow.creationTimestamp
Workflow creation timestamp formatted in RFC 3339 (e.g. 2018-08-23T05:42:49Z)

workflow.creationTimestamp.<STRFTIMECHAR>
Creation timestamp formatted with a strftime format character

If they use a reasonably modern STRFTIME implementation, the "%s" format directive returns seconds since the (1970) epoch.

--
Harald


Dhanunjaya Mitta

unread,
Oct 14, 2021, 2:37:25 AM10/14/21
to Prometheus Users
@HaraId I have already tried with %s but it is just giving the seconds part of the whole time stamp. It is not returning the total time in seconds.

Dhanunjaya Mitta

unread,
Oct 14, 2021, 2:40:32 AM10/14/21
to Prometheus Users
@Brian, I have checked this link and we are just following the same mentioned in that section. I have raised a question in their slack channel as well but i didn't get any response from them. Is there any alternative?

Regards,
Dhanunjaya Mitta

Dhanunjaya Mitta

unread,
Oct 14, 2021, 2:47:34 AM10/14/21
to Prometheus Users
@Clark, We are still getting a number rather than a string. We have tried to get the total time in seconds but unable to find the exact solution.

Regards,
Dhanunjaya Mitta

Sandip Bhattacharya

unread,
Oct 14, 2021, 6:49:25 AM10/14/21
to promethe...@googlegroups.com


On 13.10.21 21:32, Harald Koch wrote:
> On Wed, Oct 13, 2021, at 13:41, Brian Candler wrote:
>> Googling for "argos workflow.creationTimestamp" finds
>> https://argoproj.github.io/argo-workflows/variables/
>>
>> /workflow.creationTimestamp
>> Workflow creation timestamp formatted in RFC 3339 (e.g. 2018-08-23T05:42:49Z)/
>> //
>> /workflow.creationTimestamp.<STRFTIMECHAR>
>> /
>> /Creation timestamp formatted with a strftime <http://strftime.org/> format character/
>
> If they use a reasonably modern STRFTIME implementation, the "%s" format directive returns seconds since the (1970) epoch.

Seems the implementation is not that compliant.
https://github.com/argoproj/pkg/blob/50e2680bec730dc985bb2e50d02c1d383caa0145/strftime/strftime.go#L12

>>> value: >-
>>> {{workflow.creationTimestamp.Y}}{{workflow.creationTimestamp.m}}{{workflow.creationTimestamp.d}}{{workflow.creationTimestamp.H}}{{workflow.creationTimestamp.M}}{{workflow.creationTimestamp.S}}

if this is a proper Go template with functions from sprig or something similar, you may be able to get away with something horrible like:

value: >-
{{ ((printf "%s-%s-%sT%s:%s:%s" workflow.creationTimestamp.Y workflow.creationTimestamp.m workflow.creationTimestamp.d workflow.creationTimestamp.H workflow.creationTimestamp.M workflow.creationTimestamp.S) | time).Unix }}

--
https://blog.sandipb.net
https://twitter.com/sandipb

Sandip Bhattacharya

unread,
Oct 14, 2021, 6:51:24 AM10/14/21
to promethe...@googlegroups.com


On 14.10.21 12:49, Sandip Bhattacharya wrote:

> value: >-
>   {{ ((printf "%s-%s-%sT%s:%s:%s" workflow.creationTimestamp.Y workflow.creationTimestamp.m workflow.creationTimestamp.d workflow.creationTimestamp.H workflow.creationTimestamp.M workflow.creationTimestamp.S) | time).Unix }}
>

Missed the whole docs. Maybe something just like this will do.

value: >-
{{ (workflow.creationTimestamp | time).Unix }}

Brian Candler

unread,
Oct 14, 2021, 7:38:44 AM10/14/21
to Prometheus Users
In the argo-workflow source:
metav1.ObjectMeta{CreationTimestamp: metav1.Time{Time: t1}}

which leads to

which leads to

which leads to

So it *is* a kubernetes wrapper around time.Time (well, it embeds time.Time).   But I'm not sure how to extract this value as an integer in a template - the argoproj people would be best placed to answer.  All the marshalling is in RFC3339 format.

The Argo strftime is here:

I didn't find where the magic accessors like .Y, .M, .D etc are implemented though.
Reply all
Reply to author
Forward
0 new messages