Modify the Y-axis in the metrics explorer chart monitoring

834 views
Skip to first unread message

Rakesh Sagitra

unread,
Feb 2, 2021, 1:03:29 AM2/2/21
to Google Stackdriver Discussion Forum
Hello guys,
                   I am new here.  I was happy to discover this forum! :)

I'm struggling in modifying the y-axis of the metrics explorer chart. Please have a look at the attached screenshot. As you can see, 1221 is the response time, and data on Y-axis is displaying as per the latency.

I am trying to measure responseTime from an endpoint and is using a logger that provides that data point in the JSON payload. And from the graph shows, the axis is strangely distributed since I probably want to have the y-axis to show a range of the response time possibilities. So that he can notice any spikes and create the alert.

In short, I want to achieve the following:

  • Y-axis should be display based on the API response time. As 1221 is displayed in the screenshot.
Dashboard monitoring.png
Logger_Metric.png

Summit Tuladhar

unread,
Feb 2, 2021, 11:57:35 AM2/2/21
to Rakesh Sagitra, Google Stackdriver Discussion Forum
Hi Rakesh,

It looks like you are creating a logs-based metric and extracting a value out of logs and placing it as a label. Each label value creates a unique timeseries (separate line in the chart). It is not the value of the metric itself that is shown in the y-axis. 

To capture values from the logs and show it in the y-axis, you will need to use distribution logs-based metrics. Due to the unlimited number of log entries that you can ingest in a given period, you can only capture the statistics of the values as a distribution metric. See: https://cloud.google.com/logging/docs/logs-based-metrics/distribution-metrics

Regards,
Summit

--
© 2020 Google Inc. 1600 Amphitheatre Parkway, Mountain View, CA 94043
 
Email preferences: You received this email because you signed up for the Google Stackdriver Discussion Google Group (google-stackdr...@googlegroups.com) to participate in discussions with other members of the GoogleStackdriver community.
---
You received this message because you are subscribed to the Google Groups "Google Stackdriver Discussion Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-stackdriver-d...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-stackdriver-discussion/2e89aaed-7e59-47b0-b9c2-bfc25cee1fbfn%40googlegroups.com.

Rakesh Sagitra

unread,
Feb 2, 2021, 12:39:51 PM2/2/21
to Google Stackdriver Discussion Forum
Thank's for your response. I have tried to create the distribution metric also but didn't able to get the response time on the y-axis.I am getting different time series data rather than a response time
Distribution_Metrics.png
Distibution_Metric_Explorer.png

Summit Tuladhar

unread,
Feb 2, 2021, 12:42:07 PM2/2/21
to Rakesh Sagitra, Google Stackdriver Discussion Forum
Can you attach a screenshot? 
Also, make sure you are not adding the latency as a metric label, as that will create a lot of timeseries.

Rakesh Sagitra

unread,
Feb 2, 2021, 12:45:34 PM2/2/21
to Google Stackdriver Discussion Forum
PFA for the screenshot for metric label and explorer
Distribution_Metrics.png
Distibution_Metric_Explorer.png

Summit Tuladhar

unread,
Feb 2, 2021, 12:50:00 PM2/2/21
to Rakesh Sagitra, Google Stackdriver Discussion Forum
It looks like you are adding a metric label called Response_time and adding the jsonPayload.responseTime. This creates a new timeseries for each responseTime value, which is not what you want. You just need to use the extract value to capture as a distribution without using any labels.

Since you cannot delete a label from an existing metric, please create a new distribution metric without any labels. Make sure you delete this bad metric as it will cause cardinality issues.

 

Rakesh Sagitra

unread,
Feb 2, 2021, 1:13:45 PM2/2/21
to Google Stackdriver Discussion Forum
I have removed the label and created new metrics. But again I am getting the different time series rather than response time on the y-axis. I have attached the screenshot below.Please have a look over it
distribution_metric_explorer.png
distribution_metric.png

Summit Tuladhar

unread,
Feb 2, 2021, 1:25:19 PM2/2/21
to Rakesh Sagitra, Brian Hurley, Google Stackdriver Discussion Forum
The chart is showing the 99th percentile of the distribution. A distribution metric buckets the captured values as per your bucketing configuration. It does not capture each unique value as a separate point because of the large number of unique values that can be present in logs and it is unbounded. 

To have points in regular intervals, the distribution metric aggregates all points in the interval and buckets the values to make it bounded. The percentiles are calculated based on the values in the buckets. This is useful for capturing a large number of values and measuring statistics about it instead of capturing each unique value as a data point. You can also visualize the chart as a heatmap.

Does this work for your use case of measuring response times?

There's also a feature request open to support gauge logs-based metrics to capture individual points, but that will have limitations on sampling intervals: https://issuetracker.google.com/issues/136239018



Rakesh Sagitra

unread,
Feb 3, 2021, 8:43:22 AM2/3/21
to Google Stackdriver Discussion Forum
Can you please guide me on exactly how can I follow this approach guided by you? 

I want to have a graph of responsTime over timeseries data on a certain endpoint. This endpoint has the data in its logs. But how do I make a way to visualize this. Does not have to be a graph, can be a bar chart, or something that allows us to see the spikes easily.

Summit Tuladhar

unread,
Feb 3, 2021, 9:46:11 AM2/3/21
to Rakesh Sagitra, Google Stackdriver Discussion Forum
Logs-based metrics can only capture a distribution of the values from logs. You can visualize the distribution using a Heatmap chart type. If there has been an increase in response time, for example, you can visualize that in the heat map. You can also look at the various percentiles of the response times.

What you cannot do is capture individual values from logs as individual points with exact y-axis values. Metrics have ingestion limits on the number of points that can be ingested per timeseries per minute. Logs have no such limit. There could be thousands of data points in one second. That's why a distribution metric is required to measure the average, or certain percentiles over those values. Does that make sense?

Reply all
Reply to author
Forward
0 new messages