Couple of issues with prometheus and yet another cloudwatch exporter (YACWE)

129 views
Skip to first unread message

Niranjan Panch

unread,
Jan 9, 2025, 8:13:26 AM1/9/25
to Prometheus Users
Issue#1 

My YACE config looks like below but sometimes cpu utilization or some other metric reports value as Not a number(NaN) but I dont understand why because underlying cloudwatch data points are correct. what is wrong here?

apiVersion: v1alpha1
discovery:
  jobs:
    - type: AWS/RDS
      regions:
        - us-east-1
        - us-west-2
      roles:
        - roleArn: "arn:aws:iam::xxxxxxxxxx:role/yyyyy"
        - roleArn: "arn:aws:iam:: xxxxxxxxxx  :role/ yyyyy  "
        - roleArn: "arn:aws:iam:: xxxxxxxxxx  :role/ yyyyy  "
        - roleArn: "arn:aws:iam:: xxxxxxxxxx  :role/ yyyyy  "
      period: 60
      length: 60

      metrics:
        - name: CPUUtilization
          statistics: [Average]
        - name: DatabaseConnections
          statistics: [Average]
        - name: FreeableMemory
          statistics: [Average]
        - name: FreeStorageSpace
          statistics: [Average]
        - name: ReadThroughput
          statistics: [Average]
        - name: WriteThroughput
          statistics: [Average]
        - name: ReadLatency
          statistics: [Average]
        - name: WriteLatency
          statistics: [Average]
        - name: ReadIOPS


Issue#2

Now exporter to cloudwatch scrape interval is 60 seconds, prometheus to exporter scrape interval is 20 seconds and evaluation interval in prometheus is 20 seconds. 

Now my following rules are always stuck in peding state, how do I adjust my configuration to make things work? 

(aws_rds_write_latency_average{account_id!="",dimension_DBInstanceIdentifier!=""}) * 1000 > 20

avg_over_time(aws_rds_write_latency_average{account_id!="973732892259",dimension_DBInstanceIdentifier!=""}[10m]) * 1000 > 10

isue#3

sometimes when I manually verify the values between cloudwatch and prometheus exporter, I see some fluctuations in cloudwatch data points whereas prometheus is not reflecting that.

For example - every few minutes cloudwatch reports write latency as 80 ms and then goes down to 10ms , stays there for 3-4 minutes and then goes up whereas in grafana dashboard this fluction is not seen and it always shows 60-80 milliseconds for the entire hour.

Please help me to fix my configuration.

Regards,
Niranjan


Reply all
Reply to author
Forward
0 new messages