We're plugging Prometheus into DNS monitoring on a fairly large scale. One of our desired goals is to understand how many queries of various types and methods we're seeing from network ranges as well as the autonomous system neighbors and origins associated with those networks. We are building a collector that will parse our DNS queries into time-series data, and will add labels that give us enough dimensions to perform the queries that we want. We are taking BGP announcements and creating labels based on those announcements, and associating them with counters for various metrics along with the additional ASN origin/transit labels.Here is an example metric, which stores how many "A" record requests we receive, along with the labels associated with each entry:dns_num_query_type_A{env="prod", loc="ams", shost="res409", region="eu", origin-ip="10.3.2.0/24", origin-as="1239", transit-as="2914"}env will have <10 values ("prod", "dev", "test", and a few more TBD)loc will have ~180 values (three-digit city alpha codes)shost will have hostnames up to ~3000 label values (six digit alphanumeric)region has ~8 values (global area, two digit alpha codes)origin-ip will have up to =>~700,000 label values which will be a CIDR notation network (i.e.: 10.3.2.0/24)origin-as will have up to =>~60000 label values (int64)transit-as will have up to =>~3000 label values (int64)
My question is: What problems am I going to hit with Prometheus using such large label dimensions? Will this work at all?Obviously, I'm concerned about the "origin-ip" label and the "origin-as" label having so many possible values. There are lot of devils in the details here which make the labels actually a lot less scary than it looks (it's not fully expanded dimensionally) but it's still a really big number.I've read the warning (https://prometheus.io/docs/practices/naming/) about high cardinality in labels. I don't see any way to get away from it in this case. It's also not exactly "unbounded" since there is a maximum limit on each of the values, and the churn on many of the labels is extremely low.Why are we doing this? The operational reasons I will leave out of this discussion, since that is an internal issue and you'll have to take my word on it. These are mostly used for "top-10" type queries in Grafana in various ways to help us locate high-volume peers or transit partners, identify DDoS attacks, build white/black lists, and feed some of our other in-house tools that balance/direct traffic based on activity. We wish to be able to perform queries like: "show me the top 10 transit peers in Amsterdam sending us A record requests in the last 24 hours." As far as I can see, this is possible with Prometheus but I'd like to hear if this is actually a workable plan or if I need to move to some other datastore for our short-term monitoring. We've experimented with some data so far, but I always like to get opinions from people with battle scars before I attack an unknown problem.Note: we're only going to be logging data that we see, so if there is no traffic from a particular network, we will remove that network from the scraped set of data after some interval. If we never see data from a particular network, we'll never log a value for it from any of our servers. This should keep the set of data we're pulling from each DNS resolver to a reasonable number, so we're not pulling a fully expanded set of data from anywhere. Each server would be handing back (as a rough example) 1200 different origin-IP labels, and origin-as and transit-as would never be more than 1 per origin-IP entry (because at any one time, there can only be one origin AS and one transit AS for a network in any site, even though those values may change over time.) I guess what I mean is that origin-as and transit-as are NOT additional dimensions; they're just informational tagging.I also am considering dropping "shost" from the label list, since all hosts within a particular location (POP) will exhibit the same characteristics from an origin-ip/origin-as/transit-as mapping perspective so that label may be redundant and adding un-necessary time-series divisions. See below for how "loc" and "region" won't cause additional dimensionality as well.Interval on collection would be maybe 5 minutes, maybe more depending on the speed of insertions that we see on an NVMe-equipped machine.
A separate discussion, though related to labels:We also considered adding another label and breaking up the metrics into query type instead of storing each as a time series, but we're not sure that's wise. Currently, we have time series for each query type, but I suppose the collectors could be changed to express that as a label instead of a separate time series. This would really cram a lot of dimensionality into the time series, so it makes me a bit wary. What's the preferred method, and why? There are about 40 different DNS record types we'll want to monitor, which makes me lean towards making this a label, as per the best practices but I'm still not confident of the decision.Current method:dns_num_query_type_A{env="prod",...dns_num_query_type_AAA{env="prod",......Possible method:dns_num_query{qtype="A",env="prod",...dns_num_query{qtype="AAAA",env="prod",.......
Another label-based question:I have two labels of "region" and "loc" which are tightly linked. A location is always going to be in the same region; they're never going to change relationship. However, I found that there was no easy way to do summary lookups in my Grafana queries if I didn't have "region" - otherwise, I was making these huge ugly regexp's that specified every loc contained in a region. Is there an easier way to do this that doesn't involve creating a label that exists only for my querying convenience? It seems wasteful.
Lastly:I'm aware of the pcap-based methods and existing databases that can capture this information; I'm specifically asking about Prometheus tagging, and not alternate DNS logging in general.JT
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/4ec69f67-fe6b-4323-9e87-72457eb57c2b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On 15 December 2016 at 20:34, <john...@gmail.com> wrote:We're plugging Prometheus into DNS monitoring on a fairly large scale. One of our desired goals is to understand how many queries of various types and methods we're seeing from network ranges as well as the autonomous system neighbors and origins associated with those networks. We are building a collector that will parse our DNS queries into time-series data, and will add labels that give us enough dimensions to perform the queries that we want. We are taking BGP announcements and creating labels based on those announcements, and associating them with counters for various metrics along with the additional ASN origin/transit labels.Here is an example metric, which stores how many "A" record requests we receive, along with the labels associated with each entry:dns_num_query_type_A{env="prod", loc="ams", shost="res409", region="eu", origin-ip="10.3.2.0/24", origin-as="1239", transit-as="2914"}env will have <10 values ("prod", "dev", "test", and a few more TBD)loc will have ~180 values (three-digit city alpha codes)shost will have hostnames up to ~3000 label values (six digit alphanumeric)region has ~8 values (global area, two digit alpha codes)origin-ip will have up to =>~700,000 label values which will be a CIDR notation network (i.e.: 10.3.2.0/24)origin-as will have up to =>~60000 label values (int64)transit-as will have up to =>~3000 label values (int64)That's a cross-product of 6.8e20, which is many orders of magnitude more than a Prometheus can handle or that can be fit into a modern computer.You should aim to keep total number of timeseries in a Prometheus no higher than the 10s of millions, as it tends to start to run into difficulty around that point. Queries that touch more than about 1-10k timeseries also can to be slow, depending on hardware.
My question is: What problems am I going to hit with Prometheus using such large label dimensions? Will this work at all?Obviously, I'm concerned about the "origin-ip" label and the "origin-as" label having so many possible values. There are lot of devils in the details here which make the labels actually a lot less scary than it looks (it's not fully expanded dimensionally) but it's still a really big number.I've read the warning (https://prometheus.io/docs/practices/naming/) about high cardinality in labels. I don't see any way to get away from it in this case. It's also not exactly "unbounded" since there is a maximum limit on each of the values, and the churn on many of the labels is extremely low.Why are we doing this? The operational reasons I will leave out of this discussion, since that is an internal issue and you'll have to take my word on it. These are mostly used for "top-10" type queries in Grafana in various ways to help us locate high-volume peers or transit partners, identify DDoS attacks, build white/black lists, and feed some of our other in-house tools that balance/direct traffic based on activity. We wish to be able to perform queries like: "show me the top 10 transit peers in Amsterdam sending us A record requests in the last 24 hours." As far as I can see, this is possible with Prometheus but I'd like to hear if this is actually a workable plan or if I need to move to some other datastore for our short-term monitoring. We've experimented with some data so far, but I always like to get opinions from people with battle scars before I attack an unknown problem.Note: we're only going to be logging data that we see, so if there is no traffic from a particular network, we will remove that network from the scraped set of data after some interval. If we never see data from a particular network, we'll never log a value for it from any of our servers. This should keep the set of data we're pulling from each DNS resolver to a reasonable number, so we're not pulling a fully expanded set of data from anywhere. Each server would be handing back (as a rough example) 1200 different origin-IP labels, and origin-as and transit-as would never be more than 1 per origin-IP entry (because at any one time, there can only be one origin AS and one transit AS for a network in any site, even though those values may change over time.) I guess what I mean is that origin-as and transit-as are NOT additional dimensions; they're just informational tagging.I also am considering dropping "shost" from the label list, since all hosts within a particular location (POP) will exhibit the same characteristics from an origin-ip/origin-as/transit-as mapping perspective so that label may be redundant and adding un-necessary time-series divisions. See below for how "loc" and "region" won't cause additional dimensionality as well.Interval on collection would be maybe 5 minutes, maybe more depending on the speed of insertions that we see on an NVMe-equipped machine.The lowest sane scrape interval in Prometheus at the moment is about 2 minutes, due to staleness handling.
A separate discussion, though related to labels:We also considered adding another label and breaking up the metrics into query type instead of storing each as a time series, but we're not sure that's wise. Currently, we have time series for each query type, but I suppose the collectors could be changed to express that as a label instead of a separate time series. This would really cram a lot of dimensionality into the time series, so it makes me a bit wary. What's the preferred method, and why? There are about 40 different DNS record types we'll want to monitor, which makes me lean towards making this a label, as per the best practices but I'm still not confident of the decision.Current method:dns_num_query_type_A{env="prod",...dns_num_query_type_AAA{env="prod",......Possible method:dns_num_query{qtype="A",env="prod",...dns_num_query{qtype="AAAA",env="prod",.......That's still the same cardinality, and where a label should be used.
Another label-based question:I have two labels of "region" and "loc" which are tightly linked. A location is always going to be in the same region; they're never going to change relationship. However, I found that there was no easy way to do summary lookups in my Grafana queries if I didn't have "region" - otherwise, I was making these huge ugly regexp's that specified every loc contained in a region. Is there an easier way to do this that doesn't involve creating a label that exists only for my querying convenience? It seems wasteful.You could be able to use a variant of the approach described at https://www.robustperception.io/exposing-the-software-version-to-prometheus/ Create a timeseries per (region, loc), and then limit your matches with "and on (loc) (region_locs{region='eu'})". However given your high cardinality, having the additional label is best to avoid fetching data for all regions.
BrianLastly:I'm aware of the pcap-based methods and existing databases that can capture this information; I'm specifically asking about Prometheus tagging, and not alternate DNS logging in general.JT--
Brian Brazil