Hello,
it's only working partly I think. If I add the same target several times to the same job then prometheus treats targets with the exact naming as on.
This results in one target on prometheus' webui target list and tcpdump confirms onle one scrape per 60s
If I use this I have 4 different namings for the same target which results in 4 scrapes. However with this max 4 permutations are possible I think and with http only 2.
scheme: https
And at least I they do not spread as equal as I hoped and in addition I now have 4 different instances.
Maybe I could fix this with relabling the "instance" field but this sound as wrong as relabeling the "job".
Back to your question:
"Does it really matter whether it was 20 seconds or 25 seconds?"
I don't know if this is relevant. It's a rare issue and I am in discussion with the vendor of the API/appliance. However it maybe could give me some more indication if the API would respond after lets say 50s oder 3 minutes.
If scrape_timeout is reached the exporter sends a RST if I remember correctly which is good to close the connections but will also close the connection to the API and API server maybe just writes "client closed connection" or something similar to the log.
I don't know if this is really a problem if the answers of two parallel probes overlap (timeout longer than duration) because the connection uses different source ports and prometheus allows the "out-of-order" ingestion if I remember correctly.
Perhaps it could lead to many unclosed connections which need memory. lt's say interval is 1s and timeout is 60s there could be 60 connections in parallel.
Maybe a longer timeout than scrape_interval could be handled like this:
scrape_interval: 15s
scrape_timeout: 60s
if scrape_time is longer than scrape_interval check if probe duration succeeded before scrape_timeout and do the next scrape according to scrape_interval.
if scrape_duration is longer than scrape_interval and shorter than scrape_timeout skip next scrape until timeout reached or scrape succeeded.
However this would not allow parallel scrapes.
Probably this is a rare scenario and debugging an API with blackbox_exporter was only an idea. I just wanted to ask if I miss something :-)
Thanks for sharing your ideas.