Ping probe limits

40 views
Skip to first unread message

scott.p...@gmail.com

unread,
Dec 19, 2018, 6:49:19 AM12/19/18
to Cloudprober
I have been configuring the cloudprober ping probe against a large number of targets. Everything seems fine up to around 1k hosts. When I try to increase the number of hosts to around 2k, a number of the probes fail (sent != received). What is the expected limitation of the ping probe?

Manu Garg

unread,
Dec 19, 2018, 2:14:04 PM12/19/18
to Scott Pettyjohn, Cloudprober
Hi Scott,

That's a good question. We haven't actually done any benchmarking of the ping probe's performance. AFAICT, we don't really have any hard limits though. Data channel on which probe results are put is 1000 items long, but that should not make sent != received ever. Any stuckness there can only result in missing data. It's possible that the container or the machine that cloudprober is running on, runs hot when you have more than 1000 targets and cloudprober doesn't get scheduled frequently enough. How does cloudprober's CPU consumption look like? You can also try playing with different interval/timeout combinations to see if that helps.

Also, I don't have any experience of probing more than about ~800 targets per probe. We usually shard our probing setup in such a way that no single task does more than a certain amount of probing. May be you can give that a try -- have two cloudprober instances, each probing a subset of the targets  and then monitoring system can combine all the data.

Hope that helps.

Cheers,
Manu

On Wed, Dec 19, 2018 at 3:49 AM <scott.p...@gmail.com> wrote:
I have been configuring the cloudprober ping probe against a large number of targets. Everything seems fine up to around 1k hosts. When I try to increase the number of hosts to around 2k, a number of the probes fail (sent != received). What is the expected limitation of the ping probe?

--
You received this message because you are subscribed to the Google Groups "Cloudprober" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloudprober...@googlegroups.com.
To post to this group, send email to cloud...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloudprober/9bd51b0a-cf37-4e5e-b797-b11e1cde9f44%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Manu Garg
Creator of Page Notes & Pacparser
"Journey is the destination of life."

scott.p...@gmail.com

unread,
Jan 31, 2019, 9:31:07 AM1/31/19
to Cloudprober
What could be causing the ping probe to fail on some targets consistently with “Duplicate reply”?

Manu Garg

unread,
Jan 31, 2019, 12:38:21 PM1/31/19
to Scott Pettyjohn, Cloudprober
I am not sure. Does it happen on a continuous basis? Can you ping those targets directly -- with the same configuration (interval, timeout)? If yes, may be there is an obscure bug somewhere in cloudprober (I say obscure because we run a lot of ping probes with a lot of targets, and haven't come across a bug that looks like this -- we see duplicate replies sometimes, but they are explainable).

Also, can you take a tcpdump to see what's going on? Are there really duplicates on the wire? May be cloudprober is sending duplicate requests (requests with same ICMP ID and Sequence number) due to some bug. Cloudprober continues to use same ICMP ID throughout its lifetime and cycles through sequence numbers. So if you're sending 2 packets every 2s (default), you won't see sequence number repeat until 256s (about 4 min). So, there is a possibility that if your target takes long to respond from time to time, old replies may show up with the replies from the current probe cycle (hence duplicate replies).

Usually duplicate replies themselves are not the problem. They just indicate there is something wrong in the network or at the node.

(Taking a look at your config will also help debug this issue further.)

On Thu, Jan 31, 2019 at 6:31 AM <scott.p...@gmail.com> wrote:
What could be causing the ping probe to fail on some targets consistently with “Duplicate reply”?

--
You received this message because you are subscribed to the Google Groups "Cloudprober" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloudprober...@googlegroups.com.
To post to this group, send email to cloud...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages