> An important aspect of managing network performance is understanding
> congestion in a forwarding device. Monitoring queue depth is a
> practical way to understand whether buffers or queue resources are
> over or under utilised and inferring queuing delay.
I like using an outbound queue length, and during parts of my "day job" suggest just that. Alas for some reason the SNMP folks decided to deprecate outQueueLen from the interface MIB - something about "by the time the NMS sees it it is already old" - of course, that holds true for just about anything queried via SNMP, and presumably anything defined as a GAUGE.
> Here are some suggestions for using sFlow to monitor queue length. Any
> comments?
>
> Exporting the queue depth experienced by a sampled packet, as an
> extended flow structure, is ideal and enables scalable analysis of
> queuing delay experienced by classes of traffic.
I suppose that depends on just where the packet is sampled. If it is sampled on "inbound" it will not have yet experienced any queuing the sampler can discern. If it is sampled on "outbound" past the queue (say on transmit completion) the information is already toast unless the packet descriptor was tagged with a length upon enqueuing. Only if it is sampled at the time of queuing for outbound can the frame be tagged with a queue depth without having to "remember" it.
Do the sFlow specs mandate or even suggest "where" the sample point should be?
> /* Queue length counters
> Histogram of queue lengths experienced by packets when they are
> enqueued (ie queue length immediately before packet is enqueued)
> thus giving the queue lengths experienced by each packet.
> Queue length is measured in segments occupied by the enqueued
> packets.
> Queue length counter records for each of the queues for a
> port must be exported with the generic interface counters
> record, if_counters, for the port.*/
>
> /* Queue length histogram counters
> opaque = counter_data; enterprise = 0; format = 1003 */
>
If one is including a queue length with counters, is there really a need to transmit an entire histogram? If there is indeed some randomization of the counter sampling interval (perhaps even if not), then presumably the collector can keep a histogram of the individual queue length values it has seen. Further, ostensibly the dropped stat is redundant with ifOutDiscards already present in the generic counters no?
While all the world is not IP (mores the pity?-) if an IP datagram encounters congestion isn't an ECN bit supposed to be set these days? There is still something of a race condition between the setting of ECN and the sample point, but it doesn't require any further sFlow enhancements - just for the collector to check the sampled headers for ECN bits.
rick jones
The queue length needs to be captured at the point the packet is
enqueued, but it is possible that the packet could have been marked
for sampling on ingress. The only way that you can accurately
associate the queue length with the sampled packet is if the hardware
provides explicit support for feature, but there are many possible
implementations.
>
> Do the sFlow specs mandate or even suggest "where" the sample point should be?
>
sFlow permits ingress, egress or bidirectional sampling. Switch
designers are free to choose the location for sampling that best fits
their forwarding architecture.
> If one is including a queue length with counters, is there really a need to transmit an entire histogram? If there is
> indeed some randomization of the counter sampling interval (perhaps even if not), then presumably the collector can
> keep a histogram of the individual queue length values it has seen.
The set of counters needs to be maintained on the switch since longer
queue sized should be rare and you don't want to miss them. Exporting
the full set of counters in the histogram allows the sFlow analyzer to
see if any packets experienced queueing delays irrespective of the
polling intervals and packet rates.
>Further, ostensibly the dropped stat is redundant with ifOutDiscards already present in the generic counters no?
Not necessarily, there may be 4 or 8 queues per interface and you want
to know discards by queue. The sum of all the discards across all the
queues on the interface should add up to ifOutDiscards.