Karl Auerbach (ka...@cavebear.com) wrote:
: A lot of companies do some very serious checking on the quality of their
: devices. Others don't. Sometimes a normally good company gets cought
: when they resell someone else's product.
: All in all, its difficult for anyone to know the quality of the snmp
: in the boxes they are buying.
I'm really having a hard time buying this. I don't do all that
much when I look at management stats, but I have never had any problem
spotting bad/clumsy/broken/kludgy implementations. I run a quick series
of sweeps using forms that I built up in Netview to look at response to
system and if calls, then start looking at table dumps using Netview's
built in MIB walker for the device specific stuff. After that, I focus on
SETs against some key elements, again both generically and device
specific.
I built up the forms and figured out the generic SETs I needed
over a period of a month or so a couple of years ago. Total invested
time, maybe 25 hours. Running everything except the device specific SETs
can be done in less than a day. Figuring out a new part of MIB-II
generally takes me a week or less.
Relying on standard MIBs means that I need maybe 6 to 10 business
days for me to develop a suite of tests that I can run against every
vendor's hardware when I'm doing a product evaluation. When the expected
payoff for my company is $100,000+, I think that is a worthwhile
investment of my time.
When a company plans to SELL us this hardware, I assume that they
expect a payoff measured in millions. If they want my business, I fully
expect that the company will have a product which can meet my admittedly
minimal tests. If they can't, I don't believe that they have a solution
(standards based or proprietary) that I can regard as ready for mission
critical applications. Nowadays, it's hard to find a network that ISN'T
mission critical. So, what market segment are they really trying to sell
to?
--
Jim Smilanich | "A man should be able to pilot a starship, plan an
jsm...@winternet.com | invasion, diaper a baby..... specialization is for
Winternet is my access | insects!" -- Lazarus Long
provider, so don't blame|
them for my opinions! |
> : All in all, its difficult for anyone to know the quality of the snmp
> : in the boxes they are buying.
>
> I'm really having a hard time buying this. I don't do all that
> much when I look at management stats, but I have never had any problem
> spotting bad/clumsy/broken/kludgy implementations. I run a quick series
> of sweeps using forms that I built up in Netview to look at response to
> system and if calls, then start looking at table dumps using Netview's
> built in MIB walker for the device specific stuff. After that, I focus on
> SETs against some key elements, again both generically and device
> specific.
A lot of agents are broken in ways that don't show up with quick sweeps
from well behaved management stations.
One box I dealt with would crash after one read sysDescr a few hundred
times. Others go wacko when things like the 32nd bit of a counter go on.
Others don't like sweeps of tables that start at an OID that is between
rows. Some can't handle OID's with more than a dozen or so components.
Others can't handle a large number of simultaneous queries. Other's can't
send to management stations that are on Class A networks, etc, etc. Some
are more subtle and don't tally events in all the related counters.
That kind of thing generally gets noticed 6 months or a year after someone
has bought the buggy product, and the blame usually gets assigned to the
new box on the block, not the one that really has the problems.
The big management platforms are highly patterned in their queries. It
often isn't until the management platform is upgraded or a new manager
arrives on the net that the query pattern changes enough to reveal the
problems.
One area where agents really fall on their faces is in SET anlaysis -- one
can often kill a device by giving it a set request that is inconsistent.
(Sort of like my old example of the agent that would accept a set request
that said "raise wheels" without first verifying that the aircraft is
airborne.)
--karl--
I'll grant you that my tests would not have caught those kinds of
bugs, and probably would not generate the SET analysis bugs that you
mention. My tests were designed to quickly demonstrate whether a device
belonged on my short list or not. If it failed, I could drop it and
concentrate on the one(s) that passed.
I have run into similar bugs myself. I had a problem with
Frontier's NetScout where a COUNTER variable was not wrapping correctly
after a sw upgrade. Frontier fixed the problem on the next patch
release. Granted, I was not happy with them for introducing the bug.
However, I didn't regard it as a fatal blow to our relationship.
If it had been the latest in a long string of problems, I might
have. If I had, I felt reasonably sure that I could go out and start
buying RMON probes from someone else and still manage them from the
NetScout console. Oh, the joys of standards based tools! :-)