2^^128 is A Whole Lot of address space. I was (and am) of the
politically incorrect view that 2^^64 is enough if one doesn't
immediately spend 48 bits for a flat system identifier portion of the
address. If one does spend 48 bits that way but moves to a 16 byte
fixed length address, then one still has 10 bytes/20 nibbles/80 bits
of space for heirarchy. I suggest that IPng use hexadecimal notation
in its human readable form so that nibble subnet boundaries are easier
on us humans. I will then claim there are about 10-20 levels of
heirarchy available to make routing easier and more efficient. This
is not an unlimited number but I really think it is enough.
I think there is some convergence of some parts of the community
here. I just hope it isn't polarisation.
as he reaches for his asbestos... :-)
the followup comments from several previously non-vocal router
implementers are significant. The vast majority of those who spoke up on
big-Internet clearly favoured fixed length addresses. ... the noteworthy
fact is the amount of collective router implementation experience that
prefers fixed length addresses.
Stop trying to manufacture a non-existent consensus. My impression is that
both sides have a large number of proponents. I see no rough consensus here.
As an aside, most of the people I think of as doing well in the business of
selling routers for money i) want variable length <whatevers>, and ii) say
they can build boxes that will forward them as fast as fixed length.
I just hope it isn't polarisation.
Well, I dunno. I think there are people who have their feet sufficiently sunk
in concrete that they just aren't goin to change their minds, no matter how
good a rational engineering argument is placed in front of them. I don't think
this counts. If a choice is forced, there would more likely be some very upset
people.
Noel
At last! A really DIFFICULT topic to debate!
In fact, I haven't ever seen a proposal that had pleasant human factors.
Ran's observation about predictability of boundaries is important, but we
still are left with a bloody long string. Also, I think most people do hex
badly. Mostly, we seem to need decimal.
Dave
+1 408 246 8253 (fax: +1 408 249 6205)
>Stop trying to manufacture a non-existent consensus. My impression is that
>both sides have a large number of proponents. I see no rough consensus here.
Ran wasn't. I saw no words in Ran's memo that had the word or alluded to
consensus. Ran stated what happened thats all. Thats all I read into
his message as my input to Big-I and believe it was valid. Not
consensus as a statement of fact.
>As an aside, most of the people I think of as doing well in the business of
>selling routers for money i) want variable length <whatevers>, and ii) say
>they can build boxes that will forward them as fast as fixed length.
Please sustantiate this with names of vendors and who stated what when
and on what mail list. We just went through this on SIPP. Why not just
talk about the technical issues and not just from an architecuture level
but also from an engineering implemenmtation level too where costs and
econmics are real factors in decision making.
Big-I that has not seen it also needs to see Bob Hindens mail which
provided another view to this entire subject.
> I just hope it isn't polarisation.
>Well, I dunno. I think there are people who have their feet sufficiently sunk
>in concrete that they just aren't goin to change their minds, no matter how
>good a rational engineering argument is placed in front of them. I don't think
>this counts. If a choice is forced, there would more likely be some very upset
>people.
I hope this can be avoided sincerely.
/jim
There are really two cases to be distinguished here, variable length
with an 'expected' length that you can detect easily and treat as a fixed
length for the fast path, and truly variable length. Even the truly variable
length stuff can be handled with hardware assist, and hardware assist can
pretty much be assumed in high performance boxes anyway.
The win for the first kind of variable length thing is doubtful,
it lets you grow your EID/locator/whatever space with less pain. Size
of spaces seems to be less of an issue than 'how do you route efficiently
in a billion node network', though.
The win for the 2nd kind is that you can have arbitrarily long
locally assigned chunks, which is joy and a wonder to behold.
I vote for administrative simplicity, if that means TLV encoded
variable length everythings in the header, with all the fields in
random order, that's ok. The ASICs to shatter a header into its component
atoms and look each atom up in a lookaside buffer will just need a few
more gates. Big Deal.
Hmm. Is this a host-side guys against a router guys thing?
TLV encoded random order headers would truly suck if you were trying
to get wire speed out of a Sun 3/50.
Andrew
> Hmm. Is this a host-side guys against a router guys thing?
>TLV encoded random order headers would truly suck if you were trying
>to get wire speed out of a Sun 3/50.
Andrew
and there are more hosts than routers by at least 2 orders of
magnitude, but the argument that the routers have to 'go faster'
may not hold nowadays, when the hosts do mulitmeida, but when only 1%
traffic is nonlocal...
jon
> and there are more hosts than routers by at least 2 orders of
> magnitude, but the argument that the routers have to 'go faster'
> may not hold nowadays, when the hosts do mulitmeida, but when only 1%
> traffic is nonlocal...
>
Can't let that go. I should be so lucky as to have 99% locality.
75% would be high here and multimedia will REDUCE it. One of the
main reasons we run a big Class B is router performance. We have
to keep MBONE traffic off the Class B precisely because it has
zero locality (not 1%, zero).
Routers and hosts must both go at wire speed. It's worse for routers
because they are connected to multiple wires.
Brian
Jon,
>but when only 1% traffic is nonlocal
I'd like to point out that the Internet is turning into the Information
market (and it is doing this quite successfully). In such an
environment the information is all over the net, and if the predominant
use of the net is for the information retrieval, then I don't see how
"only 1% traffic" is going to be nonlocal.
So, the argument that the routers "have to go faster" may be
more important in the future than it is today (as redistribution
between local and non-local traffic shifts towards more non-local
traffic).
Yakov.
yakov - (and brian made the same point)
i sorry, i disagree
in the short term you are right, and mbone&www grwoth support you, but
when domestic traffic takes off, and web cahcing is working right, and
the net is engineered better (ore hierarchically), i
think you will find that it reverts to the telephony pattern
jon
Jon,
>in the short term you are right, and mbone&www growth support you.
Thanks.
>but when domestic traffic takes off, and web caching is working right,
>and the net is engineered better, I think you will find that it reverts
>to the telephony pattern.
If the majority of the Internet users are going to be consumers
(not such an unrealistic assumption), and if these consumers
are going to access the Internet through some dial-up capabilities
(also not such an unrealistic assumption), there will be AT LEAST
one router between the place that wants to access some information,
and the place where the information is stored.
The factors you mentioned would improve the average number of hops,
but I still think that most of the traffic is going to be nonlocal,
especially if "local" means "local to my house".
Yakov.
We are in violent agreement that ordinary users must not have to
know about address formats and that we absolutely must have very
solid auto-addressing and auto-configuration mechanisms so that plug
and play works from day 1.
My point is different, I think. The discussion in Chicago confirmed
my own experience which is that the "network managers" for connected
networks (i.e. not an isolated LAN) _must_ know about subnet masks and
variable length subnetting to do their job reasonably well. I don't
think that will change with IPng. Certainly the large interconnected
IPX networks that I know about have NetWare Network Admins who grok and
need to grok these things for IPX. I just want to make it _easier_ for
such a network admin to use his/her address space more effectively --
by making nibble boundaries easier to deal with.
>If the majority of the Internet users are going to be consumers
>(not such an unrealistic assumption), and if these consumers
>are going to access the Internet through some dial-up capabilities
>(also not such an unrealistic assumption), there will be AT LEAST
>one router between the place that wants to access some information,
>and the place where the information is stored.
>The factors you mentioned would improve the average number of hops,
>but I still think that most of the traffic is going to be nonlocal,
>especially if "local" means "local to my house".
architectural prediction:
Cable net has head end + video on demand server at head of each street.
which will also be multiprotocol router (and wil lalso be a CIX)
then next level routers will be city-city, of which there will be
quite a few, since there will be many providers...
next ones will be country to country...again, there may be quite a few
(after all, just what is a "country" nowadays...)
many sites (homes) wil lbe producers - for instance, with good
internet security, all my tax/health/bank/payroll/etc info will be _in
my home_, not in the government server...
locality of culture/language/physical travel for entertainment
will determine locality of interest in information too....
large information servers (video/libraries etc) will be replicated
etc etc
so it'll all go local again in about 10-15 years...
i assert!
jon
> I'd like to point out that the Internet is turning into the Information
> market ... In such an environment the information is all over the net,
in the short term you are right ... but when domestic traffic takes off,
... and the net is engineered better (more hierarchically), i think you
will find that it reverts to the telephony pattern
I don't believe this last statement. The usage patterns of the telephone
network are in large part related to *what* you can do with it; call up
another person and have a chat. The things you can do with the Internet are so
different, I expect future usage patterns will also look different. E.g, how
much traffic on the net is mailing lists and Netnews, all of which is
(basically) widely multicast?
Noel
Thank you for the following posting. I applaud your vision and fully
agree with your statements. However, for the poor unfortunates who have
to grope at this level (and both you and I have been there), I also wish
to support Ran's point that more adequate human factors need to be built in
as well. This is especially the case because I do not see the IETF moving
us into a "user friendly" world anywhere near quickly enough: where is
our autoregistration? End users want scalable and secure plug-and-play now.
Sincerely yours,
--Eric Fleischman
>although the ops folks reading sniffer traces and configuring router filters
>will want some "standard notation", i don't see the big problem here.
>autoconfigure, autoconfigure, then do it some more. why more than .001%
>of ipng users will be exposed to addresses, i do not know. if this is an
>accurate assumption, then why waste the energy? hex is fine, dotted
>dec is fine, but we potentially lose granularity to convenience as we
>have w/ ipv4.
>ask 100 netware "experts" what an ipx address looks like, or how many
>bytes a network id is. 90+ will draw complete blanks. ask 100 users, 100
>will draw blanks. let's straighten out the serverless/serverful
>autoconfiguration and dynamic dns registration problems and worry
>abt "dns name addresses" since this is the only thing that users
>and admins should be exposed to (generally speaking).
>the fact that thousands of users recognize dotted decimal or
>know something abt ipv4 address classes today is offensive.
>let's hide this convenient netlayer/routing address to the
>implementors and conveniently forget to document or display
>these warts on the end-systems.
_______________________________________________________________
J. Allard jal...@microsoft.com
Program Manager of TCP/IP Technologies work: (206)882-8080
Microsoft Corporation home: (206)860-8862
"On the Internet, nobody knows you're running Windows NT"
Bob
Jim,
>So are you saying that routers cannot keep up with wire speeds with
>fixed length addresses.
Not at all. What I've been saying is that when designing IPng we
need to worry as much about host performance implications
as about router performance implications. To put it differently,
we need a globally optimal solution that takes into account both
routers and hosts.
>I think its critical for this discussion if we have a statement
>from someone that fixed length addresses can or cannot attain wire speed.
I think we've already seen data from router vendors that wire speed
can be attained with both fixed and variable length addresses.
Yakov.
If we can figure out whether IPng can solve the problem with fixed
length address or variable length address is the issue and discussion.
The tangents we need to enter should be technical and analytical in
nature I think?
No its not a host guy vs router guy its a question of cost. If we can
develop a fixed length address that lasts for 35 years why incur the
cost in addition to all the other costs we face to build IPng products?
I won't respond to the TLV encoding for the whole header as variables.
This could take us on another tangent. I write code for Host kernels
and variable addresses are not free and will not perform as fast as
fixed length addresses, unless we can gain performance in other areas,
which is possible and another cost.
/jim
/jim
Excuse my ignorance. However in your reply to Andrew you said:
>and there are more hosts than routers by at least 2 orders of
>magnitude, but the argument that the routers have to 'go faster'
>may not hold nowadays, when the hosts do mulitmeida, but when only 1%
>traffic is nonlocal...
I have no trouble understanding why routers need to quickly parse
addresses to forward packets to the correct next hop -- and do it very
quickly. However, even with real-time applications, I can not imagine why
an end system needs to parse addresses in a similar manner. That is, to get
an address I use a DNS name and obtain an address. The address is a unit.
When I use the address the address remains a unit. When I reply to an address,
the address still remains a unit. Sure, in real time applications
my software needs to do things lickety-split, but I don't see the need
for my host to parse these addresses as routers. Therefore, I don't
understand why my end system cares whether the address is fixed length or
variable, as long as it has the correct address and as long as it is able
to handle that size of address.
Now that I have displayed my ignorance, I would appreciate understanding
what I have overlooked that requires an end system to have equivalent address
parsing requirements and latencies as a router.
Sincerely yours,
--Eric Fleischman
Fine. I will not state what Ross said but please if you have not read
his response to Bill, please do.
Also do you insist on bringing up the past such as what happened when
they defined IPv4? I suggest we not get into that at this point until
we want to use history maybe to support architectural assumptions. I
could respond by saying there is a protocol in the market that supports
variable addresses that has not been wildly successful. But thats not a
good technical argument either.
/jim
=> and there are more hosts than routers by at least 2 orders of
=> magnitude, but the argument that the routers have to 'go faster'
=> may not hold nowadays, when the hosts do mulitmeida, but when only 1%
=> traffic is nonlocal...
>Can't let that go. I should be so lucky as to have 99% locality.
>75% would be high here and multimedia will REDUCE it. One of the
>main reasons we run a big Class B is router performance. We have
>to keep MBONE traffic off the Class B precisely because it has
>zero locality (not 1%, zero).
But your not going to debate that:
1) There are more hosts in the world than routers?
2) That the majority of network traffic at most end user
sites that create the major revenue to warrant my job
as a network software engineer is local LAN traffic?
If so we then have to go back to some basic discussions and get
consensus on variants of folks reality so we can move forward.
>Routers and hosts must both go at wire speed. It's worse for routers
>because they are connected to multiple wires.
I have heard no router code writer tell me that theY cannot make fixed
length addresses go at wire speeds.
/jim
Noel
oh blimey, why do i always say the wrong thing
i did _not_ mean that everything would be lots of 1-1 calls like
telephones - i.e. the pattern of calls woould have a different
texture, but would have the same _density_!
i meant that the assumption of herarchical traffic wil statistically
still be the case....
our _measurements_ show that netnbews and mail are totally irrelevant
as a percentage...btw
also www & mobone traffic are only the way they are as i said, coz the
engineering of the www servers and the engfineering of the mbone are
not cachized or pim-ized yet...
jon
So are you saying that routers cannot keep up with wire speeds with
fixed length addresses. Can you explain in detail the software code
bottleneck in implementations that would prevent this using a generic
routing code base or assists algorithms?
I agree the Information Highway is important to consider as market and
will alter the present market. In fact Bob Hinden has a complete
section very well articulated in the SIPP IPng White Paper he did on
this subject, which can be read whether you like SIPP or not. I sent it
around to some of marketing folks and they pretty much agreed with Bob's
projections of where the market will end up.
But none of this takes away from the importance of the host networking
paradigm that will support the applications used in this next generation
Information Highway market.
We don't want to rob Paul to pay Peter by giving to the hosts, routers,
operators, providers, or any entity in the network space and take away
from the other if we can do that in the IETF for IPng.
But I think its critical for this discussion if we have a statement from
someone that fixed length addresses can or cannot attain wire speeds?
They do today with IPv4 and if they don't is it a limit of the address?
/jim
/jim
Not all NEW devices will have the advantage of hardware assit either!
Also, it must be true hell trying to effectively compress variable
length addresses unless they are really fixed size. This is case one,
as you described above.
I believe that addresses need to stay small. The scheme where
addresses are 8 bytes or 16 bytes, with a bit indicating which they
are might prove useful for things like source routes, wireless (or low
bandwidth) links. The problem with this scheme will be administering
it. Its bad enough now for people to deal with 4 byte IP addresses,
what will happen if they are given 4, 8 and 16 byte addresses tod eal
with? Esp. if we need new notations for each. And as bad as this will
get, variable addresses will be a total nightmare to administer.
-rich
Folks,
In the discussion on variable vs fixed length addresses,
I would be interested to see the following estimate.
Assume that a host vendor starts with a host
that supports IPv4. Then assume two alternative scenarios:
(a) need to change the host software to support IPng-A with
fixed length addreses (e.g. 16 octets)
(b) need to change the host software to support IPng-B with
variable length addresses (max to 16 octets, in increment of 8 octets)
Note that the *only* difference between IPng-A and IPng-B is that
in IPng-B the length of an address is encoded as part of the address
field.
Given that software design, implementation and testing with scenario (a)
takes X man-years, how many man-years (relative to X) it would take
to do software design, implementation and testing with scenario (b) ?
Yakov.
Andrew,
It looks like it is.
On a router you're not concerned with what happens with a packet above the
routing layer, and that is a totally different prospective than on a host,
where you are very concerned about how protocol engines will handle the packets
in the layers above routing - and big or variable addresses do not have a
neglijible effect here at all - and how the users perceive the hosts behavior
at applications level.
Hosts also do routing, but much more on exceptional than regular basis, and
yes, I am concerned about routing to be done fast too, but tricks that can
be played for routing at hardware level are none.
Furthermore, on hosts the networking is just a background activity, which
has no importace whatsoever by itself, therefore has to take the least amount
of time - ideally ZERO - such that enough CPU is left for the main things, that
the host is meant to be used for.
Going back to variable addresses, you say "I know how to make variable length
things in packets go fast. Tony Li told me how!". I do not know what Tony may
have told you - but I can speculate.
I think it would be accurate to say that you can make the variable length
address processing go at various degrees of slow, or fast, whatever word may be
preffered, BUT NEVER AS FAST AS FIX address processing of same address size.
That's why I am in for fixed size.
To an extent, protocol headers are for protocol engines like machine
instructions for a CPU. In that respect, network protocol designers and
implementors can use or learn from the experience of computer architects,
and implementors.
Computer architecture technology of the last decade showed that faster machines
can be built with a well crafted, reduced set of fix size machine instructions
(RISC), where the size of instructions is optimally chosen.
Alex
p.s. I used to like VAX (CISC) quite a bit.
Its bad enough now for people to deal with 4 byte IP addresses, what will
happen if they are given 4, 8 and 16 byte addresses tod eal with? Esp. if
we need new notations for each. And as bad as this will get, variable
addresses will be a total nightmare to administer.
Really? People seem to manage DNS names (variable lengths, variable numbers
of levels) quite well.
Noel
Also do you insist on bringing up the past such as what happened when
they defined IPv4?
"A ship on the beach is a lighthouse to the sea"
- Dutch proverb, cited on the title page of the first chapter of "Mythical
Man-Month"
I could respond by saying there is a protocol in the market that supports
variable addresses that has not been wildly successful. But thats not a
good technical argument either.
A complete design/product/technology succeeds or fails for a complex mix of
reasons. It's dangerous to draw inferences about one aspect from the overall
success or failure (which is not what I did with my reference to IPv4
addresses). I could just as easily say that CLNP/TP never caught on because
it wasn't a big enough step past TCP/IP.
(Note: I actually don't think that's the whole answer either as to why CLNP/TP
failed - there are many factors.)
Noel
unrealistic is fair, unprobable is in my view perhaps more accurate.
Today businesses are the main networking users, and in the future more
and more will be networked. They will be most likely connected 24 hours
out of 24. Leisure or even appliance type of home netorked activity
cannot measure up to that, because of the simple rule "you switch off
the light when you leave a room in your house, but you may not when you
leave your cubicle".
>are going to access the Internet through some dial-up capabilities
>(also not such an unrealistic assumption), there will be AT LEAST
>one router between the place that wants to access some information,
>and the place where the information is stored.
>
>The factors you mentioned would improve the average number of hops,
>but I still think that most of the traffic is going to be nonlocal,
>especially if "local" means "local to my house".
>
>Yakov.
Dialing-up is expensive and has few if any alternatives today, when
various service providers may be geographically far away, because this
is just starting. As "information" is a lucrative business, those
service providers will be cloned in every neighborhood,
and one's home could likely end on the same fiber subnet with no
need for routing.
Alex
> A complete design/product/technology succeeds or fails for a complex mix of
> reasons. It's dangerous to draw inferences about one aspect from the overall
One of the big reasons for failure is not ever getting it done in the
desire to reach perfection.
Bob
This is only the additional work related to referencing the addresses in the
header, which in fact is the least expensive. There are many other parts
that are more expensive, such as the address match, and the machanisms related
to passing, and storing the variable size address parameters - always have to
pass two parameters rather than one, always store the length, besides the pointer
or the address itself, etc...
>Optimized code can use tricks like the header compression I mentioned to get
>rid of even that.
I reject header compression from the start - I would consider that only
for exceptional cases and more like an act of desperation. In fact
the added cost of compressing/decompressing may not amortize the savings.
>Of course, if you're trying to lay the data down on a page
>boundary, it can mess you up, but if you only have one of those applications
>running at a time you can make it work.
Remember I am talking about hosts, where practically most applications are
networked, the file system is networked, the printing is networked, and so
on....
>Obviously, packet setup in the internetwork layer can be more expensive, e.g.
>if you can't use an unrolled loop to copy the addresses around.
Address match is even more expensive than packet setup, since it multiplies
at internet layer with number of routing table entries, and/or interface
addresses, and at transport layer with the number of sockets.
>are ways to deal with this; e.g. you can reuse packet buffers, eliminating the
>address copy completely. Etc, etc, etc.
I guess you refer to what one can do to optimize in general the protocol
engines. But this has been long done, so I do not want any additional cost
that does not have a sound and irrefutable justification.
Alex
>If the majority of the Internet users are going to be consumers
>(not such an unrealistic assumption), and if these consumers
>are going to access the Internet through some dial-up capabilities
>(also not such an unrealistic assumption), there will be AT LEAST
>one router between the place that wants to access some information,
>and the place where the information is stored.
For the Internet yes. But much of any vendors sales will consist of
IPng in private companies in addition to the Internet business.
Lets not forget about that market segment in our discussion.
>The factors you mentioned would improve the average number of hops,
>but I still think that most of the traffic is going to be nonlocal,
>especially if "local" means "local to my house".
If its cached it don't have to go to the network. There are no hops.
This is not figured out yet but will be. So what your seeing now
is only temporary and will improve. Yes set up will be major but unless
a user is browsing through unrelated subjects with each mouse click
eventually it will not require leaving the desktop and at the most
accessing a local cache server on a local wire. So lets not penalize
local traffic tomorrow for the incomplete solutions of today where
network traffic is used.
Plus I have not seen any multi-million dollar RFCs from any customers where
I could win the networking part of the bid with just mosaic. Multicast on
a local wire for IPv4 as a major piece is closer to the needs as far as
emerging technology in RFCs. Yes mosaic is definitely a customer want
but the technology behind it needs evolution, which is happening.
/jim
I am not going to debate your points, fortunately.
Brian
>--------- Text sent by bo...@zk3.dec.com follows:
Yakov,
This is a fair statement. Thanks.
I feel at times that at an extent this debate is like the
town hall meeting where the republican senator speaks about Clinton's
health plan...
Without taking away the known and aknowledged merits of variable
addresses, is anyone debating the statement that processing a
fixed address is simplier and faster than processing a variable address?
Alex
Jon> as far as i can see, if an address is 'big enough', there are
Jon> no pro's to having it variable,
Well, it all depends on whether 2^64 is a suitable definition of infinity..
it's kind of like lisp vs C for integers - in lisp, if a result won't fit in
a fixnum, it rolls over into a bignum; with C, the number just rolls over
and plays dead.
To take the analogy a step further, lisp compilers optimise for the
fixnum case and treat bignum overflow as an exceptional case; the same
approach could be applied to variable length addresses, with the
64 bit case optimised for, and larger addresses stuck somewhere
out of the way where they won't do any harm. I think this is what SIPP tries
to do, but I'm not 100% convinced yet (I haven't coded the routing header
stuff yet)
Simon // Currently in Essex and enjoying the priviledge of my first Labour
// member of anything..
Is this necessarily true? Would it not be possible for, say, a mobile
host (whose variable-length-locator-address is changing) to screw up
such header prediction? Or can we assume that the locator-address
*length* will remain constant even if the locater-address changes?
--Jim
I think this would add something like an engineering week to the
development cycle of a typical router, including testing. Add another
month or so if you want to build good hardware assist for this header
structure as opposed to IPv4, and add a few bucks per interface to the
cost since you'll need to get some gates, and gates aren't free.
It's probably quite a lot more effort to back-patch an existing
IPv4 implementation to an address format like this, though perhaps less
than the total effort required to reimplement from scratch. These estimates
are guesses at how much longer it would take do an IPv4-funny-addresses
than to do an IPv4 implementation, from scratch.
As has been pointed out, at the host end it's more expensive,
since there are APIs and whatnot to take in to consideration. Still,
the overall effort inherent purely in making one particular kind of
header field variable length is trivial.
Andrew
depends on who the official definer is. the criteria document (which
is waiting for one of the ipng directors to review it) says something
to the effect of:
"a state of the art, commercial grade, router must be able to
forward packets at speeds capable of fully utilizing common,
commercial grade, high-speed media"
'commercial grade' typically would mean something that is available
as a standard item, off the shelf, from one of the major router/media
vendors. 'state of the art' would imply the latest release of the
latest product...
we couldn't give a specific media or speed. that would imply that we
(craig partridge and i) know what the common media of the future will
be... also, we have to remember that the future holds things that no
one can predict. think of how this criterion would read if we were
doing it for ipv4 in the late 70s -- what was 'high speed' commercial
grade networking at the time? 9600 baud rs232 lines? :-)
we could say 'fddi' or 'oc12' or '155mb atm' or whatever... but what
will be common in 2004?
--
Frank Kastenholz
FTP Software
2 High Street
North Andover, Mass. USA 01845
(508)685-4000
Could we keep the terms EID and Locators out of this discussion. I
think it will confuse the technical tangents.
The problem is that when you say `address' I don't know if you're
using address as a locator or an EID. I am very likely to give
different answers to the fixed vs variable length question depending
on which I think you meant. Maybe I can extract from context which
concept you mean and maybe I'll get it wrong. Wouldn't it be easier
if we used distinct terms?
Dave
Alex,
>is anyone debating the statement that processing a fixed address
>is simpler and faster than processing a variable address.
Speaking for myself, I don't debate the above statement with respect
to "simpler" aspect. With respect to "faster" I would like to point
out that with fixed length addresses you *always* use the max length,
even if you don't need it (Christian made this point today in his
e-mail). On the other hand, with variable length addresses, I may
use shorter address (e.g. 8 octets), rather than always use max length
(e.g. 16 octets). That may have some impact on "faster" aspect.
However, when comparing two choices, (a) -- fixed, and (b) -- variable,
in my mind the most important point is NOT a qualitative statement
that (a) is "simpler and faster" than (b), but some numbers
that would provide an estimate of the difference. Would you agree
with this ?
Yakov.
Bob,
>>Given that software design, implementation and testing with scenario (a)
>>takes X man-years, how many man-years (relative to X) it would take to
>>do software design, implementation and testing with scenario (b) ?
>
>The issue is not whether it is possible to support variable length
>addresses, it is the cost for doing it worth the benefit gained. I believe
>fixed length addresses will last a very long time and I do not think
>that the added complexity of variable length addresses is worthwhile.
You still didn't answer the question I asked -- how many man-years
(relative to X) it would take to do software design implementation and
Noel:
Predictions work only if when writing the code you can predict that some
95+% of the time the address will be of a certain length.
For header processing, there's a huge penalty in going back for the memory
that contains the address info you missed, and there's a smaller but also
severe penalty for loading too much memory with excess info.
In routers, you can really optimize processor performance on the header
if you always know how big the header is.
In hosts, you can optimize both the header processing and (usually) the
buffer management if you always know how big things are.
Craig
Today, managing DNS is a different level than specifying IP addresses, network masks,
subnet masks, broadcast masks, etc... how many do DNS versus how many do the simple
stuff - a poll would show probably a ratio of 1/30 if not 1/50, or better, so I guess
I would concur with Richard and try to keep internet address handling at the current
level, rather than bringing it to the level of DNS.
Alex
If it should only be so simple. Unfortunately, it affects many things.
These include transport connection identification, pseudo checksum
calculations, routing, API's, naming systems, error handling,
documentation, training, MIB's, etc. Variable length addresses make all
of these more complex.
> Given that software design, implementation and testing with scenario (a)
> takes X man-years, how many man-years (relative to X) it would take
> to do software design, implementation and testing with scenario (b) ?
The issue is not whether it is possible to support variable length
addresses, it is the cost for doing it (complexity, performance, etc.)
worth the benefit gained. I believe fixed length addresses will last a
very long time and I do not think that the added complexity of variable
length addresses is worthwhile.
Bob
Dave,
>...neither is allowed to be slow.
We are in violent agreement. Let me repeat what I said
before -- when optimizing performance we need to look at a global
picture that includes both hosts AND routers.
Yakov.
Not at all. However, the additional cost in the high-performance path
is something like 4 instructions.
Seems like cheap insurance to me.
Tony Li
Demonstrably Clueless
Could we keep the terms EID and Locators out of this discussion. ... If we
can figure out whether IPng can solve the problem with fixed length
address or variable length address is the issue and discussion.
Unfortunately, Jim, not everyone agrees that this is the question. I.e. not
everyone thinks that IPng should look like IPv4, just with *either* larger, or
variable length, addresses! (I personally find *either* option unappealing.)
The chief difference between a locator and an address (and the reason we
defined a whole new term) is *not* splitting the transport identification from
the internetwork layer location function. Rather, it was because some people
were visualizing designs in which that location function is not carried in
every packet. Since many people had this strong mindset that an "address" is
in every packet, and is the field that routers look at to forward packets, it
led to very confusing discussions. Hence the explicit definition of "locator"
to mean a location-sensitive name which *is not* carried in every packet.
On the other hand, with variable length addresses, I may
use shorter address (e.g. 8 octets),
But will you?
rather than always use max length (e.g. 16 octets).
That may have some impact on "faster" aspect.
Yes, there's no question but that longer addresses will be
slower than shorter ones, provided they don't get too short
(shorter or longer than the natural word length will be slower).
However, making any kind of decision tends to be much slower
than straight line fixed code, unless there's special purpose
hardware, which may exist in some routers, but won't in anything
else. I can't see the difference between 8 and 16 bytes in
processing speed possibly being slower than even a single
decision on which of those is actually in use on any semi-modern
general purpose architecture.
However, when comparing two choices, (a) -- fixed, and
(b) -- variable, in my mind the most important point is NOT
a qualitative statement that (a) is "simpler and faster" than
(b), but some numbers that would provide an estimate of the
difference. Would you agree with this ?
Yes, I would - so how about providing the numbers? Since
fixed is clearly both simpler and faster, by some degree, I
think the burden of demonstrating that the difference is in
fact negligible enough not to matter should rest on those that
claim that.
Also remember when considering this that there are two decidely
different arguments for variable length addresses - some people
claim to want them to save bytes, so the shortest addresses
possible can be transmitted, others want them so they can stick
in the longest addresses they can imagine - sometimes just for
the safety net (how do you produce a number for that benefit?)
and other times because they want to use truly extravagent
local routing (11 bytes of local addressing indeed!)
Those two cases probably need to be considered separately, as
one option may be to have a very short maximum, with shorter
options which would at least appease those who don't like
variable addresses because of their potential to become huge,
but wouldn't satisfy those who say they want shorter addresses
only because they want to get variable addressing in somehow so
they can really have long ones, but don't want to be heard
saying that.
kre
I'd like to point out that the Internet is turning into the Information
market (and it is doing this quite successfully). In such an
environment the information is all over the net, and if the predominant
use of the net is for the information retrieval, then I don't see how
"only 1% traffic" is going to be nonlocal.
Your challenge is reasonable. On the other hand, there is a remarkably
long and consistent history in both human and networking communication
to see high locality of reference. So, at some level, the answer to
your question is "because that's the way people use things".
On the other hand, debating which needs to be faster, routers or hosts ,
probably isn't very productive, since the demands of real-time mean that
neither is allowed to be slow.
d/
that is a totally different prospective than on a host, where you are very
concerned about how protocol engines will handle the packets in the layers
above routing - and big or variable addresses do not have a neglijible
effect here at all
If you have a variable length internetwork header (be it from addresses or
whatever), the cost from that to upper layers, in the most general
implementation, seems to be i) an extra pointer register, and ii) two
instructions; one to extract the header length length, and one to add it to
the old pointer, and store the result in the new pointer. (On some
architectures, the latter may turn into two instructions, I guess.)
Optimized code can use tricks like the header compression I mentioned to get
rid of even that. Of course, if you're trying to lay the data down on a page
boundary, it can mess you up, but if you only have one of those applications
running at a time you can make it work.
Obviously, packet setup in the internetwork layer can be more expensive, e.g.
if you can't use an unrolled loop to copy the addresses around. However, there
are ways to deal with this; e.g. you can reuse packet buffers, eliminating the
address copy completely. Etc, etc, etc.
To an extent, protocol headers are for protocol engines like machine
instructions for a CPU. In that respect, network protocol designers and
implementors can use or learn from the experience of computer architects,
and implementors. Computer architecture technology of the last decade
showed that faster machines can be built with a well crafted, reduced set
of fix size machine instructions (RISC), where the size of instructions is
optimally chosen.
There's one important difference that affects how useful this design aphorism
is over here in networking. If I decided to change my CPU for one with a
different architecture, you can keep using yours. The same is not true of a
globally ubiquitous protocol (and I don't really believe that "design it to
evolve" is really going to buy us as much as people might hope).
Noel
---- Included message:
Cable net has head end + video on demand server at head of each street.
which will also be multiprotocol router (and wil lalso be a CIX)
I believe the servers will be higher than at neighborhood and doubt there
will even be any caching at street or neighborhood, given the storage
requirements of video. (Yes, even for the one or two most popular
movies it probably isn't worth having a cache at each stree or
neighborhood. Head-End is probably the better bet. It's set up
for the operations and maintenance that is needed. Hence, there's
a server setup for some SET of neighborhoods.
d/
Parsing addresses in IPv4 matters if you want your host to go fast.
For example, `ttcp` numbers in the hosts I know about are faster for
TCP than for UDP because you don't have to worry about addresses for
TCP as much per packet with 4.3BSD-style TCP/IP.
Dealing with the sequence numbers and timers of the IPv4 TCP, IP, and
UDP headers is less work per byte than grabbing the address and using
it as a key to figure out what the packet is about.
I write host code for a living. Silicon Graphics' host code is not
the slowest in the world.
Vernon Schryver, v...@sgi.com.
I disagree.
All routers have to do is switch packets at wire speed; albeit multiple
wires.
Hosts (in which I include desktops as well as servers with lots of
networks connected to it), in addition to receiving and sending packets,
have to run applications that source and sink the data. These applications
do things like accessing disks, frame buffers, frame grabbers etc all
of which consumes precious CPU cycles.
Erik
We have a long history with one addressing model and we have none with
another. We do not really know much about living with variable length
addressess, using them efficiently, etc. A continuing, major issue in
the IPng debate is trying to be clear about the items that involve
risk. Anything that entails doing something in a style that is
significantly different from what we have done before entails some
risk, and possibly quite alot.
The issue is not fixed-vs-variable, but why the heck we should believe
variable is such a win? The criticisms made by variable-advocates
are all true. Every attempt at predicting the right length has been
wrong. But at least we knew how to make fixed length work and work
well. Hand-waving doesn't make variable safe. If anyting, it increases
the risk.
Dave
Please remember the very important fact that optimal size does not mean
maximum. Which is to say that fix size address forcefully will prevent many
liberals to use as much as they otherwise would with variable size.
>On the other hand, with variable length addresses, I may
>use shorter address (e.g. 8 octets), rather than always use max length
>(e.g. 16 octets). That may have some impact on "faster" aspect.
I think it is a lot difficult to demonstrate the above with the current
variable size addresses, notable CLNP. From my own recent curiosity and
experience, which was stirred up by a majority of comments on this mailing
list, the fact that variable size invariably invites people to use the maximum
available size was verified. And this is statistically true not only in
networking - just look around or watch C-SPAN.
>However, when comparing two choices, (a) -- fixed, and (b) -- variable,
>in my mind the most important point is NOT a qualitative statement
>that (a) is "simpler and faster" than (b),
I think it is the very important starting point for discussion or evaluation
which seems to be openly avoided or bypassed by the variable address partizans.
>but some numbers that would provide an estimate of the difference.
>Would you agree with this ?
I am not sure what you mean. Hm,... I mean what would be considered a
satisfactory accurate number for an estimate. Do you mean,
someone to go and collect data, take short message/response oriented
applications, long message/response applications, unidirectional traffic
applications, bidirectional traffic applications, datagram oriented
applications, connection oriented applications,, create environments with one
active communication and with multiple active communications, do PC (program
counter) sampling, and profiling, internal cache sampling, instructions and
instructions sequence timing calculations, look and compare machine code
generated by C, C++, or other compilers, or assemblers, list and compare
procedure calls, instruction sequences and procedures link constructions
brought in by linkers, on CISC, and RISC, on 32, and 64 bit architectures, on
routers, and hosts, on UNIX, (how many?) MS-DOS, Windows NT,....VM,
OS-2,...OVMS,... ?
Let me say just that if IPng does not give the system response and feel of
IPv4 then there will be a lot of trouble. We talked about this at length.
I know from years of squeezing out KB, and MB/sec, that good performance
and feel does not come easily and in big chunks, but you can loose away easily
and in big chunks.
When I have a large multiprocessor VAX or Alpha host with 600 incoming and
several tens or hundreds outgoing TELNET sessions on it, hundreds of SMTP mail
coming into and going out, file copying, printing on several printers, several
tens of workstations file systems mounted remotely, with several Ethernet, and
FDDI adapters - does this get any empathy with you? - or a desktop Alpha, with
several windows imported, in an intensive threads based network application
environment, with file systems mounted all over the place, I care a lot to have
every address match operation done as fast as possible, because the address
match done for source and destination may get multiplied with the number of
addresses the system has, with the number of routes it can route to, with the
source and the destination address of each TCP, or UDP socket, just to name one
mechanism that is affected by a change from fixed to variable address.
Adding to that that the address match must be serialized - the list must be
locked, providing access to only one processor, to prevent possible
simultaneous destructive access of the list from a different processor.
And adding to that the fact that all fix size address based caches that were
built into IPv4 based TCP or UDP to speed up the searches will be broken, does
this mean anything to you?
Alex
This is an opportunity to mention one important possible advantage of 16 byte
long SIPP addresses that I think was ignored: specifying IPv4 addresses in the
"standard Internet '.' text notation", i.e. aaa.bbb.ccc.ddd !
This could be used during the IPv4 to IPng transition, and would make
applications significantly faster by eliminating the need to translate an IP
address text to binary, and will make network sniffing a lot easier, since
network addresses will be plain text !!!
(-:
Alex
Yakov,
Sorry if answers are out of the receiving order.
As one may infer, from my focusing in an earlier message, unlike others
that look at the development costs, I am looking at the production cost
of having variable address versus fix addresses, in other words, how is
a machine and its users going to be affected by a new address format,
being used on the Internet.
Obviously a fix address incures only one change relative to IPv4 - size.
A variable address format incures the additional and much more costly
change of processing variable addresses, in which the size is learned
dinamically.
The overall production cost can get higher than what may be considered
acceptable. However, the level of acceptance is hard to define.
Some customers scream if a new version of a software although with
much more features than the previous one is 2% slower. Others scream
at 5%, others at 10%, or more.
The user feel, the perception are important. New headers, new fields,
longer addresses and plus lack of experience in implementing and working
with the new protocol will add most likely to per packet processing cost,
and since that will be measured against IPv4 which has more than 10 or 15
years of continous performance improvments, I want to minimize the risc
of having a defavorable customer perception.
Alex
One of the big reasons for failure is not ever getting it done in the
desire to reach perfection.
Yup. However, that's not our problem here, I think. Our problem is far more a
major disagreement over how bold to be, technically. Not being bold enough can
also be a road to failure (and one I've seen from close up, sigh).
I really seriously think this reluctance to go to variable length <whatevers>
is as much an emotional reluctance to stray from what has been shown to work,
as it is anything else. This is understandable; being bold is difficult.
True, there are arguments about efficiency, etc, but what I think I am seeing
here is a pattern I've seen repeated over and over again in the IETF over the
years. Making radical changes is always uncomfortable, and efficiency (along
with complexity) is often one of the counterarguments.
However, you're right, we can't dither forever, fine-tuning. I can't speak for
everyone, but I'm certainly ready to move fairly expeditiously toward putting
together some "best guess" specs, and rolling it out. I just want to be pretty
bold in so doing.
Noel
Should I call this the "router guy" view?
Feel free to call it whatever you like.
You do not mention that for each packet you may do:
1. case dependent, at internet layer you multiply the address match with the
number of local host addresses, or/and with routing table entries.
2. At transport level, which 'router guys' ignore systematically, for each
packet received, with data or no data, two address matches are
performed for each socket that is in the list of sockets for that
particular transport, and is not the socket looked for.
Sorry, I'm lost here.
Back to your number 4, did you mean 4 machine instructions
or 4 instructions or lines of high level language?
Machine instructions. Oh, and I was considering a longest match
routing table lookup.
Let's see, assuming an imaginary simple architecture, and
machine language, and using BigTen specs for this example
of a two 16 byte addresses match:
mova a, r1 ;;; pointer to address1 in r1/not counted
mova b, r2 ;;; pointer to address2 in r2/not counted
$$$start_counting:
movb (r1), r3 ;1 load size of first address in register
You always have to load the first byte of the address, regardless of
what you're doing. And actually, we should load a full word (for your
favorite word size that's a multiple of 32 bits) here.
mask x, r3 ;2 mask unused bits
shift r3, 1 ;3 eliminate bit 0 - strict vs loose,
and align size
You can mask here and just remove the strict/loose bit. Leave the
lengths alone.
Aligning the length bits isn't important, as we aren't going to
_count_ them, instead we'll branch on them with a case statement....
But to continue:
movb (r2), r4 ;4 load size of second address in register
mask x, r4 ;5 mask unused bits
shift r4, 1 ;6 eliminate bit 0 - strict vs
loose, and align size
Again, loading is required anyhow, masking is due to strict/loose, and
there's no need to shift yet.
cmp r3, r4 ;7 compare the two - I suppose you don't
; want to go and compare two addresses
; that are not equal in size.
bneq $$$not_equal ;8 branch if not equal
Ok, if we got here, the lengths are equal. Doing the rest of the
compare makes sense. Note that you have to compare these two words
_anyhow_ so the above are not an additional instruction. Let's get the
first byte off of the word:
movl r3, r5 ;1 Copy
shiftr 28, r5 ;2 Shift for convenient jumping
jump table[r5] ;3 Branch
Now, for each of the possible address lengths, you write an unrolled
loop. Note that this unrolled loop is _EXACTLY_ the same as you
would have to had to have done for the fixed address. Clever folks
just arrange the labels to jump into the same instruction stream at
the appropriate place...
I believe that a more complex architecture can probably combine the
copy and shift. And if your architecture is really simple, then the
indexed branch takes one more instruction. That's 4.
Tony Li
Speed Freak
Yes, I forgot to add that hosts must go at wire speed while
only using a few percent of the CPU, I/O and memory bandwidth
to do so. But if you allow a host to spend 10% of its resources
on networking, you need a router of the same power to support
ten wires, to a first approximation.
Brian
>--------- Text sent by Erik Nordmark follows:
If it should only be so simple. Unfortunately, it affects many things.
These include ... pseudo checksum calculations
Do it once at connection setup (for the entire pseudo-header), and then store
it as a seed to the checksum algorithm. Etc, etc.
Variable length addresses make all of these more complex.
Sure, but extra complexity is a fact of life as systems get bigger. It's
going to happen whether you like it or not.
The issue is not whether it is possible to support variable length
addresses, it is the cost for doing it (complexity, performance, etc.)
worth the benefit gained.
Exactly. It's a difficult question, unfortunately with no closed-form answer.
I believe fixed length addresses will last a very long time
Well, if you look at the history of computer systems, it's amazing how many
times fixed length things have turned out to be "not long enough". I doubt
even a 16-byte locator will run a worldwide data network for 30+ years. Note:
I am not saying it's too small to provide unique *identification*; my doubts
have more to do with its ability to hold the names that would result from a
hierarchically structured address system which made the routing in such a
system have acceptable scaling overhead.
I do not think that the added complexity of variable length addresses is
worthwhile.
Well, you and I sitting here saying "yes it is" and "no it isn't" isn't going
to do much good. You have any ideas on how to make forward progress?
Noel
For header processing, there's a huge penalty in going back for the memory
that contains the address info you missed, and there's a smaller but also
severe penalty for loading too much memory with excess info.
Craig, this is true of today's technology. What about 20 years down the road,
though?
In routers, you can really optimize processor performance on the header
if you always know how big the header is.
For a variety of reasons, including the fact that their memory-to-I/O
bandwidth ratios are skewed from normal processing requirements, I think
routers will always have significant hardware support for their job, so I
don't worry about routers much. I don't hear the major router vendors jumping
up and down and freaking out either...
In hosts, you can optimize both the header processing and (usually) the
buffer management if you always know how big things are.
Sigh. I'm not sure how I got stuck defending something (variable length fields
in every packet) that wouldn't be in any design I liked.
As far as I can tell, high-performance applications are *not* going to be
sending single packets. This means you've got a flow, and you can take the
darn locators out of the packet completely, fixed *or* variable length, making
the headers faster to set up (if you aren't recycling buffers).
Even if you do have single packets, I know how to forward single packets which
contain variable length locators fairly efficiently (using the New Datagram
Mode, for instance). From the user's perspective, speed-of-light round trip
delays for access to non-local data (16 msec from the US East Coast to the
West Coast, for example) will completely swamp any extra time spent setting up
the longer header.
I can't believe we're wasting all this time arguing about a few stupid
instructions. 20 years from now people will look back on this debate and
wonder what we were all using for brains. Cycles are cheap, and getting
cheaper.
Noel
Tony,
Should I call this the "router guy" view?
Those 'n' more instructions in an address match - which means we ignore all
the other special operations required for processing variable addresses - may
mean a lot.
You do not mention that for each packet you may do:
1. case dependent, at internet layer you multiply the address match with the
number of local host addresses, or/and with routing table entries.
2. At transport level, which 'router guys' ignore systematically, for each
packet received, with data or no data, two address matches are
performed for each socket that is in the list of sockets for that
particular transport, and is not the socket looked for.
...
Back to your number 4, did you mean 4 machine instructions
or 4 instructions or lines of high level language?
Let's see, assuming an imaginary simple architecture, and
machine language, and using BigTen specs for this example
of a two 16 byte addresses match:
mova a, r1 ;;; pointer to address1 in r1/not counted
mova b, r2 ;;; pointer to address2 in r2/not counted
$$$start_counting:
movb (r1), r3 ;1 load size of first address in register
mask x, r3 ;2 mask unused bits
shift r3, 1 ;3 eliminate bit 0 - strict vs loose, and align size
mov (r2), r4 ;4 load size of second address in register
mask x, r4 ;5 mask unused bits
shift r4, 1 ;6 eliminate bit 0 - strict vs loose, and align size
cmp r3, r4 ;7 compare the two - I suppose you don't
; want to go and compare two addresses
; that are not equal in size.
bneq $$$not_equal ;8 branch if not equal
Up to here this is 8 instructions to me, which I do not have for fixed
addresses. Let's continue.
$$$loop:
cmp (r1), (r2) ;1 compare first 4 bytes
bneq $$$not_equal ;2 branch if not equal
cmp 4(r1), 4(r2) ;3 compare second group of 4 bytes
bneq $$$not_equal ;4 branch if not equal
decl r3 ;5 decrement size multiple of 8
bleq $$$equal ;6 continue if more to do
addl #8, r1 ;7 point to next group
addl #8, r2 ;8 point to next group
brb $$$loop ;9 loop to do next 8 bytes
$$$equal:
;
; Total 8 + 9 (first path) + 6(second path) = 23
;
$$$not_equal:
Let's see, fixed addresses now.
mova a, r1 ;;; pointer to address1 in r1 / not counted
mova b, r2 ;;; pointer to address2 in r2 / not counted
$$$start_counting:
cmp (r1), (r2) ;1 compare first 4 bytes
bneq $$$not_equal ;2 branch if not equal
cmp 4(r1), 4(r2) ;3 compare second group of 4 bytes
bneq $$$not_equal ;4 branch if not equal
cmp 8(r1), 8(r2) ;5 compare third 4 bytes
bneq $$$not_equal ;6 branch if not equal
cmp 12(r1), 12(r2) ;7 compare firth group of 4 bytes
bneq $$$not_equal ;8 branch if not equal
;
; total of 8 instructions
;
The difference according to this is 15 instructions.
Alex
> If you have a variable length internetwork header (be it from addresses
> or whatever), the cost from that to upper layers ... seems to be i) an
> extra pointer register, and ii) two instructions
This is only the additional work related to referencing the addresses in
the header, which in fact is the least expensive.
No, it's the extra cost of getting to the upper layer header, which is
otherwise unchanged.
There are many other parts that are more expensive, such as the address
match
You mean checking to make sure the packet's for you? First, if you had EID's
and locators, you wouldn't need to check the locator (variable length), just
the EID (probably fixed length, and definitely shorter). Second, if you're
using a protocol with end-end checksums which include the EID, you can skip
checking the EID, and just assume the packet's for you; if it's a stray packet
on a port you happen to have live, it'll fail the end-end checksum.
and the machanisms related to passing, and storing the variable size
address parameters - always have to pass two parameters rather than one,
always store the length, besides the pointer or the address itself, etc...
Why the devil are you doing that any time other than the initial open? You're
better off passing a pointer to the relevant connection block anyway, rather
than an "address" (fixed-length or otherwise). That's what you really need
anyway, and if you pass that directly, rather than something you have to look
up to turn into said pointer, you're ahead of the game there. (If you are
doing a multi-user OS, you have to do something different, but let's not get
into that rat-hole.)
> Optimized code can use tricks like the header compression I mentioned to
> get rid of even that.
I reject header compression from the start
Sorry, that was a mental typo; I mean to say "header prediction".
> Of course, if you're trying to lay the data down on a page boundary, it
> can mess you up, but if you only have one of those applications running
> at a time you can make it work.
Remember I am talking about hosts, where practically most applications are
networked, the file system is networked, the printing is networked, and so
on....
Just out of interest, do you try and put data down on page boundaries? Anyway,
it turns out that for high-performance applications, my assumption (which could
be wrong) is that most of these have a flow lying around, and if so, you can
drop the locators from the packets, making the headers fixed length.
> Obviously, packet setup in the internetwork layer can be more expensive,
> e.g. if you can't use an unrolled loop to copy the addresses around.
Address match is even more expensive than packet setup, since it multiplies
at internet layer with number of routing table entries, and/or interface
addresses
A fast TCP should contain a pointer to the relevant internetwork level route
cache entry anyway; that's faster than looking it up on every packet (and that
entry should contain a pointer to the level 2 address, for the same reason).
If you have EID's, most multi-interfaced hosts would have only one, getting rid
of *that* cost.
and at transport layer with the number of sockets.
Hmm. First, EID's change the equation anyway. Second, even without EID's,
there are tricks you can pull here. For instance, if the local/destination
port pair is unique (which you can determine at connection setup, and you can
mark as such at that time), and then you only have to check those on incoming
packets (the end-end checksum will catch stray packets). If, on processing a
packet, you find an entry not so marked, then you have to do the full compare,
but that should be rare.
> are ways to deal with this; e.g. you can reuse packet buffers,
> eliminating the address copy completely. Etc, etc, etc.
I guess you refer to what one can do to optimize in general the protocol
engines. But this has been long done
I would doubt this, actually.
so I do not want any additional cost that does not have a sound and
irrefutable justification.
Sure, we just disagree about what classifies as "a sound and irrefutable
justification".
Look, I understand the people who like efficiency; I've been there. I know all
about writing tense code. I've written more code than I can remember, knowing
what machine language the compiler was going to emit, and counting the memory
references as I wrote it, and trying to decide which variables to put in
registers. I wrote (in the dim past :-) a router in a HLL that had what was,
for that time, performance levels previously considered unapproachable without
assembler. Etc, etc, etc.
I'm trying to look at a bigger picture now, one that includes all costs,
including engineering, deployment, operation, etc, over the entire lifecycle
of the design, which is probably 30+ years! (If *anyone* thinks this is silly,
realize that IPv4 is already over 15 years, and it will be over 20 before we
are even starting to majorly transition off it.) Looked at from that
perspective, the flexibility and adapability of separate mechanisms (for EID's
and locators) and variable lengths (for locators) are worth the costs.
Noel
>I have no trouble understanding why routers need to quickly parse
>addresses to forward packets to the correct next hop -- and do it very
>quickly. However, even with real-time applications, I can not imagine why
>an end system needs to parse addresses in a similar manner. That is, to get
>an address I use a DNS name and obtain an address.
ok, what about a ipng video on demand server or a transactions server
supporting 100,000 customers
it has to map addresses into TCP PCBs exactly as fast as the router
forwarding packets to it
it has to do a _full address match_ not just a prefix one.....unless
you believe the EID stuff (actually, even if you do, such a server
might well be multihomed on multiple provider nets, and
therefore have to choose its network interfaces arefully, and check
inbound and outbound match properly...
ok?
jon
Tony's reply missed some optimizations.
1. case dependent, at internet layer you multiply the address match with
the number of local host addresses
Just assume the packet is for you. If you don't find a matching port (port
pair for TCP), *then* check the destination to decide how to do the error
handling (no such port, or the packet was misdirected). (Also, this is a
tangent, but you can also fix this with an EID.)
or/and with routing table entries.
You should only being doing this lookup once, at connection setup time, and
storing a pointer to the resulting entry in the connection block.
2. At transport level, which 'router guys' ignore systematically
Look, if you get to gratuitously insult our professional competence, expect it
back. And believe me, I can give as well as I get. How many TCP's have you
written from scratch? (Hint: The answer for me is *not* "zero".)
for each packet received, with data or no data, two address matches are
performed for each socket that is in the list of sockets for that
particular transport, and is not the socket looked for.
First, check the ports first, not the addresses. You won't even get as far as
the addresses for most sockets. Second, if you find a single matching port,
don't even bother checking the addresses (on either end); the end-end checksum
(which includes the pseudo-header, and thus the addresses of both ends) will
catch stray packets which just happen to have the right ports in them.
(There, I've saved enough cycles for you to more than pay for variable length
addresses! :-)
mov (r2), r4 ;4 load size of second address in register
mask x, r4 ;5 mask unused bits
shift r4, 1 ;6 eliminate bit 0 - strict vs loose, and align size
Assuming the second address is the one that's already on hand, these second
two instructions are totally gratuitous. The address should be stored with
with these operations already performed.
Noel
Anything that entails doing something in a style that is significantly
different from what we have done before entails some risk, and possibly
quite alot.
Sometimes doing only things that are not significantly different from what you
have done before is a bigger risk. The number of major companies that have
fallen on hard times through not changing to keep up with the fast-changing
world around them is legion. Still, these are broad generalities, and won't
help us decide here.
The criticisms made by variable-advocates are all true. Every attempt at
predicting the right length has been wrong. But at least we knew how to
make fixed length work and work well.
I'm sitting here with a pretty stunned look on my face; I can't believe you
said this. Looking for keys underneath the street-light indeed...
Noel
In my current count the difference is 8 - there is no need to load
in registers the first part of the address before the first compare
- but still obviously handcrafting makes a difference. This was fun...
Well, any way you slice it, you had to do the mask for the
strict/loose bit. This bit is completely orthogonal to the issue of
variable length addresses.
Certainly if we get rid of that bit and we do memory to memory
compares, then that saves four more instructions from your count.
For RISC, the unrolled loops have to be in proximity to ensure instruction
cache hits, while the existence of 'table' as data reference may incure data
cache mises.
Yup. Clever people might choose to arrange their jump table so that
there is no I-cache miss for the common lengths. Further cleverness
might avoid the data cache miss by doing math on the PC. It might
burn an extra instruction in this case, but still be faster.
Do you have a 'C" example that I could try with my compilers, and see
the machine code generated and difference?
Not offhand.
Tony
Tony,
What resulted is:
mova a, r1 ;;; pointer to address1 in r1/not counted
mova b, r2 ;;; pointer to address2 in r2/not counted
$$$start_counting:
movl (r1), r3 ;1 load first 4 bytes of address1 in register
mask x, r3 ;2 mask unused bit 0
movl (r2), r4 ;3 load first 4 bytes of address2 in register
mask x, r4 ;4 mask unused bit 0
cmpl r3, r4 ; compare the two
bneq $$$not_equal ; branch if not equal
movl r3, r5 ;5 Copy
shiftr 28, r5 ;6 Shift for convenient jumping
addl table, r5 ;7
jmp (r5) ;8 Branch
The unrolled loop is clever indeed - used it for checksum routines.
In my current count the difference is 8 - there is no need to load in registers
the first part of the address before the first compare - but still obviously
handcrafting makes a difference. This was fun...
The imaginary architecture brakes down to very simple operations to avoid
the hiding effect of more complex instructions, which execute in a cumulated
simple instructions time.
For RISC, the unrolled loops have to be in proximity to ensure instruction
cache hits, while the existence of 'table' as data reference may incure data
cache mises.
Do you have a 'C" example that I could try with my compilers, and see
the machine code generated and difference?
Alex
with the current variable size addresses, notable CLNP ... the fact that
variable size invariably invites people to use the maximum available size
was verified. And this is statistically true not only in networking
What?! Look at your host's DNS name; know what the maximum length of any DNS
field is? Does your US mail address completely fill the space in the address
cards which have the character positions marked out? What about your UNIX file
names?
NSAP addressing plans take the whole 20 bytes because 20 bytes is seen as
limited (particularly after you subtract the AFI, 6 bytes for the IEEE 802,
etc), so that people feel they have to sit down and carefully allocate the
space up front. If they had more space to play with *if they needed it*, and a
format which made it much easier to add space to things in the middle, nobody
would bother with pre-allocating all the space.
> However, when comparing two choices, (a) -- fixed, and (b) -- variable,
> in my mind the most important point is NOT a qualitative statement
> that (a) is "simpler and faster" than (b)
I think it is the very important starting point for discussion or
evaluation which seems to be openly avoided or bypassed by the variable
address partizans.
I get really tired of people assuming that anyone who doesn't share their
fixation on performance has no idea, or doesn't care, what the "real costs" of
their pet schemes are. I find it damned insulting, thank you. I've *been
there*, and I have tens of thousands of lines of code *still runnning* in
production in the Internet in commercial products, OK?
We're not bypassing anything, just looking at a little bigger picture than how
many instructions its going to take this year.
When I have a large multiprocessor VAX or Alpha host ... I care a lot to
have every address match operation done as fast as possible, because the
address match done for source and destination may get multiplied with the
number of addresses the system has, with the number of routes it can route
to, with the source and the destination address of each TCP, or UDP socket,
just to name one mechanism that is affected by a change from fixed to
variable address.
You don't provide any details of how your databases are arranged. However,
clever coding (e.g. checking the simple things first, like comparing ports for
matches *before* checking the addresses; use of hashes or B-trees for
databases which can get very big, instead of linear linked lists), use of some
precomputed intermediate data (such as cached pointers) could probably improve
the situation here a lot.
Noel
Noel
information theorey, she say:
if you have variable lenth address, you must vary it, otherwiase it
conveys nothing
TCP she say: me big end to end protocol, need to know end from end
multihomed host, he say, need to look at varation in address, and
check him don't vary from 1 packet to next in same connection
i.e. variable length address necessarily costs more to end system than
fixed address.
what is gain to network that is worth loss to end system?
also:
personally, i think there are as many opportunities to assign
variable length wrong as there are advantages in the flexibility ...
this is based on talking to sites with 100k host networks in the UK, i
often find they have a real problem with IPv4 address assignment.
and note, if someone gets it wrong, the use memory in _all_ routers
jon
Noel:
Here's why one argues about a few instructions:
It requires 50-70 instruction cycles and 4 to 8 cache line loads
to forward a current IP datagram using a processor. (Depends on your
architecture). Current processor and memory performance trends say
that these two metrics are roughly balanced (i.e, 4 to 8 cache
line loads can be done in 50 to 70 instruction cycles).
So, if you add, say 10 instructions to header processing, you've slowed
the system down by 15% to 20%. One more cache line load (say due to bigger
headers) has at least as severe an effect.
Craig
Not to get too far off on this tangent, but routers also run
"applications" that process and generate routing updates, handle
telnet sessions and file transfers, and respond to SNMP requests
(among other things, depending on the router). The application load
isn't the same as a desktop host, but it's definitely not zero.
Erik
-jj
A host must also be prepared to do both specific and wild card matching
on addresses. A given received UDP datagram might be the job of a
process listening to datagrams only from a single remote host or it
might be given to a process that wants all datagrams to a given port.
Then there is multicast, which can involve delivering a single datagram
to more than one process.
That said, I must note that the incoming packet rates on the
video-on-demand servers I've heard about are not the same as their
outgoing packet rates.
Vernon Schryver, v...@sgi.com
> Should I call this the "router guy" view?
>
> Feel free to call it whatever you like.
>
> You do not mention that for each packet you may do:
>
> 1. case dependent, at internet layer you multiply the address match with the
> number of local host addresses, or/and with routing table entries.
>
> 2. At transport level, which 'router guys' ignore systematically, for each
> packet received, with data or no data, two address matches are
> performed for each socket that is in the list of sockets for that
> particular transport, and is not the socket looked for.
>
> Sorry, I'm lost here.
This is worth understanding. I think it is demonstrable that hosts are more
greatly affected by the cost of address processing since they generally
need to process both the source and destination addresses. When a host
receives a packet it must:
- determine if the destination address is one of its local addresses. The
current IPv4 implementation on Unix machines normally does this by scanning
the list of interfaces looking for a match on the local address, which is
what the first note is referring to. This becomes a less-than-sensible
thing to do if the cost of comparing addresses becomes relatively more
expensive than it is now (it is a somewhat dubious practice even now if
the host has a lot of interfaces), so let's assume in any case a better
implementation might do a binary tree lookup, doing a few bit tests to
find the only local address which might match, and then does a single
comparison against this address. The cost of variable-length addresses
is probably an extra length check in the tree lookup loop (to avoid hosing
yourself if the destination in the packet is shorter than host addresses
you like) plus the difference in cost of doing a variable- rather than
fixed-length comparison of the addresses (a couple of extra instructions
plus object code bloat if you switch() to inline comparison code).
- find the protocol control block for the transport session. For TCP
and UDP this means demultiplexing based on source+destination
addresses+ports. For TCP (which is easier) current Unix IPv4
implementations often do a quick comparison against the last protocol
control block found, and if this doesn't match they scan the list
of all active pcb's, comparing source+destination addresses+ports against
each. This is 2. above, and again may be a silly thing to do if
addresses are longer and comparisons more expensive, so let's again
assume that they'll optimize the heck out of this by (a) using the
destination address match done above to locate those pcb's using that
local address (a layering violation, probably), and (b) doing a binary
tree lookup against the source address and the port pair (with some
complexity to deal with wildcarding?) to find the best possible match,
then comparing the ports and the source address to find if it matches
or not. The cost of variable length addresses is going to be the
additional cost of locating the source address in the packet, since it
won't be at a fixed offset, a length check in the tree lookup loop
to avoid testing bits which aren't in the address, and again the cost
of a variable- rather than fixed-length address comparison.
Once you have the protocol control block you can checksum any data in
the packet along with the transport header (many packets won't have data),
process the TCP header (often quick if you can shortcut using header
prediction), and move the data out to the application.
I think it is clear that variable length address processing is going to
cost hosts more than four instructions (it is also clear that bigger
addresses of any flavour are going to require hosts to process addresses
more cleverly than they might do now if they want to keep the performance
up). What I can't estimate is whether the number of instructions it costs
a good implementation to deal with variable length addresses is going to
be significant compared to the remaining cost of processing the packet.
TCP processing can be relatively fast, but there is the checksum on
data-bearing packets and the cost of moving the data to the application
to consider. I don't know how this might compare.
Dennis Ferguson
Noel,
I'm sure you feel that your response to me was on the point, but it wasn't.
My suggestion was to be aware of a category of risk and then to minimize.
Minimize does not mean eliminate. Please do not attempt to turn my
caution/suggestion into more than it was.
to repeat: each step into the unknown entails risk. The more such steps,
the more such risk. Worse, I tend to view the accumulation as having worse
than a summative relationship. Hence, take only the ones that you are
forced to take.
Dave
+1 408 246 8253 (fax: +1 408 249 6205)
>The criticisms made by variable-advocates are all true. Every attempt
>at predicting the right length has been wrong.
Do you think we should do it (trying to predict the right length) one
more time and then invest money in building products and transitioning
to IPng, so that eventually we'll prove ONE MORE TIME that "every
attempt at predicting the right length has been wrong" ?
Yakov.
Yes I do.
We have no choice.
I believe the claimed benefits for variable length addressing entail risks
that we do not adequately appreciate and do not need to incur. Further,
most such proposals really are fixed-length addressing in variable-length
guise. That is, they, too, have an upper bound.
There are many, many technical problems and features that most of us would
like to tackle for IPng. All are worthy (IMO) but that does not mean that
we can or should tackle them. We have a core set of requirements to
satisfy and just attending to them is amibitious enough. Adding more
increases the risk of the whole project.
Increasing the risk of this project, beyond what is absolutely necessary,
would be dumb (IMO).
From small inexpensive router vendors, and host vendors, the consensus
is clearly that fixed length is best, and variable is unworkable.
> ok, what about a ipng video on demand server or a transactions server
> supporting 100,000 customers
>
> it has to map addresses into TCP PCBs exactly as fast as the router
> forwarding packets to it
>
> it has to do a _full address match_ not just a prefix one.....unless
> you believe the EID stuff (actually, even if you do, such a server
> might well be multihomed on multiple provider nets, and
> therefore have to choose its network interfaces arefully, and check
> inbound and outbound match properly...
only when the connection is setup. once the connection has been established,
it should have the output interface cached.
--
Frank Kastenholz
FTP Software
2 High Street
North Andover, Mass. USA 01845
(508)685-4000
Is it logical to deduce from your statement that you believe that
IPng should be forwardable with the same cache line load and
instruction cycle count as IPv4?
Is it logical to deduce from your statement that you believe that
future silicon (4 years from now, never mind 20) will have the same
performance characteristics, in terms of caches, etc, as today?
If the answer to either of these questions is no then I would contend
that this whole argument is silly and nothing more than the product
of coders who worry about doing something that they have not done
before.
We should first figure out what the architecture for the new IP
should be, what problems need to be solved and which ones we intend
to solve. Once we do that, we can develop a protocol. Only when we
have a protocol should we worry about optimizing it.
To worry about optimizations and then make the architecture and
specifications fit the optimizations is quite simply backwards.
Let's get the architecture and the protocols done. There are plenty
of over-bright, over-eager grad students and undergrad students who
can figure out how to optimize the architecture and protocol ONCE WE
DEFINE THEM.
--
I submit that people who sell slow routers might think that
variable length addresses are bad. Not everyone just slaps a net-2
port onto a 68000 and calls it a router, though.
I think we've arrived at the following position:
- variable length addresses will cost, and the cost will vary
according to just how variable the address is (see notes below).
- the cost is relatively small, in terms of performance, if one
is careful. 10-50 percent seems like a reasonable wild guess
for sensible sorts of variable length addresses.
- the cost is greater for hosts, therefore an architecture which
hid the variable length parts from the hosts might be real slick
(Nimrod with routers handling the flow setup and so on and so forth,
hosts just see a flat network of things with EIDs)
- the development risks might be high. Some feel that we don't know
enough about how to route on variable length things to be sure we
can do it right.
- the benefits of variable length addresses are under scrutiny. The
main benefit that people can come up with is ease of administration.
It's looking a lot like a smallish win for a smallish cost, when bits
hit metal. I don't think anyone thinks we can get away with small, or even
medium sized, locators. With something like Nimrod, it doesn't really matter
if they're a huge fixed size, or truly variable, since you don't use them
all that much.
Notes on variability of 'address' length
----------------------------------------
It'd be sort of nice if people could try to be a little more clear
on what they mean by 'variable length address.'
- Nobody seems to think that Endpoint Identifiers (EIDs) should be
variable length at all. There's some disagreement about how big
they should be.
- 'address' could be one of lots of things, I think that most people
seem to mean 'locator', a thing that describes where something with
an EID is. This is heirarchical, so making it fixed length, or
mandating a maximum length, implies a maximum depth to the routing
hierarchy.
- 'address' could also mean 'forwarding information', such as a source
route. I don't see how it's possible to make this be fixed length,
though it probably does come as a variable number of things, each
of a fixed size.
- A variable length field could be:
- any number of bytes in length, with no expected length
- any number of words in length, with no expected length
- any number of longs in length, with no expected length
- any number of bytes in length, with an expected length
that we can optimise for
- any number of words in length, with ...
- any number of longs in length, with ...
or any of the above with a maximum allowable size. Other
variations are possible, but these seem to be the sane ones.
As vjs has pointed out, it'd be a Bad Thing for fields to be
anything other than 32-bit aligned and in 32-bit hunks (well,
64 or 128 would be ok, but not less than 32).
>personally, i think there are as many opportunities to assign
>variable length wrong as there are advantages in the flexibility ...
>this is based on talking to sites with 100k host networks in the UK, i
>and note, if someone gets it wrong, the use memory in _all_ routers
I doubt if anyone disagrees with you. I would be very surprised
if any variable length proponent would advise any entity to deploy
the variable length addresses in a single site variably. Rather the sites
would select a preferred fixed length and use that. Thus, we might
use a 16 byte address because of our hierarchy needs but somebody
else may use a 8 byte address because of their more limited hierachy needs.
Neither they nor we are liable to mix sizes because that would imply
that we would be smart enough to keep them straight. However, should
our companies merge together (via an acquisition or some other unforseen
event) then we would have to keep a more complicated algorithm in mind: our
old company uses 16 bytes and their old company uses 8 bytes. In any
case, not too many of us would think that we were smart enough to
use multiple sizes in the same site.
My point: variable length addresses do not mean that sites necessarily
deploy variable sized addresses. It means that different sites may select
an address size which they think bests meets their needs. And there must
be a fixed upper and lower bound to what "variable" means.
Therefore, there is one more component to add to this discussion:
One size addresses means that everyone must use the same size addresses:
one size must fit all.
Variable length addresses gives sites the flexibility to use appropriate
sized addresses to best meet their needs. [Of course, "appropriate" is
in the eye of the beholder, and I would not want to defend what other
people may think "appropriate" to be.] It also leaves open the possibility
of escaping from a "wrong length choice" -- but that escape may likely carry
with it the requirement to readdress. However, I would prefer to readdress
rather than have to redeploy new protocols to compensate for our lack of
omnicience as we are for IPv4 ==> IPng.
Bottom line: I am always in favor of flexibility unless it carries with it
a prohibitive performance penalty. Thus, I am very grateful for those of
you who are discussing the performance impacts of these two approaches.
Is a consensus beginning to form on that critical issue? How great is the
performance hit on hosts for variable length addresses?
Sincerely yours,
--Eric Fleischman
BTW: Dave's mentioning of risk also resonated with me. However, somehow
I think that our experience with DECnet/OSI and OSI has meant that we
are not totally unexposed to variable length addressing and thus the risks
may not be as great as he suggested. Any comments?
---- Included message:
or Some
Well-Designed New Variable Length header that suddenly gets REAL SLOW
but keeps working when we start expanding from 8 byte addresses to 12?
You are making a number of assumptions about the particular risks involved.
I believe we don't know enough for such assurances. For example, since
we have little design or use experience, any claims that the var-length
scheme will be well design is (IMO) inappropriate. Ditto for the sorts
of deficient behavior we will experience.
As somebody (Noel?) pointed out a long time ago, at current growth
rates the "current" installed base will be only 5% or so of the total
running in 5 years. If you think it's painful to contemplate moving
Your heart is a small percentage of your body mass, but somehow, the
medical folks think that it's worth paying very close attention to
its well-being.
Yes. Converting to variable length scares me silly, there's a lot we dont
understand. On the other hand, unlike probably a good portion of this list
thank your for this. End of discussion.
D/
Hmm. Couldn't you equally say that we don't have enough design or use
experience to say that any claims that var-length schemes will be *poorly*
designed are not approriate?
Clever question, Noel. My own answer is 'no'. Goodness seems to require
effort and skill. Badness seems to come for free. (I suppose one could
assert rules of entropy and chaos to explain this, but I wouldn't dream
of proposing such an explanation myself...)
Umm. How much did we understand about the difficulties of moving retransmis
sion
into the hosts when TCP was done?
By 1983, when TCP was put into production use, we had about 10 years
of experience with Ethernet (and TCP) exponential backoff concepts.
Dave
You are making a number of assumptions about the particular risks involved.
I believe we don't know enough for such assurances. For example, since
we have little design or use experience, any claims that the var-length
scheme will be well design is (IMO) inappropriate.
Hmm. Couldn't you equally say that we don't have enough design or use
experience to say that any claims that var-length schemes will be *poorly*
designed are not approriate?
> Yes. Converting to variable length scares me silly, there's a lot we dont
> understand.
thank your for this. End of discussion.
Umm. How much did we understand about the difficulties of moving retransmission
into the hosts when TCP was done?
Noel
I'm not sure I'm fully able to grok the Crowcroftian here, but here goes... :-)
TCP she say: me big end to end protocol, need to know end from end
Sure, which is why I like EID's (for those who aren't crazy enough to depend
on the end-end-identification being included in the end-end-checksum of the
pseudo-header of the end-end data :-). I'm perfectly happy to make *them* a
reasonable (e.g. 64 bits) fixed length.
multihomed host, he say, need to look at varation in address, and
check him don't vary from 1 packet to next in same connection
Why? If someone moves, they should send you an ICMP message saying "this
EID is now at locator <foo>, please update yourself". Looking at *every*
packet on the offchance that someone has moved seems rather silly.
i.e. variable length address necessarily costs more to end system than
fixed address.
Well, I'm not in favor of variable length addresses. I'm in favor of variable
length locators, a big difference. Yes, variable length things take more time
to process in software, but intelligent coding can reduce the impact.
what is gain to network that is worth loss to end system?
Flexibility and adaptability over the entire life-cycle of the system.
Noel
It requires 50-70 instruction cycles and 4 to 8 cache line loads
to forward a current IP datagram using a processor. ... So, if you add,
say 10 instructions to header processing, you've slowed the system down by
15% to 20%.
There is a substantially different balance between I/O-memory bandwidth and
CPU-memory bandwidth in the routing application, from normal applications.
I.e. if your router is handling a bulk data transfer, with lots of large data
packets, the number of CPU memory references is a lot smaller, in ratio to the
I/O memory references, than it is for most "normal" applications. Then, you
have the issue of double transfers of all packet data over the bus, if you
have a large single shared memory, which leads you to put large buffers, and
some crunchy computing power, out on the interfaces. (This gives you a design
that scales better in terms of numbers of interfaces anyway, this being the
current trend, for reasons that escape me, since it's bigger single points of
failure).
So, for high performance routers, you wind up with a specialized hardware base
(in terms of the memory, and busses, etc) anyway. I think that you will find
that due to this specialized hardware base (and people like Tony can speak to
this more than me, since I don't build these things for a living and more,
like I used to :-), looking at this particular application from an analysis
based on convential workstations CPU's, etc, may not be too fruitful.
Perhaps Tony has some additional comments on this particular line?
Noel
---- Included message:
On Thu, 16 Jun 1994 11:54:15 EDT, Dave Crocker said:
If I remember correctly, the Jan 1 1983 cutover was a "everybody is
supposed to be doing TCP rather than NCP. Surely you aren't suggesting
that "everybody will be running IPng rather than IPV4 by MM/DD/YY" has a
No, indeed I'm not. So, either I can argue that the level of commitment
we are about to make is comparable to that moment in time, or else I
can backup the adoption date reference. I will, of course pursue both...
The decision in 1982 (or thereabouts) affected a few hundred machines
in 1983, with pretty much none of them being mission critical. The
decision today will very, very quickly affect many, many more machines.
So, I'm inclined to keep the 1983 reference as a measure of impact.
However, if we want to require 'structural' comparability to the reference,
then I'd say somewhere around 1980 a "community decision" was made to
adopt TCP and its retransmission scheme. (I wasn't part of this
activity and might have the date off, slightly, so unless someone wants to
assert 1976 or 1983 please don't refine my estimate. Since the first
implementation of TCP was in 1976, I doubt we'll have to push the estimate
back that far.)
In any event, this means that we had a minimum of five years of experience
with exponential backoff, and probably more like 7 or 8. That's quite
a lot. Also, the progression of work that led to the scheme did
constitute a foundation to a useful degree. I'm not sure we can claim the
same for a variable length scheme.
Yes, we have much improvement over recent years. But I think there is
a difference between initial adoption of a reasonably well understood
construct, followed by periodic improvement, versus a kind of
hand-over-eyes toss of the dart hoping for an improvement, without any
meaningful experience to understand the impact. (When pinning the tail
on the donkey, be careful it doesn't bite you.)
Dave
> My point is different, I think. The discussion in Chicago confirmed
>my own experience which is that the "network managers" for connected
>networks (i.e. not an isolated LAN) _must_ know about subnet masks and
>variable length subnetting to do their job reasonably well. I don't
>think that will change with IPng. Certainly the large interconnected
>IPX networks that I know about have NetWare Network Admins who grok and
>need to grok these things for IPX. I just want to make it _easier_ for
>such a network admin to use his/her address space more effectively --
>by making nibble boundaries easier to deal with.
I have three general key points for IPng:
1. Stateless autoconfiguration (thank you Ross for the term) of end nodes
for network connectivity.
2. Easier configuration of routers at customer sites.
3. Service location which is easy and scales (which isn't part of IPng per
se, but motivate #1).
For #2, IPX, and AppleTalk "classic", routers are reasonably simple for
local administrators (at least for network configuration parameters) to
configure: an administrator has a bag of unused network numbers and picks
one of those numbers at random from the bag and assigns it to the wire (the
router's interface to the wire).
IPv4 is a nightmare to administer. Subnet masks are a key reason why.
Hopefully, IPng will shield all but the "backbone" administrators from any
such notions. Hopefully, IPng will give a "bag" of network numbers to
local administrators to choose from. (The trick, of course, is doing this
and scaling from a routing point of view at the same time!)
Greg
ps - Which is not to say we shouldn't have ipng_ntoa()...
>I'd like to point out that the Internet is turning into the Information
>market (and it is doing this quite successfully). In such an
>environment the information is all over the net, and if the predominant
>use of the net is for the information retrieval, then I don't see how
>"only 1% traffic" is going to be nonlocal.
Foo. If the measured traffic today is "only 1% nonlocal" (not that Jon
made quite so strong of an assertion about the validity of his numbers),
that is just a fact.
What you are doing is waving your hands. I can wave mine, too: one major
break in the network today is WWW/Mosaic traffic (hitting the NCSA home
page and other popular sites). To fix that, we are going to have to move
data closer to the users (caching/replication). This will increase
"locality of reference". I suspect (waving of hands) that for most usages,
similar things will happen.
And, in general, you are ignoring that the predominant use of a LAN is to
retrieve information from that LAN; the predominant use of a campus net is
to retrieve information from that campus network; etc.
Greg
> Is it logical to deduce from your statement that you believe that
> IPng should be forwardable with the same cache line load and
> instruction cycle count as IPv4?
If it were feasible yes. I believe no IPng candidate can achieve this
goal but I think it is the right kind of goal. Put another way, yes I
believe a goal of IPng should be to minimize the number of instructions
and cache load/stores required to handle the header.
> Is it logical to deduce from your statement that you believe that
> future silicon (4 years from now, never mind 20) will have the same
> performance characteristics, in terms of caches, etc, as today?
Extrapolating from today's trends, the answer is that the performance
tradeoffs will generally be worse than today -- i.e., non-cache accesses
will be more expensive.
> We should first figure out what the architecture for the new IP
> should be, what problems need to be solved and which ones we intend
> to solve. Once we do that, we can develop a protocol. Only when we
> have a protocol should we worry about optimizing it.
I believe that this discussion all started on the SIPP list with the question
being what the final proposal for SIPP should be, and what were the costs vs.
benefits of variable vs. fixed sized addresses for SIPP. In that discussion,
performance is very much part of the equation.
Craig
This is a false dichotomy. It ignores many other logically possible
choices, such as a fixed length header that works fine until other
aspects of the protocol become obsolete.
> As somebody (Noel?) pointed out a long time ago, at current growth
> rates the "current" installed base will be only 5% or so of the total
> running in 5 years. If you think it's painful to contemplate moving
> to a non-extensible format *now*, imagine the pain in 15 or 20 years.....
Do you not think that the installed base of 15 years hence will be
obly 5% of the total running to years from now? Why not?
_________________________________________________________
Matt Crawford cra...@fnal.gov Fermilab
If the host is a server (NFS, database, video server etc) that has 10 wires
you still only want to spend a small fraction of CPU and memory bandwith
while running all wires at wire speed.
My gut feel is that servers have about the same number of wires as
routers i.e. the host software running on the servers have to be faster
than the router switching hardware+software in order for the servers to be
useful as information depositories.
Erik
Noel:
I vigorously disagree. If you put the general purpose CPU out on
the interfaces (and some folks have done this in the past, and will in
the future), the same rules apply. The speed at which the CPU can handle
the packets is bounded by the instruction/memory access time. And right
now the decision making process is the bottleneck -- otherwise explain to
me why it is that we have gigabit busses (implying, with average packet
size of 1 Kb, a packet rate of 1 MPS), yet packet rates are only in
the few hundreds of thousands of packets per second -- and why we have
slow and fast paths in routers...
Or, if you prefer, look at the AT&T route approach -- all their CPUs
do is decide where to send packets based on the headers (data is transferred
over a special bus).
Craig
> I believe that this discussion all started on the SIPP list with the question
> being what the final proposal for SIPP should be, and what were the costs vs.
> benefits of variable vs. fixed sized addresses for SIPP. In that discussion,
> performance is very much part of the equation.
You are correct. This started out as a discussion about what to do with
SIPP (which is why I copied the SIPP list). I think the outcome of the
SIPP discussion is relatively clear (as clear as any email discussion
ever is) and is that for SIPP 16byte fixed length addresses are a
reasonable middle ground. They resolve the issues that some people have
with the size of 8 byte SIPP addresses and make it easier to do
auto-configuration. Not everyone agrees with this (one voice who prefers
to keep the addresses 8 bytes, and a few who prefer variable length
addresses), but I think most of the people working on SIPP seem to
support 16byte fixed length addresses.
I think the discussion on SIPP should move back to the SIPP list. If
folks want to continue the general discussion about fixed vs. variable on
the BI, they are free to do so.
Bob
for SIPP 16byte fixed length addresses are a reasonable middle ground.
Just to clarify, is this going to be treated as a monolithic entity, or
more along the lines of what Bill Simpson suggested (with 8+8, and only the
lower 8 bytes being used in the transport pseudo-header)?
If folks want to continue the general discussion about fixed vs. variable
on the BI, they are free to do so.
This debate seems to make remarkably little progress in terms of coming to any
conclusion. People may be more educated about the issues, but it's hard to
point to any agreement beyond that.
Noel
First its not silly. Second I could take the coders view and say we are
sick of architecture folks trying to build protocols in the IETF
for IPng. Now we have real coders (not used to be coders) but folks
developing code from all proposals. I take your statement as an attack
to those folks whose companies are paying them to see if any of these
ideas will ever work and perform. I don't think you would want the
coders to go away from this forum then all you would have is rough
consensus and no running code.
>Let's get the architecture and the protocols done. There are plenty
>of over-bright, over-eager grad students and undergrad students who
>can figure out how to optimize the architecture and protocol ONCE WE
>DEFINE THEM.
Well I think all the coders should quit coming to the IETF if this is
the respect they have and let the IETF go into pure research mode. And
then no one in the market would take it seriously other than a think
tank.
I think you went overboard here about the coders is my point.
/jim
> Tony's reply missed some optimizations.
Some of those optimizations (e.g. check least significant bits first)
are classic, and apply more to fixed address sizes than variable.
Where do you find the least significant bits in the presense of varying
length fields?
The only optimization I suggested along these lines was to check the ports
first. They should be easy to find, and if you have an "internetwork header
length" field in IPng, it will takes *exactly* the same number of instructions
to find them as with IPv4, since IPv4 already has variable length internetwork
headers, and I doubt there's much code that optimizes for the "no options"
case (Van's header prediction might, I don't remember). In any case, the cost
of getting the pointer to the TCP header is swamped by the costs of checking
more than a few control blocks.
Both of you have missed a lot of optimizations.
Good, can you point them out?
You're both spining out academic ideas at the keyboard, not looking hard
to make a particular real implementation faster.
I understand the point you are trying to make, that we are looking at
hypothetical implementations, not real ones, and it's a good and valid one.
However, you could have said that without the "academic" label, which was
gratuitous, since this is one of the common ways used to put down things as
"too impractical". Can we all please continue this without ad hominem comments?
It's a great sport that I enjoy, but not relevant.
What else can we do? Try and look at all possible implementations and average
over them? Looking at a particular instance (even if hypothetical) has given
us, I think, a clearer idea of what the minimal costs really are...
You can obviously make processing a particular address size nearly as
fast processing the same but fixed size. ... What happens to the
performance of your hosts and routers should that size change?
The kind of thing Tony outlined, with the jump table, does not suffer any
massive loss of performance when you change the size; the cost does increase,
but in line with the increase in size of the address. The flexibility and
adaptability seems well worth it to me, especially when taken with your
comment that the processing can be made almost as efficient as for the same,
but fixed, size. We'll only pay the cost of larger items if and when we
start using them, not now.
Suddenly, on the day the Internet gets bigger, without any change on their
part, they get a lot slower, and in the distant future when they are
already at the limits of their performance.
I don't think so. Processors are still getting faster all the time, so by the
time addresses get larger, there'd be more processing power aroud nto handle
them. Of course, we're probably also sending more packets by then, etc, etc,
but it's not quite as bleak as you indicate here.
if you can run at speed with the rest of the header and the user data
at arbitrary offsets, then you will pay in performance.
If by "arbitrary", you mean "unaligned", then this is a bit unfair. Nobody is
talking about doing *either* of these. It's easy to i) put all the variable
length fields at the end of any header, and ii) either pad each one so the
next thing is aligned, or pad that whole header when done, so the *next*
header is aligned, etc.
Variable length *headers* do not have to impose a significant cost to the
processing of the following headers. Also, trying to make it look like you
don't have variable length headers, by use of repetitive encapsulation, is a
bit of a shell-game; it's still going to take processing power to handle those
encapsulated headers.
Depending on what the variable length *data items* are in such header, there
may or may not be a significant cost in handling those data items, but that
depends on the details of the use of those fields. If you have a fixed length
EID, a variable length address or locator is obviously a lot less expensive
overall, since many of the expensive operations at the transport level (such
as control block location, etc) would be performed on the EID. Etc, etc.
that cost will be lost in the noise in low performance implementations,
but it will loom large in high performance implemenations, those which
spend no more than Van Jacobson's handful of cycles/packet.
Just out of interest, how does VJ prediction find the control block quickly?
Does it use much the same code as the vanilla BSD, or does he have some tricks
there too? E.g., keeping the control blocks linked in MRU order? (Details all
omitted to keep this short! :-)
By the way, which sizes of address are proposed? Any length from 8 to
32 bytes including odd values?
Well, for locators (i.e. not in every packet), my preference would consist of
a sequence of variable number of variable length elements. These elements
would usually only be a byte or two, but the last one would typically be
longer; e.g. 48 bits.
Note that anything that does not put the user data at predictable
boundaries will have catastrophic performance effects on the performance
...Going from 8 to 9 byte addresses without adding a couple bytes of
padding
Yes, clearly any variable length fields should be padded to keep things
aligned. However, note that IPv4 does this already (the internetworking header
is padded to a multiple of 32 bits).
How recently did you write that code? How fast did it go, adjusting
for the speed of the hardware at the time?
It was some years ago, but I was getting a good fraction of a megabit/sec from
a 68000 (or 68010, I don't remember which, and I don't recall the clock speed)
running from a data source to a data sink on another machine over an Ethernet.
I had to go through an IPC to send the packets, which was something of a
bottleneck as I recall, but the details are faded.
Please do not argue on such grounds. None of us can win such arguments
and none of us look good indulging in them.
I'd be more than happy to refrain, especially if it's agreed that you don't
have to be writing or maintaining code *this week* to have anything of value
to say.
Noel