latency from off-campus

Rob McNicholas

unread,

Jan 22, 2002, 3:18:09 PM1/22/02

to

I'm suprised there has been no discussion of this yet. Has anyone
else noticed the extreme latency when coming from the "commodity
internet" to campus? Rumor has it that packet "strangling" is being done
to throttle the connection down to an acceptable limit.

Anyone from CNS care to make a semi-official announcement? :-)

-rob

--
Rob McNicholas Dept. of EECS, U.C. Berkeley
ro...@eecs.berkeley.edu 321 Soda Hall, 510/642-8633

Aaron Brown

unread,

Jan 22, 2002, 5:44:13 PM1/22/02

to

Rob McNicholas <ro...@apiary.eecs.berkeley.edu> writes:

> I'm suprised there has been no discussion of this yet. Has anyone
> else noticed the extreme latency when coming from the "commodity
> internet" to campus? Rumor has it that packet "strangling" is being done
> to throttle the connection down to an acceptable limit.

I've definitely noticed this, and it's extremely frustrating. As a
graduate student doing CS research work, this network change basically
made it impossible for me to do my research work off-campus; from my
@home connection, I'm getting 800+ms latencies inserted within the
Berkeley network, and virtually no bandwidth. It's become painful just
to read an email message over IMAP, and forget about interactive work
or transferring files to/from my CS home directory. The bandwidth cap
also affects access to the CS department web server, making it hard
for outsiders to access our research papers, talks, and data.

I think this needs to be addressed publicly by CNS, and ideally a better
solution needs to be found. As it stands, this throttling is a serious
impediment to people doing real research work from home; most faculty and
students work at least partially from home and they will be affected by
this.

I also wonder why the throttling is being done across-the-board. At
other schools (Harvard is a good example that I'm familiar with),
selective traffic shaping is used very effectively to rate-limit only
certain traffic (napster/gnutella-style stuff in particular). In my
experience, this system works very well: work-critical ssh, cifs, and
web traffic is virtually unaffected, while bandwidth-sucking file
sharing traffic is limited.

--Aaron

Cliff Frost

unread,

Jan 22, 2002, 6:46:24 PM1/22/02

to

In ucb.net.discussion Rob McNicholas <ro...@apiary.eecs.berkeley.edu> wrote:
> I'm suprised there has been no discussion of this yet. Has anyone
> else noticed the extreme latency when coming from the "commodity
> internet" to campus?

Yes, we noticed it last week.

> Rumor has it that packet "strangling" is being done
> to throttle the connection down to an acceptable limit.

Rather traffic shaping is being done to keep us from going into
serious financial deficit.

> Anyone from CNS care to make a semi-official announcement? :-)

Sure, and since I'm the Director it's even more officious, er, official
than it might otherwise be.

I'm working on an announcement to send more broadly, but here is some
wording that I think is correct if not as articulate as I'd like:

Several people have started noticing network performance problems
between campus computers and internet sites. I've been asked if
this is the result of some new policy.

This is a result of finances and overall campus traffic. The
size of our pipe to the commodity Internet is dictated by
how much bandwidth we can afford to purchase.

For the last couple of years (once we separated out the massive
amounts of traffic from the Res Halls) the size of the pipe we can
afford was large enough that no one noticed any performance
problems. The campus's overall Commodity Internet traffic usage
has gone up to the point where we are now suffering.

The amount we can afford today is 70 mb/s. The ResHall folks
are purchasing a separate 40 mb/s. We believe we have the ability
to provide another 15 mb/s if someone can come up with the money
(approx $4,500/month). We also believe we can provide even more if
there is a significant influx of capital as well as the monthly cost.

Note that this is mostly a problem for off-campus sites that are
commercial. Traffic to other Universities in the US, Mexico, Canada,
and many other parts of the world remain unimpeded.

Also note that we hit this cap very quickly, even before classes
started.

So, the only policy going on here is the decision not to go into
deficit. I hope this helps.

Thoughts?
Thanks,

Cliff Frost
Director, Communication & Network Services

Cliff Frost

unread,

Jan 22, 2002, 6:59:40 PM1/22/02

to

In ucb.net.discussion Aaron Brown <abr...@cs.berkeley.edu> wrote:
> Rob McNicholas <ro...@apiary.eecs.berkeley.edu> writes:

>> I'm suprised there has been no discussion of this yet. Has anyone
>> else noticed the extreme latency when coming from the "commodity
>> internet" to campus? Rumor has it that packet "strangling" is being done
>> to throttle the connection down to an acceptable limit.

> I've definitely noticed this, and it's extremely frustrating. As a
> graduate student doing CS research work, this network change basically
> made it impossible for me to do my research work off-campus; from my
> @home connection, I'm getting 800+ms latencies inserted within the

I've never seen that high. 300ms is more usual. I run into it doing
ssh from home over SBC's DSL service.

But I don't dispute the point that it is creating pain.

...

> I think this needs to be addressed publicly by CNS, and ideally a better
> solution needs to be found. As it stands, this throttling is a serious
> impediment to people doing real research work from home; most faculty and
> students work at least partially from home and they will be affected by
> this.

Yes.

> I also wonder why the throttling is being done across-the-board. At

Well, we're considering something like what you describe. I have some
doubts that it can be completely effective however, given how easy it
is to get around the well-known port number shaping. Folks here are
pretty creative. Still it may be worth a try.

Thanks,
Cliff

Jon Kuroda

unread,

Jan 22, 2002, 7:02:33 PM1/22/02

to

[see cliff frosts post for more details on the 70Mbit bit]
My own suspicions, again, are that various peer-to-peer and other
file-sharing applications are a large part of why we've hit this
limit now. But they're just suspicions, I am sure CNS has a far
better idea of how the 70MBits/sec are being consumed.

It is frustrating to work when one can't even login via ssh to campus
hosts. But, I wouldn't go so far as to say there is a better solution
just like that. I am not intimately familiar with either Harvard's
or UC Berkeley networks, but I'd place decent odds on their being
different in topology, size, and implementation. Add on top of that
possible differences in provisioning for staff, and each campus'
bureacracies and I'd hate to try to compare apples to oranges.

Anyway, I do remember,back in the day when ICMP flooding was "popular",
getting a filter placed on a campus border router to block ICMP
traffic to a host that was under attack. That filter, from what I
remember, incurred a 10% CPU load on the router. I am not sure if
or how much of the campus network infrastructure has been upgraded
since then to something more capable, but just saying, even simple
filters can have drastic effects on availability/performance.

Now, having said that, how feasible is it to place more specific
traffic shaping policies to give priority to such things as smtp,
ssh, ftp, http, and other "important" protocols?

In article <uzo363...@cs.berkeley.edu>,

Aaron Brown <abr...@cs.berkeley.edu> wrote:
>I've definitely noticed this, and it's extremely frustrating. As a
>graduate student doing CS research work, this network change basically
>made it impossible for me to do my research work off-campus; from my
>@home connection, I'm getting 800+ms latencies inserted within the
>Berkeley network, and virtually no bandwidth. It's become painful just
>to read an email message over IMAP, and forget about interactive work
>or transferring files to/from my CS home directory. The bandwidth cap
>also affects access to the CS department web server, making it hard
>for outsiders to access our research papers, talks, and data.
>
>I think this needs to be addressed publicly by CNS, and ideally a better
>solution needs to be found. As it stands, this throttling is a serious
>impediment to people doing real research work from home; most faculty and
>students work at least partially from home and they will be affected by
>this.
>
>I also wonder why the throttling is being done across-the-board. At
>other schools (Harvard is a good example that I'm familiar with),
>selective traffic shaping is used very effectively to rate-limit only
>certain traffic (napster/gnutella-style stuff in particular). In my
>experience, this system works very well: work-critical ssh, cifs, and
>web traffic is virtually unaffected, while bandwidth-sucking file
>sharing traffic is limited.

In article <a2ktkg$1uqk$1...@agate.berkeley.edu>,

Cliff Frost <cl...@ack.Berkeley.EDU> wrote:
> The amount we can afford today is 70 mb/s. The ResHall folks
> are purchasing a separate 40 mb/s. We believe we have the ability
> to provide another 15 mb/s if someone can come up with the money
> (approx $4,500/month). We also believe we can provide even more if
> there is a significant influx of capital as well as the monthly cost.

how much help an extra 15Mbit/sec will help? and if it will help a lot,
would say individual departments be willing to pitch into the 4500/month?
Could departments pay to get more bandwidth for their networks? =)
Now I'd buy that for a dollar!(tm)

Wordy as ever,
Jon

Cliff Frost

unread,

Jan 22, 2002, 10:18:23 PM1/22/02

to

In ucb.net.discussion Jon Kuroda <jku...@csua.berkeley.edu> wrote:
> [see cliff frosts post for more details on the 70Mbit bit]
> My own suspicions, again, are that various peer-to-peer and other
> file-sharing applications are a large part of why we've hit this
> limit now. But they're just suspicions, I am sure CNS has a far
> better idea of how the 70MBits/sec are being consumed.

No need to be suspicious, it's fact. Kazaa's well known port has
the highest traffic of all, followed by http. Pretty close to these
is gnutella's port. In all, kazaa and gnutella account for more than
half the bits in aggregate.

> It is frustrating to work when one can't even login via ssh to campus
> hosts.

I agree.

> But, I wouldn't go so far as to say there is a better solution
> just like that.

Thank you. I wish it were simple, but I really don't think it is.

Here are a couple of issues:

* When Napster usage took off like crazy a couple of years ago, IU,
Yale, and USC very publically took the position that they would
allow Napster but give Napster traffic lower priority via traffic
shaping. Those three Universities were immediately added to the
lawsuits that the RIAA and Metallica and Dr Dre had brought against
Napster. All three ended up blocking Napster traffic entirely (as
I recall.)

The lesson is that if you single out a specific type of traffic and
treat it differently than the rest (even if you treat it "worse" in
some way), you are making an explicit judgement about that kind of
traffic. The judgement you are making is that you specifically
*approve* of that kind of traffic. Or at least that's the legal
argument you're at risk of encountering. That's why the Universities
in question caved in so quickly. Maybe with kazaa and gnutella there
isn't much risk, but I'm not eager to plunge the University into the
midst of a nasty lawsuit.

* I don't think that the campus wants CNS to get into the business of
judging that some kinds of packets are more valid than others. I'm happy
to have us implement the judgements of campus if we can, but until then
anything we do will be temporary.

> I am not intimately familiar with either Harvard's
> or UC Berkeley networks, but I'd place decent odds on their being
> different in topology, size, and implementation. Add on top of that
> possible differences in provisioning for staff, and each campus'
> bureacracies and I'd hate to try to compare apples to oranges.

They are quite different, but if Harvard thinks its safe to do this
then I'm interested. I'll be contacting my colleagues there to see
what they are actually doing.

> Anyway, I do remember,back in the day when ICMP flooding was "popular",
> getting a filter placed on a campus border router to block ICMP
> traffic to a host that was under attack. That filter, from what I
> remember, incurred a 10% CPU load on the router. I am not sure if
> or how much of the campus network infrastructure has been upgraded
> since then to something more capable, but just saying, even simple
> filters can have drastic effects on availability/performance.

> Now, having said that, how feasible is it to place more specific
> traffic shaping policies to give priority to such things as smtp,
> ssh, ftp, http, and other "important" protocols?

We're using Packeteer traffic shaping devices instead of routers for
this kind of thing. I think something to help is likely quite feasible,
although risky--there are a whole lot of unpleasant unintended consequences
you can get into with complicated shaping policies.

Thanks,
Cliff

Tom Holub

unread,

Jan 23, 2002, 12:35:13 AM1/23/02

to

In article <a2la1v$25ou$1...@agate.berkeley.edu>,
Cliff Frost <cl...@ack.Berkeley.EDU> wrote:
)
)* I don't think that the campus wants CNS to get into the business of
) judging that some kinds of packets are more valid than others. I'm happy
) to have us implement the judgements of campus if we can, but until then
) anything we do will be temporary.

So the question becomes, how do you decide "the judgements of campus"?
Because this is clearly a problem which someone needs to make a decision
on, and quickly.

The two groups which come to mind are NAG and the ITAC. The EBITF
is another possibility but I'd be concerned at the lack of technical
participation from campus departments (as I've noted before). I agree
that traffic content policy decisions shouldn't be made by CNS alone;
while there is a network-operation component to this issue, it's
really a question of tradeoffs.

For the acronym-challenged:

NAG--the Network Advisory Group, formed to give CNS a way to get advice
on policy and node bank issues from the broader campus community.

ITAC (formerly ITATF): Information Technology Architecture Committee,
a broad campus group discussing a wide range of issues.

EBITF: E-Berkeley Implementation Task Force. Probably not really
in their ballpark, but administrative muckymucks tend to listen to
them.

--
Tom Holub (tom_...@LS.Berkeley.EDU, 510-642-9069)
College of Letters & Science
249 Campbell Hall

Cliff Frost

unread,

Jan 23, 2002, 9:20:26 AM1/23/02

to

In ucb.net.discussion Tom Holub <t...@ls.berkeley.edu> wrote:
> In article <a2la1v$25ou$1...@agate.berkeley.edu>,
> Cliff Frost <cl...@ack.Berkeley.EDU> wrote:
> )
> )* I don't think that the campus wants CNS to get into the business of
> ) judging that some kinds of packets are more valid than others. I'm happy
> ) to have us implement the judgements of campus if we can, but until then
> ) anything we do will be temporary.

> So the question becomes, how do you decide "the judgements of campus"?
> Because this is clearly a problem which someone needs to make a decision
> on, and quickly.

The NAG.

Here's why:

The judgements of campus are made by the Chancellor and the Chancellor's
Cabinet. (For Communications type issues advice on these isses has been
delegated to the E-Berkeley Steering Committee, which further delegated
it to the NAG. Note the word "advice" above, the Chancellor and Cabinet
can always override (not that they're likely to do that often, but if
something gets a lot of visibility they may want to take action.)

There's a real problem here, however: there are no faculty on the NAG.
Efforts to recruit faculty have been uniformly unsuccesssful. In fact,
it took me about 3 years to get the NAG to exist at all. In the absence
of an actual crisis it can be almost impossible to get anyone to pay
attention to serious issues on campus...

Thanks,
Cliff

Jeff Anderson-Lee

unread,

Jan 23, 2002, 11:46:17 AM1/23/02

to

Cliff Frost <cl...@ack.Berkeley.EDU> writes:

> The lesson is that if you single out a specific type of traffic and
> treat it differently than the rest (even if you treat it "worse" in
> some way), you are making an explicit judgement about that kind of
> traffic. The judgement you are making is that you specifically
> *approve* of that kind of traffic. Or at least that's the legal
> argument you're at risk of encountering. That's why the Universities
> in question caved in so quickly. Maybe with kazaa and gnutella there
> isn't much risk, but I'm not eager to plunge the University into the
> midst of a nasty lawsuit.

Perhaps a slightly less slippery slope is to judge what packets are
given precedence rather than exclusion. For example: ssh, imap, and
kerberized telenet could conceivably be considered "higher priority"
without singling out any particular other traffic. Theat would
significantly help the telecommuters (of which I am one) and students
who work from home which seems to be the primary concern here.

Jeff Anderson-Lee
System Manager, Digital Library Project
ERL/UCB

Marshall Perrin

unread,

Jan 23, 2002, 8:42:28 PM1/23/02

to

Jeff Anderson-Lee <jo...@dlp.CS.Berkeley.EDU> wrote:

> Cliff Frost <cl...@ack.Berkeley.EDU> writes:
> Perhaps a slightly less slippery slope is to judge what packets are
> given precedence rather than exclusion. For example: ssh, imap, and
> kerberized telenet could conceivably be considered "higher priority"
> without singling out any particular other traffic. Theat would
> significantly help the telecommuters (of which I am one) and students
> who work from home which seems to be the primary concern here.

I'll second that. If the ports for http, mail, and ssh had priority (which I
think could easily be justified by saying those are the ones most commonly used
for academic purposes) then my guess is no one could accuse CNS of explicitly
sanctioning Napster or any of the like. How technically feasible this sort of
thing is, I don't know. Presumably it depends upon the details of the Packeteer
shaping device, which I'm not familiar with.

I'll add my voice to that of the throng (such as it is) clamoring for improved
ssh performance from home to campus. I've definitely noticed a drop in
responsivity, though not as bad as what Aaron describes - more like 300 ms
pings these days when I used to have about 90, coming in over DSL from
Sonic.net.

- Marshall

Cliff Frost

unread,

Jan 23, 2002, 9:47:21 PM1/23/02

to

In ucb.net.discussion Marshall Perrin <mperri...@arkham.berkeley.edu> wrote:
> Jeff Anderson-Lee <jo...@dlp.CS.Berkeley.EDU> wrote:
>> Cliff Frost <cl...@ack.Berkeley.EDU> writes:
>> Perhaps a slightly less slippery slope is to judge what packets are
>> given precedence rather than exclusion. For example: ssh, imap, and
>> kerberized telenet could conceivably be considered "higher priority"
>> without singling out any particular other traffic. Theat would
>> significantly help the telecommuters (of which I am one) and students
>> who work from home which seems to be the primary concern here.

I do tend to prefer this type of thing in theory. In practice it may
be pretty hard to do well. Keep the ideas coming, it's helpful.

For everyone's info, we will be supplying some relief to the current
congestion while working on various angles for long-term relief. Probably
around Friday you'll see some results, if not sooner.

There's a really interesting sociological thing going on here--my guess is
that individuals using University resources to transfer songs or videos
probably view it as sort of like stealing a pencil or pen from the office.
Not terribly significant when one person does it once. The problem is that
if 10,000 people start pilfering small stuff every day, it can actually add
up to real money pretty quickly...

Thanks,
Cliff

Marshall Perrin

unread,

Jan 23, 2002, 10:52:50 PM1/23/02

to

Cliff Frost <cl...@ack.Berkeley.EDU> wrote:
> There's a really interesting sociological thing going on here--my guess is
> that individuals using University resources to transfer songs or videos
> probably view it as sort of like stealing a pencil or pen from the office.
> Not terribly significant when one person does it once. The problem is that
> if 10,000 people start pilfering small stuff every day, it can actually add
> up to real money pretty quickly...

Probably a better analogy is people making personal phone calls from the
office... It's just part of the infrastructure, to be used without being
thought about. The difference, of course, being that the phone system has
substantially more overcapacity than the network. That difference probably
reflects more on the additional century's worth of maturity that the phone
system has, rather than any fundamental difference between the two
technologies.

- Marshall

Jeff Anderson-Lee

unread,

Jan 24, 2002, 11:47:37 AM1/24/02

to

Cliff Frost <cl...@ack.Berkeley.EDU> writes:

> Note that this is mostly a problem for off-campus sites that are
> commercial. Traffic to other Universities in the US, Mexico, Canada,
> and many other parts of the world remain unimpeded.

From this note and another source I took it that traffic through
networks that peered with Calren2 should be unaffected. However
I find this morning that my SBC/Pacbell connection "appears" to
go through calren2, but is VERY affected:

Tracing route to dlp.CS.Berkeley.EDU [128.32.46.163]
over a maximum of 30 hops:

1 <10 ms <10 ms <10 ms 192.168.2.1
2 20 ms 20 ms 20 ms adsl-64-167-239-254.dsl.scrm01.pacbell.net [64.1
67.239.254]
3 10 ms 10 ms 10 ms 64.171.152.69
4 20 ms 20 ms 20 ms bb1-g5-0.scrm01.pbi.net [64.171.152.231]
5 20 ms 10 ms 10 ms sl-gw25-stk-2-0.sprintlink.net [160.81.16.89]
6 20 ms 20 ms 20 ms sl-bb20-stk-8-1.sprintlink.net [144.232.4.217]
7 20 ms 20 ms 20 ms sl-bb23-sj-5-1.sprintlink.net [144.232.9.165]
8 20 ms 20 ms 21 ms sl-bb20-sj-12-0.sprintlink.net [144.232.3.193]
9 20 ms 20 ms 30 ms svl-brdr-02.inet.qwest.net [205.171.1.133]
10 20 ms 20 ms 21 ms svl-core-03.inet.qwest.net [205.171.14.162]
11 20 ms 20 ms 30 ms svl-edge-09.inet.qwest.net [205.171.14.98]
12 20 ms 20 ms 21 ms 65.113.32.210
13 20 ms 31 ms 20 ms ucb-gw--qsv-juniper.calren2.net [128.32.0.69]
14 20 ms 30 ms 30 ms vlan196.inr-201-eva.Berkeley.EDU [128.32.0.74]
15 1121 ms 1122 ms 1121 ms fast8-0-0.inr-210-cory.Berkeley.EDU [128.32.255.122]
16 1092 ms 1031 ms 1052 ms GE.cory-gw.EECS.Berkeley.EDU [169.229.1.46]
17 1051 ms 952 ms 941 ms 169.229.59.237
18 931 ms 911 ms 832 ms 169.229.59.241
19 832 ms 811 ms 831 ms dlp.CS.Berkeley.EDU [128.32.46.163]

Is the fact that a calren2 name appears at hop 13 a red herring, or
is the reverse route merely set incorrectly such that the return
packets are being subject to rate limiting even though the outbound
(from my prespective) packets are passing through Calren2?

Oops, cancel that -- did someone fix something?

12 20 ms 20 ms 30 ms 65.113.32.210
13 20 ms 30 ms 20 ms ucb-gw--qsv-juniper.calren2.net [128.32.0.69]
14 20 ms 30 ms 20 ms vlan196.inr-201-eva.Berkeley.EDU [128.32.0.74]
15 20 ms 30 ms 20 ms fast8-0-0.inr-210-cory.Berkeley.EDU [128.32.255.122]
16 20 ms 30 ms 30 ms GE.cory-gw.EECS.Berkeley.EDU [169.229.1.46]
17 20 ms 30 ms 20 ms 169.229.59.237
18 20 ms 30 ms 30 ms 169.229.59.241
19 20 ms 30 ms 80 ms dlp.CS.Berkeley.EDU [128.32.46.163]

Jeff Anderson-Lee
Systems Administrator, Digital Library Project
ERL/UCB

Cliff Frost

unread,

Jan 24, 2002, 12:16:54 PM1/24/02

to

In ucb.net.discussion Jeff Anderson-Lee <jo...@dlp.cs.berkeley.edu> wrote:
> Cliff Frost <cl...@ack.Berkeley.EDU> writes:

>> Note that this is mostly a problem for off-campus sites that are
>> commercial. Traffic to other Universities in the US, Mexico, Canada,
>> and many other parts of the world remain unimpeded.

> From this note and another source I took it that traffic through
> networks that peered with Calren2 should be unaffected. However
> I find this morning that my SBC/Pacbell connection "appears" to
> go through calren2, but is VERY affected:

SBC/PacBell DSL connections typically go through a peering point. The
Internet service is provided to PacBell by Conxion and CalREN2 peers with
them. There are times when Conxion's service is, ahem, less than sterling
but these are independent of the problems we're talking about here. I
use it and am generally getting good performance overall.

> Oops, cancel that -- did someone fix something?

Yes. I think you were victim of a nasty transient problem we had this
morning on campus.

By the way, we discovered a problem (bug) that was contributing a lot
to the woes of the last couple of weeks. We fixed that last night (I
think around 7pm). Then this morning the nastiness I mentioned above
hit us and RTTs spiked waaaaay up.

As of about 30 mins ago we think things are basically fixed in a short
term sense.

I would very much appreciate hearing from the community whether or not
things are better today and over the next few days.

Thanks,
Cliff

ps I will be happy to explain the gory details of the problems I allude
to above but not right now. (I'm kind of busy.) Remind me in a few
days if I haven't gotten to it and anyone's interested.

Ryan Means

unread,

Jan 24, 2002, 12:37:53 PM1/24/02

to

Cliff,

The latency problems seem to have disappeared entirely as far as I can
tell. Thanks! What did you guys do, and how short-term of a solution is
this?

Ryan

Cliff Frost wrote:
> By the way, we discovered a problem (bug) that was contributing a lot
> to the woes of the last couple of weeks. We fixed that last night (I
> think around 7pm). Then this morning the nastiness I mentioned above
> hit us and RTTs spiked waaaaay up.
>
> As of about 30 mins ago we think things are basically fixed in a short
> term sense.
>
> I would very much appreciate hearing from the community whether or not
> things are better today and over the next few days.
>

--
Ryan L. Means
Chief Technical Officer
School of Law (Boalt Hall)
University of California, Berkeley

John Kubiatowicz

unread,

Jan 24, 2002, 3:56:33 PM1/24/02

to

Cliff Frost wrote:
>
> In ucb.net.discussion Marshall Perrin <mperri...@arkham.berkeley.edu> wrote:
> > Jeff Anderson-Lee <jo...@dlp.CS.Berkeley.EDU> wrote:
> >> Cliff Frost <cl...@ack.Berkeley.EDU> writes:
> >> Perhaps a slightly less slippery slope is to judge what packets are
> >> given precedence rather than exclusion. For example: ssh, imap, and
> >> kerberized telenet could conceivably be considered "higher priority"
> >> without singling out any particular other traffic. Theat would
> >> significantly help the telecommuters (of which I am one) and students
> >> who work from home which seems to be the primary concern here.
>
> I do tend to prefer this type of thing in theory. In practice it may
> be pretty hard to do well. Keep the ideas coming, it's helpful.

Now that things seem to be better, the urgency here seems less high, but
I would like to add my 2 cents.

1) Can we check into the peering of both Pacbell/DSL *and* AT&T cable
with campus? These are both important access points for those of use
who work at home. Could we approach these organizations with the angle
that they would be doing their customers a service by peering more
directly with us? Perhaps there would be equipment prices to light-up
dark fiber, but these might be "one-time" costs. I also wonder if a lot
of the Kaaza (, etc.) traffic would be heading to these networks because
they tend to be used by individuals with home computers....

2) Can we add IPSEC protocol (mostly ip protocol 50, but I suppose 51 as
well) to the list of important prioritized types of traffic -- if
prioritization is ever adopted. I use IPSEC to get to campus. For that
matter, the port 500 UDP traffic for key exchange would good as well.

3) Not that I'm volunteering for a lot of time, but there is at least
one faculty (me) who is interested in the network. Technically, I am
on the EECS committee for computing as well, so there might be a bit of
tie-in.

--KUBI--
--
Professor John Kubiatowicz
Computer Science Division
673 Soda Hall, Berkeley

kubitron.vcf

Paul Vojta

unread,

Jan 24, 2002, 5:39:27 PM1/24/02

to

In article <3C507501...@cs.berkeley.edu>,
John Kubiatowicz <kubi...@cs.berkeley.edu> wrote:

>3) Not that I'm volunteering for a lot of time, but there is at least
>one faculty (me) who is interested in the network. Technically, I am
>on the EECS committee for computing as well, so there might be a bit of
>tie-in.

Me too.

One thing you could try is posting a call for volunteers to this list
(I'm reading it from ucb.sysadmin, but ucb.net discussion would be good, too),
along with a description of the charge of the committee, what sort of
expertise is required, time commitment needed, etc.

--Paul Vojta, vojta@math

Cliff Frost

unread,

Jan 24, 2002, 7:02:14 PM1/24/02

to

In ucb.net.discussion John Kubiatowicz <kubi...@cs.berkeley.edu> wrote:
...

> 1) Can we check into the peering of both Pacbell/DSL *and* AT&T cable
> with campus? These are both important access points for those of use

Yes, we'll work on that.

> 2) Can we add IPSEC protocol (mostly ip protocol 50, but I suppose 51 as
> well) to the list of important prioritized types of traffic -- if
> prioritization is ever adopted. I use IPSEC to get to campus. For that
> matter, the port 500 UDP traffic for key exchange would good as well.

If we go this route we'd probably include IPsec. Of course, it's only
a matter of time before the music video kiddies start using ipsec to
transfer things around. I'm told there are already versions of these
peer-to-peer filesharing programs that port-hop (specifically to get
around traffic shaping on the kazaa/gnutella/etc ports.)

> 3) Not that I'm volunteering for a lot of time, but there is at least
> one faculty (me) who is interested in the network. Technically, I am
> on the EECS committee for computing as well, so there might be a bit of
> tie-in.

Hooray!!! And a mathematician volunteered also. Great!

I don't control membership on the NAG. I'll recommend you and Prof Vojta
to the people who do.

For background, I was asked (by Paul Gray) to send a request to the
chair of the Academic Senate. I've not heard back from him, but I
haven't done a follow-up to make sure he actually received my first
memo. So I'll send him a followup and recommend you two. I don't
think the work load is enormous.

Thanks,
Cliff

Sharad Agarwal

unread,

Jan 25, 2002, 2:18:47 PM1/25/02

to

> 1) Can we check into the peering of both Pacbell/DSL *and* AT&T cable
> with campus? These are both important access points for those of use
> who work at home. Could we approach these organizations with the angle
> that they would be doing their customers a service by peering more
> directly with us? Perhaps there would be equipment prices to light-up
> dark fiber, but these might be "one-time" costs. I also wonder if a lot
> of the Kaaza (, etc.) traffic would be heading to these networks because
> they tend to be used by individuals with home computers....

That is a good point. Perhaps we can do a 'reverse' analysis too. I'll
make the wild assumption here that most KaZaA traffic comes from/to the
home computers in the dorms, as opposed to campus research machines.
If this is indeed the case (perhaps the powers that be can measure this)
why not have ResComp get their own, separate connectivity to the outside
that is rate limited? The ResComp domain can still peer with the rest
of the campus for only campus traffic.

Sharad.

Tom Holub

unread,

Jan 25, 2002, 3:04:11 PM1/25/02

to

In article <a2sb2n$u5q$1...@agate.berkeley.edu>,
Sharad Agarwal <saga...@CS.Berkeley.EDU> wrote:
)
)That is a good point. Perhaps we can do a 'reverse' analysis too. I'll
)make the wild assumption here that most KaZaA traffic comes from/to the
)home computers in the dorms, as opposed to campus research machines.
)If this is indeed the case (perhaps the powers that be can measure this)
)why not have ResComp get their own, separate connectivity to the outside
)that is rate limited? The ResComp domain can still peer with the rest
)of the campus for only campus traffic.

The dorms already have their own separate pipe (40 megabits/sec to the
commodity net) which is paid for by dorm fees. The traffic issues we've
hit in the past few weeks have been due to on-campus traffic.

Mike Howard

unread,

Jan 26, 2002, 5:51:09 PM1/26/02

to

In ucb.net.discussion Cliff Frost <cl...@ack.berkeley.edu> wrote:

> I've never seen that high. 300ms is more usual. I run into it doing
> ssh from home over SBC's DSL service.

Well, something's obviously broken right this moment. I have no
usable direct connectivity from my AT&T cable now.

Fortunately, I can access a colocated machine at Exodus, which
offers a slightly less bad route. So I was able to gather the
following.

Sat Jan 26 14:30:12 2002

It later peaked at 10 seconds. Ping times have dropped, but I'm
still seeing 25%-50% loss.

Inbound to campus:

Hostname %Loss Rcv Snt Last Best Avg Worst
1. xxx.xxx.xxx.xxx 100% 0 62 0 0 0 0
2. 12.244.97.97 0% 62 62 39 9 20 60
3. 12.244.67.65 0% 62 62 31 9 23 92
4. 12.244.72.202 0% 62 62 29 27 41 92
5. gbr5-p30.sffca.ip.att.net 0% 62 62 72 27 44 86
6. tbr1-p013501.sffca.ip.att.net 0% 62 62 109 28 47 109
7. ggr1-p340.sffca.ip.att.net 0% 62 62 67 26 40 122
8. pos6-3.core1.SanFrancisco1.Level3.n 0% 62 62 31 27 40 86
9. so-4-0-0.mp2.SanFrancisco1.Level3.n 0% 62 62 42 28 39 90
10. so-2-0-0.mp2.SanJose1.Level3.net 0% 62 62 29 28 38 92
11. gige9-0.hsipaccess1.SanJose1.Level3 0% 62 62 29 28 42 91
12. unknown.Level3.net 0% 62 62 30 28 44 89
13. ucb-gw--qsv-juniper.calren2.net 0% 62 62 85 51 166 596
14. vlan196.inr-202-doecev.Berkeley.EDU 26% 46 62 6223 132 2812 7233
15. fast9-0-0.inr-210-cory.Berkeley.EDU 31% 43 62 6207 117 3149 7281
16. GE.cory-gw.EECS.Berkeley.EDU 19% 50 61 6196 105 2605 7215
17. 169.229.59.237 25% 46 61 6462 64 2420 7288
18. 169.229.59.249 19% 49 61 6768 79 2577 7273
19. soda.CSUA.Berkeley.EDU 24% 46 61 6706 83 2709 7272

Outbound from the same machine:

Hostname %Loss Rcv Snt Last Best Avg Worst
1. gig6-1v247.snr1.CS.Berkeley.EDU 0% 64 64 0 0 5 139
2. 169.229.59.250 0% 64 64 1 1 7 144
3. 169.229.59.238 0% 64 64 68 1 7 139
4. gigE5-0-0.inr-210-cory.Berkeley.EDU 0% 64 64 18 1 10 139
5. vlan230.inr-202-doecev.Berkeley.EDU 8% 59 64 2 2 10 138
6. fast4-1-0.inr-new-666-doecev.Berkel 10% 58 64 56 6 72 206
7. qsv-juniper--ucb-gw.calren2.net 18% 52 64 4933 49 2723 7190
8. svl-edge-09.inet.qwest.net 21% 50 63 4899 66 2367 7197
9. svl-core-03.inet.qwest.net 15% 54 63 4837 44 2672 7187
10. svl-core-02.inet.qwest.net 15% 54 63 4808 52 2488 7193
11. svl-brdr-01.inet.qwest.net 13% 54 63 6688 47 2563 7202
12. 205.171.4.234 18% 52 63 5991 21 2664 7235
13. tbr1-p013302.sffca.ip.att.net 20% 51 63 6002 35 2855 7354
14. gbr5-p100.sffca.ip.att.net 21% 46 58 5876 23 2547 7388
15. gar3-p360.sffca.ip.att.net 19% 47 58 5746 64 2387 7337
16. 12.244.72.201 23% 45 58 5621 49 2312 7325
17. 12.244.67.66 16% 49 58 5613 68 2389 7281
18. 12.244.97.98 26% 43 58 5537 40 2288 7195
19. xxx-xxx-xxx-xxx.client.attbi.com 25% 43 58 6428 64 2263 7243

ken lindahl

unread,

Jan 27, 2002, 11:51:15 AM1/27/02

to

In ucb.net.discussion Mike Howard <mi...@soda.csua.berkeley.edu> wrote:
> Well, something's obviously broken right this moment. I have no
> usable direct connectivity from my AT&T cable now.

an inbound DOS attack (or two attacks) targetted at two campus hosts.
the inbound packet rate was elevated by well over 10000 pps for a short
period around 14:30 yesterday. see atm1_1_0 at:

http://cricket.berkeley.edu:8885/inr-new-666-interfaces.html

the source ip addresses were all over the map, including bogus addresses.
this has several consequences:

(1) this kind of attack (lots of tiny packets with varying source or
destination addresses) is hard on routers, since they have to make
lots of forwarding decisions in a short time and the wildly varying
addresses render caching schemes useless, or even detrimental.

(2) most border routers are configured to filter packets with bogus
addresses; since this involves examining at least the ip header of
every packet, an attack like this can be debilitating for a border
router.

(3) the flow records from the router, which we can use to analyze the
attack, after the fact, are huge (>40MB for any 15 minute period)
and difficult to work with. this makes it hard to say a lot about
the attack.

in this particular incident; our border router inr-new-666 took a pretty
hard hit. the large latencies shown in Mike's traceroutes are on either
side of inr-new-666 (depending on which direction the traceroute goes).

in this case the rate-limiting device did not play an active role in
the poor performance Mike reported.

ken

ken lindahl

unread,

Feb 16, 2002, 2:01:15 AM2/16/02

to

On Fri, 15 Feb 2002 02:50:08 +0000 (UTC), ja...@ucdata.Berkeley.EDU (Jason Meggs) wrote:
>I've been experiencing periodic latencies of significant magnitude,
>just using an ssh connection.
>...
>These types of periodic delays have not been entirely uncommon, despite
>the noted improvement recently.

i've been monitoring latency through the packetshaper and have noted these
sporadic, _aperiodic_, delays at times when the campus is presenting more
than 70Mbps of outbound data. (the shaper is configured to limit inbound
and outbound traffic to 70Mbps in each direction.) at about 06:15 this
morning (fri 15 feb 2002), i tweaked the config in a manner that i hope
improves handling of tcp traffic. (campus traffic is predominantly tcp.)
note that traceroutes and pings are udp and icmp, not tcp, and might not
benefit from the new config.

the statistics for today look better, even though they are based on ping.
i don't have an objective measure for whether that apparent improvement
is the result of improved handling of tcp data, or the result of less data
presented by the campus. i also don't know a good way of automatically
measuring latencies in tcp traffic.

it's become apparent that we originally became familiar with the behavior
of the packetshaper at times when the campus was not pushing the configured
rate-limit, and we now are needing to become familiar with the sometimes
different behaviors at time when the campus is pushing hard against the
configured rate-limit.

ken

p.s. apologies for being slow to respond. a significant fraction of CNS,
including me, is relocating from our current offices to a new location.
i've been challenged the last few days to get any real work done while
boxing up my office for the movers.

ken lindahl

unread,

Feb 16, 2002, 4:31:16 PM2/16/02

to

a number of folks filed trouble tickets last night and this morning
concerning large latencies getting to campus from their home DSL
connections, especially Pac*Bell DSL. this appears to have been caused
by asymmetric routing (traffic _to_ campus uses a different path than
traffic _from_ campus), so that the packetshaper was seeing only one
side of the tcp session.

we believe we have fixed this problem around 12:45 today, by setting
an undocumented system variable in the packetshaper designed to deal
with the problem of routing asymmetry.

ken lindahl
CNS

John Kubiatowicz

unread,

Feb 18, 2002, 2:32:09 AM2/18/02

to

ken lindahl wrote:
>
> a number of folks filed trouble tickets last night and this morning
> concerning large latencies getting to campus from their home DSL
> connections, especially Pac*Bell DSL. this appears to have been caused
> by asymmetric routing (traffic _to_ campus uses a different path than
> traffic _from_ campus), so that the packetshaper was seeing only one
> side of the tcp session.
>
> we believe we have fixed this problem around 12:45 today, by setting
> an undocumented system variable in the packetshaper designed to deal
> with the problem of routing asymmetry.
>
> ken lindahl
> CNS

Question: is it possible that this doesn't fix IPSEC traffic? I have
been seeing weird problems with my IPSEC gateway recently. Note that
IPSEC is neither TCP nor UDP -- it is IP protocol 50.

What is weird is that traffic gets really slow (and is dropped) for
brief periods of time, then resumes. During that time, normal TCP
traffic seems fine, and ICMP traffic (i.e. pings) seem fine as well. I
can cause problems just by increasing the traffic (i.e. reading my email
kills other traffic from my gateway).

Notice that it would appear to be the outgoing traffic (campus back out)
that is lost, since I can monitor my IPSEC tunnel from the Berkeley
gateway side.

I am receiving service from AT&T, if that matters. IPSEC traffic is
between 169.229.50.167 (at berkeley) and 12.233.33.34 (AT&T).

--KUBI--
Professor John Kubiatowicz
673 Soda Hall
Berkeley, CA 94720

kubitron.vcf

Michael Sinatra

unread,

Feb 18, 2002, 8:13:04 PM2/18/02

to John Kubiatowicz

On Sun, 17 Feb 2002, John Kubiatowicz wrote:

> Question: is it possible that this doesn't fix IPSEC traffic? I have
> been seeing weird problems with my IPSEC gateway recently. Note that
> IPSEC is neither TCP nor UDP -- it is IP protocol 50.
>
> What is weird is that traffic gets really slow (and is dropped) for
> brief periods of time, then resumes. During that time, normal TCP
> traffic seems fine, and ICMP traffic (i.e. pings) seem fine as well. I
> can cause problems just by increasing the traffic (i.e. reading my email
> kills other traffic from my gateway).
>
> Notice that it would appear to be the outgoing traffic (campus back out)
> that is lost, since I can monitor my IPSEC tunnel from the Berkeley
> gateway side.
>
> I am receiving service from AT&T, if that matters. IPSEC traffic is
> between 169.229.50.167 (at berkeley) and 12.233.33.34 (AT&T).

Interesting. The problem we fixed on Saturday dealt with asymmetric TCP
traffic. Since you're with ATT, the traffic shouldn't be asymmetric
through the campus borders, and IPSEC isn't TCP. I think I'll need to set
up some IPSEC tunnels between home and campus and see what happens. I am
with pacbell DSL, so it wouldn't be an exact replication of your
circumstance.

When did you start noticing the problem?

Unfortunately, I and many other CNS technical staff are in the process of
moving our offices so I haven't had much time to spend on technical
stuff...

Michael Sinatra
IST-CNS

Christopher Hylands

unread,

Feb 19, 2002, 1:29:30 AM2/19/02

to

BTW - The problems that SETI@home have been having with the 70Mbps
limit was slashdotted, see
http://slashdot.org/articles/02/02/19/0031242.shtml?tid=99

The latency issues of late have been playing havoc with my ability to
get real work done while offsite and using ssh.

Using the SHIPS modem bank instead of PacBell DSL is one workaround,
but it seems like I never get better than about 36.6kbps? I guess I
could install a modem of my own in Cory, but I would probably not get
much better than 36.6, probably more like 28.8.

I feel that it is arguable that the campus is under a DOS attack by
P2P clients. Certainly instruction and research is being hurt by
hitting the 70Mbps limit.

Is there anything that can be done?
Throwing more bandwidth at the problem only postpones the problem.

-Christopher

Christopher Hylands c...@eecs.berkeley.edu University of California
Ptolemy/Gigascale Silicon Research Center US Mail: 558 Cory Hall #1770
ph: (510)643-9841 fax:(510)642-2739 Berkeley, CA 94720-1770
home: (510)526-4010 (Office: 400A Cory)

John Kubiatowicz

unread,

Feb 19, 2002, 11:54:20 PM2/19/02

to netw...@eecs.berkeley.edu, Anthony Joseph, Hua-pei Chen

Michael Sinatra wrote:

> Interesting. The problem we fixed on Saturday dealt with asymmetric TCP
> traffic. Since you're with ATT, the traffic shouldn't be asymmetric
> through the campus borders, and IPSEC isn't TCP. I think I'll need to set
> up some IPSEC tunnels between home and campus and see what happens. I am
> with pacbell DSL, so it wouldn't be an exact replication of your
> circumstance.
>
> When did you start noticing the problem?
>
> Unfortunately, I and many other CNS technical staff are in the process of
> moving our offices so I haven't had much time to spend on technical
> stuff...
>
> Michael Sinatra
> IST-CNS

Well, my traffic is not symmetric into and out of campus. I show two
traceroutes below (end of message).

The behavior that I am seeing with my IPSEC traffic is very similar to
what Anthony Joseph said he saw with the asymmetric problem. If I ping,
I get really low latencies (order 10/20ms). However, if I try to do
anything of significance (downloading a .pdf from a berkeley web site
through IPSEC), the ping times become several seconds(!!!!) and the
bandwidth that I get through IPSEC never gets above 20KB/sec. I might
as well use dialup to campus.

Is there anything that we can do? I may have to turn off IPSEC and give
up on any windows services (since AT&T nolonger passes NETBIOS traffic.

--KUBI--

Tracing route to dhcp-50-167.Millennium.Berkeley.EDU [169.229.50.167]

over a maximum of 30 hops:

1 <10 ms <10 ms 10 ms

kubi-home-vpngw-1.Millennium.Berkeley.EDU [169.2
29.50.145]
2 10 ms 10 ms 20 ms 10.95.207.1
3 10 ms 10 ms 10 ms 12.244.98.225
4 10 ms 10 ms 10 ms 12.244.67.69
5 10 ms 20 ms 10 ms 12.244.67.86
6 10 ms 10 ms 20 ms 12.244.72.198
7 10 ms 10 ms 20 ms gbr1-p70.sffca.ip.att.net [12.123.13.58]
8 40 ms 20 ms 20 ms tbr2-p012701.sffca.ip.att.net
[12.122.11.85]
9 10 ms 10 ms 20 ms ggr1-p3100.sffca.ip.att.net
[12.122.11.230]
10 10 ms 10 ms 20 ms pos1-0.core1.SanFrancisco.Level3.net
[166.90.50.
161]
11 10 ms 10 ms 20 ms so-4-0-0.mp2.SanFrancisco1.Level3.net
[209.247.1
0.233]
12 10 ms 10 ms 20 ms so-2-0-0.mp2.SanJose1.Level3.net
[64.159.0.218]

13 10 ms 20 ms 10 ms gige9-1.hsipaccess1.SanJose1.Level3.net
[64.159.
2.103]
14 10 ms 20 ms 20 ms unknown.Level3.net [209.247.159.110]
15 10 ms 20 ms 21 ms ucb-gw--qsv-juniper.calren2.net
[128.32.0.69]
16 10 ms 20 ms 10 ms vlan196.inr-201-eva.Berkeley.EDU
[128.32.0.74]
17 10 ms 20 ms 20 ms
tier2-p2p-A.tier2-xlr.Millennium.Berkeley.EDU [1
69.229.51.225]
18 10 ms 20 ms 20 ms bb-A.coreIII-xlr.Millennium.Berkeley.EDU
[169.22
9.51.133]
19 10 ms 31 ms 10 ms bb-B.soda-bb-xlr.Millennium.Berkeley.EDU
[169.22
9.51.162]
20 10 ms 20 ms 20 ms dhcp-50-167.Millennium.Berkeley.EDU
[169.229.50.
167]

Trace complete.

From 169.229.50.167 (the second address for kubi.cs.berkeley.edu used
for IPSEC):

traceroute to 12.233.33.1 (12.233.33.1), 30 hops max, 38 byte packets
1 169.229.50.161 (169.229.50.161) 1.558 ms 1.511 ms 1.467 ms
2 * * *
3 fast4-1-0.inr-new-666-doecev.Berkeley.EDU (128.32.0.73) 0.992 ms
0.917 ms
1.068 ms
4 qsv-juniper--ucb-gw.calren2.net (128.32.0.70) 2.731 ms 3.847 ms
2.965 ms
5 POS1-0.hsipaccess1.SanJose1.Level3.net (209.247.159.109) 3.520 ms
3.934 ms
3.891 ms
6 ae0-54.mp2.SanJose1.Level3.net (64.159.2.97) 3.979 ms 3.520 ms
3.479 ms
7 so-2-0-0.mp2.SanFrancisco1.Level3.net (64.159.0.217) 5.832 ms
5.201 ms 4.
957 ms
8 pos9-0.core1.SanFrancisco1.Level3.net (209.247.10.234) 5.741 ms
4.565 ms
4.667 ms
9 unknown.Level3.net (166.90.50.158) 4.701 ms 4.841 ms 4.974 ms
10 tbr2-p013802.sffca.ip.att.net (12.122.11.229) 6.975 ms 6.363 ms
7.245 ms
11 gbr1-p40.sffca.ip.att.net (12.122.11.82) 5.757 ms 4.920 ms 5.153
ms
12 gar1-p360.sffca.ip.att.net (12.123.13.57) 5.011 ms 5.172 ms 5.440
ms
13 12.244.72.197 (12.244.72.197) 6.215 ms 5.987 ms 6.065 ms
14 12.244.67.85 (12.244.67.85) 15.206 ms 14.700 ms 15.210 ms
15 12.244.67.70 (12.244.67.70) 15.556 ms 15.112 ms 15.004 ms
16 12.244.98.227 (12.244.98.227) 7.981 ms * 7.749 ms
[root@kubi ~]#

kubitron.vcf

John Kubiatowicz

unread,

Feb 20, 2002, 12:11:26 AM2/20/02

to netw...@eecs.berkeley.edu, Anthony Joseph

John Kubiatowicz wrote:

> Well, my traffic is not symmetric into and out of campus. I show two
> traceroutes below (end of message).

Or perhaps it is symmetric. Is the traffic-shaper on both paths of the
previous traceroutes? I don't know which interfaces are related to each
other as physical routers, so I don't really know.

--KUBI--

kubitron.vcf

ken lindahl

unread,

Feb 20, 2002, 2:02:44 AM2/20/02

to

On Tue, 19 Feb 2002 20:54:20 -0800, John Kubiatowicz <kubi...@cs.berkeley.edu> wrote:
>Well, my traffic is not symmetric into and out of campus. I show two
>traceroutes below (end of message).

it is symmetric with respect to the packetshaper; that is, both directions
are passing through the packetshaper. i realize this is not obvious, but:

> 16 10 ms 20 ms 10 ms vlan196.inr-201-eva.Berkeley.EDU
>[128.32.0.74]

and

> 3 fast4-1-0.inr-new-666-doecev.Berkeley.EDU (128.32.0.73) 0.992 ms
>0.917 ms

are the two ends of the same physical link, and that is the link with the
packetshaper.

i've made a small change to the packetshaper configuration; can you tell
any difference?

ken

Cliff Frost

unread,

Feb 20, 2002, 9:28:04 AM2/20/02

to

In ucb.net.discussion John Kubiatowicz <kubi...@cs.berkeley.edu> wrote:
...

> The behavior that I am seeing with my IPSEC traffic is very similar to

> what Anthony Joseph said he saw with the asymmetric problem. If I ping,
> I get really low latencies (order 10/20ms). However, if I try to do
> anything of significance (downloading a .pdf from a berkeley web site
> through IPSEC), the ping times become several seconds(!!!!) and the
> bandwidth that I get through IPSEC never gets above 20KB/sec. I might
> as well use dialup to campus.

This sure looks like it has something to do with the packets themselves,
maybe something like the size of the packets. But I just tried a few
pings of your home machine using 1K and 1.5K packets and they seemed
just fine.

It doesn't seem to me it can be something targeted at IPSEC, since the
pings and traceroute are also inside IPSEC.

Could it be data dependent, ie content of packet dependent? It's been
years since I saw any networking gear that was susceptible to data pattern
dependent errors but it could be happening.

Testing for it easily might mean dredging up old versions of ping and
traceroute that support the "-x" option.

> Is there anything that we can do?

Keep on trouble-shooting (all of us). Your setup ought to work fine. Is
it time of day dependent, or constant?

Thanks,
Cliff

John Kubiatowicz

unread,

Feb 20, 2002, 11:38:56 AM2/20/02

to Cliff Frost

Cliff Frost wrote:

> This sure looks like it has something to do with the packets themselves,
> maybe something like the size of the packets. But I just tried a few
> pings of your home machine using 1K and 1.5K packets and they seemed
> just fine.
>
> It doesn't seem to me it can be something targeted at IPSEC, since the
> pings and traceroute are also inside IPSEC.

So, the traceroute that I showed you is outside of IPSEC. The pings
that I talked about are inside IPSEC, but I don't know if I gave you my
home machine name for you to test (and probably don't want to in this
venue -- I could offline).

Note that it *IS* associated with IPSEC. When things were bad the other
day, I had two pings running, one through the link and one to the
interface on the gateway machine at Berkeley. When trying to use the
link, the IPSEC ping times went through the roof (3000+ ms), while the
IP times (ICMP) stayed low. This seems pretty protocol specific.

Note that in discussing the problem with Anthony Joseph, we suddenly
began to wonder what it meant to try to packet shape IPSEC, since there
are no outside-visible ACKS to play with in the packet shaper. Do we
know what it tries to do with these types of packets (in particular,
protocol 50, encrypted payload packets)?

> Could it be data dependent, ie content of packet dependent? It's been
> years since I saw any networking gear that was susceptible to data pattern
> dependent errors but it could be happening.

So, this setup worked fine until a week (or at most 2 weeks) ago. It is
not data dependent, it may be packet size dependent.

Actually, something seems better at the moment. I am getting 110KB/sec
through the link. This is still not the full bandwidth that I used to
get, but it is certainly better than 10/20 kB/sec. Also, simultaneous
ping times through the link do not seem to become 1000+ms.

>
> Keep on trouble-shooting (all of us). Your setup ought to work fine. Is
> it time of day dependent, or constant?

So, I guess it is not constant, since it is better at the moment.

--KUBI--

kubitron.vcf

ken lindahl

unread,

Feb 20, 2002, 12:45:12 PM2/20/02

to

On Wed, 20 Feb 2002 08:38:56 -0800, John Kubiatowicz <kubi...@cs.berkeley.edu> wrote:

>Actually, something seems better at the moment. I am getting 110KB/sec
>through the link. This is still not the full bandwidth that I used to
>get, but it is certainly better than 10/20 kB/sec. Also, simultaneous
>ping times through the link do not seem to become 1000+ms.

you aparently missed my post last night:

On Wed, 20 Feb 2002 07:02:44 GMT, lin...@uclink.berkeley.edu (ken lindahl) wrote:
>i've made a small change to the packetshaper configuration; can you tell
>any difference?
>
>ken

i'll conclude that the change did make a difference.

fwiw, 110KB/sec is pretty good relative to what other folks using non-IPSEC
applications report. i think you're all bumping up against the overall outbound
rate-limit at this point.

as i said earlier, we are learning how this device behaves when traffic is
constantly pushing the rate-limit. i am planning a pretty much complete
overhaul of the packetshaper config based on what we've learned in the past
2 weeks. hopefully, we'll squeeze a few more Mb/s out of it.

ken

Nicholas Weaver

unread,

Feb 20, 2002, 1:01:37 PM2/20/02

to

In article <3c73debd...@news.berkeley.edu>,
ken lindahl <lin...@uclink.berkeley.edu> wrote:

>fwiw, 110KB/sec is pretty good relative to what other folks using non-IPSEC
>applications report. i think you're all bumping up against the overall outbound
>rate-limit at this point.

Definatly, I see about 10 KB/s downloading large files from campus ->
home (AT&T Cable modem).

Also, asymettric routes between two destinations are common these
days: It is to each backbone provider's benefit to shift the data to
the destination's backbone provider as soon as possible, for economic
reasons.

EG, given this topology, with A and B's provider having 2 peering
points, one "close" to A, one "close" to B.

A <-> (A's provider, long distance)
^ ^
| |
v v
(B's provider, long distance) <-> B

B's provider is going to shove packet's to A onto A's provider ASAP,
and the other way will be likewise, so the flow would be like

A <-> (A's provider, long distance)
| ^
| |
v |
(B's provider, long distance) <-> B

Voila, asymmetric routing due to economic pressures.
--
Nicholas C. Weaver nwe...@cs.berkeley.edu

John Kubiatowicz

unread,

Feb 20, 2002, 1:39:52 PM2/20/02

to lin...@uclink.berkeley.edu

ken lindahl wrote:
>
> On Wed, 20 Feb 2002 08:38:56 -0800, John Kubiatowicz <kubi...@cs.berkeley.edu> wrote:
>
> >Actually, something seems better at the moment. I am getting 110KB/sec
> >through the link. This is still not the full bandwidth that I used to
> >get, but it is certainly better than 10/20 kB/sec. Also, simultaneous
> >ping times through the link do not seem to become 1000+ms.
>
> you aparently missed my post last night:
>
> On Wed, 20 Feb 2002 07:02:44 GMT, lin...@uclink.berkeley.edu (ken lindahl) wrote:
> >i've made a small change to the packetshaper configuration; can you tell
> >any difference?
> >
> >ken
>
> i'll conclude that the change did make a difference.

Note that there is something that still doesn't work. I used to be able
to route through campus to get to acm.org (to use the digital library
with Berkeley permissions). This involves a tunnel from home to campus,
with requests going out from there.

This doesn't work presently -- attempts to access the digital library
site (portal.acm.org) lock up or are very slow. When I play with the
MTU (by lowering it), I can sometimes make more progress than before.
This is weird. I haven't had time to debug this further -- I am
hampered a bit by the fact that that sight rejects pings and traceroute
traffic, and the fact that my IPSEC configuration rejects random traffic
from outside Berkeley destined for home. Only specifically configured
addresses/subnets get through.

So, I still think that there is some size-related problem/weirdness
going on. In my copious spare time (ack!), I will try upgrading the
version of freeSWAN that I am running to see if this helps.
Unfortunately, that involves a kernel reconfiguration/rebuild, so that
may not happen quickly.

kubitron.vcf

ken lindahl

unread,

Feb 21, 2002, 1:58:00 AM2/21/02

to

On Wed, 20 Feb 2002 18:01:37 +0000 (UTC), nwe...@CSUA.Berkeley.EDU (Nicholas Weaver) wrote:

>A <-> (A's provider, long distance)
> | ^
> | |
> v |
> (B's provider, long distance) <-> B
>
>Voila, asymmetric routing due to economic pressures.

yes but that doesn't apply to the situation at hand. in our situation,
letting B represent UCB, the drawing looks like:

A <-> (A's provider, long distance)
| ^
| |
v |

(B's provider: CalREN2 ISP ) <-> B(border router)
^
|
v
packetshaper
^
|
v
campus network

the asymmetry that you describe is invisible to that packetshaper.

the routing asymmetry that can effect the packetshaper is more complicated
and i don't think i'm capable of representing it in an ascii figure.
but in addition to the above, A's provider might be at one of the IX's
(PAIX, LAAP, SDNAP) and might peer with calren2 there. the IX peering
data is carried to campus via a different path inside calren2 and
a different connection to campus. kind of like this:

A <-> (A's provider, long distance)

/| ^
/ | |
/ v |
| (B's provider: CalREN2 ISP ) <-> B(border router)
v ^
(CalREN2 non-ISP) <-> B(border) |
(router) v
^ packetshaper
| ^
| |
v v
campus network

so there are two paths into campus, only one of which goes through the
packetshaper. the calren2 non-isp path is the one that carries IX traffic
as well as internet2 traffic; the cost is not usage-sensitive, so we do
not want to shape the traffic on that path. and to complicate matters,
note the two paths from A's provider toward B: that choice is entirely
up to A's provider, and there is no requirement that the same choice be
made for all parts of campus. thus, A's provider can route to 128.32/16
addresses via the calren2 isp service but to 169.229/16 via the non-ISP
path. and they can change that decision whenever they wish, without
notification to us. ugh.

however, fwiw, ATT has so far declined to peer with calren2 at the
IXs, so this kind of asymmetry is not a concern (nor a benefit) to
ATT home users. and finally, i looked at the MAC addresses in kubi's
packets and confirmed beyond any doubt that both directions go through
the packetshaper.

ken

John Kubiatowicz

unread,

Feb 23, 2002, 3:16:16 AM2/23/02

to

ken lindahl wrote:
>
> i'll conclude that the change did make a difference.
>
> fwiw, 110KB/sec is pretty good relative to what other folks using non-IPSEC
> applications report. i think you're all bumping up against the overall outbound
> rate-limit at this point.
>
> as i said earlier, we are learning how this device behaves when traffic is
> constantly pushing the rate-limit. i am planning a pretty much complete
> overhaul of the packetshaper config based on what we've learned in the past
> 2 weeks. hopefully, we'll squeeze a few more Mb/s out of it.
>
> ken

Ok. I have installed a new version of freeSWAN at Berkeley. This has
(they claim) taken care of the fragmentaton problems that the previous
version had.

It fixes that one remaining weirdness! I can now route through campus
again through my IPSEC gateway.

kubitron.vcf

Cliff Frost

unread,

Feb 23, 2002, 2:10:44 PM2/23/02

to

In ucb.net.discussion John Kubiatowicz <kubi...@cs.berkeley.edu> wrote:

> This is a multi-part message in MIME format.
> --------------4B74486579B6407EC9AD30FF
> Content-Type: text/plain; charset=us-ascii
> Content-Transfer-Encoding: 7bit

...

> Ok. I have installed a new version of freeSWAN at Berkeley. This has
> (they claim) taken care of the fragmentaton problems that the previous
> version had.

> It fixes that one remaining weirdness! I can now route through campus
> again through my IPSEC gateway.

Wonderful.

Just so I make sure I understand what you did. The "fragmentation" problems
you mentioned, do they refer to fragmentation in the sense of IP packet
fragmentation?

It would make sense to me that our packetmangler, er packetshaper might get
confused by fragmented packets, especially if they were not fragmented
properly--that would be the sort of pathological case testers might not
think of trying out.

Thanks,
Cliff

John Kubiatowicz

unread,

Feb 23, 2002, 7:12:00 PM2/23/02

to Cliff Frost

Cliff Frost wrote:

> Just so I make sure I understand what you did. The "fragmentation" problems
> you mentioned, do they refer to fragmentation in the sense of IP packet
> fragmentation?
>
> It would make sense to me that our packetmangler, er packetshaper might get
> confused by fragmented packets, especially if they were not fragmented
> properly--that would be the sort of pathological case testers might not
> think of trying out.
>
> Thanks,
> Cliff

Well, I had previous installed a version of FreeSWAN (the IPSEC software
for Linux) that had problems negotiating MTUs. IPSEC packets are (1)
marked with the "Don't fragment" bit and (2) tend to grow a bit over the
native IP packets (since there is header stuff added). I can't really
say what was going on, but something about that web site
(portal.acm.org) was getting confused somehow. I can't really say what
was up, but perhaps it was producing packets that were too big for
FreeSWAN to handle and it couldn't fragment them, or perhaps FreeSWAN
was producing packets that were too big for the ACM. Or.....?

Some history: I had gone from FreeSWAN version 1.91 => 1.92 somewhat
concurrently with the AT&T AtHome snafu. I have been slowly bring my
infrastructure back up to functioning since then. To make 1.92 work
(even to allow me to get from home to campus!), I was forced to peg the
maximum MTU used by freeSWAN to 1400. This was a "known" bug with 1.92.
However, this still didn't seem to allow me to route through campus
(using my campus addresses for outside sites). If I lowered the MTU
further, I could sometimes get packets through to ACM.ORG. --
nondeterministically.

With version 1.95 (that I installed last night), I no longer have to set
the MTU, and everything seems to work again. [However, with the MTU
pegged, things still don't work, so there is definitely some weirdness
with packet sizes or fragmentation].

kubitron.vcf