Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Dropped packets via ISR

3 views
Skip to first unread message

amat...@gmail.com

unread,
Aug 23, 2006, 9:46:00 AM8/23/06
to
SETUP
Cisco 2811 ISR on 4mbit metro ethernet.
~200 users

Problem
Web pages are coming up either
a) perfectly
b) half mangled with some images and screwed up tables or
c) not at all

There is no pattern as to when or why a page might not come up. Most of
this happens at lunchtime while people sit at their desks and browse.
More usage = more of this problem. Tried cranking up the bandwidth to
10mbit and still had the same problem.

Example:
This is what a webpage might look like:
http://129.21.125.13/~adam/gg/screenshot_corresponding_w_capture.JPG

This is what the capture looks like:
http://129.21.125.13/~adam/gg/msn_ss_part1.JPG
http://129.21.125.13/~adam/gg/msn_ss_part2.JPG

I realize that retransmissions are normal, and this is what a normal
loss/retransmit should look like(taken from my home and office
connection:
http://129.21.125.13/~adam/gg/good_packet_loss.JPG

Notice the 'Continuation packets' in the good packet loss image. I
don't get those on the problem network.

I've troubleshot this down to the ISR and provider uplink. The
provider from their NOC said that everythign on the line looked good
and we don't have a ton of dropped packets on the ISR which one might
expect. Any ideas? I've been working on this for weeks.

Thanks,
Adam

Bo...@hotmail.co.uk

unread,
Aug 23, 2006, 1:21:39 PM8/23/06
to

amat...@gmail.com wrote:
> SETUP
> Cisco 2811 ISR on 4mbit metro ethernet.
> ~200 users
>
> Problem
> Web pages are coming up either
> a) perfectly
> b) half mangled with some images and screwed up tables or
> c) not at all
>
> There is no pattern as to when or why a page might not come up. Most of
> this happens at lunchtime while people sit at their desks and browse.
> More usage = more of

> I realize that retransmissions are normal, and this is what a normal


> loss/retransmit should look like(taken from my home and office

> I've troubleshot this down to the ISR and provider uplink. The


> provider from their NOC said that everythign on the line looked good
> and we don't have a ton of dropped packets on the ISR which one might
> expect. Any ideas? I've been working on this for weeks.

What do you get from more basic tools:-

ping, traceroute.

The packet loss rate shown in the traces is not
normal, i.e. about 1 in 10 or 20.

What I would do is:-


1 - check the interface stats on all network kit in the path
that you can reach. Post the output here if you wish.
sh int

2 - Use traceroute with a lot of repetitions for a long
time to verify each hop in the path to the internet.
Pingplotter makes this a breeze.

If you use pingplotter set the repition rate to 1sec and
leave it running over a failure.

Post results.

3 - Check that you are not having an MTU issue I guess
would be an idea.

If that does not lead anywhere then we can have a look
at the Ethereal dumps. 'cos that takes more brainpower.

amat...@gmail.com

unread,
Aug 23, 2006, 10:16:04 PM8/23/06
to
Bod,
Thanks.

1&2. We have ICMP disabled through the ISR but ping and tracert from
the actual ISR router have been good. Confirmed this with Cisco
support. I might be on-site tomorrow so I might take out that
'feature' in the ruleset and test some things...

3. I had another person mention MTU size. I tried changing the value
but got this error:

% Interface FastEthernet0/1 does not support user settable mtu.

I have a pretty good feeling it might be the MTU size as this client is
on a metro ethernet...the vlan info might be going over the
pre-configured 1500 mtu size and thus creating the resets from the web
servers. Any comments/ideas?

Bo...@hotmail.co.uk

unread,
Aug 24, 2006, 6:53:15 AM8/24/06
to

Doesn't really look like MTU to me, without delving into the
ethereal dumps. Could be I guess.

If ICMP is off then PMTUD will not work.
http://www.cisco.com/en/US/tech/tk827/tk369/technologies_white_paper09186a00800d6979.shtml

Blindly turning off ICMP echo seems to me to be a /very bad idea/.
I ALWAYS leave it on at least for selected hosts so that
the troubleshooting tools that are part of the
internet can be used.

ping www.cisco.com
ping www.google.com

Do they turn it off? Cisco do appear to rate limit it.


You probably have a duplex missmatch.
As discussed sh int.

Look for input errors and output errors.
Unexplained errors are almost certainly
duplex mismatches.

Merv

unread,
Aug 24, 2006, 8:34:51 AM8/24/06
to

Are you using CBAC ( inspect commands )?

Post show version and config

amat...@gmail.com

unread,
Aug 24, 2006, 9:48:07 AM8/24/06
to
Bod,
FYI - I am inheriting this infrastructure/problem and configurations

1. BUT! I did allow through ICMP just now through the 2811 and did a
ping to an external host, no problem...

Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=33ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=33ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55
Reply from 129.21.125.13: bytes=32 time=32ms TTL=55

2. I also used ping plotter...

Target Name: www.google.com
IP: 66.102.7.147
Date/Time: 8/24/2006 9:22:54 AM

1 1 ms 1 ms [10.1.1.4]
2 3 ms 2 ms host.static.twtelecom.net [XX.XX.XX.XXX)
3 3 ms 2 ms dist-01-ge-3-0-0-510.roch.twtelecom.net
[66.192.240.144]
4 15 ms 15 ms core-01-so-5-1-0-0.chcg.twtelecom.net
[66.192.244.62]
5 15 ms 16 ms peer-02-so-0-0-0-0.chcg.twtelecom.net
[66.192.244.20]
6 15 ms 60 ms [66.192.252.90]
7 15 ms 15 ms [216.239.46.5]
8 84 ms 70 ms [66.249.95.215]
9 70 ms 70 ms [72.14.233.129]
10 -32764 ms 72 ms [216.239.49.54]
11 -32764 ms 71 ms [216.239.49.66]
12 69 ms 69 ms [66.102.7.147]

Had no problems using ping plotter against various hosts...

3. Tried changing from full duplex to half duplex to auto and that did
not resolve anything. Running at half duplex gave me more
'continuation' packets and less reset packets but we still had the
problem. If anything, half duplex slowed things down (as you woulde
expect). Here is an email I just got from our provider...

"1) Interface MTU at 1500 is fine. There is no VLAN tagging occurring
between your interface and mine, so MTU issues here are moot.

2) 10MB, Full-duplex

As a general FYI, 95% of our reported throughput/latency issues are
fixed when configs related to #2 are corrected.
Let us know if we can be of further assistance."

4. There are not too many errors on the interfaces (<1000) which is
also confusing. If there was some glaring problem I would expect to see
a huge number there, Cisco support agreed.

5. show ver
Cisco IOS Software, 2800 Software (C2800NM-ADVSECURITYK9-M), Version
12.4(5), RE
LEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2005 by Cisco Systems, Inc.
Compiled Tue 01-Nov-05 00:52 by alnguyen

ROM: System Bootstrap, Version 12.3(8r)T7, RELEASE SOFTWARE (fc1)
ROM: Cisco IOS Software, 2800 Software (C2800NM-ADVSECURITYK9-M),
Version 12.4(3
a), RELEASE SOFTWARE (fc2)

ISR-ROC-001 uptime is 31 weeks, 2 days, 11 hours, 41 minutes
System returned to ROM by Reload Command at 20:51:25 EST Mon Jan 16
2006
System image file is "flash:c2800nm-advsecurityk9-mz.124-5.bin"

Cisco 2811 (revision 53.51) with 249856K/12288K bytes of memory.
Processor board ID FTX0949C2X7
6 FastEthernet interfaces
1 Virtual Private Network (VPN) Module
DRAM configuration is 64 bits wide with parity enabled.
239K bytes of non-volatile configuration memory.
62720K bytes of ATA CompactFlash (Read/Write)

Configuration register is 0x2102

6. inspect rules
They are using a few inspect rules which are the following:

ISR-ROC-001#show run | include inspect
ip inspect name HOST-FW ftp
ip inspect name HOST-FW tcp timeout 3600
ip inspect name HOST-FW udp timeout 15
ip inspect HOST-FW out

7. Interface
ISR-ROC-001#show int fastEthernet 0/0
FastEthernet0/0 is up, line protocol is up
Hardware is MV96340 Ethernet, address is 0015.fa2f.74e8 (bia
0015.fa2f.74e8)
Description: Outside Interface WAN$FW_OUTSIDE$$ETH-WAN$
Internet address is 66.162.190.6/30
MTU 1500 bytes, BW 10000 Kbit, DLY 1000 usec,
reliability 255/255, txload 9/255, rxload 19/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 10Mb/s, 100BaseTX/FX
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:00, output hang never
Last clearing of "show interface" counters 1d00h
Input queue: 0/75/39/0 (size/max/drops/flushes); Total output drops:
0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 760000 bits/sec, 154 packets/sec
5 minute output rate 381000 bits/sec, 148 packets/sec
9315729 packets input, 1632655839 bytes
Received 74 broadcasts, 7 runts, 0 giants, 3 throttles
35 input errors, 28 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog
0 input packets with dribble condition detected
9049297 packets output, 2645884968 bytes, 0 underruns
2435 output errors, 6488 collisions, 4 interface resets
0 babbles, 2435 late collision, 0 deferred
0 lost carrier, 0 no carrier
0 output buffer failures, 0 output buffers swapped out

8. I can't blindly post the entire conf with access rules, etc but here
is most of it...
---------------
version 12.4
service timestamps debug datetime msec
service timestamps log datetime msec
no service password-encryption
!
hostname ISR-ROC-001
!
boot-start-marker
boot system flash c2800nm-advsecurityk9-mz.124-5.bin
boot system flash
boot-end-marker
!
logging buffered 51200 debugging
logging console critical
!
no aaa new-model
!
resource policy
!
clock timezone EST -5
clock summer-time EDT recurring
!
!
no ip cef
ip inspect name HOST-FW ftp
ip inspect name HOST-FW tcp timeout 3600
ip inspect name HOST-FW udp timeout 15
!
!
ip domain name clientname.com

!
interface Loopback1
ip address XX.XX.XX.XXX 255.255.255.255
ip nat inside
ip virtual-reassembly max-fragments 16 max-reassemblies 64
!
interface Loopback2
ip address 1.1.1.2 255.255.255.252
ip nat inside
ip virtual-reassembly
!
interface FastEthernet0/0
description Outside Interface WAN$FW_OUTSIDE$$ETH-WAN$
ip address XX.XX.XX.XXX 255.255.255.252
ip access-group 110 in
ip inspect HOST-FW out
ip nat outside
ip virtual-reassembly max-fragments 16 max-reassemblies 64
duplex auto
speed 10
!
interface FastEthernet0/1
description Inside Interface LAN$FW_INSIDE$$ETH-LAN$
ip address XX.XX.XX.XX 255.255.0.0
ip nat inside
ip virtual-reassembly max-fragments 16 max-reassemblies 64
duplex auto
speed auto
!
interface FastEthernet0/0/0
duplex full
speed 100
!
interface FastEthernet0/0/1
speed 100
!
interface FastEthernet0/0/2
duplex full
speed 100
!
interface FastEthernet0/0/3
duplex full
speed 100
!
interface Vlan1
description DMZ Interface - Protected
ip address XX.XX.XX.XXX 255.255.0.0
ip access-group 105 in
ip nat inside
ip virtual-reassembly max-fragments 16 max-reassemblies 64
!
!
route-map nonat permit 10
match ip address 180
set ip next-hop 1.1.1.1
!
!
!
control-plane
!
!
banner motd

!
line con 0
login local
line aux 0
line vty 0 4
privilege level 15
login local
transport input telnet ssh
line vty 5 15
privilege level 15
login local
transport input telnet ssh
!
scheduler allocate 20000 1000
!
end
----------------------

Let me know what you think.

Merv

unread,
Aug 24, 2006, 2:44:51 PM8/24/06
to

1. you are using 12.4(5) which is a deferred image ( read junked)
So you should move off this image


2. Is there any particular reason for using IOS 12.4 ?
If not I would downgrade to latest IOS 12.3 to see if your problem
persists


3. why is CEF disabled ? ( no ip cef )


4. Disable console logging ( no logging console )

amat...@gmail.com

unread,
Aug 25, 2006, 8:28:50 AM8/25/06
to
Merv,
Thanks. Cisco also just recomended that I enable ip cef. I have done
that and done a few other things per their recs....

memory-size iomem 25

Made some rule modifications and took off virtual reassembly. Waiting
to see how it goes today, it is Friday so things are light but we'll
see.

Thanks and I"ll post back when I get some feedback.

amat...@gmail.com

unread,
Aug 29, 2006, 11:10:49 AM8/29/06
to
Made some headway...just posting this for the benefit of any future
people searching and if anyone has any further insight. It turns out
after more search and discover through the router that the 'ip inspect'
command seems to be at fault. What is happening is the max number of
half open connections is reached and then traffic is dropped. See
below:

--
ISR-ROC-001#show ip inspect statistics
Packet inspection statistics [process switch:fast switch]
tcp packets: [12898:471751]
udp packets: [66884:247203]
ftp packets: [149:0]
Interfaces configured for inspection 1
Session creations since subsystem startup or last reset 18515 Current
session counts (estab/half-open/terminating) [212:2:0] Maxever session
counts (estab/half-open/terminating) [331:40:17] Last session created
00:00:00 Last statistic reset never Last session creation rate 436 Last
half-open session total 2 Half-open session count or session creation
rate exceeded
--
The key here is the last line obviously. Cisco is recommending that I
up the max-incomplete value from the default(40) to 150. Any ideas or
insight on this?

Thanks,
Adam

Igor Mamuzic

unread,
Aug 30, 2006, 4:49:50 AM8/30/06
to
Yeah, I saw such problems with rising max-incomplete values from defaults
to:
one-minute (sampling period) thresholds are [10000:27000] connections
max-incomplete sessions thresholds are [10000:27000]

But before this you should check how many active NAT translations you have
while experiencing problems with web sites? I had a lot of active
translations (about 3000), because I don't have pretty much outbound things
(p2p, etc.) banned and maybe some worms are operating in the network and
trying to access the Net which rises the number of active NAT translations.

B.R.
Igor


"amat...@layer8group.com" <amat...@gmail.com> wrote in message
news:1156864249.4...@74g2000cwt.googlegroups.com...

amat...@gmail.com

unread,
Aug 30, 2006, 1:07:08 PM8/30/06
to
Fixed the problem. We were getting that: "Half-open session count or
session creation rate exceeded" error message on ip inspect stats.
Raised the one-minute max and min values above 500/400 respectively.
Now we are all set.

Thanks,
Adam

0 new messages