Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Memory problem on STi w/PRI ?

1 view
Skip to first unread message

Gabriele Delconti

unread,
Dec 21, 1997, 3:00:00 AM12/21/97
to

Our WAN is based on one NetBlazer STi w/PRI at central site and 22 LSi
connected to the central site through ISDN lines.

After a couple of hours from rebooting the STi stops processing logins
from other sites. All the incoming calls become locked at login phase as
showed by "line list" command.
Telnet sessions are also rejected while outgoing calls are processed
regularly.
After another couple of hours, usually in the morning when the routed
traffic increases, the STi reboots from itself.

These logs are taken from console:

### Results above obtained immediately after rebooting (system working
ok)

kpmg-mi:Top>Configure>Debug> mem
Memory usage status:
text start 8000 (901120), data start e4000, bss start f34bc, end 10350c
heap base 103510 size 7187184 avail 5643288 (78%) low mark 5228352 (72%)
allocs 2291 frees 738 (diff 1553) mfails 0 garbage 0 splits 177
mbuf: allocs 1096 frees 1084 (diff 12) dups 0 fails 0
small mbuf size: 140 large mbuf size: 1514 (1514)
large queue: dequeued 246 queued 346 (diff 100)
small queue: dequeued 897 queued 1097 (diff 200)
free queue: dequeued 0 queued 0 (diff 0)
1077 strings allocated, 78 too big (for zone), 67 freed, length changed
0
kpmg-mi:Top>Configure>Debug> mem max
5118660


### Results above when the login problem is present

kpmg-mi:Top>Configure>Debug> mem
Memory usage status:
text start 8000 (901120), data start e4000, bss start f34bc, end 10350c
heap base 103510 size 7187184 avail 5485752 (76%) low mark 5224536 (72%)
allocs 229448 frees 227461 (diff 1987) mfails 0 garbage 0 splits 175
mbuf: allocs 484441 frees 484341 (diff 100) dups 8 fails 0
small mbuf size: 140 large mbuf size: 1514 (1514)
large queue: dequeued 165096 queued 165198 (diff 102)
small queue: dequeued 320253 queued 320455 (diff 202)
free queue: dequeued 0 queued 0 (diff 0)
7092 strings allocated, 2116 too big (for zone), 5331 freed, length
changed 0
kpmg-mi:Top>Configure>Debug> mem max
5164668


The STi is 3.1 patch lev.13. Similar behaviour is obtained under any
patch level from 4 to 13.


Any help, also in the form of advice for further inquires, will be very
appreciated.

Gabriele Delconti
KPMG S.p.A.

Gabriele Delconti

unread,
Dec 22, 1997, 3:00:00 AM12/22/97
to

Scooter wrote:
>
> Is this a recent development? How long has it been doing this? This seems
> like it's an older machine judging from the version of code being run. The
> current levels of code are 3.3 and 3.32 .

About 10 months of regular working. Than the problem has been
sporadically noted. Now, with increased routed traffic and subsequent
calls, the proble is every 6-8 hours.
No new hardware has been added since the first deployment so version
3.3x is not needed/requested.
Current configuration is: STi 8Mb, PRI card - 15 channels used, nothing
else connected except for serial console.


> Try mem leak. An leaks?

kpmg-mi:Top>Configure>Debug> mem leak


Memory usage status:
text start 8000 (901120), data start e4000, bss start f34bc, end 10350c

heap base 103510 size 7187184 avail 5481432 (76%) low mark 5228352 (72%)
allocs 324349 frees 322443 (diff 1906) mfails 0 garbage 0 splits 177
mbuf: allocs 1109838 frees 1109738 (diff 100) dups 226 fails 0


small mbuf size: 140 large mbuf size: 1514 (1514)

large queue: dequeued 333289 queued 333390 (diff 101)
small queue: dequeued 779182 queued 779382 (diff 200)


free queue: dequeued 0 queued 0 (diff 0)

18354 strings allocated, 6000 too big (for zone), 16376 freed, length
changed 0
Heap dump from 103510 to 7de000 (allocation size=36 bytes)
Seq block size reqsize time file line magic pid
name
0031: 5e387c 108 ( 36) 11:52:51 user.c 182 ok 6793
[lost]
.....
#### here 67 lines similar to the above/below
.....
1629: 7834e4 108 ( 36) 12:06:35 user.c 182 ok 6903
[lost]
Total: 69 found for 7K or 0 % of heap
Above entries were allocated by process(es) that have exited
and may not be actual leaks.


> What does "who" show compared to "line list"?

kpmg-mi:Top>Configure>Debug> lin l
Line Modem/Dvc CharGrp Use Command DeviceStatus
line00 <<None>> <<None>> login <<None>> <RTS,DTR>
line01 <<None>> <<None>> login <<None>> <RTS,DTR>
line02 <<None>> <<None>> root <<None>> <CTS,DSR,RTS,DTR>
pri101 isdn-pri pridialout login <<None>> In-connected
pri102 isdn-pri pridialout login <<None>> In-connected
pri103 isdn-pri pridialout login <<None>> In-connected
pri104 isdn-pri pridialout none <<None>> Activated
pri105 isdn-pri pridialout login <<None>> In-connected
pri106 isdn-pri pridialout login <<None>> In-connected
pri107 isdn-pri pridialout none <<None>> Activated
pri108 isdn-pri pridialout login <<None>> In-connected
pri109 isdn-pri pridialout login <<None>> In-connected
pri110 isdn-pri pridialout none <<None>> Activated
pri111 isdn-pri pridialout login <<None>> In-connected
pri112 isdn-pri pridialout login <<None>> In-connected
pri113 isdn-pri pridialout login <<None>> In-connected
pri114 isdn-pri pridialout login <<None>> In-connected
pri115 isdn-pri pridialout login <<None>> In-connected
pri117 isdn-pri request none <<None>> Activated
pri118 isdn-pri request none <<None>> Activated
pri119 isdn-pri request none <<None>> Activated
pri120 isdn-pri request none <<None>> Activated
pri121 isdn-pri request none <<None>> Activated
pri122 isdn-pri request none <<None>> Activated
pri123 isdn-pri request none <<None>> Activated
pri124 isdn-pri request none <<None>> Activated
pri125 isdn-pri request none <<None>> Activated
pri126 isdn-pri request none <<None>> Activated
pri127 isdn-pri request none <<None>> Activated
pri128 isdn-pri request none <<None>> Activated
pri129 isdn-pri request none <<None>> Activated
pri130 isdn-pri request none <<None>> Activated
pri131 isdn-pri request none <<None>> Activated
kpmg-mi:Top>Configure>Debug> who
User Time From
root Dec 21 16:05 line02
kpmg-mi:Top>Configure>Debug>


> Also, check "ps" for orphaned processes.

kpmg-mi:Top>Configure>Debug> ps
Pgroup pid user stksize maxstk heap event flags time name
main 1 system 8K 1.9K 976K 7c4e18 IW 37.4 main
main 2 system 8K 1.8K 108K 7b7b78 IW 2:29 en0
main 3 system 8K 1.0K 8K fc5b0 IW 8.8 killer
main 4 system 8K 0.8K 53K 0.4 timer
main 5 system 8K 0.1K 8K ecd0c W 0.0 tracer
main 6 system 4K 1.0K 4K ff9e8 W 21.5 syslog
main 7 system 8K 0.2K 8K a715c W 0.4
comport_proc
main 8 system 4K 0.7K 4K 101b30 IW 2.3
syslog_slow
main 11 system 16K 1.3K 16K 6f3820 IW 1.1 pri1
main 15 system 8K 1.8K 8K 7801d8 IW 0.3
AppleTalk Timer

main 17 system 4K 0.6K 4K 7c3df0 IW 0.0 Telnet
listener

main 52 system 2K 0.2K 4K 64650 IW 0.0
Modemstatus
main 53 system 2K 0.3K 4K 7444ea IW 0.3 Event
Manager
main 55 system 2K 0.7K 4K 6cfc07 IW 0.0 line00
main 56 system 2K 0.7K 4K 6cfc0e IW 0.0 line01
main 57 system 2K 0.7K 4K 6cfc15 IW 0.0 line02
line00 58 system 8K 1.5K 10K 7478d8 W 0.0 start
line00
line01 59 system 8K 1.5K 10K 747674 W 0.0 start
line01
line02 60 root 8K 1.9K 10K 0.2 Cons
line02
main 6849 system 2K 0.7K 4K 5e2d30 IW 0.0 pri102
main 6850 system 8K 0.3K 10K 74b0ac IW w 0.0 start
pri102
main 7082 system 2K 0.7K 4K 64bc1b IW 0.0 pri108
main 7083 system 8K 0.3K 10K 757568 IW w 0.0 start
pri108
main 7096 system 2K 0.7K 4K 64bc6f IW 0.0 pri109
main 7097 system 8K 0.3K 10K 7573b8 IW w 0.0 start
pri109
main 7098 system 2K 0.7K 4K 64bc84 IW 0.0 pri113
main 7099 system 8K 0.3K 10K 75b12c IW w 0.0 start
pri113
main 7100 system 2K 0.7K 4K 64bc99 IW 0.0 pri106
main 7101 system 8K 0.3K 10K 7616fc IW w 0.0 start
pri106
main 7104 system 2K 0.7K 4K 64bcb5 IW 0.0 pri114
main 7105 system 8K 0.3K 10K 5ee80c IW w 0.0 start
pri114
main 7109 system 2K 0.7K 4K 64bcca IW 0.0 pri103
main 7110 system 8K 0.3K 10K 767d14 IW w 0.0 start
pri103
main 7122 system 2K 0.7K 4K 64bc06 IW 0.0 pri111
main 7123 system 8K 0.3K 10K 785234 IW w 0.0 start
pri111
main 7124 system 2K 0.7K 4K 64bced IW 0.0 pri115
main 7125 system 8K 0.3K 10K 761f00 IW w 0.0 start
pri115
main 7129 system 2K 0.7K 4K 64bc30 IW 0.0 pri104
main 7130 system 8K 0.3K 10K 5e3748 IW w 0.0 start
pri104
main 7132 system 2K 0.7K 4K 64bc5a IW 0.0 pri112
main 7133 system 8K 0.3K 10K 5e44a4 IW w 0.0 start
pri112
kpmg-mi:Top>Configure>Debug>

> Are you using cloned interfaces (i.e default-ppp) or statically
> configured interfaces. Are Only LSes dialing in or are other devices also
> dialing in.

NO cloned interfaces, statically defined interfaces only. ONLY LSes can
dial using ISDN, no modems, no LAN cards, no IPX. Mainly IP with little
AppleTalk that has been desabled for testing without positive results.


Thank you for your replay.

Gabriele



> At 11:18 PM 12/21/97 +0100, you wrote:
> >Our WAN is based on one NetBlazer STi w/PRI at central site and 22 LSi
> >connected to the central site through ISDN lines.
> >
> >After a couple of hours from rebooting the STi stops processing logins
> >from other sites. All the incoming calls become locked at login phase as
> >showed by "line list" command.
> >Telnet sessions are also rejected while outgoing calls are processed
> >regularly.
> >After another couple of hours, usually in the morning when the routed
> >traffic increases, the STi reboots from itself.
> >
> >These logs are taken from console:
> >

> >### Results BELOW obtained immediately after rebooting (system working


> >ok)
> >
> >kpmg-mi:Top>Configure>Debug> mem
> >Memory usage status:
> >text start 8000 (901120), data start e4000, bss start f34bc, end 10350c
> >heap base 103510 size 7187184 avail 5643288 (78%) low mark 5228352 (72%)
> >allocs 2291 frees 738 (diff 1553) mfails 0 garbage 0 splits 177
> >mbuf: allocs 1096 frees 1084 (diff 12) dups 0 fails 0
> > small mbuf size: 140 large mbuf size: 1514 (1514)
> > large queue: dequeued 246 queued 346 (diff 100)
> > small queue: dequeued 897 queued 1097 (diff 200)
> > free queue: dequeued 0 queued 0 (diff 0)
> >1077 strings allocated, 78 too big (for zone), 67 freed, length changed
> >0
> >kpmg-mi:Top>Configure>Debug> mem max
> >5118660
> >
> >

> >### Results BELOW when the login problem is present

Gabriele Delconti

unread,
Dec 23, 1997, 3:00:00 AM12/23/97
to

Scooter wrote:
>
<<< some text deleted >>>
>
> Lots of in-connected/use login lines. Why so many callback lines?
>

pri117 to pri pri131 are not used (i.e. non configured by local
Telecom).
The request mode was setted for testing only (same problem).


> >pri115
> >main 7129 system 2K 0.7K 4K 64bc30 IW 0.0 pri104
> >main 7130 system 8K 0.3K 10K 5e3748 IW w 0.0 start
> >pri104
> >main 7132 system 2K 0.7K 4K 64bc5a IW 0.0 pri112
> >main 7133 system 8K 0.3K 10K 5e44a4 IW w 0.0 start
> >pri112
> >kpmg-mi:Top>Configure>Debug>
>

> I'd try a "deb kill 7132" to see if that frees up the pri122 line and also
> frees 7133. If it does, do it to the others that are in a similar state.
> I'd be willing bet that this is fixed in later versions of code.
>

killing the processes associated with a given line frees that line
indeed (same as resetting it). However other lines get busy in the login
state following other calls from LSes.

>
> What version(s) are the LSes running?
>

3.1 patch mlewis27 that is a custom patch made by Mark to solve local
hardware-related problems and to give caller-id identification.

Look at these lines from the console log:

Tue Dec 23 13:09:22 1997 - pri110 - Incoming call..
Calling-party=458015852
Tue Dec 23 13:09:22 1997 - start login on pri110
Tue Dec 23 13:09:22 1997 - getty failed on pri110

Until here the STi has worked OK. This incoming call is the first failed
(i.e. locked at login phase)
Starting from here all the incoming calls and Telnet sessions are
rejected.

Note that the 'getty' message is generated during the first failed
incoming call ONLY. The following calls fail without recording the
'getty' error.

The number of the line and the LS that is calling are not significative,
other LSes on other linse cause the same problem, after a couple of
hours from the rebooting.

Thank you again for your help

Gabriele

0 new messages