I originally posted a query about this over on alt.linux, and
someone
there suggested I post a query here. I am using wvdial to
provide "dial up" broadband access to the Cricket
wireless network using a Cal-Comp A600 GSM wireless modem on a headless
router.. It's working fairly well, but there is one serious issue I
need to resolve. After 12 hours online, the ISP shuts down the link.
When this happens, the modem sometimes locks up, at least as far as
wvdial is concerned. An attempt to dial again fails. Occasionally,
the modem truly hangs, so that even a soft reboot won't recover the
session. Since wvdial produces no logs, no externally readable status
structure, and no exit codes, it's difficult to tell what action the
router needs to take when the ppp link is down.
Does anyone know of a slightly more sophisticated or
better supported alternative to wvdial? I am looking at linuxconf, but
there doesn't seem to be a Debian package for it, and the RPM package
won't load under alien. The source code fails with a fair number of
errors. I would really rather not start digging into problems with the
RPM package or in the source code if I don't have to. The fact the
newest version of linuxconf is from 2005 also suggests it may not be
actively maintained, either
Try looking at www.theory.physics.ubc.ca/ppp-linux.html for techniques
for debugging your ppp connection. wvdial is simply a font end for the
thing that does the actual work, namely pppd. pppd can and does produce
error messages to the daemon and local2 facilities. You need to tell
your system to record those and tell pppd to actually output the debug
messages.
One thing you could try is to kill
killall pppd
as root
before you try again.
Alternatively, have your system itself close the connection after 11hr
45 min, and then restart it. -- you must be paying pretty hefty wireless
bills to to be logged on for over 12hr a day.
The problem isn't at the ppn layer. The problem lies at the hardware
layer, epcifically the modem locking up.
> wvdial is simply a font end for the
> thing that does the actual work, namely pppd. pppd can and does
> produce error messages to the daemon and local2 facilities.
Yes, but there are no errors at that level, at least not until the modem
locks up.
> You need
> to tell your system to record those and tell pppd to actually output
> the debug messages.
>
> One thing you could try is to kill
> killall pppd
> as root
> before you try again.
No, I'm not using pppd, and ppp is dead whenever the connection is
closed.
> Alternatively, have your system itself close the connection after 11hr
> 45 min, and then restart it.
That's what I am doing, or trying to do, although I use 11hr 17 min, in
fact.
> -- you must be paying pretty hefty
> wireless bills to to be logged on for over 12hr a day.
No, Cricket is broadband wireless. Or it is SUPPOSED to be. If it were
truly broadband, they would not shut down the link after 12 hours.
Nonetheless, there are no fees for connection time. Broadband routers
are always logged on 24/7.
That should have been, "...specifically..."
>
>> One thing you could try is to kill
>> killall pppd
>> as root
>> before you try again.
>
> No, I'm not using pppd, and ppp is dead whenever the connection is
> closed.
That's not what I meant to say. What I meant to say is that it locks
up before pppd is ever called. It loks up while the dialer is trying
to establish layer 1 connectivity. I'll double-check, though.
And you know this how? And if you are so sure why are you asking here
since hardware fixes do not get handled by Linux user groups.
>
>> wvdial is simply a font end for the
>> thing that does the actual work, namely pppd. pppd can and does
>> produce error messages to the daemon and local2 facilities.
>
> Yes, but there are no errors at that level, at least not until the modem
> locks up.
Have you switched on logging for ppd? Have you set up the daemon and
local2 facilities to log to a file?
daemon.*;local2.* /var/log/daemonlog
in /etc/syslog.conf
and then do
killall -1 syslogd
Have you looked at that file?
>
>> You need
>> to tell your system to record those and tell pppd to actually output
>> the debug messages.
>>
>> One thing you could try is to kill
>> killall pppd
>> as root
>> before you try again.
>
> No, I'm not using pppd, and ppp is dead whenever the connection is
> closed.
??? pppd is THE daemon for using ppp, and wvdial uses pppd
ps auxww|grep pppd
>
>> Alternatively, have your system itself close the connection after 11hr
>> 45 min, and then restart it.
>
> That's what I am doing, or trying to do, although I use 11hr 17 min, in
> fact.
>
>> -- you must be paying pretty hefty
>> wireless bills to to be logged on for over 12hr a day.
>
> No, Cricket is broadband wireless. Or it is SUPPOSED to be. If it were
What do you mean by wireless? GSM? In which case I am really surprized
that they are not charging you through the teeth.
Try using a chat sequence instead of wvdial. Sorry I cannot suggest one.
I do not understand though. You said that you were connected for 12
hours when things went down. Or are you saying that after the ISP has
diconnected you, then when you try to restablish the connection, things
go south?
>
It does not appear to be a problem with the hardware, but rather the
hardware layer. The modem has been replaced, and the behavior after
replacement is the same as the behavior before replacement. Instead of
returning coherent responses in wvdial when the modem is instructed to
dial, it returns gibberish. Other commands return the "OK" response.
>>> wvdial is simply a font end for the
>>> thing that does the actual work, namely pppd. pppd can and does
Unless I ma mistaken, pppd only does its work after layer 1
connectivity is established, is that not correct? I will double-check.
She is bringing the router over this afternoon.
>>> produce error messages to the daemon and local2 facilities.
>>
>> Yes, but there are no errors at that level, at least not until the
>> modem locks up.
>
> Have you switched on logging for ppd? Have you set up the daemon and
> local2 facilities to log to a file?
No, but since the lock-up occurs before pppd is called, there doesn't
seem to be much point. Again, I will double-check to make sure this is
the case.
> daemon.*;local2.* /var/log/daemonlog
> in /etc/syslog.conf
> and then do
> killall -1 syslogd
>
> Have you looked at that file?
Syslog? Of course. 'No errors. The daemon log also has bo error in
it. I'll try adding the specific targets you mention, if they are not
already there.
>>> You need
>>> to tell your system to record those and tell pppd to actually output
>>> the debug messages.
>>>
>>> One thing you could try is to kill
>>> killall pppd
>>> as root
>>> before you try again.
>>
>> No, I'm not using pppd, and ppp is dead whenever the connection is
>> closed.
>
> ??? pppd is THE daemon for using ppp, and wvdial uses pppd
>
> ps auxww|grep pppd
Returns nothing, as I recall. I will check again when the system gets
here, if I can duplicate the problem. Troubleshooting network issues
remotely is a real challenge.
>>> Alternatively, have your system itself close the connection after
>>> 11hr 45 min, and then restart it.
>>
>> That's what I am doing, or trying to do, although I use 11hr
>> 17 min, in
>> fact.
>>
>>> -- you must be paying pretty hefty
>>> wireless bills to to be logged on for over 12hr a day.
>>
>> No, Cricket is broadband wireless. Or it is SUPPOSED to be. If it
>> were
>
> What do you mean by wireless? GSM? In which case I am really surprized
> that they are not charging you through the teeth.
Actually, I believe it is CDMA, but there are a number of broadband
wireless companies out there, both GSM and CDMA. In this case:
http://www.mycricket.com/broadband/plans/broadband40
Yeah, I thought of that. Since the router lives an hour from here by
car, my access to it is limited. Writing and debugging a chat script
would not normally be an issue, but when I can only gain local access
for an hour or so every week or two, it makes debugging difficult,
especially when the issue is normally only seen after 12 hours online.
> Sorry I cannot suggest
> one. I do not understand though. You said that you were connected for
> 12 hours when things went down. Or are you saying that after the ISP
> has diconnected you, then when you try to restablish the connection,
> things go south?
Either the ISP disconnects or the modem does. I suppose the modem
could have some sort of internal timer that shuts it down after 12
hours, but I expect it is the ISP's router which does this. And yes,
the problem is only experienced after the link is dropped. I can
sometimes reproduce the issue by `kill -9 <pid>` on the wvdial session.
Often not, however, in which case establishing a new session works just
fine.
>>>> After 12 hours online, the ISP shuts down the link. When this
>>>> happens, the modem sometimes locks up, at least as far as wvdial
>>>> is concerned.
That _usually_ indicates that the dialing program isn't cleanly
resetting the modem. One (of many) problem with WvDial is that it's
trying to use relatively generic modem initiation strings which may
not be suitable for all modems, never mind a GSM modem.
>>>> An attempt to dial again fails. Occasionally, the modem truly
>>>> hangs, so that even a soft reboot won't recover the session.
And what is the exact indication that dialing failed? The soft-reboot
is also suggesting the modem initiation string is wrong.
>>>> Since wvdial produces no logs, no externally readable status
>>>> structure, and no exit codes, it's difficult to tell what
>>>> action the router needs to take when the ppp link is down.
Well, I originally indicated it's a piece of crap. Somewhere in your
configuration file, you probably have
--no-syslog
Don't output debug information to the syslog daemon (only
useful together with --chat).
You _MAY_ also have disabled logging in /etc/syslog.conf, but as I
don't know what facility:level the authors are using. I'm also told
that you can cause wvdial to log stuff if you call it from the
command line (rather than clicking on some icon) with the syntax
wvdial 2>&1 | tee /tmp/some.file.name
which should put data into that file. Afterwards, read that file
using 'less'.
>>>> I am looking at linuxconf, but there doesn't seem to be a Debian
>>>> package for it, and the RPM package won't load under alien
'linuxconf' is just about as bad as wvdial. If you're using Debian,
the normal tool is pppconf which creates 'pon' and 'poff'.
>>> Try looking at www.theory.physics.ubc.ca/ppp-linux.html for
>>> techniques for debugging your ppp connection.
Concur - also look carefully through any documentation for your modem,
as the init strings commonly used for ordinary modems are not the same.
>> The problem isn't at the ppn layer. The problem lies at the hardware
>> layer, epcifically the modem locking up.
The web page deals with the whole idea of connecting.
>>> One thing you could try is to kill
>>> killall pppd
>>> as root
>>> before you try again.
I'm not sure the existing init-strings would recover from a remote
disconnect, and 'killall' isn't resetting the modem either.
>> No, I'm not using pppd, and ppp is dead whenever the connection is
>> closed.
'/bin/ps auwx' might indicate differently.
Old guy
Sounds like the pppd/chat layer to me. Ie, the response you are getting
back is a garbled version of the negotiation.
Anyway, have you tried to set up debug logging and looking at the files
yet? If not, I have no idea why you came here for help, since you seem
to be convinced that your theory, without testing, is right.
>
>>>> wvdial is simply a font end for the
>>>> thing that does the actual work, namely pppd. pppd can and does
>
> Unless I ma mistaken, pppd only does its work after layer 1
> connectivity is established, is that not correct? I will double-check.
> She is bringing the router over this afternoon.
No. pppd does the whole negotiation. wvdial/chat set up the modem and
dial the phone, and then hand off to pppd to do all negotiation.
>
>>>> produce error messages to the daemon and local2 facilities.
>>>
>>> Yes, but there are no errors at that level, at least not until the
>>> modem locks up.
>>
>> Have you switched on logging for ppd? Have you set up the daemon and
>> local2 facilities to log to a file?
>
> No, but since the lock-up occurs before pppd is called, there doesn't
> seem to be much point. Again, I will double-check to make sure this is
> the case.
Sheesh.
>
>> daemon.*;local2.* /var/log/daemonlog
>> in /etc/syslog.conf
>> and then do
>> killall -1 syslogd
>>
>> Have you looked at that file?
>
> Syslog? Of course. 'No errors. The daemon log also has bo error in
> it. I'll try adding the specific targets you mention, if they are not
> already there.
No, the file where daemon and local2 logfacilities get written to. In
the above it is /var/log/daemonlog.
> On Sun, 10 Jan 2010, in the Usenet newsgroup comp.os.linux.networking,
> in article <uKKdneMWS88X4tfW...@giganews.com>, lrhorer
> wrote:
>
>>>>> After 12 hours online, the ISP shuts down the link. When this
>>>>> happens, the modem sometimes locks up, at least as far as wvdial
>>>>> is concerned.
>
> That _usually_ indicates that the dialing program isn't cleanly
> resetting the modem.
Yeah, that's pretty obvious. I've thought about trying AT+WRST,
although I'm very limited to what I can try remotely, and the modem is
no longer here.
> One (of many) problem with WvDial is that it's
> trying to use relatively generic modem initiation strings which may
> not be suitable for all modems, never mind a GSM modem.
Well, one can apply whatever initialization strings one likes in the
config file, or from the command line. You are correct, however, that
at the moment it is only issuing the ATZ command.
>>>>> An attempt to dial again fails. Occasionally, the modem truly
>>>>> hangs, so that even a soft reboot won't recover the session.
>
> And what is the exact indication that dialing failed? The soft-reboot
> is also suggesting the modem initiation string is wrong.
Well, when the modem is well and truly hung, it disappears altogether.
The computer no longer recognizes that it is attached to the USB
port - lsusb shows nothing at all - and of course udev de-registers
the /dev/ttyACM0 device, so no string of any sort is going to help at
that point.
>>>>> Since wvdial produces no logs, no externally readable status
>>>>> structure, and no exit codes, it's difficult to tell what
>>>>> action the router needs to take when the ppp link is down.
>
> Well, I originally indicated it's a piece of crap. Somewhere in your
> configuration file, you probably have
>
> --no-syslog
> Don't output debug information to the syslog daemon (only
> useful together with --chat).
>
> You _MAY_ also have disabled logging in /etc/syslog.conf, but as I
> don't know what facility:level the authors are using. I'm also told
> that you can cause wvdial to log stuff if you call it from the
> command line (rather than clicking on some icon) with the syntax
Well of *COURSE* it's called from the command line. I believe I
mentioned more than once this is a headless router. The commands are
all called from a startup script in /etc/init.d.
> wvdial 2>&1 | tee /tmp/some.file.name
I might try that. The bigger issue, however, is that logging to a log
file - especially an output as verbose as that produced by wvdial - and
then parsing it with a pseudo-linguistic recognition script is not the
best way to try to programmatically determine the status of a running
application.
> which should put data into that file. Afterwards, read that file
> using 'less'.
You guys don't seem to be getting this. First of all, I'm not a
novice. I've only been using Linux for about 4 years, so I'm no guru,
but I've been an administrator of a number of HP-UX systems for over 20
years.
Secondly, this is a router, so everything it does must be done
automatically. Nothing interactive can be employed to facilitate
systems operations, unless of course I write an expect script or some
such to interact with the system. Nothing like `less` would fill that
bill. When it is working properly, the output from wvdial looks like
this:
Mon Jan 11 21:29:55 CST 2010 : Dialing out
--> WvDial: Internet dialer version 1.60
--> Cannot get information for serial port.
--> Initializing modem.
--> Sending: ATZ
ATZ
OK
--> Sending: ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
OK
--> Modem initialized.
--> Sending: ATDT#777
--> Waiting for carrier.
ATDT#777
NO CARRIER
--> No Carrier! Trying again.
--> Sending: ATDT#777
--> Waiting for carrier.
ATDT#777
NO CARRIER
--> No Carrier! Trying again.
--> Sending: ATDT#777
--> Waiting for carrier.
ATDT#777
NO CARRIER
--> No Carrier! Trying again.
--> Sending: ATDT#777
--> Waiting for carrier.
ATDT#777
NO CARRIER
--> No Carrier! Trying again.
--> Sending: ATDT#777
--> Waiting for carrier.
ATDT#777
CONNECT
--> Carrier detected. Waiting for prompt.
~[7f]}#@!}!}!} }9}"}&} } } } }#}%B#}%}%}&X[05]@[16]}'}"}(}"}9y~
--> PPP negotiation detected.
--> Starting pppd at Mon Jan 11 21:30:54 2010
--> Pid of pppd: 14384
--> Using interface ppp0
--> pppd: �'�[08]�%�[08]
--> pppd: �'�[08]�%�[08]
--> pppd: �'�[08]�%�[08]
--> pppd: �'�[08]�%�[08]
--> pppd: �'�[08]�%�[08]
--> local IP address 10.100.38.42
--> pppd: �'�[08]�%�[08]
--> remote IP address 172.29.122.162
--> pppd: �'�[08]�%�[08]
--> primary DNS address 172.28.221.53
--> pppd: �'�[08]�%�[08]
--> secondary DNS address 172.28.221.54
--> pppd: �'�[08]�%�[08]
>>>>> I am looking at linuxconf, but there doesn't seem to be a Debian
>>>>> package for it, and the RPM package won't load under alien
>
> 'linuxconf' is just about as bad as wvdial. If you're using Debian,
> the normal tool is pppconf which creates 'pon' and 'poff'.
Since the failure doesn't usually seem to be in pppd, I'm not sure
these will be helpful. I'll take a look, though.
>>>> Try looking at www.theory.physics.ubc.ca/ppp-linux.html for
>>>> techniques for debugging your ppp connection.
>
> Concur - also look carefully through any documentation for your modem,
You're kidding me, right? The modem documentation is a single sheet of
paper about the size of a credit card. I've found nothing online,
either specific to the modem.
> as the init strings commonly used for ordinary modems are not the
> same.
I know that. I do have a reference that purports to be the command
reference for CalComp CDMA modems. It does not mention the A600
specifically, but I have no reason to believe it doesn't cover the
A600. OTOH, there isn't much in it that would be of great help, other
than perhaps the AT+WRST command.
>>> The problem isn't at the ppn layer. The problem lies at the
>>> hardware layer, epcifically the modem locking up.
>
> The web page deals with the whole idea of connecting.
>
>>>> One thing you could try is to kill
>>>> killall pppd
>>>> as root
>>>> before you try again.
>
> I'm not sure the existing init-strings would recover from a remote
> disconnect, and 'killall' isn't resetting the modem either.
>
>>> No, I'm not using pppd, and ppp is dead whenever the connection is
>>> closed.
>
> '/bin/ps auwx' might indicate differently.
No, it doesn't. At a bare minimum, pppd is silent, and the ppp0
network port (as reported by ifconfig) is gone. Sometimes
the /dev/ttyACM0 device is gone and lsusb doesn't even report the
presence of the device on the usb port.
I double-checked when she brought the router over, and when it dies,
sometimes it is before pppd is called (and pppd never gets called), and
other times the failure seems to be when pppd is called. Either way,
however, pppd won't be active since pppd quits after an authentication
error.
Here is an example of one failure where pppd does get called:
ATDT#777
CONNECT
--> Carrier detected. Waiting for prompt.
~[7f]}#@!}!}!} }9}"}&} } } } }#}%B#}%}%}&a_cp}'}"}(}"[03]3~
--> PPP negotiation detected.
--> Starting pppd at Mon Jan 11 21:25:39 2010
--> Pid of pppd: 10153
--> Using interface ppp0
--> pppd: ��][08]��][08]
--> pppd: ��][08]��][08]
--> pppd: ��][08]��][08]
--> pppd: ��][08]��][08]
--> pppd: ��][08]��][08]
--> Disconnecting at Mon Jan 11 21:25:46 2010
--> The PPP daemon has died: Authentication error.
--> We failed to authenticate ourselves to the peer.
--> Maybe bad account or password? (exit code = 19)
--> man pppd explains pppd error codes in more detail.
--> I guess that's it for now, exiting
--> The PPP daemon has died. (exit code = 19)
Exit code 19 is just "We failed to authenticate ourselves to the peer."
Other times it has gotten nothing but garbage after the carrier is
detected, so it never calls pppd in the first place. Of course, when
the failure involves a total deregsitration of the device,
then /dev/ACM0 doesn't even exist, so the dialer never even sees the
modem. Note in the attempt above the modem does seem to be reporting a
good connection, but it never gets the authentication.
The point that is being missed, however, is that I am not asking for
help with troubleshooting the modem. Rather, I am looking for a
communications solution that will allow me to have better granularity
in interfacing the router's control scripts with the dialing utility.
So far, the front runner seems to be writing a chat script. If there
is a better solution, though, I'm all ears.
Yeah, I thought so. Logging is enabled (to daemon.log, actually, not
daemonlog), and log rotation is enabled for daemon.log. Pppd reports
no errors in daemon.log.
> and then do
> killall -1 syslogd
'No need, since (1) daemon.log was already enabled, and (2) The router
has been rebooted innumerable times.
> Have you looked at that file?
Yes, of course, as well as syslog, kern.log, mesages.log, dmesg, and of
course the several log files generated by the scripts I wrote to handle
the networking.
>>> You need
>>> to tell your system to record those and tell pppd to actually output
>>> the debug messages.
>>>
>>> One thing you could try is to kill
>>> killall pppd
>>> as root
>>> before you try again.
>>
>> No, I'm not using pppd, and ppp is dead whenever the connection is
>> closed.
>
> ??? pppd is THE daemon for using ppp, and wvdial uses pppd
>
> ps auxww|grep pppd
'Produces nothing. This was of course the first thing I checked. I'm
not a noob.
I don't have a theory, and I'm not looking for any help developing one.
All I am asking is if someone knows of an application which is better
suited to automation than wvdial. I know how to troubleshoot, and if
any of these failures are the result of a systemic problem within my
sphere of control, then I will resolve them as such. Indeed, the
hardware deregistration failure is almost certainly going to require
building some hardware, and I have already designed the hardware to
handle the situation along with the software to control it, but what I
want is a more sophisticated set of hooks to enable a faster and more
reliable recovery system, calling the recovery code best suited to the
specific failure mode at hand. Forcing the hardware to shut power off
to the modem and bring it back up whenever connectivity is lost is not
the best solution, but hanging up the modem only works if the failure
is fairly benign, and even soft rebooting the router doesn't clear one
of the common failure modes.
>>>>> wvdial is simply a font end for the
>>>>> thing that does the actual work, namely pppd. pppd can and does
>>
>> Unless I ma mistaken, pppd only does its work after layer 1
>> connectivity is established, is that not correct? I will
>> double-check. She is bringing the router over this afternoon.
>
> No. pppd does the whole negotiation. wvdial/chat set up the modem and
> dial the phone, and then hand off to pppd to do all negotiation.
Pppd is called before the carrier is established? Are you certain? In
any case, the point is, in at least two of the more common failure
modes, pppd is never even called by wvdial. In one of them, the modem
interface device, /dev/ACM0, does not even exist, so wvdial can't even
dial the modem, let alone hand off to pppd.
>>>>> produce error messages to the daemon and local2 facilities.
>>>>
>>>> Yes, but there are no errors at that level, at least not until the
>>>> modem locks up.
>>>
>>> Have you switched on logging for ppd? Have you set up the daemon and
>>> local2 facilities to log to a file?
As I mentioned, I checked, and they are there.
>>
>> No, but since the lock-up occurs before pppd is called, there
>> doesn't
>> seem to be much point. Again, I will double-check to make sure this
>> is the case.
>
> Sheesh.
So now you are giving me an attitude because rather than assume I was
right, I agreed to go back and check to make sure I hadn't missed
something?
>>> daemon.*;local2.* /var/log/daemonlog
>>> in /etc/syslog.conf
>>> and then do
>>> killall -1 syslogd
>>>
>>> Have you looked at that file?
>>
>> Syslog? Of course. 'No errors. The daemon log also has bo
>> error in
>> it. I'll try adding the specific targets you mention, if they are
>> not already there.
>
> No, the file where daemon and local2 logfacilities get written to. In
> the above it is /var/log/daemonlog.
Yes, of course I checked all the files to which anything was being
written. Look, I mean no offense, but I never asked for any help
troubleshooting this. If you will read my original post, you will see I
am only asking if anyone knows of alternative applications which will
allow me greater flexibility in interfacing my own code with the
communications applications, not help with troubleshooting. If chat is
the best answer, then I'll write a chat script, but I see no reason to
re-invent the wheel if better solutions already exist.
Although I will indeed further investigate the root causes of the
failures, right now that effort is incidental to the development of
routines which will properly and efficiently handle failures when they
do inevitably occur. I would rather not take the "shotgun" approach of
simply killing power to the modem and bringing it back up whenever any
sort of failure occurs.
How about posting the contents of that log file for a session. Also make
sure that in /etc/syslog.conf you have
daemon.*;local2.* as the facility/level that is logged-- ie everything
is logged, not just some special loglevels.
>
>> and then do
>> killall -1 syslogd
>
> 'No need, since (1) daemon.log was already enabled, and (2) The router
> has been rebooted innumerable times.
>
>> Have you looked at that file?
>
> Yes, of course, as well as syslog, kern.log, mesages.log, dmesg, and of
Of course? When this is the first time you noticed that you have a
daemon.log?
>Moe Trin wrote:
>> And what is the exact indication that dialing failed? The soft-reboot
>> is also suggesting the modem initiation string is wrong.
> Well, when the modem is well and truly hung, it disappears
>altogether. The computer no longer recognizes that it is attached to
>the USB port - lsusb shows nothing at all - and of course udev
>de-registers the /dev/ttyACM0 device, so no string of any sort is
>going to help at that point.
The fact that the device is disappearing would seem to be the problem.
In Message-ID: <M7KdnS7IsJt869bW...@giganews.com>, you
say:
] I can sometimes reproduce the issue by `kill -9 <pid>` on the wvdial
] session. Often not, however, in which case establishing a new session
] works just fine.
I'd want to find out why it's disappearing from lsusb - never mind
/dev/ttyACM0. I think you know that stuff doesn't happen randomly,
and something wvdial is doing (perhaps because of those init-strings)
is randomly hosing the device file.
>Well of *COURSE* it's called from the command line. I believe I
>mentioned more than once this is a headless router.
I first saw the word headless in yesterday's alt.linux post, which I
read _after_ reading c.o.l.n.
>The commands are all called from a startup script in /etc/init.d.
In my response yesterday in alt.linux, I showed a pair of scripts to
operate the connection directly rather than using a less than helpful
helper program like 'wvdial'. Even simpler,
[galileo ~]$ cat /etc/ppp/options | column
lock crtscts nodetach defaultroute persist
/dev/modem modem 115200 noipdefault
[galileo ~]$ ls -l /etc/ppp/*ap-secrets /etc/resolv.conf
-rw------- 1 root root 99 Jun 16 2008 /etc/ppp/chap-secrets
-rw------- 1 root root 94 Jun 16 2008 /etc/ppp/pap-secrets
-rw-r--r-- 1 root root 49 Jun 16 2008 /etc/resolv.conf
[galileo ~]$ cat /usr/local/bin/dialin.example
exec /usr/sbin/pppd user ibup...@example.com connect "/usr/sbin/chat
ABORT BUSY \"\" AT\&F1 OK ATDT2662902 CONNECT \"\d\c\""
[galileo ~]$
NOTE: That's one long (127 characters) line.
NOTE: CONNECT \\\d\\\c" also works
You could put /usr/local/bin/dialin.example with the appropriate
changes of username, init-string, and telephone number, in place of
the call to wvdial. If your IP address is going to be changing, you
probably want to stick a '1' into /proc/sys/net/ipv4/ip_dynaddr
There is also the Debian 'pppconf' program which creates the needed
connection setup/teardown scripts, and the author of that ap does
read these newsgroups.
>> wvdial 2>&1 | tee /tmp/some.file.name
>I might try that. The bigger issue, however, is that logging to a log
>file - especially an output as verbose as that produced by wvdial - and
>then parsing it with a pseudo-linguistic recognition script is not the
>best way to try to programmatically determine the status of a running
>application.
There shouldn't be that much logging. With full debugging, the script
above would produce around 40 lines of ASCII in total - bringing up the
link and tearing it down.
>Secondly, this is a router, so everything it does must be done
>automatically. Nothing interactive can be employed to facilitate
>systems operations, unless of course I write an expect script or
>some such to interact with the system. Nothing like `less` would
>fill that bill.
You have no shell access to read log files?
>When it is working properly, the output from wvdial looks like
>this:
I'd hardly consider it working properly, but anyway:
>--> WvDial: Internet dialer version 1.60
>--> Cannot get information for serial port.
Trying to read the UART, and failing. That's likely due to this being
a USB port.
>--> Initializing modem.
>--> Sending: ATZ
>ATZ
>OK
So it sends a "reset to an unknown saved configuration", and receives
an OK - it can talk to the modem.
>--> Sending: ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
>ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
>OK
Sent some useless stuff not applicable to a USB device, and got another
OK response
>--> Sending: ATDT#777
>--> Waiting for carrier.
>ATDT#777
>NO CARRIER
>--> No Carrier! Trying again.
Supposedly dialed a number, but no modem answered within the timeout
period - that's normally 45 seconds.
[repeats snipped]
>--> Sending: ATDT#777
>--> Waiting for carrier.
>ATDT#777
>CONNECT
>--> Carrier detected. Waiting for prompt.
This time, a modem was detected on the other end of the line. As I
mentioned, the idiots from wvdial are still living in the past and
think there will be a LOGIN: prompt.
>~[7f]}#@!}!}!} }9}"}&} } } } }#}%B#}%}%}&X[05]@[16]}'}"}(}"}9y~
Instead, the peer sends a ppp frame, like every ISP has been doing
for the past 15 plus years.
>--> PPP negotiation detected.
>--> Starting pppd at Mon Jan 11 21:30:54 2010
>--> Pid of pppd: 14384
and wvdial belatedly starts pppd.
>--> pppd: <EF><BF><BD>'<EF><BF><BD>[08]<EF><BF><BD>%<EF><BF><BD>[08]
This should be 8 bit ppp frames - hard to say what it actually is.
[remainder of ``log'' snipped]
and it's getting RFC1918 addresses for local, remote/peer, and two
DNS servers. My ISPs don't play Musical IP Addresses with the DNS,
so I've got those written directly into /etc/resolv.conf. There is a
'usepeerdns' option to pppd to cause it to ask the peer for these
addresses, but pppd puts them into /etc/ppp/resolv.conf rather than
screwing with a system configuration file - you could soft-link the
two if needed. It's in the pppd man page.
>> If you're using Debian, the normal tool is pppconf which creates
>> 'pon' and 'poff'.
> Since the failure doesn't usually seem to be in pppd, I'm not sure
>these will be helpful. I'll take a look, though.
The failure _seems_ to be caused by the modem disappearing. Why that
happens is anyone's guess, but my guess would be due to those plain
old generic telephone modem init-strings used by wvdial. Remember
that helper tools like wvdial, kppp, GnomePPP, linuxconf, and similar
are built for people who can't or won't read documentation, and are
configured in a mode that the author hopes is right. Some of them do
a good job of emulating windoze in that they intentionally avoid
displaying anything technical - helps when things are working as
designed, but makes troubleshooting difficult when things go wrong.
>> Concur - also look carefully through any documentation for your modem,
> You're kidding me, right? The modem documentation is a single sheet
>of paper about the size of a credit card. I've found nothing online,
>either specific to the modem.
Must be hell - did this modem come with a windoze disk? There will
be a windoze file that contains the modem init strings. I don't do
windoze, so I don't know which file it is, but greping for "AT"
should turn it up. Probably one of the smaller files (the bigger
ones are often bitmap icons for the desktop). Good luck
Old guy
>Here is an example of one failure where pppd does get called:
ATDT#777
CONNECT
--> Carrier detected. Waiting for prompt.
Modems out of the picture not - you are connected
>~[7f]}#@!}!}!} }9}"}&} } } } }#}%B#}%}%}&a_cp}'}"}(}"[03]3~
>--> PPP negotiation detected.
>--> Starting pppd at Mon Jan 11 21:25:39 2010
>--> Pid of pppd: 10153
>--> Using interface ppp0
OK
>--> Disconnecting at Mon Jan 11 21:25:46 2010
>--> The PPP daemon has died: Authentication error.
>--> We failed to authenticate ourselves to the peer.
>--> Maybe bad account or password? (exit code = 19)
>--> man pppd explains pppd error codes in more detail.
>--> I guess that's it for now, exiting
>--> The PPP daemon has died. (exit code = 19)
> Exit code 19 is just "We failed to authenticate ourselves to the peer."
EITHER: 1. wvdial couldn't figure out which username to use (the
"Username" in /etc/wvdial.conf) or
2. the username isn't matching what _this_ peer (GSM is
wonderful this way - username, username@domain, something else?) or
3. it tried to use the ``wrong'' authentication mode for _this_
peer (CHAP-MD5 verses PAP verses EAP). Wvdial dinks with the secrets
files for some bizarre reason. I never understood why the idiots
thought it necessary. Also, I don't believe wvdial understands EAP,
though that may or may not matter.
>The point that is being missed, however, is that I am not asking for
>help with troubleshooting the modem.
That's probably where you should be looking
Old guy
You should be; it sounds like the modem gets confused and drops off
the USB bus. Have you tried removing and re-inserting the cdc-acm
module? That *might* be enough to re-initialize the modem to a usable
state. There are also ways to control the power to a USB port via
sysfs I believe. This would allow you to fully reset the modem.
As for "interfacing the router's control scripts with the dialing
utility": chat.
Jerry
> On Mon, 11 Jan 2010, in the Usenet newsgroup comp.os.linux.networking,
> in article <M7KdnSnIsJsXmdHW...@giganews.com>, lrhorer
> wrote:
>
>>Moe Trin wrote:
>
>>> And what is the exact indication that dialing failed? The
>>> soft-reboot is also suggesting the modem initiation string is wrong.
>
>> Well, when the modem is well and truly hung, it disappears
>>altogether. The computer no longer recognizes that it is attached to
>>the USB port - lsusb shows nothing at all - and of course udev
>>de-registers the /dev/ttyACM0 device, so no string of any sort is
>>going to help at that point.
>
> The fact that the device is disappearing would seem to be the problem.
> In Message-ID: <M7KdnS7IsJt869bW...@giganews.com>, you
> say:
>
> ] I can sometimes reproduce the issue by `kill -9 <pid>` on the wvdial
> ] session. Often not, however, in which case establishing a new
> session ] works just fine.
>
> I'd want to find out why it's disappearing from lsusb - never mind
> /dev/ttyACM0.
Well, ultimately I do, too, of course. In the mean time, however, I
need to account for and handle the possibility, even if it rarely (or
hopefully never) occurs in daily useage.
> I think you know that stuff doesn't happen randomly,
No, of course not. Most of this is happening because I am forcing the
issue setting up failure scenarios. Nonetheless, more than one of
these failure modes has occurred during simple testing, rather than
during stress testing.
> and something wvdial is doing (perhaps because of those init-strings)
> is randomly hosing the device file.
It's certainly possible, but I don't think that is the case, at least
not for most of the failure modes. I could be wrong, of course.
Further testing will tell. but before I can really do any failure
testing, I need to come up with an ironclad means of automatic recovery
under the worst case scenario. The router is 50 miles away from me, in
the possession of someone who can barely use a mouse and finds browsing
the web a daunting proposition.
>>Well of *COURSE* it's called from the command line. I believe I
>>mentioned more than once this is a headless router.
>
> I first saw the word headless in yesterday's alt.linux post, which I
> read _after_ reading c.o.l.n.
>
>>The commands are all called from a startup script in /etc/init.d.
>
> In my response yesterday in alt.linux, I showed a pair of scripts to
> operate the connection directly rather than using a less than helpful
> helper program like 'wvdial'. Even simpler,
>
> [galileo ~]$ cat /etc/ppp/options | column
> lock crtscts nodetach defaultroute persist
> /dev/modem modem 115200 noipdefault
> [galileo ~]$ ls -l /etc/ppp/*ap-secrets /etc/resolv.conf
> -rw------- 1 root root 99 Jun 16 2008
> /etc/ppp/chap-secrets
> -rw------- 1 root root 94 Jun 16 2008
> /etc/ppp/pap-secrets
> -rw-r--r-- 1 root root 49 Jun 16 2008 /etc/resolv.conf
> [galileo ~]$ cat /usr/local/bin/dialin.example
> exec /usr/sbin/pppd user ibup...@example.com connect
> "/usr/sbin/chat ABORT BUSY \"\" AT\&F1 OK ATDT2662902 CONNECT
> \"\d\c\""
> [galileo ~]$
Thanks. I've looked briefly into chat as a solution. I may use it, or
perhaps minicom + expect.
> If your IP address is going to be changing, you
> probably want to stick a '1' into /proc/sys/net/ipv4/ip_dynaddr
It does, with every single call. What's more, they are using NAT,
because they provide the user with non-routable addresses in the 10.100
range, as you can see above. They also block TCP port 25, which many
ISPs do to help cut down spam, but then they don't bother to provide an
e-mail server themselves, which is incumbent on any ISP who blocks port
25. Setting up mail for her was a real pain. I finally resorted to
setting up a gmail account with an SSL SMTP server and hosting her POP
mail over a VPN to an IMAP daemon on one of my servers here at my
house. Oh, by the way, they also block any VPN using PPTP or IPSec. I
set up openvpn over SSL. The minimum latency on any connection is
180ms, and the best my sister can do is about 400ms. If one uses any
significant amount of bandwidth, they throttle the link by artificially
increasing the latency over a period of time. When working remotely
using VNC to help my sister with basic Windows tutoring, I've
encountered latencies consistently as high as 14 seconds! Glacial
doesn't begin to describe window updates over a 14 second link, even
with only an 8 bit colormap.
It's definitely a low rent operation, but OTOH they are inexpensive
($40 flat rate a month), and they do have service coverage at my
sister's house. Verizon and Sprint do not.
> There is also the Debian 'pppconf' program which creates the needed
> connection setup/teardown scripts, and the author of that ap does
> read these newsgroups.
Yeah, I looked at that. I haven't settled fully on any option, yet.
>>I might try that. The bigger issue, however, is that logging to a log
>>file - especially an output as verbose as that produced by wvdial -
>>and then parsing it with a pseudo-linguistic recognition script is not
>>the best way to try to programmatically determine the status of a
>>running application.
>
> There shouldn't be that much logging. With full debugging, the script
> above would produce around 40 lines of ASCII in total - bringing up
> the link and tearing it down.
Um, no. I have to determine if the device is registered on the USB
bus, and restart it if not. Then I have to determine if it is
properly "fipped" into modem mode, and force it to switch modes if not.
Then of course I have to make sure it is responding on the pseudo tty
port (/dev/ttyACM0), and once again restart it if not. Then of course
I have to make sure it gets a carrier, and finally turn it over to
pppd. If that fails, I have to start over again.
>>Secondly, this is a router, so everything it does must be done
>>automatically. Nothing interactive can be employed to facilitate
>>systems operations, unless of course I write an expect script or
>>some such to interact with the system. Nothing like `less` would
>>fill that bill.
>
> You have no shell access to read log files?
What do you mean by "me"? If by "me", you mean me, personally, then I
don't have any sort of access at all until the router can successfully
dial out, establish networking, and finally establish the VPN tunnel to
my server. If any of that fails, I'm perfectly blind. If by "me" you
mean the router itself, then of course it can parse logs under the
control of my scripts or compiled programs, but an automated / embedded
system needs to limit the number of responses it is expected to
encounter. Exit codes work great, as do signals, semaphores, and
simple status files such as those in /proc and /sys. If by "me" you
mean, "can I read the logs to see what went wrong for troubleshooting
purposes", then of course I can - after networking is up and running
again, but once more I am not really at the troubleshooting stage of
the project, yet - at least not in-depth troubleshooting. To be sure,
I want to eliminate as many of the more common causes of failure as I
can, but it is more important at this point to build the system to be
fault tolerant, allowing it to recover from whatever problem - expected
or otherwise - that might occur than it is to stamp out as many
gremlins as I can find.
>>When it is working properly, the output from wvdial looks like
>>this:
>
> I'd hardly consider it working properly, but anyway:
>
>>--> WvDial: Internet dialer version 1.60
>>--> Cannot get information for serial port.
>
> Trying to read the UART, and failing. That's likely due to this being
> a USB port.
>
>>--> Initializing modem.
>>--> Sending: ATZ
>>ATZ
>>OK
>
> So it sends a "reset to an unknown saved configuration", and receives
> an OK - it can talk to the modem.
>
>>--> Sending: ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
>>ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
>>OK
>
> Sent some useless stuff not applicable to a USB device, and got
> another OK response
I haven't delved into the command codes, yet, but the standard AT
commands are evidently implemented by the modem, albeit differently
than a regular PSTN modem. The string was given to me by someone else
who got his A600 working. How much is superfluous, I don't yet know.
>>--> Sending: ATDT#777
>>--> Waiting for carrier.
>>ATDT#777
>>NO CARRIER
>>--> No Carrier! Trying again.
>
> Supposedly dialed a number, but no modem answered within the timeout
> period - that's normally 45 seconds.
No, it's set to be much faster than that. I think it's set to 1/2
second, but I don't have access to the router right now to check. The
example I posted took less than 10 seconds to connect. Unless it fails
altogether, it has never taken more than 90 seconds to do everything,
including clearing the firewall, shutting down the Ethernet port,
switching the modem, dialing, establishing ppp, turning the Ethernet
port back on, establishing the VPN tunnel, clearing the firewall again,
verifying connectivity, establishing DNS, establishing NAT, putting the
(correct) firewall back in place, and setting up the routes. It's not
unusual for it to attempt dialing 8 or 10 times, though.
> [repeats snipped]
>
>>--> Sending: ATDT#777
>>--> Waiting for carrier.
>>ATDT#777
>>CONNECT
>>--> Carrier detected. Waiting for prompt.
>
> This time, a modem was detected on the other end of the line. As I
> mentioned, the idiots from wvdial are still living in the past and
> think there will be a LOGIN: prompt.
>
>>~[7f]}#@!}!}!} }9}"}&} } } } }#}%B#}%}%}&X[05]@[16]}'}"}(}"}9y~
Or else that is just a generic status message which assumes nothing
about what might be found at the other end. Not everyone dials an ISP
with their modem, you know. Of course in her case, it is, but a
properly written application of this sort should be able to handle
connections of any sort.
> Instead, the peer sends a ppp frame, like every ISP has been doing
> for the past 15 plus years.
>
>>--> PPP negotiation detected.
And obviously it responds appropriately. I haven't looked into the
guts of wvdial, and it might indeed be buggy and klunky, but the fact
they employ a response you don't like and support older and less
commonly used protocols isn't a valid criticism, when it clearly does
support a more universal one. It also would not surprise me if there
are not ISPs out there who still don't implement ppp.
>>--> Starting pppd at Mon Jan 11 21:30:54 2010
>>--> Pid of pppd: 14384
>
> and wvdial belatedly starts pppd.
I don't see how you determine that it is "belatedly". The console
output doesn't indicate how long after it receives the ppp negotiation
string it spawns the pppd process. It certainly should not spawn pppd
until it is certain the other end actually is pppd, unless it is
specifically told to do so by it's configuration file or a command line
option.
>>--> pppd: <EF><BF><BD>'<EF><BF><BD>[08]<EF><BF><BD>%<EF><BF><BD>[08]
>
> This should be 8 bit ppp frames - hard to say what it actually is.
I agree with you, here. It certainly should be better able to
translate the protocols into something more intelligible. That's one
reason I asked for alternatives.
> [remainder of ``log'' snipped]
>
> and it's getting RFC1918 addresses for local, remote/peer, and two
> DNS servers. My ISPs don't play Musical IP Addresses with the DNS,
> so I've got those written directly into /etc/resolv.conf.
Most of your better ISPs try to avoid shuffling their clients' IP
addresses. This is not one of your better ISPs. They are inexpensive,
though, and my sister is retired and living on a very fixed income.
> There is a
> 'usepeerdns' option to pppd to cause it to ask the peer for these
> addresses, but pppd puts them into /etc/ppp/resolv.conf rather than
> screwing with a system configuration file - you could soft-link the
> two if needed. It's in the pppd man page.
It already does. I suppose it could be wvdial that's updating
resolv.conf, but I'm pretty sure it's pppd.
>>> If you're using Debian, the normal tool is pppconf which creates
>>> 'pon' and 'poff'.
>
>> Since the failure doesn't usually seem to be in pppd, I'm not sure
>>these will be helpful. I'll take a look, though.
>
> The failure _seems_ to be caused by the modem disappearing. Why that
> happens is anyone's guess, but my guess would be due to those plain
> old generic telephone modem init-strings used by wvdial.
Well, it's possible, of course. I wouldn't expect it to be the case,
however, or else I would expect the modem to never connect at all.
I'll certainly run it down once I have recovery working solidly.
>>> Concur - also look carefully through any documentation for your
>>> modem,
>
>> You're kidding me, right? The modem documentation is a single sheet
>>of paper about the size of a credit card. I've found nothing online,
>>either specific to the modem.
>
> Must be hell - did this modem come with a windoze disk?
No, it's worse than that. The box it comes in would not fit an
ordinary mouse. The modem starts up in storage mode with an executable
which runs automatically and checks to see if the system is already
configured with their canned application. If not, it loads itself into
Windows and installs the device drivers and the canned application,
along with a volatile initialization application, run only once to
initialize the modem and then never again. Windows then issues a
switch command which "flips" the modem from storage mode into modem
mode, and the modem disappears as device 1f28:0021 on its initial USB
address and reappears as 1f28:0020 on its original USB address +1. The
initialization routine initializes the modem for access and terminates,
resetting the modem. Thereafter, it will respond to AT commands. Of
course, in Linux, none of this happens automagically. Instead, I use a
little utility produced by a 3rd party developer to handle just such
modems under Linux which switches the device into modem mode. It works
very well on a wide variety of such devices. Thereafter, it acts like
any other CDMA modem, responding on /dev/ttyACMx.
> There will
> be a windoze file that contains the modem init strings.
Uh-uh. Binaries only. 'All canned applications with no user readable
files, including no documentation. Indeed, under Windows it isn't even
possible to read the files at all, since the modem is immediately
flipped out of storage mode the moment it is recognized by Windows.
> I don't do windoze,
I try not to, but unfortunately my job requires it.
> so I don't know which file it is, but greping for "AT"
> should turn it up. Probably one of the smaller files (the bigger
> ones are often bitmap icons for the desktop). Good luck
It might be embedded as text in one of the binaries. My sister is
bringing the router over tomorrow. I'll try taking a look then. It
can't hurt.
Use chat.
>
>> If your IP address is going to be changing, you
>> probably want to stick a '1' into /proc/sys/net/ipv4/ip_dynaddr
>
> It does, with every single call. What's more, they are using NAT,
> because they provide the user with non-routable addresses in the 10.100
> range, as you can see above. They also block TCP port 25, which many
> ISPs do to help cut down spam, but then they don't bother to provide an
> e-mail server themselves, which is incumbent on any ISP who blocks port
> 25. Setting up mail for her was a real pain. I finally resorted to
> setting up a gmail account with an SSL SMTP server and hosting her POP
> mail over a VPN to an IMAP daemon on one of my servers here at my
> house. Oh, by the way, they also block any VPN using PPTP or IPSec. I
> set up openvpn over SSL. The minimum latency on any connection is
> 180ms, and the best my sister can do is about 400ms. If one uses any
> significant amount of bandwidth, they throttle the link by artificially
> increasing the latency over a period of time. When working remotely
> using VNC to help my sister with basic Windows tutoring, I've
> encountered latencies consistently as high as 14 seconds! Glacial
> doesn't begin to describe window updates over a 14 second link, even
> with only an 8 bit colormap.
Lunatic. I assume it is they who supplied you with that usb modem as
well. I would say it is defective and you should be requesting a new
one. For you to waste as much time as you have on a device which cost
lets say $50 is silly. Their modem should not be dieing and
disconnecting. As for another one, or buy another one.
Sorry, that cannot be true. It takes much longer than that to dial and
to connect. Why do you not set up the timeout for longer.
> example I posted took less than 10 seconds to connect. Unless it fails
> altogether, it has never taken more than 90 seconds to do everything,
> including clearing the firewall, shutting down the Ethernet port,
> switching the modem, dialing, establishing ppp, turning the Ethernet
> port back on, establishing the VPN tunnel, clearing the firewall again,
> verifying connectivity, establishing DNS, establishing NAT, putting the
> (correct) firewall back in place, and setting up the routes. It's not
> unusual for it to attempt dialing 8 or 10 times, though.
If that is because of a highly inappropriate timeout, then you should
fix that.
>
>> [repeats snipped]
>>
>>>--> Sending: ATDT#777
>>>--> Waiting for carrier.
>>>ATDT#777
>>>CONNECT
>>>--> Carrier detected. Waiting for prompt.
>>
>> This time, a modem was detected on the other end of the line. As I
>> mentioned, the idiots from wvdial are still living in the past and
>> think there will be a LOGIN: prompt.
>>
>>>~[7f]}#@!}!}!} }9}"}&} } } } }#}%B#}%}%}&X[05]@[16]}'}"}(}"}9y~
>
> Or else that is just a generic status message which assumes nothing
> about what might be found at the other end. Not everyone dials an ISP
> with their modem, you know. Of course in her case, it is, but a
> properly written application of this sort should be able to handle
> connections of any sort.
No, it is not. It is the remote side sending a ppp negotiation frame.
wvdial is well known for inappropriately waiting for a login prompt.
That is just wrong. Again, I strongly suggest that web page for setting
up an appropriate response.
On the other hand, I have no idea what sort of ppp you have. The return
messages it is giving you are just bizarre. They should be text
messages.
Anyway, what is the operating system on that "router", and where did y
ou get that version of pppd.
>
>> Instead, the peer sends a ppp frame, like every ISP has been doing
>> for the past 15 plus years.
>>
>>>--> PPP negotiation detected.
>
> And obviously it responds appropriately. I haven't looked into the
No it obviously does NOT respond properly. That by chance once in 8
times it connects is a fluke rather than proper operation.
> guts of wvdial, and it might indeed be buggy and klunky, but the fact
> they employ a response you don't like and support older and less
> commonly used protocols isn't a valid criticism, when it clearly does
> support a more universal one. It also would not surprise me if there
Oh yes it is. I have dealt with complaints about wvdial for 10 years.
> are not ISPs out there who still don't implement ppp.
No, there are not. ppp is how it works.
>
>>>--> Starting pppd at Mon Jan 11 21:30:54 2010
>>>--> Pid of pppd: 14384
>>
>> and wvdial belatedly starts pppd.
>
> I don't see how you determine that it is "belatedly". The console
> output doesn't indicate how long after it receives the ppp negotiation
> string it spawns the pppd process. It certainly should not spawn pppd
> until it is certain the other end actually is pppd, unless it is
> specifically told to do so by it's configuration file or a command line
> option.
>
>>>--> pppd: <EF><BF><BD>'<EF><BF><BD>[08]<EF><BF><BD>%<EF><BF><BD>[08]
>>
>> This should be 8 bit ppp frames - hard to say what it actually is.
>
> I agree with you, here. It certainly should be better able to
> translate the protocols into something more intelligible. That's one
> reason I asked for alternatives.
I have no idea what kind of pppd you have on your system.
>
>
>> [remainder of ``log'' snipped]
>>
>> and it's getting RFC1918 addresses for local, remote/peer, and two
>> DNS servers. My ISPs don't play Musical IP Addresses with the DNS,
>> so I've got those written directly into /etc/resolv.conf.
>
> Most of your better ISPs try to avoid shuffling their clients' IP
> addresses. This is not one of your better ISPs. They are inexpensive,
> though, and my sister is retired and living on a very fixed income.
No, ISPs hand out IP addresses at random if it is a dialup operation.
Almost none give static IPs.
>
>> There is a
>> 'usepeerdns' option to pppd to cause it to ask the peer for these
>> addresses, but pppd puts them into /etc/ppp/resolv.conf rather than
>> screwing with a system configuration file - you could soft-link the
>> two if needed. It's in the pppd man page.
>
> It already does. I suppose it could be wvdial that's updating
> resolv.conf, but I'm pretty sure it's pppd.
They probably are. But I have no idea since you have a highly
non-standard pppd.
>
>
>>>> If you're using Debian, the normal tool is pppconf which creates
>>>> 'pon' and 'poff'.
>>
>>> Since the failure doesn't usually seem to be in pppd, I'm not sure
>>>these will be helpful. I'll take a look, though.
>>
>> The failure _seems_ to be caused by the modem disappearing. Why that
>> happens is anyone's guess, but my guess would be due to those plain
>> old generic telephone modem init-strings used by wvdial.
>
> Well, it's possible, of course. I wouldn't expect it to be the case,
> however, or else I would expect the modem to never connect at all.
> I'll certainly run it down once I have recovery working solidly.
Nope, they could just put it into an unstable state.
>Moe Trin wrote:
>> I'd want to find out why it's disappearing from lsusb - never mind
>> /dev/ttyACM0.
>Well, ultimately I do, too, of course. In the mean time, however, I
>need to account for and handle the possibility, even if it rarely (or
>hopefully never) occurs in daily useage.
Another poster suggested reloading the cdc-acm module. That probably
requires root, but have you looked into that?
>It's certainly possible, but I don't think that is the case, at least
>not for most of the failure modes. I could be wrong, of course.
>Further testing will tell. but before I can really do any failure
>testing, I need to come up with an ironclad means of automatic recovery
>under the worst case scenario.
Automatic - that may be the problem, as you don't know what the actual
failure is. That the device disappears from lsusb has to be attacked.
If the device doesn't exist, there's not much an application like
minicom, pppd/chat, or wvdial can do.
>Thanks. I've looked briefly into chat as a solution. I may use it, or
>perhaps minicom + expect.
'chat' was designed for this. There _used_ to be a "PPP-over-Minicom"
mini-howto, but I haven't seen it mentioned in over ten years. Few
have bothered to use it, as it's unduly complex to setup/use. The
pppd wants a "working" serial link (tty option to pppd) and this is
normally setup using the 'connect' option. You can use what-ever
you like - I've seen it done with 'echo', 'read' and redirection if
you want to be crude about it.
>> If your IP address is going to be changing, you probably want to
>> stick a '1' into /proc/sys/net/ipv4/ip_dynaddr
>It does, with every single call.
That's a kernel routing problem - which is why it's a kernel parameter
that is altered above.
>What's more, they are using NAT, because they provide the user with
>non-routable addresses in the 10.100 range, as you can see above.
Not relevant to ppp. RFC1918 addresses aren't that unusual.
>> Sent some useless stuff not applicable to a USB device, and got
>> another OK response
>I haven't delved into the command codes, yet, but the standard AT
>commands are evidently implemented by the modem, albeit differently
>than a regular PSTN modem. The string was given to me by someone else
>who got his A600 working. How much is superfluous, I don't yet know.
Blindly sending codes and hoping they're OK isn't always the best
solution. By the way, there is no such thing as 'standard AT
commands'. Those are Hayes commands, and even Hayes didn't implement
or use the commands consistently. What WvDial is using are a
combination of Hayes standard command, Hayes extended commands, and
a Rockwell command. Standards??? What are those?
>> Supposedly dialed a number, but no modem answered within the timeout
>> period - that's normally 45 seconds.
>No, it's set to be much faster than that. I think it's set to 1/2
>second, but I don't have access to the router right now to check.
[..]
>It's not unusual for it to attempt dialing 8 or 10 times, though.
I'd look into that when you have time. It's hard to believe that
failing 8-10 times before happening to work is designed behavior.
>>>--> Carrier detected. Waiting for prompt.
>> the idiots from wvdial are still living in the past and
>> think there will be a LOGIN: prompt.
>Or else that is just a generic status message which assumes nothing
>about what might be found at the other end.
You really want to read the documentation for wvdial. It's designed
to log in to a getty - and over the development they discovered ISPs
are using the equivalent of autoppp mode because this is what windoze
expects.
>Not everyone dials an ISP with their modem, you know. Of course in
>her case, it is, but a properly written application of this sort
>should be able to handle connections of any sort.
What-ever
>It also would not surprise me if there are not ISPs out there who
>still don't implement ppp.
The other alternatives (besides ppp-over-something) are SLIP, CSLIP
and UUCP. Any idea when you saw one of those protocols used? I don't
think many distributions come with the 'sliplogin' package any more
(there is a 1996 tarball at sunsite) and I don't think UUCP is being
maintained (1994 tarball at sunsite). There is at least one, and
maybe more ISPs that provide a shell account (panix being the one I'm
aware of), which is basically what you used to get with a BBS setup.
For that, you use a terminal application like minicom (which seems to
be being maintained). But that doesn't offer IP connectivity.
>I don't see how you determine that it is "belatedly". The console
>output doesn't indicate how long after it receives the ppp negotiation
>string it spawns the pppd process. It certainly should not spawn pppd
>until it is certain the other end actually is pppd,
What other purpose do you think there could be? WvDial really isn't
able to handle UUCP, SLIP, or a text shell session. It also doesn't
handle ppp-over-whatever.
>> There is a 'usepeerdns' option to pppd to cause it to ask the peer
>> for these addresses, but pppd puts them into /etc/ppp/resolv.conf
>> rather than screwing with a system configuration file - you could
>> soft-link the two if needed. It's in the pppd man page.
>It already does. I suppose it could be wvdial that's updating
>resolv.conf, but I'm pretty sure it's pppd.
Read the man page - pppd is written by people with security in mind.
>The modem starts up in storage mode with an executable which runs
>automatically and checks to see if the system is already configured
>with their canned application. If not, it loads itself into
>Windows and installs the device drivers and the canned application,
>along with a volatile initialization application, run only once to
>initialize the modem and then never again. Windows then issues a
>switch command which "flips" the modem from storage mode into modem
>mode, and the modem disappears as device 1f28:0021 on its initial USB
>address and reappears as 1f28:0020 on its original USB address +1. The
>initialization routine initializes the modem for access and terminates,
>resetting the modem. Thereafter, it will respond to AT commands. Of
>course, in Linux, none of this happens automagically. Instead, I use
>a little utility produced by a 3rd party developer to handle just such
>modems under Linux which switches the device into modem mode. It works
>very well on a wide variety of such devices.
I can see 1f28 identified as CalComp on the Linux USB list
(http://www.linux-usb.org/usb.ids), but no devices are listed. To me,
it sounds as if your problem is getting this thing to act as a modem.
Once the kernel can see it, wvdial or anything else ought to be able
to make the connection. The authentication failure you showed can
be isolated by looking at the LCP negotiations, but you'd have to get
pppd to debug mode (debug option) and set the system logging daemon to
save this output to one of the log files. Wvdial won't do this. The
logs would look something like this (canned response):
----------------
PPP is like a two handed game of "Mother, may I" (popular with children
50-100 years ago) where each player asks the other if they can do some
thing. There are three possible answers allowed: "yes", "no" and "no,
but" and the game has to be played step by step. Watch the line wraps
below - the log lines may get long, but all begin with a date/time.
Jul 3 09:55:24 gtech pppd[924]: sent [LCP ConfReq id=0x1 <asyncmap
0x0> <magic 0x8bab12d4> <pcomp> <accomp>]
Jul 3 09:55:27 gtech pppd[924]: sent [LCP ConfReq id=0x1 <asyncmap
0x0> <magic 0x8bab12d4> <pcomp> <accomp>]
Jul 3 09:55:27 gtech pppd[924]: rcvd [LCP ConfReq id=0x1 < 00 04 00
00> <mru 1524> <asyncmap 0xa0000> <auth pap> <pcomp> <accomp> < mrru
1524> <<endpoint [MAC:00:c0:7b:90:17:04]>]
Here, this box said hello (sent LCP ConfReq) twice, before hearing from
the peer (rcvd = received from the peer). This box asks for four common
options (asyncmap = characters to escape, magic is a random 32 bit
number used to detect looped back connections, pcomp and accomp are two
header compression protocols.) The peer offers an empty Vendor
specific option (00 04 00 00), wants to have this system authenticate
with PAP (Password Authentication Protocol which shows up as
'<auth pap>'), wants to set a maximum packet size (MRU) it will accept,
wants us to 'escape' the XON/XOFF characters (asyncmap 0xa0000) and is
offering Multilink (mrru and endpoint variables).
Jul 3 09:55:27 gtech pppd[924]: sent [LCP ConfRej id=0x1 < 00 04 00
00> < mrru 1524> <<endpoint [MAC:00:c0:7b:90:17:04]>]
So, this box rejects (ConfRej = "no") the empty vendor option and the
Multilink. One rule of the game is that when something is rejected, it
can't be asked for again.
Jul 3 09:55:27 gtech pppd[924]: rcvd [LCP ConfAck id=0x1 <asyncmap
0x0> <magic 0x8bab12d4> <pcomp> <accomp>]
Jul 3 09:55:27 gtech pppd[924]: rcvd [LCP ConfReq id=0x2 <mru 1524>
<asyncmap 0xa0000> <auth pap> <pcomp> <accomp>]
Jul 3 09:55:27 gtech pppd[924]: sent [LCP ConfAck id=0x2 <mru 1524>
<asyncmap 0xa0000> <auth pap> <pcomp> <accomp>]
The peer comes back and acknowledges (ConfAck = "yes") the original
'hello' approving our requested options, and says hello again, but this
time without the unwanted stuff. This box acknowledges the peer.
Jul 3 09:55:27 gtech pppd[924]: sent [PAP AuthReq id=0x1 user=<hidden>
password=<hidden>]
Jul 3 09:55:28 gtech pppd[924]: rcvd [PAP AuthAck id=0x1 ""]
Because the peer asked for 'PAP' authentication (and we agreed to that),
this box sends in the username and password (here edited out), and the
peer comes back and approves the login. The (here) empty quotes in the
reply _may_ contain a greeting or some other message, but this is
optional.
---------------
Once the LCP (and authentication) negotiations are completed, ppp then
goes to an 'IPCP' negotiation to set up IP addresses and IP header
compression, and perhaps a 'CCP' negotiations to set up packet
compression - not really relevant here. The 'ppp' protocol is quite
versatile, and able to work with other networking besides IP (XNS, SNA,
Appletalk, IPv6, IPX, Banyan Vines... and more) depending on what the
link is needed for.
You'd have to watch that 'auth' exchange. There would also be clues
(not shown here) if you are talking to the same peer every time, or
are talking to more than one (not unusual) perhaps with different ideas
of authentication modes. Wvdial doesn't appear to handle such
diversity, never mind handle it easily.
Old guy
Old guy, you sure know your modems.
I've read this discussion all the way through, partly because I'm learning
a bunch of stuff I never knew about modems, ppp, wvdial, & friends; and
partly because I'm remembering how much I used to struggle with and hate
modems, and how glad I am that I don't have to do that any more. But it's
good that there are some Old Guys out there who still know how to work 'em.
Andrew.
--
To reply by email, change "deadspam.com" to "alumni.utexas.net"
>lrhorer <lrh...@satx.rr.com> wrote:
Geez Bill - I _WISH_ you'd learn how to trim stuff you're not
responding to off your replies. I have to use the 'tab' key to
find where you are replying to stuff.
>> No, it's set to be much faster than that. I think it's set to 1/2
>> second, but I don't have access to the router right now to check.
>Sorry, that cannot be true. It takes much longer than that to dial
>and to connect. Why do you not set up the timeout for longer.
Did you notice this is GSM and not plain ole telephones?
>Anyway, what is the operating system on that "router", and where
>did you get that version of pppd.
He's said it was Debian, and I don't think WvDial is used for other
than Linux and anu ppp.
>> I suppose it could be wvdial that's updating resolv.conf, but
>> I'm pretty sure it's pppd.
>They probably are. But I have no idea since you have a highly
>non-standard pppd.
No idea where you got that idea.
Old guy
It is also very hard to believe that a phone could be answered in 1/2 a
second. Ie, the timeout has to be longer.
>
>>>>--> Carrier detected. Waiting for prompt.
>
>>> the idiots from wvdial are still living in the past and
>>> think there will be a LOGIN: prompt.
>
>>Or else that is just a generic status message which assumes nothing
>>about what might be found at the other end.
>
> You really want to read the documentation for wvdial. It's designed
> to log in to a getty - and over the development they discovered ISPs
> are using the equivalent of autoppp mode because this is what windoze
> expects.
Back in the days when pppd on dialup was popular, and when I wrote that
web page, one of the big problems was the difficulties people had with
wvdial. If it worked fine, if it did not, it was a mess, and very
difficult to debug. And wvdial made all sorts of unwarranted
assumptions. It really got in the way. You can tellit not to try a login
(I donot remember exaclty how anymore) but chat really really is far
simpler.
>
>>Not everyone dials an ISP with their modem, you know. Of course in
>>her case, it is, but a properly written application of this sort
>>should be able to handle connections of any sort.
>
> What-ever
>
>>It also would not surprise me if there are not ISPs out there who
>>still don't implement ppp.
If they do not then they do not act as an ISP onto the net. ppp is the
only way people have. wvdial assumes that first you log onto your
service provider, and then after that you run pppd. That was their
authentication scheme. Eventually the service providers discovered that
there had existed for 10-15 years something called pap or chap, and
eventually they all went over to that. But wvdial was started when,
because many service providers read very very slowly, they had not yet
gotten to the chapter discussion pap and chap.
.
>
> The other alternatives (besides ppp-over-something) are SLIP, CSLIP
> and UUCP. Any idea when you saw one of those protocols used? I don't
> think many distributions come with the 'sliplogin' package any more
> (there is a 1996 tarball at sunsite) and I don't think UUCP is being
> maintained (1994 tarball at sunsite). There is at least one, and
> maybe more ISPs that provide a shell account (panix being the one I'm
> aware of), which is basically what you used to get with a BBS setup.
> For that, you use a terminal application like minicom (which seems to
> be being maintained). But that doesn't offer IP connectivity.
>
>>I don't see how you determine that it is "belatedly". The console
>>output doesn't indicate how long after it receives the ppp negotiation
>>string it spawns the pppd process. It certainly should not spawn pppd
Yes. it does. It waits for a login, and gets a ppp negotiation string
back. It should NOT wait for a login. Noone uses that anymore ( and
almost none did 10 years ago)
O
The problem is that his logs do NOT show the negotiation session in text
form. That is NOT pppd. pppd shows the negotiations all in text form,
exactly as you quote below. I have severe doubts that he is actually
using pppd.
Ah yes, the conflict between readability and maintaining past history. I
never go back into discussion to read the history, so I very much like
to have the discussion available with teh answer. I detest cases where
someone answers but I have no idea what the question is. On the other
hand, I also hate having to search for the answers in an interminable
history, but I tend to see that as the lesser of two evils. You do not.
>
>>> No, it's set to be much faster than that. I think it's set to 1/2
>>> second, but I don't have access to the router right now to check.
>
>>Sorry, that cannot be true. It takes much longer than that to dial
>>and to connect. Why do you not set up the timeout for longer.
>
> Did you notice this is GSM and not plain ole telephones?
I did, but I still do not believe 1/2 sec. response.
>
>>Anyway, what is the operating system on that "router", and where
>>did you get that version of pppd.
>
> He's said it was Debian, and I don't think WvDial is used for other
> than Linux and anu ppp.
>
>>> I suppose it could be wvdial that's updating resolv.conf, but
>>> I'm pretty sure it's pppd.
>
>>They probably are. But I have no idea since you have a highly
>>non-standard pppd.
>
> No idea where you got that idea.
Because of the form of the pppd logs which he posted. They look nothing
like the anu PPPD logs.
>
> Old guy
>Moe Trin <ibup...@painkiller.example.tld.invalid> wrote:
>Ah yes, the conflict between readability and maintaining past history. I
>never go back into discussion to read the history, so I very much like
>to have the discussion available with teh answer. I detest cases where
>someone answers but I have no idea what the question is. On the other
>hand, I also hate having to search for the answers in an interminable
>history, but I tend to see that as the lesser of two evils. You do not.
You're using slrn now, not nn. Press the question mark key now, and
you get key-stroke helps (press q to quit the help). The 'tab' key
mentioned takes you to the next part of the article that isn't a reply.
If you press the 'Esc' key then the 'p' key, it pulls up the previous
article referred to. The sequence 'Esc' '1' 'Esc' 'p' will cause slrn
ask the news server for all articles in the thread (and the news server
_may_ provide them - you're posting from highwinds-media - I don't know
if they do, giganews doesn't, but the server I normally use does).
>> Did you notice this is GSM and not plain ole telephones?
>I did, but I still do not believe 1/2 sec. response.
I don't disbelieve it - that's a completely digital link. I wish
Clifford would join in, as he has some experience with GSM links.
>>> But I have no idea since you have a highly non-standard pppd.
>> No idea where you got that idea.
>Because of the form of the pppd logs which he posted. They look
>nothing like the anu PPPD logs.
True - those are the wvdial logs. Pretty useless, no?
Old guy
>Moe Trin <ibup...@painkiller.example.tld.invalid> wrote:
>> I'd look into that when you have time. It's hard to believe that
>> failing 8-10 times before happening to work is designed behavior.
>It is also very hard to believe that a phone could be answered in 1/2
>a second. Ie, the timeout has to be longer.
In 'chat', that's the 'timeout' variable. WvDial doesn't seem to list
anything that sets this, so it could be the phone itself deciding that
no one is home.
>Back in the days when pppd on dialup was popular, and when I wrote
>that web page, one of the big problems was the difficulties people
>had with wvdial. If it worked fine, if it did not, it was a mess,
>and very difficult to debug. And wvdial made all sorts of unwarranted
>assumptions. It really got in the way.
It wasn't just wvdial - kppp and some of the whizzo distribution
specific tools caused quite similar problems.
>You can tellit not to try a login (I donot remember exaclty how
>anymore) but chat really really is far simpler.
Stupid Mode
When wvdial is in Stupid Mode, it does not attempt to interpret
any prompts from the terminal server. It starts pppd
immediately after the modem connects. Apparently there are
ISP's that actually give you a login prompt, but work only if
you start PPP, rather than logging in. Go figure. Stupid Mode
is (naturally) disabled by default.
I always used to ask which of the authors the mode was named after.
>wvdial assumes that first you log onto your service provider, and
>then after that you run pppd. That was their authentication scheme.
>Eventually the service providers discovered that there had existed
>for 10-15 years something called pap or chap, and eventually they
>all went over to that.
Not quite. PPP goes back to 1989 (RFC1134). 'PAP' and 'CHAP' were
described in RFC1334 (1992), but only PAP defined. CHAP-MD5 (RFC1994)
dates from 1996, but that was after microsoft invented the telephone
or the internet or something in win95. That was using PAP by default
until microsoft invented MSCHAP-80 (RFC2433 in 1998) which you may
recall was incompatible with CHAP-MD5. In 2000, they invented
MSCHAP-81 which was incompatible with either CHAP-MD5 or MSCHAP-80.
Both MSCHAP versions are supported by ANU pppd, but I think microsoft
has silently dropped both versions.
>But wvdial was started when, because many service providers read very
>very slowly, they had not yet gotten to the chapter discussion pap
>and chap.
WvDial originated in late 1997, and win95 had made RFC1334 mandatory
for most ISPs. There were a few holdouts - earthlink and netcom come
to mind, and they supplied binaries to their windoze customers for
login.
>The problem is that his logs do NOT show the negotiation session in
>text form. That is NOT pppd. pppd shows the negotiations all in text
>form, exactly as you quote below. I have severe doubts that he is
>actually using pppd.
Other response - he's showing wvdial log data.
Old guy
>Old guy, you sure know your modems.
Thank you - but no, the real modem expert was Rob Clark who had the
'Winmodems are not Modems" web site referenced in the Modem-HOWTO.
>I've read this discussion all the way through, partly because I'm
>learning a bunch of stuff I never knew about modems, ppp, wvdial, &
>friends; and partly because I'm remembering how much I used to
>struggle with and hate modems, and how glad I am that I don't have
>to do that any more.
Recall Robert Hart's original 'PPP-HOWTO'? That document drove me
nuts because the second or third system I set up in ~1995 was trying
to dial in to an ISP who had advanced to the RFC1334 authentication
method, but whose hell-desk staff could spell PPP (or DNS) no matter
what you did and thus were unable to provide the hint I needed. It
took most of a week to get things running. Thing that bothered me the
most was that the week was my vacation and I was unable to access
the Internet and "was on my own". Once you realize how the game is
meant to be played (end chat when the modem reports CONNECT and don't
even send a carriage return), it becomes extremely easy to get a
ppp connection running. Most ``helper'' tools are written by those
who haven't bothered to investigate how ppp is used, and as a result
are less than optimum. Most also hide any ppp debug data, which
makes it all the harder to figure out what went wrong.
>But it's good that there are some Old Guys out there who still know
>how to work 'em.
I don't see that much for dialin questions any more. In 2003, we used
to average about four articles a day in the Usenet newsgroup
comp.protocols.ppp, and one of the authors of the current package was
a regular poster. Last year, it was around one article per week.
The group comp.dcom.modems went from ten a day to ten a month, and
most of those are spam now.
Old guy
> On Thu, 14 Jan 2010, in the Usenet newsgroup comp.os.linux.networking,
> in article <tKadnZgERulUSdPW...@giganews.com>, lrhorer
> wrote:
>
>>Moe Trin wrote:
>
>>> I'd want to find out why it's disappearing from lsusb - never mind
>>> /dev/ttyACM0.
>
>>Well, ultimately I do, too, of course. In the mean time, however, I
>>need to account for and handle the possibility, even if it rarely (or
>>hopefully never) occurs in daily useage.
>
> Another poster suggested reloading the cdc-acm module. That probably
> requires root, but have you looked into that?
Not yet. I just got the router back this evening, and I haven't had any
time to look into the specifics of any of the failures. I'm still
running down some coding issues on one of the control modules. One of
the header files (not mine) has some issues, and so the code won't
compile. I probably won't get back to looking into the failure modes
until this weekend, if then. I did turn up yet another failure mode,
though. Oddly, the device got registered (non switched), but wouldn't
respond to ordinary system calls, so udev couldn't even produce the
device targets and the switch code could not flip it. There's no
question I am going to have to shut down and restore power to the
device when some of these failure modes are encountered. It just
shouldn't be the default action.
>>It's certainly possible, but I don't think that is the case, at least
>>not for most of the failure modes. I could be wrong, of course.
>>Further testing will tell. but before I can really do any failure
>>testing, I need to come up with an ironclad means of automatic
>>recovery under the worst case scenario.
> Automatic - that may be the problem, as you don't know what the actual
> failure is. That the device disappears from lsusb has to be attacked.
> If the device doesn't exist, there's not much an application like
> minicom, pppd/chat, or wvdial can do.
'My point exactly. I have to address such eventualities, however.
That's one of the common issues with unattended devices which must
recover from unexpected issues autonomously. I deal with such systems
all the time, and I access troublesome devices all the time remotely,
but admittedly this is the first time I have had to do development work
for such a device remotely.
>>Thanks. I've looked briefly into chat as a solution. I may use it,
>>or perhaps minicom + expect.
> 'chat' was designed for this. There _used_ to be a "PPP-over-Minicom"
Well, OK. I only briefly skimmed the man page for chat, but it
certainly holds promise. If I can make some headway with the device
control (or get totally stumped) this weekend, I'll have to look into
creating a chat script to be called by the startup script.
> mini-howto, but I haven't seen it mentioned in over ten years. Few
> have bothered to use it, as it's unduly complex to setup/use. The
> pppd wants a "working" serial link (tty option to pppd) and this is
> normally setup using the 'connect' option. You can use what-ever
> you like - I've seen it done with 'echo', 'read' and redirection if
> you want to be crude about it.
>
>>> If your IP address is going to be changing, you probably want to
>>> stick a '1' into /proc/sys/net/ipv4/ip_dynaddr
>
>>It does, with every single call.
>
> That's a kernel routing problem - which is why it's a kernel parameter
> that is altered above.
I lost you, there. How is the fact the ISP does not try to maintain
its client's IP address in its DHCP server a kernel routing problem?
>>What's more, they are using NAT, because they provide the user with
>>non-routable addresses in the 10.100 range, as you can see above.
>
> Not relevant to ppp. RFC1918 addresses aren't that unusual.
I never said it was relevant to ppp, and I am very well aware that
RFC1918 addresses are commonplace. If they weren't, we would have run
out of ipv4 address space long ago. All I said is that this ISP
employs NAT to supply addresses to its clients. It's a rather cheezy,
or at least obnoxiously restrictive thing to do. It certainly prevents
the subscriber from running any sort of internet-facing server.
>>> Sent some useless stuff not applicable to a USB device, and got
>>> another OK response
>
>>I haven't delved into the command codes, yet, but the standard AT
>>commands are evidently implemented by the modem, albeit differently
>>than a regular PSTN modem. The string was given to me by someone else
>>who got his A600 working. How much is superfluous, I don't yet know.
>
> Blindly sending codes and hoping they're OK isn't always the best
> solution.
No, of course not.
> By the way, there is no such thing as 'standard AT
> commands'. Those are Hayes commands, and even Hayes didn't implement
> or use the commands consistently.
Yes, so I was being sloppy.
> What WvDial is using are a
> combination of Hayes standard command, Hayes extended commands, and
> a Rockwell command. Standards??? What are those?
In datacom, they are indeed sometimes a bit of a mythical creature.
>>> Supposedly dialed a number, but no modem answered within the timeout
>>> period - that's normally 45 seconds.
>
>>No, it's set to be much faster than that. I think it's set to 1/2
>>second, but I don't have access to the router right now to check.
> [..]
>>It's not unusual for it to attempt dialing 8 or 10 times, though.
>
> I'd look into that when you have time. It's hard to believe that
> failing 8-10 times before happening to work is designed behavior.
Intentionally designed? 'Possibly not. This company doesn't really
know what it is doing, though, and that it might be by unintentional
design would in no way surprise me. I'm not that familiar with CDMA,
though, so I can't really say one way or the other.
>>>>--> Carrier detected. Waiting for prompt.
>
>>> the idiots from wvdial are still living in the past and
>>> think there will be a LOGIN: prompt.
>
>>Or else that is just a generic status message which assumes nothing
>>about what might be found at the other end.
>
> You really want to read the documentation for wvdial. It's designed
> to log in to a getty - and over the development they discovered ISPs
> are using the equivalent of autoppp mode because this is what windoze
> expects.
If I'm going to abandon it outright, I don't really see the point in
investigating further. I'll just take your word for it.
>>Not everyone dials an ISP with their modem, you know. Of course in
>>her case, it is, but a properly written application of this sort
>>should be able to handle connections of any sort.
>
> What-ever
Did I mention that I am a network engineer for a major
telecommunications company, and that we are one of the largest ISPs in
the country? We have literally hundreds of fault tolerant modems
around the country that auto-dial hundreds of other fault-tolerant
modems around the country if the networking fails to one of the cities.
They do not establish ppp sessions.
>>It also would not surprise me if there are not ISPs out there who
>>still don't implement ppp.
>
> The other alternatives (besides ppp-over-something) are SLIP, CSLIP
> and UUCP. Any idea when you saw one of those protocols used?
The last time I used SLIP was under OS/2, on a US Robotics 14.4K modem,
although I have to deal with some UUCP crap on several of my HP-UX
servers at work. What do you want to bet there aren't ISPs in
Guatemala or Irkutsk who are still using SLIP over 9600 bps dial-up
modems, though?
> I don't
> think many distributions come with the 'sliplogin' package any more
> (there is a 1996 tarball at sunsite) and I don't think UUCP is being
> maintained (1994 tarball at sunsite).
We're still using some equipment and software created in 1990, and
we're a multi-billion dollar company. Imagine what some of the
companies in South America are using, not to mention private
individuals?
> There is at least one, and
> maybe more ISPs that provide a shell account (panix being the one I'm
> aware of), which is basically what you used to get with a BBS setup.
> For that, you use a terminal application like minicom (which seems to
> be being maintained). But that doesn't offer IP connectivity.
>
>>I don't see how you determine that it is "belatedly". The console
>>output doesn't indicate how long after it receives the ppp negotiation
>>string it spawns the pppd process. It certainly should not spawn pppd
>>until it is certain the other end actually is pppd,
>
> What other purpose do you think there could be? WvDial really isn't
> able to handle UUCP, SLIP, or a text shell session. It also doesn't
> handle ppp-over-whatever.
OK. Again,I'll take your word for it.
>>> There is a 'usepeerdns' option to pppd to cause it to ask the peer
>>> for these addresses, but pppd puts them into /etc/ppp/resolv.conf
>>> rather than screwing with a system configuration file - you could
>>> soft-link the two if needed. It's in the pppd man page.
>
>>It already does. I suppose it could be wvdial that's updating
>>resolv.conf, but I'm pretty sure it's pppd.
>
> Read the man page - pppd is written by people with security in mind.
You lost me, again. Completely. Wvdial is calling pppd with the
usepeerdns option, which by every reading I have done in the pppd man
page tells me it is pppd which is updating /etc/resolv.conf.
Admittedly, I am not as familiar with ppp as I should be, since I
rarely every use it for anything, but the man page for pppd
specifically says, "usepeerdns - Ask the peer for up to 2 DNS server
addresses. The addresses supplied by the peer (if any) are passed to
the /etc/ppp/ip-up script in the environment variables DNS1 and DNS2,
and the environment variable USEPEERDNS will be set to 1. In
addition, pppd will create an /etc/ppp/resolv.conf file containing one
or two nameserver lines with the address(es) supplied by the peer."
So how is it my understanding of the security features of pppd are
flawed, or that I am mistaken in believing pppd is
updating /etc/resolv.conf?
>>The modem starts up in storage mode with an executable which runs
>>automatically and checks to see if the system is already configured
>>with their canned application. If not, it loads itself into
>>Windows and installs the device drivers and the canned application,
>>along with a volatile initialization application, run only once to
>>initialize the modem and then never again. Windows then issues a
>>switch command which "flips" the modem from storage mode into modem
>>mode, and the modem disappears as device 1f28:0021 on its initial USB
>>address and reappears as 1f28:0020 on its original USB address +1. The
>>initialization routine initializes the modem for access and
>>terminates,
>>resetting the modem. Thereafter, it will respond to AT commands. Of
>>course, in Linux, none of this happens automagically. Instead, I use
>>a little utility produced by a 3rd party developer to handle just such
>>modems under Linux which switches the device into modem mode. It
>>works very well on a wide variety of such devices.
>
> I can see 1f28 identified as CalComp on the Linux USB list
> (http://www.linux-usb.org/usb.ids), but no devices are listed. To me,
> it sounds as if your problem is getting this thing to act as a modem.
'More like keeping it acting like a modem. Getting it to do so in the
first place is easy.
> Once the kernel can see it, wvdial or anything else ought to be able
> to make the connection. The authentication failure you showed can
> be isolated by looking at the LCP negotiations, but you'd have to get
> pppd to debug mode (debug option) and set the system logging daemon to
> save this output to one of the log files. Wvdial won't do this. The
> logs would look something like this (canned response):
But that can only happen if ppp can talk to the modem. The biggest
issues I am having at the moment are when pppd *CAN'T* talk to the
modem (and nothing else can, either). It's also my primary concern for
getting the system to recover autonomously. Once /dev/ACM0 is active
and working, the rest is a piece of cake. You guys keep trying to move
me towards troubleshooting the network utilities, but I've got much
bigger fish to fry at the moment.
> ----------------
> PPP is like a two handed game of "Mother, may I" (popular with
Did I mention I am a professional telecommunications engineer? I've
been doing this for over 30 years. I really, really don't need
hand-holding. I'm relatively new to Linux, so I don't know a lot of
what's available for it, but I don't need any primers. Again, no
offense intended, but I'm looking for advice on what applications are
available, not how to troubleshoot issues.
The modem has been replaced under warranty. The replacement behaves
precisely the same way. Retail cost is $119. This and one other very
similar modem are the only ones they support. Note the modem doesn't
typically die and then disconnect. It's the other way around.
>>> Supposedly dialed a number, but no modem answered within the timeout
>>> period - that's normally 45 seconds.
>>
>> No, it's set to be much faster than that. I think it's set to 1/2
>> second, but I don't have access to the router right now to check.
>> The
>
> Sorry, that cannot be true. It takes much longer than that to dial and
> to connect.
I'm not sure why you think that. This isn't a PSTN modem. PRI and BRI
connections are ALWAYS made in less than 1/2 second, for example. I'm
not intimately familiar with CDMA protocols, but I am fairly intimately
familiar with the equipment which handles it, and there is no reason
why a connect cannot be failed or accepted in less than a second.
> Why do you not set up the timeout for longer.
Well, first of all because there is no timeout parameter for wvdial, so
it isn't possible. Secondly, because it clearly is not necessary. The
modem never fails to get a connection, and in far less time than your
45 second suggestion, at that.
>> example I posted took less than 10 seconds to connect. Unless it
>> fails altogether, it has never taken more than 90 seconds to do
>> everything, including clearing the firewall, shutting down the
>> Ethernet port, switching the modem, dialing, establishing ppp,
>> turning the Ethernet port back on, establishing the VPN tunnel,
>> clearing the firewall again, verifying connectivity, establishing
>> DNS, establishing NAT, putting the
>> (correct) firewall back in place, and setting up the routes. It's
>> not unusual for it to attempt dialing 8 or 10 times, though.
>
> If that is because of a highly inappropriate timeout, then you should
> fix that.
I'll look into it when I implement chat (or whatever), but it's not a
high priority.
>>> This time, a modem was detected on the other end of the line. As I
>>> mentioned, the idiots from wvdial are still living in the past and
>>> think there will be a LOGIN: prompt.
>>>
>>>>~[7f]}#@!}!}!} }9}"}&} } } } }#}%B#}%}%}&X[05]@[16]}'}"}(}"}9y~
>>
>> Or else that is just a generic status message which assumes
>> nothing
>> about what might be found at the other end. Not everyone dials an
>> ISP
>> with their modem, you know. Of course in her case, it is, but a
>> properly written application of this sort should be able to handle
>> connections of any sort.
>
> No, it is not. It is the remote side sending a ppp negotiation frame.
To what, exactly, were you responding? What "is not"? Moe and I were
speaking about the line which says, "--> Carrier detected. Waiting for
prompt." That is most certainly not a ppp negotiation frame.
> wvdial is well known for inappropriately waiting for a login prompt.
> That is just wrong. Again, I strongly suggest that web page for
> setting up an appropriate response.
> On the other hand, I have no idea what sort of ppp you have. The
> return messages it is giving you are just bizarre. They should be text
> messages.
>
> Anyway, what is the operating system on that "router", and where did y
> ou get that version of pppd.
It's not a "router". It's a router, specifically a Linux router. It's
running Debian "Lenny" under kernel 2.6.26-2-686. The version of pppd
is 2.4.4 Rel 10.1, and of course it is from the Debian Stable
repository.
>>> Instead, the peer sends a ppp frame, like every ISP has been doing
>>> for the past 15 plus years.
>>>
>>>>--> PPP negotiation detected.
>>
>> And obviously it responds appropriately. I haven't looked
>> into the
>
> No it obviously does NOT respond properly. That by chance once in 8
> times it connects is a fluke rather than proper operation.
The fact it connects in less than 10 seconds every time, consistently
takes very nearly the same amount of time to connect, takes no more
time to connect under Linux than under Windows, and consistently
undergoes 6 - 8 retires rather than random numbers of attempts strongly
suggests you are incorrect.
>> guts of wvdial, and it might indeed be buggy and klunky, but the fact
>> they employ a response you don't like and support older and less
>> commonly used protocols isn't a valid criticism, when it clearly does
>> support a more universal one. It also would not surprise me if there
>
> Oh yes it is. I have dealt with complaints about wvdial for 10 years.
>
>> are not ISPs out there who still don't implement ppp.
>
> No, there are not. ppp is how it works.
And you have personally connected to every one of the 10,000+ dial-up
ISPs out there to verify this?
>>>>--> pppd: <EF><BF><BD>'<EF><BF><BD>[08]<EF><BF><BD>%<EF><BF><BD>[08]
>>>
>>> This should be 8 bit ppp frames - hard to say what it actually is.
>>
>> I agree with you, here. It certainly should be better able
>> to
>> translate the protocols into something more intelligible. That's one
>> reason I asked for alternatives.
>
> I have no idea what kind of pppd you have on your system.
You can ask Marco d'Itri <m...@linux.it>, he's the maintainer.
>>> and it's getting RFC1918 addresses for local, remote/peer, and two
>>> DNS servers. My ISPs don't play Musical IP Addresses with the DNS,
>>> so I've got those written directly into /etc/resolv.conf.
>>
>> Most of your better ISPs try to avoid shuffling their
>> clients' IP
>> addresses. This is not one of your better ISPs. They are
>> inexpensive, though, and my sister is retired and living on a very
>> fixed income.
>
> No, ISPs hand out IP addresses at random if it is a dialup operation.
> Almost none give static IPs.
This is supposed to be a broadband operation. Of course, it emulates a
dial-up operation, which is I suppose their excuse.
> On Fri, 15 Jan 2010, in the Usenet newsgroup comp.os.linux.networking,
> in article <slrnhkvc0c...@wormhole.physics.ubc.ca>, unruh
>>> Did you notice this is GSM and not plain ole telephones?
>
>>I did, but I still do not believe 1/2 sec. response.
>
> I don't disbelieve it - that's a completely digital link. I wish
> Clifford would join in, as he has some experience with GSM links.
Exactly. There's no interface to the PSTN, and either the spectrum is
available or it isn't. There's no actual dialing involved, either,
although as I mentioned earlier, ISDN systems (which may tie into the
very same class 5 switch as your POTS line) always connect in less than
1/2 second.
>>>> But I have no idea since you have a highly non-standard pppd.
>
>>> No idea where you got that idea.
>
>>Because of the form of the pppd logs which he posted. They look
>>nothing like the anu PPPD logs.
>
> True - those are the wvdial logs. Pretty useless, no?
>
> Old guy
It's not really a log, per se. It's stdout redirected to a file.
This is an example of the pppd logs:
Jan 9 09:48:59 Cricket pppd[13333]: Connect time 68.2 minutes.
Jan 9 09:48:59 Cricket pppd[13333]: Sent 125726 bytes, received 214681
bytes.
Jan 9 09:49:00 Cricket pppd[13333]: CHAP authentication succeeded
Jan 9 09:49:00 Cricket pppd[13333]: CHAP authentication succeeded
Jan 9 09:49:00 Cricket pppd[13333]: local IP address 10.100.18.173
Jan 9 09:49:00 Cricket pppd[13333]: remote IP address 172.29.122.162
Jan 9 09:49:00 Cricket pppd[13333]: primary DNS address 172.28.221.53
Jan 9 09:49:00 Cricket pppd[13333]: secondary DNS address 172.28.221.54
Jan 9 11:26:48 Cricket pppd[2544]: pppd 2.4.4 started by root, uid 0
Jan 9 11:26:48 Cricket pppd[2544]: Using interface ppp0
Jan 9 11:26:48 Cricket pppd[2544]: Connect: ppp0 <--> /dev/ttyACM0
Jan 9 11:26:51 Cricket pppd[2544]: CHAP authentication succeeded
Jan 9 11:26:51 Cricket pppd[2544]: CHAP authentication succeeded
Jan 9 11:26:52 Cricket pppd[2544]: local IP address 10.100.23.215
Jan 9 11:26:52 Cricket pppd[2544]: remote IP address 172.29.122.162
Jan 9 11:26:52 Cricket pppd[2544]: primary DNS address 172.28.221.53
Jan 9 11:26:52 Cricket pppd[2544]: secondary DNS address 172.28.221.54
> lrhorer <lrh...@satx.rr.com> wrote:
>>
>> The point that is being missed, however, is that I am not
>> asking for
>> help with troubleshooting the modem. Rather, I am looking for a
>> communications solution that will allow me to have better granularity
>> in interfacing the router's control scripts with the dialing utility.
>> So far, the front runner seems to be writing a chat script. If there
>> is a better solution, though, I'm all ears.
>
> You should be; it sounds like the modem gets confused and drops off
> the USB bus.
Yes, but that doesn't mean I require assistance troubleshooting it. I
appreciate people wanting to be helpful, but I'm not new at dealing
with electronics down to the component level. I wasn't new at it when
the IBM PC was first introduced.
> Have you tried removing and re-inserting the cdc-acm
> module? That *might* be enough to re-initialize the modem to a usable
> state.
I'll give it a shot when I get the chance.
> There are also ways to control the power to a USB port via
> sysfs I believe. This would allow you to fully reset the modem.
Are you sure about that? I've looked, and I can't find any
documentation to that effect. I'm working on writing a userspace
binary to allow me to control the port, but if there is already a
solution out there, I would be thrilled to use it. Of course, many USB
ports do not have the capability in the first place, but if necessary I
can use an external hub that does support power management.
> As for "interfacing the router's control scripts with the dialing
> utility": chat.
That seems to be the consensus.
pppd updates /etc/ppp/resolv.conf It is up to the user's programs to
transfer that information into /etc/resolv.conf. The writers of pppd did
not feel that it was their business to be altering system files.
> Admittedly, I am not as familiar with ppp as I should be, since I
> rarely every use it for anything, but the man page for pppd
> specifically says, "usepeerdns - Ask the peer for up to 2 DNS server
> addresses. The addresses supplied by the peer (if any) are passed to
> the /etc/ppp/ip-up script in the environment variables DNS1 and DNS2,
> and the environment variable USEPEERDNS will be set to 1. In
> addition, pppd will create an /etc/ppp/resolv.conf file containing one
^^^^^^^^^^^^^^^^^^^^
Please read the underlined carefully.
> or two nameserver lines with the address(es) supplied by the peer."
>
> So how is it my understanding of the security features of pppd are
> flawed, or that I am mistaken in believing pppd is
> updating /etc/resolv.conf?
You are mistaken. Please read that paragraph you quoted again.
>
>>
>> I can see 1f28 identified as CalComp on the Linux USB list
>> (http://www.linux-usb.org/usb.ids), but no devices are listed. To me,
>> it sounds as if your problem is getting this thing to act as a modem.
>
> 'More like keeping it acting like a modem. Getting it to do so in the
> first place is easy.
Can you really not buy your sister a decent cdma modem?
...
>> ----------------
>> PPP is like a two handed game of "Mother, may I" (popular with
>
> Did I mention I am a professional telecommunications engineer? I've
> been doing this for over 30 years. I really, really don't need
> hand-holding. I'm relatively new to Linux, so I don't know a lot of
> what's available for it, but I don't need any primers. Again, no
> offense intended, but I'm looking for advice on what applications are
> available, not how to troubleshoot issues.
See above.
I have no idea about the cdma modem market, but "they support" and
"works" are not necessarily the same.
It is hard to say if it is disconnecting first or disconnecting because
it died. Logs would help. Decent logs, not the crap that wvdial puts
out. set up logging for pppd.
>
>>>> Supposedly dialed a number, but no modem answered within the timeout
>>>> period - that's normally 45 seconds.
>>>
>>> No, it's set to be much faster than that. I think it's set to 1/2
>>> second, but I don't have access to the router right now to check.
>>> The
>>
>> Sorry, that cannot be true. It takes much longer than that to dial and
>> to connect.
>
> I'm not sure why you think that. This isn't a PSTN modem. PRI and BRI
> connections are ALWAYS made in less than 1/2 second, for example. I'm
> not intimately familiar with CDMA protocols, but I am fairly intimately
> familiar with the equipment which handles it, and there is no reason
> why a connect cannot be failed or accepted in less than a second.
>
>> Why do you not set up the timeout for longer.
>
> Well, first of all because there is no timeout parameter for wvdial, so
> it isn't possible. Secondly, because it clearly is not necessary. The
> modem never fails to get a connection, and in far less time than your
> 45 second suggestion, at that.
WHAT? You have told us how the modem fails 8 times out of 10 and then
tell us it never fails to get a connection.
Note that using chat would allow you to change the timeout parameter.
For an engineer you are remarkably resistant to changing things that do
not work for something that may work. And for ignoring the advice of
people who have worked in the area you are having trouble with.
>>>>>~[7f]}#@!}!}!} }9}"}&} } } } }#}%B#}%}%}&X[05]@[16]}'}"}(}"}9y~
>>>
>>> Or else that is just a generic status message which assumes
>>> nothing
>>> about what might be found at the other end. Not everyone dials an
>>> ISP
>>> with their modem, you know. Of course in her case, it is, but a
>>> properly written application of this sort should be able to handle
>>> connections of any sort.
>>
>> No, it is not. It is the remote side sending a ppp negotiation frame.
>
> To what, exactly, were you responding? What "is not"? Moe and I were
> speaking about the line which says, "--> Carrier detected. Waiting for
> prompt." That is most certainly not a ppp negotiation frame.
"Or else that is just a generic status message which assumes
nothing...."
Yes, it looks a lot like a ppp negotiation frame.
>
>> wvdial is well known for inappropriately waiting for a login prompt.
>> That is just wrong. Again, I strongly suggest that web page for
>> setting up an appropriate response.
>> On the other hand, I have no idea what sort of ppp you have. The
>> return messages it is giving you are just bizarre. They should be text
>> messages.
>>
>> Anyway, what is the operating system on that "router", and where did y
>> ou get that version of pppd.
>
> It's not a "router". It's a router, specifically a Linux router. It's
> running Debian "Lenny" under kernel 2.6.26-2-686. The version of pppd
> is 2.4.4 Rel 10.1, and of course it is from the Debian Stable
> repository.
Thanks. Please enable logging for pppd.
>>
>> I have no idea what kind of pppd you have on your system.
>
> You can ask Marco d'Itri <m...@linux.it>, he's the maintainer.
Now that you have told me which version of pppd you have, and Moe
pointed out that what you were quoting was probably the wvdial log, I
understand your post better. Please switch on pppd logging so that we
all have a chance of understanding the messages pppd puts out and
receives.
One of the aspect of engineering is a) not operating on assumptions but
on measurements, and b) actually making those measurements. One of the
measuring tools you have here are the logs put out by pppd. Please use
them, and if you are asking for help, let us know what they are as well.
The indications are that your modem is defective-- either as designed or
as set up by the instructions you send it. For all we know, one of the
instructions you send it in that ATZ is "tear down the modem and the
firmware on disconnect".
[unruh <un...@wormhole.physics.ubc.ca> wrote:]
>> Why do you not set up the timeout for longer.
>Well, first of all because there is no timeout parameter for wvdial,
>so it isn't possible. Secondly, because it clearly is not necessary.
>The modem never fails to get a connection, and in far less time than
>your 45 second suggestion, at that.
The 45 second value is the default for chat - and is aimed at the
POTS style analog modem. The timeout is meant to be "if nothing has
happened in this long, declare a fault". On the other hand, if the
modem returns one of the many (Hayes ATX3) error messages or codes
in less than the timeout period, the value of the timeout is moot.
>> That by chance once in 8 times it connects is a fluke rather than
>> proper operation.
>The fact it connects in less than 10 seconds every time, consistently
>takes very nearly the same amount of time to connect, takes no more
>time to connect under Linux than under Windows, and consistently
>undergoes 6 - 8 retires rather than random numbers of attempts
>strongly suggests you are incorrect.
I'll merely state that needing multiple retries is a rather unusual
design specification.
Old guy
>Moe Trin wrote:
>> Another poster suggested reloading the cdc-acm module. That probably
>> requires root, but have you looked into that?
>Not yet. I just got the router back this evening, and I haven't had
>any time to look into the specifics of any of the failures. I'm
>still running down some coding issues on one of the control modules.
>One of the header files (not mine) has some issues, and so the code
>won't compile. I probably won't get back to looking into the failure
>modes until this weekend, if then. I did turn up yet another failure
>mode, though. Oddly, the device got registered (non switched), but
>wouldn't respond to ordinary system calls, so udev couldn't even
>produce the device targets and the switch code could not flip it.
It's looking a lot more as if this is the problem, rather than pppd.
>There's no question I am going to have to shut down and restore power
>to the device when some of these failure modes are encountered. It
>just shouldn't be the default action.
Agree. As a guess, the software in the modem isn't being set/reset to
the "right" mode, and needs the power cycle to reset to sane values.
From a serial data-link point of view, this probably means those
init-strings aren't correct. For an analog modem, 'ATZ' generally
means to reset to a _user_ specified stored configuration (that were
saved using the 'AT&W0' command). This differs from 'AT&Fn' which
resets the modem to ``factory'' settings. So if the user accidentally
set up some weird configuration and then ran the 'AT&W0' command, the
modem powers up to sane values, but gets reset to the strange condition
by that 'ATZ'. That's why it's generally not a good init-string.
>> Automatic - that may be the problem, as you don't know what the
>> actual failure is. That the device disappears from lsusb has to be
>> attacked. If the device doesn't exist, there's not much an
>> application like minicom, pppd/chat, or wvdial can do.
>'My point exactly. I have to address such eventualities, however.
>That's one of the common issues with unattended devices which must
>recover from unexpected issues autonomously.
The power on reset does that, but I don't consider it to be the best
solution. A parachute will probably save your buns when the airplane
comes un-glued, but it's more desirable to not have the aircraft do
that - certainly not on a regular basis.
>> 'chat' was designed for this.
>Well, OK. I only briefly skimmed the man page for chat, but it
>certainly holds promise. If I can make some headway with the device
>control (or get totally stumped) this weekend, I'll have to look into
>creating a chat script to be called by the startup script.
Actually, 'chat' is called by pppd using the 'connect' option. I've
shown two examples in this discussion.
>> That's a kernel routing problem - which is why it's a kernel
>> parameter that is altered above.
>I lost you, there. How is the fact the ISP does not try to maintain
>its client's IP address in its DHCP server a kernel routing problem?
DHCP is an Ethernet service, not PPP. The kernel routing problem
occurs when the system _NORMALLY_ has a default route using a ppp
interface - as either booting to a configuration where pppd is run by
default and the link may yo-yo, or in a 'demand' mode.
>> Not relevant to ppp. RFC1918 addresses aren't that unusual.
>I never said it was relevant to ppp, and I am very well aware that
>RFC1918 addresses are commonplace. If they weren't, we would have
>run out of ipv4 address space long ago. All I said is that this ISP
>employs NAT to supply addresses to its clients.
That's what I meant. I know of several ISPs that operate that way.
>It's a rather cheezy, or at least obnoxiously restrictive thing to
>do. It certainly prevents the subscriber from running any sort of
>internet-facing server.
I'm not going to defend the ISP, but that's just another fact of
business. Usually, it's a method of separating the cheap - 'client
only' customer from the expensive 'allowed to run servers' customer.
>Wvdial is calling pppd with the usepeerdns option, which by every
>reading I have done in the pppd man page tells me it is pppd which is
>updating /etc/resolv.conf.
^^^^^^^^^^^^^^^^
>The addresses supplied by the peer (if any) are passed to the
>/etc/ppp/ip-up script in the environment variables DNS1 and DNS2,
>and the environment variable USEPEERDNS will be set to 1. In
>addition, pppd will create an /etc/ppp/resolv.conf file
^^^^^^^^^^^^^^^^^^^^
Different file.
>> To me, it sounds as if your problem is getting this thing to act as
>> a modem.
>'More like keeping it acting like a modem. Getting it to do so in the
>first place is easy.
I'll give you that.
Old guy
ppp is a peer to peer service. This means that there need by no IP
addresses on the line--- you send stuff to only one place, your peer.
It is also not necessary that the address delivered on the ppp line is
the same as the address than either side uses.
However, in normal operation, the important addresses are the address
the ISP sends you to refer to him ( but again he does not care, because
there is only one ppp connection) and for him the important address is
yours.
Now the usual situation is that the ISP is asked both for his IP address
and your IP address. the ISP has a certain limited number of addresses,
and since on the telephone systems, many users will call up, there is a
huge disincentive to assigning the same address to the same caller (
that would mean you would have to know who the caller is). Thus the ppp
addresses on each connection tend to be different. Now, CDMA broadband
may be different, but again there is no incentive for them to go to the
trouble of giving you the same address, and it is hard to do ( how do
you identify the remote caller).
Neither was I. I've found over may years of troubleshooting problems
that *most* people simply can't debug. I've worked on projects where
out of perhaps 20 or 30 people only 2 or 3 of us could debug
anything.
>
>> Have you tried removing and re-inserting the cdc-acm
>> module? That *might* be enough to re-initialize the modem to a usable
>> state.
>
> I'll give it a shot when I get the chance.
>
>> There are also ways to control the power to a USB port via
>> sysfs I believe. This would allow you to fully reset the modem.
>
> Are you sure about that? I've looked, and I can't find any
> documentation to that effect. I'm working on writing a userspace
> binary to allow me to control the port, but if there is already a
> solution out there, I would be thrilled to use it. Of course, many USB
> ports do not have the capability in the first place, but if necessary I
> can use an external hub that does support power management.
>
No I'm not sure, hence the "I believe" part. It could also require an
ioctl or some other special operation. What happens if you remove
[eou]hci-hcd (the USB port driver)?
Also there's additional debugging info that can be turned on for USB.
I think it's a kernel config option though.
>> As for "interfacing the router's control scripts with the dialing
>> utility": chat.
>
> That seems to be the consensus.
Yep, there's a good reason for that: flexibilty and control. I still
use dial-up when I visit my dad in Florida. I have a bash script which
builds the chat script (dependent on the modem & some options), then
invokes pppd with the connect option to actually do anything.
Also 2.6.26 is an old kernel (late 2008), current stable is 2.6.32.3.
Jerry
Yeah, including a lot of people who are paid to be able to debug an
troubleshoot. 'Paid a lot, sometimes.
>>> Have you tried removing and re-inserting the cdc-acm
>>> module? That *might* be enough to re-initialize the modem to a
>>> usable state.
>>
>> I'll give it a shot when I get the chance.
>>
>>> There are also ways to control the power to a USB port via
>>> sysfs I believe. This would allow you to fully reset the modem.
>>
>> Are you sure about that? I've looked, and I can't find any
>> documentation to that effect. I'm working on writing a userspace
>> binary to allow me to control the port, but if there is already a
>> solution out there, I would be thrilled to use it. Of course, many
>> USB ports do not have the capability in the first place, but if
>> necessary I can use an external hub that does support power
>> management.
>>
> No I'm not sure, hence the "I believe" part. It could also require an
> ioctl or some other special operation. What happens if you remove
> [eou]hci-hcd (the USB port driver)?
Nothing at all, I think. I believe the device has to be commanded to
down the port (and I believe it is port specific). I also have to
qualify the statement with "I believe", because that just what the
reading I have done and the people to whom I have spoken so far has
suggested. One of the USB gurus tho whom I have spoken (Alan Stern at
Harvard) has suggested the best way to approach the issue is to unbind
the driver and then issue the command to shut down power on the port.
That makes me think unbinding the driver isn't sufficient by itself.
> Also there's additional debugging info that can be turned on for USB.
> I think it's a kernel config option though.
>
>>> As for "interfacing the router's control scripts with the dialing
>>> utility": chat.
>>
>> That seems to be the consensus.
>
> Yep, there's a good reason for that: flexibilty and control. I still
> use dial-up when I visit my dad in Florida. I have a bash script which
> builds the chat script (dependent on the modem & some options), then
> invokes pppd with the connect option to actually do anything.
>
> Also 2.6.26 is an old kernel (late 2008), current stable is 2.6.32.3.
Debian never packages the most recent versions of software, unless the
release affects security or addresses a significant bug. It is an
extremely conservative distribution, which is the main reason I use it.
It definitely never has all the new bells and whistles, and it isn't
for someone who wants to always have support for all the latest
gadgets, but it is rock solid and provides new meaning for the
term "stable". I use it on all my servers, and I am using it on this
router in large measure because it is stable in the extreme. Most
distros have maybe two versions, Stable and Unstable. Debian has
Stable, Unstable, Testing, and Experimental, in order of decreasing
reliability. What most distros call, "Stable", Debian calls "Testing".
If you ask me, late 2008 isn't all that "old". A lot of bugs can go
undetected after only a little more than 1 year in public release. If
I need anything from one of the newer kernels, I can always download
and compile it, but there's nothing from any of the newer kernels that
is missing in 2.6.26 and that I need for this project. Indeed, I
suspect I may never upgrade the kernel, or anything else, on this
machine once it is permanently online. Its job is just to sit on the
shelf, route packets, generate a firewall, provide DHCP support, and
provide a VPN tunnel into my file server at my house.
Well, yes and no. When the application layer hands off a payload to
layer 4, which then passes it to layer 3, layer 3 won't know where to
send the stream unless there is either a default layer 3 address in the
routing table or a specific gateway address in the routing table, and
the routing table can't have an address in it unless it has been put
in. The layer 3 addresses of course must also be bound to layer 2
addresses, at least one of which would be the ppp addreess in question.
Otherwise, the network layer would just discard the payload. Any
protocol which dealt directly with layer 2 could carry on a
conversation, of course, but TCP/IP would be dead as a doornail.
> It is also not necessary that the address delivered on the ppp line is
> the same as the address than either side uses.
Well, yeah, that's partly true. Once layer 3 has handed off to layer
2, the next-hop information is discarded, so as long as the IP layer
can properly resolve the correct layer 2 stack to which to hand the
payload, everything else would work, UNLESS the conversation is
supposed to be between the two hosts on the opposite sides of the ppp
link.
> However, in normal operation, the important addresses are the address
> the ISP sends you to refer to him ( but again he does not care,
> because there is only one ppp connection) and for him the important
> address is yours.
Again, that's true unless the two two hosts are trying to talk to one
another. If host A has a different address for itself than the address
host B has, then when host B sends a packet to host A with a source
address that does not exist on host A, host A is either going to
forward it out the appropriate gateway if the address is not on a local
subnet, or just drop the packet if it is on a local subnet. Either way,
Host A never passes the payload to layer 3 and beyond. Pass-through
packets would work OK, because they have destinations other than the
local host anyway.
> Now the usual situation is that the ISP is asked both for his IP
> address and your IP address. the ISP has a certain limited number of
> addresses,
This is no less true for a dial-up ISP than a broadband ISP. It's true
the dial-up ISP can probably get by with somewhat fewer IP addresses,
since there is never going to be a time when every user is online. This
is not the case for a broadband provider. It's probably not a factor of
three, though.
> and since on the telephone systems, many users will call
> up, there is a huge disincentive to assigning the same address to the
> same caller ( that would mean you would have to know who the caller
> is).
Of course.
> Thus the ppp addresses on each connection tend to be different.
> Now, CDMA broadband may be different, but again there is no incentive
> for them to go to the trouble of giving you the same address, and it
> is hard to do ( how do you identify the remote caller).
By his login and password, if nothing else. Caller ID would also work.
The OS can obtain the caller ID before the link is even established.
Now I am not arguing with you that it is particularly worth the trouble
for a dial-up ISP to try to provide its users with fixed IPs, but it is
certainly possible. This is supposed to be a broadband provider,
though, and for a broadband provider, the etiquette is a bit different.
>Moe Trin <ibup...@painkiller.example.tld.invalid> wrote:
>> The kernel routing problem occurs when the system _NORMALLY_ has a
>> default route using a ppp interface - as either booting to a
>> configuration where pppd is run by default and the link may yo-yo,
>> or in a 'demand' mode.
>ppp is a peer to peer service. This means that there need by no IP
>addresses on the line--- you send stuff to only one place, your peer.
Well, that's true, but in a convoluted way. If the peer lacks an IP
address, your routing table must be set showing a default route but
with no gateway - i.e.
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 0.0.0.0 0.0.0.0 U 0 0 1234 ppp0
which will cause the kernel to assume that the whole world is directly
reachable on the ppp0 interface. The peer sees no difference in packet
headers, and forwards as per usual. You can't talk directly to the
peer, but that's often the case for procedural/policy reasons.
>However, in normal operation, the important addresses are the address
>the ISP sends you to refer to him ( but again he does not care, because
>there is only one ppp connection) and for him the important address is
>yours.
The important address is yours, but that's so the kernel networking
stack knows what address to assume for the ppp0 interface. The
'/proc/sys/net/ipv4/ip_dynaddr' problem is to avoid problems when
the interface and/or peer IP address changes - a common problem in
'demand' mode (where the pppd assumes the peer is at 10.112.112.112
plus the interface number (see Changes-2.3 in the ppp source, under
2.3.10). The other thing is that the kernel isn't really prepared
to have it's ppp0 interface address of 0.0.0.0 (not legal per RFC1122
section 3.2.1.3(a) and RFC0791 section 3.2).
>there is no incentive for them to go to the trouble of giving you the
>same address, and it is hard to do ( how do you identify the remote
>caller).
It's done with some regularity - USUALLY by having your system ask
for a specific address via the colon option to pppd. Not foolproof,
but done.
Old guy
>lrhorer <lrh...@satx.rr.com> wrote:
[reason modem disconnects]
>It is hard to say if it is disconnecting first or disconnecting because
>it died. Logs would help. Decent logs, not the crap that wvdial puts
>out. set up logging for pppd.
I'm not sure how much it would say. It boils down to "do you see a
[Term-Req] packet from the peer" or not. A message of "modem hangup"
might also be all there is if the peer doesn't issue a Term-Req. As
this is occurring some significant time after dial-in, it could well
be a duration timer at the ISP, or possibly an inactivity timer within
the modem itself - hard to say. The problem is that the modem is left
in an uncommunicative state, such that the kernel can't see it any
more - that's not a pppd problem beyond a possible wrong init-string.
>>> Why do you not set up the timeout for longer.
>> Well, first of all because there is no timeout parameter for
>> wvdial, so it isn't possible. Secondly, because it clearly is not
>> necessary. The modem never fails to get a connection, and in far
>> less time than your 45 second suggestion, at that.
>WHAT? You have told us how the modem fails 8 times out of 10 and then
>tell us it never fails to get a connection.
>Note that using chat would allow you to change the timeout parameter.
Some 'apples and oranges' here. Second, in Message-ID:
<M7KdnSnIsJsXmdHW...@giganews.com> (Mon, 11 Jan 2010
23:11:38 -0600), he shows
] --> Modem initialized.
] --> Sending: ATDT#777
] --> Waiting for carrier.
] ATDT#777
] NO CARRIER
] --> No Carrier! Trying again.
] --> Sending: ATDT#777
] --> Waiting for carrier.
] ATDT#777
] NO CARRIER
] --> No Carrier! Trying again.
The 'chat' 45 second timeout (if nothing happens, do this) isn't
involved here - the modem is dialing, waiting an interval, and
deciding there is no carrier. That's a modem timeout - in analog
modems, this is often register S7 (but the register number isn't
standardized). Looking at manuals for several analog modems here, S7
is in seconds, and the default value varies from 30 to 60 seconds.
Now this could be something that was mis-set, and saved to NVRAM - to
be restored using the ATZ rather that AT&Fn init string. Most modem
setups don't screw up the 'S' registers, maybe that happened here.
However, lacking any documentation on his modem, he probably needs
to go to Joe's Rent-a-Canoe-Paddle shop.
>Please switch on pppd logging so that we all have a chance of
>understanding the messages pppd puts out and receives.
I see two possible ppp failures. The one is the disconnect from the
ISP after some unspecified time - duration timer at the ISP, or modem
activity timer as noted above. The other problem is the authentication
failure reported in Message-ID:
<4qCdnfB7Su93lNHW...@giganews.com> on Mon, 11 Jan 2010
23:34:33 -0600 where he showed
] --> Disconnecting at Mon Jan 11 21:25:46 2010
] --> The PPP daemon has died: Authentication error.
] --> We failed to authenticate ourselves to the peer.
] --> Maybe bad account or password? (exit code = 19)
That could be wvdial unable to determine the appropriate username,
remote name (as he later shows one instance of CHAP authentication)
or the peer requesting a different _form_ of authentication, such as
EAP (RFC2284, RFC2945 or RFC3748), or even PAP (if all that wvdial
configured is CHAP). Yes, debug log data would show this, but I'm
not sure how big a problem this is compared to the hardware problem.
Old guy
No, you're missing the meaning of the term "support" in this context.
They won't allow any other type of modem to attach to their network.
The won't sell you service unless you have purchased their modem, they
won't authorize the account for the modem to connect unless its serial
number is in their database, and they won't initialize the modem to
work unless your account is in good standing. Indeed, the user can't
even decipher which carrier to select since the info is not sent to the
modem. It is embedded in its firmware.
> It is hard to say if it is disconnecting first or disconnecting
> because it died. Logs would help. Decent logs, not the crap that
> wvdial puts out. set up logging for pppd.
Why would both modems just happen to die after being connected
precisely 12 hours, 0 minutes, 0.000 seconds every single time?
>>>>> Supposedly dialed a number, but no modem answered within the
>>>>> timeout period - that's normally 45 seconds.
>>>>
>>>> No, it's set to be much faster than that. I think it's set to 1/2
>>>> second, but I don't have access to the router right now to check.
>>>> The
>>>
>>> Sorry, that cannot be true. It takes much longer than that to dial
>>> and to connect.
>>
>> I'm not sure why you think that. This isn't a PSTN modem.
>> PRI and BRI
>> connections are ALWAYS made in less than 1/2 second, for example.
>> I'm not intimately familiar with CDMA protocols, but I am fairly
>> intimately familiar with the equipment which handles it, and there is
>> no reason why a connect cannot be failed or accepted in less than a
>> second.
>>
>>> Why do you not set up the timeout for longer.
>>
>> Well, first of all because there is no timeout parameter for
>> wvdial, so
>> it isn't possible. Secondly, because it clearly is not necessary.
>> The modem never fails to get a connection, and in far less time than
>> your 45 second suggestion, at that.
>
> WHAT? You have told us how the modem fails 8 times out of 10 and then
> tell us it never fails to get a connection.
I never said it fails 8 times out of 10. I said it consistently
retries 8 to 10 times. After successfully starting the dialing
sequence, it never fails to obtain a carrier in approximately 20
seconds, and it never succeeds at dialing in less than 6 trials.
There's no way that is random.
> Note that using chat would allow you to change the timeout parameter.
> For an engineer you are remarkably resistant to changing things that
> do not work for something that may work. And for ignoring the advice
> of people who have worked in the area you are having trouble with.
No, I am insistent on tackling the more important issues before moving
on to less important ones. Working out ppp issues, if any, is trivial.
No matter how well or poorly ppp is working, however, the modem will
never dial out if the computer can't issue commands to it. Some of
these failure modes are encountered long before the router gets to the
point of starting to dial out. I need to resolve those sorts of issues
before I try to dig in to problems with ppp. I also have not yet had
any time at all to work on any of the issues, no matter how large or
how small.
>> It's not a "router". It's a router, specifically a Linux
>> router. It's
>> running Debian "Lenny" under kernel 2.6.26-2-686. The version of
>> pppd is 2.4.4 Rel 10.1, and of course it is from the Debian Stable
>> repository.
>
> Thanks. Please enable logging for pppd.
It's already enabled. Next time I get a failure in a dialing session,
I'll let you know what it says. It will probably be several days,
unless I get lucky, so to speak.
I agree, although despite what unruh seems to believe, I haven't
fixated on this as a root cause, and I certaily have not even remotely
ruled out ppp problems or something bizarre that wvdial is doing as
being a proximate cause.
>>There's no question I am going to have to shut down and restore power
>>to the device when some of these failure modes are encountered. It
>>just shouldn't be the default action.
>
> Agree. As a guess, the software in the modem isn't being set/reset to
> the "right" mode, and needs the power cycle to reset to sane values.
That, or one of 10,000 other things. I'll probably figure out what,
eventually. If I come up with a solution which eliminates the symptoms
without knowing what is really happening underneath I probably won't
bother, though.
> From a serial data-link point of view, this probably means those
> init-strings aren't correct. For an analog modem, 'ATZ' generally
> means to reset to a _user_ specified stored configuration (that were
> saved using the 'AT&W0' command). This differs from 'AT&Fn' which
> resets the modem to ``factory'' settings. So if the user accidentally
> set up some weird configuration and then ran the 'AT&W0' command, the
> modem powers up to sane values, but gets reset to the strange
> condition by that 'ATZ'. That's why it's generally not a good
> init-string.
Yes, but gracefully exiting the session and then dialing back in
doesn't produce a problem. If resetting the modem kills it in one
case, one might expect that resetting it in the other case would have
the same effect.
>>'My point exactly. I have to address such eventualities, however.
>>That's one of the common issues with unattended devices which must
>>recover from unexpected issues autonomously.
>
> The power on reset does that, but I don't consider it to be the best
> solution.
Me, either.
>>Well, OK. I only briefly skimmed the man page for chat, but it
>>certainly holds promise. If I can make some headway with the device
>>control (or get totally stumped) this weekend, I'll have to look into
>>creating a chat script to be called by the startup script.
>
> Actually, 'chat' is called by pppd using the 'connect' option. I've
> shown two examples in this discussion.
Yeah, I see that. I'll dig into the details when I start to write the
script.
>>It's a rather cheezy, or at least obnoxiously restrictive thing to
>>do. It certainly prevents the subscriber from running any sort of
>>internet-facing server.
>
> I'm not going to defend the ISP, but that's just another fact of
> business. Usually, it's a method of separating the cheap - 'client
> only' customer from the expensive 'allowed to run servers' customer.
Well, dial-up service typically costs about $10 a month or so, so by
comparison this service, at $40 a month, is not cheap. Indeed, it is
about the same as most landline based broadband services, cost-wise.
It is less expensive than other wireless or satellite services, but not
THAT much less expensive.
>
>>Wvdial is calling pppd with the usepeerdns option, which by every
>>reading I have done in the pppd man page tells me it is pppd which is
>>updating /etc/resolv.conf.
> ^^^^^^^^^^^^^^^^
>>The addresses supplied by the peer (if any) are passed to the
>>/etc/ppp/ip-up script in the environment variables DNS1 and DNS2,
>>and the environment variable USEPEERDNS will be set to 1. In
>>addition, pppd will create an /etc/ppp/resolv.conf file
> ^^^^^^^^^^^^^^^^^^^^
> Different file.
OK, I see, and I see now what you were saying. 'My mistake. When I
read over that in the man page, I missed the /ppp/ and so I was
thinking pppd was writing the file.
> No, I am insistent on tackling the more important issues
> before moving
> on to less important ones. Working out ppp issues, if any, is
> trivial. No matter how well or poorly ppp is working, however, the
> modem will
> never dial out if the computer can't issue commands to it. Some of
> these failure modes are encountered long before the router gets to the
> point of starting to dial out. I need to resolve those sorts of
> issues
> before I try to dig in to problems with ppp. I also have not yet had
> any time at all to work on any of the issues, no matter how large or
> how small.
>
>>> It's not a "router". It's a router, specifically a Linux
>>> router. It's
>>> running Debian "Lenny" under kernel 2.6.26-2-686. The version of
>>> pppd is 2.4.4 Rel 10.1, and of course it is from the Debian Stable
>>> repository.
>>
>> Thanks. Please enable logging for pppd.
>
> It's already enabled. Next time I get a failure in a dialing
> session,
> I'll let you know what it says. It will probably be several days,
> unless I get lucky, so to speak.
OK, I was doing some testing and I had one of the lockups when I was
sitting at the console. This was not one of the 12 hour lockups, as
you can see from the logs. Here are the log outputs:
Cricket:~# grep "Jan 16 00:" /var/log/messages
Jan 16 00:19:37 Cricket rsyslogd: [origin software="rsyslogd"
swVersion="3.18.6" x-pid="2245" x-info="http://www.rsyslog.com"]
restart
Jan 16 00:41:23 Cricket kernel: [ 1692.388023] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:23 Cricket kernel: [ 1692.948014] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:24 Cricket kernel: [ 1693.508017] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:24 Cricket kernel: [ 1694.028016] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:25 Cricket pppd[2682]: Modem hangup
Jan 16 00:41:25 Cricket pppd[2682]: Connect time 27.6 minutes.
Jan 16 00:41:25 Cricket pppd[2682]: Sent 144558 bytes, received 188005
bytes.
Jan 16 00:41:25 Cricket pppd[2682]: Connection terminated.
Jan 16 00:41:25 Cricket kernel: [ 1694.440107] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jan 16 00:41:25 Cricket kernel: [ 1694.440319] usb 1-1: USB disconnect,
address 3
Jan 16 00:41:25 Cricket pppd[2682]: Exit.
Jan 16 00:41:25 Cricket kernel: [ 1694.552196] usb 1-1: new full speed
USB device using uhci_hcd and address 4
Jan 16 00:41:25 Cricket kernel: [ 1695.116015] usb 1-1: new full speed
USB device using uhci_hcd and address 5
Jan 16 00:41:26 Cricket kernel: [ 1695.676005] usb 1-1: new full speed
USB device using uhci_hcd and address 6
Jan 16 00:41:26 Cricket kernel: [ 1696.196015] usb 1-1: new full speed
USB device using uhci_hcd and address 7
Jan 16 00:43:32 Cricket kernel: [ 1821.624006] usb 1-1: new full speed
USB device using uhci_hcd and address 8
Jan 16 00:43:32 Cricket kernel: [ 1822.188016] usb 1-1: new full speed
USB device using uhci_hcd and address 9
Jan 16 00:43:33 Cricket kernel: [ 1822.748041] usb 1-1: new full speed
USB device using uhci_hcd and address 10
Jan 16 00:43:33 Cricket kernel: [ 1823.268022] usb 1-1: new full speed
USB device using uhci_hcd and address 11
Jan 16 00:48:25 Cricket kernel: [ 2115.128018] usb 1-1: new full speed
USB device using uhci_hcd and address 12
Jan 16 00:48:26 Cricket kernel: [ 2115.292074] usb 1-1: configuration #1
chosen from 1 choice
Jan 16 00:48:26 Cricket kernel: [ 2115.304881] scsi1 : SCSI emulation
for USB Mass Storage devices
Jan 16 00:48:26 Cricket kernel: [ 2115.305242] usb 1-1: New USB device
found, idVendor=1f28, idProduct=0021
Jan 16 00:48:26 Cricket kernel: [ 2115.305248] usb 1-1: New USB device
strings: Mfr=1, Product=2, SerialNumber=3
Jan 16 00:48:26 Cricket kernel: [ 2115.305253] usb 1-1: Product: USB
Micro SD Storage
Jan 16 00:48:26 Cricket kernel: [ 2115.305256] usb 1-1: Manufacturer:
Cal-comp E&CC Limited
Jan 16 00:48:26 Cricket kernel: [ 2115.305259] usb 1-1: SerialNumber:
214939913900
Jan 16 00:48:31 Cricket kernel: [ 2120.308859] scsi 1:0:0:0:
Direct-Access Cricket T-Flash Disk 2.31 PQ: 0 ANSI: 2
Jan 16 00:48:31 Cricket kernel: [ 2120.308859] scsi 1:0:0:1: CD-ROM
Cal-Comp CD INSTALLER 2.31 PQ: 0 ANSI: 0
Jan 16 00:48:31 Cricket kernel: [ 2120.329118] sd 1:0:0:0: [sda]
Attached SCSI removable disk
Jan 16 00:48:31 Cricket kernel: [ 2120.556881] Driver 'sr' needs
updating - please use bus_type methods
<snip>
Cricket:~# grep "Jan 16 00:" /var/log/syslog
<snip>
Jan 16 00:41:23 Cricket kernel: [ 1692.388023] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:23 Cricket kernel: [ 1692.508017] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:23 Cricket kernel: [ 1692.732031] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:23 Cricket kernel: [ 1692.948014] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:23 Cricket kernel: [ 1692.965250] cdc_acm: acm_ctrl_irq -
usb_submit_urb failed with result -19
Jan 16 00:41:23 Cricket kernel: [ 1693.068019] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:24 Cricket kernel: [ 1693.292024] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:24 Cricket kernel: [ 1693.508017] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:24 Cricket kernel: [ 1693.916028] usb 1-1: device not
accepting address 3, error -71
Jan 16 00:41:24 Cricket kernel: [ 1694.028016] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:25 Cricket pppd[2682]: Modem hangup
Jan 16 00:41:25 Cricket pppd[2682]: Connect time 27.6 minutes.
Jan 16 00:41:25 Cricket pppd[2682]: Sent 144558 bytes, received 188005
bytes.
Jan 16 00:41:25 Cricket pppd[2682]: Script /etc/ppp/ip-down started (pid
4457)
Jan 16 00:41:25 Cricket pppd[2682]: Connection terminated.
Jan 16 00:41:25 Cricket kernel: [ 1694.440021] usb 1-1: device not
accepting address 3, error -71
Jan 16 00:41:25 Cricket kernel: [ 1694.440107] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jan 16 00:41:25 Cricket kernel: [ 1694.440319] usb 1-1: USB disconnect,
address 3
Jan 16 00:41:25 Cricket pppd[2682]: Waiting for 1 child processes...
Jan 16 00:41:25 Cricket pppd[2682]: script /etc/ppp/ip-down, pid 4457
Jan 16 00:41:25 Cricket pppd[2682]: Script /etc/ppp/ip-down finished
(pid 4457), status = 0x0
Jan 16 00:41:25 Cricket pppd[2682]: Exit.
Jan 16 00:41:25 Cricket kernel: [ 1694.552196] usb 1-1: new full speed
USB device using uhci_hcd and address 4
Jan 16 00:41:25 Cricket kernel: [ 1694.676007] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:25 Cricket kernel: [ 1694.900051] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:25 Cricket kernel: [ 1695.116015] usb 1-1: new full speed
USB device using uhci_hcd and address 5
Jan 16 00:41:25 Cricket kernel: [ 1695.236042] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:26 Cricket kernel: [ 1695.460007] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:26 Cricket kernel: [ 1695.676005] usb 1-1: new full speed
USB device using uhci_hcd and address 6
Jan 16 00:41:26 Cricket kernel: [ 1696.084006] usb 1-1: device not
accepting address 6, error -71
Jan 16 00:41:26 Cricket kernel: [ 1696.196015] usb 1-1: new full speed
USB device using uhci_hcd and address 7
Jan 16 00:41:27 Cricket kernel: [ 1696.604019] usb 1-1: device not
accepting address 7, error -71
Jan 16 00:41:27 Cricket kernel: [ 1696.604049] hub 1-0:1.0: unable to
enumerate USB device on port 1
<snip>
Now will you believe me when I tell you pppd isn't issuing any errors
and it is unlikely the networking layer is the culprit in most of these
failures? Can we move on? I'll investigate possible networking issues
later when I have these far more problematical and obvious problems
ironed out.
>> It's looking a lot more as if this is the problem, rather than pppd.
> I agree, although despite what unruh seems to believe, I haven't
>fixated on this as a root cause, and I certaily have not even remotely
>ruled out ppp problems or something bizarre that wvdial is doing as
>being a proximate cause.
Mentioned else-thread, I see just two possible ppp related issues:
the disconnects after some amount of time, and the authentication
failure. Everything else seems to be related to strange actions
by the GSM modem or the USB interface.
> Yes, but gracefully exiting the session and then dialing back in
>doesn't produce a problem. If resetting the modem kills it in one
>case, one might expect that resetting it in the other case would have
>the same effect.
Else-thread, you showed:
>Jan 16 00:41:23 Cricket kernel: [ 1692.388023] usb 1-1: reset full speed
>USB device using uhci_hcd and address 3
>Jan 16 00:41:23 Cricket kernel: [ 1692.948014] usb 1-1: reset full speed
>USB device using uhci_hcd and address 3
>Jan 16 00:41:24 Cricket kernel: [ 1693.508017] usb 1-1: reset full speed
>USB device using uhci_hcd and address 3
>Jan 16 00:41:24 Cricket kernel: [ 1694.028016] usb 1-1: reset full speed
>USB device using uhci_hcd and address 3
>Jan 16 00:41:25 Cricket pppd[2682]: Modem hangup
>Jan 16 00:41:25 Cricket pppd[2682]: Connect time 27.6 minutes.
>Jan 16 00:41:25 Cricket pppd[2682]: Sent 144558 bytes, received 188005
>bytes.
>Jan 16 00:41:25 Cricket pppd[2682]: Connection terminated.
Now, I don't know if the kernel messages (the word 'reset' scares me)
are the reason or not, but the "Modem hangup" means that pppd detected
the modem going "on-hook". USB modems are totally serial, and lack
the RS-232 status wires, which the USB software emulates as far as
the serial interface is concerned. With an RS-232 modem, the modem
going on hook _uncommanded_by_pppd_ nearly always means that the peer
disconnected. This _could_ also be an activity timeout function in
the modem itself (an S-register setting, not standardized by
manufacturer).
If a peer hangs up, RFC1661 says that the peer should initiate this
by sending a TermReq message, and the other system responds with a
TermAck. (These packets should show in a debug level log.) Then both
can shut down "cleanly". However, many ISPs fail to follow this, and
essentially yank the plug much the same concept as a -SIGTERM verses a
-SIGKILL. All pppd can then say is that there was a modem hangup,
and try to clean up as best it can. How this would occur on a USB
modem as opposed to an RS-232 modem is something else, but given that
word 'reset' in the immediate preceding message, I'd be looking in
that direction.
You may also want to inquire if the ISP has a connect time limitation.
If they're hanging up on you because you've been connected to long,
the USB <-> RS-232 emulation may cause problems that don't occur under
similar circumstances with a straight RS-232 interface.
Old guy
You actually do not have debugging switched on for your pppd (option
debug to pppd), and thus the reason for the pppd disconnect is still
obscure. It looks like pppd is detecting a hangup, disconnects, and then your usb
bus is misbehaving. But there is no indication here why the hangup is
occuring.
You also seem to have a usb disk drive attached to
the usb bus, and it may be that there is some weird interaction between
the modem and the disk drive. It is a necessary drive? Can you remove it
for your tests?
I used to get messages like that when I plugged in my digital camera (seen
as a usb storage device). I eventually had to turn off udev and
create the device nodes manually. Then I could get to my photos.
> are the reason or not, but the "Modem hangup" means that pppd detected
> the modem going "on-hook". USB modems are totally serial, and lack
He said his modem was wireless. I thought we'd be looking at the failed
reconnection attempts in his logs, but they weren't there. I thought he
mentioned failed reconnection attempts somewhere (maybe in the original
post).
Anyway, I don't anything about any of this, so I'll just sign off.
linux/Documentation/usb/power-management.txt
Though I don't see any files except wakeup in the power
sub-directories when I look. Could be my Aspire 1 ports don't support
power management, or I didn't have the necessary options set when I
built the kernel.
I was going to suggest LKML, but if you're already corresponding with
Alan Stern that's superfulous. Did he suggest usbmon? That shows the
USB bus transactions which might help.
The problem could actually be a "feature" of the device, perhaps to
save power or eliminate the stupid dance to "safely remove hardware"
in Windows. The Windows program could "know" how to reactivate the
device.
Except that getting support for a year old kernel via LKML is likely
to be difficult. I'm surprised Alan Stern didn't suggest upgrading.
Jerry
Jerry
Yes, I do. The debug directive is in the options file.
> , and thus the reason for the pppd disconnect is still
> obscure. It looks like pppd is detecting a hangup, disconnects, and
> then your usb bus is misbehaving.
No, look at the log, again. The USB reset occurs a full two seconds
before the pppd disconnect. The reset is at 41:23. The pppd hang-up
isn't until 41:25.
> But there is no indication here why
> the hangup is occuring.
> You also seem to have a usb disk drive attached to
> the usb bus, and it may be that there is some weird interaction
> between the modem and the disk drive. It is a necessary drive?
That's the modem. As I said before, it has an internal storage target
which is established when the modem is initialized. When the modem is
switched into modem mode, the drive target disappears and the modem
target becomes visible.
> Can you
> remove it for your tests?
No. It's built-in to the modem. It's also required for the
mode-switch.
>>
>> You actually do not have debugging switched on for your pppd (option
>> debug to pppd)
>
> Yes, I do. The debug directive is in the options file.
Good. But what you display is NOT the pppd debug logs.
Did you put the line
daemon.*;local2.* /var/log/daemonlog
(or whereever you want to put it) into /etc/syslog.conf, and restart
syslog.conf
killall -1 syslogd
?
Have you looked into /var/log/deamonlog ( or whereever you want to put
it) during that disconnect.
The log outputs you are giving are not from the pppd detailed logging.
>
>> , and thus the reason for the pppd disconnect is still
>> obscure. It looks like pppd is detecting a hangup, disconnects, and
>> then your usb bus is misbehaving.
>
> No, look at the log, again. The USB reset occurs a full two seconds
> before the pppd disconnect. The reset is at 41:23. The pppd hang-up
> isn't until 41:25.
It may be. As I say I would like to see the detailed pppd debug logs
before I say what is going on.
>
>> But there is no indication here why
>> the hangup is occuring.
>> You also seem to have a usb disk drive attached to
>> the usb bus, and it may be that there is some weird interaction
>> between the modem and the disk drive. It is a necessary drive?
>
> That's the modem. As I said before, it has an internal storage target
> which is established when the modem is initialized. When the modem is
> switched into modem mode, the drive target disappears and the modem
> target becomes visible.
Ah, OK, so somehow the modem is resetting itself off the bus.
Again, I really really would advise getting a new, different modem.
>> Can you
>> remove it for your tests?
>
> No. It's built-in to the modem. It's also required for the
> mode-switch.
OK, so that fact that it is appearing here indicates that the modem has
reset itself.
Is it possible that for some reason the power to the modem is being
interrupted causing the disconnect-- eg flakey usb port?
My system does not have that file. I found a copy on the web.
> Though I don't see any files except wakeup in the power
> sub-directories when I look. Could be my Aspire 1 ports don't support
> power management, or I didn't have the necessary options set when I
> built the kernel.
> I was going to suggest LKML, but if you're already corresponding with
> Alan Stern that's superfulous. Did he suggest usbmon?
No, he did not.
> That shows the
> USB bus transactions which might help.
> The problem could actually be a "feature" of the device, perhaps to
> save power or eliminate the stupid dance to "safely remove hardware"
> in Windows. The Windows program could "know" how to reactivate the
> device.
'Doesn't seem to be. Lockups are random.
No, he just pointed out why it could be problematical in general.
Then pppd isn't creating them, unless it is someplace other
than /var/log/. The debug option is in the options file, as I
mentioned.
> Did you put the line
> daemon.*;local2.* /var/log/daemonlog
I already responded to that. I did not have to. It was already there.
> (or whereever you want to put it) into /etc/syslog.conf, and restart
> syslog.conf
> killall -1 syslogd
Again, as I already mentioned, this isn't necessary. The router has
been both cold and warm booted many, many times, and the logging
parameters were already there when the system was created.
Oh, and just BTW, this distro does not use syslog.d. It's using
rsyslog.d. not that ity really matters.
> Have you looked into /var/log/deamonlog ( or whereever you want to put
> it) during that disconnect.
/var/log/messages and /var/log/syslog are as shown above for the Jan 16
event. /var/log/daemon.log has been completely empty since Jan 15
20:44. There are no logs at all in daemon.log for ppp at any time
going all the way back to Dec 27, although there are logs from ntpd
concerning its listening and shutting down on ppp0. When the Jan 16
event took place, I was in the middle of writing the chat script for
pppd and setting up pppd as the primary shell. I didn't make any
changes to the options file, but after I started running pppd as the
primary shell (not that it should make a difference), it started doing
more verbose logging.
There was a lock-up just an hour ago, although this time without a USB
bus reset. I've enabled he persist option in pppd, at least for the
time being. It does allow the system to recover from an ordinary idle
time-out. This event did not involve a USB reset, /dev/ACM0 stayed
online, and the modem was able to respond to AT commands. It would not
dial, however, either using pppd or minicom. The first response to a
simgle AT command was "ERROR" after I shut down pppd and ran minicom to
do some manual testing.
Here is the log from the event:
Jan 18 18:29:54 Cricket pppd[5836]: No response to 4 echo-requests
Jan 18 18:29:54 Cricket pppd[5836]: Serial link appears to be
disconnected.
Jan 18 18:29:54 Cricket pppd[5836]: Connect time 306.6 minutes.
Jan 18 18:29:54 Cricket pppd[5836]: Sent 445560 bytes, received 588608
bytes.
Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down started (pid
10015)
Jan 18 18:29:54 Cricket pppd[5836]: sent [LCP TermReq id=0x2 "Peer not
responding"]
Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down finished
(pid 10015), status = 0x0
Jan 18 18:29:57 Cricket pppd[5836]: sent [LCP TermReq id=0x3 "Peer not
responding"]
Jan 18 18:30:00 Cricket pppd[5836]: Connection terminated.
Jan 18 18:31:56 Cricket pppd[5836]: ioctl(TIOCSETD, N_TTY): Interrupted
system call (line 571)
Jan 18 18:31:57 Cricket pppd[5836]: tcsetattr: Interrupted system call
(line 1010)
Jan 18 18:31:57 Cricket pppd[5836]: Hangup (SIGHUP)
Jan 18 18:31:57 Cricket pppd[5836]: Modem hangup
Jan 18 18:31:58 Cricket pppd[5836]: Hangup (SIGHUP)
Jan 18 18:32:04 Cricket chat[10941]: Can't get terminal parameters:
Input/output error
Jan 18 18:32:04 Cricket pppd[5836]: Connect script failed
Jan 18 18:32:04 Cricket pppd[5836]: Hangup (SIGHUP)
Jan 18 18:32:54 Cricket ntpd[2516]: Deleting interface #13 ppp0,
10.100.37.231#123, interface stats: received=0, sent=0, dropped=0,
active_time
=18300 secs
Jan 18 18:51:10 Cricket dnsmasq[2435]: reading /etc/resolv.conf
Jan 18 18:51:10 Cricket dnsmasq[2435]: using nameserver 24.93.41.128#53
Jan 18 18:51:10 Cricket dnsmasq[2435]: using nameserver 24.93.41.127#53
Jan 18 18:51:10 Cricket dnsmasq[2435]: DHCPREQUEST(eth0) 192.168.1.77
00:1b:11:1a:2b:b0
Jan 18 18:51:10 Cricket dnsmasq[2435]: DHCPACK(eth0) 192.168.1.77
00:1b:11:1a:2b:b0 Brittany
That's it. Pppd did not shut down, did not release the loackfile, and
did not release /dev/ttyACM0. I shut down pppd with `kill -9`, and
re-started it. After restart, it produced only a single line in
syslog:
Jan 18 19:07:17 Cricket pppd[30605]: pppd 2.4.4 started by root, uid 0
Again, it produced a lockfile and grabbed /dev/ttyACM0, but it did nit
run the chat script or anything else. I killed pppd again, and ran
minicom to see if the modem was responding on /dev/ttyACM0. It was. I
pulled the modem and reseated it, running pppd, again:
Jan 18 19:10:30 Cricket kernel: [52380.248072] usb 1-2: USB disconnect,
address 3
Jan 18 19:10:44 Cricket kernel: [52394.364017] usb 1-2: new full speed
USB device using uhci_hcd and address 4
Jan 18 19:10:44 Cricket kernel: [52394.527131] usb 1-2: configuration #1
chosen from 1 choice
Jan 18 19:10:44 Cricket kernel: [52394.534360] scsi1 : SCSI emulation
for USB Mass Storage devices
Jan 18 19:10:44 Cricket kernel: [52394.541500] usb 1-2: New USB device
found, idVendor=1f28, idProduct=0021
Jan 18 19:10:44 Cricket kernel: [52394.541513] usb 1-2: New USB device
strings: Mfr=1, Product=2, SerialNumber=3
Jan 18 19:10:44 Cricket kernel: [52394.541518] usb 1-2: Product: USB
Micro SD Storage
Jan 18 19:10:44 Cricket kernel: [52394.541522] usb 1-2: Manufacturer:
Cal-comp E&CC Limited
Jan 18 19:10:44 Cricket kernel: [52394.541526] usb 1-2: SerialNumber:
214939913900
Jan 18 19:10:44 Cricket kernel: [52394.542908] usb-storage: device found
at 4
Jan 18 19:10:44 Cricket kernel: [52394.542919] usb-storage: waiting for
device to settle before scanning
Jan 18 19:10:49 Cricket kernel: [52399.541264] usb-storage: device scan
complete
Jan 18 19:10:49 Cricket kernel: [52399.544176] scsi 1:0:0:0:
Direct-Access Cricket T-Flash Disk 2.31 PQ: 0 ANSI: 2
Jan 18 19:10:49 Cricket kernel: [52399.544176] scsi 1:0:0:1: CD-ROM
Cal-Comp CD INSTALLER 2.31 PQ: 0 ANSI: 0
Jan 18 19:10:49 Cricket kernel: [52399.592290] sd 1:0:0:0: [sda]
Attached SCSI removable disk
Jan 18 19:10:50 Cricket kernel: [52399.777931] Driver 'sr' needs
updating - please use bus_type methods
Jan 18 19:10:50 Cricket kernel: [52399.832120] sr0: scsi3-mmc drive:
0x/0x caddy
Jan 18 19:10:50 Cricket kernel: [52399.832120] sr 1:0:0:1: Attached scsi
CD-ROM sr0
Jan 18 19:10:50 Cricket kernel: [52399.918559] sd 1:0:0:0: Attached scsi
generic sg0 type 0
Jan 18 19:10:50 Cricket kernel: [52399.918611] sr 1:0:0:1: Attached scsi
generic sg1 type 5
Jan 18 19:10:50 Cricket kernel: [52400.039108] sr0: CDROM (ioctl) error,
command: Get configuration 46 00 00 00 00 00 00 00 20 00
Jan 18 19:10:50 Cricket kernel: [52400.039131] sr: Sense Key : No Sense
[current]
Jan 18 19:10:50 Cricket kernel: [52400.039138] sr: Add. Sense: No
additional sense information
Jan 18 19:10:50 Cricket kernel: [52400.220020] usb 1-2: reset full speed
USB device using uhci_hcd and address 4
Jan 18 19:10:50 Cricket kernel: [52400.488029] usb 1-2: reset full speed
USB device using uhci_hcd and address 4
Jan 18 19:10:51 Cricket kernel: [52400.760032] usb 1-2: reset full speed
USB device using uhci_hcd and address 4
Jan 18 19:11:03 Cricket pppd[31677]: unrecognized option '/dev/ttyACM0'
I shut down the router altogether and rebooted:
Cricket:/var/log# grep pppd syslog
<snip>
Jan 18 20:11:24 Cricket pppd[3062]: pppd 2.4.4 started by root, uid 0
Jan 18 20:11:25 Cricket pppd[3062]: Connect script failed
Jan 18 20:12:03 Cricket pppd[3062]: Serial connection established.
Jan 18 20:12:03 Cricket pppd[3062]: using channel 1
Jan 18 20:12:03 Cricket pppd[3062]: Using interface ppp0
Jan 18 20:12:03 Cricket pppd[3062]: Connect: ppp0 <--> /dev/ttyACM0
Jan 18 20:12:04 Cricket pppd[3062]: sent [LCP ConfReq id=0x1 <asyncmap
0x0> <magic 0xeaf2ceb9> <pcomp> <accomp>]
Jan 18 20:12:07 Cricket pppd[3062]: sent [LCP ConfReq id=0x1 <asyncmap
0x0> <magic 0xeaf2ceb9> <pcomp> <accomp>]
Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfReq id=0x1 <asyncmap
0x0> <auth chap MD5> <magic 0xf27b8c09> <pcomp> <accomp>]
Jan 18 20:12:09 Cricket pppd[3062]: sent [LCP ConfNak id=0x1 <auth pap>]
Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfAck id=0x1 <asyncmap
0x0> <magic 0xeaf2ceb9> <pcomp> <accomp>]
Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfReq id=0x2 <asyncmap
0x0> <auth pap> <magic 0xf27b8c09> <pcomp> <accomp>]
Jan 18 20:12:09 Cricket pppd[3062]: sent [LCP ConfAck id=0x2 <asyncmap
0x0> <auth pap> <magic 0xf27b8c09> <pcomp> <accomp>]
Jan 18 20:12:09 Cricket pppd[3062]: sent [LCP EchoReq id=0x0
magic=0xeaf2ceb9]
Jan 18 20:12:09 Cricket pppd[3062]: sent [PAP AuthReq id=0x1
user="Cricket" password=<hidden>]
Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP EchoRep id=0x0
magic=0xf27b8c09]
Jan 18 20:12:10 Cricket pppd[3062]: rcvd [PAP AuthAck id=0x1 ""]
Jan 18 20:12:10 Cricket pppd[3062]: PAP authentication succeeded
Jan 18 20:12:10 Cricket pppd[3062]: sent [CCP ConfReq id=0x1 <deflate
15> <deflate(old#) 15> <bsd v1 15>]
Jan 18 20:12:10 Cricket pppd[3062]: sent [IPCP ConfReq id=0x1 <compress
VJ 0f 01> <addr 0.0.0.0> <ms-dns1 0.0.0.0> <ms-dns3 0.0.0.0>]
Jan 18 20:12:10 Cricket pppd[3062]: rcvd [IPCP ConfReq id=0x1 <addr
172.29.122.162>]
Jan 18 20:12:10 Cricket pppd[3062]: sent [IPCP ConfAck id=0x1 <addr
172.29.122.162>]
Jan 18 20:12:10 Cricket pppd[3062]: rcvd [LCP ProtRej id=0x1 80 fd 01 01
00 0f 1a 04 78 00 18 04 78 00 15 03 2f]
Jan 18 20:12:10 Cricket pppd[3062]: Protocol-Reject for 'Compression
Control Protocol' (0x80fd) received
Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfRej id=0x1 <compress
VJ 0f 01>]
Jan 18 20:12:11 Cricket pppd[3062]: sent [IPCP ConfReq id=0x2 <addr
0.0.0.0> <ms-dns1 0.0.0.0> <ms-dns3 0.0.0.0>]
Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfNak id=0x2 <addr
10.100.73.203> <ms-dns1 172.28.221.53> <ms-dns3 172.28.221.54>]
Jan 18 20:12:11 Cricket pppd[3062]: sent [IPCP ConfReq id=0x3 <addr
10.100.73.203> <ms-dns1 172.28.221.53> <ms-dns3 172.28.221.54>]
Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfAck id=0x3 <addr
10.100.73.203> <ms-dns1 172.28.221.53> <ms-dns3 172.28.221.54>]
Jan 18 20:12:11 Cricket pppd[3062]: local IP address 10.100.73.203
Jan 18 20:12:11 Cricket pppd[3062]: remote IP address 172.29.122.162
Jan 18 20:12:11 Cricket pppd[3062]: primary DNS address 172.28.221.53
Jan 18 20:12:11 Cricket pppd[3062]: secondary DNS address 172.28.221.54
Jan 18 20:12:11 Cricket pppd[3062]: Script /etc/ppp/ip-up started (pid
3912)
Jan 18 20:12:12 Cricket pppd[3062]: Script /etc/ppp/ip-up finished (pid
3912), status = 0x0
Here is the messages log from the time of the restart (showing the
script failure):
Jan 18 20:11:25 Cricket chat[3106]: abort on (NO CARRIER)
Jan 18 20:11:25 Cricket chat[3106]: abort on (NO DIALTONE)
Jan 18 20:11:25 Cricket chat[3106]: abort on (ERROR)
Jan 18 20:11:25 Cricket chat[3106]: abort on (NO ANSWER)
Jan 18 20:11:25 Cricket chat[3106]: abort on (BUSY)
Jan 18 20:11:25 Cricket chat[3106]: send (AT^M)
Jan 18 20:11:25 Cricket chat[3106]: expect (OK)
Jan 18 20:11:25 Cricket chat[3106]: AT^M^M
Jan 18 20:11:25 Cricket chat[3106]: OK
Jan 18 20:11:25 Cricket chat[3106]: -- got it
Jan 18 20:11:25 Cricket chat[3106]: send (ATDT#777^M)
Jan 18 20:11:25 Cricket chat[3106]: expect (CONNECT)
Jan 18 20:11:25 Cricket chat[3106]: ^M
Jan 18 20:11:25 Cricket chat[3106]: ATDT#777^M^M
Jan 18 20:11:25 Cricket chat[3106]: NO CARRIER
Jan 18 20:11:25 Cricket chat[3106]: -- failed
Jan 18 20:11:25 Cricket chat[3106]: Failed (NO CARRIER)
Jan 18 20:11:57 Cricket chat[3652]: abort on (NO CARRIER)
Jan 18 20:11:57 Cricket chat[3652]: abort on (NO DIALTONE)
Jan 18 20:11:57 Cricket chat[3652]: abort on (ERROR)
Jan 18 20:11:57 Cricket chat[3652]: abort on (NO ANSWER)
Jan 18 20:11:57 Cricket chat[3652]: abort on (BUSY)
Jan 18 20:11:57 Cricket chat[3652]: send (AT^M)
Jan 18 20:11:57 Cricket chat[3652]: expect (OK)
Jan 18 20:11:57 Cricket chat[3652]: AT^M^M
Jan 18 20:11:57 Cricket chat[3652]: OK
Jan 18 20:11:57 Cricket chat[3652]: -- got it
Jan 18 20:11:57 Cricket chat[3652]: send (ATDT#777^M)
Jan 18 20:11:57 Cricket chat[3652]: expect (CONNECT)
Jan 18 20:11:57 Cricket chat[3652]: ^M
Jan 18 20:12:03 Cricket chat[3652]: ATDT#777^M^M
Jan 18 20:12:03 Cricket chat[3652]: CONNECT
Jan 18 20:12:03 Cricket chat[3652]: -- got it
Jan 18 20:12:03 Cricket pppd[3062]: Serial connection established.
Jan 18 20:12:03 Cricket pppd[3062]: Using interface ppp0
Jan 18 20:12:03 Cricket pppd[3062]: Connect: ppp0 <--> /dev/ttyACM0
Jan 18 20:12:10 Cricket pppd[3062]: PAP authentication succeeded
> Ah, OK, so somehow the modem is resetting itself off the bus.
> Again, I really really would advise getting a new, different modem.
There is only one other model. I'll speak with Cricket about the
possibility of a swap - out, but I'm not hopeful. They have been far
less than accomodating.
>>> Can you
>>> remove it for your tests?
>>
>> No. It's built-in to the modem. It's also required for the
>> mode-switch.
>
> OK, so that fact that it is appearing here indicates that the modem
> has reset itself.
Yes. It is back to the power-on state, except that now it may not
become functional. In some cases, an attempt to switch the modem mode
fails. In others, the modem can be switched, but udev doesn't
recognize it. In yet others, udev recognizes the modem and creates
the /dev/ACM0 device, but the modem doesn't respond. Finally,
sometimes the modem responds and dials out, but the authentication
fails.
> Is it possible that for some reason the power to the modem is being
> interrupted causing the disconnect-- eg flakey usb port?
Of course it is possible. I installed a USB PCI card yesterday. Time
will tell if it eliminates some or all of the hardware failures. Using
the persistent option in pppd seems like it may have alleviated one
common failure mode: the failed authentication after disconnect. The
next 24 hours should hopefully prove that one way or the other. I also
found one bug in my scripts that was making remote management more
difficult. I've put the hardware reset binary project on hold until
the results are in from replacing the USB port. I suspect I will need
it eventually in any case, but for now I'm letting the system run to
see what failures are encountered with the new scripts and the new
hardware.
Then you have a non-standard version of pppd.
With the debug option, pppd writes detailed reports to the daemon
syslog facility.
>
>> (or whereever you want to put it) into /etc/syslog.conf, and restart
>> syslog.conf
>> killall -1 syslogd
>
> Again, as I already mentioned, this isn't necessary. The router has
> been both cold and warm booted many, many times, and the logging
> parameters were already there when the system was created.
>
> Oh, and just BTW, this distro does not use syslog.d. It's using
> rsyslog.d. not that ity really matters.
I do not know rsyslogd. Are you sure it uses /etc/syslog.conf as its
configuration file?
>
>> Have you looked into /var/log/deamonlog ( or whereever you want to put
>> it) during that disconnect.
>
> /var/log/messages and /var/log/syslog are as shown above for the Jan 16
> event. /var/log/daemon.log has been completely empty since Jan 15
> 20:44. There are no logs at all in daemon.log for ppp at any time
> going all the way back to Dec 27, although there are logs from ntpd
> concerning its listening and shutting down on ppp0. When the Jan 16
> event took place, I was in the middle of writing the chat script for
> pppd and setting up pppd as the primary shell. I didn't make any
> changes to the options file, but after I started running pppd as the
> primary shell (not that it should make a difference), it started doing
> more verbose logging.
>
> There was a lock-up just an hour ago, although this time without a USB
> bus reset. I've enabled he persist option in pppd, at least for the
> time being. It does allow the system to recover from an ordinary idle
> time-out. This event did not involve a USB reset, /dev/ACM0 stayed
> online, and the modem was able to respond to AT commands. It would not
> dial, however, either using pppd or minicom. The first response to a
> simgle AT command was "ERROR" after I shut down pppd and ran minicom to
> do some manual testing.
>
> Here is the log from the event:
>
> Jan 18 18:29:54 Cricket pppd[5836]: No response to 4 echo-requests
> Jan 18 18:29:54 Cricket pppd[5836]: Serial link appears to be
> disconnected.
OK, pppd is using echos to test if the link is up. It is failing. Ie,
the connection has gone down before pppd goes down.
This looks more and more like a hardware flaw, either a bug or a bad
design. There is NOTHING that software can do to fix it. Only
> Jan 18 18:29:54 Cricket pppd[5836]: Connect time 306.6 minutes.
> Jan 18 18:29:54 Cricket pppd[5836]: Sent 445560 bytes, received 588608
> bytes.
> Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down started (pid
> 10015)
> Jan 18 18:29:54 Cricket pppd[5836]: sent [LCP TermReq id=0x2 "Peer not
> responding"]
> Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down finished
> (pid 10015), status = 0x0
pppd has shut itself down.
> Jan 18 19:07:17 Cricket pppd[30605]: pppd 2.4.4 started by root, uid 0
>
> Again, it produced a lockfile and grabbed /dev/ttyACM0, but it did nit
> run the chat script or anything else. I killed pppd again, and ran
> minicom to see if the modem was responding on /dev/ttyACM0. It was. I
> pulled the modem and reseated it, running pppd, again:
????
>
OK, now we are seeing valid pppd debug output.
Where did you get this from?
PUt in a noccp option.
> Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfRej id=0x1 <compress
> VJ 0f 01>]
And novj
> Jan 18 20:12:11 Cricket pppd[3062]: sent [IPCP ConfReq id=0x2 <addr
> 0.0.0.0> <ms-dns1 0.0.0.0> <ms-dns3 0.0.0.0>]
> Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfNak id=0x2 <addr
> 10.100.73.203> <ms-dns1 172.28.221.53> <ms-dns3 172.28.221.54>]
> Jan 18 20:12:11 Cricket pppd[3062]: sent [IPCP ConfReq id=0x3 <addr
> 10.100.73.203> <ms-dns1 172.28.221.53> <ms-dns3 172.28.221.54>]
> Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfAck id=0x3 <addr
> 10.100.73.203> <ms-dns1 172.28.221.53> <ms-dns3 172.28.221.54>]
> Jan 18 20:12:11 Cricket pppd[3062]: local IP address 10.100.73.203
> Jan 18 20:12:11 Cricket pppd[3062]: remote IP address 172.29.122.162
> Jan 18 20:12:11 Cricket pppd[3062]: primary DNS address 172.28.221.53
> Jan 18 20:12:11 Cricket pppd[3062]: secondary DNS address 172.28.221.54
> Jan 18 20:12:11 Cricket pppd[3062]: Script /etc/ppp/ip-up started (pid
> 3912)
> Jan 18 20:12:12 Cricket pppd[3062]: Script /etc/ppp/ip-up finished (pid
> 3912), status = 0x0
And you are connected.
The question is whether or not there are versions which Cricket does not
supply.
>
>>>> Can you
>>>> remove it for your tests?
>>>
>>> No. It's built-in to the modem. It's also required for the
>>> mode-switch.
>>
>> OK, so that fact that it is appearing here indicates that the modem
>> has reset itself.
>
> Yes. It is back to the power-on state, except that now it may not
> become functional. In some cases, an attempt to switch the modem mode
> fails. In others, the modem can be switched, but udev doesn't
> recognize it. In yet others, udev recognizes the modem and creates
> the /dev/ACM0 device, but the modem doesn't respond. Finally,
> sometimes the modem responds and dials out, but the authentication
> fails.
Again hardware.
>There was a lock-up just an hour ago, although this time without a USB
>bus reset. I've enabled he persist option in pppd, at least for the
>time being. It does allow the system to recover from an ordinary idle
>time-out. This event did not involve a USB reset, /dev/ACM0 stayed
>online, and the modem was able to respond to AT commands. It would not
>dial, however, either using pppd or minicom.
Minor confusion - your log shows pppd unable to communicate with the
modem between (at least) 18:29:54 and 20:11:24.
>Jan 18 18:29:54 Cricket pppd[5836]: No response to 4 echo-requests
>Jan 18 18:29:54 Cricket pppd[5836]: Serial link appears to be
>disconnected.
>Jan 18 18:29:54 Cricket pppd[5836]: Connect time 306.6 minutes.
>Jan 18 18:29:54 Cricket pppd[5836]: Sent 445560 bytes, received 588608
>bytes.
>Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down started (pid
>10015)
>Jan 18 18:29:54 Cricket pppd[5836]: sent [LCP TermReq id=0x2 "Peer not
>responding"]
>Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down finished
>(pid 10015), status = 0x0
>Jan 18 18:29:57 Cricket pppd[5836]: sent [LCP TermReq id=0x3 "Peer not
>responding"]
>Jan 18 18:30:00 Cricket pppd[5836]: Connection terminated.
LCP Echo is used to detect link failure where the link goes down, but
the modem (which means the central office) doesn't recognize this and
remains off-hook. pppd is here shown sending two TermReq, but the
peer isn't responding. The link may have existed at the physical
level, but was unable to pass packets.
>Jan 18 18:31:57 Cricket pppd[5836]: Hangup (SIGHUP)
>Jan 18 18:31:57 Cricket pppd[5836]: Modem hangup
Nearly two minutes - why? pppd hangs up the modem using the DTR "wire"
which is emulated in a USB modem. "Module problems"?
>Jan 18 18:31:58 Cricket pppd[5836]: Hangup (SIGHUP)
>Jan 18 18:32:04 Cricket chat[10941]: Can't get terminal parameters:
>Input/output error
On an attempted restart, the serial port wasn't a serial port.
>That's it. Pppd did not shut down, did not release the loackfile, and
>did not release /dev/ttyACM0. I shut down pppd with `kill -9`, and
>re-started it.
Try to avoid using -SIGKILL (-9). The results are messy. -SIGTERM
(-15) should be enough.
>After restart, it produced only a single line in syslog:
>
>Jan 18 19:07:17 Cricket pppd[30605]: pppd 2.4.4 started by root, uid 0
>
> Again, it produced a lockfile and grabbed /dev/ttyACM0, but it
>did nit run the chat script or anything else.
I'm assuming you used 'fuser' or 'lsof' to determine that pppd had the
device. Assuming your file system does update access time (some people
are mounting -noatime), did 'ls -lu /usr/sbin/chat' show current
access? What about the script that contains the dialing commands?
>Jan 18 20:11:24 Cricket pppd[3062]: pppd 2.4.4 started by root, uid 0
>Jan 18 20:11:25 Cricket pppd[3062]: Connect script failed
>Jan 18 20:12:03 Cricket pppd[3062]: Serial connection established.
Below
>Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfReq id=0x1 <asyncmap
>0x0> <auth chap MD5> <magic 0xf27b8c09> <pcomp> <accomp>]
>Jan 18 20:12:09 Cricket pppd[3062]: sent [LCP ConfNak id=0x1 <auth pap>]
>Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfAck id=0x1 <asyncmap
>0x0> <magic 0xeaf2ceb9> <pcomp> <accomp>]
>Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfReq id=0x2 <asyncmap
>0x0> <auth pap> <magic 0xf27b8c09> <pcomp> <accomp>]
The peer requested you authenticate by CHAP-MD5 (RFC1994), and you
refused, suggesting PAP (RFC1334) instead. This time, the peer
accepted it. On 'Jan 9 09:49:00', you showed CHAP authentication
working - what changed? I'd suggest having identical pap-secrets
and chap-secrets files, and letting pppd use which-ever it needed.
>Jan 18 20:12:10 Cricket pppd[3062]: sent [CCP ConfReq id=0x1 <deflate
>15> <deflate(old#) 15> <bsd v1 15>]
>Jan 18 20:12:10 Cricket pppd[3062]: sent [IPCP ConfReq id=0x1 <compress
>VJ 0f 01> <addr 0.0.0.0> <ms-dns1 0.0.0.0> <ms-dns3 0.0.0.0>]
>Jan 18 20:12:10 Cricket pppd[3062]: rcvd [LCP ProtRej id=0x1 80 fd 01 01
>00 0f 1a 04 78 00 18 04 78 00 15 03 2f]
>Jan 18 20:12:10 Cricket pppd[3062]: Protocol-Reject for 'Compression
>Control Protocol' (0x80fd) received
>Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfRej id=0x1 <compress
>VJ 0f 01>]
This peer doesn't do CCP (RFC1962), or Van Jacobson header compression
(RFC1144). No big deal, but you _could_ add 'noccp' and 'novj' if it
did become a problem. It's _not_ a problem now.
>Jan 18 20:11:25 Cricket chat[3106]: abort on (BUSY)
>Jan 18 20:11:25 Cricket chat[3106]: send (AT^M)
>Jan 18 20:11:25 Cricket chat[3106]: expect (OK)
>Jan 18 20:11:25 Cricket chat[3106]: AT^M^M
>Jan 18 20:11:25 Cricket chat[3106]: OK
An empty 'Hayes' command prefix - not very reliable at initializing
the modem.
>Jan 18 20:11:25 Cricket chat[3106]: send (ATDT#777^M)
>Jan 18 20:11:25 Cricket chat[3106]: expect (CONNECT)
>Jan 18 20:11:25 Cricket chat[3106]: ^M
>Jan 18 20:11:25 Cricket chat[3106]: ATDT#777^M^M
>Jan 18 20:11:25 Cricket chat[3106]: NO CARRIER
>Jan 18 20:11:25 Cricket chat[3106]: -- failed
You told the modem to dial - it responded with 'NO CARRIER' rather
than the expected 'CONNECT blah, blah, blah'. That's a modem issue.
>Jan 18 20:11:57 Cricket chat[3652]: send (ATDT#777^M)
>Jan 18 20:11:57 Cricket chat[3652]: expect (CONNECT)
>Jan 18 20:11:57 Cricket chat[3652]: ^M
>Jan 18 20:12:03 Cricket chat[3652]: ATDT#777^M^M
>Jan 18 20:12:03 Cricket chat[3652]: CONNECT
>Jan 18 20:12:03 Cricket chat[3652]: -- got it
You tried again, and the modem took six seconds, but this time
reported it found another modem to talk to.
>Finally, sometimes the modem responds and dials out, but the
>authentication fails.
Above - '/bin/cp /etc/ppp/pap-secrets /etc/ppp/chap-secrets'
Old guy
Found the necessary files:
~$ ls /sys/bus/usb/devices/usb?/power
/sys/bus/usb/devices/usb1/power:
active_duration autosuspend connected_duration level wakeup
/sys/bus/usb/devices/usb2/power:
active_duration autosuspend connected_duration level wakeup
/sys/bus/usb/devices/usb3/power:
active_duration autosuspend connected_duration level wakeup
/sys/bus/usb/devices/usb4/power:
active_duration autosuspend connected_duration level wakeup
/sys/bus/usb/devices/usb5/power:
active_duration autosuspend connected_duration level wakeup
And here's what the files look like for usb1:
~$ grep '.' /sys/bus/usb/devices/usb1/power/*
/sys/bus/usb/devices/usb1/power/active_duration:9144
/sys/bus/usb/devices/usb1/power/autosuspend:2
/sys/bus/usb/devices/usb1/power/connected_duration:1128124
/sys/bus/usb/devices/usb1/power/level:auto
/sys/bus/usb/devices/usb1/power/wakeup:enabled
The usb[1-5] devices must represent the USB ports themselves, since
AFAIK there are 5 USB ports on this AA1, 3 external and 2 internal.
So power control via sysfs at least appears feasible.
Jerry
> On Mon, 18 Jan 2010, in the Usenet newsgroup comp.os.linux.networking,
> in article <JOydnYNd4P63u8jW...@giganews.com>, lrhorer
> wrote:
>
>>There was a lock-up just an hour ago, although this time without a USB
>>bus reset. I've enabled he persist option in pppd, at least for the
>>time being. It does allow the system to recover from an ordinary idle
>>time-out. This event did not involve a USB reset, /dev/ACM0 stayed
>>online, and the modem was able to respond to AT commands. It would
>>not dial, however, either using pppd or minicom.
>
> Minor confusion - your log shows pppd unable to communicate with the
> modem between (at least) 18:29:54 and 20:11:24.
Right. I was checking some things. When I shut down pppd eventually,
the first attempt to talk to the modem (AT^M) caused it to respond
with, "ERROR". A subsequent AT^M produced an, "OK".
>>Jan 18 18:29:54 Cricket pppd[5836]: No response to 4 echo-requests
>>Jan 18 18:29:54 Cricket pppd[5836]: Serial link appears to be
>>disconnected.
>>Jan 18 18:29:54 Cricket pppd[5836]: Connect time 306.6 minutes.
>>Jan 18 18:29:54 Cricket pppd[5836]: Sent 445560 bytes, received 588608
>>bytes.
>>Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down started
>>(pid 10015)
>>Jan 18 18:29:54 Cricket pppd[5836]: sent [LCP TermReq id=0x2 "Peer not
>>responding"]
>>Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down finished
>>(pid 10015), status = 0x0
>>Jan 18 18:29:57 Cricket pppd[5836]: sent [LCP TermReq id=0x3 "Peer not
>>responding"]
>>Jan 18 18:30:00 Cricket pppd[5836]: Connection terminated.
>
> LCP Echo is used to detect link failure where the link goes down, but
> the modem (which means the central office) doesn't recognize this and
> remains off-hook. pppd is here shown sending two TermReq, but the
> peer isn't responding. The link may have existed at the physical
> level, but was unable to pass packets.
Exactly. This is one of the failure modes. I removed wvdial from the
system entirely, and this behavior seems to have stopped. 'Fingers
crossed - again.
>>Jan 18 18:31:57 Cricket pppd[5836]: Hangup (SIGHUP)
>>Jan 18 18:31:57 Cricket pppd[5836]: Modem hangup
>
> Nearly two minutes - why? pppd hangs up the modem using the DTR
> "wire"
> which is emulated in a USB modem. "Module problems"?
I was skeptical the "modem" and "crtscts" options were of any value,
and I was wondering if having them specified in the options file for a
modem without signal lines was causing any of the issues. I suppose
it's possible they could have been at least partly responsible for some
of the odd behavior, although by no means all of it. There is a small
whiff of this being the case, here. I've commented them out.
>>Jan 18 18:31:58 Cricket pppd[5836]: Hangup (SIGHUP)
>>Jan 18 18:32:04 Cricket chat[10941]: Can't get terminal parameters:
>>Input/output error
>
> On an attempted restart, the serial port wasn't a serial port.
'Except that minicom is still able to talk to it. I was able to
reproduce this behavior a number of times. Sometimes it gives this
response. Other times it is just silent.
>>That's it. Pppd did not shut down, did not release the loackfile, and
>>did not release /dev/ttyACM0. I shut down pppd with `kill -9`, and
>>re-started it.
>
> Try to avoid using -SIGKILL (-9). The results are messy. -SIGTERM
> (-15) should be enough.
`Not when it locks up like this. Kill -9 is the only thing which seems
to be able to shut down pppd when this happens. The problem is, it
won't restart when I do. If you have any other ideas for cleanly
shutting down and restarting the ppp session, I'm all ears.
>>After restart, it produced only a single line in syslog:
>>
>>Jan 18 19:07:17 Cricket pppd[30605]: pppd 2.4.4 started by root, uid 0
>>
>> Again, it produced a lockfile and grabbed /dev/ttyACM0, but it
>>did nit run the chat script or anything else.
>
> I'm assuming you used 'fuser' or 'lsof' to determine that pppd had the
> device.
Yes. I also looked in /var/lock/, and even after removing it manually
(yes, I know the lockfile is old school), subsequently opening pppd
again or running minicom reports the locking in place. I had another
failure just a few minutes ago. Lsof again reports /dev/ACM0 in use by
pppd. Sending a SIGTERM or a SIGHUP has no apparent effect. After
issuing SIGKILL, upon restart, pppd hangs. Minicom can still talk to
the modem, get it to dial, get a carrier, and receive the ppp
negotiation packet.
> Assuming your file system does update access time (some people
> are mounting -noatime)
No, I'm not mounting the partition with noatime, just with
rw,errors=remount-ro.
> , did 'ls -lu /usr/sbin/chat' show current
> access? What about the script that contains the dialing commands?
The chat script, or the call file? What about them? Do you mean were
they showing to be in-use? No, they aren't. Lsof shows quite a few
files in use by pppd, but not them. They aren't in use by anything.
>>Jan 18 20:11:24 Cricket pppd[3062]: pppd 2.4.4 started by root, uid 0
>>Jan 18 20:11:25 Cricket pppd[3062]: Connect script failed
>>Jan 18 20:12:03 Cricket pppd[3062]: Serial connection established.
>
> Below
>
>>Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfReq id=0x1 <asyncmap
>>0x0> <auth chap MD5> <magic 0xf27b8c09> <pcomp> <accomp>]
>>Jan 18 20:12:09 Cricket pppd[3062]: sent [LCP ConfNak id=0x1 <auth
>>pap>] Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfAck id=0x1
>><asyncmap 0x0> <magic 0xeaf2ceb9> <pcomp> <accomp>]
>>Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfReq id=0x2 <asyncmap
>>0x0> <auth pap> <magic 0xf27b8c09> <pcomp> <accomp>]
>
> The peer requested you authenticate by CHAP-MD5 (RFC1994), and you
> refused, suggesting PAP (RFC1334) instead. This time, the peer
> accepted it. On 'Jan 9 09:49:00', you showed CHAP authentication
> working - what changed?
I was wondering that, myself. Under wvdial, the systems were
authenticating using CHAP. When I moved over to chat, I copied the
pppd command line directly from the wvdial command line. I used the
same calling file and had not yet made any changes to the options file,
yet they started authenticating using PAP. I was puzzled why, but it
did seem to be working well enough not to need my immediate attention.
I made several other changes, during which time ppd continued to
authenticate using PAP. When I removed wvdial, it instantly reverted
to CHAP authentication. I know it sounds a bit fishy, but I swear to
you I made no other configuration changes when I removed wvdial, and it
was authenticating using PAP immediately prior to removal and CHAP
immediately after removing wvdial.
> I'd suggest having identical pap-secrets
> and chap-secrets files, and letting pppd use which-ever it needed.
That's the way it is set up, and I never changed it.
>>Jan 18 20:12:10 Cricket pppd[3062]: sent [CCP ConfReq id=0x1 <deflate
>>15> <deflate(old#) 15> <bsd v1 15>]
>>Jan 18 20:12:10 Cricket pppd[3062]: sent [IPCP ConfReq id=0x1
>><compress VJ 0f 01> <addr 0.0.0.0> <ms-dns1 0.0.0.0> <ms-dns3
>>0.0.0.0>]
>
>>Jan 18 20:12:10 Cricket pppd[3062]: rcvd [LCP ProtRej id=0x1 80 fd 01
>>01 00 0f 1a 04 78 00 18 04 78 00 15 03 2f]
>>Jan 18 20:12:10 Cricket pppd[3062]: Protocol-Reject for 'Compression
>>Control Protocol' (0x80fd) received
>>Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfRej id=0x1
>><compress VJ 0f 01>]
>
> This peer doesn't do CCP (RFC1962), or Van Jacobson header compression
> (RFC1144). No big deal, but you _could_ add 'noccp' and 'novj' if it
> did become a problem. It's _not_ a problem now.
Yeah, I spotted that, too. I figured I would leave it.
>>Jan 18 20:11:25 Cricket chat[3106]: abort on (BUSY)
>>Jan 18 20:11:25 Cricket chat[3106]: send (AT^M)
>>Jan 18 20:11:25 Cricket chat[3106]: expect (OK)
>>Jan 18 20:11:25 Cricket chat[3106]: AT^M^M
>>Jan 18 20:11:25 Cricket chat[3106]: OK
>
> An empty 'Hayes' command prefix - not very reliable at initializing
> the modem.
To the best of my ability to tell, neither ATZ nor AT&F do anything,
although the modem responds with, "OK". The manual I have lists
AT+WRST as the wireless reset command. When I try that, it returns an
error.
>>Jan 18 20:11:25 Cricket chat[3106]: send (ATDT#777^M)
>>Jan 18 20:11:25 Cricket chat[3106]: expect (CONNECT)
>>Jan 18 20:11:25 Cricket chat[3106]: ^M
>>Jan 18 20:11:25 Cricket chat[3106]: ATDT#777^M^M
>>Jan 18 20:11:25 Cricket chat[3106]: NO CARRIER
>>Jan 18 20:11:25 Cricket chat[3106]: -- failed
>
> You told the modem to dial - it responded with 'NO CARRIER' rather
> than the expected 'CONNECT blah, blah, blah'. That's a modem issue.
That depends on what you mean by "modem". There are a finite number of
carriers available to each antenna on a cellular base station. It's
not terribly unusual to be denied a channel. I'm sure you have
encountered a "Network busy - try again" response from your cell phone
from time to time even though you have plenty of bars. My guess would
be a full antenna face, although of course it could be a problem with
the modem itself.
>>Jan 18 20:11:57 Cricket chat[3652]: send (ATDT#777^M)
>>Jan 18 20:11:57 Cricket chat[3652]: expect (CONNECT)
>>Jan 18 20:11:57 Cricket chat[3652]: ^M
>>Jan 18 20:12:03 Cricket chat[3652]: ATDT#777^M^M
>>Jan 18 20:12:03 Cricket chat[3652]: CONNECT
>>Jan 18 20:12:03 Cricket chat[3652]: -- got it
> You tried again, and the modem took six seconds, but this time
> reported it found another modem to talk to.
>
>>Finally, sometimes the modem responds and dials out, but the
>>authentication fails.
>
> Above - '/bin/cp /etc/ppp/pap-secrets /etc/ppp/chap-secrets'
They are already identical in the uncommented sections.
Here is the pppd log from the latest lockup. Nothing I tried short of
a reboot would get pppd to invoke the chat script and move forward.
I'm open to suggestions. On the bright side, a soft reboot does allow
the system to come back up, and although it would be rather draconian
to reboot every time just to clear such an issue, it is an option. So
far the really nasty hardware lockups have not returned.
Below is the log from the latest lock-up, experienced when the
following command was issued:
`ps -ef | grep -q pppd` && kill -1 `pidof pppd`
I am issuing the command every couple of hours, and the third time I
issued it in this session, pppd hung.
Jan 19 19:35:29 Cricket pppd[2783]: Hangup (SIGHUP)
Jan 19 19:35:29 Cricket pppd[2783]: Connect time 152.1 minutes.
Jan 19 19:35:29 Cricket pppd[2783]: Sent 1551485 bytes, received 1648658
bytes.
Jan 19 19:35:29 Cricket pppd[2783]: Script /etc/ppp/ip-down started (pid
10026)
Jan 19 19:35:29 Cricket pppd[2783]: sent [LCP TermReq id=0xe "User
request"]
Jan 19 19:35:29 Cricket pppd[2783]: Script /etc/ppp/ip-down finished
(pid 10026), status = 0x0
Jan 19 19:35:32 Cricket pppd[2783]: sent [LCP TermReq id=0xf "User
request"]
Jan 19 19:35:35 Cricket pppd[2783]: Connection terminated.
Jan 19 19:39:00 Cricket pppd[2783]: ioctl(TIOCSETD, N_TTY): Interrupted
system call (line 571)
Jan 19 19:39:01 Cricket pppd[2783]: tcsetattr: Interrupted system call
(line 1010)
Jan 19 19:39:01 Cricket pppd[2783]: Hangup (SIGHUP)
Jan 19 19:39:01 Cricket pppd[2783]: Modem hangup
Jan 19 19:39:02 Cricket pppd[2783]: Hangup (SIGHUP)
When I kill and restart pppd, I get:
Jan 19 20:05:37 Cricket pppd[13614]: pppd 2.4.4 started by root, uid 0
and then nothing.
> On 2010-01-19, lrhorer <lrh...@satx.rr.com> wrote:
>>
>>>>> You actually do not have debugging switched on for your pppd
>>>>> (option debug to pppd)
>>>>
>>>> Yes, I do. The debug directive is in the options file.
>>>
>>> Good. But what you display is NOT the pppd debug logs.
>>
>> Then pppd isn't creating them, unless it is someplace other
>> than /var/log/. The debug option is in the options file, as I
>> mentioned.
>>
>>> Did you put the line
>>> daemon.*;local2.* /var/log/daemonlog
>>
>> I already responded to that. I did not have to. It was
>> already there.
>
> Then you have a non-standard version of pppd.
> With the debug option, pppd writes detailed reports to the daemon
> syslog facility.
Cricket:/etc/ppp# egrep -v '#|^ *$' /etc/ppp/options
asyncmap 0
usehostname
noipdefault
usepeerdns
defaultroute
auth
local
lock
hide-password
debug
lcp-echo-interval 30
lcp-echo-failure 4
noipx
Cricket:/etc# egrep -v '#|^ *$' rsyslog.conf
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
$FileOwner root
$FileGroup adm
$FileCreateMode 0640
$DirCreateMode 0755
$IncludeConfig /etc/rsyslog.d/*.conf
auth,authpriv.* /var/log/auth.log
*.*;auth,authpriv.none -/var/log/syslog
daemon.*; local2.* -/var/log/daemon.log
kern.* -/var/log/kern.log
lpr.* -/var/log/lpr.log
mail.* -/var/log/mail.log
user.* -/var/log/user.log
<snip>
What more can I say?
>>
>>> (or whereever you want to put it) into /etc/syslog.conf, and restart
>>> syslog.conf
>>> killall -1 syslogd
>>
>> Again, as I already mentioned, this isn't necessary. The
>> router has
>> been both cold and warm booted many, many times, and the logging
>> parameters were already there when the system was created.
>>
>> Oh, and just BTW, this distro does not use syslog.d. It's
>> using
>> rsyslog.d. not that ity really matters.
>
> I do not know rsyslogd. Are you sure it uses /etc/syslog.conf as its
> configuration file?
No, it does not:
Cricket:/etc# find / -name "*syslogd*"
/usr/sbin/rsyslogd
Cricket:/etc# find /etc -name "*syslog*"
/etc/init.d/rsyslog
/etc/rsyslog.d
/etc/rc2.d/S10rsyslog
/etc/rc1.d/K90rsyslog
/etc/rc6.d/K90rsyslog
/etc/default/rsyslog
/etc/rc3.d/S10rsyslog
/etc/logrotate.d/rsyslog
/etc/rsyslog.conf
/etc/rc0.d/K90rsyslog
/etc/rc5.d/S10rsyslog
/etc/rc4.d/S10rsyslog
>> There was a lock-up just an hour ago, although this time
>> without a USB
>> bus reset. I've enabled he persist option in pppd, at least for the
>> time being. It does allow the system to recover from an ordinary
>> idle
>> time-out. This event did not involve a USB reset, /dev/ACM0 stayed
>> online, and the modem was able to respond to AT commands. It would
>> not
>> dial, however, either using pppd or minicom. The first response to a
>> simgle AT command was "ERROR" after I shut down pppd and ran minicom
>> to do some manual testing.
>>
>> Here is the log from the event:
>>
>> Jan 18 18:29:54 Cricket pppd[5836]: No response to 4 echo-requests
>> Jan 18 18:29:54 Cricket pppd[5836]: Serial link appears to be
>> disconnected.
> OK, pppd is using echos to test if the link is up. It is failing. Ie,
> the connection has gone down before pppd goes down.
>
> This looks more and more like a hardware flaw, either a bug or a bad
> design. There is NOTHING that software can do to fix it. Only
Nonsense. Since the modem can still dial out, still connect to the
carrier, and still get a ppp session request, clearly the hardware is
still working in this failure mode. Worst case, for this failure mode,
I can reboot the router. It's a Draconian approach, but it works, and
it is pure software. Someone suggested I unbind the ACM driver and
re-bind it. I may try that, although the fact the modem still works
under minicom suggests it isn't ACM.
>> Jan 18 18:29:54 Cricket pppd[5836]: Connect time 306.6 minutes.
>> Jan 18 18:29:54 Cricket pppd[5836]: Sent 445560 bytes, received
>> 588608 bytes.
>> Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down started
>> (pid 10015)
>> Jan 18 18:29:54 Cricket pppd[5836]: sent [LCP TermReq id=0x2 "Peer
>> not responding"]
>> Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down finished
>> (pid 10015), status = 0x0
>
> pppd has shut itself down.
>
>
>> Jan 18 19:07:17 Cricket pppd[30605]: pppd 2.4.4 started by root, uid
>> 0
>>
>> Again, it produced a lockfile and grabbed /dev/ttyACM0, but
>> it did nit
>> run the chat script or anything else. I killed pppd again, and ran
>> minicom to see if the modem was responding on /dev/ttyACM0. It was.
>> I pulled the modem and reseated it, running pppd, again:
>
> ????
You'll have to make this question a bit more specific before I can
answer it.
> OK, now we are seeing valid pppd debug output.
>
> Where did you get this from?
/var/log/syslog. As I explained in another message, pppd started
spitting out a greater breadth of messages into /var/log/syslog when I
quit calling it from wvdial.
>> Jan 18 20:12:10 Cricket pppd[3062]: rcvd [LCP ProtRej id=0x1 80 fd 01
>> 01 00 0f 1a 04 78 00 18 04 78 00 15 03 2f]
>> Jan 18 20:12:10 Cricket pppd[3062]: Protocol-Reject for 'Compression
>> Control Protocol' (0x80fd) received
>
> PUt in a noccp option.
I don't suppose it will hurt. I seriously doubt it will help, but I
suppose one never knows.
>> Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfRej id=0x1
>> <compress VJ 0f 01>]
>
> And novj
Ditto.
>> fails. In others, the modem can be switched, but udev doesn't
>> recognize it. In yet others, udev recognizes the modem and creates
>> the /dev/ACM0 device, but the modem doesn't respond. Finally,
>> sometimes the modem responds and dials out, but the authentication
>> fails.
>
> Again hardware.
I seem to remember hearing someone else pointing that out. There
haven't been any hardware faults since the USB port was replaced. It's
too soon to tell for certain if this is meaningful, but a new USB card
was $24, and I can return it if it doesn't fix the problem.
>Cricket:/etc/ppp# egrep -v '#|^ *$' /etc/ppp/options
>asyncmap 0
Not needed - that's the default
>usehostname
Are you sure you need this?
>noipdefault
>usepeerdns
>defaultroute
>auth
That usually prevents dialin, as most ISPs will _not_ authenticate to
you.
>local
I realize your USB device lacks modem control lines, but most
emulations I've seen work without this.
>noipx
I haven't seen an ISP offering Novell IPX connections in over 15 years.
Old guy
>Moe Trin wrote:
>> Minor confusion - your log shows pppd unable to communicate with the
>> modem between (at least) 18:29:54 and 20:11:24.
>Right. I was checking some things. When I shut down pppd eventually,
>the first attempt to talk to the modem (AT^M) caused it to respond
>with, "ERROR". A subsequent AT^M produced an, "OK".
As noted below, AT is a command prefix - assuming that is the modem
that is responding with the 'ERROR' message, I'd expect that it means
that the modem is in 'command' mode, but the command registers
contained some other "stuff" and the 'AT' and carriage return were
interpreted as the end of some unrecognized command.
>> Nearly two minutes - why? pppd hangs up the modem using the DTR
>> "wire" which is emulated in a USB modem. "Module problems"?
>I was skeptical the "modem" and "crtscts" options were of any value,
>and I was wondering if having them specified in the options file for a
>modem without signal lines was causing any of the issues. I suppose
>it's possible they could have been at least partly responsible for some
>of the odd behavior, although by no means all of it. There is a small
>whiff of this being the case, here. I've commented them out.
For dialout, the 'modem' option specifies using the control lines to
handle the on/off hook commands. 'crtscts' is for flow control. What
are you using in place of these functions? ATH1/ATH0? XON/XOFF?
If the later, you want 'asyncmap 0xa0000' and 'xonxoff' options.
>>>Jan 18 18:31:58 Cricket pppd[5836]: Hangup (SIGHUP)
>>>Jan 18 18:32:04 Cricket chat[10941]: Can't get terminal parameters:
>>>Input/output error
>> On an attempted restart, the serial port wasn't a serial port.
>'Except that minicom is still able to talk to it. I was able to
>reproduce this behavior a number of times. Sometimes it gives this
>response. Other times it is just silent.
That's great - but the device handed to pppd wasn't a modem at that
time.
>> Try to avoid using -SIGKILL (-9). The results are messy. -SIGTERM
>> (-15) should be enough.
>`Not when it locks up like this. Kill -9 is the only thing which seems
>to be able to shut down pppd when this happens.
You are aware that kill -9 doesn't allow a clean shutdown, so
>The problem is, it won't restart when I do.
this shouldn't be a surprise. Looking at your other response
(Message-ID: <M6OdnQl2lLMi5svW...@giganews.com> dated
19 Jan 2010 21:42:55), I don't see a port speed setting. Is that
done on the command line? (ANU pppd for Linux only accepts specific
port speeds in the sourcefile ./pppd/sys-linux.c based on settings
in /usr/include/termbits.h files, and ignores speed commands not
matching those settings.) If you have a speed mismatch, things
obviously won't work. By the way - in about 17 years, I've rarely
seen pppd wedged to the point that -SIGTERM didn't do the job (if
kill itself was functional - nor blocked by I/O actions).
>If you have any other ideas for cleanly shutting down and restarting
>the ppp session, I'm all ears.
>Sending a SIGTERM or a SIGHUP has no apparent effect. After issuing
>SIGKILL, upon restart, pppd hangs. Minicom can still talk to the
>modem, get it to dial, get a carrier, and receive the ppp
>negotiation packet.
Minicom does it's own initiation - including port speed. Have you
compared the minicom setup with what you are using in pppd? Patrick
Klos (Klos Technologies, Inc pat...@klos.com) used to have a serial
port analyzer - do you have anything that works in similar fashion
to see what is being sent to the USB port?
>> did 'ls -lu /usr/sbin/chat' show current access? What about the
>> script that contains the dialing commands?
>The chat script, or the call file? What about them? Do you mean were
>they showing to be in-use? No, they aren't.
Trying to isolate if pppd can even talk to the serial device.
>> The peer requested you authenticate by CHAP-MD5 (RFC1994), and you
>> refused, suggesting PAP (RFC1334) instead. This time, the peer
>> accepted it. On 'Jan 9 09:49:00', you showed CHAP authentication
>> working - what changed?
> I was wondering that, myself. Under wvdial, the systems were
>authenticating using CHAP. When I moved over to chat, I copied the
>pppd command line directly from the wvdial command line.
You've got such confusion about what is being called from the command
line, what options, and what results into the logs. pppd will refuse
to authenticate if it can't identify a specific secret to use. The
'usehostname' you show in the other message might be a conflict. But
you also show the 'auth' option, yet I don't see it demanding the peer
to authenticate to you. WTF? You might add the 'dump' option to see
what all is being declared from where.
>I copied the pppd command line directly from the wvdial command line.
>I used the same calling file
'calling file' meaning?
>and had not yet made any changes to the options file, yet they
>started authenticating using PAP. I was puzzled why, but it did seem
>to be working well enough not to need my immediate attention. I made
>several other changes, during which time ppd continued to
>authenticate using PAP.
Most needed options can be called directly from the command line, but
I usually put only the ISP specific stuff there, and put the rest in
/etc/ppp/options. If this is the only setup, you could put everything
in the options files. Thus, I showed only
[galileo ~]$ cat /etc/ppp/options | column
lock crtscts nodetach defaultroute
/dev/modem modem 115200 noipdefault
[galileo ~]$ cat /usr/local/bin/dialin.example
exec /usr/sbin/pppd user ibup...@example.com connect "/usr/sbin/chat
ABORT BUSY \"\" AT\&F1 OK ATDT2662902 CONNECT \"\d\c\""
[galileo ~]$
The common to all ISP stuff is in /etc/ppp/options, and only the stuff
unique to this ISP and this telephone number is in the command line.
In addition to what I show here, you need 'usepeerdns', 'debug', the
'lcp-echo-*' stuff and perhaps 'persist'. What else have you got in
there, and why?
>When I removed wvdial, it instantly reverted to CHAP authentication. I
>know it sounds a bit fishy, but I swear to you I made no other
>configuration changes when I removed wvdial, and it was authenticating
>using PAP immediately prior to removal and CHAP immediately after
>removing wvdial.
>> I'd suggest having identical pap-secrets and chap-secrets files, and
>> letting pppd use which-ever it needed.
>That's the way it is set up, and I never changed it.
Obviously wvdial was adding something that caused pppd to assume that
there was no valid CHAP authentication token. Best I can think of is
something relating to username, or (remote) hostname. For what it's
worth now, the 'dump' or even 'dryrun' option might have helped
isolate that problem - moot now.
>> An empty 'Hayes' command prefix - not very reliable at initializing
>> the modem.
>To the best of my ability to tell, neither ATZ nor AT&F do anything,
>although the modem responds with, "OK". The manual I have lists
>AT+WRST as the wireless reset command. When I try that, it returns an
>error.
(That's why I like my USR modems - they have a built in command help
so I can see what all of the commands are.) There is a
non-standardized Hayes style command 'ATIn' (read 'atiN') where 'n' is
a digit that causes the modem to 'dump' it's current settings to stdout.
Some other modems used AT&V for the same function. You seem to be
able to get minicom to do some things - what modem setup is it using?
>> You told the modem to dial - it responded with 'NO CARRIER' rather
>> than the expected 'CONNECT blah, blah, blah'. That's a modem issue.
>That depends on what you mean by "modem". There are a finite number
>of carriers available to each antenna on a cellular base station.
>It's not terribly unusual to be denied a channel. I'm sure you have
>encountered a "Network busy - try again" response from your cell phone
>from time to time even though you have plenty of bars.
Actually, no - but then I don't make that many calls on the cell. I'd
expect a modem to report 'BUSY'. 'NO CARRIER' generally means that the
call went through (or seemed to), and the call was answered - it just
wasn't able to negotiate a connection. The analog equivalent would be
either no "whistling" or incompatible "whistling". (poetic license #513)
>Here is the pppd log from the latest lockup. Nothing I tried short of
>a reboot would get pppd to invoke the chat script and move forward.
>I'm open to suggestions. On the bright side, a soft reboot does allow
>the system to come back up, and although it would be rather draconian
>to reboot every time just to clear such an issue, it is an option. So
>far the really nasty hardware lockups have not returned.
As mentioned several times before - the reboot is resetting things to
sane values.
>Below is the log from the latest lock-up, experienced when the
>following command was issued:
>`ps -ef | grep -q pppd` && kill -1 `pidof pppd`
[compton ~]$ whatis killall
killall (1) - kill processes by name
[compton ~]$
That's the BSD style, not SysV (sometimes known as 'killall5').
Re the -SIGHUP verses -SIGTERM or even -SIGINT, -SIGHUP is usually
only used if you have the 'persist' option.
>I am issuing the command every couple of hours, and the third time I
>issued it in this session, pppd hung.
>Jan 19 19:35:29 Cricket pppd[2783]: Script /etc/ppp/ip-down started (pid
>10026)
>Jan 19 19:35:29 Cricket pppd[2783]: sent [LCP TermReq id=0xe "User
>request"]
>Jan 19 19:35:29 Cricket pppd[2783]: Script /etc/ppp/ip-down finished
>(pid 10026), status = 0x0
>Jan 19 19:35:32 Cricket pppd[2783]: sent [LCP TermReq id=0xf "User
>request"]
>Jan 19 19:35:35 Cricket pppd[2783]: Connection terminated.
Comment: Peer isn't TermAck'ing the 'TermReq' - it should. Things are
running slowly. I'm used to a snappier response. This is the result of
a '/usr/bin/killall pppd':
Jan 17 11:34:42 kepler pppd[24285]: Terminating on signal 15.
Jan 17 11:34:42 kepler pppd[24285]: sent [LCP TermReq id=0x2 "User request"]
Jan 17 11:34:43 kepler pppd[24285]: rcvd [LCP TermAck id=0x2]
Jan 17 11:34:43 kepler pppd[24285]: Connection terminated.
Jan 17 11:34:44 kepler pppd[24285]: Exit.
>Jan 19 19:39:01 Cricket pppd[2783]: Hangup (SIGHUP)
>Jan 19 19:39:01 Cricket pppd[2783]: Modem hangup
>Jan 19 19:39:02 Cricket pppd[2783]: Hangup (SIGHUP)
Three and a half minutes to shut things down? That's a problem.
>When I kill and restart pppd, I get:
>
>Jan 19 20:05:37 Cricket pppd[13614]: pppd 2.4.4 started by root, uid 0
>
>and then nothing.
25 minutes later. Did /usr/sbin/chat even start?
Old guy
Yes, I know.
> that is responding with the 'ERROR' message, I'd expect that it means
> that the modem is in 'command' mode, but the command registers
> contained some other "stuff" and the 'AT' and carriage return were
> interpreted as the end of some unrecognized command.
Obviously. The question - more or less - is, "Why is pppd sending
garbage to the modem? The bigger question is, "Why is it hanging?"
>>I was skeptical the "modem" and "crtscts" options were of any value,
>>and I was wondering if having them specified in the options file for a
>>modem without signal lines was causing any of the issues. I suppose
>>it's possible they could have been at least partly responsible for
>>some
>>of the odd behavior, although by no means all of it. There is a small
>>whiff of this being the case, here. I've commented them out.
>
> For dialout, the 'modem' option specifies using the control lines to
> handle the on/off hook commands. 'crtscts' is for flow control. What
> are you using in place of these functions? ATH1/ATH0? XON/XOFF?
> If the later, you want 'asyncmap 0xa0000' and 'xonxoff' options.
None at all. It's not surprising the modem doesn't require any form of
flow control. After all, it doesn't have a UART. I doubt it would
respond to flow control of any sort, especially software flow control.
>>>>Jan 18 18:31:58 Cricket pppd[5836]: Hangup (SIGHUP)
>>>>Jan 18 18:32:04 Cricket chat[10941]: Can't get terminal parameters:
>>>>Input/output error
>
>>> On an attempted restart, the serial port wasn't a serial port.
>
>>'Except that minicom is still able to talk to it. I was able to
>>reproduce this behavior a number of times. Sometimes it gives this
>>response. Other times it is just silent.
>
> That's great - but the device handed to pppd wasn't a modem at that
> time.
Yeah, it was. Pppd just did not properly recognize it. The device
wasn't "handed" to pppd. Pppd was already controlling it with no
problems on the ttyACM0 device. It is only when pppd receives a hangup
request that it goes south. This, whether the carrier is dropped by
the ISP or a SIGHUP is issued.
>
>>> Try to avoid using -SIGKILL (-9). The results are messy. -SIGTERM
>>> (-15) should be enough.
>
>>`Not when it locks up like this. Kill -9 is the only thing which
>>seems to be able to shut down pppd when this happens.
>
> You are aware that kill -9 doesn't allow a clean shutdown, so
Yes, but what other option do I have, other than rebooting the system?
>>The problem is, it won't restart when I do.
>
> this shouldn't be a surprise. Looking at your other response
> (Message-ID: <M6OdnQl2lLMi5svW...@giganews.com> dated
> 19 Jan 2010 21:42:55), I don't see a port speed setting. Is that
Setting the "speed" does nothing. One can set it to 110bps, and the
modem still gets upward of 750 Kbps. Remember, there is no UART, so
setting the bit rate is superfluous. I strongly suspect that setting
bit 7 of the Line Control Register (used to tel the UART one wishes to
set the clock divisor in an 8250 / 16450 / 16550 UART) does nothing at
all.
> done on the command line? (ANU pppd for Linux only accepts specific
> port speeds in the sourcefile ./pppd/sys-linux.c based on settings
> in /usr/include/termbits.h files, and ignores speed commands not
> matching those settings.) If you have a speed mismatch, things
> obviously won't work.
The point is, "they" do work. Pppd is able to speak to the modem until
it is requested by either a SIGHUP or the internal timer to disconnect.
At that point, it disconnects, but locks up tight.
> By the way - in about 17 years, I've rarely
> seen pppd wedged to the point that -SIGTERM didn't do the job (if
> kill itself was functional - nor blocked by I/O actions).
Well, now you have.
>>If you have any other ideas for cleanly shutting down and restarting
>>the ppp session, I'm all ears.
>
>>Sending a SIGTERM or a SIGHUP has no apparent effect. After issuing
>>SIGKILL, upon restart, pppd hangs. Minicom can still talk to the
>>modem, get it to dial, get a carrier, and receive the ppp
>>negotiation packet.
>
> Minicom does it's own initiation - including port speed. Have you
> compared the minicom setup with what you are using in pppd?
You're missing the point. The fact minicom can still talk to the
modem, but pppd can't talk to anything is significant.
> Patrick
> Klos (Klos Technologies, Inc pat...@klos.com) used to have a serial
> port analyzer - do you have anything that works in similar fashion
> to see what is being sent to the USB port?
No, I don't.
>>> did 'ls -lu /usr/sbin/chat' show current access? What about the
>>> script that contains the dialing commands?
>
>>The chat script, or the call file? What about them? Do you mean were
>>they showing to be in-use? No, they aren't.
>
> Trying to isolate if pppd can even talk to the serial device.
>> I was wondering that, myself. Under wvdial, the systems were
>>authenticating using CHAP. When I moved over to chat, I copied the
>>pppd command line directly from the wvdial command line.
>
> You've got such confusion about what is being called from the command
> line, what options, and what results into the logs. pppd will refuse
> to authenticate if it can't identify a specific secret to use. The
You're not listening. This isn't a configuration issue. Admittedly
wvdial was causing something odd - I have no idea what or how - but as
soon as it was removed, the system began behaving as expected, EXCEPT
that now pppd hangs on exit.
> 'usehostname' you show in the other message might be a conflict. But
> you also show the 'auth' option, yet I don't see it demanding the peer
> to authenticate to you. WTF?
The command options in the calling file override any configuration in
the /etc/ppp/options file. From the directions in the /etc/ppp/options
file:
# Require the peer to authenticate itself before allowing network
# packets to be sent or received.
# Please do not disable this setting. It is expected to be standard in
# future releases of pppd. Use the call option (see manpage) to disable
# authentication for specific peers.
The reason for this request is obvious.
> You might add the 'dump' option to see
> what all is being declared from where.
<Sigh> I know what's being called from where, but if it will assuage
your curiosity:
debug # (from /etc/ppp/options)
maxconnect 7200 # (from /etc/ppp/peers/cricket)
persist # (from /etc/ppp/peers/cricket)
dump # (from command line)
noauth # (from /etc/ppp/peers/cricket)
user (830)388-...@mycricket.com #
(from /etc/ppp/peers/cricket)
usehostname # (from /etc/ppp/options)
/dev/ttyACM0 # (from /etc/ppp/peers/cricket)
lock # (from /etc/ppp/options)
connect /usr/sbin/chat -v -f /etc/ppp/chatfile #
(from /etc/ppp/peers/cricket)
local # (from /etc/ppp/options)
asyncmap 0 # (from /etc/ppp/options)
lcp-echo-failure 4 # (from /etc/ppp/options)
lcp-echo-interval 30 # (from /etc/ppp/options)
hide-password # (from /etc/ppp/options)
novj # (from /etc/ppp/peers/cricket)
noipdefault # (from /etc/ppp/peers/cricket)
defaultroute # (from /etc/ppp/peers/cricket)
usepeerdns # (from /etc/ppp/peers/cricket)
noccp # (from /etc/ppp/peers/cricket)
noipx # (from /etc/ppp/options)
>>I copied the pppd command line directly from the wvdial command line.
>>I used the same calling file
>
> 'calling file' meaning?
Meaning the file used by the call option. It was trivial, and I have
now modified it with the above options.
>>and had not yet made any changes to the options file, yet they
>>started authenticating using PAP. I was puzzled why, but it did seem
>>to be working well enough not to need my immediate attention. I made
>>several other changes, during which time ppd continued to
>>authenticate using PAP.
>
> Most needed options can be called directly from the command line, but
> I usually put only the ISP specific stuff there, and put the rest in
> /etc/ppp/options. If this is the only setup, you could put everything
> in the options files. Thus, I showed only
I chose to keep the global and default options in the options file,
which evidently is what the authors intended and put the vendor
specific and override options in the calling file, which also would
seem to be the developer's intent. Of course, the rc files are
designed to contain user-specific configurations, but there are no
users on this system, and no rc files. The command line is simply
`pppd call cricket`. (Of course in the case above it was `pppd dump
call cricket`.)
> [galileo ~]$ cat /etc/ppp/options | column
> lock crtscts nodetach defaultroute
> /dev/modem modem 115200 noipdefault
> [galileo ~]$ cat /usr/local/bin/dialin.example
> exec /usr/sbin/pppd user ibup...@example.com connect
> "/usr/sbin/chat ABORT BUSY \"\" AT\&F1 OK ATDT2662902 CONNECT
> \"\d\c\""
> [galileo ~]$
>
> The common to all ISP stuff is in /etc/ppp/options, and only the stuff
> unique to this ISP and this telephone number is in the command line.
> In addition to what I show here, you need 'usepeerdns', 'debug', the
> 'lcp-echo-*' stuff and perhaps 'persist'. What else have you got in
> there, and why?
Evewrything is default, except I commented out the crtscts, modem, and
proxyarp lines and added the debug, usehostname, noipdefault,
usepeerdns, and defaultroute lines.
Look, you're wasting both your time and mine. This isn't a
configuration issue. The modem has successfully dialed out and
disconnected literally hundreds of times, now. If it were a
configuration issue, it likely would not be working, and it most
certainly not crash only when disconnecting, and most certainly not
intermittently.
>>To the best of my ability to tell, neither ATZ nor AT&F do anything,
>>although the modem responds with, "OK". The manual I have lists
>>AT+WRST as the wireless reset command. When I try that, it returns an
>>error.
>
> (That's why I like my USR modems - they have a built in command help
> so I can see what all of the commands are.) There is a
> non-standardized Hayes style command 'ATIn' (read 'atiN') where 'n' is
> a digit that causes the modem to 'dump' it's current settings to
> stdout.
> Some other modems used AT&V for the same function. You seem to be
> able to get minicom to do some things - what modem setup is it using?
AT. Nothing but AT and ATDT seem to have any effect on the modem.
Everything else I tried happily produces an OK or else an ERROR, but
none of them do anything, AFAICT. 'Not really surprising.
> Actually, no - but then I don't make that many calls on the cell. I'd
> expect a modem to report 'BUSY'. 'NO CARRIER' generally means that
> the call went through (or seemed to), and the call was answered - it
> just
> wasn't able to negotiate a connection. The analog equivalent would be
> either no "whistling" or incompatible "whistling". (poetic license
> #513)
It's more like the equivalent of a fast busy on your landline.
> Re the -SIGHUP verses -SIGTERM or even -SIGINT, -SIGHUP is usually
> only used if you have the 'persist' option.
Which I do. Originally I implemented the function in the monitoring
script, but it's cleaner to allow pppd to handle it. I was also hoping
the persist option would alleviate the hang on disconnect. It doesn't.
The issue seems to be somewhat related to the amount of time pppd is
active. If I set maxconnect to a low value like 180, it happily calls,
disconnects, and redials successfully literally dozens of times without
hanging. If I set it to 3600, I can sometimes get 10 or 15 redials
before it hangs. If I set it to 10800, I can get three or four
successful redials. If I set it to 40631, or if I disable maxconnect
and let the ISP hang up after 12.00 hours, I can get maybe 1 or 2 or at
most 3 successful redials, if any at all.
>>I am issuing the command every couple of hours, and the third time I
>>issued it in this session, pppd hung.
>
>>Jan 19 19:35:29 Cricket pppd[2783]: Script /etc/ppp/ip-down started
>>(pid 10026)
>>Jan 19 19:35:29 Cricket pppd[2783]: sent [LCP TermReq id=0xe "User
>>request"]
>>Jan 19 19:35:29 Cricket pppd[2783]: Script /etc/ppp/ip-down finished
>>(pid 10026), status = 0x0
>>Jan 19 19:35:32 Cricket pppd[2783]: sent [LCP TermReq id=0xf "User
>>request"]
>>Jan 19 19:35:35 Cricket pppd[2783]: Connection terminated.
>
> Comment: Peer isn't TermAck'ing the 'TermReq' - it should. Things are
> running slowly. I'm used to a snappier response. This is the result of
> a '/usr/bin/killall pppd':
It's no faster here when using the maxconnect option nor the SIGTERM.
> Jan 17 11:34:42 kepler pppd[24285]: Terminating on signal 15.
> Jan 17 11:34:42 kepler pppd[24285]: sent [LCP TermReq id=0x2 "User
> request"] Jan 17 11:34:43 kepler pppd[24285]: rcvd [LCP TermAck
> id=0x2] Jan 17 11:34:43 kepler pppd[24285]: Connection terminated.
> Jan 17 11:34:44 kepler pppd[24285]: Exit.
>
>>Jan 19 19:39:01 Cricket pppd[2783]: Hangup (SIGHUP)
>>Jan 19 19:39:01 Cricket pppd[2783]: Modem hangup
>>Jan 19 19:39:02 Cricket pppd[2783]: Hangup (SIGHUP)
>
> Three and a half minutes to shut things down? That's a problem.
>
>>When I kill and restart pppd, I get:
>>
>>Jan 19 20:05:37 Cricket pppd[13614]: pppd 2.4.4 started by root, uid 0
>>
>>and then nothing.
>
> 25 minutes later. Did /usr/sbin/chat even start?
That's because it took me 25 minutes to notice the system was hung,
finish what I was doing, go to the other room, and run some tests
before issuing the SIGKILL. No, chat does not start. Pppd does not
respond to removing the modem, sending a SIGTERM, nothing.
> On Tue, 19 Jan 2010, in the Usenet newsgroup comp.os.linux.networking,
> in article <M6OdnQl2lLMi5svW...@giganews.com>, lrhorer
> wrote:
>
>>Cricket:/etc/ppp# egrep -v '#|^ *$' /etc/ppp/options
>>asyncmap 0
>
> Not needed - that's the default
I didn't put it in there, but it doesn't seem to cause problems, so I
left it.
>>usehostname
>
> Are you sure you need this?
I'm fairly certain I don't, but again, it was the default, and doesn't
seem to cause any issues.
>>noipdefault
>>usepeerdns
>>defaultroute
>>auth
>
> That usually prevents dialin, as most ISPs will _not_ authenticate to
> you.
Not when it is contradicted in the calling file. See above.
>>local
>
> I realize your USB device lacks modem control lines, but most
> emulations I've seen work without this.
It seems to. I removed the "modem" line and uncommented this one for
testing. It made no difference.
>>noipx
>
> I haven't seen an ISP offering Novell IPX connections in over 15
> years.
It was the default, so I left it. I agree it's unlikely an ISP would
ever support IPX. TCP/IP is definitely the playing field for WAN, and
indeed for more than 90% of LAN. Heck, I feel like an orphan child
because a large amount of our gear runs 7 layer OSI.
>> As noted below, AT is a command prefix - assuming that is the modem
> Yes, I know.
>> that is responding with the 'ERROR' message, I'd expect that it means
>> that the modem is in 'command' mode, but the command registers
>> contained some other "stuff" and the 'AT' and carriage return were
>> interpreted as the end of some unrecognized command.
>Obviously. The question - more or less - is, "Why is pppd sending
>garbage to the modem?
ppp is an eight bit clean protocol - how the modem got into command
mode and assumed that this may have been a command is another question.
It could also be that the the modem didn't see all of the initial AT
and was barfing on that.
>The bigger question is, "Why is it hanging?"
Because the USB isn't emulating a serial port that pppd expects by
default. The acm module is supposed to make things act like a
serial port for applications like pppd that were designed with RS-232
in mind.
>> For dialout, the 'modem' option specifies using the control lines to
>> handle the on/off hook commands. 'crtscts' is for flow control. What
>> are you using in place of these functions? ATH1/ATH0? XON/XOFF?
>> If the later, you want 'asyncmap 0xa0000' and 'xonxoff' options.
>None at all. It's not surprising the modem doesn't require any form of
>flow control. After all, it doesn't have a UART. I doubt it would
>respond to flow control of any sort, especially software flow control.
So how do you expect it to go on/off hook? Perhaps you may want to
experiment with the 'disconnect' option to pppd.
>> That's great - but the device handed to pppd wasn't a modem at that
>> time.
>Yeah, it was. Pppd just did not properly recognize it.
'was' It isn't now.
>> You've got such confusion about what is being called from the command
>> line, what options, and what results into the logs. pppd will refuse
>> to authenticate if it can't identify a specific secret to use. The
> You're not listening.
No, you're not providing all the details.
>> 'usehostname' you show in the other message might be a conflict. But
>> you also show the 'auth' option, yet I don't see it demanding the peer
>> to authenticate to you. WTF?
>The command options in the calling file override any configuration in
>the /etc/ppp/options file. From the directions in the /etc/ppp/options
>file:
That may be, but how are we to guess you what you have in the calling
script?
># Require the peer to authenticate itself before allowing network
># packets to be sent or received.
># Please do not disable this setting. It is expected to be standard in
># future releases of pppd. Use the call option (see manpage) to disable
># authentication for specific peers.
Someone hasn't bothered to read the Changelog file - the "future
releases" note was part of the change to 2.3.0, and it was added to
ppp-2.3.6 in 1999 - but that applies only when there is a pre-existing
default route.
>The reason for this request is obvious.
You'll have to ask the person who created that - it is not part of ANU
pppd. If you read the Makefile for Linux, you'll find
$(INSTALL) -c -m 644 etc.ppp/options $@
and the total contents of the 'etc.ppp/options' source is five
characters - the word 'lock' and a newline
-rw-r--r-- paulus/paulus 5 1999-02-26 20:09 ppp-2.4.4/etc.ppp/options
but if you read the 'README' file in the ppp source directory (which
included the ChangeLog data for the 2.4.x versions) you'll find
* Doing `make install' no longer puts example configuration files in
/etc/ppp. Use `make install-etcppp' if you want that.
as part of the changes creating 2.4.3, so ppp doesn't even _install_
an options (or *secrets) file.
The person creating your 'default' /etc/ppp/options seems to have their
own ideas of how things work. The 'asyncmap' originally defaulted to
0x0 from ppp-2.2.0 to ppp-2.3.4, was 0xffffffff in ppp-2.3.5 and 2.3.6
ONLY, and reverted to 0x0 in ppp-2.3.7 - that was in April 1999.
>> Comment: Peer isn't TermAck'ing the 'TermReq' - it should. Things are
>> running slowly. I'm used to a snappier response. This is the result of
>> a '/usr/bin/killall pppd':
>It's no faster here when using the maxconnect option nor the SIGTERM.
It's indicative of the fact that pppd isn't seeing things as expected,
and it's waiting. Maybe you can use the 'disconnect' option to get
the modem to "hang up" in a timely manner - maybe even reset it
afterwards. That might be something like
/usr/sbin/chat "" \d+++\d\c "" ATH0 OK ATZ OK
though I've never needed this function. It's a Hayes escape sequence
followed by the "hang up the phone" and "reset". You may also have
quoting problems - not sure.
Old guy
That doesn't prevent it from spewing out garbage when a kernel module
goes haywire and starts barfing all over the protocol stacks.
> mode and assumed that this may have been a command is another
> question. It could also be that the the modem didn't see all of the
> initial AT and was barfing on that.
'Pretty unlikely. As to why it is in command mode, whenever it hangs
up, it is supposed to be in command mode. There was never any problem
with the modem going on-hook.
>>The bigger question is, "Why is it hanging?"
>
> Because the USB isn't emulating a serial port that pppd expects by
> default.
Uh-uh. The ACM module was working, and so was the modem, throughout.
> The acm module is supposed to make things act like a
> serial port for applications like pppd that were designed with RS-232
> in mind.
Correct. Which in turn means the modem DOESN'T act like an RS-232
device. Had ACM or the modem been snockered, then it was unlikely
minicom could have talked to them, and extremely likely pppd could have
talked to them after being yanked and starting over. It couldn't.
Indeed, a modem hang or a burp in the ACM module wouldn't prevent pppd
from shutting down on SIGTERM. This prima facia evidence was confirmed
by debugging.
>>> For dialout, the 'modem' option specifies using the control lines to
>>> handle the on/off hook commands. 'crtscts' is for flow control.
>>> What
>>> are you using in place of these functions? ATH1/ATH0? XON/XOFF?
No, I'm using nothing at all. It isn't necessary. What gets me is I
don't understand why you were so wrapped up on the configuration
settings. If the configuration settings were bad, then the modem would
have never connected at all, or else never would have hung up at all.
The fact the modem dialed out and got a connect 100% of the time and
dutifully hung up 100% of the time should tell you it isn't a
configuration issue. The only thing that ever failed was pppd, and
then it was only after being connected for a relatively long period of
time. Ditto the fact the modem did not experience any of these
problems under Windows.
>>> If the later, you want 'asyncmap 0xa0000' and 'xonxoff' options.
>
>>None at all. It's not surprising the modem doesn't require any form
>>of
>>flow control. After all, it doesn't have a UART. I doubt it would
>>respond to flow control of any sort, especially software flow control.
>
> So how do you expect it to go on/off hook?
Off-hook is a no brainer. ATDT. On-hook can be managed by a
virtualization of DTR, or by +++ATH. I would expect the former.
> Perhaps you may want to
> experiment with the 'disconnect' option to pppd.
Nope. The problem was a kernel bug. Upgrading the kernel fixed it.
>>> That's great - but the device handed to pppd wasn't a modem at that
>>> time.
>
>>Yeah, it was. Pppd just did not properly recognize it.
>
> 'was' It isn't now.
'Highly unlikely explanation from the outset. It definitely was not
the case.
>>> You've got such confusion about what is being called from the
>>> command
>>> line, what options, and what results into the logs. pppd will
>>> refuse to authenticate if it can't identify a specific secret to
>>> use. The
>
>> You're not listening.
>
> No, you're not providing all the details.
What point would there have been?
1. I didn't need help troubleshooting. I said that, several times.
2. I can and did read the man page. None of the configuration options
are unclear.
3. By the time you would have read the configuration, responded, and I
read your response, the configuration would have changed at least
several times. I wasn't sitting on my hands, waiting for a response
from this forum. I was trying out possibilities, or at least I was
until I had determined the configuration was sufficiently correct not
to be producing issues.
4. I had already determined it was not a configuration issue.
>>> 'usehostname' you show in the other message might be a conflict.
>>> But you also show the 'auth' option, yet I don't see it demanding
>>> the peer
>>> to authenticate to you. WTF?
>
>>The command options in the calling file override any configuration in
>>the /etc/ppp/options file. From the directions in the
>>/etc/ppp/options file:
>
> That may be, but how are we to guess you what you have in the calling
> script?
You didn't need to. As I said, until I had determined it was not the
source of the problem, it was changing regularly. Once I had
determined it was functional and not the cause of the problems, the
question was moot.
>># Require the peer to authenticate itself before allowing network
>># packets to be sent or received.
>># Please do not disable this setting. It is expected to be standard in
>># future releases of pppd. Use the call option (see manpage) to
>># disable authentication for specific peers.
>
> Someone hasn't bothered to read the Changelog file - the "future
> releases" note was part of the change to 2.3.0, and it was added to
> ppp-2.3.6 in 1999 - but that applies only when there is a pre-existing
> default route.
Default route? We were talking about the peer authentication setting,
not the default route.
>>The reason for this request is obvious.
>
> You'll have to ask the person who created that - it is not part of ANU
> pppd. If you read the Makefile for Linux, you'll find
'Beside the point, really. If peer authentication is to be made an
internal default in the future, then placing the "auth" directive in
the options file future-proofs the current configuration. Otherwise,
an upgrade (and upgrades for Debian are either automatic or
semi-automatic) runs the risk of breaking the working configuration.
> $(INSTALL) -c -m 644 etc.ppp/options $@
>
> and the total contents of the 'etc.ppp/options' source is five
> characters - the word 'lock' and a newline
>
> -rw-r--r-- paulus/paulus 5 1999-02-26 20:09
> ppp-2.4.4/etc.ppp/options
>
> but if you read the 'README' file in the ppp source directory (which
> included the ChangeLog data for the 2.4.x versions) you'll find
I didn't download the source. As with almost all users of Debian and
its derivatives, I'm using apt. One rarely downloads the source with
apt, unless one wishes to custom compile a package or there is no .deb
for the utility. Of course, when I needed to compile in custom
settings for pppd for debugging purposes, then I did download the
source.
> * Doing `make install' no longer puts example configuration files in
> /etc/ppp. Use `make install-etcppp' if you want that.
Except when compiling my own code or customizing other people's code,
or in the rather rare case where I am using a non-Debian package, I
don't use make. I let the maintainers worry about it. That's why they
are there. Issuing `apt-get install xxxx` is all one needs for 99% of
applications. `No sectional downloading, no worrying about
dependencies or conflicts, no fiddling with compile-time configuration.
>>> Comment: Peer isn't TermAck'ing the 'TermReq' - it should. Things
>>> are running slowly. I'm used to a snappier response. This is the
>>> result of a '/usr/bin/killall pppd':
>
>>It's no faster here when using the maxconnect option nor the SIGTERM.
>
> It's indicative of the fact that pppd isn't seeing things as expected,
> and it's waiting.
No, more likely it was indicative of the kernel bug. Assuming is a bad
idea, and that one's a whopper.
> Maybe you can use the 'disconnect' option to get
> the modem to "hang up" in a timely manner - maybe even reset it
> afterwards. That might be something like
>
> /usr/sbin/chat "" \d+++\d\c "" ATH0 OK ATZ OK
The modem is (and was) connecting and hanging up reliably and virtually
instantly. Perhaps more to the point, since pppd was often hanging
before it even got to the point of calling its external scripts, the
disconnect option wouldn't help. This is doubly the case when the
session is terminated by the modem, rather than pppd.
irtualization of DTR, or by +++ATH. I would expect the former.
>
>> Perhaps you may want to
>> experiment with the 'disconnect' option to pppd.
>
> Nope. The problem was a kernel bug. Upgrading the kernel
> fixed it.
So what kernel fixed the problem?
--
buck
Well, I didn't attempt to incrementally install kernels until one of
them fixed the problem. I simply upgraded to "Squeeze", which runs
under 2.6.32. On the Intel platform it is 2.6.32-trunk-686.