Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Reproducible panic - Going nowhere without my init!

135 views
Skip to first unread message

Andy Farkas

unread,
Oct 3, 2016, 9:15:41 PM10/3/16
to freebsd...@freebsd.org
Is it just me or....

Step 1: boot
Step 2: login as root
Step 3: type "w<enter>" *
Step 4: type "shutdown now; logout<enter>"
Step 5: press <enter> at the 'Enter full pathname of shell or RETURN for
/bin/sh:' prompt
Step 6: type "reboot<enter>"
Step 7: get a Panic: "Going nowhere without my init!"

* The panic will not happen if you skip step 3.

The panic will not happen if you type "sync; sync; sync" after step 5.

The panic will not happen if you wait (an unknown amount of) some time
after step 5.

# uname -a
FreeBSD deepthink 11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #6 r306656: Tue
Oct 4 09:03:05 AEST 2016 root@deepthink:/usr/obj/usr/src/sys/GENERIC amd64

-andyf

ps. apologies, forced to send from a gmail account.
reply-to: an...@andyit.com.au
_______________________________________________
freebsd...@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"

Konstantin Belousov

unread,
Oct 4, 2016, 7:25:15 AM10/4/16
to Andy Farkas, freebsd...@freebsd.org
On Tue, Oct 04, 2016 at 11:14:38AM +1000, Andy Farkas wrote:
> Is it just me or....
>
> Step 1: boot
> Step 2: login as root
> Step 3: type "w<enter>" *
> Step 4: type "shutdown now; logout<enter>"
> Step 5: press <enter> at the 'Enter full pathname of shell or RETURN for
> /bin/sh:' prompt
> Step 6: type "reboot<enter>"
> Step 7: get a Panic: "Going nowhere without my init!"
This means that init process (pid 1) exited for some reason. Show
exact console log of the events.

Konstantin Belousov

unread,
Oct 5, 2016, 4:44:04 AM10/5/16
to Andy Farkas, Andy Farkas, freebsd...@freebsd.org
On Wed, Oct 05, 2016 at 05:32:18AM +1000, Andy Farkas wrote:
> On 04/10/2016 23:11, Andy Farkas wrote:
> > On 04/10/2016 21:24, Konstantin Belousov wrote:
> >> On Tue, Oct 04, 2016 at 11:14:38AM +1000, Andy Farkas wrote:
> >>> Is it just me or....
> >>>
> >>> Step 1: boot
> >>> Step 2: login as root
> >>> Step 3: type "w<enter>" *
> >>> Step 4: type "shutdown now; logout<enter>"
> >>> Step 5: press <enter> at the 'Enter full pathname of shell or RETURN
> >>> for
> >>> /bin/sh:' prompt
> >>> Step 6: type "reboot<enter>"
> >>> Step 7: get a Panic: "Going nowhere without my init!"
> >> This means that init process (pid 1) exited for some reason. Show
> >> exact console log of the events.
>
> I can also offer a (badly taken) photo of the console screen:
>
> http://imgur.com/1xixODY
>
> -andyf
>
> ps. Thank you for taking an interest. I only really wanted to know
> if anyone else could reproduce the panic because it has happened
> on several of my (home network) boxes since 10.0....

Apply the following patch. I am interested if anything additional appear
on the console. Screenshot is good enough.

diff --git a/sbin/init/init.c b/sbin/init/init.c
index bda86b5..1e88964 100644
--- a/sbin/init/init.c
+++ b/sbin/init/init.c
@@ -884,8 +884,13 @@ single_user(void)
if (Reboot) {
/* Instead of going single user, let's reboot the machine */
sync();
- reboot(howto);
- _exit(0);
+ if (reboot(howto) == -1) {
+ emergency("reboot(%#x) failed, %s", howto,
+ strerror(errno));
+ _exit(1); /* panic and reboot */
+ }
+ warning("reboot(%#x) returned", howto);
+ _exit(0); /* panic as well */
}

shell = get_shell();

Andy Farkas

unread,
Oct 5, 2016, 5:33:04 AM10/5/16
to Konstantin Belousov, freebsd...@freebsd.org
On 05/10/2016 18:43, Konstantin Belousov wrote:

> Apply the following patch. I am interested if anything additional appear
> on the console. Screenshot is good enough.

Patch applied. Panic (easlily!) reproduced. No additional output.

Screenshot: http://imgur.com/KOOBysH

I guess init is dying before it gets there.

-andyf

Konstantin Belousov

unread,
Oct 5, 2016, 9:37:06 AM10/5/16
to Andy Farkas, freebsd...@freebsd.org
On Wed, Oct 05, 2016 at 07:32:33PM +1000, Andy Farkas wrote:
> On 05/10/2016 18:43, Konstantin Belousov wrote:
>
> > Apply the following patch. I am interested if anything additional appear
> > on the console. Screenshot is good enough.
>
> Patch applied. Panic (easlily!) reproduced. No additional output.
>
> Screenshot: http://imgur.com/KOOBysH
>
> I guess init is dying before it gets there.

No, init does not die in your case, since error code is zero, and the
termination signal is absent. It must occur because init explicitely
called _exit(0).

Please try this variation, I want to see if the error code changed.

diff --git a/sbin/init/init.c b/sbin/init/init.c
index bda86b5..1e88964 100644
--- a/sbin/init/init.c
+++ b/sbin/init/init.c
@@ -884,8 +884,13 @@ single_user(void)
if (Reboot) {
/* Instead of going single user, let's reboot the machine */
sync();
- reboot(howto);
- _exit(0);
+ if (reboot(howto) == -1) {
+ emergency("reboot(%#x) failed, %s", howto,
+ strerror(errno));
+ _exit(1); /* panic and reboot */
+ }
+ warning("reboot(%#x) returned", howto);
+ _exit(97); /* panic as well */
}

shell = get_shell();

Andy Farkas

unread,
Oct 5, 2016, 4:59:17 PM10/5/16
to Konstantin Belousov, freebsd...@freebsd.org
On 05/10/2016 23:36, Konstantin Belousov wrote:

> Please try this variation, I want to see if the error code changed.

Afraid not. Still signal 0, exit 0.

Screenshot: http://imgur.com/AU6weU0

-andyf

Andy Farkas

unread,
Oct 6, 2016, 4:32:34 AM10/6/16
to Konstantin Belousov, freebsd...@freebsd.org
Reverted your patch then changed line 1011 of init.c to _exit(97):

--- init.c-orig 2016-10-05 18:52:24.022910000 +1000
+++ init.c 2016-10-06 17:02:33.714624000 +1000
@@ -1008,7 +1008,7 @@
*/
warning("single user shell terminated.");
sleep(STALL_TIMEOUT);
- _exit(0);
+ _exit(97);
} else {
warning("single user shell terminated, restarting");
return (state_func_t) single_user;

..and got a panic that showed "exit 97": http://imgur.com/xonPwxR

I think that kern_reboot() is not being called somehow.
kern_reboot() is the only place rebooting = 1; is executed.

"init died (signal 0, exit 97)
panic: Going nowhere without my init!"

can only happen if rebooting = 0 in kern_exit.c exit1().

Another tell that kern_reboot() has not been called is "cpuid = 3"
because the first thing kern_reboot() does is bind to CPU 0.

Why is kern_reboot() being skipped? I have no idea.

Anything more I can do to help? Do you want a core dump?

Konstantin Belousov

unread,
Oct 6, 2016, 1:44:02 PM10/6/16
to Andy Farkas, freebsd...@freebsd.org
On Thu, Oct 06, 2016 at 06:31:59PM +1000, Andy Farkas wrote:
> Reverted your patch then changed line 1011 of init.c to _exit(97):
>
> --- init.c-orig 2016-10-05 18:52:24.022910000 +1000
> +++ init.c 2016-10-06 17:02:33.714624000 +1000
> @@ -1008,7 +1008,7 @@
> */
> warning("single user shell terminated.");
> sleep(STALL_TIMEOUT);
> - _exit(0);
> + _exit(97);
> } else {
> warning("single user shell terminated, restarting");
> return (state_func_t) single_user;
>
> ...and got a panic that showed "exit 97": http://imgur.com/xonPwxR
>
> I think that kern_reboot() is not being called somehow.
> kern_reboot() is the only place rebooting = 1; is executed.
>
> "init died (signal 0, exit 97)
> panic: Going nowhere without my init!"
>
> can only happen if rebooting = 0 in kern_exit.c exit1().
>
> Another tell that kern_reboot() has not been called is "cpuid = 3"
> because the first thing kern_reboot() does is bind to CPU 0.
>
> Why is kern_reboot() being skipped? I have no idea.
>
> Anything more I can do to help? Do you want a core dump?
>

Please try the following patch.

diff --git a/sbin/init/init.c b/sbin/init/init.c
index bda86b5..25ac2bd 100644
--- a/sbin/init/init.c
+++ b/sbin/init/init.c
@@ -870,6 +870,7 @@ single_user(void)
sigset_t mask;
const char *shell;
char *argv[2];
+ struct timeval tv, tn;
#ifdef SECURE
struct ttyent *typ;
struct passwd *pp;
@@ -884,8 +885,13 @@ single_user(void)
if (Reboot) {
/* Instead of going single user, let's reboot the machine */
sync();
- reboot(howto);
- _exit(0);
+ if (reboot(howto) == -1) {
+ emergency("reboot(%#x) failed, %s", howto,
+ strerror(errno));
+ _exit(1); /* panic and reboot */
+ }
+ warning("reboot(%#x) returned", howto);
+ _exit(0); /* panic as well */
}

shell = get_shell();
@@ -1002,7 +1008,14 @@ single_user(void)
* reboot(8) killed shell?
*/
warning("single user shell terminated.");
- sleep(STALL_TIMEOUT);
+ gettimeofday(&tv, NULL);
+ tn = tv;
+ tv.tv_sec += STALL_TIMEOUT;
+ while (tv.tv_sec > tn.tv_sec || (tv.tv_sec ==
+ tn.tv_sec && tv.tv_usec > tn.tv_usec)) {
+ sleep(1);
+ gettimeofday(&tn, NULL);
+ }
_exit(0);
} else {
warning("single user shell terminated, restarting");

Andy Farkas

unread,
Oct 6, 2016, 6:32:55 PM10/6/16
to Konstantin Belousov, freebsd...@freebsd.org
With your latest patch applied, I ran through my procedure more
than a dozen times and no panics!

Any explanation why sleep(STALL_TIMEOUT) as apposed to a
bunch of sleep(1)'s tickles the panic?

Also, it is definitely not sleeping for 30 seconds. I guess some
event interrupts the sleep loop?

Thanks heaps for your time and effort,

-andyf

%%%

Graham Menhennitt

unread,
Oct 6, 2016, 6:56:31 PM10/6/16
to freebsd...@freebsd.org
Let me preface this by saying that I know nothing about this particular
bit of code, but...

As a general rule, I would question the use of gettimeofday() while
panicing. At that stage, everything could have already gone down the
plug hole.

That said, it already calls sleep(), so maybe that uses the same
gettimeofday() call internally. In which case, please ignore this comment.

Graham

Konstantin Belousov

unread,
Oct 7, 2016, 8:19:20 AM10/7/16
to Andy Farkas, freebsd...@freebsd.org
On Fri, Oct 07, 2016 at 08:32:24AM +1000, Andy Farkas wrote:
> With your latest patch applied, I ran through my procedure more
> than a dozen times and no panics!
>
> Any explanation why sleep(STALL_TIMEOUT) as apposed to a
> bunch of sleep(1)'s tickles the panic?
What happened was sleep() got interrupted by a signal.

Normally reboot(8) stops init with SIGTSTP, then kill processes, then
calls reboot(2). reboot(8) does not and cannot get acknowledges for the
receipts of the signals by signalled processes, which indeed may result
in the sleep interruption if other signal is delivered to init before
SIGTSTP.

Patch does not add 'just bunch of sleeps'. The code in the patch ensures
that _exit() is called not earlier than STALL_TIMEOUT from the moment
of detection of the shell exit, by reissuing sleep(). I changed the
argument to 1 second to avoid situation where we e.g. sleep for 15 secs,
get interrupt and then sleep for whole 30 secs. The overtime with
sleep(1) is limited to 1 second.

>
> Also, it is definitely not sleeping for 30 seconds. I guess some
> event interrupts the sleep loop?
0 new messages