Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#778598: atop sometimes fails with a floating point exception or a trap exception

69 views
Skip to first unread message

Klaus Ethgen

unread,
Feb 17, 2015, 4:00:03 AM2/17/15
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Package: atop
Version: 1.27.3-1
Severity: normal

I do not know since when it started but since some time I get more and
more floating point exceptions when starting it on the console or "trap
divide error" when rotated via cron.

The trap error from log is:
traps: atop[28297] trap divide error ip:407b9a sp:7fff857869b8 error:0 in atop[400000+29000]

... and on the console just a "floating point exception".

The error do not happen every time and often atop TUI start with the
second or subsequent start. But it happen that often that I believe in a
more important bug.

- -- System Information:
Debian Release: 8.0
APT prefers unstable
APT policy: (800, 'unstable'), (110, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.18.6 (SMP w/8 CPU cores)
Locale: LANG=de_DE, LC_CTYPE=de_DE (charmap=ISO-8859-1) (ignored: LC_ALL set to de_DE)
Shell: /bin/sh linked to /bin/dash
Init: sysvinit (via /sbin/init)

Versions of packages atop depends on:
ii libc6 2.19-15
ii libncurses5 5.9+20140913-1+b1
ii libtinfo5 5.9+20140913-1+b1
ii lsb-base 4.1+Debian13+nmu1
ii zlib1g 1:1.2.8.dfsg-2+b1

Versions of packages atop recommends:
ii cron 3.0pl1-127

atop suggests no packages.

- -- Configuration Files:
/etc/cron.d/atop changed:
0 0 * * * root /usr/sbin/invoke-rc.d atop _cron

/etc/default/atop changed:
INTERVAL=600
LOGPATH="/var/log/atop"
OUTFILE="$LOGPATH/daily.log"
START_DAEMON="no"

/etc/init.d/atop changed:
PATH=/sbin:/usr/sbin:/bin:/usr/bin
DESC="atop system monitor"
NAME=atop
DAEMON=/usr/bin/atop
WRAPPER=/usr/share/atop/atop.wrapper
INTERVAL=600 # interval 10 minutes
LOGPATH="/var/log/atop"
OUTFILE=$LOGPATH/daily.log
PIDFILE=/var/run/$NAME.pid
SCRIPTNAME=/etc/init.d/$NAME
[ -x $DAEMON ] || exit 0
START_DAEMON="yes"
[ -r /etc/default/$NAME ] && . /etc/default/$NAME
. /lib/init/vars.sh
. /lib/lsb/init-functions
CURDAY=$(date +%Y%m%d)
DAEMON_ARGS="-a -w $LOGPATH/atop_$CURDAY $INTERVAL"
do_start()
{
# Return
# 0 if daemon has been started
# 1 if daemon was already running
# 2 if daemon could not be started
start-stop-daemon --start --background --quiet \
--pidfile $PIDFILE \
--test --startas $WRAPPER > /dev/null \
|| return 1
start-stop-daemon --start --background --quiet \
--pidfile $PIDFILE --make-pidfile \
--startas $WRAPPER -- $DAEMON $OUTFILE \
$DAEMON_ARGS \
|| return 2
}
do_stop()
{
# Return
# 0 if daemon has been stopped
# 1 if daemon was already stopped
# 2 if daemon could not be stopped
# other if a failure occurred
start-stop-daemon --stop --quiet --retry=USR2/30/KILL/5 --pidfile $PIDFILE --name $NAME
RETVAL="$?"
[ "$RETVAL" = 2 ] && return 2
rm -f $PIDFILE
return "$RETVAL"
}
case "$1" in
start)
if [ "x$START_DAEMON" = "xyes" ]
then
[ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC " "$NAME"
do_start
case "$?" in
0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
esac
fi
;;
stop)
[ "$VERBOSE" != no ] && log_daemon_msg "Stopping $DESC" "$NAME"
do_stop
case "$?" in
0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
esac
;;
status)
status_of_proc "$DAEMON" "$NAME" && exit 0 || exit $?
;;
restart|force-reload|_cron)
[ "$1" = "_cron" ] && VERBOSE="no"
[ "$VERBOSE" != no ] && log_daemon_msg "Restarting $DESC" "$NAME"
do_stop
case "$?" in
0|1)
do_start
case "$?" in
0) [ "$VERBOSE" != no ] && log_end_msg 0
[ "$1" = "_cron" ] && sleep 3 && find $LOGPATH -name 'atop_*' -mtime +28 -exec rm {} \;
;;
1) [ "$VERBOSE" != no ] && log_end_msg 1 ;; # Old process is still running
*) [ "$VERBOSE" != no ] && log_end_msg 1 ;; # Failed to start
esac
;;
*)
# Failed to stop
[ "$VERBOSE" != no ] && log_end_msg 1
;;
esac
;;
*)
echo "Usage: $SCRIPTNAME {start|stop|status|restart|force-reload}" >&2
exit 1
;;
esac
exit 0


- -- no debconf information

- --
Klaus Ethgen http://www.ethgen.ch/
pub 4096R/4E20AF1C 2011-05-16 Klaus Ethgen <Kl...@Ethgen.de>
Fingerprint: 85D4 CA42 952C 949B 1753 62B3 79D0 B06F 4E20 AF1C
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQGcBAEBCgAGBQJU4wDNAAoJEKZ8CrGAGfasPJ0L+wUNJSw8XlJdy0b3ZZe6f7+v
3Sv3Uy+zELAB2C9GdAYvz5tWdKybBgM+/YUFj5k63Jgpmpj0hhObLBGxV2Ifcele
RhHiAEpOO8raIkB33skATNruRl21NdyDj1l+YBjnpmHqWy5RVMiVbeyvPXyHt59/
udNfADzrkatRPfuLG3UiaHSI4ghc5+jwRCfsaBUYMCj0m069+JQ9TjALGmPouI9H
jwLjGM5WNUAUJ5t9fmvXH1F5GdYI6hhSX2BhUfTfzjlCWaQrNQrzPUIn0k6RZL0W
RS/N3TRyd8zdxPjUNfrvbvn9Y7NkO/48cfyLnj9YIRdlrtmE3VXt/aZse0GhWgKN
DRlujNCzalwde0jFUGn8dN2nHAFgilrYljvIvNxHz5GE/kPI7XAd/p3w6BI5GhVj
iBdlJdxVfJj0r3ux4h5HKF5S5dr7hUZFO2fvHKbns1miESWsejVQn+NiJcD6FYVe
GX9Swq5ud+/N+5+erHwsx4g5hW1vQ0jDGdqZJJYZzQ==
=Ertj
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-bugs-...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org

Jakob Haufe

unread,
Aug 23, 2015, 7:50:02 AM8/23/15
to
Hi,

I stumbled across this as well and I rebuild atop with nostrip which gave me
this backtrace:

----SNIP----
/tmp/atop-dbg/atop-1.27.3$ gdb /usr/bin/atop /var/lib/coredumps/atop.1440280803/core
Reading symbols from /usr/bin/atop...done.
[New LWP 27858]
Core was generated by `/usr/bin/atop -a -w /var/log/atop/atop_20150823 600'.
Program terminated with signal SIGFPE, Arithmetic exception.
#0 0x0000000000407662 in acctprocnt () at acctproc.c:571

warning: Source file is more recent than executable.
571 return (statacc.st_size - acctsize) / acctrecsz;
gdb $ bt full
#0 0x0000000000407662 in acctprocnt () at acctproc.c:571
statacc = {
st_dev = 18,
st_ino = 996486,
st_nlink = 1,
st_mode = 33152,
st_uid = 0,
st_gid = 0,
__pad0 = 0,
st_rdev = 0,
st_size = 0,
st_blksize = 4096,
st_blocks = 0,
st_atim = {
tv_sec = 1440280803,
tv_nsec = 90171947
},
st_mtim = {
tv_sec = 1440280803,
tv_nsec = 89171920
},
st_ctim = {
tv_sec = 1440280803,
tv_nsec = 89171920
},
__glibc_reserved = {0, 0, 0}
}
#1 0x0000000000403242 in engine () at atop.c:829
lastcmd = <optimized out>
nactproc = 4285065470
totslpi = -1613224016
devsstat = 0x7fb47489d010
curpact = 0x7fb474867010
curpexit = <optimized out>
devtstat = <optimized out>
ntask = 280
sigact = {
__sigaction_handler = {
sa_handler = 0x403890 <getalarm>,
sa_sigaction = 0x403890 <getalarm>
},
sa_mask = {
__val = {0 <repeats 16 times>}
},
sa_flags = 0,
sa_restorer = 0x0
}
devpstat = <optimized out>
nexit = <optimized out>
ndeviat = <optimized out>
totproc = 1
totrun = 0
totslpu = 32767
totzombie = 1954458716
presstat = 0x7fb474981010
curplen = 298
i = <optimized out>
noverflow = <optimized out>
hlpsstat = 0x7fb474981010
j = <optimized out>
timelimit = 0
cursstat = 0x7fb47490f010
#2 main (argc=<optimized out>, argv=<optimized out>) at atop.c:659
i = <optimized out>
c = <optimized out>
p = <optimized out>
rlim = {
rlim_cur = 18446744073709551615,
rlim_max = 18446744073709551615
}
gdb $ print acctrecsz
$1 = 0
----SNIP----

This is a global variable in acctproc.c. Unfortunately, I didn't have the
time to investigate this yet.

Cheers,
sur5r

--
ceterum censeo microsoftem esse delendam.

Marc Haber

unread,
Aug 11, 2016, 6:10:03 AM8/11/16
to
tags #778598 moreinfo
thanks

Can you please re-check with 2.2.3 from experimental?

Greetings
Marc

On Sun, Aug 23, 2015 at 01:36:52PM +0200, Jakob Haufe wrote:
> From: Jakob Haufe <su...@sur5r.net>
> Subject: Bug#778598: atop sometimes fails with a floating point exception
> or a trap exception
> To: Klaus Ethgen <Kl...@Ethgen.de>, 778...@bugs.debian.org
> Reply-To: Jakob Haufe <su...@sur5r.net>, 778...@bugs.debian.org
> Date: Sun, 23 Aug 2015 13:36:52 +0200
> X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.28; x86_64-pc-linux-gnu)
--
-----------------------------------------------------------------------------
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany | lose things." Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421
0 new messages