Using automatic installation via FAI I see a hanging "update-menus
--trigger" call causing the build to fail:
root 22433 0.0 0.2 2696 1312 pts/0 S+ 14:57 0:00 /bin/sh /var/lib/dpkg/info/menu.postinst triggered /usr/share/menu
root 22434 97.6 0.2 3144 1236 pts/0 R+ 14:57 2:24 update-menus --trigger
strace-ing the PID outside the chroot gives me tons of EBADF which
might be the reason for the 100% cpu load and the hanging process:
[...]
read(4293087900, 0xf7f18000, 8192) = -1 EBADF (Bad file descriptor)
read(4293087900, 0xf7f18000, 8192) = -1 EBADF (Bad file descriptor)
read(4293087900, 0xf7f18000, 8192) = -1 EBADF (Bad file descriptor)
read(4293087900, 0xf7f18000, 8192) = -1 EBADF (Bad file descriptor)
[...]
Killing the update-menus process and manually invoking update-menus
inside the chroot displays:
# update-menus -v
update-menus[12725]: Dpkg is not locking dpkg status area, good.
update-menus[12725]: Reading installed packages list...
acpi-support
acpi-support-base
acpid
adduser
[...]
-> it displays the whole package list and then it just hangs (same
problem as running via FAI).
Funnily it works when invoked through strace:
# strace -o /tmp/log -ffff update-menus -v
update-menus[11649]: Dpkg is not locking dpkg status area, good.
update-menus[11649]: Reading installed packages list...
update-menus[11649]: Reading translation rules in /etc/menu-methods/translate_menus.
update-menus[11649]: Reading menu-entry files in /etc/menu/.
update-menus[11649]: 0 menu entries found (0 total).
update-menus[11649]: Reading menu-entry files in /usr/lib/menu/.
update-menus[11649]: 0 menu entries found (0 total).
update-menus[11649]: Reading menu-entry files in /usr/share/menu/.
update-menus[11649]: 38 menu entries found (38 total).
update-menus[11649]: Reading menu-entry files in /usr/share/menu/default/.
update-menus[11649]: 0 menu entries found (38 total).
update-menus[11649]: Running menu-methods in /etc/menu-methods/.
update-menus[11649]: Running method: /etc/menu-methods/fluxbox
#
Do you have any idea what's going on here?
Please let me know if you need any further information, I've a
chroot available where I can easily reproduce/work on the problem.
System information: building a Debian/sid i386 system on a amd64 box.
ii libc6 2.9-6 GNU C Library: Shared libraries
ii menu 2.1.41 generates programs menu for all menu-aware applications
regards,
-mika-
--
To UNSUBSCRIBE, email to debian-bugs-...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
This is interesting because 'update-menus -v' should not display the
package list. So there seems to be a problem with the call to dpkg-query:
string inst_states="installed\\|triggers-awaited\\|triggers-pending";
string pkgs = "dpkg-query --show --showformat=\"\\${status} \\${provides} \\${package}\\n\" | sed -n -e \"/" + inst_states
+ " /{s/^.*\\("+inst_states+"\\) *//; s/[, ][, ]*/\\n/g; p}\"";
pkgs = "exec /bin/bash -o pipefail -c '" + pkgs + "'";
FILE *status = popen(pkgs.c_str(), "r");
> Funnily it works when invoked through strace:
>
> # strace -o /tmp/log -ffff update-menus -v
> update-menus[11649]: Dpkg is not locking dpkg status area, good.
> update-menus[11649]: Reading installed packages list...
> update-menus[11649]: Reading translation rules in /etc/menu-methods/translate_menus.
> update-menus[11649]: Reading menu-entry files in /etc/menu/.
> update-menus[11649]: 0 menu entries found (0 total).
> update-menus[11649]: Reading menu-entry files in /usr/lib/menu/.
> update-menus[11649]: 0 menu entries found (0 total).
> update-menus[11649]: Reading menu-entry files in /usr/share/menu/.
> update-menus[11649]: 38 menu entries found (38 total).
> update-menus[11649]: Reading menu-entry files in /usr/share/menu/default/.
> update-menus[11649]: 0 menu entries found (38 total).
> update-menus[11649]: Running menu-methods in /etc/menu-methods/.
> update-menus[11649]: Running method: /etc/menu-methods/fluxbox
> #
>
> Do you have any idea what's going on here?
I wonder if your system somehow set signal handling to non-default
value and that interfer with update-menus.
Maybe this bug is related to bug 511698 (in experimental).
> Please let me know if you need any further information, I've a
> chroot available where I can easily reproduce/work on the problem.
1) does 'update-menus --stdout' work ?
2) does 'update-menus --stdout | install-menu -v /etc/menu-methods/fluxbox'
3) what file do you have in /etc/menu-methods ?
Cheers,
--
Bill. <ball...@debian.org>
Imagine a large red swirl here.
Yeah, that's what I'm used to. :)
Basically a kvm system with Debian's defaults. Used to work just
fine until a few days ago.
> Maybe this bug is related to bug 511698 (in experimental).
Ah ok.
> > Please let me know if you need any further information, I've a
> > chroot available where I can easily reproduce/work on the problem.
> 1) does 'update-menus --stdout' work ?
No, hanging as well.
> 2) does 'update-menus --stdout | install-menu -v /etc/menu-methods/fluxbox'
No, hanging too. Output:
# update-menus --stdout | install-menu -v /etc/menu-methods/fluxbox
install-menu: [supported]: name=X11
install-menu: [supported]: name=WM
install-menu: [supported]: name=TEXT
install-menu: [supported]: name=FLUXBOX
Unknown error, message=section
install-menu: /etc/menu-methods/fluxbox: aborting
-> then it hangs. No idea what's the "Unknown error,
message=section" though - might this be a configuration problem?
> 3) what file do you have in /etc/menu-methods ?
# ls -la /etc/menu-methods
total 32
drwxr-xr-x 2 root root 4096 2009-03-23 16:08 ./
drwxr-xr-x 88 root root 4096 2009-03-23 16:22 ../
-rwxr-xr-x 1 root root 1176 2009-03-18 21:06 fluxbox*
-rw-r--r-- 1 root root 254 2008-10-24 23:13 lang.h
-rw-r--r-- 1 root root 31 2005-09-11 21:42 menu.config
-rw-r--r-- 1 root root 2910 2007-10-12 09:22 menu.h
-rw-r--r-- 1 root root 391 2005-09-11 23:14 README
-rw-r--r-- 1 root root 2232 2008-02-29 19:15 translate_menus
# dpkg -S /etc/menu-methods/*
fluxbox: /etc/menu-methods/fluxbox
menu: /etc/menu-methods/lang.h
menu: /etc/menu-methods/menu.config
menu: /etc/menu-methods/menu.h
menu: /etc/menu-methods/README
menu: /etc/menu-methods/translate_menus
Removing the fluxbox package doesn't seem to change anything.
Would it help if I'd provide you the affected chroot as tar.gz?
thx && regards,
-mika-
Can you check the list of package you changed recently (if any)
via /var/log/dpkg.log and maybe try to downgrade them ?
What kernel do you run ?
> > Maybe this bug is related to bug 511698 (in experimental).
>
> Ah ok.
... so it is conceivable that some update break update-menus.
> > > Please let me know if you need any further information, I've a
> > > chroot available where I can easily reproduce/work on the problem.
>
> > 1) does 'update-menus --stdout' work ?
>
> No, hanging as well.
So that narrow the issue to update-menus proper.
> Would it help if I'd provide you the affected chroot as tar.gz?
Sure, but please check you can reproduce the problem using the
tar.gz.
Cheers,
Bill.
> > Basically a kvm system with Debian's defaults. Used to work just
> > fine until a few days ago.
> Can you check the list of package you changed recently (if any)
> via /var/log/dpkg.log and maybe try to downgrade them ?
No changes for 20 days and used to work until ~4 days ago.
> What kernel do you run ?
2.6.26-bpo.1-amd64
> > > Maybe this bug is related to bug 511698 (in experimental).
> > Ah ok.
> ... so it is conceivable that some update break update-menus.
Jepp.
> > > > Please let me know if you need any further information, I've a
> > > > chroot available where I can easily reproduce/work on the problem.
> > > 1) does 'update-menus --stdout' work ?
> > No, hanging as well.
> So that narrow the issue to update-menus proper.
Ok.
> > Would it help if I'd provide you the affected chroot as tar.gz?
> Sure, but please check you can reproduce the problem using the
> tar.gz.
Sure, will do. I'll send you a mail off the BTS.
regards,
-mika-
[...]
Alright, it's "fgets(tmp, MAX_LINE, status)" in read_pkginfo() of
menu-2.1.41/update-menus/update-menus.cc that fails with errno 9
(AKA "Bad file descriptor") when using libc6 2.9-6 from unstable. It
works just fine with libc6 2.7-18 from lenny though.
The system in question (it's a kvm guest):
# cat /etc/debian_version
4.0
# dpkg --print-architecture
amd64
# uname -a
Linux amd64.grml.org 2.6.26-bpo.1-amd64 #1 SMP Wed Dec 17 12:18:36 UTC 2008 x86_64 GNU/Linux
The problem does NOT exist for example on:
# cat /etc/debian_version
5.0
# dpkg --print-architecture
i386
# uname -a
Linux openvz 2.6.26-1-openvz-686 #1 SMP Sat Nov 8 21:19:32 UTC 2008 i686 GNU/Linux
regards,
-mika-
this is strange: update-menus checks twice status before using it.
Thanks a lot for all your effort, but I still cannot reproduce this issue:
I tried on an up-to-date sid system and update-menus works fine.
If you strace update-menus on a working system, you see someting like:
write(2, "update-menus[30226]: Dpkg is not "..., 65) = 65
pipe([4, 5]) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff0b3046780) = 30227
I am very interested to know what happen on broken systems.
Maybe you can run strace outside the chroot to catch that ?
Cheers,
--
Bill. <ball...@debian.org>
Imagine a large red swirl here.
--
> > Alright, it's "fgets(tmp, MAX_LINE, status)" in read_pkginfo() of
> > menu-2.1.41/update-menus/update-menus.cc that fails with errno 9
> > (AKA "Bad file descriptor") when using libc6 2.9-6 from unstable. It
> > works just fine with libc6 2.7-18 from lenny though.
> this is strange: update-menus checks twice status before using it.
> Thanks a lot for all your effort, but I still cannot reproduce this issue:
> I tried on an up-to-date sid system and update-menus works fine.
Are you using an amd64 system?
Because it works fine on my i386 system with current libc6 and menu
versions as well.
> If you strace update-menus on a working system, you see someting like:
> write(2, "update-menus[30226]: Dpkg is not "..., 65) = 65
> pipe([4, 5]) = 0
> clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff0b3046780) = 30227
> I am very interested to know what happen on broken systems.
> Maybe you can run strace outside the chroot to catch that ?
The sad thing is that if update-menus is running under strace
(meaning: either directly invoked via strace or when "strace -p
$PID" attaches early enough) it's working as supposed to. If I'm
catching the process from the chroot to late I just see tons of the
mentioned EBADF errors and update-menus keeps hanging.
What I get inside strace (note: then update-menus is working fine):
write(2, "update-menus[31341]: Dpkg is not"..., 65) = 65
SYS_331(0xff800758, 0x80000, 0xf7dbaff4, 0x90e2928, 0x1) = -1 ENOSYS (Function not implemented)
pipe([4, 5]) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xf7c7c718) = 31342
I guess the SYS_331 is just a strace specific thing.
regards,
-mika-
Yes.
> Because it works fine on my i386 system with current libc6 and menu
> versions as well.
Good to know!
> > If you strace update-menus on a working system, you see someting like:
>
> > write(2, "update-menus[30226]: Dpkg is not "..., 65) = 65
> > pipe([4, 5]) = 0
> > clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff0b3046780) = 30227
>
> > I am very interested to know what happen on broken systems.
> > Maybe you can run strace outside the chroot to catch that ?
>
> The sad thing is that if update-menus is running under strace
> (meaning: either directly invoked via strace or when "strace -p
> $PID" attaches early enough) it's working as supposed to. If I'm
> catching the process from the chroot to late I just see tons of the
> mentioned EBADF errors and update-menus keeps hanging.
This suggest some kind of race-condition between update-menus and the kernel.
Could you try
gdb /usr/bin/update-menus
to see if it also works ?
If it fails, then you can try to do
single-step in read_pkginfo() to see what happen.
The fact that update-menus -v output the list of package seems to imply
that it does pipe([x, 1]) or pipe([x, 2]). Well, at least,
we can tell which one it is: try
update-menus -v >menu.out 2>menu.err
and see whether the package list is in menu.out or menu.err.
> What I get inside strace (note: then update-menus is working fine):
>
> write(2, "update-menus[31341]: Dpkg is not"..., 65) = 65
> SYS_331(0xff800758, 0x80000, 0xf7dbaff4, 0x90e2928, 0x1) = -1 ENOSYS (Function not implemented)
> pipe([4, 5]) = 0
> clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xf7c7c718) = 31342
>
> I guess the SYS_331 is just a strace specific thing.
Actually 'SYS_331' means the syscall number 331. However
/usr/include/asm/unistd_64.h stops at 294, so maybe this is
a false syscall added by strace so it can catch it.
Cheers,
Bill.
> > The sad thing is that if update-menus is running under strace
> > (meaning: either directly invoked via strace or when "strace -p
> > $PID" attaches early enough) it's working as supposed to. If I'm
> > catching the process from the chroot to late I just see tons of the
> > mentioned EBADF errors and update-menus keeps hanging.
> This suggest some kind of race-condition between update-menus and the kernel.
ACK
> Could you try
> gdb /usr/bin/update-menus
> to see if it also works ?
> If it fails, then you can try to do
> single-step in read_pkginfo() to see what happen.
It fails under gdb as well:
[...]
(gdb) break read_pkginfo()
Breakpoint 1 at 0x804fbfc: file update-menus.cc, line 439.
(gdb) run
Starting program: /root/update-menus
Breakpoint 1, read_pkginfo () at update-menus.cc:439
439 string inst_states="installed\\|triggers-awaited\\|triggers-pending";
(gdb) s
441 + " /{s/^.*\\("+inst_states+"\\) *//; s/[, ][, ]*/\\n/g; p}\"";
(gdb) s
std::operator+<char, std::char_traits<char>, std::allocator<char> > (
__lhs=0x806028c "dpkg-query --show --showformat=\"\\${status} \\${provides} \\${package}\\n\" | sed -n -e \"/", __rhs=@0xffa1b9cc)
at /usr/include/c++/4.2/bits/basic_string.tcc:675
675 const basic_string<_CharT, _Traits, _Alloc>& __rhs)
(gdb) s
258 { return strlen(__s); }
(gdb) s
682 __str.reserve(__len + __rhs.size());
(gdb) s
262 : _Alloc(__a), _M_p(__dat) { }
(gdb) s
258 { return strlen(__s); }
(gdb) s
682 __str.reserve(__len + __rhs.size());
(gdb) s
683 __str.append(__lhs, __len);
(gdb) s
684 __str.append(__rhs);
(gdb) s
685 return __str;
(gdb) s
read_pkginfo () at /usr/include/c++/4.2/bits/basic_string.h:2111
2111 basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
(gdb) s
827 return this->append(__s, traits_type::length(__s));
(gdb) s
441 + " /{s/^.*\\("+inst_states+"\\) *//; s/[, ][, ]*/\\n/g; p}\"";
(gdb) s
std::operator+<char, std::char_traits<char>, std::allocator<char> > (__lhs=@0xffa1b9c0, __rhs=@0xffa1b9cc) at /usr/include/c++/4.2/bits/basic_string.h:2074
2074 basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
(gdb) s
2072 const basic_string<_CharT, _Traits, _Alloc>& __rhs)
(gdb) s
2074 basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
(gdb) s
2075 __str.append(__rhs);
(gdb) s
2076 return __str;
(gdb) s
read_pkginfo () at /usr/include/c++/4.2/bits/basic_string.h:2111
2111 basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
(gdb) s
827 return this->append(__s, traits_type::length(__s));
(gdb) s
441 + " /{s/^.*\\("+inst_states+"\\) *//; s/[, ][, ]*/\\n/g; p}\"";
(gdb) s
442 pkgs = "exec /bin/bash -o pipefail -c '" + pkgs + "'";
(gdb) s
std::operator+<char, std::char_traits<char>, std::allocator<char> > (__lhs=0x80602e4 "exec /bin/bash -o pipefail -c '", __rhs=@0xffa1b9c8)
at /usr/include/c++/4.2/bits/basic_string.tcc:675
675 const basic_string<_CharT, _Traits, _Alloc>& __rhs)
(gdb) s
258 { return strlen(__s); }
(gdb) s
682 __str.reserve(__len + __rhs.size());
(gdb) s
262 : _Alloc(__a), _M_p(__dat) { }
(gdb) s
258 { return strlen(__s); }
(gdb) s
682 __str.reserve(__len + __rhs.size());
(gdb) s
683 __str.append(__lhs, __len);
(gdb) s
684 __str.append(__rhs);
(gdb) s
685 return __str;
(gdb) s
read_pkginfo () at /usr/include/c++/4.2/bits/basic_string.h:2111
2111 basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
(gdb) s
827 return this->append(__s, traits_type::length(__s));
(gdb) s
491 { return this->assign(__str); }
(gdb) s
442 pkgs = "exec /bin/bash -o pipefail -c '" + pkgs + "'";
(gdb) s
443 FILE *status = popen(pkgs.c_str(), "r");
(gdb) s
445 if (!status)
(gdb) sadduser
libapt-pkg-libc6.7-6-4.6
apt
libapt-inst-libc6.7-6-1.1
apt-utils
aptitude
atftp
attr
[ snip rest of the package list ]
vim-common
vim-runtime
libz1
zlib1g
zsh
443 FILE *status = popen(pkgs.c_str(), "r");
(gdb)
445 if (!status)
(gdb)
449 configinfo::report_verbose);
(gdb)
configinfo::report (this=0x8066c00, message=@0xffa1b9ac, v=configinfo::report_verbose) at update-menus.cc:310
310 void configinfo::report(const std::string &message, verbosity_type v)
(gdb)
312 if(v <= verbosity) {
(gdb)
326 }
(gdb)
read_pkginfo () at update-menus.cc:454
454 if (fgets(tmp, MAX_LINE, status) != NULL) {
(gdb)
451 while (!feof(status))
(gdb)
454 if (fgets(tmp, MAX_LINE, status) != NULL) {
(gdb)
451 while (!feof(status))
(gdb)
454 if (fgets(tmp, MAX_LINE, status) != NULL) {
(gdb)
451 while (!feof(status))
(gdb)
454 if (fgets(tmp, MAX_LINE, status) != NULL) {
(gdb)
451 while (!feof(status))
(gdb)
454 if (fgets(tmp, MAX_LINE, status) != NULL) {
(gdb)
451 while (!feof(status))
(gdb)
454 if (fgets(tmp, MAX_LINE, status) != NULL) {
(gdb)
451 while (!feof(status))
(gdb)
[...]
> The fact that update-menus -v output the list of package seems to imply
> that it does pipe([x, 1]) or pipe([x, 2]). Well, at least,
> we can tell which one it is: try
> update-menus -v >menu.out 2>menu.err
> and see whether the package list is in menu.out or menu.err.
The package list is in menu.out.
> > What I get inside strace (note: then update-menus is working fine):
> > write(2, "update-menus[31341]: Dpkg is not"..., 65) = 65
> > SYS_331(0xff800758, 0x80000, 0xf7dbaff4, 0x90e2928, 0x1) = -1 ENOSYS (Function not implemented)
> > pipe([4, 5]) = 0
> > clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xf7c7c718) = 31342
> > I guess the SYS_331 is just a strace specific thing.
> Actually 'SYS_331' means the syscall number 331. However
> /usr/include/asm/unistd_64.h stops at 294, so maybe this is
> a false syscall added by strace so it can catch it.
A right, thanks.
regards,
-mika-