Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1034392: tomcat9: jstack/jcmd broken for non-root users with tomcat9+jdk11 or greater

142 views
Skip to first unread message

Per Lundberg

unread,
Apr 14, 2023, 2:50:04 AM4/14/23
to
Package: tomcat9
Version: 9.0.43-2~deb11u6
Severity: normal
X-Debbugs-Cc: sebastia...@hibox.tv

Hi,

We noticed while rolling out JDK 17 support for our in-house application
that the following command is "broken" (moral-martin is an LXD container
in my examples below, PID 4108 is the tomcat9 java process):

root@moral-martin:~# lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 11 (bullseye)
Release: 11
Codename: bullseye

root@moral-martin:~# sudo -u tomcat jstack 4108
4108: Unable to open socket file /proc/4108/root/tmp/.java_pid4108: target process 4108 doesn't respond within 10500ms or HotSpot VM not loaded

...when all following conditions are met:

* tomcat9 is running from systemd, _and_
* the JDK is of version 11 or greater, _and_
* the systemd unit (/lib/systemd/system/tomcat9.service) sets
AmbientCapabilities=CAP_NET_BIND_SERVICE (which is done by the Debian
package)

We have spent a significant amount of time debugging this and I'll try
to do my best to summarize our findings here:

The problem is that the way jstack and similar tools work have changed
from JDK8 to JDK11. In JDK8, it simply uses /tmp to try and communicate
with the target process:
https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/master/jdk/src/solaris/classes/sun/tools/attach/LinuxVirtualMachine.java#L40-L45
and https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/master/jdk/src/solaris/classes/sun/tools/attach/LinuxVirtualMachine.java#L293

In newer JDK versions (JDK 17 as an example), the code has been made
"smarter" to support mount namespaces:
https://github.com/openjdk/jdk17u/blob/master/src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java#L299-L302

_However_... bear with me, this is where it gets interesting: this
presumes that the calling process can access /proc/<pid>/root/tmp. When
AmbientCapabilities=CAP_NET_BIND_SERVICE is set in the systemd unit,
this is not the case:

root@moral-martin:~# sudo -u tomcat ls -l /proc/4108/root
ls: cannot read symbolic link '/proc/4108/root': Permission denied
lrwxrwxrwx 1 tomcat tomcat 0 Apr 13 12:55 /proc/4108/root

We have tested this and concluded that:

1. This happens whever _any_ capability is set in the systemd unit; it's
not limited to CAP_NET_BIND_SERVICE. (Note: I haven't tested adding
all possible capabilities yet; I believed I had but when writing this
bug report I realize that my attempt at setting all of them didn't
actually list all of them in `getpcaps pid`; will test this a bit
more and see if it makes any difference)

2. When you remove AmbientCapabilities or set it to AmbientCapabilities=
(empty string), it also works correctly.

I honestly don't know if tomcat9 is the correct package to report this
to; it can also be seen as a bug in the JDK. (We will work with the JDK
maintainers to get this reported to them as well.) Feel free to reassign
the bug report to another package.

With JDK 8, this works correctly. Some of our tooling/monitoring is
dependent on being able to connect to Tomcat (running on JDK 8 or 17) at
runtime. That's why this is imporant for us.

Workaround

What we have seen that on JDK 17, running `jstack` as root works; this
will connect to the target process correctly. However, this does _not_
work on JDK 8 and doesn't seem to work properly on JDK 11 either (I
think this has been fixed upstream in JDK for more recent JDK versions,
which is why it behaves differently on JDK 17).

Our application supports both JDK 8 and 17 for now, and running `jstack`
as root *does not* work on JDK 8. Hence, having to run it as root with
our JDK 17-based installations (only) makes things unnecessarily
complex.

Conclusions

It puzzles me why setting the ambient capabilities for a process breaks
this. It's uncertain whether this is a "feature" by the kernel or
elsewhere. We have tried to find more details about this by studying the
systemd and dbus code to a certain extent, but have yet been unable to
find anything. If anyone reading this knows the prctl and cap_set_proc
semantics by heart, your help would be greatly appreciated.

Best regards,
Per

-- System Information:
Debian Release: bookworm/sid
APT prefers testing-security
APT policy: (500, 'testing-security'), (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 6.1.0-6-amd64 (SMP w/20 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages tomcat9 depends on:
ii lsb-base 11.6
ii systemd [systemd-tmpfiles] 252.6-1
ii sysvinit-utils [lsb-base] 3.06-2
pn tomcat9-common <none>
ii ucf 3.0043+nmu1

Versions of packages tomcat9 recommends:
ii libtcnative-1 1.2.35-1

Versions of packages tomcat9 suggests:
pn tomcat9-admin <none>
pn tomcat9-docs <none>
pn tomcat9-examples <none>
pn tomcat9-user <none>

Thorsten Glaser

unread,
Apr 19, 2023, 3:40:04 AM4/19/23
to
On Tue, 18 Apr 2023, Per Lundberg wrote:

> A short update on this. This is a known regression in more recent versions of
> Java: https://bugs.openjdk.org/browse/JDK-8226919
>
> One of my colleagues (thanks, Sebastian!) managed to workaround this by
> patching the JDK 17 sources to make it use plain /tmp in this case (when ns_pid
> == pid), and also added some better error handling in case this fails.
>
> We are currently working on getting this submitted upstream to OpenJDK, but I

That’s a good path.

> wanted to share it with you as well. One option would be to include this in
> Debian's set of local JDK patches

Shouldn’t this be added to 11 as well? Apparently, both are affected.

> but I don't know how conservative the project is re. fixes like this? I'll
> leave this up to the debian-java maintainers to decide.

The OpenJDK (except for 8 which the ELTS people and I mostly work on)
is not maintained by the debian-java people but by Doko.

The usual way to hope for inclusion is to clone the bugreport, assign
one to src:openjdk-11 and the other to src:openjdk-17, mail the patch
with a description, add the tag patch and pray.

bye,
//mirabilos
--
Infrastrukturexperte • tarent solutions GmbH
Am Dickobskreuz 10, D-53121 Bonn • http://www.tarent.de/
Telephon +49 228 54881-393 • Fax: +49 228 54881-235
HRB AG Bonn 5168 • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg

****************************************************
/⁀\ The UTF-8 Ribbon
╲ ╱ Campaign against Mit dem tarent-Newsletter nichts mehr verpassen:
 ╳  HTML eMail! Also, https://www.tarent.de/newsletter
╱ ╲ header encryption!
****************************************************

Vladimir Petko

unread,
Apr 19, 2023, 5:10:04 PM4/19/23
to
Hi,

Oh, thank you for providing a patch for a quite annoying bug!!!!

Would it be possible to add a header to the patch, so that it is
possible to see where it came from and why, e.g.
-----------------------------------cut--------------------------------------------------------------------------
Description: attach in linux hangs due to permission denied accessing
/proc/pid/root
The attach API uses /proc/pid/root in order to support containers.
Dereferencing this symlink is governed by ptrace access mode
PTRACE_MODE_READ_FSCREDS
which may not succeed when running as the user running the JRE.
This breaks running jcmd and jmap as the same user the JVM is running as.
Use tmpdir when pid matches ns_pid.
Author: Sebastian Lovdahl <sebastia...@hibox.tv>
Bug: https://bugs.openjdk.org/browse/JDK-8226919
Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1034601
Last-Update: 2023-04-18
-----------------------------------cut--------------------------------------------------------------------------

Best Regards,
Vladimir.

On Wed, Apr 19, 2023 at 9:57 PM Per Lundberg <per.lu...@hibox.tv> wrote:
>
> On 2023-04-19 10:22, Thorsten Glaser wrote:
> > On Tue, 18 Apr 2023, Per Lundberg wrote:
> >
> >> wanted to share it with you as well. One option would be to include this in
> >> Debian's set of local JDK patches
> >
> > Shouldn’t this be added to 11 as well? Apparently, both are affected.
>
> Good point. Yes, it should.
>
> > The OpenJDK (except for 8 which the ELTS people and I mostly work on)
> > is not maintained by the debian-java people but by Doko.
>
> Hmm... who/what are Doko?
>
> > The usual way to hope for inclusion is to clone the bugreport, assign
> > one to src:openjdk-11 and the other to src:openjdk-17, mail the patch
> > with a description, add the tag patch and pray.
>
> Thanks for the detailed description! I have done exactly that now. Here
> are the new bugs (added to the Cc line as well):
>
> - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1034600
> - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1034601
>
> To those reading this who might not have the context: the patch attached
> to the previous message in this thread fixes an issue with jstack/cmd
> and similar tools not being able to connect to processes with Linux
> capabilities added to them, when the processes are running as non-root.
> This is a regression in the JDK:
> https://bugs.openjdk.org/browse/JDK-8226919
>
> The patch has been successfully tested on JDK 17 and works fine,
> according to our testing. No guarantees are given as to whether it works
> on JDK 11, but as long as it applies cleanly, it "should" be fine.
>
> Best regards,
> Per
>
0 new messages