By Bubi
This is the fifth article in the series Java in the Container. Please stay tuned. 😊
The common problem is that arthas is required for troubleshooting, but customers use JRE, and they cannot attach arthas. Therefore, they have to replace JRE with JDK and then perform redeployment and troubleshooting.
As a result, many useful sites are lost in this process, which eventually reduces the efficiency of troubleshooting. Therefore, we have explored how to use artahs in JRE.
Let's write a Java example and a Dockerfile:
// ./src/main/java/Main.java
public class Main {
public static void main(String[] args) throws Exception {
while (true) {
System.out.println("hello!");
Thread.sleep(30 * 1000);
}
}
}
# ./Dockerfile
FROM openjdk:8-jdk-alpine as builder
COPY ./ /app
WORKDIR /app/src/main/java/
# Compile the Java file
RUN javac Main.java
# Runtime container uses the JRE
FROM openjdk:8-jre-alpine
RUN apk add bash curl busybox-extras
WORKDIR /app/src/main/java/
# Copy arthas to the container
COPY --from=hengyunabc/arthas:latest /opt/arthas /opt/arthas
COPY --from=builder /app/src/main/java/ /app/src/main/java/
CMD ["java", "Main"]
Build and start the application normally and try to use arthas to attach the process. Here we use as.sh to understand how it works:
$ # Build an image
$ docker build . -t example-attach
$ # Start the container
$ docker run --name example-attach --rm example-attach
$ # Enter the container in another terminal and run as.sh
$ docker exec -it example-attach sh
/app/src/main/java $ /opt/arthas/as.sh
Arthas script version: 3.6.7
tools.jar was not found, so arthas could not be launched!
Let's run it with JDK and see how arthas is attached:
# Replace the container with the JDK image and run it
# Start Attach Listener first
$ pid=1 ;\
touch /proc/${pid}/cwd/.attach_pid${pid} && \
kill -SIGQUIT ${pid} && \
sleep 2 &&
ls /proc/${pid}/root/tmp/.java_pid${pid}
# -x indicates debugging execution, which outputs executed commands; 1 is java process pid
$ bash -x /opt/arthas/as.sh 1
...
+ /usr/lib/jvm/java-1.8-openjdk/bin/java -Xbootclasspath/a:/usr/lib/jvm/java-1.8-openjdk/lib/tools.jar -Djava.awt.headless=true -jar /opt/arthas/arthas-core.jar -pid 1 -core /opt/arthas/arthas-core.jar -agent /opt/arthas/arthas-agent.jar
...
+ telnet 127.0.0.1 3658
...
It can be seen that the main logic is java -jar arthas-core.jar -pid 1 -core arthas-core.jar -agent arthas-agent.jar
. Then, connect port 3658.
-Xbootclasspath/a:tools.jar
is useful, but as there is no tools.jar in JRE, it can be ignored.
What about trying to run the logic above directly on JRE? Continue to execute the command above in the JRE image:
# Replace the container with the JRE image and run it
# Start Attach Listener first
$ pid=1 ;\
touch /proc/${pid}/cwd/.attach_pid${pid} && \
kill -SIGQUIT ${pid} && \
sleep 2 &&
ls /proc/${pid}/root/tmp/.java_pid${pid}
$ cd /opt/arthas/
$ java -jar arthas-core.jar -pid 1 -core arthas-core.jar -agent arthas-agent.jar
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: com/sun/tools/attach/AgentLoadException
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: com.sun.tools.attach.AgentLoadException
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
According to the code, this error is very normal. arthas-core
will call Attach API and then load Agent (the key code has been marked):
People familiar with the class loading mechanism may have figured out that Arthas.class
relies on some classes of com.sun.tools.
, so the error above is reported in the class link. This is why the stacktrace that reports the error does not contain any arthas packages.
Based on the arthas code above, it is necessary to think about how to get rid of the dependency on tools.jar
.
First, as shown in the figure, directly calling the related classes and methods of com.sun.tools.attach.*
does not work. The error reported above already shows the reason. In addition, reflection does not work. Since tools.jar does not exist, you can't load these classes.
Second, can we manually put tools.jar into the container to eliminate the dependency on JDK? This method works in theory, and the relevant issue also provides specific operations and precautions.
This method works in theory, but for one thing, tools.jar varies according to different JDK distributions and different JDK versions. For example, if arthas cannot be attached in eclipse-temurin:11-jre-alpine
, you cannot copy the tools.jar of JDK8 to handle it. Is there any other way to attach the agent?
Third, ByteBuddy implements the agent-attach function. However, ByteBuddy attaches the agent by trying them one by one, and almost all methods rely on tools.jar. If you are interested, you can look at the implementation of the following strategies:
It seems that we can implement an AttachmentProvider by ourselves and then transform arthas to attach the agent through ByteBuddy.
That's what I thought at first, and I even wrote half the code. However, on the way home at night, I thought of the previous article that said we can attach the agent with a custom script or jattach.
Fourth, load the agent through jattach.
Refer to the jattach documentation and perform the following operations:
# Install the jattach
$ apk add jattach
# Attach the arthas-agent.jar
$ jattach 1 load instrument false /opt/arthas/arthas-agent.jar
Connected to remote JVM
JVM response code = 0
return code: 0
# Confirm the listening port with netstat
$ netstat -alnp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:3658 0.0.0.0:* LISTEN 1/java
...
# Connect to the corresponding port
$ java -jar /opt/arthas/arthas-client.jar 127.0.0.1 3658
After the operation above, arthas can be executed freely:
Sometimes, we just need an answer:
$ pid=1 ;\
jattach ${pid} load instrument false /opt/arthas/arthas-agent.jar && \
java -jar /opt/arthas/arthas-client.jar 127.0.0.1 3658
This time we did more with the attach mechanism. Developers no longer need to replace JRE with JDK or images. They can retain the site to the greatest extent and perform troubleshooting much smoother and more efficiently.
Of course, Java applications may encounter various strange situations in the container environment. If you want to learn more, please stay tuned for future articles.
Why Can't Arthas Be Mounted in the Init Process in Container?
495 posts | 48 followers
FollowAlibaba Clouder - October 9, 2019
Alibaba Developer - August 18, 2020
Alibaba Developer - July 15, 2021
Alibaba Cloud Native Community - May 22, 2023
Adrian Peng - February 1, 2021
Aliware - May 20, 2019
495 posts | 48 followers
FollowAlibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn MoreMulti-source metrics are aggregated to monitor the status of your business and services in real time.
Learn MoreProvides a control plane to allow users to manage Kubernetes clusters that run based on different infrastructure resources
Learn MoreA secure image hosting platform providing containerized image lifecycle management
Learn MoreMore Posts by Alibaba Cloud Native Community