×
Community Blog List of Capabilities Related to Cloud-Native Scenarios From JDK 9 to 19 (Part 1)

List of Capabilities Related to Cloud-Native Scenarios From JDK 9 to 19 (Part 1)

Part 1 of this series provides an interpretation from the perspective of O&M and runtime.

By Guyi

Prior to JDK 9, Java released new versions every three years (on average). However, since the launch of JDK9 in September 2017, Java has been on an update spree, with two major JDK versions updated each year. Since 2017, 11 versions have been released up to JDK 19. Among them, there are two LTS versions (JDK11 and JDK17). In addition to stepping up its release of new versions, JDK introduced and enhanced a series of capabilities (such as dynamic perception of resources in containers), non-stop garbage collectors (such as Z Garbage Collector (ZGC) and Shenandoah), and native O&M capabilities centering the capabilities of cloud-native scenarios. This article summarizes the capabilities related to cloud-native scenarios the EDAS team has summed up during the process of providing services to customers. We hope this article can help you see Java in a new light.

Cloud-Native Scenario Definition

One of the things that internally drives cloud-native developments is how to maximize the use of the technology dividend brought by the cloud for our business workloads. The biggest technology dividend brought by the cloud is the efficient delivery and utilization of our resources through elasticity and other related technologies, which can help reduce the cost of the final resources. Therefore, how to maximize the use of resource flexibility is one of the goals pursued by many technology products.

Another internal driving force is how to avoid the confinement of cloud vendor technology. The implementation method is to promote the establishment of standards in various fields. Since the birth of cloud-native, especially with the great success of Kubernetes, establishing specifications and standards for open-source technical products in various areas has always been a goal to be pursued.

With the ultimate goal and through continuous optimization of standards, many products are undergoing revolutions to figure out how to use the relevant standard capabilities in new scenarios. What's mentioned above makes sense for our products and Java.

Capabilities Targeting Different Scenarios in Java

For the nearly ten updated versions of Java, we will provide an interpretation from three scenarios: operation and maintenance, model programming and run time, and memory. The O&M part focuses on how to use the existing container technology to obtain O&M metrics and support some native capabilities in this scenario. At the same time, Java has strings, together with new IO models and capabilities. The other biggest change is about memory. In addition to better support for CGroup in container scenarios, it provides two non-stop garbage collectors: ZGC and Shenandoah GC. In addition to providing low-latency STW, it can return part of the memory to the operating system, maximizing the ability of applications to utilize hardware resources in cloud-native scenarios.

The whole interpretation is divided into two parts. We will explain all contents in this chapter, except memory, which will be analyzed in another article separately.

1. More Cloud-Native O&M Scenarios

1.1 OperatingSystemMXBean

One of the capabilities of containers is process-level isolation. By default, if a Java program in a container accesses the host based on the methods in the OperatingSystemMXBean provided in JMX, all resource data of the host where it resides will be returned. After JDK 14, when executed in a container or other virtualized operating environment, the OperatingSystemMXBean method will return container-specific information (such as available memory in system, Swap, Cpu, and Load). This capability is particularly friendly in many capabilities developed based on JMX (such as monitoring and system throttling). The APIs involved in the transformation in JDK 14 are listed below:

// Returns the amount of free memory in bytes
long getFreeMemorySize();

// Returns the total amount of memory in bytes.
long getTotalMemorySize();

// Returns the amount of free swap space in bytes.
long getFreeSwapSpaceSize();

// Returns the total amount of swap space in bytes
long getTotalSwapSpaceSize();

// Returns the "recent cpu usage" for the operating environment. 
double getCpuLoad();

1.2 Single File

There are two steps in the Java language program execution process:

  1. Use the compilation tool to compile the source code into a static bytecode file. For example, an App.class file will be generated after the execution of the javac App.java.
  2. Execute the application program using the Java startup command, adding relevant classpaths and setting the main programs to be started. For example: use the java -cp. App to execute the compiled bytecode program.

Many other static language programs are compiled directly to generate an executable file (such as C++/Go). Linux also provides #shebang for other dynamic scripting languages, which works with the executable permissions of files to simplify the execution mode.

Java's execution method is slightly complicated, which might not be friendly to some students who are used to using scripts for O&M. Therefore, Java language doesn't have much to do with O&M. In cloud-native scenarios, many one-time tasks are usually executed through Job/CronJob + Single-file as affected by Code Base and Admin processes. The JEP 330 released in JDK 11 defines this capability and complements the way Java executes programs from source code. That is to say, the result of using Java App.java for execution is equivalent to using the following two lines of command execution:

$ javac App.java
$ java -cp . App

At the same time, the shebang file of Linux is supported. The content of the script can be directly executed after the execution engine of the file is specified in the script file header and the executable permission is given to the file. The relevant script method is explained below:

$ cat helloJava
#!/path/to/java --source version

// Java Source Code

$ chmod +x helloJava
$ ./hellJava

1.3 JDK_JAVA_OPTIONS

In a container environment, once the image is determined, the program behavior can only be changed through configuration. This is also in line with the cloud-native element Config. However, when the JVM program starts, we have many configurations that need to be configured through enabling parameters (such as memory setting and -D system parameter setting). This design is very unfriendly to Java unless we support the JVM startup command to manually pass in the relevant environment variables to change the behavior of the JVM during the compiling stage of the Dockerfile. Fortunately, JVM provides a system environment variable JAVA_TOOL_OPTIONS to set the default value of the startup parameter by reading the value of this environment variable. However, this parameter has the following problems:

  1. It takes effect for Java commands and other control commands (such as jar, jstack, and jmap). By default, the processes in the container read the value of the environment variables passed in from outside. Once set, this value will be shared by all processes in the container. It means that when we want to enter the container to perform some troubleshooting work for Java programs, the default will be polluted by the variable JAVA_TOOL_OPTIONS, and the expected results will not be obtained.
  2. Limits on the Length of Environment Variables: No matter in the Linux shell or in the yaml orchestrated by Kubernetes, the length of environment variables is not infinite, and the JVM startup parameters are usually very long. So, there are many times when we encounter unpredictable behaviors caused by JAVA_TOOL_OPTIONS values that are too long.

In JDK 9, a new environment variable JDK_JAVA_OPTIONS is provided, which will only support the Java startup command and will not pollute other commands. At the same time, it supports the relevant content read from the specified file by exporting JDK_JAVA_OPTIONS='@file'. Thus, the two problems above are avoided.

1.4 ExitOnOutOfMemoryError

OutOfMemoryError is the last scenario a Java programmer wants because it means there might be some memory leak in the system. Moreover, such a problem usually takes complicated steps and a large amount of energy to analyze and locate. From realizing the problem to locating the problem, it often takes a lot of time and energy. In order to ensure business continuity, how to recover in time when an error occurs to stop losses is top of mind when dealing with failures. If OutOfMemoryError is reported in the system, we often choose to restart quickly for recovery.

The Liveness probe is defined in Kubernetes, giving programmers the opportunity to determine whether a quick restart is required based on the health of the business. The common OutOfMemoryErrors are often accompanied by a large number of FullGC, which may cause a spike of CPU/Load, thus making the request time too long. Based on this, we can select the appropriate business API to detect the health and survival of the application. However, this scheme has the following problems:

  1. The selected API may be misjudged. There might be many reasons for API timeout, and memory is only one of them.
  2. When OutOfMemoryError errors occur, they are not necessarily all about heap memory used by businesses. Such error is also reported in metadata space overflow, stack space overflow, failure to create system threads, etc.
  3. From the occurrence of a problem to the failure of the final detection, it usually takes a continuous repeated detection of failures to report a final failure. This process will have a certain time delay.

This problem has a better solution in JDK9, which introduces additional system parameters:

  • ExitOnOutOfMemoryError: When an OutOfMemoryError is encountered, the JVM exits immediately.
  • CrashOnOutOfMemoryError: In addition to inheriting the semantics of the ExitOnOutOfMemoryError, it generates a log file of JVM Crash, allowing the program to perform basic reservations on the spot before exiting.
  • OnOutOfMemoryError: A script can be added after this parameter. With this script, some statuses can be cleaned up before exiting.

The three parameters above are particularly valuable in the concept of Fail Fast advocated by cloud-native, especially in stateless micro-application scenarios (such as EDAS). Before exiting, you can combine OnOutOfMemoryError scripts to do a lot of graceful offline work and output JVM crash files to disks (such as NAS) at the same time. This way, it can protect our business from being interfered with by memory and save the scenes at that time.

1.5 CDS

Another concept cloud-native applications act on is the quick startup of applications. Driven by Serverless, cloud vendors are working hard on the cold startup metrics of applications. Java applications have been suffering from the problem of long initialization time. In the recent yearly report of EDAS 2022, 70% of applications managed in EDAS take more than 30 seconds to start up. If we analyze further, the startup time of Java applications consists of the initialization time of the application and the initialization time of the JVM. In the initialization process of the JVM, the search and loading of the Class file take most of the time. CDS technology is born to speed up the startup speed of Class files. It is short for Class-Data Sharing, a technology for sharing Class-Data data information among applications. Based on the fact that Class files will not be easily changed, the technology can directly dump the Class metadata information generated in one of the processes so it can be shared and reused in newly started instances. This way, the overhead of initializing each new instance from 0 can be saved.

CDS was introduced in JDK 5, but the first version only supports Class sharing capabilities of Bootstrap Class Loader.

The introduction of AppCDS to JDK 10 allows application-level Class to be loaded; two JVM parameters (-XX:ArchiveClassesAtExit=foo.jsa and -XX:ShareArchiveFile=foo.jsa) are introduced in JDK 13. Combined with the use of these two parameters, the dynamic dump of shared files can be carried out before the program exits and loaded at startup. In JDK 19, O&M operations are simplified through-XX:-AutoCreateSharedArchive, and the idempotence of shared files does not need to be detected during runtime, improving the ease of use of this technology.

2. More Friendly Run Time Capabilities

2.1 Compact Strings

In Java, all our characters are stored with 2 char bytes (16 bytes). It is analyzed from many different online Java applications that the heap consumption inside the JVM is mainly caused by the use of strings. However, only one Latin character is stored in most strings, which means 1 byte is enough to express the whole string. So, in theory, the vast majority of strings only require half the space to be stored and expressed.

Starting from JDK9, the internal implementation of the default implementation of strings (java.lang.String, AbstractStringBuilder, StringBuilder, StringBuffer) in JDK is integrated with such a mechanism by default. This mechanism automatically encodes one-byte ISO-8859-1/Latin-1 or two-byte UTF-16 according to the content of strings, thus significantly reducing the usage of heap memory. Smaller heap usage also means fewer GC cycles, thereby systematically improving the performance of the entire system.

String compression JDK has been exploring possibilities since version 1.6. At that time, a switch of non-open source UseCompressedStrings was provided at the JVM parameter level to achieve the purpose of compression. When we switch it on, it will change the storage structure (byte[] or char[]) to do compression. Since this method only modifies the implementation of the String class and does not systematically sort out other scenarios of string usage, in the course of the experiment, it was later erased in JDK7, as some unpredictable problems happened.

2.2 Active Processor Count

Active Processor Count refers to obtaining the number of CPU cores that the JVM process can utilize, and the API in the corresponding JDK is Runtime.getRuntime().availableProcessors(). It is common in scenarios, such as system threads and I/O (such as the default number of GC threads in JVM, the number of JIT compilation threads, I/O in some frameworks, ForJoinPool, etc.). We will habitually set the number of threads to the number that JVM can obtain. However, initially, the number is set by reading the CPU data under the /proc/cpuinfo file system. In container scenarios, if you do not make special operations, the CPU information of the host is read by default. In the container scenario, through the isolation mechanism of cgroup, we can set a real number of cores for the container that is much smaller than the machine where it is located. For example, if we run a JVM program on a 4-core machine in a 2-core container, it gets 4 instead of the expected 2.

The resource awareness in the container is not only about the CPU. The more famous version is JDK 8u191. In addition to CPU, this version adds the acquisition of the maximum memory value and the optimization on the attach (jstack/jcmd command, etc.) of the JVM process in the container on the host. The following lists how we have improved the CPU:

  1. A startup parameter-XX:ActiveProcessorCount is added, which can display the number of specified processors.
  2. Automatic detection is performed based on the CGroup file system. The three related variables are automatically detected:

1) CPU Set (CPU is allocated by binding cores)

2) cpu.shares

3) cfs_quota + cfs_period

In Kubernetes scenarios, the default priority is 1) > 2) > 3).

You may have a question. Why does it cause problems in Kubernetes scenarios? For example, we use the following configuration to set the resource usage of a POD:

    resources:
      limits:
        cpu: "4"
      requests:
        cpu: "2"

The configuration above indicates that this POD can use up to 4 cores, while the resources requested by the system are 2 cores. In Kubernetes, the CPU limit part is finally represented by CFS (quota + period), while the CPU request part is finally set by cpu.shares (Details are not provided here to explain how to perform cgroup mapping). In this scenario, the number of cores that can be obtained by Runtime.getRuntime().availableProcessors() is 2 by default, not the expected 4.

How can we avoid this problem? The first and simplest way is to transfer CPU through-XX:ActiveProcessorCount display by default. We need to rewrite the O&M operations on the startup command. In JDK19, the JVM removes the calculation logic based on cpu.shares by default and adds a startup parameter-XX:+ UseContainerCpuShares to be compatible with the previous behaviors.

2.3 JEP 380: Unix Domain Sockets

Unix domain socket (UDS) is used to help with inter-process (IPC) communication in the same machine under the Unix systems. In many ways, it works similarly to TCP/IP (such as Socket read and write, reception, and establishment of links). However, there are many differences. For example, it does not have an actual IP and port, and it does not need to go through a full stack analysis and forwarding of TCP/IP. At the same time, compared with directly using 127.0.0.1 for transmission, UDS has two advantages:

  1. Security: UDS is only designed in the machine for inter-process communication and cannot accept any remote access. Therefore, it can avoid the interference of non-native processes in the long run. At the same time, its access control can be directly applied to file-based access control in Unix, thus greatly enhancing security from the perspective of system.
  2. Performance: Although the Loopback access method has made many optimizations on the protocol stack through 127.0.0.1, it is essentially a Socket communication method. That is to say, UDS still requires a three-way handshake, unpacking and unpacking of the protocol stack, and is restricted by the system buffer. The link establishment of UDS does not need to be complicated, and the data transmission service does not require multiple copies at the kernel level. The logic of transmitting data is streamlined to:

1) Find others' sockets

2) Put the data directly into others' buffer area for receiving messages. This makes it twice or more as efficient as Loopback in scenarios where small amounts of data are sent.

In Java, there has been no support for UDS, but there will be a turn for the better in JDK 16. Why didn't Java add support for UDS until now? The reason is still the impact of cloud-native scenarios. In Kubnernetes scenarios, the method of using multiple containers in a POD together through orchestration (sidecar mode) will become increasingly popular. When data is transmitted among multiple containers in the same POD, the addition of UDS will significantly improve the efficiency of data transmission between containers in the same POD because the transmission is done under the file system in the same namespace by default.

Summary

This article provides an interpretation from the perspective of O&M and runtime. We will talk about memory in the next article.

0 2 1
Share on

You may also like

Comments

Related Products