Alibaba's open-source and self-developed dynamic non-intrusive AOP solution: JVM-Sandbox-Alibaba Cloud Developer Community

Write it in front

with the expansion of software deployment scale and the refinement of system functions, the coupling degree and link complexity between systems are continuously strengthened. To maintain the stability of current-scale systems, we need to implement and improve supporting tool platforms such as monitoring system, fault location analysis, traffic recording and playback, strong and weak dependency detection, and fault drills. For the sake of server size and business stability, these supporting tool platforms must be non-invasive, effective in real time, and dynamically pluggable for target applications.

To achieve this, the underlying technology-dynamic bytecode enhancement is more or less involved. If each tool implements its own bytecode enhancement logic, the threshold for early implementation and later maintenance costs are high, and the interaction between different tools poses unpredictable risks. How to block the high threshold of bytecode enhancement technology, reduce R & D O & M costs, and support the rapid implementation and dynamic management of multiple upper-layer tool platforms has become the goal of Alibaba Group. Since last year, we have devoted ourselves to practice and innovatively developed a real-time non-intrusive bytecode enhancement framework.

So JVM-Sandbox was born!

Scenario target group of JVM-Sandbox
  • Btrace is very powerful, and Zeng Jiyu wants to be a more convenient and suitable problem locating tool for itself, which can support both online link monitoring and troubleshooting and standalone problem locating.
  • Sometimes, a problem is suddenly reported, which requires input parameters to complete the positioning. However, there are no logs, even in other people's code. I really want to develop a tool to dynamically add logs as needed, we recommend that you filter by business ID.
  • Many tools can be used to simulate exceptions between systems. However, what to do with the exception simulation in the system? Add a switch or use AOP to implement it in the development system. I really want to develop a more elegant exception simulation tool, it can simulate both inter-system anomalies and intra-system anomalies.
  • If you want to obtain the trace data, you can use it to identify scenarios, coverage statistics, and so on. The coverage statistics tool cannot be native and the statistics link data is inaccurate. You want to develop a tool to obtain the data of a row link.
  • I want to develop tools such as recording and playback, fault simulation, dynamic logs, and trace acquisition. Even if I have developed these tools, the underlying implementation principles of these tools are the same. At the same time, how to eliminate the impact between these tools, how to ensure that these tools are dynamically loaded, how to ensure that other tools are not affected after dynamic loading/unloading, and how to ensure that when tools have problems, quickly eliminate impact and restore code.

If you have the above demands, then you are JVM-Sandbox potential customer. JVM-Sandbox provides dynamic enhancement classes for the classes you specify to obtain the required parameters and row information; Provides dynamic pluggable containers to manage JVM-Sandbox-based modules.

What can JVM-Sandbox do?

In the world view of JVM-Sandbox (hereinafter referred to as sandbox), any call to a Java method can be decomposed BEFORE, RETURNand THROWSthree links, from which the corresponding links are extended 事件探测and 流程控制机制. Not only that, but also LINEevent, which can be used to record code lines.

try {

 * do something...

} catch (Throwable cause) {

Based on BEFORE, RETURNand THROWSthree links events and LINEevents, which can complete Many AOP-like operations.

  1. You can perceive and change the input parameters of method calls.
  2. Detects and changes the return value of method calls and the exceptions thrown.
  3. Which rows are executed in sequence in a request?
  4. You can change the process of method execution.
  • The custom result object is directly returned before the method body is executed. The original method code will not be executed.
  • Reconstructs a new result object before the method body returns, or even changes it to throw an exception.
  • If a new exception is thrown after the method body throws an exception, it can even be changed to normal return.
JVM-Sandbox what are the possible applications of scene

JVM sandbox can also help you do a lot, depending on how big your brain is.

Application of JVM-Sandbox in Alibaba Group online fault drill

the 17 fault drill platform completed the system reconstruction of the fault injection in only one week based on the JVM-Sandbox. The restructured system has significantly improved the Mount efficiency and Mount success rate, greatly shortening the fault drill time and improving the drill efficiency by dozens of times. The JVM-Sandbox-based fault drill platform has strong universality and is supported by all JVM-based systems. This greatly expands the scope of fault drills. Fault drills have been deployed at the group level.

Compared with the fault drill data in 16, the fault drill platform in 17 increased BU coverage by 1.6 times, application coverage by 5 times, and scenario coverage by 37 times.

Dependency detection

in 17, the automated testing platform for strong and weak dependence was born. It provides multiple capabilities, such as dependency detection, strength analysis, dependency scanning, and fault injection. The underlying capabilities are developed within one week based on JVM-Sandbox. Using the features of the module container, the previously developed modules and the newly added modules are mounted together to complete the platform functions.

In terms of sorting out strong and weak dependencies, Taobao carries the work of sorting out strong and weak dependencies of the system. More than 260 applications are connected to the system with one click, and automatic and intelligent sorting of zero labor costs is realized.

Isolated playback mechanism for recording on the server

based on JVM-Sandbox, a SS module is developed, which is equivalent to a recorder and a playback machine. When calling middleware, our middleware requests are recorded in sequence, and store this 'tape 'to the server. When we need to isolate playback, we will find the 'tape 'and read it directly from the 'tape' when necessary, without actually requesting our middleware, this ensures that our read and write interfaces can be reused, thus realizing isolated playback on the server.

Isolated playback of online recording not only greatly reduces the business regression time, but also frees business test students from tedious data preparation and interface automation script writing, in addition, it greatly expands the coverage, making the regression scope closer to users and richer in scenarios.

Precise regression

after the birth of the isolated playback mechanism for recording on the server, although the coverage has been effectively improved and the manual investment in automated scripts has been reduced, new problems have also been brought. There are a large number of online recording scenarios. A single system can record tens of thousands, hundreds of thousands, or even millions of records. In these recording scenarios, there are a large number of duplicate scenarios. How to identify duplicate scenarios, effective and accurate playback becomes a new problem to be solved.

In 17, based on the JVM-Sandbox, the LineEvnet was used to realize the recognition and marking of the trace, which effectively improved the accuracy and efficiency of playback.

JVM-Sandbox has been deployed across the Alibaba Group. Different modules are loaded to implement different functions. Each function is loaded according to the needs of BU and application:

Through the above examples, we must be more interested in JVM-Sandbox, core functions, what else we can do, and whether we can provide services for employees outside Alibaba, the following section focuses on this part.

JVM-Sandbox technical Background

to solve these problems, such as fault drills, strong and weak dependency detection, recording and playback, and precise regression, the essence of these problems is how to complete the surround control of java methods and obtain runtime links, that is, the AOP framework solution. Currently, there are two common AOP framework solutions: proxy and tracking. The advantage of proxy is that a unified API is implemented, which reduces repeated input but does not take effect in real time. Therefore, the system needs to be compiled and restarted. Tracking has the advantage of high flexibility of dynamic effectiveness, but there is no unified API.

To quickly solve the above four problems, the AOP solution we need must have two features:

  • dynamically pluggable, that is, a unified API for tracking
  • non-invasive solution to JVM class isolation

based on the above requirements, we have developed JVM-Sandbox.

Core functions of JVM-Sandbox

based on JVMTI technical specifications, the JVM-Sandbox is encoded in Java and provides a plug-and-play module interface container for observing and changing code running results. It provides the following core functions:

  • use tracking technology to provide a unified API to implement an AOP solution that does not require restart.
  • Use containers to isolate JVM classes to solve intrusive problems
  • provides a container management mechanism to manage various containers.
Core event model of JVM-Sandbox

normal flow and intervention flow of events BEFORE, RETURN, and THROWS

isolation and communication


  • sandbox through custom SandboxClassLoaderthe parent-parent delegation convention is broken, and class isolation from the observation application is realized. Therefore, there is no need to worry about the pollution and conflicts of applications caused by loading sandboxes.
  • Each module of the sandbox passes ModuleClassLoaderthe independence of each other is achieved, so that modules, modules and sandboxes, modules and applications do not interfere with each other.


  • by injecting the Spy class into the Bootstrap ClassLoader, the communication between the observation application and the JVM-Sandbox is completed.
  • JVM-Sandbox will distribute events to each Module to complete the communication between JVM-Sandbox and modules.
Module dynamic management
code weaving

insert the Spy class into the bytecode. The Spy method reflects the method that calls the JVM-Sandbox.

Use BeforeEvent as an example to display the code.

Overall architecture

the sandbox consists of three core functional components.

  • Code weaving components

rewrite and take effect of the preset code

responsible for the distribution of events and the execution of method flow control

responsible for controlling and managing each module of the sandbox

the bottom layer of the Sandbox provides a HTTP-SERVER(Jetty), which is completed through HTTP sandbox.shand provides APIs based on HttpServlet and WebSocket specifications for each module. Each module can reuse sandboxes to control and interact with each other.

Open source and co-construction

1. It is open-source. We are looking for more employees to improve the functions of JVM-Sandbox. Github address:

2. I hope that some students will work with us to improve the functions of JVM-Sandbox;

3. I hope that more people can think of more application scenarios and open source them for everyone to use.

To sum up, JVM-Sandbox is an AOP solution written in java. It provides a platform for developers to quickly implement bytecode enhancement tools. Its module management function can reuse modules and cooperate to the maximum extent to reduce repeated investment.

As JVM-Sandbox become open-source, we expect more people to join in the expansion and optimization of functions to adapt to more open-source middleware and JVM.

It is hoped that more students can use their intelligence to develop more and better upper-layer modules for themselves and others to use. We also hope to make good use of existing modules to assemble new tool platforms and application scenarios.

We look forward to the construction and application of JVM-Sandbox.

The original text was published on 2018-01-19.

Author: Xu Dongchen

this article is from "Taobao technology", a partner of Yunqi community. For more information, please pay attention to the "Taobao technology" WeChat public account.

Selected, One-Stop Store for Enterprise Applications
Support various scenarios to meet companies' needs at different stages of development

Start Building Today with a Free Trial to 50+ Products

Learn and experience the power of Alibaba Cloud.

Sign Up Now