×
Community Blog A Detailed Explanation of JVM Garbage Collector G1&ZGC

A Detailed Explanation of JVM Garbage Collector G1&ZGC

This article shares some execution processes of garbage collectors, providing a useful reference for those interested in the topic.

By Jushi

1

Introduction to the Garbage Collector

G1 (Garbage First) Garbage Collector

The G1 garbage collector was developed in JDK7, and its functionality was fully implemented in JDK8. It successfully replaced Parallel Scavenge and became the default garbage collector in server mode. Compared with another garbage collector CMS, G1 can not only provide regular memory but also achieve predictable pauses and control garbage collection time within N milliseconds.

G1 collects both the young generation and the old generation at the same time, but these are referred to as G1's Young GC mode and Mixed GC mode, respectively. This feature comes from G1's unique memory layout. Memory allocation no longer strictly follows the division of the young and old generations but takes Region as a unit. G1 tracks each Region and maintains a priority list of Regions. It selects an appropriate region at the right time for collecting. This Region-based memory division provides the basis for some remarkable design ideas to address pause time and high throughput. Next, we will provide a detailed explanation of G1's garbage collection process and its notable design features.

Region Memory Division

G1 divides the heap into a series of memory regions of varying sizes, called Regions (hereafter referred to as regions). Each region is 1-32M, which is the nth power of 2. Under the idea of a generational garbage collection algorithm, the region is logically divided into Eden, Survivor, and the Old Generation. Each region can be an Eden region, Survivor region, or Old region, but at any given time, it can only be one type of region. The number of regions of each role is not fixed, which means that the memory of each generation is also not fixed. These regions are logically contiguous, not physically contiguous, which is very different from the previous physical continuity of the Young/Old regions.

In addition to the Eden region, Survivor region, and Old region, there is also a special region in G1, the Humongous region. Humongous is used to store a large object. When the capacity of an object exceeds half of the Region size, the object will be put into the Humongous region, because if the replication algorithm is used to collect a large object that exists for a short period of time, the replication cost is very high. Directly putting objects into the Old region causes objects that should only exist for a short time to occupy memory in the Old Generation, which is detrimental to garbage collection performance. If the size of an object exceeds the size of a Region, you need to find a contiguous Humogous region to store the large object. Sometimes you have to enable Full GC because you cannot find a contiguous Humogous region.

2

Region Internal Structure

Card Table

Card Table is the internal structure of a Region. Each region is divided into several memory blocks, called cards. These card sets are called card tables.

For example, in the following example, the memory area in region1 is divided into 9 cards, and the set of 9 cards is the card table.

3

Remember Set

In addition to the card table, each region contains a Remember Set or RSet for short. RSet is actually a hash table. The key is the starting address of other regions that reference the current region. The value is the index position of the card that is referenced by the region corresponding to the key in the current region.

The reason for the existence of RSet must be explained here. RSet is to solve "cross-generation reference". Imagine that a young generation object is referenced by an old generation object. In order to find this young generation object through the reference chain, the old generation object must be traversed from GC Roots. In fact, traversing in this way is to traverse all objects. We only want to collect the objects of the young generation, but we have traversed all the objects, which is undoubtedly very inefficient.

In Young GC, when RSet exists, it follows the reference chain to search for references. If an old generation object appears on the reference chain, then simply abandon the search for the reference chain. When the entire GC Root Tracing is executed, we will know the young generation objects that are alive in addition to those referenced across generations. Then, it traverses the RSet of the young generation region. If there is a region with a key of the old generation in the RSet, the card object represented by the value corresponding to the key is marked as alive, thus marking the young generation object referenced across generations.

Of course, there will be a problem. If some old generation objects are objects that should be collected, but they still reference the young generation across generations, the young generation objects that should have been collected will avoid this round of young generation collecting. These objects can only be collected during the subsequent old generation garbage collection, known as Mixed GC. This is one of the reasons why the GC accuracy of G1 is relatively low.

4

Take this figure as an example: both region1 and region2 reference objects in region3. Therefore, the RSet of region3 contains two keys, which are respectively the starting addresses of region1 and region2.

When the RSet of region3 is scanned, it is found that the region with the key 0x6a is an Old region. If the objects corresponding to the 3rd and 5th cards are not marked as reachable at this time, they will be marked again according to RSet.

Similarly, if the region corresponding to the key 0x9b is a Young region, then the objects of card 0 and card 2 will not be marked.

Young GC Process

After understanding the internal structure of the region, let's look at the specific process of G1's Young GC.

1.  Stop the world (STW). The entire Young GC process is performed within an STW pause, which is why Young GC can reclaim all Eden regions. The only ways to control the overhead of Young GC are to reduce the number of Young regions, which means reducing the size of the young generation memory, and to increase concurrency by having multiple threads perform GC simultaneously to minimize STW time.

2.  Scan GC Roots. Here, the scanned GC Roots are those in the general sense, meaning they directly point to objects in the young generation. If a GC Root directly points to an object in the old generation, the scan will stop at this step and not proceed further. Young generation objects referenced by old generation objects will be identified in the following step by using the RSet, where keys point to the card tables of the old generation. This avoids scanning the entire large heap of the old generation, thereby improving efficiency. This is also why the RSet can avoid scanning the entire old generation.

3.  Drain the Dirty Card Queue and update the Rset. The Rset records which objects are referenced across generations by the old generation, that is, when the young generation object is referenced by the old generation object, this record should be updated to the RSet. However, updating the RSet record does not occur immediately following the reference change. Whenever the old generation references a young generation object, the card address of this reference is actually put into the Dirty Card Queue (which is thread-private; when the thread-private Dirty Card Queue is full, it will be transferred to the global Dirty Card Queue, which is unique). This is because directly updating the RSet each time a reference is updated would lead to multithreading contention, as assignment operations are very frequent and impact performance. Therefore, updating Rset is left to the Refinement thread. The capacity of the global Dirty Card Queue changes in 4 phases.

5

White: Nothing happens.

Green: Refinement threads are activated, with the number of threads specified by -XX:G1ConcRefinementGreenZone=N. A dirty card is taken from the global and thread-private queue and updated to the corresponding Rset.

Yellow: The dirty card is generated too quickly, and all Refinement threads are activated, as specified by the -XX:G1ConcRefinementYellowZone=N parameter.

Red: The dirty card is generated too quickly, and the application thread is also added to the work of draining the queue, slowing down the application thread and dirty card generation.

4.  Scan the Rset to scan all references from the Old to the Young region in the Rset. In this step, it is determined which objects in the Young region are alive.

5.  Copy objects to the Survivor region or promote them to the Old region.

6.  Handle reference queues, including soft references, weak references, and phantom references.

The above is the full process of Young GC.

Three-Color Marking Algorithm

After understanding the process of Young GC, we will learn about G1's garbage collection process for the old generation, known as Mixed GC. Before we formally introduce this, let's first explain the concrete implementation of the reachability analysis algorithm, the three-color marking algorithm. And the defect of the three-color marking algorithm and how G1 solves this defect are also explained.

Under the guidance of reachability analysis, we need to mark whether the object is reachable or not, then we use different colors to mark the object to distinguish whether the object is reachable or not. It can be understood that if an object can be traversed starting from GC Roots, then the object is reachable. This process is called checking.

White: The object is not checked.

Gray: The object is checked, but the object's member Field (other objects referenced in the object) is not checked. This shows that the object is reachable.

Black: The object is checked, and the object's member Fileld is also checked.

Then the entire checking process involves starting from GC Roots, continuously traversing objects, and marking reachable objects as black. When the marking is complete, the objects that are still white are the ones that have not been traversed, meaning they are unreachable objects.

Take an example

In the first round of checking, all GC Roots are found. GC Roots are marked as gray, and some GC Roots are marked as black because they have no member fields.

6

In the second round of checking, objects referenced by GC Roots are checked and marked as gray.

7

In the third round of checking, loop the previous steps and check the child fields of objects marked as gray. Since it is assumed that there are three rounds of checking, this will be the final checking. At the end of this checking, the object that is still white is the object that can be collected. That is, objectC in the figure.

8

The above description outlines the process of one round of the three-color marking algorithm. However, this is an ideal scenario. During the marking process, the marking threads run alternately with the user threads, so changes in references may occur during marking. Imagine that between the second and third rounds of checking, a change in references occurs, and objectD is no longer referenced by objectB but is now referenced by objectA. At this point, objectA's members have already been checked, but objectB's member fields have not been checked yet. At this point, objectD will never be checked again. This leads to a missed marking.

Another case of missed marking occurs when a new object is generated and is held by an object that has already been marked as black. For example, newObjectF in the figure. Because the black object is already considered to be checked, the newly generated object will not be checked again, which can also lead to missed marking. I will explain the solutions for these two missed marking in detail.

9

Missed marking of the existing object

In the example, the missed marking of objectD occurs under the following conditions:

  1. The gray object no longer points to the white object, that is, objectB.d = null.
  2. The black object points to the white object, that is, objectA.d = objectD.

To solve the missed marking, we need to break either of these two conditions. This leads us to two solutions: Snapshot At The Beginning and Incremental Update.

1.  Snapshot At The Beginning (SATB)

When any reference from a gray object to a white object is deleted, the deleted reference is recorded. By default, the deleted reference object is alive. This can also be understood as the reference relationship in the whole checking process is subject to the moment when the checking starts.

2.  Incremental Update

When a reference to a white object is added to a gray object, record the black object with the changed reference, change it to a gray object, and remark it. This is the solution adopted by CMS (of course, CMS is also implemented by the three-color marking algorithm).

In both two solutions above, we find that the changed reference should be recorded anyway. Therefore, we need a way to record the changes in references, that is, write barrier. The write barrier is a means to record the changes in references. The effect is similar to AOP, but its implementation is far more basic than the AOP we use. You can regard it as a piece of code at the JVM code level. Whenever any reference changes, this code is triggered and the changed reference is recorded.

Missed marking of the newly generated object

The solution to the missed marking of the newly generated object is simpler. In the incremental update mode, this problem is inherently solved. In the SATB mode, we actually determine a scope of checking at the very beginning. Therefore, we can put the newly generated objects outside this checking scope, assuming by default that the newly generated objects are alive. Of course, this process has to be explained in combination with the card table to be more specific. We will talk more about this in the process of Mixed GC.

SATB

Snapshot At The Beginning: G1 has two top-at-mark-start (TAMS) pointers in the region when allocating objects, which indicate prevTAMS and nextTAMS respectively. Corresponding to the two numbers on the card table that indicate the range of the card table, GC considers objects allocated above the nextTAMS position as alive. This is an implicit marking (this involves the details of the G1 Mixed GC garbage collection phase, which is quite complex and will be discussed in detail next). This method of resolving missed marking is flawed, as it allows white objects that should be collected to survive until the next GC cycle, resulting in float garbage. The precision of SATB is relatively low, so there are many cases of float garbage.

10

Mixed GC

Through previous studies, we have learned about the marking algorithm used by G1-the three-color marking algorithm-and the solutions to address its issues. Next, we will explain the detailed process of Mixed GC and how to use G1's card table to resolve the issues within it.

Mixed GC can be divided into two major steps: global concurrent marking and evacuation. The global concurrent marking process involves the SATB marking process, which we will explain in detail.

Global

The global concurrent marking of the G1 garbage collector is divided into multiple phases.

1. Initial Marking

This phase involves STW, marking objects directly reachable from the GC Root. This step is accompanied by Young GC. The reason for performing Young GC is to handle cross-generation references. The old generation objects may be referenced by the young generation, but the old generation cannot use RSet to resolve cross-generation references. Additionally, Young GC also involves STW. By sharing the STW time in the first step of Young GC, the overall STW time can be minimized.

This step also initializes some parameters, assigning the bottom pointer to the prevTAMS pointer, the top pointer to the nextTAMS pointer, and clearing the nextBitMap pointer. These three variables will be needed for the subsequent concurrent marking.

These variables might be confusing, so let me explain: top, prevTAMS, and nextTAMS are all pointers to the card table. Their purpose is to identify which objects can be collected and which are alive, according to the SATB mechanism. The nextBitMap, on the other hand, is an array that records which objects in the card table are alive. Of course, since we haven't started the marking process yet, the records in nextBitMap are currently empty.

11

2. Root Region Scan

After the STW phase, this phase will scan the Survivor region (the Survivor regions are considered the root regions) and mark all old generation objects referenced by objects in the Survivor region. This is also why Young GC is needed in the previous step: to handle cross-generation references, it is necessary to know which old generation objects are referenced by objects in the Survivor regions. Since this process requires scanning the Survivor regions, Young GC cannot occur during this time. If the young generation is exhausted during the scanning process, it must wait until the scan is complete before starting Young GC. This step takes a short time.

3. Concurrent Marking

It takes a long time to analyze the reachability of objects in the heap from GC Roots to find out the information about the surviving objects in each region. The rough process is like this, but the actual process of this step is very complicated. The reason is that we need to consider the changes of each pointer under the SATB mechanism.

Assuming no reference changes after the root region scan, the status of a region will remain consistent with the status after the initial marking step. At this point, if objects continue to be allocated, they will be allocated after nextTAMS. As objects are allocated, the TOP pointer will move backward accordingly.

12

Since this step runs concurrently with the mutator (user threads), the root node scanning is actually performed on a snapshot. The snapshot covers the range from prevTAMS to nextTAMS (note that while the snapshot range is fixed, the objects between prevTAMS and nextTAMS can change during the scanning process).

When new objects are allocated in a region, they are allocated after nextTAMS, causing the top pointer to move backward. Any object between nextTAMS and top is considered implicitly alive.

During this period, references between objects in the range from prevTAMS to nextTAMS may change. For example, a white object might be held by a black object. This is a flaw of the three-color marking algorithm, and it necessitates updating the status of the white object. In this case, objects with changed references will be put into the satb_mark_queue. The satb_mark_queue is a queue that records all white objects with changed references.

The satb_mark_queue here refers to the global queue. In addition to the global queue, each thread also has its own satb_mark_queue. The global queue is derived from the merging of all other threads' satb_mark_queue. When a thread's satb_mark_queue is full, it will be transferred to the global satb_mark_queue. During the concurrent marking phase, the capacity of the global satb_mark_queue is regularly checked. If it exceeds a certain threshold, the concurrent marker thread will take out all objects from both the global satb_mark_queue and the threads' satb_mark_queue and mark them. Additionally, the child fields of these objects will be pushed onto the marking stack, waiting to be marked next. This process is similar to handling the global dirty card queue. Note here.

13

As the concurrent marking phase ends, the nextBitMap marks which objects can be collected. However, note that not every thread's satb_mark_queue may have been transferred to the global satb_mark_queue, as this merging process is also concurrent. So the next step is needed.

4. Remark

Marking the objects that changed during the concurrent marking phase involves identifying the objects in the threads' satb_mark_queue whose references have changed and putting them into the global satb_mark_queue. STW is necessary at this phase to ensure correct marking.

5. Cleanup

Evaluate and sort the value and cost of collecting each region. Devise a collection plan based on the user's expected GC pause time. Select certain Old regions and all Young regions to form the Collection Set (CSet). Add regions with no objects to the set of regions available for object allocation. Note that this step is not about clearing; it identifies which regions are worth collecting without copying any objects. Once the evaluation is complete, a global concurrent marking cycle is essentially finished. At this point, the nextTAMS pointer is assigned to prevTAMS, and nextBitMap is assigned to prevBitMap. It may seem odd to record the current marking results into prevBitMap. Can these marking results be reused the next time this region is checked?

We know that G1 can adjust the capacities of the Eden (E) and Old (O) regions based on memory usage changes. If certain regions grow rapidly in capacity, it indicates that memory access in these regions is more frequent, and they are likely to reach their capacity limits sooner. Therefore, during the next copy and transfer, objects in these regions will be prioritized for relocation to larger regions.

After the marking phase, the remaining task is evacuation, which involves copying the surviving objects. This means copying the live objects to empty regions and then reclaiming the space from the used regions. This step uses multithreaded copying and clearing, and the entire process will perform STW. This is also one of the advantages of G1. As long as there is a free region, garbage collection can be completed. There is no need to reserve too much memory like CMS.

G1 Reviews

From G1's design perspective, it uses a significant number of additional structures to store reference relationships, which helps reduce the time spent on marking during garbage collection. However, these additional structures also lead to memory waste, which, in extreme cases, can consume up to an additional 20% of memory. The region-based memory division scheme makes memory allocation more complex, but it also has its advantages. That is, less fragmentation is generated after memory reclamation, thereby reducing the frequency of Full GCs.

Based on experience, on most large memory servers (6GB and above), G1 outperforms CMS in both throughput and STW time.

ZGC

After understanding G1, let's take a look at the renowned ZGC, currently touted as a "fully concurrent, low-latency garbage collector capable of keeping pause time within 10 ms." ZGC stands for Z Garbage Collector, where the Z does not represent an abbreviation. ZGC started as an experimental feature in JDK11 and was fully implemented in JDK17. Before explaining the ZGC garbage collection process, let's first introduce the technical design of ZGC, which will help us understand its working process.

Page Memory Layout

ZGC also divides the heap memory into a series of memory partitions called pages (the official documentation uses the term pages, here, we quote the official term to avoid confusion with G1).

This management approach is very similar to G1 GC; however, ZGC's pages do not have generational characteristics. More accurately, ZGC currently does not support generational features (as of JDK17). The reason is that designing ZGC to support generational garbage collection would be too complex. Therefore, ZGC was initially designed as a non-generational GC mode, with plans to iterate and possibly introduce generational support in the future. There are three types of ZGC pages.

Small page

Capacity 2M, store objects less than 256k.

Medium page

Capacity 32M, store objects greater than or equal to 256k but less than 4M.

Large page

The capacity is not fixed but must be an integer multiple of 2M. Store objects greater than 4M but can store only one object.

14

Memory Reclamation Algorithm

ZGC's collection algorithm also follows the steps of first finding garbage and then collecting it, with the most complex part being the process of identifying the garbage. To understand ZGC's concurrent collection process, it is essential to first grasp three concepts.

Colored Pointer

Colored Pointer is a technology that allows pointers to store additional information. We know that in a 64-bit operating system, a memory address consists of 64 bits in total. However, due to the limitations of actual physical memory size, we do not actually use all 64 bits. This concept is easier to understand if you are familiar with Linux virtual memory management. I will provide a brief explanation here. When we refer to the "physical memory address" in an operating system, it is not the actual physical address of the memory chips. Instead, it is a "virtual address" created by the operating system. This technology is known as virtual memory management. Virtual memory is used on almost all Linux servers, except for a few embedded devices where the memory is too small to require such technology. With the help of virtual memory, we can have two virtual memory addresses corresponding to one real physical address. The details of virtual memory are not our focus here; it is enough to have a general understanding.

15

For the JVM, an object's address uses only the first 42 bits, while bits 43 to 46 are used to store additional information, indicating which GC phase the object is in under ZGC. The objective reason for using only 46 bits is that the Linux system supports up to 46 bits of physical address space, which equates to 64TB of memory. Using more memory requires additional settings in Linux. But this memory setting is enough on mainstream servers.

In the division of reference addresses, the 43rd bit of an object reference represents the marked0 flag, the 44th bit represents the marked1 flag, the 45th bit represents the remapped flag, and the 46th bit represents the finalizable flag. Pointer coloring involves setting the corresponding bit to 1. Of course, only one of these three bits can be active at the same time. These flags indicate which phase of the GC the object is in. In the following detailed process of ZGC, we will introduce how colored pointers help GC. The switching of reference addresses between different flags is also referred to as self-healing of pointers.

16

Read Barrier

The read barrier is a technology where the JVM inserts a small piece of code into the application code, triggered only when a thread reads an object reference from the heap. It functions similarly to a write barrier but is triggered when an object is read.

The read barrier is primarily used to rewrite the namespace of each address. Self-healing of pointers is also involved here. The self-healing of pointers refers to the process where, when accessing an object that is currently being reallocated, the read barrier intercepts the access. It then forwards the access to the newly copied object via the forwarding table, correcting and updating the reference to directly point to the new object. This capability significantly reduces ZGC's STW time, as other GCs need to fix all pointers to the object before access can resume. This content might be unclear at first, but it will make sense after we go through the detailed ZGC process.

NUMA

NUMA is not an innovation specific to ZGC in garbage collection, but it is a significant feature of ZGC. It is enough to have a basic understanding. To understand NUMA, we should first understand UMA.

uma(Uniform Memory Access Architecture)

Uniform Memory Access (UMA) is the typical architecture of standard computers, where multiple CPUs access a single block of memory. This can lead to contention when multi-core CPUs access the same memory block. To manage this, the operating system restricts access to a specific memory block on the bus, which can become a bottleneck as the number of CPUs increases.

numa(non-Uniform Memory Access Architecture)

Non-Uniform Memory Access (NUMA) is a system architecture where each CPU has its own corresponding block of memory, typically located closer to the CPU. The CPU will preferentially access this local memory. This reduces contention and increases efficiency, as CPUs primarily access their own memory blocks. NUMA technology allows multiple machines to be combined into a single service for external use. This technology is popular in large-scale systems and provides a high-performance solution. The heap space can also be composed of multiple machines. ZGC's adaptation to NUMA means that ZGC can automatically detect and optimize for NUMA architectures.

ZGC Process

Next, let's study the ZGC process in detail, introducing the concept of the good_mask. The good_mask records the flags of the JVM garbage collection phases, switching continuously as the GC progresses. It allows the JVM to identify whether an object is generated during the current GC round based on the relationship between the good_mask and the object's mark bits. The good_mask can be considered a variable that records the current view of the JVM's memory space. The ZGC process is as follows.

1. Init Mark
This step is similar to G1 in that it records all objects reachable from GC Roots. Additionally, it switches the value of good_mask. The good_mask is initially set to remapped. After this switch, it changes to marked1 (many people mistakenly believe it switches to 0 the first time, but it does not). It is worth noting that, since the pointers of objects have not yet participated in a GC, the mark bits of these objects are in the Remapped state. After this step, all objects reachable starting from GC Roots are marked as marked1.

17

2. Concurrent Mark
When performing the initial marking, the view is marked as marked1. The GC thread starts from GCRoots, and if an object is accessed by the GC thread, the object's address view will be Remapped to marked1. At the end of the marking phase, if the object's address space is marked1, it indicates that the object is active. If it is Remapped, it indicates that the object is inactive. At the same time, the size in bytes of the surviving objects within each inner page will be recorded to facilitate subsequent page relocation. The colored pointer of the newly allocated object at this phase is set to marked1. During this phase, another important task is to fix the pointers that were marked during the last GC. This is also the reason why two namespaces are needed for colored pointers. If there is only one namespace, it would be impossible to distinguish whether a pointer marked during this marking phase was marked this time or during the last marking phase.

18

3. Remark
During this phase, tasks that were not completed during the concurrent mark phase are processed (with a small STW pause controlled to within 1ms). If not all tasks are completed, another concurrent mark phase will be initiated. This primarily addresses the issue of missed markings in the three-color marking algorithm, specifically the problem of white objects being held by black objects. Objects whose references were changed during the concurrent mark phase will be recorded (triggering a read barrier will record them). In this phase, the objects whose references have changed are marked. There is no need to draw a diagram here; understanding the concept is enough.

4. Concurrent Prepare for Relocate
This step primarily prepares for the subsequent relocation process. It handles soft references, weak references, and phantom references, and resets the page's Forwarding table. It also collects information about the pages to be reclaimed into the Relocation Set.

The Forwarding table records the mapping of old and new references after object relocation. The Relocation Set stores the collection of surviving pages that need to be reclaimed. In this phase, ZGC scans all pages and stores the page information to be relocated to the Relocation Set. This is very different from G1, which only selects some regions to be collected. While recording page information, it also initializes the Forwarding table of page and records which objects need to be relocated in each page. This step takes a long time because it involves a full scan of all the pages. However, since it runs concurrently with user threads, there is no STW pause. Compared to G1, it also saves the cost of maintaining RSet and SATB.

19

5. Relocate Start
This phase aims to identify all objects directly reachable from GC Roots and switch the good_mask to remapped. This step involves an STW pause. Note that objects directly referenced by GC Roots might need to be relocated. If necessary, the object will be copied to a new page, and the pointer in GC Roots to the object will be corrected. This process is known as "self-healing of pointers". However, the main focus here is switching the good_mask.

20

6. Concurrent Relocate

This phase requires traversing all pages and copying the surviving objects to other pages based on the page's forwarding table. The forwarding table records the mapping between the old and new references of the objects. Once the objects in a page have been relocated, the page is reclaimed. It is important to note that the forwarding table is not reclaimed at this phase to avoid losing the mapping between old and new object references. If a user thread accesses a relocated object during this phase, the reference held by the thread will be corrected based on the forwarding table. This is another aspect of "self-healing of pointers".

21

7. Concurrent Remap

This phase is aimed at correcting all references to relocated objects. Strictly speaking, concurrent remap does not belong to the operations required in the current GC phase. After executing step 6, we obtain the address mapping relationship for all objects that need to be remapped before and after relocation. With this relationship, during future accesses, object references can be corrected based on this mapping, which is referred to as the "self-healing of pointers". If we were to directly traverse all objects again based on GC Roots to complete the "self-healing of pointers" for all objects, it would indeed accomplish this task. However, this would result in an additional full traversal of all objects. Therefore, the approach taken by ZGC is to defer this phase to the next round of ZGC's concurrent marking. This way, the process of traversing all objects is shared. Before the next ZGC cycle begins, any thread that accesses the reference of a relocated object can trigger a read barrier and use the forwarding table to identify the relocated address of the object, thus completing the "self-healing of pointers" autonomously.

22

ZGC Reviews

I might not have the most expertise on the topic, but I still want to share my thoughts on the strengths and weaknesses of ZGC based on my understanding. The advantages of ZGC are quite clear. The almost entirely concurrent collection process results in an unparalleled low pause time, which is the core design philosophy of ZGC. Low pause time, combined with Java's inherent support for high concurrency, will undoubtedly allow ZGC to showcase its power in the server field in the future. However, to achieve this design goal, ZGC has indeed made some sacrifices, such as throughput. I know that in many places, ZGC is described as surpassing G1. However, that is not entirely the case, at least not for the non-generational ZGC before JDK21. For specific tests and comparisons, you can refer to the following Oracle article.

https://cr.openjdk.org/~pliden/slides/ZGC-OracleDevLive-2020.pdf

Of course, these drawbacks will gradually improve as ZGC matures and with the introduction of generational features in JDK21. All in all, ZGC is still a very well-designed garbage collector.

Conclusion

The knowledge about JVM garbage collectors is extensive and quite demanding to write about. I also plan to share the knowledge related to GC logs in the future.


Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.

0 1 0
Share on

Alibaba Cloud Community

1,057 posts | 259 followers

You may also like

Comments