×
Community Blog Exploring the Principles of Runtime Type T in A<T> Generics in Java

Exploring the Principles of Runtime Type T in A<T> Generics in Java

The article explores Java generics and delves into the impact of reflection on the extension of generics and type erasure.

By Enbo Shi (Xihang)

0

Overview

If you frequently write utility classes, there is a high chance that you will encounter a practical issue - the need to obtain the runtime type T in the generic expression A. It requires some skills to accomplish this task. However, the explanation behind why this technique works is rarely provided. In the following article, we will start with Java generics and delve into the Java Language Specification (JLS) and Java Virtual Machine Specification (JVMS). By exploring the source code of generics processed by the Java compiler and examining the JRE reflection API source code, we will finally verify this technique using a virtual machine implementation (OpenJDK8's hotspot).

Preface

When writing utility classes or utilizing compact logic with generics, it is often necessary to obtain runtime type information for making logical judgments in the next steps. This includes scenarios such as the commonly used plugin architecture in business or obtaining type details during deserialization. As a result, many beginners may attempt to write code like the following to retrieve the runtime type using reflection and generics.

public static <T> Class<T> typeOf(T obj) {
    return (Class<T>)obj.getClass();
}

Undoubtedly, the above code can work very well in a certain range. However, when it comes to generics, it does not work. It can be said that after Java 1.5 introduced generics, the handling of generics in reflection has always been troublesome.

Generics

Generics were introduced in Java 1.5 to address type checking issues [1] and provide robust type constraints for writing generic code, especially libraries, without concerns about heap pollution caused by casting in previous versions [13].

To understand why generics were introduced and how they were designed, you can refer to Gilad Bracha's paper presented at the OOPLSA conference in 1998 [2]. Subsequently, JSR14 was implemented, which added generics to the Java Programming Language [14] and eventually incorporated them into JDK in Java 1.5.

The formal definition of generics can be found in the Java Language Specification (JLS) [3], [4], [5], and [6].

For the non-formal definition of generics, you can refer to the following simple code, which explains the concepts that are often confusing, such as type variables, type parameters, and type arguments.

/**
 * Define a generic class, where
 *
 * Type Parameter is T extends Number
 * Type Variable is T
 * Type argument is an integer in Foo<Integer>
 */
class Foo<T extends Number> {}

Reflection

Due to the introduction of generics in Java 1.5, reflection was also extended to accommodate this new concept [7]. In terms of implementation, reflection introduces the Type interface along with its derived interfaces and classes, which are responsible for implementing the generic Java Language Specification (JLS). Their UML types are as follows.

1

One of the key concepts we frequently encounter is ParameterizedType [10].

ParameterizedType may be unfamiliar to some, but developers who regularly use the core reflection API may be familiar with it. ParameterizedType is one of the derived classes of the Type interface. To put it simply, this concept can be analogized to an implementation of the generic type Foo. For example, Foo and Foo are the ParameterizedType of Foo.

Additionally, as part of the implementation of generics, a set of methods and classes with "Generic" in their names were added to the reflection API [8]. These serve as the foundation for retrieving the runtime types of generics.

Type Erasure

Although generics were introduced in Java, type erasure was implemented as a way to maintain forward compatibility (compatibility at the JVM level without changing bytecode and JVM design) and improve compilation performance (compared to C++ templates, where new types are generated with template parameters) [9]. With type erasure, Java does not need to modify the virtual machine implementation or create new classes for ParameterizedTypes.

Defects of type erasure

Every design has its trade-offs. While Java generics benefit from type erasure, it also leads to two major issues.

  1. Java generics are compile-time, which means that all generic information is erased at runtime. As a result, the JVM is not aware of the type's existence.
  2. Consequently, we cannot obtain the type represented by the type variable at runtime using API reflection.

This feature presents challenges when writing utility classes. For instance, you cannot create an instance using only the type variable T. If T is a non-generic class, we can perform operations by directly passing in the type information.

public static final <T> void foo(List<T> list, Class<T> tClass)

However, when T is a ParameterizedType, the T class type information in the above interface can only retrieve the non-generic type information from the ParameterizedType. For example, if T is List, the class will be List.class. In certain scenarios, such as deserialization, this limitation can pose difficulties.

Tips for Getting Runtime Types of Generics

Introduce TypeReference

So, is there no way to get runtime types of generics in Java? The answer is yes. But we need to make some changes. For example, we can see TypeReference or similar design mechanisms in many serialization frameworks (such as Jackson and Fastjson). We can obtain the runtime type of T without altering the signature of the function basically.

The method is to define the class.

class Wrapper<T> {
}

It's very simple, basically like a wrapper class. Then, make a simple method definition.

public static <T> Type getGenericRuntimeType(Wrapper<T> wrapper)

Finally, a small trick can be used to create an instance of an anonymous derived class and work together with the reflection API to retrieve the generic information of the superClass. If the superClass is a ParameterizedType, we can attempt to obtain the actual Type Argument information, allowing us to obtain the runtime type of T.

public static <T> Type getGenericRuntimeType(Wrapper<T> wrapper) {
    Type type = wrapper.getClass().getGenericSuperclass();
    if (type == null) {
        return null;
    }

    if (type instanceof ParameterizedType) {
        Type[] types = ((ParameterizedType)type).getActualTypeArguments();
        return types[0];
    }
    return null;
}

For example, comparing the following two statements, the only difference is that line 2 creates an anonymous class of wrapper.

Type type1 = getGenericRuntimeType(new Wrapper<List<String>>());
Type type2 = getGenericRuntimeType(new Wrapper<List<String>>() {})

The results after the final running are printed separately.

null
java.util.List<java.lang.String>

So why does a single instance of an anonymous class make such a significant difference? Is it possible to obtain generics within the framework of type erasure? What is the underlying principle?

In fact, it utilizes a technique mentioned in JSR14 [14]. This technique involves saving the generic type information in the class Signature.

Classfiles need to carry generic type information in a backwards compatible way. This is accomplished by introducing a new "Signature" attribute for classes, methods, and fields.

First, the Java compiler writes the generic type information into the Signature attribute of the classfiles. Then, the JRE's reflection interface parses the string within the Signature. Finally, the hidden runtime type information is identified. In the following section, we will start with the definition of the Java Virtual Machine Specification (JVMS), study the process of compiling Java code and generating classfiles, and explore the JRE's reflection code.

Principle Analysis

Class file standard of the JVM

A JVM classfile refers to the binary format that is generated after compiling Java source files. It can be compared to ELF in Linux or COFF in Windows and can be understood as the executable file of the JVM. The JVM reads and executes bytecode from the classfile to run the program. The format of a classfile is as follows:

ClassFile {
    u4             magic;
    u2             minor_version;
    u2             major_version;
    u2             constant_pool_count;
    cp_info        constant_pool[constant_pool_count-1];
    u2             access_flags;
    u2             this_class;
    u2             super_class;
    u2             interfaces_count;
    u2             interfaces[interfaces_count];
    u2             fields_count;
    field_info     fields[fields_count];
    u2             methods_count;
    method_info    methods[methods_count];
    u2             attributes_count;
    attribute_info attributes[attributes_count];
}

The attributes array is where the generic type information is stored, as mentioned in JSR14. JVMS points out [11] that:

A Java compiler must emit a signature for any class, interface, constructor, method, or field whose declaration uses type variables or parameterized types

You can see that the Java compiler needs to bring the generic class information to the Signature attribute and store it in the compiled classfile.

Conduct a Simple Experiment

Let's simply inherit the wrapper class and verify the conclusion through javap after compilation.

public class ExtendedWrapper extends Wrapper<List<String>> {
}

After using the javap command, you can observe that the Signature of line 42 in the class already contains the corresponding type information (Lcom/aliyun/cwz/model/Wrapper;>;). This validates the JVMS standard to a certain extent.

Classfile /Users/alibaba/myprojects/GenericsAndReflection/target/test-classes/com/aliyun/cwz/impl/ExtendedWrapper.class
  Last modified 2023-4-17; size 413 bytes
  MD5 checksum 96ca23aed30b94c2a445bbd76189e250
  Compiled from "ExtendedWrapper.java"
public class com.aliyun.cwz.impl.ExtendedWrapper extends com.aliyun.cwz.model.Wrapper<java.util.List<java.lang.String>>
  minor version: 0
  major version: 52
  flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
   #1 = Methodref          #3.#15         // com/aliyun/cwz/model/Wrapper."<init>":()V
   #2 = Class              #16            // com/aliyun/cwz/impl/ExtendedWrapper
   #3 = Class              #17            // com/aliyun/cwz/model/Wrapper
   #4 = Utf8               <init>
   #5 = Utf8               ()V
   #6 = Utf8               Code
   #7 = Utf8               LineNumberTable
   #8 = Utf8               LocalVariableTable
   #9 = Utf8               this
  #10 = Utf8               Lcom/aliyun/cwz/impl/ExtendedWrapper;
  #11 = Utf8               Signature
  #12 = Utf8               Lcom/aliyun/cwz/model/Wrapper<Ljava/util/List<Ljava/lang/String;>;>;
  #13 = Utf8               SourceFile
  #14 = Utf8               ExtendedWrapper.java
  #15 = NameAndType        #4:#5          // "<init>":()V
  #16 = Utf8               com/aliyun/cwz/impl/ExtendedWrapper
  #17 = Utf8               com/aliyun/cwz/model/Wrapper
{
  public com.aliyun.cwz.impl.ExtendedWrapper();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method com/aliyun/cwz/model/Wrapper."<init>":()V
         4: return
      LineNumberTable:
        line 7: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       5     0  this   Lcom/aliyun/cwz/impl/ExtendedWrapper;
}
Signature: #12                          // Lcom/aliyun/cwz/model/Wrapper<Ljava/util/List<Ljava/lang/String;>;>;
SourceFile: "ExtendedWrapper.java"

So, how does the Java compiler work?

Explore Java compiler

According to "The Hitchhiker's Guide to javac" [15], the JavaCompiler serves as the driver for javac. Therefore, by studying the implementation of the JavaCompiler, we can understand the process of file compilation with javac. (Through the compilation process observed with strace javac, it can be observed that the calls include the JavaCompiler. This leads to a similar conclusion.)

Let's use OpenJDK 1.8 and see how the Java compiler compiles extended wrappers.

During the compilation process, the JavaCompiler class invokes the compile method. This method serves as the core compilation method, ultimately generating a Java class as a classfile.

com.sun.tools.javac.main.JavaCompiler#compile(com.sun.tools.javac.util.List<javax.tools.JavaFileObject>, com.sun.tools.javac.util.List<java.lang.String>, java.lang.Iterable<? extends javax.annotation.processing.Processor>)

Within this function, the Java file is first parsed as JCTree.JCCompilationUnit by the parser. It serves as the basic unit in the abstract syntax tree. The type information of the Java file is then propagated to the type variables of the class symbols generated by the ExtendedWrapper class using the visitor pattern within the class com.sun.tools.javac.comp.Enter and the corresponding method. The ClassSymbol is subsequently stored in the symbol table.

com.sun.tools.javac.jvm.ClassReader#enterClass

2
3

In the next step of the code generation process,

com.sun.tools.javac.main.JavaCompiler#generate

with the method in the following link,

com.sun.tools.javac.jvm.ClassWriter#writeClassFile

the type information of superClass is obtained through the type of ClassSymbol in the symbol table and written into the attribute of Signature of a classfile through ClassWriter.

com.sun.tools.javac.code.Types#supertype

4

5

So far, we have understood how the JavaCompiler writes generic type information into classfiles. In this section, I recommend examining the step source code (in the com.sun.tools.javac package) to gain a deeper understanding.

JRE (Java Runtime Environment) source code

Now, let's analyze how the reflection API in the JRE converts the Signature string to a Type object. We have observed that the main reflection method used in the previous code is the getGenericSuperclass method under the Class type. Therefore, let's begin our analysis from here. This method was introduced in Java 1.5 and is used to return the base class with generics. Here is its code:

public Type getGenericSuperclass() {
    ClassRepository info = getGenericInfo();
    if (info == null) {
        return getSuperclass();
    }

    // Historical irregularity:
    // Generic signature marks interfaces with superclass = Object
    // but this API returns null for interfaces
    if (isInterface()) {
        return null;
    }

    return info.getSuperclass();
}

We can notice that the core is the variables of the ClassRepository type generated by line 2, which represents the generic type information of the class. The details are as follows:

This class represents the generic type information for a class. The code is not dependent on a particular reflective implementation. It is designed to be used unchanged by at least core reflection and JDI.

By further learning getGenericInfo, we can see how info is generated.

private ClassRepository getGenericInfo() {
    ClassRepository genericInfo = this.genericInfo;
    if (genericInfo == null) {
        String signature = getGenericSignature0();
        if (signature == null) {
            genericInfo = ClassRepository.NONE;
        } else {
            genericInfo = ClassRepository.make(signature, getFactory());
        }
        this.genericInfo = genericInfo;
    }
    return (genericInfo != ClassRepository.NONE) ? genericInfo : null;
}

It can be seen that the string Signature generates information from the method "getGenericSignature0" by the process of ClassRepository. So, where does this string come from? We can find that this is a native method, which is from the JVM implementation.

// Generic signature handling
private native String getGenericSignature0();

Since it is the JVM method, we can look through the source code to verify whether it meets the JVMS mentioned earlier.

OpenJDK source code verification

After studying the JVMS and the Class#getGenericSignature0 function in the JRE, I believe it is necessary to explore the specific implementation within the JVM. Based on the Java compiler source code mentioned earlier, we have decided to refer to the implementation in OpenJDK. Specifically, we will refer to the widely used JDK8 [12].

By searching for the function name getGenericSignature0, we can find a configuration array of JNI methods in the file ./jdk/src/share/native/java/lang/Class.c.

static JNINativeMethod methods[] = {
    {"getName0",         "()" STR,          (void *)&JVM_GetClassName},
    {"getSuperclass",    "()" CLS,          NULL},
    {"getInterfaces0",   "()[" CLS,         (void *)&JVM_GetClassInterfaces},
    {"isInterface",      "()Z",             (void *)&JVM_IsInterface},
    {"getSigners",       "()[" OBJ,         (void *)&JVM_GetClassSigners},
    {"setSigners",       "([" OBJ ")V",     (void *)&JVM_SetClassSigners},
    {"isArray",          "()Z",             (void *)&JVM_IsArrayClass},
    {"isPrimitive",      "()Z",             (void *)&JVM_IsPrimitiveClass},
    {"getComponentType", "()" CLS,          (void *)&JVM_GetComponentType},
    {"getModifiers",     "()I",             (void *)&JVM_GetClassModifiers},
    {"getDeclaredFields0","(Z)[" FLD,       (void *)&JVM_GetClassDeclaredFields},
    {"getDeclaredMethods0","(Z)[" MHD,      (void *)&JVM_GetClassDeclaredMethods},
    {"getDeclaredConstructors0","(Z)[" CTR, (void *)&JVM_GetClassDeclaredConstructors},
    {"getProtectionDomain0", "()" PD,       (void *)&JVM_GetProtectionDomain},
    {"getDeclaredClasses0",  "()[" CLS,      (void *)&JVM_GetDeclaredClasses},
    {"getDeclaringClass0",   "()" CLS,      (void *)&JVM_GetDeclaringClass},
    {"getGenericSignature0", "()" STR,      (void *)&JVM_GetClassSignature},
    {"getRawAnnotations",      "()" BA,        (void *)&JVM_GetClassAnnotations},
    {"getConstantPool",     "()" CPL,       (void *)&JVM_GetClassConstantPool},
    {"desiredAssertionStatus0","("CLS")Z",(void *)&JVM_DesiredAssertionStatus},
    {"getEnclosingMethod0", "()[" OBJ,      (void *)&JVM_GetEnclosingMethodInfo},
    {"getRawTypeAnnotations", "()" BA,      (void *)&JVM_GetClassTypeAnnotations},
};

getGenericSignature0 corresponds to an object of JNINativeMethod. {"getGenericSignature0", "()" STR, (void *)&JVM_GetClassSignature}JNINativeMethod is defined as follows:

typedef struct {
    char *name;
    char *signature;
    void *fnPtr;
} JNINativeMethod;

You can see that the jvm implementation corresponding to getGenericSignature0 is JVM_GetClassSignature, a function pointer. The implementation of this function is ./hotspot/src/share/vm/prims/jvm.cpp, wrapped in the JVM_ENTRY macro.

JVM_ENTRY(jstring, JVM_GetClassSignature(JNIEnv *env, jclass cls))
  assert (cls != NULL, "illegal class");
  JVMWrapper("JVM_GetClassSignature");
  JvmtiVMObjectAllocEventCollector oam;
  ResourceMark rm(THREAD);
  // Return null for arrays and primatives
  if (!java_lang_Class::is_primitive(JNIHandles::resolve(cls))) {
    Klass* k = java_lang_Class::as_Klass(JNIHandles::resolve(cls));
    if (k->oop_is_instance()) {
      Symbol* sym = InstanceKlass::cast(k)->generic_signature();
      if (sym == NULL) return NULL;
      Handle str = java_lang_String::create_from_symbol(sym, CHECK_NULL);
      return (jstring) JNIHandles::make_local(env, str());
    }
  }
  return NULL;
JVM_END

As you can see, the final getGenericSignature0 is obtained from the method: InstanceKlass::cast(k)->generic_signature. This method uses the _generic_signature_index to get the relevant data from the symbol array of classfiles. It is consistent with the source code of the javac compilation process and JVMS.

// for adding methods, ConstMethod::UNSET_IDNUM means no more ids available
  inline u2 next_method_idnum();
  void set_initial_method_idnum(u2 value)             { _idnum_allocated_count = value; }

  // generics support
  Symbol* generic_signature() const                   {
    return (_generic_signature_index == 0) ?
      (Symbol*)NULL : _constants->symbol_at(_generic_signature_index);
  }
  u2 generic_signature_index() const                  {
    return _generic_signature_index;
  }
  void set_generic_signature_index(u2 sig_index)      {
    _generic_signature_index = sig_index;
  }

Correspond to Signature [11] in classfile format in JVM.

Conclusion

6

We begin by exploring Java generics and examining the impact of reflection on the extension of generics and type erasure. Additionally, we explore techniques to access the runtime type of generics by generating anonymous instances.

Furthermore, by referring to the JVMS, the javac compilation process, and analyzing the JRE source code, we have studied how the JVM obtains generics and gained an understanding of the underlying principles. In conclusion, we have found a satisfactory solution to this problem.

References

1: https://docs.oracle.com/javase/tutorial/java/generics/why.html
2: https://homepages.inf.ed.ac.uk/wadler/gj/Documents/gj-oopsla.pdf
3: https://docs.oracle.com/javase/specs/jls/se8/html/jls-8.html#jls-8.1.2
4: https://docs.oracle.com/javase/specs/jls/se8/html/jls-8.html#jls-8.4.4
5: https://docs.oracle.com/javase/specs/jls/se8/html/jls-8.html#jls-8.8.4
6: https://docs.oracle.com/javase/specs/jls/se8/html/jls-9.html#jls-9.1.2
7: https://docs.oracle.com/javase/1.5.0/docs/guide/reflection/enhancements.html
8: https://docs.oracle.com/javase/8/docs/api/java/lang/reflect/class-use/Type.html
9: https://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.6
10: https://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.5
11: https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.7.9.1
12: https://github.com/openjdk/jdk8u
13: https://docs.oracle.com/javase/tutorial/java/generics/nonReifiableVarargsType.html#heap_pollution
14: https://jcp.org/aboutJava/communityprocess/review/jsr014/index.html
15: https://openjdk.org/groups/compiler/doc/hhgtjavac/index.html

Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.

0 1 0
Share on

Alibaba Cloud Community

879 posts | 198 followers

You may also like

Comments