×
Community Blog Performance Issues Related to LocalDateTime and Instant during Serialization Operations

Performance Issues Related to LocalDateTime and Instant during Serialization Operations

This article discusses some performance issues that an Alibaba engineered encountered during serialization operations which were related to using the LocalDateTime and Instant time formats.

By Lv Renqi.

The Performance Issue

When conducting performance pressure tests on the new version of Apache Dubbo, we discovered an issue related to an attribute of the Transfer Object (TO) class. Changing Date to LocalDateTime resulted in the throughput dropping from 50 thousand to 20 thousand and the response time increasing from 9 ms to 90 ms.

Among these changes, we were most concerned with the response time changes. Response time in many ways is the cornerstone for good performance numbers because performance metrics are only meaningful once after a certain response time level is secured. In the case of stress testing, gigabits per second (GPS) and transactions per second (TPS) numbers are only acceptable when target response time numbers are met. Pure theoretical numbers are meaningless. Every bit of response time counts in cloud computing. Even an increase of 0.1 ms in response time of the underlying services means a 10% jump in the overall cost.

Latency is the Achilles's heel of any system with remote users. For every 100 km, the latency of a data packet increases by 1 ms. The latency between Hangzhou and Shanghai is about 5 ms, and latency between Shanghai and Shenzhen is of course even higher as the distance is significantly larger. The direct result of latency is increased response time, and this worsens the overall user experience and inflates cost.

If a request modifies the same row of records in different units, even if we can maintain consistency and integrity, the cost is still very high. If a request needs to access remote High-Speed Service Framework (HSF) services, which is a distributed RPC service framework widely used within Alibaba, or other remote databases more than 10 times, and one service invokes another, latency will add up real soon, having a snowballing effect.

The Importance of Universality in Java

Dealing with time is something that is ubiquitous in computer science. Without a strict concept of time, 99.99% of applications lose meaning and practicality, especially the time-oriented custom processing that you see with most monitoring systems on the cloud nowadays.

Before Java Development Kit 8 (JDK 8), java.util.Date was used to describe date and time, and java.util.Calendar was used for time-related computing. JDK 8 introduced more convenient time classes, including Instant, LocalDateTime, OffsetDateTime, and ZonedDateTime. In general, time processing is more convenient because of these classes.

Instant stores timestamps in a Coordinated Universal Time (UTC) format and provides a machine-facing, or internal, time view. It is suitable for database storage, business logic, data exchange, and serialization scenarios. LocalDateTime, OffsetDateTime, and ZonedDateTime include time zone or seasonal information and also provide a human-friendly time view to input and output data to users. When the same time is output to different users, their values are different. For example, the shipping time of an order is shown to the buyer and seller in different local times. These three classes can be considered as external-facing tools, rather than the internal work part of the application.

In short, Instant is better for backend services and databases, while LocalDateTime and its cohorts are better for frontend services and displays. The two are in theory interchangeable but in reality serve different functions. The international business team has rich experience and thoughts on that matter.

Date and Instant are commonly used in integrating Alibaba's in-house High-Speed Service Framework (HSF) and Dubbo.

Reproducing the Performance Issue

To figure out what exactly is behind the performance issue that we saw before, we can try to reproduce it. But before we do so, we can consider the performance advantages of Instant through a simple demonstration. To do so, consider this common scenario in which you define a date in the format of Date, and then use the Instant format.

    @Benchmark
    @BenchmarkMode(Mode.Throughput)
    public String date_format() {
        Date date = new Date();
        return new SimpleDateFormat("yyyyMMddhhmmss").format(date);
    }

    @Benchmark
    @BenchmarkMode(Mode.Throughput)
    public String instant_format() {
        return Instant.now().atZone(ZoneId.systemDefault()).format(DateTimeFormatter.ofPattern(
                "yyyyMMddhhmmss"));
    }

After doing this, run the stress test for 30 seconds in four local concurrent threads. The results will be as follows:

Benchmark                            Mode  Cnt        Score   Error  Units
DateBenchmark.date_format           thrpt       4101298.589          ops/s
DateBenchmark.instant_format        thrpt       6816922.578          ops/s

From these results, we can conclude that Instant has an advantage in terms of formatting performance. In fact, it also has a performance advantage when it comes to other operations as well. For example, we found that it shows promising performance for addition and subtraction of date and time operations.

Pitfalls of Instant during Serialization Operations

Next, to replication of the problem we saw above, we also performed stress tests to see the performance changes during serialization and deserialization operations with Java and Hessian (optimized for Taobao), respectively.

Hessian is the default serialization scheme in HSF 2.2 and Dubbo.

    @Benchmark
    @BenchmarkMode(Mode.Throughput)
    public Date date_Hessian() throws Exception {
        Date date = new Date();
        byte[] bytes = dateSerializer.serialize(date);
        return dateSerializer.deserialize(bytes);
    }

    @Benchmark
    @BenchmarkMode(Mode.Throughput)
    public Instant instant_Hessian() throws Exception {
        Instant instant = Instant.now();
        byte[] bytes = instantSerializer.serialize(instant);
        return instantSerializer.deserialize(bytes);
    }

    @Benchmark
    @BenchmarkMode(Mode.Throughput)
    public LocalDateTime localDate_Hessian() throws Exception {
        LocalDateTime date = LocalDateTime.now();
        byte[] bytes = localDateTimeSerializer.serialize(date);
        return localDateTimeSerializer.deserialize(bytes);
    }

The results were as follows: By using the Hessian protocol, the throughput dropped precipitously when using the Instant and LocalDateTime formats. In fact, it is 100 times less than that when the Date format is used. Through further investigation, we found that the serialized byte stream of Date is 6 bytes, while the stream of LocalDateTime is 256 bytes. The cost of network bandwidth for transmitting is also magnified. Java's built-in serialization solution shows a slight decline, but there's no substantive difference.

Benchmark                         Mode  Cnt        Score   Error  Units
DateBenchmark.date_Hessian       thrpt       2084363.861          ops/s
DateBenchmark.localDate_Hessian  thrpt         17827.662          ops/s
DateBenchmark.instant_Hessian    thrpt         22492.539          ops/s
DateBenchmark.instant_Java       thrpt       1484884.452          ops/s
DateBenchmark.date_Java          thrpt       1500580.192          ops/s
DateBenchmark.localDate_Java     thrpt       1389041.578          ops/s

Issue Analysis

Our analysis is as follows: Date is one of the eight primitive types of Hessian object serialization.

Next, while Instant has to go through Class.forName for both serialization and deserialization, which caused the precipitous drop in throughput and response time. Therefore, Date is more favorable.

Final Thoughts

We found that we can implement com.alibaba.com.caucho.hessian.io.Serializer for classes such as Instant through extension, and register it with SerializerFactory to upgrade and optimize Hessian so that the issue discussed in this article can be eliminated. However, there will be compatibility issues with earlier and future versions. This is a serious problem. The rather complex dependencies in Alibaba make this impossible. Given this issue, the only recommendation we can give is to use Date as the preferred time attribute for the TO class.

Technically, HSF's RPC protocol is a session-layer protocol and version recognition is also done here. However, the presentation layer of service data is implemented by a self-descriptive serialization framework such as Hessian, which lacks version recognition. Therefore, upgrading is very difficult.

0 0 0
Share on

JeffLv

4 posts | 2 followers

You may also like

Comments

JeffLv

4 posts | 2 followers

Related Products