What makes a 20-line code 10x faster
Summary: The obvious benefit of performance optimization is that it saves machine resources. If an application with 2000 servers has an overall performance improvement of 10%, in theory, it is equivalent to saving 200 machines. In addition to saving machine resources, applications with good performance are less likely to reach the performance bottleneck of the machine when dealing with sudden traffic surges than those with poor performance. When expanding machine capacity in the same traffic scenario, fewer machines are required. In this way, capacity expansion and emergency operations can be completed more quickly. Therefore, applications with good performance are better in terms of stability than applications with poor performance.
1. Background
1.1 Business Background
The Alipay card package holds the user's membership card and coupons. Both the coupon cell and the coupon details are finally presented to the end user through static template configuration plus dynamic variable data.
The following [Figure 1] shows the presentation form of the card and coupon data in the C-end user, and [Figure 2] shows the C-end data assembly process.
The display form of the card and coupon data on the C side
Taking [Figure 2] as an example, there are two variables availableAmount and voucherName in the template, and these two variables have corresponding values in the dynamic variable data. Replace the two corresponding variables in the template with dynamic values, and finally assemble them into "100 yuan red envelope name". When the red envelope is used once and 30 yuan is consumed, the value of availableAmount in the dynamic data will become 70. When the user enters the red envelope details page again, the display data will become "70 yuan red envelope name" after reassembly.
1.2 Problem Discovery
During the recent project process, I carefully sorted out the card and coupon assembly and rendering logic, and carefully studied the template variable replacement logic in [Figure 3]. This is an old piece of code that has been around since the card pack product was born, almost ten years ago. Its role is to replace the variables in the template with dynamic data. At first glance at the logic of this code, there is no problem, that is, the variables between (including) two $ in the template are replaced with dynamic data. Considering that this is a very core and high-frequency calling logic, let's see if there is room for performance optimization.
After clarifying the replacement logic, the first feeling is that this code has room for performance improvement. There are two main points:
1. Each while loop performs two indexOf operations
2. Substring operation is performed every time while loop
So, there are two questions:
1. Can indexOf and substring operations be reduced?
2. Do you really need to search for template variables every time?
2. Performance optimization
With the above two questions, step by step performance optimization and testing.
The entire optimization process was iterated for a total of 5 versions, and finally achieved a performance improvement of more than 10 times. The implementation and performance comparison of different versions are described below.
2.1 Performance Optimization V1
This release removes the indexOf and substring operations in favor of another alternative.
The previous replacement logic is to loop the template content string from beginning to end, and replace the variable between $. In the process, indexOf and substring operations need to be performed continuously. The new implementation method is to use double pointers to extract all variables in the template by looping the template content string before performing variable substitution, and then loop through the variable set to replace the variables in the template content in turn.
2.2 Performance Optimization V2
Static template configurations generally do not change. That is to say, the variables corresponding to the same template are fixed. The template id and template variable set can be cached one-to-one to reduce variable extraction before each replacement.
Before deciding to use a cache, think about how to implement the cache. There are two things to note:
1. Use local cache instead of TBase to reduce the pressure on TBase in high-traffic scenarios
2. How to control the effective number of local caches and maximize cache efficiency with limited memory footprint
The cache logic can be implemented with the help of the cache class of the Google Guava library. See [Figure 5] for the sample code.
2.3 Performance comparison (1)
After completing the above two steps, the performance test was carried out, and the performance comparison is shown in [Figure 7].
Through the performance comparison, it is found that the V1 version has a performance improvement compared to the original version, and the V2 version with cache also has a performance improvement compared to the V1 version without cache. However, as the traffic increases, the performance optimization effect gradually weakens. It shows that the time-consuming optimization of the V1 and V2 versions does not account for a high proportion of the entire template variable replacement time. It also shows that there are other more time-consuming points in the entire template variable replacement logic.
Going back and looking at the variable substitution logic again, I suddenly realize that there is a "big problem" missing. This is the String.replace method, which has two time-consuming points:
1. Template compilation is performed for each replace
2, replace is to create a new object to return
And after each replace, the variable must be reassigned.
2.4 Performance Optimization V3
On the basis of the V2 version, the replace method is removed, and StringBuilder is used to implement it.
There is one thing to note in the implementation of StringBuilder. In the V2 version, the extracted variable returns a Set collection. The order of variables in the returned set will be inconsistent with the order of variables in the template. If there are multiple identical variables in the template, only the first variable will be replaced. Therefore, it is necessary to replace the result returned by the variable extraction with an ordered and repeatable List to ensure the correctness of the logic.
2.5 Performance Optimization V4
After the V3 version optimization, the performance has improved significantly, which proves that the String.replace method is the most time-consuming point in the entire template variable replacement logic. So in the original method, only use StringBuilder to replace String.replace to get the V4 version.
2.6 Performance comparison (2)
Through [Figure 11], it can be clearly found that after the implementation of StringBuilder, the performance is improved by more than 10 times, and the effect is very obvious.
The V4 version actually takes less time than the V3 version with cache, indicating that the V3 version first extracts variables and then performs StringBuilder assembly process, which is relatively more time-consuming. However, the code readability of the V4 version is not as good as that of the V3 version. You can combine the V3 version with the V4 version to remove the cache dependency and produce a V5 version with the best code readability and performance.
2.7 Performance Optimization V5
First extract variables, remove cache dependencies, and replace String.replace with StringBuilder to increase code readability.
3. Summary
Through the performance optimization of the above 5 versions, the performance has been improved by more than 10 times.
The order of performance from high to low is V4 > V3 > V5 > V2 > V1 > unoptimized original. Among them, the performance of V3, V4, and V5 versions is significantly better than that of V1 and V2 versions, which proves that the most time-consuming point of this template replacement logic is String.replace. V3 > V5 and V2 > V1 indicate that the introduction of cache still has a certain performance improvement helpful. In terms of code readability, V4 is not as good as V3 and V5.
There are two main points to summarize the entire optimization:
1. The String.replace method involves template compilation and new string generation, which is more resource-intensive
2. StringBuilder replaces String.replace, in addition to shortening the call time, it can also reduce resource usage in space. Compared with String.replace, StringBuilder.append can reduce the creation and destruction of a large number of String objects in the middle, which can reduce the pressure of GC, thereby reducing the load of CPU.
The obvious benefit of performance optimization is that it saves machine resources. If an application with 2000 servers has an overall performance improvement of 10%, in theory, it is equivalent to saving 200 machines. In addition to saving machine resources, applications with good performance are less likely to reach the performance bottleneck of the machine when dealing with sudden traffic surges than those with poor performance. When expanding machine capacity in the same traffic scenario, fewer machines are required. In this way, capacity expansion and emergency operations can be completed more quickly. Therefore, applications with good performance are better in terms of stability than applications with poor performance.
Finally, back to the topic of this article: what makes a 20-line code 10 times faster?
My answer is: StringBuilder yyds!
1. Background
1.1 Business Background
The Alipay card package holds the user's membership card and coupons. Both the coupon cell and the coupon details are finally presented to the end user through static template configuration plus dynamic variable data.
The following [Figure 1] shows the presentation form of the card and coupon data in the C-end user, and [Figure 2] shows the C-end data assembly process.
The display form of the card and coupon data on the C side
Taking [Figure 2] as an example, there are two variables availableAmount and voucherName in the template, and these two variables have corresponding values in the dynamic variable data. Replace the two corresponding variables in the template with dynamic values, and finally assemble them into "100 yuan red envelope name". When the red envelope is used once and 30 yuan is consumed, the value of availableAmount in the dynamic data will become 70. When the user enters the red envelope details page again, the display data will become "70 yuan red envelope name" after reassembly.
1.2 Problem Discovery
During the recent project process, I carefully sorted out the card and coupon assembly and rendering logic, and carefully studied the template variable replacement logic in [Figure 3]. This is an old piece of code that has been around since the card pack product was born, almost ten years ago. Its role is to replace the variables in the template with dynamic data. At first glance at the logic of this code, there is no problem, that is, the variables between (including) two $ in the template are replaced with dynamic data. Considering that this is a very core and high-frequency calling logic, let's see if there is room for performance optimization.
After clarifying the replacement logic, the first feeling is that this code has room for performance improvement. There are two main points:
1. Each while loop performs two indexOf operations
2. Substring operation is performed every time while loop
So, there are two questions:
1. Can indexOf and substring operations be reduced?
2. Do you really need to search for template variables every time?
2. Performance optimization
With the above two questions, step by step performance optimization and testing.
The entire optimization process was iterated for a total of 5 versions, and finally achieved a performance improvement of more than 10 times. The implementation and performance comparison of different versions are described below.
2.1 Performance Optimization V1
This release removes the indexOf and substring operations in favor of another alternative.
The previous replacement logic is to loop the template content string from beginning to end, and replace the variable between $. In the process, indexOf and substring operations need to be performed continuously. The new implementation method is to use double pointers to extract all variables in the template by looping the template content string before performing variable substitution, and then loop through the variable set to replace the variables in the template content in turn.
2.2 Performance Optimization V2
Static template configurations generally do not change. That is to say, the variables corresponding to the same template are fixed. The template id and template variable set can be cached one-to-one to reduce variable extraction before each replacement.
Before deciding to use a cache, think about how to implement the cache. There are two things to note:
1. Use local cache instead of TBase to reduce the pressure on TBase in high-traffic scenarios
2. How to control the effective number of local caches and maximize cache efficiency with limited memory footprint
The cache logic can be implemented with the help of the cache class of the Google Guava library. See [Figure 5] for the sample code.
2.3 Performance comparison (1)
After completing the above two steps, the performance test was carried out, and the performance comparison is shown in [Figure 7].
Through the performance comparison, it is found that the V1 version has a performance improvement compared to the original version, and the V2 version with cache also has a performance improvement compared to the V1 version without cache. However, as the traffic increases, the performance optimization effect gradually weakens. It shows that the time-consuming optimization of the V1 and V2 versions does not account for a high proportion of the entire template variable replacement time. It also shows that there are other more time-consuming points in the entire template variable replacement logic.
Going back and looking at the variable substitution logic again, I suddenly realize that there is a "big problem" missing. This is the String.replace method, which has two time-consuming points:
1. Template compilation is performed for each replace
2, replace is to create a new object to return
And after each replace, the variable must be reassigned.
2.4 Performance Optimization V3
On the basis of the V2 version, the replace method is removed, and StringBuilder is used to implement it.
There is one thing to note in the implementation of StringBuilder. In the V2 version, the extracted variable returns a Set collection. The order of variables in the returned set will be inconsistent with the order of variables in the template. If there are multiple identical variables in the template, only the first variable will be replaced. Therefore, it is necessary to replace the result returned by the variable extraction with an ordered and repeatable List to ensure the correctness of the logic.
2.5 Performance Optimization V4
After the V3 version optimization, the performance has improved significantly, which proves that the String.replace method is the most time-consuming point in the entire template variable replacement logic. So in the original method, only use StringBuilder to replace String.replace to get the V4 version.
2.6 Performance comparison (2)
Through [Figure 11], it can be clearly found that after the implementation of StringBuilder, the performance is improved by more than 10 times, and the effect is very obvious.
The V4 version actually takes less time than the V3 version with cache, indicating that the V3 version first extracts variables and then performs StringBuilder assembly process, which is relatively more time-consuming. However, the code readability of the V4 version is not as good as that of the V3 version. You can combine the V3 version with the V4 version to remove the cache dependency and produce a V5 version with the best code readability and performance.
2.7 Performance Optimization V5
First extract variables, remove cache dependencies, and replace String.replace with StringBuilder to increase code readability.
3. Summary
Through the performance optimization of the above 5 versions, the performance has been improved by more than 10 times.
The order of performance from high to low is V4 > V3 > V5 > V2 > V1 > unoptimized original. Among them, the performance of V3, V4, and V5 versions is significantly better than that of V1 and V2 versions, which proves that the most time-consuming point of this template replacement logic is String.replace. V3 > V5 and V2 > V1 indicate that the introduction of cache still has a certain performance improvement helpful. In terms of code readability, V4 is not as good as V3 and V5.
There are two main points to summarize the entire optimization:
1. The String.replace method involves template compilation and new string generation, which is more resource-intensive
2. StringBuilder replaces String.replace, in addition to shortening the call time, it can also reduce resource usage in space. Compared with String.replace, StringBuilder.append can reduce the creation and destruction of a large number of String objects in the middle, which can reduce the pressure of GC, thereby reducing the load of CPU.
The obvious benefit of performance optimization is that it saves machine resources. If an application with 2000 servers has an overall performance improvement of 10%, in theory, it is equivalent to saving 200 machines. In addition to saving machine resources, applications with good performance are less likely to reach the performance bottleneck of the machine when dealing with sudden traffic surges than those with poor performance. When expanding machine capacity in the same traffic scenario, fewer machines are required. In this way, capacity expansion and emergency operations can be completed more quickly. Therefore, applications with good performance are better in terms of stability than applications with poor performance.
Finally, back to the topic of this article: what makes a 20-line code 10 times faster?
My answer is: StringBuilder yyds!
Related Articles
-
A detailed explanation of Hadoop core architecture HDFS
Knowledge Base Team
-
What Does IOT Mean
Knowledge Base Team
-
6 Optional Technologies for Data Storage
Knowledge Base Team
-
What Is Blockchain Technology
Knowledge Base Team
Explore More Special Offers
-
Short Message Service(SMS) & Mail Service
50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00