• UID63
  • Fans4
  • Follows1
  • Posts58

Performance optimization of mobile Taobao

More Posted time:Jan 12, 2016 17:24 PM
Performance optimization of mobile Taobao

by Rayman, From Alibaba Technology Associate(ATA)

To meet the diverse shopping needs of different users, mobile Taobao has evolved from a simple shopping tool to a content platform as its business has continued to expand over the past two years, but such rapid growth has come with side-effects. For instance, many links and pages are slow to open because they contain too many functions and users have to wait a long time. Therefore,
performance optimization is necessary.

Based on the shopping procedures of mobile Taobao users, we have divided the main purchase process into seven steps: launch, homepage loading, search, shopping cart, placing orders, payment and viewing orders. Every step and module is monitored so that the optimization is based on quantified data.The following sections specifically introduce three procedures - launch, homepage loading and shopping cart - and the optimization of two basic functions - network tuning and photo download.

I. Launch optimization
Based on offline analytical tools, online gray data and code review, we found three reasons for a slow launch: (1) many   are activated during launch; (2) there are unnecessary time-consuming blocking operations in the main thread during launch (3) some lock operations in the main thread result in a long waiting time.So we’ve come up with the following optimization scheme:
1) Net-less launch + cache
There is no Internet interaction from the time the user clicks the logo to the first homepage being displayed, during which all data is obtained through cache or pre-set data. This is because Internet IO is time-consuming and we can save time by moving back the data pull in the launch period and for homepage display.
2) Task classification and asynchronization
We’ve classified all launch tasks and adjusted the time of executing them based on their classes.
Level 1 refers to block launch tasks, including initiation of the basic SDK and the creation of a homepage.
Level 2 includes tasks that can be executed after the homepage has been successfully loaded,such as automatic login, micro Taobao status, number of messages, configuration information and operational data.
Only Level-1 tasks are executed during launch (there should be no lock operation and the main thread is not blocked) while Level-2 tasks can be executed after launch is completed.
3) Lazy load
Previously the launch process involved the initialization of many business operations, but now we apply lazy load to them and only initialize them at the time of use. Lazy load can work with cache or pre-set data to achieve better results.
Mobile Taobao has five tabs and their corresponding pages are all created during launch before optimization, but only the homepage is visible. Their creation costs time and involves some business logic such as sending requests, which compromises launch performance. Therefore, we’ve optimized them with lazy load, so only the visible homepage is created during the launch whereas the other pages won’t be created until the users click the corresponding tab. Thisprocess saves about 0.5s.
With the three main optimization strategies mentioned above, the mobile Taobao launch process flow is shown in Figure 1 (applicable for Android). The figure shows that in the main thread, we only keep the necessary initialization in the launch period and apply lazy load or asynchronization to all unnecessary operations, and we’ve achieved the goal of optimization.

II. Homepage optimization
As the page with the largest exposure, speed, stability and timely updates are the three main goals.
Content displayed on the homepage is divided into four types:
  1. The entry, icon, title and position of secondary pages are relatively fixed.
  2. There are usually six rolling photos on top, as entries to different modules or activities.
  3. Recommended goods and stores based on user ID.
  4. Entry to the message box on top with the number of unread messages.
The four types of content are dealt with in different ways:
For types 1 and 2, a lot of entry text and icons are cached locally. Content cached previously is displayed first, so even if there is no Internet connection, the overall homepage framework, cached photos and text can be drawn.
Locally cached content has an effective time stamp. After being drawn, if the system determines that the cache has expired, content will be extracted asynchronously and the page would refresh when the downloading is complete.

For Type 3, the page structure and hierarchy is optimized, meaning that recommended goods are at the bottom of the page and would be drawn when the user proactively scrolls the page up, so as to avoid pulling too much content at one time.
For Type 4, lazy load is applied, meaning delayed processing. After other content on the homepage is drawn, the interface is called to pull the number of unread messages.
Task asynchronization is also applied to the homepage. Because mainstream cellphones all have multi-core processors, some tasks can be completed in the asynchronous thread. For instance, there is a huge amount of data on the homepage, the
uploading, analysis and assembly of which is very time-consuming. By asynchronizing these time-consuming operations, we can ensure the normal
dispatching of the main thread and only render and update the UI in the main thread after all data is ready, thus guaranteeing the smoothness of the main thread.

III. Shopping cart optimization
The shopping cart has become users’ “second favorite” option and many continuously update their cart via different devices (PC, cellphone). Caching and storing data locally and presenting them to users in a timely way is a necessary means of improving the process of opening the shopping cart.
But goods in the shopping cart can be grouped differently according to activities, stores and the number of single goods. Adding or deleting any goods on Terminal A would lead to complexities in total price changes, and the calculating method cannot be obtained by the client in a real-time manner. The server side will calculate the price and deliver it.
Therefore, the shown price of goods in the shopping cart isn't just a simple equation of “unit price x number of goods = total price”. When the client
is updating, changes in the number of goods as well as in the total price are constantly pulled.

In the past, when users refreshed the page, it was a full update; now it’s optimized to a partial update, which not only reduces flow but also effectively speeds up the pull and refreshing.

What’s described above are optimization methods for business links. Now we’ll introduce the optimization approaches in terms of basic service.

IV. Network optimization
We all know that backbone network transmission is very time-consuming, in that it involves DNS search, TCP/TLS handshake and data transmission. The key to network optimization is how to save time regarding those three aspects.
1) IP direct connection:
The way to do that involves HTTP DNS. After launch is complete, send an HTTP request to obtain a domain name-IP mapping table used by mobile Taobao and cache it locally. Every time a connection is initiated, the domain name is replaced by an IP on the network layer for direct connection. This not only saves DNS time, but also avoids service shutdown on mobile Taobao caused by DNS Server attacks
from the public network. Such a situation happened once in 2013, but mobile Taobao wasn’t affected.

2) Establish a persistent connection:
This is realized through SPDY. Reduce TCP/TLS handshake to lower the connection cost. This can largely speed up photo downloading through CDN.
3) Domain name convergence:
Converge the domain name to the company’s main CDN domain name.
Concentrate the requests under a few domain names to raise the reuse rate of a persistent connection.
4) TCP optimization:
A wireless network is characterized by a high packet loss rate and a long RT, so targeted TCP optimization can be carried out in the following way:
Widen the initial congestion control window, close idle slow-start and dynamic MSS, close TCP DF and remove TCP time stamp.
5) Message shortening:
Gradually shift from JSON format to PB protocol.

V. Photo optimization
Photos are the most-used element in e-commerce apps. Downloading and rendering photos in a quick and efficient way is a concern of all e-commerce apps. In addition to the persistent connection and domain name convergence mentioned above, mobile Taobao also provides photo classification.
Derivative files in different combinations are created for the same photo in four dimensions -resolution, quality, sharpening and format.
A series of matching rules have been developed for different screens, different handling capacity and different network environments, so that users’ visual experience can be guaranteed with an appropriate photo size and quality.

A highlight is when a photo is sharpened until its color and clarity will be satisfactory for users despite the poor quality of the original image.

VI. Good tools are essential
Systematic tools help us a lot during the launch optimization of mobile Taobao, including Android’s TraceView and iOS’s Time Profiler and System Trace under its Instruments. As tools for data collection and analysis, they are powerful for analyzing hotspots in applications. How to use those tools isn’t part of this article, but they provide all scenarios about the use of a thread, and details such as stack trace and time consumption can be seen for every thread mobilization. By analyzing those mobilizations, we can find the more time-consuming ones during launch and, once we identify the specific
bottlenecks, we can carry out targeted optimizations.

For instance, during the launch of mobile Taobao on Android, there used to be an encrypted storage module that called the SecretKeyFactory.getInstance() to generate an encrypted key. With TraceView, we discovered that it takes more than 300ms to call this function because a lock operation exists in its stack. Having found this bottleneck, we studied many methods for encrypted storage for mobile Taobao on Android and eventually found a light module for optimization.

Summary - seven principles
1. Make good use of performance analysis tools and establish a monitoring system
2. Carry out sound network construction and optimization
3. Offline operation, local cache
4. Lazy load
5. Classify tasks and make them parallel as appropriate
6. Remove unnecessary operations in the main thread
7. Simplify and integrate complicated views
[Cloudy edited the post at Jan 12, 2016 18:25 PM]