Yitian platform full-stack application performance optimization and migration best practice

01 Ecological panorama of Yitian ECS application

At present, Yitian ECS supports a rich open source ecosystem. In terms of developer tools, Reliance ECS supports GCC, LLVM and other compilers, as well as Java (OpenJDK, GraalVM), Python and other languages; In the aspect of OS/basic libraries, Yitian ECS supports glibc, dpdk, jemalloc, OpenSSL and other basic libraries, as well as ubuntu and other OS; In terms of container/virtualization, it supports secure containers, Kubernetes (K8S), Docker, KVM, etc.

As shown in the above figure, from the perspective of application ecology, we can see the support of Reliance ECS. In terms of operating system, Yitian ECS supports the innovative LTS version of OpenAnolis Anolis OS and its customized version of Alibaba Cloud Linux; In terms of virtualization&containers, Yitian ECS supports Dragonfly Hypervisor, ACK, and Alibaba Cloud sandbox container kangaroo; In terms of tool chain and language, Yitian ECS supports Alibaba Cloud Compiler (LLVM based), Alibaba Dragonwell, Noslate Anode, and APython; In terms of middleware and workload, Reliance ECS supports MySQL, Flink, TensorFlow, Spark, etc.

There are two issues that we are more concerned about when we use Yitian ECS. First, how does the application smoothly migrate Yitian ECS? Can OS, compiler, basic library and open source components provide better adaptation? Are there tools to support migration? After the migration, is there any excellent cloud native management software? How to manage software versions with multiple architectures?

Second, how much revenue can the application get on the Yitian ECS? How fast can the app run?

Yitian ECS finally realizes the performance optimization of the whole stack through the optimization of the basic components of the software stack, the optimization of the second/third party library, and the general load optimization. Through relevant tools and best practices, it also supports customer scenario application tuning.

02 Full-stack application performance optimization practice

As shown in the figure above, the full-stack application optimization mainly includes Reliance ECS, operating system, compiler/runtime, and applications. There are two main methods to realize full-stack optimization, namely, Outside in approach and Layered approach.

Among them, Outside in approach is Alibaba Cloud to establish a baseline based on the actual Workload, and then iterate and optimize based on the baseline.

In terms of hierarchical optimization, Alibaba Cloud's operating system team, compiler team, middleware team, and PaaS team carry out their own hierarchical optimization around the Reliance 710 chip. For example, we combined the characteristics of the Inter-Tian architecture to optimize the specific operating system and compiler version.

At present, the basic software versions recommended by Alibaba Cloud include Alibaba Cloud Linux (3), Alibaba Cloud Compiler (13), and Alibaba Dragonwell (11). Application tool support includes KeenTune, EMT4J, Jifa, Perf (x).

Alibaba Cloud Linux has four features around Yitian ECS.

First, Alibaba Cloud Linux has completed comprehensive support for the first time, and realized out-of-the-box, work order support and ten-year maintenance.

Second, Alibaba Cloud Linux has experienced the Alibaba Double 11 scenario and effectively supported Alibaba Cloud databases, container services and other cloud products.

Thirdly, Alibaba Cloud Linux as the base, we have made full stack application optimization from hardware, kernel, compiler, and runtime to greatly improve performance.

Fourth, Alibaba Cloud Linux is gradually open to the dragon lizard community and the upstream community.

Next, let's talk about the C/C++compiler suite Alibaba Cloud Compiler. Alibaba Cloud Compiler has been comprehensively optimized for Yitian 710 chip. It can better support SVE instructions; The team carried out microarchitecture optimization based on Yitian 710 chip; The latest C++20 feature support on the Reliance 710 chip: Coroutine, Modules, etc.

In addition, Alibaba Cloud Compiler is aimed at Alibaba Cloud products and serves customers on the cloud. A set of compilers, supporting both X64 and AArch64 architectures; Support faster build and compilation speed, which is 15% to 40% faster than GCC; More easy to use compilation optimization, with 5% to 15% performance improvement compared with GCC.

The Chinese name of Alibaba Dragonwell is "Longjing", which was opened in 2019. As shown in the above figure, Alibaba Dragonwell iteratively optimizes between different versions on the Inter-Tian chip, and the throughput of 11.0.11.6 is 58% higher than that of 11.0.8.3SEPCjbb2015. The performance under multi-core conditions can better ensure scalability.

Next, let's talk about the one-button tuning tool KeenTune. KeenTune can better combine different business scenarios and VM specifications on the cloud to form the best performance tuning; ProfilesKeenTune can tune kernel parameters, application configuration, etc. in a one-click, full-stack way.

As shown in the figure above, the basic software of Yitian ECS has been optimized in four directions: Workload Profiling Driven optimization, architecture difference optimization, compilation optimization, and concurrency optimization.

◾ In terms of Workload Profiling Driver, according to the load characteristics, we targeted the use of large code pages, XPS, kernel scheduling, ext4 fast commit and other technologies for optimization.

◾ In terms of architecture differences, it mainly includes the optimization of TLBi, new instruction set, Code Cache, register, etc. Among them, in terms of new instruction sets, concurrent multithreading is a typical feature of modern general load. LSE instructions can effectively improve the performance of general loads in multi-core situations.

◾ In terms of compiler optimization, FDO/PGO, LTO and other traditional compiler optimization technologies are used.

◾ In terms of concurrency optimization, Alibaba Cloud has done a lot of Weak Memory optimization in the field of JAVA virtual machines, as well as multi-thread optimization such as CAS and Lock.

As shown in the figure above, Yitian ECS has achieved more than 20% performance improvement in database, Big Data and Web scenarios. In the RDS-MySQL scenario of C++, the performance was improved by 33%; In the Fink scenario of Java, the performance has been improved by 30%; In the Web-tooling/Node.js scenario, the performance was improved by 43%; In the WordPress scenario of PHP, the performance has been improved by 20%.

03 Cross-architecture migration

Next, let's talk about the full software lifecycle support of cross-architecture migration solutions. At present, Alibaba Cloud's cross-architecture migration scheme covers the cross-architecture support of the whole process of source code, construction, testing, production and online.

At the source code stage, users can use tools to check the compatibility and health of the architecture. In the construction phase, we can help users compile across architectures and support the integration of optimized basic libraries, open source libraries and frameworks. In the test phase, Alibaba Cloud has accumulated a large number of practice cases, which can help users to check the build sanity and software version dependency. In the production stage, users are supported to check production parameters, software version, and online troubleshooting tools.

The above figure is the management diagram of ACK for multi-cPU architecture. An ACK cluster manages both the x86 node pool and the Arm node pool. Alibaba Cloud image warehouse ACR fully supports multi-architecture images.

In the native environment of ACK cloud, it will automatically manage and pull images matching the current CPU architecture according to the different CPU architectures. In addition, user services can smoothly migrate and configure traffic between x86 and Arm architectures.

Let's summarize: Yitian ECS supports rich software and tools, and provides rich documentation. Users can select different ECS instances, OS instances, tool chains, and containers according to their needs. We support application migration and troubleshooting through tools/solutions.

At present, the eighth generation ECS cloud server ARM instance g8y is open for testing. The relevant website is as follows:

https://www.aliyun.com/daily-act/ecs/ecs_yitian

04 Important release

Next, introduce the new source code scanning tool Cross Platform Migration Scanner; The out-of-the-box intelligent optimization scheme of Yitian ECS; And the development guide for the developer community.

Cross Platform Migration Scanner supports multiple languages and cross-architecture compatibility check. It can scan the code at the source code stage to find compatibility problems.

The above figure shows the architecture of ECS Booster's intelligent optimization scheme. ECS Booster automatically allocates optimization parameters according to different scenarios, such as Web, database, cache, video, AI, etc., and supports out-of-the-box use.

The commercial version of Yitian ECS will be officially launched on November 15, and the developer guide of Yitian ECS will also be released on Github open source (open source address: https://github.com/aliyun/yitian-ecs-getting-started )。

In the novice guide, the containerized optimization results will be published through the image method. Help you complete the migration of different architectures through documents. In addition, the Yitian ECS developer guide provides rich analysis tools to help users solve different performance problems.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us