Cloud computing, do you use it in a correct way? - Alibaba Cloud Developer Forums: Cloud Discussion Forums

  • UID623
  • Fans4
  • Follows1
  • Posts72

[Share]Cloud computing, do you use it in a correct way?

More Posted time:Apr 5, 2017 13:26 PM
Explosive growth of cloud computing
First, let's look at a group of data: the financial report of Alibaba Group showed that the Alibaba Cloud had owned over 2.3 million users as of March 31, 2016; the revenue of Alibaba Cloud was 1.066 billion yuan for the first quarter of 2016, with year-on-year growth of 175%.
This group of data above may reflect the current IT market - the utilization rate of cloud computing has become higher and higher, and more users began to accept and use cloud computing, which is a good thing. However, I, as a cloud computing practitioner for many years, have vaguely realized that some improvements may be done for this market while providing cloud computing services for many customers.
First, let's see the user base composition of cloud computing:
Individual site masters and developers
Micro, small and medium enterprises (non-internet)
Internet enterprises
Medium and large traditional enterprises
Government departments at all levels
Education departments and scientific research institutions
The understandings of these users for cloud computing are roughly divided into the following three categories:
The “No” school: They change their looks when talking about cloud computing, and consider the cloud computing is not reliable and safe;
The blind school: They do not fully understand cloud computing, but consider cloud computing could solve all problems;
The powerhouse school: They comprehensively embrace cloud computing, and are able to systematically combine cloud computing with the business;
How can we make a good use of cloud computing
As all major cloud computing manufacturers open their excellent IT technologies in the pattern of cloud service one after another, actually preparations at three levels shall be made for the users preparing for applying cloud technologies or already using the cloud technologies, so as to better embrace cloud computing, and improve the service capability of their IT business:
1. Psychological expectation level
A reasonable expectation shall be held for various major cloud computing platforms. For users, cloud computing has indeed improved the technical and service abilities substantially over the traditional IDC, enabling users to access quality IT capability easily and quickly, and implement paid-on-demand IT architecture with elastic scaling.
After all, cloud computing manufacturers are neither god nor all-purpose, and they are unable to ensure 100% availability for all cloud services. Taking ECS, the most-used cloud server product in Alibaba Cloud as an example, the official data of its availability is 99.95%, which means that ECS may suffer problems and cause service interruption for about 0.1825 days in a year, namely 4.38 hours. Through my investigation, the availability promised by the cloud server providers such as Alibaba Cloud, Tencent Cloud, Amazon AWS, Microsoft Azure is all 99.95%.
As scaled effect has been formed for various major cloud platforms, the data centers spread across all major cities, even all over the world. There are basically tens of thousands of servers in each data center, and the network connections between data centers are extremely complex. In such enormous infrastructure architecture, it is very likely to trigger a butterfly effect. After all, natural and man-made disasters are unavoidable, and flood, fire, power failure, earthquake, and even engineer mis-operation may possibly happen. Does that mean cloud computing is also not reliable, or there is no way to solve these problems? Actually not. Cloud computing still provides a certain guarantee and technical means for users at this level. I will explain the implementation in the following sections.
It is frequently asked by users as to how much is the concurrency supported for their applications on this cloud platform. Actually, this is a very unprofessional way to ask, just like you go to see a doctor and ask what is wrong with you as soon as you enter the room.
The system performance is like the human body. It is a comprehensive manifestation, and a result of cooperative work of multiple organs. In addition to network and server performance, factors affecting performance also include the use of middleware such as data structure, database design, SQL writing, cache design, and message queue, and multiple levels such as the key code structure, and framework design. Therefore, to understand the system performance after it is migrated to the cloud, you should first have a reasonable framework design, coupled with professional performance tests and evaluations. Besides, the performance optimization of a system is a systematic project rolled out step by step along with the business development, which requires multi-side cooperation (not just a simple individual task).
Thus, you shall not blindly count on the cloud platform solely for system performance. The cloud platform may assist you to a certain extent, but the core optimization work still relies on the improvement of your technical capacity.
2. Application architecture
I communicated with many cloud computing users, and provided services for them. I discovered that many people did not give play to the advantages of cloud computing. In summary, the following problems exist in general:
The cloud server is regarded as the entire cloud computing
Cloud server is a very important service of cloud computing, but it is just one of the more than 100 cloud products. Unless the application is really very simple, the cloud server shall not undertake all the work, such as hosting database, message queue and cache components, which is unreasonable in multiple aspects including the performance, scalability and management.
The concepts of pay-as-you-go and access are not accepted
The conventional thinking of many users has not been changed. Regardless of the application, they like to adopt cloud resources of high configuration. Actually this is a waste of resources and money. Access on demand and elastic scaling are both one of the core features of cloud computing. You should fully understand and leverage the two features.
The application architecture is not based on cloud, and distributed and high-availability cloud architecture have not been fully implemented
This problem shall be at the core. After migration to the cloud platform, the two problems that users most concerned with are: data reliability and application availability (no-break). Taking Alibaba Cloud as an example, I have the following suggestions:
Reasonably adopt some cloud platform components: such as Server Load Balancer, database service RDS, distributed database service DRDS, object-based storage OSS, message queue service, cache service, and log service. This shall be more effective, with better performance, stronger scalability, more convenient maintenance and higher availability than self-built systems, on the premise that some transformation is made on your application system. The transformation mainly involves interaction code between APIs and the components, and does not involve logic code of core businesses. Hence the changing scope is controllable.
How can we guarantee the data reliability: First, if you used the object-based storage OSS of Alibaba Cloud, it has triple-redundancy in itself. One copy of data is placed in a different cabinet in the same AZ (available zone), and the other copy of data is put in another AZ (available zone) under the same region. Therefore, its data reliability is up to 99.99999999%, and this basically represents that the data will not be lost. Of course, if you still feel anxious, you can also store the data backups in different regions. If you use cloud data disks, you may also set snapshot rules as needed, so that Alibaba Cloud will automatically help you make the snapshot of the cloud data disk. These snapshots are all stored in the object-based storage OSS, and the data reliability of cloud storage is also up to 99.999%.
How can we guarantee application availability: Recently, problems occurred in an available zone of Alibaba Cloud's China North Region 2 and service was suspended, which caused failure of some users to access the system about nearly one hour. But some other users successfully avoided this accident. How did they achieve this? Actually it is very simple - make real distributed and multi-zone deployment on the cloud. Here we put aside the technical schemes which are very difficult to achieve, such as two centers in one place, three centers in two places, dual active and ternate active schemes. I strongly suggest you achieve highly available architecture to a certain extent: Server Load Balancer (supporting multi-zone deployment) + multiple cloud servers ECS (distributed in different zones under the same region) + database service RDS + object-based storage OSS + cache service OCS (optional), and this shall be the cheapest architecture of high availability that is the easiest to implement. Of course, some transformation shall be made to your application so as to achieve distribution. The core problem is to achieve stateless application (such as unified management of sessions) and independent storage of static files in OSS (such as images, videos, audios, attachments uploaded by users, and even files like CSS and JS). Such architecture could at least ensure zero interruptions of application service when there is a failure in an available zone.
Concept popularization:
Region: Based on geographic location, we call the infrastructure service set in a certain area as a region.
AZ or zone: The physical areas (server rooms) in the same region with independent power supplies and networks.
3. Team ability
The appearance of cloud computing emancipated lots of technical staffs who used to only focus on underlying machine rooms, hardware and network O&M to a certain extent. The core of an enterprise is its business system, rather than a pile of machines in the machine room. The maximum value of the equipment is that the calculation, storage and network abilities could support the core business development of the enterprise.
With cloud, the enterprise may spend more resources and energy on business development, hence a new requirement for its team ability.
O&M personnel
More energy shall be paid to the upper-layer applications to ensure efficient and stable operation of core businesses such as application and database. In the era of cloud, new requirements are imposed for O&M staffs. On the one hand, O&M personnel should enhance their study of cloud computing techniques; on the other hand, they should also learn more about the enterprise business application, as well as middleware, database, system framework. They should work harder in continuous integration, various publishing, demotion and traffic limitation of business systems and assist the enterprise to achieve automated operation and maintenance as much as possible.
In the era of cloud, programmers shall learn more about the core products of major cloud platforms, especially the open APIs of these products. In the future programs, more and more cloud products will be integrated, and APIs will be the only approach for interaction between the program and cloud platform.
Architect is a systematic planner and builder. During the course of overall planning, appropriate and mature technical schemes shall be selected. Along with the cloud platform development, issues such as advantages and disadvantages of each component in the cloud, applicable scenarios, combined usage and others deserve attention from the architect.
In general, cloud computing industry has developed to a large scale as it is today. However, various manufacturers should still strengthen market education, publicity and promotion to enable more users to really understand cloud computing, to use computing better, and to secure applications system operation on the cloud. I, as a practitioner of cloud computing, and my company both hope to do our bit for this industry.