Community Blog The Chronicles of Cloud Building in Hangzhou: Part 1

The Chronicles of Cloud Building in Hangzhou: Part 1

This article series carefully details Alibaba Cloud's 9 year history and development, including its internal struggles and its relationship with the Hangzhou government.

By Jazzyear


In the bleak autumn of 2013, the Alibaba Cloud Developer Conference (a predecessor of The Computing Conference) was moved to the Yunqi Town in a suburb of Hangzhou. Despite repeated invitations from the host, few journalists were willing to come to cover this unpopular event. The conference was held outdoors, with a small blue platform as the stage and a few hundred gray folding chairs for the attendees. Surrounded by wasteland and farmland, one could hardly find a decent restaurant near the meeting place. Alibaba Cloud's senior executives had to pick up guests from the airport and buy water and food for the guests.


The Butterfly Effect of Cloud Computing in China

This conference was themed "the butterfly effect of cloud computing". At that time, cloud computing was just a buzzword. The conference theme was more of an unrealistic fantasy of a bunch of idealists than a practical discussion.

At the China IT Summit held three years before the conference, many industry leaders were not enthusiastic about cloud computing. Li Yanhong (co-founder of Baidu) said in a clear-cut manner that, "Cloud computing is nothing novel but old wine in new bottles." Ma Huateng (founder of Tencent), however, held a vague attitude. He considered cloud computing as something of an imaginative vision that could not yet be leveraged as water or electricity until hundreds and even thousands of years later. "It may be possible in the days of the Avatar. But just not now."

Only Jack Ma stood firm in the faith of cloud computing. He asserted that cloud computing is what customers need. "If we don't do it, we will not survive."

Years of development has seen the irreversible arrival of the age of cloud computing. The debate between the BAT (Baidu, Alibaba, Tencent) leaders has left a thought-provoking scene in China's history of cloud computing.

The butterfly effect of cloud computing is coming true. The Zhonghe-Shangtang overpass used to suffer the worst congestion in Hangzhou. However, with the support of cloud computing, the average pass time per person has been reduced by 4.6 minutes. When the super typhoon Chan-hom approached Zhejiang province, 5 million people were able to check the typhoon's latest moving path through clients supported by cloud computing. Zhongce Rubber Group Co., Ltd., which had an annual output of 50 million tires, has seen an increase of 5% in its yield rate and a reduction of millions of dollars in its costs, owing to ET Industrial Brain developed based on cloud computing. Cloud computing has enabled data connection and mining. Like a tornado, it has set off from Hangzhou and swept across the Internet, where it was shaped. It has touched down in people's daily lives and become a powerful driving force for the transformation of traditional industries.

Jack Ma, with no technology background, successfully predicted the rise of cloud computing, while the IT geeks Li Yanhong and Ma Huateng missed the opportunity to get a head start. According to Gartner's report on worldwide public cloud market share in 2017, Alibaba Cloud ranked third, behind Amazon Web Services (AWS) and Microsoft Azure, and surpassed Google Cloud Platform (GCP) for the second consecutive year. According to IDC, in terms of the revenue for the first half of 2017, Alibaba Cloud secured a market share of 47.6% in the cloud infrastructure as a service (IaaS) market in China. In China, over half of the websites are built on Alibaba Cloud. Alibaba Cloud's paying users amount to over one million, which is far beyond the reach of other competitors.

The Computing Conference, previously the Alibaba Cloud Developer Conference, is no longer a highbrow event exclusive to cloud computing professionals. On September 19, 2018, nearly 60,000 people arrived at Yunqi Town, resulting in a shortage of tickets and an astronomical rise in the prices of hotel rooms.

How Did Alibaba Cloud Predict the Success of Cloud Computing?

"Just because Jack Ma doesn't know about technology." Yu Feng (nickname: Chu Ba), Alibaba Cloud's elastic computing leader, told Jazzyear playfully that, "Geek bosses are often too rational, while those who major in liberal arts are better at unbound imagination."

The funny thing is that Jack Ma was motivated merely by a "far-fetched" idea of Wang Jian to devote all of Alibaba Group's resources to support Alibaba Cloud. He knew nothing about cloud computing or the difficulties facing this task. Wang Jian, a Doctor of Computer Psychology, is the one who built Alibaba Cloud from scratch, but no one has seen him write any code. As some old employees recall, Wang liked to wear plaid shirts, teared up easily, and was really good at telling stories. The city of Hangzhou, which nurtures Alibaba Cloud, does not boast genes for technology, either. It has always been a city singing the romance of the West Lake, Broken Bridge, and poets.

Looking back, Chu Ba thought Alibaba Cloud had also been dashing forward through the night. Starting off despite doubts and traveling undauntedly in the dark, Alibaba Cloud forged ahead with the bless of knowing nothing. "An innocent child knows no fear; only an adult does."

An Ambitious Start

After the Spring Festival of 2009, in an office with no heating or air conditioning, located at the Huizhong Building, Shangdi Street, Beijing, a young female engineer typed the first line of code for Alibaba Cloud's Apsara system:

##Created at 2009-02-19 by Apsara. 

Over the next nine years, this line of code has been extended gradually to a massive system dubbed Apsara, the cornerstone of Alibaba Cloud. Apsara is a data center operating system, which interconnects millions of servers located in 49 regions worldwide and enables them to work collaboratively as a supercomputer.

Before the age of cloud computing, both traditional enterprises and IT enterprises were trapped in their own data silos. They spent a considerable amount of money to buy servers and build data centers. However, as their businesses expanded, the addition of servers could not catch up with the rapid increase of users, which often led to IT system crashes. Some people started to think about the possibility of building a shared server that supported Pay-As-You-Go plans.

In 1997, Prof. Ramnath K. Chellappa at the University of Southern California first broached the concept of "cloud computing". Essentially, cloud computing is similar to today's sharing economy. It allows everyone to lease a pool of servers and obtain as much computing power as needed from it. In this way, anyone with any amount of money can enjoy access to the data pool, and without needing to build a private data pool. Theoretically, this solution can greatly reduce the IT cost for enterprises, improve system stability, and break down data silos that impede information flow. However, the biggest difficulty was that no one knew how to "bridge" different servers to support the free flow and synchronous update of data and to implement a scalable Pay-As-You-Go pricing strategy.

In 2004, Google enabled the implementation of this sci-fi idea. After Google published three major papers on the big data theory—Bigtable, GFS, and MapReduce, the open source community ignited the zeal to develop cloud computing and distributed systems. With the joint efforts of programmers around the globe, the Hadoop system emerged. Its core innovation is the distributed architecture, which runs a large task on multiple servers to tackle the increasing workload, instead of doing all the work on a single server.

Amazon thrived in the first wave of cloud computing. To support peak-time e-commerce service on holidays such as Christmas, Amazon spent a lot of money on a large amount of expensive IT resources. However, these resources were idle in most cases. To relieve the cost pressure, Amazon turned itself into a cloud computing platform and put the redundant resources up for sale. It unexpectedly became an even more lucrative business than Amazon's e-commerce business. By the first quarter of 2018, AWS generated a profit of USD 1.4 billion, which accounted for 73% of Amazon's gross profit.

Alibaba shares a similar experience with Amazon. The year 2007 witnessed the rapid expansion of Alibaba's businesses with increasing investment in IT. Alibaba had even become a "benchmarking customer" in China for many IT vendors outside China. By the end of 2007, Alibaba had paid tens of millions of dollars to IBM, Oracle, EMC, Dell, and other vendors. It is no exaggeration to say that every penny Alibaba earned went to the pockets of these IT vendors. More importantly, the services of these major IT vendors would soon fail to satisfy the requirements of Alibaba's fast growing businesses.

At this critical junction, Jack Ma met Wang Jian. Wang was then the assistant managing director at Microsoft Research Asia. He had boldly predicted, "Cloud computing will be a crucial infrastructure in the future, just like electricity was in the Industrial Age." Although Jack Ma did not understand cloud computing, he was acutely aware of its bright prospects. He then appointed Wang as Alibaba's chief architect to start building a "massive cloud" with enormous computing power.

Alibaba Cloud emerged in the high aspirations and great ambitions of the two dreamers. In fact, there were no such systems or experts in this field in China at that time. There were only a bunch of highly motivated engineers who were willing to devote themselves to this vague vision. A similar case on the other side of the globe was none other than a preliminary product developed by Amazon, which also encountered skepticism. Regardless, Wang set up a highly ambitious plan to build not only the Apsara system but also a series of cloud computing–powered applications, including YunOS (a mobile phone operating system), email, search engine, map, etc.


The Development of the Apsara System

Each internal module of the Apsara system was marked by a mythological sense: The distributed file system at the most underlying layer was named Pangu; the job scheduling and resource management module was named Fuxi; the network connection module was named Kuafu... An engineer who came to Alibaba Cloud for an interview was awestruck when he saw the couplet pasted on the sides of the office door, "Our dream aims high over the clouds; our code lies solid on the ground."

Before long, Alibaba Cloud had gathered nearly all the Group's technical elites. Wang was permitted to poach any talent or teams he wanted from within the Group, which displeased the other departments. Complaints flew to the Chief Human Resources Officer Peng Lei. An engineer even recalled the antagonism between Taobao's and Alibaba Cloud's engineers. Taobao engineers were all practical grassroots who kept generating profit for the company, while Alibaba Cloud engineers were all adventurous elites who kept spending money for the company. However, Alibaba Cloud always enjoyed privileges. For example, it was the only department that was able to recruit from Peking University and Tsinghua University.

Besides a proper team, they needed customers. At that time, Simon Hu Xiaoming was undertaking internal corporate venturing, that is, AliFinance (predecessor of Ant Financial). His initiative thus became an experiment for the Alibaba Cloud venture. Hu recalled that he suggested AliFinance should build its own IT system at the board meeting where Dr. Wang said, "Cloud computing is the best way to do it." Zeng Ming was an "accomplice" to this idea. Jack Ma also gave a push, expressing his approval.

AliFinance was therefore designated to be Alibaba Cloud's supporter. Despite the reluctance, Hu promised that, even if AliFinance had to sacrifice, it would support Alibaba Cloud at all costs.

Internal Struggles and Conflicts within Alibaba Group

AliFinance indeed nearly sacrificed itself. Alibaba Cloud in its early phase showed excessive ambition and expansion, showing Alibaba's firm determination to transform into a technological enterprise in all aspects. However, Alibaba Cloud was often accused of reinventing the wheel within the Group. It insisted on building a new system from scratch, regardless of the existing open source system, and the new system was far from stable. This made Hu Xiaoming suffer a lot.

AliFinance indeed nearly sacrificed itself. Alibaba Cloud in its early phase showed excessive ambition and expansion, showing Alibaba's firm determination to transform into a technological enterprise in all aspects. However, Alibaba Cloud was often accused of reinventing the wheel within the Group. It insisted on building a new system from scratch, regardless of the existing open source system, and the new system was far from stable. This made Hu Xiaoming suffer a lot.


Hu Xiaoming still remembers the time when Alibaba Cloud was haunted by various bugs. In the worst scenario, it went wrong dozens of times one night, counting RMB 1 as RMB 2 and a loan of RMB 10,000 as RMB 1 million. If such transactions were approved, AliFinance would have no choice but to swallow the consequences. The engineers who were responsible for the data warehouse even gave up eating meat (a practice in Buddhism) and prayed every day.

Wang Guotao was an engineer who worked on the data platform. He had numerous arguments with Hu, for he wanted to abandon Alibaba Cloud and switch to the open source system. Hu, however, insisted the opposite every time, leaving this tough guy from Northeast China with no choice but to continue contending with his tears.

Voices of disapproval of Alibaba Cloud also emerged within the Group. After all, it occupied the Group's technical elites and gobbled an annual investment of tens of millions of dollars, yet had not yielded a good product after two years' development and ended up with the lowest performance within the Group each year. Once there was even a rumor going around before a meeting that, "Alibaba Cloud is about to be dismissed." Many senior executives with their technical owners were ready to poach talent at the meeting.

The meeting ended up with two sides either approving or disapproving the dismissal of Alibaba Cloud. Everyone was yelling and arguing almost to the point of fighting. Jack Ma managed to withstand the pressure and said, "We shall bet on Dr. Wang Jian."

"In such a situation, 999 out of 1,000 CEOs would choose to give up," Hu Xiaoming thought.

On the weekend after that heated debate, Hu went for a walk with Jack Ma around the West Lake. Hu asked again whether it was really necessary to keep Alibaba Cloud. Jack Ma paused for 10 minutes and replied, "Nobody knows about cloud computing. But we have chosen it as our future to invest in. It will be a major asset of our country, so we must hold on to it."

Hu thought Jack Ma made this decision out of pure idealism, not knowing how many engineers were toiling in silence.

Hu had to put it frankly to the Alibaba Cloud team, "If we see no improvement in the situation, we will build another technical architecture."

Xu Changliang (nickname: Chang Liang), the senior director of Alibaba Cloud's big data business division, was responsible for supporting AliFinance at that time. He recalled that, at the end of 2011, Hu brought his team to the Zhongkuidao meeting room at Westlake International Square Block D and pleaded for solutions to Alibaba Cloud's bugs. Their PPT slide read, "Begging on our knees."

Chen Pengyu (nickname: Bu Lao), an engineer at AliFinance, told Xu that he had received 700 emergency calls from the platform, and he could barely sleep at night. He thus made the cry of his own baby as his mobile ring, so that he could get up as soon as he heard the sound. Hearing their complaints, Xu felt as if he had been slapped in the face. "What's the point of bragging about our system?"

After the meeting of pleading, Hu bowed deeply to all the attending Alibaba Cloud engineers.

Xu also felt aggrieved, since most system failures and network outages were beyond their control. However, he had to admit that these were certainly the Alibaba Cloud's fault from the customers' perspective. To enhance the stability of the Apsara system, Alibaba Cloud's engineers did not go home for the whole Spring Festival. Everyone was concentrating all along on the lines of code on the screen.

According to the internal survey conducted at Alibaba's annual general meeting in 2012, when Alibaba Cloud reached rock bottom, everyone agreed "Alibaba Cloud is the most idealistic subsidiary within the Group." This acknowledgment largely came out of sympathy for this dare-to-die corps—in the eyes of their colleagues in other departments, the Alibaba Cloud team was fighting a losing battle.

Reaching the 5K Target with Apsara

To minimize losses, Alibaba invested in two teams with different tasks. Aerial Ladder 1 went all out to cope with existing problems with the open source Hadoop system, while Aerial Ladder 2 focused on developing the proprietary Apsara ODPS system (predecessor of MaxCompute). Despite the many difficulties in developing the Apsara system, most people understood that Aerial Ladder 1 was merely an accompanist for Aerial Ladder 2—the true star in the eyes of Wang Jian and Jack Ma.

As the two teams grew bigger and bigger, choices had to be made between them. The key to the decision lay in whether the Apsara system could solve the "5K" problem, which is, achieving virtual connection of up to 5,000 computers within a cluster.

Sitting in the meeting room and eating bread, Chu Ba told Jazzyear in recollection, "The fiercest storm comes at the turn of the seasons." The two teams were engaged in a fierce quarrel with no one making concessions.

This fight in code also extended to the fight in words within Alibaba's internal network. In 2012, engineers posted comparisons between the Apsara system and the Hadoop system, incurring over 17,000 views. Some emphasized that Hadoop was currently in good condition, while some others quickly argued that long-term condition was more important. Some said the open source community could provide abundant resources, while others said their own problems were out of the reach of other people.

To better assist the development of the Apsara system, Alibaba once again assigned their best crew to the development team. This is when Chu Ba was transferred from Taobao core system to Alibaba Cloud. He was among the very first engineers in the Chinese open source community. He was former Netease employee No. 25, and was able to work out nearly all the code for the Thunder client 4.0. Before he came to Alibaba, a headhunter had been sending him chocolates and cookies in an attempt to poach him to Shanda Co., Ltd.

It was clear that nothing could be done if they failed to develop the Apsara system this time.

In April 2013, everyone was entering the battle mode. Two rounds of reporting per day made them unable to sleep at night and they underwent bitter arguments on specific plans during the day. Lin Chenxi, Alibaba Cloud's first CTO, said that four years in Alibaba Cloud was like ten years because they had to work at a frequency twice as fast as normal. During the most intense period, a new member who quit Tencent for Alibaba was shocked to see how everyone worked on his first day at the office and exclaimed, "You guys really are working on cloud computing!"

Four months later, the battle came to an end.

Some were happy while some were sad. The "5K" problem was solved, which signaled the success of Aerial Ladder 2. The Hadoop cluster had to go offline. Luo Li (nickname: Gui Li), a senior member of Aerial Ladder 1, posted on Weibo, "Count down to the offline of Aerial Ladder. My wife knows I'm feeling bad so she bought me a present today, which says, 'The aerial ladder goes off duty; an era reaches its end. But the aerial ladder at home is still on duty. I'm always here for you!' I have never shown weakness towards the Aerial Ladder project. Today I couldn't help crying out loud."

After the "5K" battle, Alibaba Cloud attracted more talent. Many Chinese technical experts outside China had been keenly aware of the fact that cloud computing was leading a revolution in China.

Dr. Min Wanli (nickname: Shan Jing) returned to China and joined Alibaba as a chief scientist in Alibaba Cloud's machine intelligence sector after the "5K" battle. He was admitted to the first youth class of the University of Science and Technology of China (USTC) at the age of 14. During his 16 years in the United States, he had worked on artificial intelligence at IBM and Google. A few words from Jack Ma brought him home: "There is a Chinese enterprise that owns more customer data than eBay, Amazon, and PayPal put together."

The next year, Feng Chao (nickname: Jing Ye) also returned to join Alibaba Cloud as a senior technical expert. He was among the first generation of architects and product managers at Microsoft. He had led a stable life in the United States, living in a big house in a community with a good environment. Many engineers lived there. One did not need to worry about adverse consequences if a car got scratched because the first thing people did was not to start a fight but to ask in a friendly way, "Are you from Microsoft?"

Jing Ye had received four interview invitations altogether from Alibaba for 2B business, foreign trade and e-commerce, Taobao and Tmall, and Alipay. He turned down all of them without hesitation. At that time, Alibaba struck him as an e-commerce business, "not high-end at all".

Yet when the headhunter "harassed" him for the fifth time with a job from Alibaba Cloud, he took it straightaway. The headhunter was shocked and texted him back, "Have you lost your mind? Putting aside hot businesses such as Taobao, Tmall, and Alipay, you choose a sinking ship?"

For a time, people still saw no future for Alibaba Cloud even though the "5K" problem had been overcome. No one knew where this ship would go.

Jing Ye made such an "arbitrary" decision not because he was optimistic about Alibaba Cloud. On the contrary, he was not. He had experienced Alibaba Cloud's products because of work requirements. He could feel the distance between Alibaba Cloud and AWS or Azure, "as if one suddenly shifts from an iPhone X to a feature phone".

"The significance of cloud computing to a country is far more than that of Windows or Office. Cloud computing can be compared to a nuclear bomb in the 1950s." For Jing Ye, going back home to join cloud computing was a golden opportunity. He would be excited to be even a "component" if not a chief engineer in that highly-anticipated revolutionary wave of technology. He accepted the job offer even though its salary was less than half of Microsoft, and despite his parents' objections, took his wife and children and all the family belongings to Hangzhou.

Alibaba Cloud then ushered in a brand new era.

Not far from the meeting place of the Computing Conference lies an Apsara 5K monument. One side of it is inscribed with the names of 227 engineers who participated in the "5K" battle and the names of their families. The other side is inscribed with some words of Wang Jian, "A group of ordinary people with extraordinary dreams underwent 1,757 days and nights to encode a future for cloud computing with their aspiration and ambition. Adhere to what you believe in, and believe in what you adhere to."


0 1 0
Share on

Alibaba Clouder

2,626 posts | 711 followers

You may also like


Alibaba Clouder

2,626 posts | 711 followers

Related Products

  • Function Compute

    Alibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.

    Learn More
  • Elastic High Performance Computing Solution

    High Performance Computing (HPC) and AI technology helps scientific research institutions to perform viral gene sequencing, conduct new drug research and development, and shorten the research and development cycle.

    Learn More
  • Quick Starts

    Deploy custom Alibaba Cloud solutions for business-critical scenarios with Quick Start templates.

    Learn More
  • Super Computing Cluster

    Super Computing Service provides ultimate computing performance and parallel computing cluster services for high-performance computing through high-speed RDMA network and heterogeneous accelerators such as GPU.

    Learn More