Community Blog The Best Practice of Cloud-Native Full-Stack Data Warehouses in So-Young

The Best Practice of Cloud-Native Full-Stack Data Warehouses in So-Young

This article will explain the practice of So-Young's data platform, data warehouse, and upper-layer data application based on Alibaba Cloud in four aspects.

Watch the replay of the Apsara Conference 2021 at this link!

"Building a big data platform is difficult and challenging. After comprehensive consideration of cost, security, asset management, and component scalability, we decided to migrate to Alibaba Cloud. After migration, our overall resource cost is reduced by 30%, the performance is improved by 2 to 3 times, and the operation experience of merchants, users, and activities is improved as well. We expect to have more communication and cooperation with Alibaba Cloud in the future," said Gao Hongchao, Director of the Data Research and Development Department of the So-Young Data Middle Platform.

Gao Hongchao, Director of the Data Research and Development Department of the So-Young Data Middle Platform

This article will explain the practice of So-Young's data platform, data warehouse, and upper-layer data application based on Alibaba Cloud in the following four aspects:

  1. An Introduction to So-Young
  2. Self-Built Big Data Architecture of So-Young
  3. Big Data Platform of So-Young Based on Alibaba Cloud
  4. Achievements of So-Young's Big Data Platform

First, I want to introduce So-Young and its business model. Then, I would like to discuss the technical architecture of our self-built data platform and the problems and challenges in building it. Factors that motivated us to migrate to the cloud will also be described. Finally, I will introduce our data architecture based on Alibaba Cloud and the benefits and impact of the migration in terms of technology and business.

1. An Introduction to So-Young

So-Young is the largest and most popular vertical online platform in China that provides search, selection, and appointment of cosmetic medicine services. Our business covers more than 350 cities in China. Our foreign business wing covers Japan, South Korea, Singapore, and Thailand. We have attracted nearly 6,000 certified cosmetic medicine and medical consumption institutions, providing a plethora of cosmetic medicine choices for our users. So-Young has published more than 4.7 million blogs to provide real and effective decision-making assistance to our users. So-Young had its initial public offering (IPO) in May 2019 in the United States and offered the first share of the global Internet cosmetic medicine platform to investors.

Our business is in a cosmetic medicine community + e-commerce mode. We have always adhered to the principle of user first and focused on users' trust. We build healthy aesthetic value and provide cosmetic medicine knowledge to our users in the community so that everyone can pursue a beautiful appearance with a healthy body. Besides, we try to establish a strict content screening mechanism to provide users with real and useful information and more reliable choices. Users can compare the prices of different plans through our price comparison system. We work together with other companies to provide quality products. The So-Young platform has integrated high-quality cosmetic medicine resources from all over China and established a complete system of audit, customer service, and merchant contract fulfillment. As such, we have built a more professional, safe, and transparent communication channel for our users and doctors.

2. Self-Built Big Data Architecture of So-Young

2.1 Inside the Self-Built Big Data Architecture of So-Young

So-Young's self-built data architecture adopts open-source technical components. For example, Flume is used to solve the collection of tracking point logs in data access; offline data access of BD database uses Kettle; real-time Binlog logs use Maxwell; data computing and storage usually adopt some components in Hadoop ecosystem.

In terms of offline computing and storage, HDFS stores the process data of data warehouse computation; HBase stores user tag data; Kafka temporarily stores real-time data reported by tracking points from the recent seven days; Elasticsearch stores the data of request logs in the application backend. The data computing engine is based on Hive and Spark computing on Yarn resource management, and the real-time query engine uses Impala, while the multi-dimensional data is built using Kylin.

In real-time computing, Flink and SparkStreaming are used. In data storage, Kafka and HBase are used. ETL task scheduling of data warehouses used Jenkins previously and uses Azkaban currently. For server performance monitoring, we use Prometheus & Grafana components. LDAP, Kerberos, and Sentry are used for permission management and authentication.


The output service of the data platform uses open-source Zeppelin and Hue as the platform for reactive analysis and development and APIs for various businesses. Nearly 4,000 offline tasks and about 100 real-time tasks were processed per day in the past. A cluster with nearly 100 nodes supported the data storage and computing.

2.2 Problems and Challenges Faced by the Self-Built Big Data Architecture of So-Young

On the whole, the So-Young platform is relatively comprehensive and has been operational for more than a year. In this process, we met many business and technical challenges.

At the business level, the pressure often comes from other departments within our company. Colleagues may doubt the quality and reliance of data provided by our platform. Sometimes they cannot find the data they need and have to ask personnel from the production and research department. This leads to cumbersome operation on the data.

The business changes are diversified. The upstream R&D architecture of the data warehouse changes often, which will affect data statistics. Errors occur from time to time, sometimes even at midnight. If errors are not handled timely, they often affect the statistical analysis of the next day. All the challenges above are at the business level.

From a technical aspect, scheduling tasks can call computing resources in any space based on the scheduling system of open-source Azkaban. The isolation and control of computing resources are not achieved. The development environment and the online environment were not isolated in the past, and wrong operations often led to accidents. Furthermore, the online task code had no tool for standard code review. It was reviewed by humans, which increased costs. At the same time, cluster resources did not count the usage of each business by queue and service, which made scale-in and scale-out by usage and cost difficult.

2.3 Reasons for the Cloud Migration of the So-Young Big Data Platform

So-Young migrated to Alibaba Cloud mainly based on the four factors below:

  1. Cost Reduction and Efficiency Improvement: For example, the cost of data computing and storage and personnel needs to be reduced. We need a unified data development platform, integrating data access, development, review, release, and monitoring and alerting.
  2. Data Security: Data security, including approval of data permissions, compliance audit of data usage, and disaster recovery (such as backup and restoration), needs to be considered. In addition, we need to establish a secure mart for data masking, automatic discovery of sensitive data, user data query, and audit and monitoring of downloads.
  3. Effective Management of Data Assets: We also need to consider the data quality monitoring mechanism, metadata, lineage, and controllable data lifecycle. We need to make sure that each data mart can be isolated and the cost can be split.
  4. Scaling of Data Components: For example, data visualization components should be compatible with data mining and real-time computing components.


3. So-Young's Big Data Platform Based on Alibaba Cloud

After considering the four aspects above, we migrated our self-built data platform to Alibaba Cloud. After the migration, the data architecture of the data warehouse and data application is more organized. The following part describes the overall architecture of the data platform based on Alibaba Cloud: data computing and storage, data warehouse and data service, data asset management, all-in-one data development, and data application.

First, in terms of computing and storage, we use MaxCompute for offline operations and the combination of EMR Kafka + Flink + Hologres for real-time operations.

Then, in terms of the data warehouse and data service, we adopt a hierarchical architecture of a standard data warehouse, containing the basic data center, theme data center, and multi-dimensional data center. The basic data center, namely the ODS layer of the data warehouse, uses the data integration capability of DataWorks to process two types of data. The first is data of tracking point logs (request logs of applications, PCs, H5 pages, mini programs, and backend), and the second is DB data of business systems. These business systems include the transaction system, user membership system, merchant organization system, community content system, and financial system. ODS is consistent with the online business systems, and different data access strategies are adopted for processing based on the amount of data. The theme data center, namely the DWD layer of the data warehouse, adopts the theme-oriented model design. It constructs the DWD model through the abstract of the business units and processes in combination with analysis dimension. The data warehouse of So-Young is divided into ten themes based on our businesses – traffic, content, merchants, users, products, transactions, operations, after-sales, finance, and media. The data is cleansed and normalized at this layer. The multi-dimensional data center is the DWS layer of the data warehouse. The data warehouse model is constructed based on analyzing objects and statistical indicators with dimensional modeling ideas. Several core data systems for users, content, merchants, traffic, and operations are abstracted. For the consistency of metrics, upper-layer data applications all obtain data from this layer. Most upper-layer data applications use customized and unified data service interfaces to obtain data. The data service API of DataWorks provides such capabilities.

Next, Alibaba Cloud DataWorks is used to manage data assets. DataWorks provides metadata management and data lineage management capabilities for data overview and data maps. Data quality management provides us with the guarantee of data accuracy, integrity, and consistency. Data usage analysis and the security center provide data compliance checks and data security guarantee capabilities. Resource monitoring management provides stability guarantees to help So-Young cope with abnormal task detection and scale-in and scale-out of resources. The all-in-one data development of DataWorks has improved our work efficiency substantially. Data integration cleans structured data. Modeling and development, launching and approval of scheduling tasks, online monitoring and O&M, and fast positioning and handling of abnormal data nodes have become very efficient and convenient.


Finally, in terms of data application, So-Young's data middle platform has provided data application support for many core businesses based on the functions above and the overall architecture of the data warehouse. So-Young has built the user operation management system and merchant operation system in its businesses. It has also built a visual OLAP system for data, market launching, and the Abtest experimental platform, providing effective data support for search recommendations, risk control, and anti-cheating.

4. Achievements of So-Young in Building Big Data Platform

With a clear architecture, So-Young has achieved good results in both data processing technology and business after its data platform migrated to Alibaba Cloud.

4.1 Data

The average task running duration increases by 2-3 times when the costs and task amount remain unchanged. All tasks are completed at 6:00 a.m., which was 10:00 a.m. in the past.


4.2 Business (Merchant Operation)

The first aspect of our business is the merchant operation system. As So-Young develops rapidly, the data of institutions and doctors are stored in various business centers, lacking unified management standards. This results in data duplication, loss, inconsistency, chaotic permission, data security risk, lack of aggregated viewing platform data, and a series of other problems. Our support for front-line BD business personnel is relatively weak. This has seriously affected the efficiency of our business operation. The general summary model based on merchants is built on the new data warehouse architecture of Alibaba Cloud. It provides standard APIs, and three layers of data permission management are added to the permission management level. The first layer is the access permission of merchant data mart, the second layer is the call permission of APIs, and the third layer is the management permission of functions and applications.


In terms of data application and data decision-making, the basic information and operation information of doctors and institutions integrated by the data warehouse connect with applications, PCs, and mobile terminals to serve client end users. Data dashboard management is provided for business end institutions to guide the operation. Furthermore, an aggregated query platform is provided for the internal management of the company and the front-line BD personnel, supporting the management decision-making and the innovation of the BD personnel with data. The data analysis platform for business end merchants analyzes and monitors the traffic conversion CTR through the customer traffic funnel. Merchants can reach users and achieve traffic conversion through the analysis of business opportunities. Pain points and problems in the transaction process are detected and solved quickly through transaction transformation analysis. The distribution of customer groups is clarified to guide fine operation through customer group analysis.


4.3 Business (User Operation)

The user operation system achieves a closed-loop process of crowd selection, planning, and data viewing, integrates marketing tools and access channels, and supports multiple strategy types.

The overall architecture is divided into five layers – the business data layer, data storage and computing layer, profile tag layer, tag application layer, and decision application layer. The first two layers provide basic and behavioral data based on user granularity through data integration and aggregated computing of the data warehouse. The profile tag layer generates tags from the user population attribute class, behavior preference class, and user value contribution class through algorithms and models. The application layer filters people based on tags and delivers them through red envelope policies, pushing, short messages, and direct messages. Then, the delivery effect and strategy effect are tracked through various data dashboards.


When selecting the crowd, we can select various tag combinations. After selection, we can know the crowd size to check if it meets the delivery expectations. If not, we can adjust the size flexibly. We can make different strategic schemes or make differential delivery based on the AB experimental platform and the visualization of the effect of copywriting editing. After delivery, the effect in core scenarios can be viewed through the funnel model, which now supports T+1 effect analysis.


4.4 Business (Activity Dashboard)


The figure above shows the activity dashboards during So-Young's Double 6 and Double 9 promotions. They are based on Alibaba Cloud DataV. The goal is to monitor the promotion targets in real-time through data performance, provide risk prompts during the promotion, and improve support for operational decisions. They mainly realize the monitoring of SKU release in the early stage of activities, the transaction conversion funnel of traffic during activities, and the early warning of cost subsidy for red packets. This ensures the efficient implementation of various operational strategies at the data level during the promotion.

There are many scenarios for the application of So-Young's data middle platform. These scenarios include the AB experimental platform widely used by the product development team, the tracking point management system (providing standardized and constrained management for the data quality of tracking points), and the self-service data analysis system.

4.5 Advantages and Benefits of So-Young's Big Data Platform Based on Alibaba Cloud

Now, we will explain the advantages and benefits of the Alibaba Cloud solution for So-Young's big data platform, which are mainly reflected in the cost, product maturity, user experience, and security assurance.

The overall resource cost is reduced by 30%, while the performance is improved by 2-3 times. In addition, the personnel cost for R&D and O&M of big data platform components can be reduced from three to 0.5.

Most of the Alibaba Cloud products we use are based on commercial products, including MaxCompute and DataWorks. At the same time, Alibaba Cloud provides stable cluster versions and personal service support, making its products more stable and reliable.

The user experience has been improved substantially. DataWorks has replaced open-source products, such as Hue, Zeppelin, and Azkaban, to realize all-in-one development, fully managed scheduling, monitoring and alerting, and data integration. Based on cloud-native, extremely elastic expansion capability without user perception is also provided.

Our entire data processing is based on Alibaba Cloud products and the architecture of the public cloud, which provides security protection. In the event of a problem, Alibaba Cloud will also provide professional security consultation and protection with security products to solve security problems quickly and effectively.


0 1 0
Share on

Alibaba Cloud MaxCompute

135 posts | 18 followers

You may also like


Alibaba Cloud MaxCompute

135 posts | 18 followers

Related Products