MaxCompute Big Data Security Solution

With the improvement of laws, data security, information security, and network security have been upgraded to national security, so data security will become more and more important for users and companies. As the leader of big data cloud data warehouse solutions, Alibaba Cloud MaxCompute has also made many features in the security system. This article briefly introduces some of MaxCompute's capabilities on data security.

Introduction to safety system

The security system is not a system, but a series of systems in order to meet the data security requirements of the big data platform, including: 1. Pre-preparation, including data marking, white/black list, permission assignment, encryption algorithm and desensitization algorithm preparation. 2. In-process processing, including data encryption/decryption, whitelist filtering, data scanning, and real-time data alarming. 3. Post-event audit, including data usage log audit, data offline report monitoring, etc.

Security Architecture

The data security system not only requires the cooperation of various systems, but also requires different departments to carry out process management, so that the data can be used under reasonable authorization: • The data compliance department will be involved, marking data, configuring data rules, setting permissions, and managing whitelists; • The big data platform should automatically encrypt or desensitize data according to the rules set by compliance personnel, and then provide it to data users; • Data security personnel also need to have real-time data monitoring for each sensitive data usage, as well as regular data audits afterwards. This article mainly introduces the data storage encryption and data desensitization of the Alibaba big data platform. Currently, the Alibaba big data platform MaxCompute and the KMS platform can encrypt and store data when uploading data to the cloud, and support AES256, AESCTR and RC4 algorithms. It is automatically decrypted when used, so that customers can not perceive data protection. At the same time, MaxCompute cooperates with Dataworks and Data Security Umbrella to desensitize sensitive data. Users can perform operations such as marking configuration, risk rule definition, desensitization rule configuration, and whitelist design in the data protection umbrella. MaxCompute will automatically desensitize data that has been tagged. The target data is desensitized and displayed according to the desensitization rules.

Applicable scene

Scenario 1: Customer Personal Information Protection In the personal information protection scenario, with the introduction of relevant laws, many game companies need to enter sensitive information such as personal ID numbers. If a customer's personal information is leaked, it is a serious data security accident, so personal information such as ID numbers and other personal information is required. Protection becomes very important. This information can only be used by customers themselves or with the authorization of customers. However, when enterprises operate, these information need to be processed, matched, etc., so all processing needs to be encrypted. or desensitization. Scenario 2: Enterprise Internal Information Protection Most companies have a lot of sensitive data such as finances and personal salaries, but the normal operation of the company requires these data to be processed and calculated on the big data platform, and finally output the report. In the intermediate processing process, data development personnel, testers, products Managers, etc., cannot touch the plaintext data, and need to desensitize the data.

Suitable for customers

This article is suitable for data managers, data governance personnel, data R&D personnel, and data security compliance personnel who have already used Maxcompute products.

data encryption

MaxCompute supports encrypted storage of data through the key management service KMS (Key Management Service), provides data static protection capabilities, and meets enterprise supervision and security compliance requirements. Preconditions • Alibaba Cloud service account; • The KMS key management service has been enabled. Steps 1. Enter the key management service activation page, select the key management service service agreement, and click Activate Now to activate the KMS service. 2. Log in to the DataWorks console, and in the left navigation bar, click Workspace List. 3. After selecting an area at the top of the workspace list page, click Create Workspace. In the Create Workspace panel, configure basic configuration information and click Next. For details, see Creating a Project Space. 4. In the Select Compute Engine Service area of ​​the Create Workspace panel, select MaxCompute. 5. In the ODPS service account authorization dialog box, click Authorize. 6. On the newly opened cloud resource access authorization page, click Agree to Authorize. 7. Return to the ODPS service account authorization dialog box. Close the Please authorize the ODPS service account dialog box. In the Create Workspace pane, select the Compute Engine service area, reselect MaxCompute, and click Next. 8. In the Create Workspace panel, configure the engine details. Select Encrypt to enable data encryption. Take, for example, a workspace that creates a simple schema. 9. Click Create Workspace to complete the creation. After the data encryption function is enabled, MaxCompute will automatically complete the encryption or decryption operations in the process of reading and writing project data.

Data desensitization

Data Protection Umbrella is a data security management product that provides you with functions such as data discovery, data desensitization, data watermarking, access control, risk identification, data auditing, and data traceability. Next, I will introduce you how to activate and use the data protection umbrella. Preconditions • Alibaba Cloud service account; • The Dataworks space has been opened. Steps 1. Log in to the DataWorks console, go to Settings, and enable page query content desensitization: 2. Click the icon in the upper left and select All Products > Data Governance > Data Umbrella. 3. Data classification and classification settings, the system will field 1000+ data classifications. If there are no special requirements, the default classification can be used in most cases, and it also supports customizing the customer's own classification classification. 4. Data recognition type, the system has built-in many 1000+ recognition types. There are no special requirements. You can use the built-in recognition to automatically generate a data recognition model. At the same time, it also supports model training through the sample fields you provide to help you find the target field. content features, and generate corresponding rule models. 5. Definition of data masking rules, users can define masking rules for the specified data field type. Currently, there are three types of masking methods: pseudonym, Hash, and masking. 6. Data query will be automatically masked: whitelist At present, Data Protection Umbrella supports adding a whitelist to users for the data masking part. If a customer is on the whitelist, the masking rules can be ignored and plaintext data can be found. data discovery At present, the data protection umbrella supports the system to automatically scan the data and display the risk statistics for the data desensitization part. Data Risk Identification At present, the data protection umbrella supports user-defined risk behaviors for data desensitization, and performs unified query and display of risks. data audit At present, for the data desensitization part of the data protection umbrella, users can query the data risk processing status and audit the data security processing.


This article only briefly sorts out the security of the big data platform. Many details are not detailed because of the space. Interested students can refer to the official documents. • Data encryption configuration • Data masking configuration At present, Alibaba's big data platform has made more capabilities for data storage encryption and data desensitization display, but data cannot be encrypted at column level, row level or even cell level; It does not support manual scanning according to its own rules; it does not cover all Regions in data omni-channel access and desensitization; in terms of data security use, data monitoring and data auditing are not perfect, etc., follow-up security Adding more capabilities allows users to use data on the Alibaba Cloud big data platform with peace of mind, peace of mind, and peace of mind.

common problem

• After the Maxcompute datastore is encrypted, can it be accessed by the hologres outer table? Answer: Yes, but you need to choose the encryption method with your own key when storing encryption. • After Maxcompute data storage is encrypted, do users need to manually decrypt it? A: No, the system will automatically decrypt it when it is detected. • Data classification and desensitization rules have been configured in the data protection umbrella. Why does desensitization still not take effect? A: First, enable page query content desensitization in the Dataworks settings.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us