Community Blog Data Security Solutions and Technologies on the Cloud

Data Security Solutions and Technologies on the Cloud

Learn Alibaba Cloud reliable solutions and technologies to ensure your business data security.

Kerberos-Based Big Data Security Solution

To ensure the security of Hadoop clusters, user authentication and authorization must be implemented, in addition to firewalls, to contain attacks originating from the inside.

The Internet is never a safe place. Most of the time, we rely too much on firewalls to contain security problems. Unfortunately, reliance on the firewall assumes that attacks always come from the outside while the truly destructive attacks often come from the inside.

In recent years, websites such as The Hacker News have reported on widespread attacks and ransoms due to data security problems. Versions earlier than Hadoop 1.0.0 provided no security support and assumed that any roles in a cluster are trusted. As a result, user access is not authenticated and malicious users can easily access clusters by means of masquerading.

To ensure the security of Hadoop clusters, user authentication and authorization must be implemented. To address this, the following common solutions are developed:

  1. Authentication: MIT Kerberos, Azure AD, Kerby
  2. Authorization: Apache Sentry (Cloudera), Apache Ranger (Hortonworks)

Hadoop Cluster support for Kerberos

After Hadoop 1. 0. 0 was released in 2012, Hadoop started to support Kerberos to ensure that the nodes in a cluster are trustworthy.

Before the cluster is deployed, Kerberos stores the authentication key on a trusted node. When the cluster runs, the nodes in the cluster are authenticated by the key, and only the successfully authenticated nodes can be used to provide services. Impersonated nodes cannot communicate with any nodes in the cluster because they do not carry the key information in advance. This prevents the malicious utilization of or tampering with the Hadoop cluster, ensuring its trustworthiness and security.

Introduction to Kerberos

Kerberos is a network authentication protocol that was designed to protect network servers in Athena projects. The name "Kerberos" is the name of the three-headed dog from Greek mythology. As its name implies, it provides strong authentication for the client-to-server access sequence by using key encryption technology. Kerberos can prevent eavesdropping and can replay attacks for data integrity. It is a system that uses symmetric key algorithms to manage keys. Kerberos-based products also use the public key encryption method for authentication.

So far, the latest version of Kerberos is V5, and the V1 to V3 versions are only available within MIT because DES encryption is used. In its early development, Kerberos was classified as military arms by the U.S. Export Controls and its export was banned until the Royal Swedish Institute of Engineering released Kerberos V4, namely KTH-KRB. Later, this team released V5 (Heimdal), which is one of the most common implementations of Kerberos V5.

The Kerberos V5 implementation version mentioned in this document refers to MIT Kerberos, which is updated regularly on a six-month basis. Presently, the latest version of MIT Kerberos is version 1.16.2, which was released on November 1, 2018.

Related Blogs

Out-of-the-Box MaxCompute Data Security Solution

Alibaba Cloud MaxCompute and DataWorks can be used together for customizing project-based security configurations based on tenants' requirements.

MaxCompute is a multi-tenant big data processing platform that supports project-based security configurations to meet tenants' requirements for data security. Project owners can customize their external account support and authentication models to protect their project data.

Typically, MaxCompute and DataWorks are used together for data protection configuration. In this scenario, the data security solution is as follows:

Prevent Data from Being Downloaded Locally

Prevent Data Leakage or Local Downloads

Method 1:

The data protection mechanism is also known as project protection. You can enable this feature in the MaxCompute console to disable exporting data from the server end.

set ProjectProtection=true 
--Sets ProjectProtection to allow data import and prohibit data export.
--The default value of ProjectProtection is false.

Method 2:

You can use DataWorks to analyze data, and download the analysis result displayed on the IDE. In this case, choose Project Management > Project Configuration and enable Select Result Can Be Downloaded.

MaxCompute and DataWorks Security Management Guide: Examples

This article will provide some referential examples for security management members on MaxCompute and DataWorks.

The article MaxCompute and DataWorks Security Management Guide: Basics describes the relevant security models of MaxCompute and DataWorks, the correlation between the two products, and various security actions. This article will provide some referential examples for security management members.

Project Creation Example

We have known the security models of MaxCompute and DataWorks and the relationship between permissions of these two products. This article uses two basic business requirements to describe how to create and manage a project.

Scenario 1: Collaborative Business Development for ETL Tasks

In a collaborative development scenario, responsibilities and tasks are clearly assigned to members, and the regular development, debugging, and publishing procedures are required. Production data must be strictly controlled.


  1. A DataWorks project itself can allow multiple members to perform collaborative development work.
  2. The basic member roles (project administrator, developer, maintainer, deployer, and guest) in DataWorks can basically ensure explicit duty assignment among members.
  3. The Development and Production projects created in DataWorks can be used to perform regular development, debugging, and publishing and implement strict production data control.

Cyber Security Tips for ECS Instances

Cyber security should be a good concern in this information era, here you can get some useful information on ECS and some security hardening requirements.

Elastic Compute Service (ECS) instance, ApsaraDB for RDS MySQL database, Server Load Balancer, with Elastic IP and security group is the most common scenario for most applications hosted on the web. Although the system is functioning well, this type of deployment is deficient in terms of cyber security. This is especially true for servers used in production scenarios.

This article will focus on ECS and its security hardening requirements, which are easy to follow.

Reduce External Exposure of Alibaba Cloud Resources

As part of the design of your offering, you should have controls in place to ensure an ECS can only access their data and resources they are authorized to access. What controls can you put in place? What assurances can you offer to your ECS instance that their data can't be accessed by another ECS? This all bubbles down to the least privilege principle: the less you expose the offering to external world, the more secure design can be achieved.

In Alibaba Cloud this can be achieved through security group and network segregation of your offering.

Security group plays important role here, to segregate traffic,

  1. Operates at the instance level (first layer of defense)
  2. Supports allow rules only
  3. Is statefull: Return traffic is automatically allowed, regardless of any rules
  4. Evaluate all rules before deciding whether to allow traffic
  5. Applies to an ECS instance only if someone specifies the security group when launching the instance, or associates the security group with the instance later on

Network segregation: After creating a VPC, the next logical construct is the vSwitch. vSwitch in Alibaba cloud are sub-networks within a VPC and are analogous to the subnets. One can add one or more subnets in each availability zone; however, each subnet must reside exclusively within one AZ and cannot span AZs. Here is one example how the production env is segregated in two AZ operating on different vSwitch and one staging environment which is on complete different vSwitch

Related Courses

Secure Your Data on Alibaba Cloud

With this certification course, you will understand where data should be secured in Alibaba Cloud, such as: storage technology, backup and recovery solutions, how to transmit data securely, which encryption algorithm to choose, and so on. You will also master the core skills of data security protection on Alibaba Cloud platform, including: how to implement automatic remote backup of data, how to implement encrypted storage in cloud environment, how to generate SSL certificate, etc.

Secure Your Data in Alibaba Cloud(French version)

With this certification course, you will: Understand where data should be secured in Alibab Cloud, such as: storage technology, backup and recovery solutions, how to transmit data securely, which encryption algorithm to choose, and so on. Master the core skills of data security protection on Alibaba Cloud platform, including: How to implement automatic remote backup of data, how to implement encrypted storage in cloud environment, how to generate SSL certificate, etc.

The Backup and Recovery of Common Cloud Databases

The security of the cloud databases are critical, they are directly affecting the security and stable operation of the cloud based applications. Only by understanding the principles, methods, and operation methods of backup and recovery of commonly used cloud databases, the cloud database administrators can better protect them. Through this course, you can not only understand the backup and recovery principles, types and methods of databases on the cloud, but also understand the backup and recovery methods related to Alibaba Cloud RDS.

Related Documentation

Enter Data Security Guard

Enter the start page

When you first enter the Data Security Guard, the Guide page appears, which introduces you to the core features and usage process of the data umbrella, help you get a basic understanding of the Data Security Guard.

Click Try now to enter the Data Security Guard authorization page (if the tenant Administrator has been authorized, then direct access to the Data Security Guard Home page ).

Enter the authorization page

Only the tenant Administrator can authorize the provision of Data Security Guard.

ECS data security best practices

This topic describes the recommended best practices for improving the data security of ECS instances from an O&M perspective.

Best practices

  1. Back up data regularly
  2. Design security domains
  3. Set security group rules
  4. Set logon passwords
  5. Improve server port security
  6. Protect application vulnerabilities
  7. Collect security information

Back up data regularly

Backing up data regularly reduces the risks of data loss due to system failures, operation errors, and security problems. The snapshot backup function of ECS instances can be used as a means to backup data regularly. To use this function, you must first customize a backup policy by performing one of the following operations:

  1. Create a snapshot.
  2. Define automatic snapshot policies and apply automatic snapshot policies to disks.

Next, you need to create automatic snapshots regularly, such as on a daily basis, and then store snapshots for a period of at least seven days. This will significantly increase disaster tolerance, minimizing potential data losses.

Related Market Products

Fortinet FortiManager (BYOL) Centralized Security Management

Offers centralized configuration, policy-based provisioning, update management and end-to-end network monitoring for your Fortinet installation - You can further simplify management of your network security by grouping devices into geographic or functional administrative domains (ADOMs) - Easily manage VPN policy and configuration while leveraging FortiManager virtual appliances as a local distribution point for software and policy updates

Fortinet FortiAnalyzer (BYOL) Security Logging and Reporting

Pre-defined and customized charts help monitor and maintain identify attack patterns, acceptable use policies, and demonstrate policy compliance - Scalable architecture allows the device to run in collector or analyzer modes for optimized log processing - Advanced features such as event correlation, forensic analysis, and vulnerability assessment provide essential tools for in-depth protection of complex networks

Related Products

Key Management Service

Alibaba Cloud Key Management Service (KMS) provides secure and compliant key management and cryptography services to help you encrypt and protect sensitive data assets. KMS is integrated with a wide range of Alibaba Cloud services to allow you to encrypt data across the cloud and to control its distributed environment. KMS provides key usage logs via ActionTrail, supports custom key rotation, and provides HSMs that have passed FIPS 140-2 Level 3 or other relevant validation, to help you meet your regulatory and compliance needs.

Sensitive Data Discovery and Protection

Sensitive Data Discovery and Protection (SDDP) automatically discovers sensitive data in a large amount of user-authorized data, and detects, records, and analyzes sensitive data consumption activities. SDDP detects security compliance violations and predicts risks to help you prevent data leakage and meet the General Data Protection Regulation requirements.

0 0 0
Share on

Alibaba Clouder

2,600 posts | 751 followers

You may also like