Functions and features - - Alibaba Cloud Documentation Center

This topic describes the features of AnalyticDB for MySQL.

Compute engines

AnalyticDB for MySQL supports three types of compute engines: Spark, XIHE massively parallel processing (MPP), and XIHE bulk synchronous parallel (BSP).

Spark engine

Data Lakehouse Edition (V3.0) is fully compatible with Apache Spark, a multi-language programmable compute engine, and supports seamless migration of applications. This engine is ideal for complex extract-transform-load (ETL) processing and machine learning scenarios.

XIHE MPP engine

The XIHE MPP engine uses an MPP architecture and computes data for all tasks of a query. Data pipelines are used to implement stream computing, meeting low-latency ad hoc analysis requirements.

XIHE BSP engine

The XIHE BSP engine divides tasks in a directed acyclic graph (DAG) and uses a batch computing architecture to compute data for each task. This engine supports disk storage and is ideal for complex analysis scenarios that involve large data volumes, high throughput, and limited resources. You are charged for the resources you use on a pay-as-you-go basis. Exclusive resources are allocated to each query, preventing resource contention among queries. Furthermore, the XIHE BSP engine provides powerful failover capabilities and can retry failed tasks.

The XIHE BSP engine does not allow you to use user-defined functions (UDFs) or write Hudi tables.

Comparison between XIHE MPP and XIHE BSP

When you submit a query to the frontend node and specify an interactive resource group to execute the query, the query is executed in XIHE MPP mode on the permanent compute nodes. If you specify a job resource group to execute the query, the query is executed in XIHE BSP mode on the temporary compute nodes.

Item	XIHE MPP	XIHE BSP
Resource group	Interactive resource group	Job resource group
Response time (RT)	Within milliseconds	Within seconds or minutes
Scenario	Latency-sensitive applications with high queries per second (QPS)	High-throughput applications that have relatively low latency requirements
Isolation	Among resource groups	Among queries
SQL syntax	No difference

Storage engines

XUANWU storage engine (C-Store)

The XUANWU storage engine provides cost-effective enterprise-grade data storage with high reliability, availability, and performance, laying a foundation for high-throughput real-time writes and high-performance real-time queries.

Hudi storage engine

Data Lakehouse Edition (V3.0) uses cost-effective Object Storage Service (OSS) and open source Apache Hudi storage solutions to implement incremental data processing in near real time.

Connection methods

The following table describes the connection methods that are supported by AnalyticDB for MySQL.

Connection method	References
MySQL command-line tool	Use the MySQL command-line tool to connect to AnalyticDB for MySQL
Business system	Java, Druid connection pool, Python, PHP, C# for macOS, and Go
Client	DBeaver, DbVisualizer, Navicat, and SQL WorkBench/J
BI tool	FineBI, Quick BI, Yonghong BI, DataV, Tableau, QlikView, FineReport, Power BI, and Smartb

Security

IP address whitelists

By default, AnalyticDB for MySQL clusters do not allow access from external connections. To access an AnalyticDB for MySQL cluster, you must add the IP address or CIDR block of your client to a whitelist of the cluster. For more information, see Configure a whitelist.

SQL audit

SQL audit records the information about DML and DDL operations, helping you perform fault analysis, behavior analysis, and security audit and improving database security. For more information, see Configure SQL audit.

Disk encryption

AnalyticDB for MySQL provides the disk encryption feature. This feature encrypts the data on each disk in your cluster based on block storage. This way, your data cannot be stolen even if it is leaked. For more information, see Disk encryption.

Access control

Resource Access Management (RAM) controls the access permissions of RAM users on cloud resources. After an AnalyticDB for MySQL cluster is created by a RAM user, only the RAM user and the Alibaba Cloud account to which they belong can view and manage the cluster. If multiple users in your organization need to use the AnalyticDB for MySQL cluster, you can create multiple RAM users and grant them the permissions to view or manage the AnalyticDB for MySQL cluster. For more information, see Manage RAM users and permissions.

Database permission control

After you create a database account, you can perform database operations. For example, you can create or delete databases, tables, and views. You can also insert, change, and query data. AnalyticDB for MySQL provides two types of database accounts: privileged accounts and standard accounts. A privileged account can grant standard accounts different levels of permissions, such as cluster-level, database-level, table-level, and column-level permissions. For more information, see Account types and Database permission model.

Monitoring and alerting

AnalyticDB for MySQL provides performance metrics to help you obtain a clear understanding of the health status and performance of cluster nodes. For more information, see View monitoring information. In addition, AnalyticDB for MySQL can send alerts to contacts if the values of CPU utilization, disk usage, IOPS, query duration, and number of database connections exceed specified thresholds. For more information, see Configure an alert rule.

Backup and restoration

AnalyticDB for MySQL supports periodic full backup and log backup, which can effectively prevent data loss.

Full data backup

AnalyticDB for MySQL compresses full data snapshots of a cluster and stores them on offline storage media. If you want to perform a full data restoration, AnalyticDB for MySQL clones a cluster, downloads the backup sets, and then restores data to the new cluster. For more information, see Manage backups.

Log backup

AnalyticDB for MySQL uploads redo logs in parallel from multiple nodes to OSS to store logs in real time. You can restore the data of a cluster to a specified point in time by using the full data backups and their subsequent redo logs up until the desired point in time. This ensures the data security within this period of time. For more information, see Manage backups.

Data restoration from backups

AnalyticDB for MySQL supports full data restoration and restoration to a point in time. AnalyticDB for MySQL clones a cluster and restores data of the downloaded backup sets to the new cluster. For more information, see Clone a cluster.

Intelligent diagnostics and optimization

Schema optimization

Schema design and optimization can significantly reduce database costs and improve query performance. AnalyticDB for MySQL regularly analyzes your algorithms based on the performance metrics of SQL queries and relevant information such as tables and indexes. It automatically offers optimization suggestions and one-click deployment, which lets you easily apply the suggestions. This greatly simplifies the optimization process. For more information, see Schema optimization.

SQL pattern

The SQL pattern feature is designed for real-time SQL statements. AnalyticDB for MySQL can group similar SQL statements into an SQL pattern and perform intelligent diagnostics and analysis on the pattern, improving the diagnostics efficiency. The diagnostic results of SQL patterns can serve as an effective basis for database optimization. For more information, see Schema optimization.

SQL diagnostics

AnalyticDB for MySQL provides diagnostic results and optimization suggestions based on statistics of queries, stages, and operators. For more information, see Introduction to SQL diagnostics.

Data import and export

You can import data from other databases, OSS, Tablestore, Apsara File Storage for HDFS, MaxCompute, ApsaraMQ for Kafka, or Log Service to AnalyticDB for MySQL, and export data from AnalyticDB for MySQL to other databases, OSS, Apsara File Storage for HDFS, or MaxCompute. For more information, see Data import overview and Data export overview.

Data ingestion

Data source management

AnalyticDB for MySQL Data Lakehouse Edition (V3.0) supports ApsaraMQ for Kafka, Log Service, and Hive data sources. Each data synchronization or migration job requires a data source. Data sources can be reused among different data synchronization or migration jobs. This simplifies the process of creating duplicate jobs. You can create, query, modify, and delete data sources.

Data synchronization

AnalyticDB for MySQL Data Lakehouse Edition (V3.0) allows you to create data synchronization jobs to synchronize data from ApsaraMQ for Kafka and Log Service to Data Lakehouse Edition in real time. This helps meet requirements such as near-real-time data ingestion, full data archiving, and elastic analysis. For more information, see Use data synchronization to synchronize Kafka data to Data Lakehouse Edition and Use data synchronization to synchronize Log Service data to Data Lakehouse Edition.

Data migration

AnalyticDB for MySQL Data Lakehouse Edition (V3.0) allows you to create data migration jobs to migrate Hive data to OSS. For more information, see Use data migration to migrate Hive data to Data Lakehouse Edition.

Metadata discovery

AnalyticDB for MySQL Data Lakehouse Edition (V3.0) can automatically discover OSS buckets and data objects that are stored in the same region, and create and update data lake metadata. For more information, see Use metadata discovery to import data to Data Lakehouse Edition.

Data management

AnalyticDB for MySQL Data Lakehouse Edition (V3.0) allows you to manage databases and tables in a visualized manner. You can view information about databases, tables, and views in the AnalyticDB for MySQL console. For more information, see Manage data.

Job development

AnalyticDB for MySQL Data Lakehouse Edition (V3.0) provides two engines to implement different job development methods: the open source Spark engine and the in-house XIHE engine that is developed by the AnalyticDB for MySQL team. For more information, see Job development.

Job scheduling

AnalyticDB for MySQL Data Lakehouse Edition (V3.0) provides job scheduling for batch SQL and Spark applications to help you perform complex ETL processing. For more information, see Perform job scheduling in Data Lakehouse Edition.

SQL

XIHE SQL

When you execute XIHE SQL statements, AnalyticDB for MySQL uses the in-house XIHE engine that is highly compatible with MySQL. The following table describes the syntax, functions, and data types that are supported by XIHE SQL.

Category	References
DDL	Databases: CREATE DATABASE, CREATE EXTERNAL DATABASE, DROP DATABASE, and SHOW DATABASES Tables: CREATE TABLE, CREATE EXTERNAL TABLE, CREATE TABLE AS SELECT (CTAS), ALTER TABLE, ALTER TABLE (external tables), TRUNCATE TABLE, SHOW TABLES, and SHOW CREATE TABLE Views: CREATE VIEW and DROP VIEW Materialized views: Create a materialized view, Refresh a materialized view, Manage materialized views, and Query data from a materialized view Resource groups: CREATE RESOURCE GROUP, ALTER RESOURCE GROUP, and DROP RESOURCE GROUP
DML	INSERT INTO, INSERT OVERWRITE SELECT, INSERT ON DUPLICATE KEY UPDATE, INSERT SELECT FROM, REPLACE INTO, REPLACE SELECT FROM, UPDATE, and DELETE
DQL	SELECT
DCL	CREATE USER, RENAME USER, DROP USER, GRANT, and REVOKE
Function	Control flow functions, Arithmetic operators, Numeric functions, Date and time functions, String functions, Regular functions, Bit functions and operators, Operation functions, Spatial functions, Accessor functions, Spatial constructor functions, JSON functions, Window functions, Aggregate functions, Variable-length binary functions, CAST functions, Roaring bitmap functions, and Funnel and retention functions
Data type	Basic data types, Array, Map, and JSON

Spark SQL

AnalyticDB for MySQL Data Lakehouse Edition (V3.0) supports the Spark engine that is fully compatible with Apache Spark. You can use Spark SQL to perform data lake operations. For more information, see SQL Reference.