You can connect ApsaraDB for SelectDB to an Alibaba Cloud MaxCompute data source to perform federated analysis.
Overview
MaxCompute is a fast, fully managed data warehouse solution for data at the terabyte (TB) or petabyte (PB) scale. It provides comprehensive data import solutions and various distributed computing models to help you quickly process massive amounts of data, reduce costs, and ensure data security. SelectDB can connect to an Alibaba Cloud MaxCompute data source for federated analysis.
Prerequisites
The open storage (Storage API) feature must be enabled for MaxCompute. For more information about the procedure and supported regions, see Tenant properties.
Procedure
Connect to a SelectDB instance. For more information, see Connect to an instance.
Create a catalog for MaxCompute. The following SQL statement is an example. Replace the parameters with your actual values.
CREATE CATALOG mc PROPERTIES ( "type" = "max_compute", "mc.region" = "cn-beijing", "mc.default.project" = "yourProject", "mc.access_key" = "yourAccessKeyID", "mc.secret_key" = "yourAccessKeySecret", "mc.endpoint" = "https://service.cn-beijing-vpc.maxcompute.aliyun-inc.com/api" );Parameter
Description
type
The value is fixed to "max_compute".
mc.region
The region where the MaxCompute project is located.
mc.default.project
The name of the MaxCompute project.
mc.access_key
The AccessKey ID. For more information, see Create an AccessKey.
mc.secret_key
The AccessKey secret.
mc.public_access
When
mc.endpointis configured as an Internet endpoint, you must configure"mc.public_access"="true".mc.endpoint
The Endpoint of the region where the MaxCompute project is located.
(Recommended) VPC Endpoint: Make sure that the ApsaraDB for SelectDB instance and the MaxCompute project are in the same region.
Internet Endpoint: Public network access poses security risks and has limited bandwidth. It is not recommended for production environments. To use public network access, see Use an Internet NAT gateway for public network access.
NoteApsaraDB for SelectDB versions 4.0 and later require this parameter.
NoteThe pay-as-you-go quota for MaxCompute has limits on concurrent queries and resource usage. To increase your resource quota, see Computing resources - Quota management.
Column type mapping
The column type mapping for MaxCompute is the same as that of a Hive catalog. For more information, see Hive data source.