This solution is for use cases where you need to access services such as RDS, HBase clusters, and Hadoop clusters in a virtual private cloud (VPC). You can use MaxCompute SQL, user-defined functions (UDFs), Spark, PyODPS/Mars, foreign tables, or a data lakehouse architecture.
This solution creates two security groups for return traffic. The names are MaxCompute-vpc-xxx and MaxCompute-backup-vpc-xxx. The `xxx` placeholder represents the VPC ID that you provide. Do not modify the rules of these two security groups. Do not use these security groups to manage security rules for other components. The platform is not responsible for any issues that result from these modifications.
Procedure
Step 1: Prepare your account and project
Before you establish a network connection between MaxCompute and the target service, ensure that the following conditions are met.
Create a MaxCompute project. If you use the data lakehouse solution, we recommend that you set the data type edition for your MaxCompute project to the Hive-compatible data type edition.
To access a target service in a VPC, ensure that the VPC owner account, the Alibaba Cloud account that is used to access the MaxCompute project, and the administrator account of the target service environment or cluster all belong to the same Alibaba Cloud account.
Step 2: Establish a direct connection
1. Grant permissions
Grant the operating user the permission to create network connection objects.
The authorized user must be a Project Owner or a user with the tenant-level Super_Administrator or Admin role. For more information, see Role planning.
For the authorization procedure, see List of Object Permissions in a Tenant.
Authorize MaxCompute: This allows MaxCompute to create ENIs in your VPC to enable network connectivity from MaxCompute to your VPC. Click Authorize while you are logged on to your Alibaba Cloud account.
2. Add security group rules
In the VPC-connected instance, you must create a separate security group. This group is used to control access from MaxCompute to resources in the VPC.
You must create a new basic security group. Do not use other types of security groups or security groups that are already in use. MaxCompute creates ENIs in your VPC to access your services and automatically places them in this security group.
Set the outbound rules for this security group to control the destination addresses that MaxCompute jobs, which run on ENIs, can access. If you have no special requirements, you can keep the default outbound rules.
Traffic that enters the ENI is return traffic. Therefore, you must allow all inbound traffic.
Log on to the Virtual Private Cloud (VPC) console.
In the navigation pane on the left, choose VPC. In the upper-left corner, select a region.
On the VPC page, click the Instance ID/Name of the target VPC.
On the VPC details page, click the Resource Management tab.
On the Resource Management tab, in the VPC Resources section, hover over the value for Security Group and click Add.
Set Security Group Type to Basic Security Group.
A Basic Security Group allows outbound traffic by default. An Advanced Security Group denies outbound traffic by default, which prevents access to services in the VPC.
Select the same VPC Network used by the connectivity service.
For more information, see Create a security group.
Configure the security group rules to allow access from MaxCompute.
In the Operation column for the target security group, click Manage Rules.
On the Inbound tab in the Access Rules area, click Edit in the Actions column for the target rule. Configure the settings to allow all Inbound traffic.
Set Action to Allow.
The Priority is set to 1.
For Protocol, select All Traffic.
Source is the CIDR block of the VPC or VSwitch that contains the Alibaba Cloud service to access.
Destination (Current Instance) defaults to ALL(-1/-1).
For more information, see Security Group Application Guide and Examples.
In an HBase scenario, if HBase cannot grant network access to a security group, you can add the Elastic Network Interface (ENI) IP address created by MaxCompute to the whitelist. Because ENI IP addresses can change, we recommend that you add the CIDR block of the vSwitch for the VPC-connected instance to the whitelist. Log on to the ECS console. In the navigation pane on the left, click Elastic Compute Service to obtain the ENI IP address.
During the network connection creation process, MaxCompute automatically creates two ENIs based on bandwidth requirements. These ENIs are free of charge and are placed in this security group.
3. Create a network connection between MaxCompute and the target VPC
In the MaxCompute console, an Alibaba Cloud account or a RAM user with the Super_Administrator or Admin role at the MaxCompute tenant level can create a connection to a VPC network. For more information, see MaxCompute tenant-level roles. To create the connection, perform the following steps:
Log on to the MaxCompute console and select a region in the top-left corner.
In the navigation pane on the left, choose .
On the Network Connection page, click Add Network Connection.
In the Add Network Connection dialog box, configure the parameters as prompted and click OK. When you add a network connection for the first time, you must first grant authorization to allow the MaxCompute platform proxy to request network interface cards. Otherwise, the connection cannot be created.
In the Add Network Connection dialog box, configure the parameters as prompted and click OK. When you add a network connection for the first time, you must first grant authorization to allow the MaxCompute platform proxy to request network interface cards. Otherwise, the connection cannot be created.
The following table describes the parameters.
Parameter
Required
Description
Connection Name:
Required
Enter a custom name for the connection.
Type:
Required
The default value is Passthrough.
Region:
Required
The system automatically populates this parameter based on the region you selected in the upper-left corner.
VPC Selected:
Required
A virtual private cloud (VPC) is an isolated virtual network. It provides a secure and configurable private network space similar to a traditional data center.
To create a new VPC, see Create or delete a VPC.
Switch:
Required
A vSwitch defines a subnet. Service interconnection is enabled between different vSwitches in the same VPC. Deploy resources across vSwitches in different zones to protect your application from failures in a single zone.
If no vSwitch is available, see Create or delete a vSwitch.
Security group:
Required
A security group acts as a virtual firewall for your resources. Manage security groups and their rules to implement fine-grained network isolation and access control.
To create a security group, see Create a security group.
4. Configure the security group for the target service
After the ENI-based leased line connection is established, you must add security rules to the security group of the service that you want to access. These rules grant the security group that represents MaxCompute access to specific service ports, such as 9200 and 31000.
For example, to access ApsaraDB RDS, you must add a rule to the RDS instance's security group to allow access from the security group that you created in Step 2. If the service that you want to access does not support adding security groups and only supports adding IP addresses, you must add the entire CIDR block of the vSwitch in which the target service resides.
Configure the security group for the Hadoop cluster.
Configure the security group for the Hadoop cluster to allow access from MaxCompute. The security group must be configured as follows:
Add inbound rules to the security group of the Hadoop cluster.
Set the authorization object to the security group that contains the ENI. This is the security group that you created in Step 2.
The HiveMetaStore port is 9083.
The HDFS NameNode port is 8020.
Allow access to the HDFS DataNode port: 50010.
For example, when you connect to a Hadoop cluster created on Alibaba Cloud E-MapReduce, the required security group rules are shown in the following figure. For more information, see Create a security group.

Configure the security group for the HBase cluster.
Add the security group created for MaxCompute or the ENI IP address to the security group or IP address whitelist of the HBase cluster.
For example, when you connect to an Alibaba Cloud HBase cluster:
Log on to the HBase Management Console. In the upper-left corner, select a region.
In the navigation pane on the left, select Clusters.
On the Clusters page, click the name of the target cluster.
In the navigation pane on the left, select Access Control.
On the Whitelist Setting and Security Group tabs, you can Add Whitelist or Add Security Group. If you cannot add a security group, add the IP address of the ENI created by MaxCompute on the Whitelist Setting tab. The ENI IP address might change if the MaxCompute configuration is modified. To avoid connectivity issues, we recommend that you add the CIDR block of the vSwitch to the whitelist instead.
For more information about adding security groups or IP address whitelists, see Set whitelists and security groups.
Configure the RDS security group.
Add the security group created for MaxCompute or the ENI IP address to the RDS security group or IP address whitelist.
For example, when you connect to ApsaraDB RDS:
Log on to the ApsaraDB RDS for MySQL console.
In the navigation pane on the left, click Instances.
In the navigation pane on the left, click Whitelist and Security Group.
You can add IP address whitelists or security groups on the Whitelist Settings and Security Group tabs. Because the ENI IP address may change when the MaxCompute configuration is modified, we recommend that you add the CIDR block of the vSwitch to the whitelist.
For more information about setting up security groups or IP address whitelists, see Set a security group or Set an IP address whitelist.
Step 3: Use the network connection to access addresses in the VPC
To use SQL or Spark to access a VPC network, you must add the following configurations after you complete the Enable leased line network connectivity operation.
For other types of jobs, adjust the configurations as needed.
Access a VPC using SQL
You can use UDFs to access VPC networks. For more information, see Access VPC network resources using a UDF. The following code provides an example:
-- Set the network connection name. This is the name of the connection configured based on the leased line connection solution. This setting is valid only for the current session. SET odps.session.networklink=testLink;You can access a VPC network from a foreign table. The following code provides an example:
-- Set the parameter in the CREATE TABLE statement. TBLPROPERTIES( 'networklink'='<networklink_name>')You can configure a network link for a data lakehouse. For more information, see Data Lakehouse 2.0 User Guide.
Access a VPC using Spark
To run a Spark job that connects to services in the target VPC, you must add the following configurations. For more information, see Spark accessing VPC-connected instances.
spark.hadoop.odps.cupid.eni.enable = truespark.hadoop.odps.cupid.eni.info=regionid:vpc id
(Optional) Step 4: Add a whitelist
If access control is enabled on your server, you must add the security group that you created for the leased line connection to the server's whitelist.