To develop and manage EMR Serverless StarRocks tasks in DataWorks, you must first associate your EMR Serverless StarRocks instance as a computing resource. Once associated, you can use the resource for data development.
Prerequisites
An EMR Serverless StarRocks instance is created.
The instance must be created in the same Region as your DataWorks workspace. If the regions differ, you cannot associate the instance as a computing resource for the workspace.
A DataWorks workspace is created. The RAM user is a member of the workspace and has the workspace administrator role.
ImportantThis feature is available only for workspaces that use the new version of DataStudio.
For workspaces that do not use the new version of DataStudio, you can create a StarRocks data source in DataWorks.
A resource group is associated with the workspace, and network connectivity is established.
If you use a serverless resource group, ensure that the computing resource instance can connect to the serverless resource group.
If you use an exclusive resource group of an earlier version, ensure that the computing resource instance can connect to the exclusive resource group for scheduling.
Add the internal CIDR block of the resource group to the internal IP address allowlist of the EMR Serverless StarRocks instance to ensure network connectivity.
Limitations
Region limitations: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Indonesia (Jakarta), Germany (Frankfurt), US (Silicon Valley), and US (Virginia).
Permission limitations:
User role
Required permissions
Alibaba Cloud account
No additional permissions are required.
RAM user or RAM role
Only workspace members with the O&M or Workspace Administrator role, or those with the
AliyunDataWorksFullAccesspermission, can create computing resources. For more information, see Grant the workspace administrator role to a user.
Go to the computing resource list page
-
Log on to the DataWorks console. In the navigation pane on the left, switch to the target region and click . Select your workspace from the drop-down list and click Go to Management Center.
-
In the navigation pane on the left, click Computing Resources to open the computing resource list page.
Associate a Serverless StarRocks computing resource
On the computing resource list page, associate a Serverless StarRocks computing resource.
Select a computing resource type to associate.
Click Associate Computing Resources to go to the Associate Computing Resources page.
On the Associate Computing Resources page, select the computing resource type as Serverless StarRocks to go to the Associate EMR Serverless StarRocks Computing Resource configuration page.
Configure the Serverless StarRocks computing resource.
On the Associate EMR Serverless StarRocks Computing Resource configuration page, configure the parameters described in the following table.
Parameter
Description
Configuration Mode
ApsaraDB for RDS and User-created Data Store with Public IP Addresses. Configure the corresponding parameters based on the configuration mode you select.
ApsaraDB for RDS
Instance
Select the EMR Serverless StarRocks instance to associate. You can also click Create in the drop-down list to create an EMR Serverless StarRocks instance.
NoteIf you selected to isolate the production and development environments when you created the workspace, you must select separate StarRocks instances for the production and development environments.
Database Name
Select a database in the EMR Serverless StarRocks instance. If no database is available, create one in the EMR Serverless StarRocks instance.
Authentication Method
Enterprise Edition supports multiple options. Non-Enterprise Edition supports only the Account and Password method.
When the authentication method is set to Task Owner, the DataWorks tenant administrator of Enterprise Edition must add identity credentials in Security Center to map RAM accounts to engine accounts. If users need to request and approve StarRocks permissions in DataWorks, the tenant administrator must also configure the permission management settings in Security Center.
Username
The account and password that you specified when you created the EMR Serverless StarRocks instance. The default account is
admin. If you forgot the password, you can click Reset Password on the instance details page.Password
User-created Data Store with Public IP Addresses
Host Address/IP Address
The IP address of the StarRocks FE.
Port
The query port of the StarRocks FE. The default value is
9030. Set this parameter based on the configuration on the StarRocks side.Load URL
The address of the StarRocks FE for StreamLoad. Format:
FE_IP:FE_HTTP. Separate multiple FE addresses with commas.NoteFE_IPsupports only internal IP addresses, not public IP addresses. TheFE_HTTPport is typically8030or18030. Set this parameter based on the configuration on the StarRocks side.Database Name
Select a database in the EMR Serverless StarRocks instance. If no database is available, create one in the SQL query editor of the EMR Serverless StarRocks instance.
Username
The account and password that you specified when you created the EMR Serverless StarRocks instance. The default account is
admin. If you forgot the password, you can click Reset Password on the instance details page.Password
Advanced Parameters
Optional. You can click Add Property to add property parameters. For more information, see the official MySQL documentation.
Computing Resource Instance Name
Used to identify the computing resource. When a task runs, the computing resource is selected by this instance name.
Test network connectivity.
In the Connection Configuration section, select the resource group that DataWorks uses to run StarRocks node tasks, and click Test Connectivity to ensure that the resource group can access your EMR Serverless StarRocks instance. For more information, see Network connectivity solutions.
Click Confirm to complete the Serverless StarRocks computing resource configuration.
NoteWhen you associate a Serverless StarRocks computing resource, the system automatically creates a Serverless StarRocks data source with the same name in the Data Sources section of the current workspace.
Next steps
After you configure the Serverless StarRocks computing resource, you can perform data development in the corresponding nodes in Data Studio.
Use the computing resource in the node to develop batch synchronization tasks. For more information, see Configure a batch sync node for StarRocks.
Use the computing resource in the node to develop EMR Serverless StarRocks tasks. For more information, see Create a Serverless StarRocks SQL node.