Storage-compute integrated instances are suitable for scenarios that require high query performance, such as Online Analytical Processing (OLAP) multidimensional analysis, high-concurrency queries, and real-time data analytics. This instance type stores data on cloud disks or local disks to ensure high data read and write efficiency. This topic describes how to create and use an EMR Serverless StarRocks storage-compute integrated instance with an Alibaba Cloud account.
Prerequisites
Register an Alibaba Cloud account and complete identity verification.
If you are a Resource Access Management (RAM) user, grant the AliyunEMRStarRocksFullAccess system policy.
NoteThe AliyunEMRStarRocksFullAccess system policy is required to create and manage StarRocks instances.
Precautions
You are responsible for managing and configuring the runtime environment for your code.
Procedure
Step 1: Create a storage-compute integrated StarRocks instance
-
Go to the E-MapReduce Serverless StarRocks instance list page.
-
Log on to the E-MapReduce console.
-
In the navigation pane on the left, choose .
-
In the top menu bar, select the required region.
-
-
On the Instances page, click Create Instance.
-
On the E-MapReduce Serverless StarRocks page, configure the instance parameters.
Configuration
Example
Description
Product Type
Pay-as-you-go
Select Pay-as-you-go. For more information about billing, see pay-as-you-go.
Region
China (Beijing)
The physical location of the instance.
ImportantYou cannot change the region after the instance is created. Select the region with caution.
Network and zone
vpc_Hangzhou/vpc-bp1f4epmkvncimpgs****
Zone I
vsw_i/vsw-bp1e2f5fhaplp0g6p****
Select a virtual private cloud (VPC), a zone, and the corresponding vSwitch.
-
VPC: An isolated network environment that you define in Alibaba Cloud.
Select an existing VPC, or click Create a VPC to go to the VPC console and create one. For more information, see Create and manage a VPC.
Note-
When you create a VPC, you must select an IPv4 CIDR block from one of the following three private network ranges defined in RFC 1918:
-
10.0.0.0/8(10.0.0.0 to 10.255.255.255) -
172.16.0.0/12(172.16.0.0 to 172.31.255.255) -
192.168.0.0/16(192.168.0.0 to 192.168.255.255)
-
-
If your Serverless StarRocks instance needs to access the Internet (for example, to import data or query foreign tables), make sure its VPC has Internet access. You can deploy an Internet NAT gateway in the VPC and enable the SNAT feature. For more information, see Use the SNAT feature of an Internet NAT gateway to access the Internet.
-
-
Zone: The zone where the instance is located.
-
vSwitch: A vSwitch is a basic network module of a VPC that connects different cloud resources.
Select an existing vSwitch, or click . Create vSwitch to go to the VPC console and create one. For more information, see Create and manage a vSwitch.
Instance Type
Storage-compute integrated
Suitable for scenarios that require high query performance, such as OLAP multidimensional analysis, high-concurrency queries, and real-time data analytics. This instance type stores data on cloud disks or local disks to ensure high data read and write efficiency.
Instance Edition
Standard Edition
Supports Basic Edition and Standard Edition. For more information, see Instance edition details.
NoteStarter Edition is available only in the China (Beijing), China (Shanghai), China (Shenzhen), and China (Hangzhou) regions.
Kernel Version
3.3
The community version number of StarRocks.
FE Specifications
-
Specification type: Standard Specifications
Compute CU: 8 CU
Data disk: PL1 ESSD,
High availability: Enabled by default.
Number of nodes: 3
Load balancing: Built-in PrivateZone
-
Specification type: The specification type of FE nodes varies with the Instance Edition of StarRocks.
-
Basic Edition: Supports Standard Specifications.
-
Standard Edition: Supports Standard Specifications and Memory-optimized Specifications specifications.
-
-
Compute CUs: Select the number of compute units (CUs).
Select the appropriate CU specification as needed. For more information about CU fees, see Billable items.
-
Data Disk: Only PL1 ESSD is supported. The data disk size ranges from 100 GB to 65000 GB, with a step size of 100.
For more information about cloud disks, see Enterprise SSDs.
-
HA: Enabled by default. The Standard edition supports high availability. After you enable high availability, the number of StarRocks FE nodes increases from 1 to 3 to reduce the risk of failures.
ImportantWe strongly recommend that you enable high availability for production environments.
-
Number of Nodes: The number of FE nodes. The value can be an odd number from 1 to 11.
-
Load balancing: Supports the following methods.
-
Built-in Private Zone: Balances traffic through PrivateZone domain name resolution at no extra cost. Suitable for lightweight or cost-sensitive scenarios.
Recommended for non-production environments or services with low load balancing requirements.
-
Load balancing SLB: Provides high-performance load balancing through the SLB service. Recommended for production environments with high performance and reliability requirements.
The feature to remove the FE leader from query traffic is available only after SLB is activated.
You must activate the SLB service, which incurs extra fees. For more information, see CLB billing overview.
-
BE Specifications
-
Specification type: Standard Specifications
Compute CU: 8 CU
Data disk: PL1 ESSD, 100 GB, 1
Number of nodes: 3
-
Specification type: The specification type of BE nodes varies with the Instance Edition of StarRocks.
-
Basic Edition: Supports Standard Specifications.
-
Standard Edition: Supports the following specifications.
Standard Specifications: The default and recommended option. 1 CU = 1 CPU core + 4 GiB of memory. This specification uses ESSD for StarRocks storage.
Memory-optimized Specifications: 1 RCU = 1 CPU core + 8 GiB of memory. This specification is suitable for memory-intensive scenarios, such as running a large number of complex queries or handling high-concurrency requests. It uses ESSD for StarRocks storage.
Network-enhanced Specifications: 1 NCU = 1 CPU core + 4 GiB of memory, offering more than double the network bandwidth of the standard specifications. This specification is suitable for scenarios that involve scanning large amounts of data from external tables. It uses ESSD for StarRocks storage.
High-performance Storage: You need to select detailed specifications for this type. It uses a local SSD for StarRocks storage and is ideal for scenarios with stringent I/O performance requirements.
High-specification Storage: You need to select detailed specifications for this type. It uses a local HDD for StarRocks storage. This type is ideal for storing very large data volumes at a lower cost, but it has lower I/O performance.
-
-
Compute CUs: Select the number of compute units (CUs).
Select the appropriate CU specification as needed. For more information about CU fees, see Billable items.
-
Data Disk: Supports PL0 ESSD, PL1 ESSD (recommended), PL2 ESSD, and PL3 ESSD. For more information, see Enterprise SSDs.
The cache disk size ranges from 100 to 65000 GB. The number of cache disks is 1 by default. The value can range from 1 to 8, with a step size of 1.
NoteYou can enter the required storage capacity, and the system automatically recommends a configuration. If the capacity you select exceeds the recommended threshold, a prompt is displayed to help you adjust for optimal performance.
-
Number of Nodes: The number of BE nodes. The value can range from 3 to 50.
Instance Name
Enter a custom instance name.
The instance name must be 1 to 64 characters in length and can contain Chinese characters, letters, digits, hyphens (-), and underscores (_).
Administrator
admin
The administrator username used to manage StarRocks. The default value is admin and cannot be changed.
Password and Confirm Password
Enter a custom password.
The password for the built-in administrator, admin. You must record the password. The password is required when you manage and use the instance. If you forget the password, you can reset it. For more information, see How to reset an instance password?
For more information about instance parameters, see Create an instance.
-
Read and select the terms of service, click Create Instance, and complete the payment as prompted.
After the payment is complete, return to the instance management page to view the newly created instance. The instance is created when its Status changes to Running.
Step 2: Connect to the StarRocks instance
-
On the Instances page, find the target instance and click Connect in the Actions column.
You can also connect to a StarRocks instance in other ways.
-
Connect to the StarRocks instance.
-
On the New Connection tab, configure the following parameters.
After completing the configuration, click Test Network Connectivity to validate the connection, and then click OK to create it.
Parameter
Example
Description
Region
China (Hangzhou)
Select the physical location of the created StarRocks instance.
Instance
StarRocks_Serverless
Select the name of the created StarRocks instance.
Name
Connection_Serverless
Enter a custom connection name.
The name must be 1 to 64 characters in length and can contain Chinese characters, letters, digits, hyphens (-), and underscores (_).
Username
Enter a value based on your actual needs.
The default initial username is admin. You can use this username to connect or create other users as needed. For more information about how to create users, see Manage Users and Data Authorization.
Password
Enter a value based on your requirements.
The password that corresponds to the username created in the StarRocks instance.
-
Click Test Connectivity.
-
After the connection test is successful, click OK.
You are redirected to the SQL Editor page, where you can run SQL queries. For more information, see Connect to a StarRocks instance by using EMR StarRocks Manager.
-
Step 3: Run SQL queries
-
On the Queries page of the SQL Editor, click the
icon. -
In the New dialog box, click Confirm.
-
In the new file, enter the following commands. Select all commands and click Run.
/**Create a database**/ CREATE DATABASE IF NOT EXISTS load_test; /**Use the database**/ USE load_test; /**Create a table**/ CREATE TABLE insert_wiki_edit ( event_time DATETIME, channel VARCHAR(32) DEFAULT '', user VARCHAR(128) DEFAULT '', is_anonymous TINYINT DEFAULT '0', is_minor TINYINT DEFAULT '0', is_new TINYINT DEFAULT '0', is_robot TINYINT DEFAULT '0', is_unpatrolled TINYINT DEFAULT '0', delta INT SUM DEFAULT '0', added INT SUM DEFAULT '0', deleted INT SUM DEFAULT '0' ) AGGREGATE KEY(event_time, channel, user, is_anonymous, is_minor, is_new, is_robot, is_unpatrolled) PARTITION BY RANGE(event_time) ( PARTITION p06 VALUES LESS THAN ('2015-09-12 06:00:00'), PARTITION p12 VALUES LESS THAN ('2015-09-12 12:00:00'), PARTITION p18 VALUES LESS THAN ('2015-09-12 18:00:00'), PARTITION p24 VALUES LESS THAN ('2015-09-13 00:00:00') ) DISTRIBUTED BY HASH(user) BUCKETS 10 PROPERTIES("replication_num" = "1"); /**Insert data**/ INSERT INTO insert_wiki_edit VALUES("2015-09-12 00:00:00","#en.wikipedia","GELongstreet",0,0,0,0,0,36,36,0),("2015-09-12 00:00:00","#ca.wikipedia","PereBot",0,1,0,1,0,17,17,0); /**Query data**/ select * from insert_wiki_edit;
The system returns the relevant information.
A successful query returns records from the insert_wiki_edit table, confirming that the data was imported successfully.
Step 4: Run a performance test
For more information, see Test Instructions.
(Optional) Step 5: Release the instance
This operation deletes the instance and all its resources. This action cannot be undone. Proceed with caution.
When you no longer need an instance, release it to avoid incurring further charges.
-
On the Instances page, find the target instance and click Release in the Actions column.
-
In the dialog box that appears, click OK.
Related documentation
-
For more information about the SQL Editor, see SQL Editor.
-
To view SQL query information for the current instance, analyze SQL execution plans, and diagnose and troubleshoot SQL issues, see Diagnostics and Analysis.
-
To view and analyze all operations that occur in the database, see Audit Log.
Contact us
For questions, join our support DingTalk group by searching for the ID 24010016636.