This topic describes how to create and configure a StarRocks cluster.

Prerequisites

A virtual private cloud (VPC) and a vSwitch are created in the region where you want to create a StarRocks cluster. For more information, see Create and manage a VPC and Create and manage a vSwitch.

Procedure

  1. Go to the cluster creation page.
    1. Log on to the EMR on ECS console.
    2. Optional: In the top navigation bar, select the region where you want to create a cluster and select a resource group based on your business requirements.
      • The region of a cluster cannot be changed after the cluster is created.
      • All resource groups within your account are displayed by default.
    3. In the upper-left corner of the page, click Create Cluster.
  2. Configure the cluster.
    To create a cluster, you must configure software parameters, hardware parameters, and basic parameters as guided by the wizard.
    Important After a cluster is created, you cannot modify its parameters except for the cluster name. Make sure that all parameters are correctly configured when you create a cluster.
    1. Configure software parameters.
      ParameterExampleDescription
      RegionChina (Hangzhou)The region of a cluster cannot be changed after the cluster is created.
      Business ScenarioData AnalyticsSelect Data Analytics.
      Product VersionEMR-3.42.0The major version of EMR. The latest version is selected by default.
      High Service AvailabilityOffBy default, this switch is turned off. If you turn on this switch, three master nodes are created in the cluster to ensure the availability of the ResourceManager and NameNode processes. You can also modify the number of master nodes.
      Optional ServicesStarRocksThe other services that you can select based on your business requirements. By default, the relevant processes for the services you specify are started.
      Advanced SettingsOffCustom Software Configuration: customizes software settings. You can use a JSON file to customize the parameters of basic components required for a cluster, such as Hadoop, Spark, and Hive. This switch is turned off by default.
    2. Configure hardware parameters.
      ParameterExampleDescription
      Billing MethodPay-as-you-goSubscription is selected by default. EMR supports the following billing methods:
      • Pay-as-you-go: a billing method that allows you to pay for a cluster after you use the cluster. The system charges you for a cluster based on the hours the cluster is actually used. Bills are generated on an hourly basis at the top of every hour. We recommend that you use pay-as-you-go clusters for short-term test jobs or dynamically scheduled jobs.
      • Subscription: a billing method that allows you to use a cluster only after you pay for the cluster.
        Note

        We recommend that you create a pay-as-you-go cluster for a test run. If the cluster passes the test, you can create a subscription cluster for production.

      ZoneZone IThe zone where you want to create a cluster. Zones are different geographical areas located in the same region. They are interconnected by an internal network. In most cases, you can use the zone selected by default.
      VPCstarrocks_test/vpc-bp1f4epmkvncimpgs****By default, an existing virtual private cloud (VPC) is selected.

      You can also create a VPC in the VPC console. For more information, see Create and manage a VPC.

      vSwitchvsw_test/vsw-bp1e2f5fhaplp0g6p****Select a vSwitch in the specified zone of the VPC. If no vSwitch is available in the zone, you must go to the VPC console and create a vSwitch in the zone. For more information, see Create and manage a vSwitch.
      Default Security Groupsg-bp1ddw7sm2risw****/sg-bp1ddw7sm2risw****The security group of the cluster. An existing security group is selected by default. For more information about security groups, see Overview.

      You can also click create a new security group to create a security group in the Elastic Compute Service (ECS) console. For more information, see Create a security group.

      Important Do not use an advanced security group that is created in the ECS console.
      Node GroupDefault valuesYou can select instance types based on your business requirements. For more information, see Instance families.
      • System Disk: You can select a standard SSD, enhanced SSD, or ultra disk based on your business requirements.
      • Disk Size: You can resize a disk based on your business requirements. The recommended minimum disk size is 120 GB. Valid values: 60 to 500. Unit: GB.
      • Data Disk: You can select a standard SSD, enhanced SSD, or ultra disk based on your business requirements.
      • Disk Size: You can resize a disk based on your business requirements. The recommended minimum disk size is 80 GB. Valid values: 40 to 32768. Unit: GB.
      • Assign Public Network IP: You can specify whether an elastic IP address (EIP) is associated with the cluster. This switch is turned off by default.
        Note If you do not turn on this switch but want to access the cluster over the Internet after the cluster is created, you can apply for a public IP address on ECS. For information about how to apply for an EIP address, see Elastic IP addresses.
      • Instances: By default, one master node and one core node are created.
    3. Configure basic parameters.
      Configure parameters in the Basic Configuration step.
      ParameterExampleDescription
      Cluster NameEmr-StarRocksThe name of the cluster. The name must be 1 to 64 characters in length and can contain letters, digits, hyphens (-), and underscores (_).
      Identity CredentialsCustom PasswordKey Pair: the SSH key pairs that are used to log on to a Linux instance. This value is selected by default.

      For information about how to use a key pair, see SSH key pair overview.

      Password: the password that is used to log on to the master node (Linux instance).

      The password must be 8 to 30 characters in length and must contain uppercase letters, lowercase letters, digits, and special characters.

      The following special characters are supported: ! @ # $ % ^ & *

      Advanced SettingsConfigure the parameters based on your business requirements.
      • ECS Application Role: You can also assign an application role to a cluster. Then, EMR applies for a temporary AccessKey pair when applications that run on the compute nodes of that cluster access other Alibaba Cloud services, such as Object Storage Service (OSS). This way, you do not need to manually specify an AccessKey pair. You can grant the access permissions of the application role on specific Alibaba Cloud services based on your business requirements.
      • Bootstrap Actions: optional. For more information, see Manage bootstrap actions.
      • Resource Group: optional. For more information, see Use resource groups.
  3. In the Confirm step, read the terms of service, and select the check box to confirm that you have read the terms of service.
  4. Click Confirm.
    Refresh the page to view the creation progress. When Status becomes Running, the cluster is created.

FAQ

Q: Are the FE and BE processes of StarRocks deployed on the master or core node?

A: The FE process of StarRocks is deployed on the master node. One master node is configured by default. If you turn on High Service Availability when you create a cluster, three master nodes are configured by default. Each master node is configured with an FE process. The High Service Availability feature provides fault tolerance and load balancing capabilities.

By default, a BE process of StarRocks is deployed on each core node. You can adjust the number of core nodes with the BE process deployed.