You can add a Spark data source for a Lindorm instance to quickly import data to the instance in batches. This topic describes how to add a Spark data source.

Prerequisites

  • A Lindorm Tunnel Service (LTS) instance is purchased.
  • An Lindorm instance with Lindorm Distributed Processing System (LDPS) activated is created. For more information, see Create an instance.

Procedure

Use the Lindorm console to add a Spark data source

  1. Log on to the Lindorm console.
  2. On the Instances page, click the ID of an instance whose engine type is LTS.
  3. In the left-side navigation pane, click Data Sources.
  4. Click the Compute Engine Data Source tab, and then click Add Data Source.
  5. In the Add Data Source dialog box, configure the parameters described in the following table.
    Parameters Description
    Instance type Select Lindorm.
    Region Select the region in which your Lindorm instance is deployed.
    Instance ID Select the ID of your Lindorm instance.
    Note
    • Make sure that LDPS is activated for the Lindorm instance. For more information, see Activate LDPS and modify the configurations.
    • Make sure the Lindorm instance and the LTS instance are in the same virtual private cloud (VPC). For more information about how to associates instances across different VPCs, see Connect VPCs.
  6. Click OK. If the state of the Spark data source is Associated, the data source is added.

Use LTS to add a Spark data source

  1. You are logged on to the web UI of the LTS instance. For more information, see Activate and log on to LTS.
  2. In the left-side navigation pane of the LTS console, choose Data Source Manage > Add Data Source.
  3. On the Add data source page, configure the parameters described in the following table.
    Parameter Description
    Name Enter lts_bulkload_spark.
    Data Source Type Select Spark.
    Parameters Configure parameters for the Spark data source.
    {
        "virtualClusterName":"token",
        "hdfsUri":"hdfs://nn1:8020,nn2:8020",
        "sparkEndpoint":"http://192.168.XX.XX:10099"
    }
    • virtualClusterName: The token of the JAR address of LDPS. You can obtain the token of the Lindorm instance by selecting Database Connections in the left-side navigation pane on the instance details page and then clicking the Compute Engine tab, as shown in the following figure. Obtain token value
    • hdfsUri: The HDFS endpoint of the Lindorm instance in the following format: hdfs://nn1:8020,nn2:8020.
      Note To obtain the value of nn1 and nn2 in the endpoint, submit a ticket.
    • sparkEndpoint: The JAR VPC address of LDPS. You can obtain the address by selecting Database Connections in the left-side navigation pane on the instance details page and then clicking the Compute Engine tab, as shown in the following figure. Obtain VPC endpoint
  4. Click Add.