To bulk-load data into a Lindorm instance using Apache Spark, add a Spark data source that connects your Lindorm instance to a Lindorm Tunnel Service (LTS) instance. You can do this from either the Lindorm console or the LTS web UI.
Prerequisites
Before you begin, make sure you have:
A Lindorm Tunnel Service (LTS) instance
A Lindorm instance with Lindorm Distributed Processing System (LDPS) activated. See Create an instance
The Lindorm instance and the LTS instance in the same virtual private cloud (VPC). See Connect VPCs if they are in different VPCs
(LTS web UI method only) The HDFS namenode hostnames for your Lindorm instance. To get these values, submit a ticket
Add a Spark data source from the Lindorm console
Log on to the Lindorm console.
On the Instances page, click the ID of the instance whose engine type is LTS.
In the left-side navigation pane, click Data Sources.
Click the Compute Engine Data Source tab, then click Add Data Source.
In the Add Data Source dialog box, configure the following parameters.
Parameter Description Instance type Select Lindorm. Region Select the region where your Lindorm instance is deployed. Instance ID Select the ID of your Lindorm instance. LDPS must be activated for the selected instance. See Activate LDPS and modify the configurations. Click OK. When the Spark data source status shows Associated, the data source is added successfully.
Add a Spark data source from the LTS web UI
Log on to the LTS web UI. See Activate and log on to LTS.
In the left-side navigation pane, choose Data Source Manage > Add Data Source.
On the Add data source page, configure the following parameters.
Parameter Description Name Enter lts_bulkload_spark.Data Source Type Select Spark. Parameters Enter the JSON configuration below. Replace each placeholder with your actual value. { "virtualClusterName": "<ldps-access-token>", "hdfsUri": "hdfs://<namenode1-hostname>:8020,<namenode2-hostname>:8020", "sparkEndpoint": "http://192.168.XX.XX:10099" }Parameter Description How to get the value virtualClusterNameThe access token for the LDPS JAR address. On the instance details page, go to Database Connections in the left-side navigation pane, then click the Compute Engine tab. hdfsUriThe HDFS endpoint of the Lindorm instance. Format: hdfs://<namenode1-hostname>:8020,<namenode2-hostname>:8020Submit a ticket to get the namenode hostnames ( <namenode1-hostname>and<namenode2-hostname>).sparkEndpointThe JAR VPC address of LDPS. Format: http://<ip-address>:10099On the instance details page, go to Database Connections in the left-side navigation pane, then click the Compute Engine tab. Click Add.