edit-icon download-icon

Environment preparations

Last Updated: Mar 14, 2018

Access Table Store tables with Spark or Spark SQL

You can use Spark and Spark SQL to access data in Table Store directly by using the dependency package released by Table Store and E-MapReduce.

Install Spark/Spark SQL

  1. Download the Spark installation package that complies with the following requirements:

    • Release version: 1.6.2

    • Package type: Pre-built for Hadoop 2.6

    • Download type: Direct Download

  2. Unpack the installation package as follows.

    1. $ cd /home/admin/spark-1.6.2
    2. $ tar -zxvf spark-1.6.2-bin-hadoop2.6.tgz

Install JDK-7+

  1. Download and install the installation package of JDK-7+.

  2. Check the installation status as follows.

    1. $ java -version
    2. java version "1.8.0_77"
    3. Java(TM) SE Runtime Environment (build 1.8.0_77-b03)
    4. Java HotSpot(TM) 64-Bit Server VM (build 25.77-b03, mixed mode)

Download Java SDK for Table Store

  1. Download the Java SDK dependency package (version 4.1.0. or higher).

    Note: The SDK dependency package is updated with Java SDK. Download the dependency package according to the latest Java SDK.

  2. Copy the SDK to the Spark directory as follows.

    1. $ mv tablestore-4.1.0-jar-with-dependencies.jar /home/admin/spark-1.6.2/

Download EMR dependency package

  1. Download the Alibaba Cloud EMR dependency package.

    Note: For more information on EMR, click here.

  2. Rename the “emr-sdk_2.10-1.3.0-20161025.065936-1.jar” file.

    1. mv emr-sdk_2.10-1.3.0-20161025.065936-1.jar /home/admin/spark-1.6.2/emr-sdk_2.10-1.3.0-SNAPSHOT.jar

Run Spark SQL

  1. $ cd /home/admin/spark-1.6.2/
  2. $ bin/spark-sql --master local --jars tablestore-4.3.1-jar-with-dependencies.jar,emr-tablestore-1.4.2.jar
Thank you! We've received your feedback.