Background information

This document explains how to establish a seamless connection between Table Store and MaxCompute under an Alibaba Cloud account.

As a big data computing service, MaxCompute provides a fast and fully-hosted PB-level data warehousing solution, which allows you to analyze and process mass data economically and efficiently. You can use a simple DDL statement to create an external table on MaxCompute and associate MaxCompute tables with external data sources to provide various data access and output capabilities. MaxCompute tables can contain only structured data while external tables can contain either structured or non-structured data.

Both Table Store and MaxCompute have their own type systems, and the following table shows their mappings.
Table Store MaxCompute
STRING STRING
INTEGER BIGINT
DOUBLE DOUBLE
BOOLEAN BOOLEAN
BINARY BINARY

Preparations

Preparations before accessing Table Store using MaxCompute:

  1. Activate the MaxCompute service.
  2. Create a MaxCompute project.
  3. Create an AccessKey.
  4. In the RAM console, authorize MaxCompute to access Table Store.
    • Method 1: Log on with an Alibaba Cloud account and click here for quick authorization.

    • Method 2: Perform manual authorization using the following steps.

      1. Log on to the RAM console.
      2. On the Roles page, create a user role AliyunODPSDefaultRole.

      3. On the Role Details page, set the policy content.

        The policy content is set as follows.
        {
        "Statement": [
        {
        "Action": "sts:AssumeRole",
        "Effect": "Allow",
        "Principal": {
         "Service": [
           "odps.aliyuncs.com"
         ]
        }
        }
        ],
        "Version": "1"
        }
      4. On the Policies page, create an authorization policy AliyunODPSRolePolicy.

        The policy content is set as follows.
        {
        "Version": "1",
        "Statement": [
         {
           "Action": [
             "ots:ListTable",
             "ots:DescribeTable",
             "ots:GetRow",
             "ots:PutRow",
             "ots:UpdateRow",
             "ots:DeleteRow",
             "ots:GetRange",
             "ots:BatchGetRow",
             "ots:BatchWriteRow",
             "ots:ComputeSplitPointsBySize"
           ],
           "Resource": "*",
           "Effect": "Allow"
         }
        ]
        }
        You can also customize other permissions.
      5. On the Roles page, grant the AliyunODPSRolePolicy permission to the role AliyunODPSDefaultRole.

  5. Create an instance and create a table in the Table Store console.

    In this example, the Table Store instance and data table are detailed as follows:

    • Instance name: cap1
    • Data table name: vehicle_track
    • Primary key information: vid (integer), gt (integer)
    • Endpoint: https://cap1.cn-hangzhou.ots-internal.aliyuncs.com
      Note We recommend that you use the private network address of Table Store when accessing Table Store using MaxCompute.
    • Set the network type of the instance to Any Network.

Step 1. Install and configure the client

  1. Download and extract the MaxCompute client.
    Note Make sure JRE 1.7 or a later version is installed on your computer.
  2. Edit the conf/odps_config.ini file to configure the client.
    access_id=*******************
    access_key=*********************
     # The AccessID and AccessKey of your Alibaba Cloud account can be viewed on the "Management Console" > "AccessKeys" page on the Alibaba Cloud official website.
    project_name=my_project
     # Specify the project space you want to access.
    end_point=https://service.odps.aliyun.com/api
     # The access link of the MaxCompute service.
    tunnel_endpoint=https://dt.odps.aliyun.com
     # The access link of the MaxCompute Tunnel service.
    log_view_host=http://logview.odps.aliyun.com
     # After you run a job, the client returns the LogView address of the job. Access the address to see the details of the executed job.
    https_check=true
     # Determine whether to enable HTTPS access.
    Note In odps_config.ini file, # is used as annotation. In the MaxCompute client, -- is used as annotation.
  3. Run bin/odpscmd.bat, and enter show tables;.

    The display of tables in the current MaxCompute project indicates that the preceding configuration is correct.



Step 2. Create an external table

Create a MaxCompute data table (ots_vehicle_track) and associate it with a table (vehicle_track) in Table Store.

In this example, the associated table is detailed as follows:

  • Instance name: cap1
  • Table name: vehicle_track
  • Primary Key information: vid (int); gt (int)
  • Endpoint: https://cap1.cn-hangzhou.ots-internal.aliyuncs.com
CREATE EXTERNAL TABLE IF NOT EXISTS ots_vehicle_track
(
vid bigint,
gt bigint,
longitude double,
latitude double,
distance double ,
speed double,
oil_consumption double
)
STORED BY 'com.aliyun.odps.TableStoreStorageHandler' -- (1)
WITH SERDEPROPERTIES ( -- (2)
'tablestore.columns.mapping'=':vid, :gt, longitude, latitude, distance, speed, oil_consumption', -- (3)
'tablestore.table.name'='vehicle_track' -- (4)
)
LOCATION 'tablestore://cap1.cn-hangzhou.ots-internal.aliyuncs.com'; -- (5)

Parameters are described as follows.

No. Parameter Description
(1) com.aliyun.odps.TableStoreStorageHandler StorageHandler built in MaxCompute for processing Table Store data. It defines the interactions between MaxCompute and Table Store, and related logics are implemented by MaxCompute.
(2) SERDEPROPERITES The interface that provides parameter options. When TableStoreStorageHandler is used, two options, tablestore.columns.mapping and tablestore.table.name, must be specified.
(3) tablestore.columns.mapping Required option. Table Store columns to be accessed by MaxCompute include primary key and attribute columns. In Table Store, columns with :, such as :vid and :gt in this example, are primary key columns, and others are attribute columns.

When specifying a mapping, you must provide all primary key columns of the specified Table Store table, and the attribute columns to be accessed by MaxCompute.

(4) tablestore.table.name The name of Table Store tables to be accessed. If the specified Table Store table name is incorrect (it does not exist), an error is reported. MaxCompute does not create any Table Store table proactively.
(5) LOCATION The instance information of Table Store to be accessed, including the instance name and endpoint.

Step 3. Access Table Store data through external tables

When an external table is created, Table Store data is introduced into the MaxCompute ecosystem and can be accessed using MaxCompute SQL commands.

// Provides statistics on the average speed and oil consumption of vehicles with a VID less than 4 before the time stamp 1469171387)
select vid,count(*),avg(speed),avg(oil_consumption) from ots_vehicle_track where vid <4 and gt<1469171387  group by vid;

A result similar to the following is returned.