edit-icon download-icon

Allow MaxCompute to access Table Store using one account

Last Updated: Jul 12, 2018

Background information

This document explains how to establish a seamless connection between Table Store and MaxCompute under an Alibaba Cloud account.

As a big data computing service, MaxCompute provides a fast and fully-hosted PB-level data warehousing solution, which allows you to analyze and process mass data economically and efficiently. You can use a simple DDL statement to create an external table on MaxCompute and associate MaxCompute tables with external data sources to provide various data access and output capabilities. MaxCompute tables can contain only structured data while external tables can contain either structured or non-structured data.

Both Table Store and MaxCompute have their own type systems, and the following table shows their mappings.

Table Store MaxCompute
STRING STRING
INTEGER BIGINT
DOUBLE DOUBLE
BOOLEAN BOOLEAN
BINARY BINARY

Preparations

Preparations before accessing Table Store using MaxCompute:

  1. Activate the MaxCompute service.

  2. Create a MaxCompute project.

  3. Create an AccessKey.

  4. In the RAM console, authorize MaxCompute to access Table Store.

    1. Log on to the RAM console.

    2. On the Roles page, create a user role AliyunODPSDefaultRole.

      Create a user role

    3. On the Role Details page, set the policy content.

      Set role policy content

      The policy content is set as follows.

      1. {
      2. "Statement": [
      3. {
      4. "Action": "sts:AssumeRole",
      5. "Effect": "Allow",
      6. "Principal": {
      7. "Service": [
      8. "odps.aliyuncs.com"
      9. ]
      10. }
      11. }
      12. ],
      13. "Version": "1"
      14. }
    4. On the Policies page, create an authorization policy AliyunODPSRolePolicy.

      Create authorization policy

      The policy content is set as follows.

      1. {
      2. "Version": "1",
      3. "Statement": [
      4. {
      5. "Action": [
      6. "ots:ListTable",
      7. "ots:DescribeTable",
      8. "ots:GetRow",
      9. "ots:PutRow",
      10. "ots:UpdateRow",
      11. "ots:DeleteRow",
      12. "ots:GetRange",
      13. "ots:BatchGetRow",
      14. "ots:BatchWriteRow",
      15. "ots:ComputeSplitPointsBySize"
      16. ],
      17. "Resource": "*",
      18. "Effect": "Allow"
      19. }
      20. ]
      21. }
      22. --You can also customize other permissions.
    5. On the Roles page, grant the AliyunODPSRolePolicy permission to the role AliyunODPSDefaultRole.

      Authorize

  5. Create an instance and create a table in the Table Store console.

    In this example, the Table Store instance and data table are detailed as follows:

    • Instance name: cap1
    • Data table name: vehicle_track
    • Primary key information: vid (integer), gt (integer)
    • Endpoint: https://cap1.cn-hangzhou.ots-internal.aliyuncs.com

      Note: We recommend that you use the private network address of Table Store when accessing Table Store using MaxCompute.

    • Set the network type of the instance to Any Network.

      Network type

Step 1. Install and configure the client

  1. Download and extract the MaxCompute client.

    Note: Make sure JRE 1.7 or a later version is installed on your computer.

  2. Edit the conf/odps_config.ini file to configure the client.

    1. access_id=*******************
    2. access_key=*********************
    3. # The AccessID and AccessKey of your Alibaba Cloud account can be viewed on the "Management Console" > "AccessKeys" page on the Alibaba Cloud official website.
    4. project_name=my_project
    5. # Specify the project space you want to access.
    6. end_point=https://service.odps.aliyun.com/api
    7. # The access link of the MaxCompute service.
    8. tunnel_endpoint=https://dt.odps.aliyun.com
    9. # The access link of the MaxCompute Tunnel service.
    10. log_view_host=http://logview.odps.aliyun.com
    11. # After you run a job, the client returns the LogView address of the job. Access the address to see the details of the executed job.
    12. https_check=true
    13. # Determine whether to enable HTTPS access.

    Note: In odps_config.ini file, # is used as annotation. In the MaxCompute client, -- is used as annotation.

  3. Run bin/odpscmd.bat, and enter show tables;.

    The display of tables in the current MaxCompute project indicates that the preceding configuration is correct.

    Run odpscmd

Step 2. Create an external table

Create a MaxCompute data table (ots_vehicle_track) and associate it with a table (vehicle_track) in Table Store.

In this example, the associated table is detailed as follows:

  • Instance name: cap1
  • Table name: vehicle_track
  • Primary Key information: vid (int); gt (int)
  • Endpoint: https://cap1.cn-hangzhou.ots-internal.aliyuncs.com
  1. CREATE EXTERNAL TABLE IF NOT EXISTS ots_vehicle_track
  2. (
  3. vid bigint,
  4. gt bigint,
  5. longitude double,
  6. latitude double,
  7. distance double ,
  8. speed double,
  9. oil_consumption double
  10. )
  11. STORED BY 'com.aliyun.odps.TableStoreStorageHandler' -- (1)
  12. WITH SERDEPROPERTIES ( -- (2)
  13. 'tablestore.columns.mapping'=':vid, :gt, longitude, latitude, distance, speed, oil_consumption', -- (3)
  14. 'tablestore.table.name'='vehicle_track' -- (4)
  15. )
  16. LOCATION 'tablestore://cap1.cn-hangzhou.ots-internal.aliyuncs.com'; -- (5)

Parameters are described as follows.

No. Parameter Description
(1) com.aliyun.odps.TableStoreStorageHandler StorageHandler built in MaxCompute for processing Table Store data. It defines the interactions between MaxCompute and Table Store, and related logics are implemented by MaxCompute.
(2) SERDEPROPERITES The interface that provides parameter options. When TableStoreStorageHandler is used, two options, tablestore.columns.mapping and tablestore.table.name, must be specified.
(3) tablestore.columns.mapping Required option. Table Store columns to be accessed by MaxCompute include primary key and attribute columns.
In Table Store, columns with :, such as :vid and :gt in this example, are primary key columns, and others are attribute columns.
When specifying a mapping, you must provide all primary key columns of the specified Table Store table, and the attribute columns to be accessed by MaxCompute.
(4) tablestore.table.name The name of Table Store tables to be accessed. If the specified Table Store table name is incorrect (it does not exist), an error is reported. MaxCompute does not create any Table Store table proactively.
(5) LOCATION The instance information of Table Store to be accessed, including the instance name and endpoint.

Step 3. Access Table Store data through external tables

When an external table is created, Table Store data is introduced into the MaxCompute ecosystem and can be accessed using MaxCompute SQL commands.

  1. // Provides statistics on the average speed and oil consumption of vehicles with a VID less than 4 before the time stamp 1469171387)
  2. select vid,count(*),avg(speed),avg(oil_consumption) from ots_vehicle_track where vid <4 and gt<1469171387 group by vid;

A result similar to the following is returned.

Run MaxCompute SQL

Thank you! We've received your feedback.