All Products
Search
Document Center

Data Lake Analytics - Deprecated:Query data from multiple ApsaraDB RDS for MySQL instances

Last Updated:Mar 18, 2021

This topic describes how to query data from two ApsaraDB RDS for MySQL instances at the same time.

Background information

Data Lake Analytics (DLA) works as a hub for cloud data processing. It allows you to use a standard Java Database Connectivity (JDBC) driver to query and analyze data of a single ApsaraDB RDS, Tablestore, Object Storage Service (OSS), or ApsaraDB for MongoDB instance. You may create multiple ApsaraDB RDS, Tablestore, or ApsaraDB for MongoDB instances to store data to meet your business expansion or fine-grained data division requirements. In this scenario, you can still use DLA to query data from multiple instances of the same type at the same time.

Prerequisites

Before you use DLA to query data from two ApsaraDB RDS for MySQL instances, make sure that you have prepared test data by performing the following steps:

Notice
  • DLA and the two ApsaraDB RDS for MySQL instances must be deployed in the same region. Otherwise, the operations described in this topic cannot be performed.

  • DLA connects to the databases in the two ApsaraDB RDS for MySQL instances over the virtual private cloud (VPC) to which the instances belong. Therefore, we recommend that you select VPC as the network type when you create the two instances. You can also switch the network type from classic network to VPC for the two instances.

  1. Create two ApsaraDB RDS for MySQL instances.

  2. Configure whitelists for the two instances.

  3. Connect DLA to the instances.

  4. Write test data to tables in the databases of the instances.

    • Execute the following statements to create the orders_db database and the order_item table for ApsaraDB RDS for MySQL instance 1, and write test data to the table:

            create table orders_db.order_item (
                    id bigint not null primary key auto_increment,
                    prod_id bigint comment 'Product ID',
                    prod_cnt int comment 'Product quantity'
            );
            insert into orders_db.order_item values 
                    (1, 1, 2),
                    (2, 2, 3),
                    (3, 3, 4),
                    (4, 2, 5),
                    (5, 1, 6);
    
    • Execute the following statements to create the prod_db database and the prod table for ApsaraDB RDS for MySQL instance 2, and write test data to the table:

            create table prod_db.prod (
                    id bigint not null primary key auto_increment,
                    prod_name varchar(31) comment 'Product name'
            );
            insert into prod_db.prod values 
                    (1, 'Keyboard'),
                    (2, 'Mouse'),
                    (3, 'Monitor');
    

Procedure

Usage notes

Before you connect DLA to the ApsaraDB RDS for MySQL databases, you must add the CIDR block 100.104.0.0/16 to the whitelist of each ApsaraDB RDS for MySQL instance.

The ApsaraDB RDS for MySQL instances are deployed in your VPC. By default, DLA is not allowed to access resources in the VPC. To enable DLA to access the ApsaraDB RDS for MySQL databases, you must use the reverse access technique of VPCs. To achieve this purpose, you can add the CIDR block 100.104.0.0/16 to the whitelists of the ApsaraDB RDS for MySQL instances.

Note

Permission declaration: After you add the CIDR block 100.104.0.0/16 to the whitelists of the ApsaraDB RDS for MySQL instances, you are allowed to use the reverse access technique to read data from and write data to the ApsaraDB RDS for MySQL databases.

Step 1: Connect DLA to the ApsaraDB RDS for MySQL databases

Log on to the DLA console. In the left-side navigation pane, choose Serverless SQL > SQL access point. On the SQL access point page, find your VPC in the VPC Network section and click Log on in DMS in the Actions column to create schemas. The two schemas are mapped to those in the ApsaraDB RDS for MySQL databases.

CREATE SCHEMA dla_orders_db WITH DBPROPERTIES (
  CATALOG = 'mysql', 
  LOCATION = 'jdbc:mysql://rm-******.mysql.rds.aliyuncs.com:3306/orders_db',
  USER = 'mysql_db_user_name',
  PASSWORD = 'mysql_db_password',
  VPC_ID = 'mysql_vpc_id',
  INSTANCE_ID = 'mysql_instance_id'
);
CREATE SCHEMA dla_prod_db WITH DBPROPERTIES (
  CATALOG = 'mysql', 
  LOCATION = 'jdbc:mysql://rm-******.mysql.rds.aliyuncs.com:3306/prod_db',
  USER = 'mysql_db_user_name',
  PASSWORD = 'mysql_db_password',
  VPC_ID = 'mysql_vpc_id',
  INSTANCE_ID = 'mysql_instance_id'
);

Parameters

  • LOCATION: the JDBC URL that is used to connect to an ApsaraDB RDS for MySQL database. The URL is in the format of jdbc:mysql://Internal endpoint of an ApsaraDB RDS for MySQL database/Name of an ApsaraDB RDS for MySQL database.

  • USER: the username that is used to log on to an ApsaraDB RDS for MySQL database.

  • PASSWORD: the password of the username.

  • VPC_ID: the ID of the VPC to which an ApsaraDB RDS for MySQL instance belongs.

  • INSTANCE_ID: the ID of an ApsaraDB RDS for MySQL instance.

Step 2: Create external tables of the ApsaraDB RDS for MySQL databases in DLA

Create an external table named order_item in DLA. This external table is mapped to the order_item table in ApsaraDB RDS for MySQL instance 1.

create external table order_item (
    id bigint,
    prod_id bigint,
    prod_cnt int,
);

Create an external table named prod in DLA. This external table is mapped to the prod table in ApsaraDB RDS for MySQL instance 2.

create external table prod (
    id bigint,
    prod_name varchar(31)
);

Step 3: Use DLA to query data from the ApsaraDB RDS for MySQL databases

After you connect DLA to the two ApsaraDB RDS for MySQL databases and create external tables, you can use a MySQL client or MySQL command-line tool to connect to DLA, and execute SQL statements to query data from the ApsaraDB RDS for MySQL databases.

You can also directly use DLA to query data from the databases.

The following example shows how to query data from the order_item and prod tables in the databases after you use a MySQL command-line tool to connect to DLA:

mysql> select * from dla_orders_db.order_item;
+------+---------+----------+
| id   | prod_id | prod_cnt |
+------+---------+----------+
|    1 |       1 |        2 |
|    2 |       2 |        3 |
|    3 |       3 |        4 |
|    4 |       2 |        5 |
|    5 |       1 |        6 |
+------+---------+----------+
5 rows in set (0.22 sec)

mysql> select * from dla_prod_db.prod;
+------+-----------+
| id   | prod_name |
+------+-----------+
|    1 | Keyboard      |
|    2 | Mouse      |
|    3 | Monitor    |
+------+-----------+
3 rows in set (0.18 sec)

select p.prod_name, sum(o.prod_cnt) cnt
   from dla_prod_db.prod p inner join dla_orders_db.order_item o
    on p.id = o.prod_id
    group by p.prod_name
    order by cnt desc;
+------+-----------+
| prod_name | cnt  |
+-----------+------+
| Keyboard      |    8 |
| Mouse      |    8 |
| Monitor    |    4 |
+-----------+------+
3 rows in set (0.55 sec)

What to do next

After you query data from the databases in the two ApsaraDB RDS for MySQL instances, you can use DLA to store the obtained data to OSS. OSS is a secure, cost-effective, and reliable cloud storage service that allows you to store a large volume of data. If you require the data in the future, you can use DLA to read the data without the need to transfer data files in OSS.