All Products
Search
Document Center

MaxCompute:Best practice to migrate data from Oracle to MaxCompute

Last Updated:Jan 22, 2025

This topic describes how to use the data integration feature of DataWorks to migrate data from Oracle to MaxCompute.

Prerequisites

  • The DataWorks environment is prepared.

    1. MaxCompute and DataWorks are activated. For more information, see Activate MaxCompute and DataWorks.

    2. A workspace is created. For more information, see Create a workspace. In this example, a workspace in basic mode is used.

    3. A workflow is created in your workspace in the DataWorks console. For more information, see Create a workflow.

  • The Oracle database is prepared.

    In this example, the Oracle database is installed on an Elastic Compute Service (ECS) instance. To enable network communication, you must configure a public IP address for the ECS instance. In addition, you must configure a security group rule for the ECS instance to ensure that the common port 1521 of the Oracle database is accessible. The following figure shows the configuration of the ECS instance. For more information about how to configure a security group rule for an ECS instance, see Modify a security group rule.

    In this example, the type of the ECS instance is ecs.c5.xlarge. The ECS instance resides in a virtual private cloud (VPC) in the China (Hangzhou) region.

Background information

In this example, DataWorks Oracle Reader is used to read test data from the Oracle database. For more information, see Oracle Reader.

Prepare test data in the Oracle database

  1. In the Oracle database, create a table named DTSTEST.GOOD_SALE. The table contains the following columns: create_time, category, brand, buyer_id, trans_num, trans_amount, and click_cnt.

  2. Execute the following statements to insert test data into the DTSTEST.GOOD_SALE table:

    insert into good_sale values('28-December-19','Kitchenware','Brand A','hanmeimei','6','80.6','4');
    insert into good_sale values('21-December-19','Fresh food','Brand B','lilei','7','440.6','5');
    insert into good_sale values('29-December-19','Clothing','Brand C','lily','12','351.9','9');
    commit;
  3. Execute the following statement to view the data in the table:

    select * from good_sale;

Use DataWorks to migrate data from the Oracle database to MaxCompute

  1. Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and O&M > Data Development. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.

  2. On the DataStudio page, create a table to store the data that is migrated from the Oracle database.

    1. In the Scheduled Workflow pane of the DataStudio page, find the workflow that you created, right-click the workflow name, and then choose Create Table > MaxCompute > Table.

    2. In the Create Table dialog box, configure the basic information about the table.

    3. In the top toolbar of the configuration tab of the table, click DDL.

    4. In the DDL dialog box, enter the following CREATE TABLE statement and click Generate Table Schema.

      CREATE TABLE good_sale 
      (
          create_time     string,
          category        string,
          brand           string,
          buyer_id        string,
          trans_num       bigint,
          trans_amount    double,
          click_cnt       bigint
      ) ;

      Make sure that the data types of the MaxCompute table that you create match the data types of the Oracle table. For more information about the data types supported by Oracle Reader, see Data type mappings.

    5. Click Submit to Production Environment.

  3. Create an Oracle connection. For more information, see Add an Oracle data source.

  4. Create a batch synchronization task.

    1. Go to the data analytics page. Right-click the specified workflow and choose new > data integration > offline synchronization.

    2. In the Create Node dialog box, enter a node name in the Name field and click Confirm.

    3. After you create the batch synchronization task, select the Oracle data source as the source, the MaxCompute data source as the destination, and the resource group in the Configure Network Connections and Resource Group step. Then, click Test Connectivity of All Data Sources to test the network connectivity between the resource group and data sources. If the connectivity test is successful, click Next. On the Configure Source and Destination page, set the Table parameter to the Oracle table that you create. On the Field Mapping page, click Map with the same name. Retain the default values of other parameters.

    4. Click **icon to run the code.

    5. You can operation Log view the results.

Verify the migration result

  1. Right-click the workflow and choose new > MaxCompute > ODPS SQL.

  2. In the Create Node dialog box, enter a node name in the Name field and click Confirm.

  3. On the configuration tab of the ODPS SQL node, enter the following statement:

    -- Check whether the data is synchronized to MaxCompute. 
    select * from good_sale;
  4. Click **icon to run the code.

  5. You can operation Log view the results.