All Products
Search
Document Center

AnalyticDB:DMS job scheduling

Last Updated:Mar 30, 2026

Use Data Management (DMS) job scheduling to automate an ETL pipeline that moves data from ApsaraDB RDS for PostgreSQL through Object Storage Service (OSS) to AnalyticDB for PostgreSQL in Serverless mode. The pipeline runs on the schedule you configure, loads data in T+1 mode, and uses OSS as intermediate storage to reduce costs and allow independent reprocessing of historical data.

How it works

The pipeline runs in three stages:

  1. Extract — A DMS task runs a SQL query against ApsaraDB RDS for PostgreSQL and writes the results to an OSS bucket as CSV files, partitioned by date.

  2. Load — A second DMS task reads those CSV files from OSS via a foreign table and inserts them into AnalyticDB for PostgreSQL.

  3. Schedule — DMS enforces the execution order (Extract then Load) and runs both tasks on the schedule you configure.

Constraints

  • Data in ApsaraDB RDS for PostgreSQL must support incremental archiving based on a time condition (for example, by day).

  • The ApsaraDB RDS for PostgreSQL instance, AnalyticDB for PostgreSQL instance, and OSS bucket must all be in the same region.

  • AnalyticDB for PostgreSQL in Serverless mode does not support primary keys.

Prerequisites

Before you begin, make sure you have:

AnalyticDB for PostgreSQL

ApsaraDB RDS for PostgreSQL

OSS

  • An OSS bucket. See Create buckets.

  • The bucket name and endpoint. To get them, log on to the OSS console, click Buckets in the left-side navigation pane, click the bucket name, then click Overview. The endpoint is listed in the Port section. Use the Access from ECS over the VPC (internal network) endpoint for best performance.

AccessKey

Step 1: Set up the source database (ApsaraDB RDS for PostgreSQL)

Connect to the ApsaraDB RDS for PostgreSQL instance. This tutorial uses DMS to connect. See Connect to an ApsaraDB RDS for PostgreSQL instance.

  1. Create a test table and insert sample data:

    CREATE TABLE t_src (a int, b int, c date);
    INSERT INTO t_src SELECT generate_series(1, 1000), 1, now();
  2. Install the oss_fdw extension:

    CREATE EXTENSION IF NOT EXISTS oss_fdw;
  3. Create a foreign server that points to your OSS bucket:

    CREATE SERVER ossserver FOREIGN DATA WRAPPER oss_fdw OPTIONS
        (host '<bucket_endpoint>', id '<access_key_id>', key '<access_key_secret>', bucket '<bucket_name>');

    Replace the following placeholders:

    Placeholder Description
    <bucket_endpoint> Endpoint of the OSS bucket
    <access_key_id> Your AccessKey ID
    <access_key_secret> Your AccessKey secret
    <bucket_name> Name of the OSS bucket

Step 2: Set up the destination database (AnalyticDB for PostgreSQL)

Connect to the AnalyticDB for PostgreSQL instance. This tutorial uses DMS to connect. See Client connection.

  1. Create a target table with the same schema as t_src:

    CREATE TABLE t_target (a int, b int, c date);
    Note AnalyticDB for PostgreSQL in Serverless mode does not support primary keys. Do not add a PRIMARY KEY constraint.
  2. Install the oss_fdw extension:

    CREATE EXTENSION IF NOT EXISTS oss_fdw;
  3. Create an OSS server and a user mapping:

    CREATE SERVER oss_serv
        FOREIGN DATA WRAPPER oss_fdw
        OPTIONS (
            endpoint '<bucket_endpoint>',
            bucket '<bucket_name>'
        );
    
    CREATE USER MAPPING FOR PUBLIC
        SERVER oss_serv
        OPTIONS (
            id '<access_key_id>',
            key '<access_key_secret>'
        );

    Replace the following placeholders:

    Placeholder Description
    <bucket_endpoint> Endpoint of the OSS bucket
    <bucket_name> Name of the OSS bucket
    <access_key_id> Your AccessKey ID
    <access_key_secret> Your AccessKey secret

Step 3: Configure the ETL job in DMS

Create a task flow with two tasks: one to extract data from RDS to OSS, and one to load data from OSS into AnalyticDB for PostgreSQL. DMS runs the tasks in sequence on the schedule you set.

  1. Log on to the DMS console.

  2. In the top navigation bar, click Data Development. In the left-side navigation pane, choose Data Development > Task Orchestration.

  3. In the Task Flow section, click Create Task Flow.

  4. In the Create Task Flow dialog box, enter a name in the Task Flow Name field and click OK. This tutorial uses Import data from RDS to OSS.

Create the Extract task

  1. On the task flow canvas, choose Data Processing Nodes > Single Instance SQL in the left-side navigation pane, then drag Single Instance SQL onto the canvas.

  2. (Optional) Select the task and click DMS作业调度-重命名 to rename it. This tutorial uses Extract data from RDS.

  3. Select the task and click DMS-编辑当前节点配置 to open the configuration panel.

  4. From the drop-down list above the code editor, select the ApsaraDB RDS for PostgreSQL database to bind.

    DMS作业调度-选择需要绑定的数据库

  5. In the code editor, enter the following SQL. The ${mydate} placeholder is replaced with the scheduled run date (in yyyyMMdd format, for example 20240101) at runtime.

    DROP FOREIGN TABLE IF EXISTS oss_${mydate};
    
    CREATE FOREIGN TABLE IF NOT EXISTS oss_${mydate}
        (a int,
         b int,
         c date)
         SERVER ossserver
         OPTIONS (dir 'rds/t3/${mydate}/', DELIMITER '|',
             format 'csv', encoding 'utf8');
    
    INSERT INTO oss_${mydate} SELECT * FROM t_src WHERE c >= '${mydate}';
  6. In the right-side navigation pane, click the Variable Setting tab. Select Node Variable, set Variable Name to mydate, and set Time Format to yyyyMMdd.

    DMS作业调度-RDS变量设置

Create the Load task

  1. Drag another Single Instance SQL node onto the canvas.

  2. (Optional) Rename it. This tutorial uses Load data to AnalyticDB for PostgreSQL.

  3. Select the task and click DMS-编辑当前节点配置 to open the configuration panel.

  4. From the drop-down list above the code editor, select the AnalyticDB for PostgreSQL database to bind.

  5. In the code editor, enter the following SQL:

     CREATE FOREIGN TABLE IF NOT EXISTS oss_${mydate}(
          a int,
          b int,
          c date
     ) SERVER oss_serv
         OPTIONS (
             dir 'rds/t3/${mydate}/',
             format 'csv',
             delimiter '|',
             encoding 'utf8');
    
     INSERT INTO t_target SELECT * FROM oss_${mydate};
  6. Repeat step 10 to configure the variable setting for this task: set Variable Name to mydate and Time Format to yyyyMMdd.

    DMS作业调度-RDS变量设置

Configure task dependencies and scheduling

  1. On the canvas, hover over the Extract data from RDS task. Click and hold the small circle on the right edge, then drag a line to the Load data to AnalyticDB for PostgreSQL task. This dependency ensures the Extract task always completes before the Load task starts.

    DMS作业调度-任务地图

  2. In the lower part of the page, click the Task Flow Information tab. In the Scheduling Settings section, turn on Enable Scheduling, then select a scheduling cycle for both the Extract and Load tasks.

Test and publish

  1. Click Try Run to validate the task flow.

  2. If the task flow runs successfully, click Publish.

What's next