All Products
Search
Document Center

AnalyticDB:Synchronize SLS data using the data synchronization feature (Recommended)

Last Updated:Sep 20, 2025

The Simple Log Service (SLS) sync link synchronizes data in real time from an SLS Logstore to an AnalyticDB for MySQL cluster, starting from a specific time offset. This feature supports near real-time data output, full historical data archiving, and elastic analytics. This topic describes how to add an SLS data source, create an SLS sync link, start the sync task, analyze the synchronized data, and manage the data source.

Prerequisites

Notes

  • Currently, a table in an AnalyticDB for MySQL cluster can be synchronized with only one Logstore from Simple Log Service.

  • After data is ingested into the data lake, you must run a commit operation to make the written data visible. To prevent small commit intervals from affecting job stability and read/write performance, the default commit interval for the AnalyticDB for MySQL data synchronization feature is 5 minutes. Therefore, after you create and start a data synchronization task for the first time, you must wait at least 5 minutes before you can view the first batch of written data.

Billing

Using the AnalyticDB for MySQL data migration feature to migrate data to OSS incurs the following fees.

Procedure

Configure RAM authorization

If you synchronize SLS data to AnalyticDB for MySQL across different Alibaba Cloud accounts, you must create a RAM role on the source account, grant specific permissions to the RAM role, and modify the trust policy of the RAM role. If you are synchronizing SLS data only within the same account, you can skip this step and proceed to create a data source. For more information, see Create a data source.

  1. Create a RAM role. For more information, see Create a RAM role for an Alibaba Cloud account.

    Note

    When you configure the Select Trusted Cloud Account parameter, select Other Cloud Account and enter the ID of the Alibaba Cloud account that owns the AnalyticDB for MySQL cluster. You can log on to the Account Center to view the Account ID on the Overview page.

  2. Grant the AliyunAnalyticDBAccessingLogRolePolicy permission to the RAM role. For more information, see Grant precise permissions to a RAM role.

  3. Modify the trust policy of the RAM role. For more information, see Modify the trust policy of a RAM role.

    {
      "Statement": [
        {
          "Action": "sts:AssumeRole",
          "Effect": "Allow",
          "Principal": {
            "RAM": [
                "acs:ram::<Alibaba Cloud account ID>:root"
            ],
            "Service": [
                "<Alibaba Cloud account ID>@ads.aliyuncs.com"
            ]
          }
        }
      ],
      "Version": "1"
    }
    Note

    The Alibaba Cloud account ID is the one you entered in Step 1. Do not include the angle brackets (<>).

Create a data source

Note

If you want to synchronize data using an existing data source management task, you can skip this step and directly create a sync link. For more information, see Create a sync link.

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Data Ingestion > Data Sources.

  3. In the upper-right corner of the page, click Create Data Source.

  4. On the Create Data Source page, configure the parameters. The following table describes the parameters.

    Parameter

    Description

    Data Source Type

    Select SLS.

    Data Source Name

    The system generates a name based on the data source type and the current time. You can change it as needed.

    Data Source Description

    A description for the data source, such as the Data Lakehouse scenario or business limitations.

    Deployment Mode

    Currently, only Alibaba Cloud Instance is supported.

    Region of Simple Log Service Project

    The region where the SLS project is located.

    Across Alibaba Cloud Accounts

    The SLS data source supports synchronizing SLS data to AnalyticDB for MySQL across different Alibaba Cloud accounts.

    • No: Synchronize SLS data within the current account to AnalyticDB for MySQL.

    • Yes: Synchronize SLS data from another account to AnalyticDB for MySQL. When you select cross-account synchronization, you must specify Cross-Alibaba Cloud Account and Cross-Alibaba Cloud Account Role Name.

      Note
      • Cross-Alibaba Cloud Account: The Alibaba Cloud account ID of the source.

      • Cross-Alibaba Cloud Account Role Name: The name of the RAM role created on the source. This is the RAM role created in Step 1.

    Simple Log Service Project

    The source SLS project.

    Important

    The SLS Project list displays all projects under the Alibaba Cloud account and its RAM users. If you select a project of the Alibaba Cloud account, make sure the RAM user has permissions for that project. Otherwise, data cannot be synchronized to AnalyticDB for MySQL.

    Simple Log Service Logstore

    The source SLS Logstore.

  5. After you configure the parameters, click Create.

Create a sync link

  1. In the left-side navigation pane, click Simple Log Service/Kafka Data Synchronization.

  2. In the upper-right corner of the page, click Create Synchronization Job.

  3. On the Create Synchronization Job page, configure the parameters in the Source and Destination Settings, Destination Database and Table Settings, and Synchronization Settings sections.

    • The following table describes the parameters for Data Source and Destination Configuration.

      Parameter

      Description

      Job Name

      The name of the data link. The system generates a name based on the data source type and the current time. You can change it as needed.

      Data Source

      Select an existing SLS data source or create a new one.

      Destination Type

      The following options are supported:

      • Data Lake - User OSS.

      • Data Lake - ADB Lake Storage (Recommended).

        Important

        When you select Data Lake-ADB Lake Storage, you must enable the lake storage feature.

      ADB Lake Storage

      The name of the lake storage where the AnalyticDB for MySQL lake data resides.

      Select the destination lake storage from the drop-down list. If no lake storage has been created, click Automatic Creation in the drop-down list to automatically create one.

      Important

      This parameter is required when Destination Type is set to Data Lake - ADB Lake Storage.

      OSS Path

      The storage path in OSS for the AnalyticDB for MySQL Data Lakehouse data.

      Important
      • All buckets that reside in the same region as the AnalyticDB for MySQL cluster are displayed. Configure this parameter based on your business requirements. After you configure this parameter, it cannot be modified.

      • We recommend that you select an empty directory that does not have nested relationships with the directories of other data synchronization jobs. This prevents historical data from being overwritten. For example, assume that the involved OSS storage paths of two data synchronization jobs are oss://adb_demo/test/sls1/ and oss://adb_demo/test/. In this case, data overwriting occurs during data synchronization because these two paths have nested relationships with each other.

      Storage Format

      The data storage format. The following options are supported:

      • PAIMON.

        Important

        This format is supported only when Destination Type is set to Data Lake - User OSS.

      • ICEBERG.

    • The following table describes the parameters for Destination Database and Table Settings.

      Parameter

      Description

      Database Name

      The name of the database in AnalyticDB for MySQL to which data will be synchronized. If a database with the same name does not exist, a new one is created. If a database with the same name already exists, data is synchronized to the existing database. For more information about naming conventions, see Limits.

      Important

      In the Source and Destination Settings section, if Storage Format is set to PAIMON, an existing database must meet the following conditions. Otherwise, the data synchronization task will fail.

      • It must be an external database. The statement to create the database must be CREATE EXTERNAL DATABASE <DatabaseName>.

      • The `DBPROPERTIES` parameter of the database creation statement must include the catalog property, and the catalog value must be paimon.

      • The DBPROPERTIES parameter in the database creation statement must include the adb.paimon.warehouse property. For example: adb.paimon.warehouse=oss://testBucketName/aps/data.

      • The DBPROPERTIES parameter in the database creation statement must include the LOCATION property. You must also add .db to the end of the database name. Otherwise, XIHE queries will fail. For example: LOCATION=oss://testBucketName/aps/data/kafka_paimon_external_db.db/.

        The bucket folder in the OSS path specified for LOCATION must exist. Otherwise, the database creation will fail.

      Table Name

      The name of the table in AnalyticDB for MySQL to which data will be synchronized. If a table with the same name does not exist in the database, a new one is created. If a table with the same name already exists, the data synchronization will fail. For more information about naming conventions, see Limits.

      Schema Field Mapping

      By default, fields are retrieved from the delivery task configuration of Simple Log Service. If no delivery task is configured for the LogStore, fields are retrieved from the most recent log data by default.

      • Supported data types: BOOLEAN, INT, BIGINT, FLOAT, DOUBLE, and STRING.

      • Synchronization of SLS reserved fields is supported. For more information, see Reserved fields.

      Important
      • You cannot modify the destination field names.

      • If the task has been started (including running and completed states), you cannot modify existing column information, but you can add new columns. If the task has been created but not started, you can modify it as needed.

      Partition Key Settings

      The partition field settings for the destination table. To ensure that data is ingested and queries are executed as expected, we recommend that you configure partitions based on the log time or business logic. If you do not configure partitions, no partitions exist in the destination table.

      Valid values for the Format Processing Method parameter:

      • Formatted Time: Select a datetime field from the Source Partition Field drop-down list, set the Format Processing Method parameter to Formatted Time, and then configure the Source Field Format and Destination Partition Format parameters. AnalyticDB for MySQL identifies the value of the partition field based on the specified source field format and converts the value into the specified destination partition format for partitioning. For example, if the source field is gmt_created whose value is 1711358834, the Source Field Format parameter is set to Timestamp Accurate to Seconds, and the Destination Partition Format parameter is set to yyyyMMdd, the value is partitioned based on 20240325.

      • Specified Partition Field: Set the Format Processing Method parameter to Specified Partition Field and configure other required parameters.

    • The following table describes the parameters for Synchronization Configuration.

      Parameter

      Description

      Starting Consumer Offset for Incremental Synchronization

      When the sync task starts, it begins consuming SLS data from the selected time point. Valid values:

      • Earliest offset (begin_cursor): Automatically consumes data from the earliest time point in the SLS data.

      • Latest offset (end_cursor): Automatically retrieves data from the latest time point in the SLS data.

      • Custom offset: You can select any time point. The system will start consuming from the first piece of data in SLS that is at or after this time point.

      Job Resource Group

      Specify the Job resource group for the task to run in.

      Number Of ACUs For Incremental Sync

      The number of AnalyticDB compute units (ACUs) that are required for the job resource group to run the data synchronization job. The value ranges from 2 to the maximum number of computing resources that are available in the job resource group. To improve the stability and performance of data ingestion, we recommend that you specify a larger number of ACUs.

      Note

      When you create a data synchronization job in a job resource group, elastic resources in the resource group are used, and the system deducts the used resources from the resource group. For example, assume that a job resource group has 48 ACUs of reserved computing resources and a synchronization job that consumes 8 ACUs is created. When you create another synchronization job in the resource group, you can select up to 40 ACUs.

      Advanced Settings

      Advanced configuration lets you customize the sync task. To make custom configurations, contact technical support.

  4. After you configure the preceding parameters, click Submit.

Start the data synchronization job

  1. On the Simple Log Service/Kafka Data Synchronization page, find the data synchronization job that you created and click Start in the Actions column.

  2. In the upper-right corner of the page, click Search. If the state of the job changes to Starting, the data synchronization job is started.

Analyze data

After the data synchronization job is complete, you can use Spark JAR Development to analyze the data that is synchronized to the AnalyticDB for MySQL cluster. For more information about Spark development, see Spark editor and Overview.

  1. In the left-side navigation pane, choose Job Development > Spark JAR Development.

  2. Enter SQL statements in the default template and click Run Now.

    -- Here is just an example of SparkSQL. Modify the content and run your spark program.
    
    conf spark.driver.resourceSpec=medium;
    conf spark.executor.instances=2;
    conf spark.executor.resourceSpec=medium;
    conf spark.app.name=Spark SQL Test;
    conf spark.adb.connectors=oss;
    
    -- Here are your sql statements
    show tables from lakehouse20220413156_adbTest;
  3. (Optional) On the Applications tab, find an application and click Log in the Actions column to view the Spark SQL running log of the application.

Manage the data source

On the Data Sources page, you can perform the operations described in the following table in the Actions column.

Operation

Description

Create Job

Goes to the Create Synchronization Job or Create Migration Job page to create a job that uses the data source.

View

Views the detailed configurations of the data source.

Edit

Modifies the data source parameters, such as the data source name and description.

Delete

Deletes the data source.

Note

If the data source is being used in a data synchronization or data migration job, you cannot delete the data source. In this case, you must first go to the Simple Log Service/Kafka Data Synchronization page, find the job, and then click Delete in the Actions column to delete the job.