All Products
Search
Document Center

AnalyticDB for MySQL:BUILD

Last Updated:Jan 24, 2024

BUILD jobs are used to rebuild partitions that have newly written data, which can improve read performance. During the rebuild process, the system creates indexes, clears redundant data, and asynchronously executes DDL statements.

Precautions

During a BUILD job, you cannot execute INSERT OVERWRITE SELECT statements until the BUILD job is completed.

Description

During a BUILD job, the system merges the data that is written in real time and the historical partitions related to the data, creates indexes, and asynchronously executes DDL statements. After INSERT, UPDATE, or DELETE statements are executed on the data in a partition, a BUILD job can be performed to rebuild the partition. BUILD jobs do not rebuild the partitions whose data has not been modified.

After a BUILD job is started, it cannot be stopped.

BUILD jobs are performed at the table level. After a BUILD job starts on a table, the job is split into tasks at the shard level. All three replicas of a shard each perform a task. The BUILD job is completed when all tasks are completed. Up to three BUILD jobs can be performed in parallel, each on a different table.

Automatic trigger of BUILD jobs

A BUILD job can be automatically triggered when one of the following conditions is met:

  • The period of time that has elapsed since the most recent BUILD job reaches the minimum time interval between BUILD jobs and at least 50,000 rows of data have been added to a shard of a table. The minimum time interval is 0.5 hours for Data Warehouse Edition (V3.0) clusters in reserved mode and 1.5 hours for Data Warehouse Edition (V3.0) clusters in elastic mode and Data Lakehouse Edition (V3.0) clusters.

  • Twenty-four hours have elapsed since the most recent BUILD job, and at least one row of data has been modified.

Manually trigger a BUILD job

You can use the following methods to manually trigger different BUILD jobs:

  • Execute a BUILD statement on the partitions that have data changes

    BUILD TABLE <table_name>;

    This method triggers a BUILD job only on the partitions that have data changes.

  • Execute a BUILD statement on specified partitions

    BUILD TABLE test force=true partitions='partition1,partition2';
    Important

    You can execute a BUILD statement on specified partitions only for AnalyticDB for MySQL clusters of V3.1.6.0 or later. For information about how to query the minor version of an AnalyticDB for MySQL cluster, see How do I query the version of an AnalyticDB for MySQL cluster? To update the minor version of a cluster, contact technical support.

    If a table contains large volumes of data, the BUILD TABLE <table_name> force = true; statement may require a long period of time to execute. AnalyticDB for MySQL allows you to specify the partitions on which you want to perform a BUILD job. This helps reduce the usage of resources and improves the performance of BUILD jobs.

  • Execute a BUILD statement on an entire table

    BUILD TABLE <table_name> force = true;

    This method triggers a BUILD job on and re-creates an index for all partitions of a table. By default, this feature is disabled.

    Important

    This method re-creates an index for all existing data of a table, which consumes a significant amount of time. We recommend that you execute a BUILD statement on specified partitions. Before you execute a BUILD statement on an entire table, you must evaluate the effects and risks. To enable this feature, submit a ticket.

Configure automatic scheduling for BUILD jobs

  • Configuration method

    SET ADB_CONFIG RC_CSTORE_BUILD_SCHEDULE_PERIOD=`<start>,<end>`;

    BUILD jobs are scheduled on the hour within the time ranges specified by the start and end parameters. Set both values to integers within the range of 0 to 24. You can configure multiple time ranges and separate them with semicolons (;).

    Important

    The specified time ranges are used for scheduling jobs, not for executing jobs. Therefore, after jobs are scheduled, the system may not complete executing the jobs when the time range ends.

  • Example

    Configure BUILD jobs to be scheduled within the following time ranges: 00:00:00 to 06:00:00 and 18:00:00 to 24:00:00.

    SET ADB_CONFIG RC_CSTORE_BUILD_SCHEDULE_PERIOD=`0,6;18,24`;

Configure scheduling priorities for BUILD jobs

The order in which BUILD jobs are scheduled is based on the amount of data that is added to a shard of a table. The larger the amount of data added to the shard, the earlier the job is scheduled and executed. You can also use a hint or execute the SET ADB_CONFIG statement to configure the scheduling priorities of BUILD jobs for tables.

Important
  • You can configure the scheduling priorities of BUILD jobs only for AnalyticDB for MySQL clusters of V3.1.5.0 or later. For information about how to query the minor version of an AnalyticDB for MySQL cluster, see How do I query the version of an AnalyticDB for MySQL cluster? To update the minor version of a cluster, contact technical support.

  • A hint supports only one table. The configured priority takes effect only for the current BUILD job.

  • The SET ADB_CONFIG statement supports one or more tables. The configured priorities do not change until you configure new scheduling priorities of BUILD jobs for the tables.

  • If you use a hint and the SET ADB_CONFIG statement to configure different scheduling priorities of BUILD jobs for a table, the hint-based scheduling priority takes precedence in the current job.

The task_priority parameter specifies the scheduling priority of BUILD jobs. The value must be an integer. The default value is 0. A greater value specifies a higher scheduling priority. After you configure priorities for BUILD jobs, the frontend nodes schedule BUILD jobs based on the priorities. If you set the task_priority parameter to a value that is less than 0 for a table, automatic scheduling of BUILD jobs is disabled for the table and the frontend nodes do not schedule BUILD jobs for the table.

Configuration methods:

  • Use a hint

    /*build_task_priority = <task_priority> */ BUILD TABLE <db_name>.<table_name>;

    For example, set the scheduling priority of BUILD jobs to 30 for the test table in the adb_demo database.

    /*build_task_priority = 30 */ Build TABLE adb_demo.test;
  • Use SET ADB_CONFIG

    • Configure scheduling priorities of BUILD jobs for multiple tables in different databases.

      SET ADB_CONFIG RC_BUILD_TASK_PRIORITY_LIST = `<db1_name>.<table1_name>.<task_priority>;<db2_name>.<table2_name>.<task_priority>`;
      Note

      Separate multiple tables with semicolon (;).

      For example, set the scheduling priority of BUILD jobs to 30 for the test1 table in the adb_demo1 database, and 10 for the test2 table in the adb_demo2 database.

      SET ADB_CONFIG RC_BUILD_TASK_PRIORITY_LIST = `adb_demo1.test1.30;adb_demo2.test2.10`;
    • Configure the same scheduling priority of BUILD jobs for all tables in a database.

      SET ADB_CONFIG RC_BUILD_TASK_PRIORITY_LIST = `<db1_name>.*.<task_priority>`;

      For example, set the scheduling priority of BUILD jobs to 30 for all tables in the adb_demo1 database.

      SET ADB_CONFIG RC_BUILD_TASK_PRIORITY_LIST = `adb_demo1.*.30`;
    • Configure different scheduling priorities of BUILD jobs between a table and other tables in the same database.

      SET ADB_CONFIG RC_BUILD_TASK_PRIORITY_LIST = `<db1_name>.*.<task_priority>;<db1_name>.<table_name>.<task_priority>`;

      For example, set the scheduling priority of BUILD jobs to 30 for the test1 table in the adb_demo1 database, and 10 for all other tables in the database.

      SET ADB_CONFIG RC_BUILD_TASK_PRIORITY_LIST = `adb_demo1.*.10;adb_demo1.test1.30`;

You can execute the SHOW ADB_CONFIG statement to query the current scheduling priority of BUILD jobs.

Query the state of a BUILD job

After a BUILD job is automatically or manually triggered, you can execute the following SQL statement to query the state of the BUILD job within the last three days:

SELECT table_name, schema_name, status FROM INFORMATION_SCHEMA.KEPLER_META_BUILD_TASK ORDER BY create_time DESC LIMIT 10;

The following table describes the states of BUILD jobs that correspond to different values of status.

Value of status

Description

INIT

The job is being initialized.

RUNNING

The job is running.

FINISH

The job is completed.