AnalyticDB for MySQL supports submitting XIHE bulk synchronous parallel (BSP) SQL jobs through the SQL development editor or Java Database Connectivity (JDBC). This topic covers when to use XIHE BSP mode, how to submit and configure jobs, and how to monitor job status.
Prerequisites
Before you begin, ensure that you have:
A job resource group created for your AnalyticDB for MySQL Enterprise Edition, Basic Edition, or Data Lakehouse Edition cluster. For instructions, see Create a resource group.
A database account created for the cluster.
Use cases
XIHE BSP jobs run on the XIHE BSP engine, which is optimized for high throughput, automatic retries, and cost efficiency. Use XIHE BSP mode when:
Latency is not the priority: your workload prioritizes reliability and resource efficiency over query response time.
You are running ETL pipelines: data cleansing and transformation from a data source to the Application Data Service (ADS) layer typically process large volumes of data and are time-consuming. XIHE BSP mode provides the throughput and reliability these jobs require.
Large queries fail or are too costly in MPP mode: some queries involving large data volumes cause memory errors or require expensive scale-out in XIHE Massively Parallel Processing (MPP) mode. Running them in XIHE BSP mode uses a dedicated job resource group where data is written to disks — a more cost-effective approach.
You need to isolate burst low-priority queries: low-priority queries with unpredictable spikes can starve other workloads of resources. Routing them to a job resource group in XIHE BSP mode isolates their resource usage and prevents interference.
For ADS layer queries that require second- or millisecond-level response times, use XIHE MPP mode instead. For more information, see Compute engines.
A typical ETL flow is shown in the following figure.
Limitations
Writing to Apache Hudi tables is not supported in XIHE BSP mode.
Reading from or writing to Delta tables is not supported in XIHE BSP mode.
Submit a XIHE BSP job
Submit XIHE BSP jobs through the SQL development editor or JDBC.
For time-consuming, compute-intensive queries that return small result sets — such asINSERT INTO SELECT,INSERT OVERWRITE SELECT, andCREATE TABLE AS SELECT— asynchronous submission is recommended so your client does not block while waiting for results.
Use the SQL development editor
Use JDBC or a MySQL client (synchronous)
Use JDBC or a MySQL client (asynchronous)
Configure a XIHE BSP job
Configure resources, timeout, and priority at three scopes: a single job, all jobs in a resource group, or all jobs in a cluster. A narrower scope takes precedence.
Single job
Add configuration items directly to the job hint.
/*+ resource_group=<resource_group_name>,<config_name>*/ <SQL statement>;Example: run a job in the bsptest resource group with a maximum of 20 AnalyticDB Compute Units (ACUs).
/*+ resource_group=bsptest,elastic_job_max_acu=20*/SELECT count(*) from test_db.ods_hudi;Resource group
Apply a configuration to all jobs in a resource group.
SET adb_config <resource_group_name>.<config_name>Example: set a 20-ACU cap for all jobs in bsptest.
SET adb_config bsptest.elastic_job_max_acu=20;To verify that the setting took effect:
SHOW ADB_CONFIG KEY=<resource_group_name>.<config_name>;Cluster
Apply a configuration to all jobs in the cluster.
SET adb_config <config_name>=<value>;Example: set a 20-ACU cap for all jobs in the cluster.
SET adb_config elastic_job_max_acu=20;To verify that the setting took effect:
SHOW ADB_CONFIG KEY=<config_name>;Configuration items
| Category | Configuration item | Description | Default | When to adjust |
|---|---|---|---|---|
| Resource | elastic_job_max_acu | Maximum ACUs a single XIHE BSP job can use, including the AppMaster node and compute nodes. Cannot exceed the resource group maximum. Note The AppMaster node handles query parsing, scheduling, and execution. | 9 | Increase for jobs that are memory-intensive or require faster execution. |
| Timeout | batch_query_timeout | Job timeout in milliseconds. Jobs that exceed this duration are automatically canceled. | 7200000 | Increase for long-running ETL jobs. Decrease to fail fast and free resources sooner. |
| Priority | query_priority | Job priority within the resource group. Valid values: HIGH, NORMAL, LOW, LOWEST. For details, see Priority queues for job resource groups. | NORMAL | Set to LOW or LOWEST for burst background jobs to prevent them from competing with higher-priority workloads. |
FAQ
How do I check the status of a BSP job?
If you submitted the job from the job editor, view its status on the Execution Records tab at the bottom of the Job Editor > SQL Development page.
For jobs submitted through JDBC or other methods, query the information_schema.kepler_meta_elastic_job_list table:
SELECT status FROM information_schema.kepler_meta_elastic_job_list WHERE process_id='<job_id>';This table stores records for the last 1,000 BSP tasks submitted within the past 30 days. You can also aggregate across statuses — for example, to count jobs by state:
SELECT status, count(*) FROM information_schema.kepler_meta_elastic_job_list GROUP BY status;When should I use asynchronous instead of synchronous submission?
The two methods are functionally identical — the only difference is whether the client waits for the query to finish.
Use asynchronous submission for queries that are time-consuming and compute-intensive but return small result sets, such as INSERT INTO SELECT, INSERT OVERWRITE SELECT, and CREATE TABLE AS SELECT. This frees the client immediately and avoids connection timeout issues for long-running jobs.
Keep in mind that asynchronous results are subject to the following limits: result sets are capped at 10,000 rows, and up to 1,000 result sets are stored for up to 30 days.