All Products
Search
Document Center

AnalyticDB for MySQL:Writes and queries

Last Updated:Jun 20, 2023

This topic provides answers to some frequently asked questions about writes and queries in AnalyticDB for MySQL.

Note

If no edition is specified in a question, the question is applicable only to AnalyticDB for MySQL Data Warehouse Edition (V3.0) clusters.

FAQ overview

Can I use JDBC to query the data of Hudi tables in an AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster?

Yes, you can use JDBC to query the data of Hudi tables. After you create a Hudi table in a Data Lakehouse Edition (V3.0) cluster, you can use JDBC to query the data of the Hudi table.

Can I use a Data Lakehouse Edition (V3.0) cluster to read the data of Hudi tables that are stored in OSS?

Yes, you can use external tables to read the data of Hudi tables that are stored in OSS. For more information, see Use external tables to import data to Data Lakehouse Edition.

Can a Data Lakehouse Edition (V3.0) cluster automatically switch between XIHE MPP and XIHE BSP jobs?

When you submit a job, you must specify an interactive resource group or a job resource group to determine whether the job is a XIHE MPP or XIHE BSP job.

How do I choose between XIHE MPP and XIHE BSP to run a job in a Data Lakehouse Edition (V3.0) cluster?

By default, XIHE BSP asynchronously submits jobs. The only difference between synchronous submission and asynchronous submission is whether the client needs to wait for the query execution to complete.

The following limits are imposed on asynchronous submissions:

  • A result set can contain up to 10,000 rows of data.

  • The system can retain up to 1,000 result sets , including the download URL of CSV files, for up to 30 days.

We recommend that you asynchronously submit BSP jobs for queries that consume large amounts of computing power, and require a long period of time to complete, but return a small number of result sets. Examples: INSERT INTO SELECT, INSERT OVERWRITE SELECT, and CREATE TABLE AS SELECT.

How do I check the status of a XIHE BSP job in a Data Lakehouse Edition (V3.0) cluster?

  • If you submit a XIHE BSP job in a Data Lakehouse Edition (V3.0) cluster by using a job editor, you can choose Job Editor > SQL Development and click the Execution Records tab to view the status of the job.

  • If you submit a XIHE BSP job without using a job editor, you can execute the following statement to query the status of the job from the internal memory table:

    SELECT status FROM information_schema.kepler_meta_elastic_job_list WHERE process_id='<job_id>';

    You can execute the following statement to query status statistics about the XIHE BSP job:

    SELECT status,count(*) FROM information_schema.kepler_meta_elastic_job_list GROUP BY status;

How do I isolate resources to reduce impacts between SQL jobs?

AnalyticDB for MySQL Data Warehouse Edition (V3.0) clusters in elastic mode for Cluster Edition and Data Lakehouse Edition (V3.0) clusters provide the resource group feature. For more information, see Resource group overview (Data Warehouse Edition) and Resource group overview (Data Lakehouse Edition). You can create different types of resource groups and submit SQL jobs to the corresponding resource groups to isolate resources.

What do I do if a large number of conditions are specified for the IN operator?

AnalyticDB for MySQL imposes limits on the number of conditions that are specified for the IN operator. The default value is 2000. You can also modify the value based on your business requirements.

Note

To ensure performance, do not specify more than 5,000 conditions for the IN operator.

For example, execute the following statement to set the number of conditions for the IN operator to 3000:

SET ADB_CONFIG max_in_items_count=3000;

What do I do if the "Query exceeded maximum time limit of 1800000.00ms" error is reported?

AnalyticDB for MySQL allows you to configure timeout periods for SQL queries. If the execution of a query does not end after 1,800,000 milliseconds, the preceding error is reported. You can execute the following statements to configure a timeout period for a single query or all queries in a cluster:

  • A single query:

    /*+ QUERY_TIMEOUT=xxx */SELECT count(*) FROM t;
  • All queries in a cluster:

    SET ADB_CONFIG QUERY_TIMEOUT=xxx;

    For more information, see Common configuration parameters.

What do I do if query results contain truncated time?

Check whether the time that is contained in query results is displayed as expected on the MySQL client. If the time is displayed as expected, check whether the query results are processed by other clients.

What do I do if an error occurs on the built-in AES_ENCRYPT() function?

An error may occur when you execute the following statement:

 SELECT CONVERT(AES_DECRYPT(AES_ENCRYPT('ABC123','key_string'),'key_string'),char(10));

The x parameter in the AES_ENCRYPT(NCRYPT(varbinary x, varchar y) statement must be changed to the VARBINARY type. Sample statement:

SELECT CONVERT(AES_DECRYPT(AES_ENCRYPT(CAST('ABC123' AS VARBINARY), 'key_string'), 'key_string'),char(10)); 

Why do query results suddenly change?

If no data is updated, a sudden change in query results may occur due to the following reasons:

  • No ORDER BY clause is used together with the LIMIT clause. AnalyticDB for MySQL uses a distributed architecture. If a query uses multiple threads, the query ends when a number of rows that match the value that is specified by the LIMIT clause are returned. Therefore, if no ORDER BY clause is used together with the LIMIT clause, query results are returned in a random manner.

  • In a SELECT statement that uses a GROUP BY clause and aggregate functions, if the GROUP BY clause and aggregate functions do not contain a specific field, a random value is returned for the field.

If this issue persists, contact technical support.

Why does the number of scanned rows for all tables in an execution plan differ from the total number of rows that are scanned by the query?

In most cases, this issue is caused by dimension tables. AnalyticDB for MySQL stores dimension tables on each backend node and repeatedly counts the number of rows when the dimension tables are scanned.

Why is data duplicated when the INSERT OVERWRITE statement is used on an AnalyticDB for MySQL table that does not have a primary key?

Automatic deduplication is not supported for AnalyticDB for MySQL tables that do not have primary keys.

Why does the "Column 'XXX' not in GROUP BY clause" error occur when the SELECT * FROM TABLE GROUP BY KEY statement is executed?

Query results cannot display all fields when you use the GROUP BY clause. You must specify column names in your statement. Sample statement:

SELECT nation.name FROM nation GROUP BY nation.nationkey

Is a limit imposed on the number of values that are specified for the IN operator when query results are returned in the JSON format?

For AnalyticDB for MySQL V3.1.4 or earlier, the number of values that are specified for the IN operator cannot exceed 16. For AnalyticDB for MySQL versions later than V3.1.4, no limit is imposed on the number of values that are specified for the IN operator. For information about how to query the minor version of a cluster, see the "How do I query the minor version of an AnalyticDB for MySQL cluster?" section of the Purchase, configuration change, and upgrade topic.

Can I use .csv.gz files from OSS as external table data sources in AnalyticDB for MySQL?

Yes, you can use .csv.gz files as external table data sources in AnalyticDB for MySQL. You must set the compress_type parameter to gzip for external tables. For information about the syntax of OSS external tables, see the "Create an OSS external table for a non-partitioned object" section of the Use external tables to import data to Data Warehouse Edition topic.

Is the INSERT ON DUPLICATE KEY UPDATE statement supported in AnalyticDB for MySQL?

AnalyticDB for MySQL allows you to assign only values to variables in the INSERT ON DUPLICATE KEY UPDATE statement. You cannot assign expressions in this statement.

Can I use JOIN in UPDATE statements in AnalyticDB for MySQL?

You can use JOIN in UPDATE statements only for AnalyticDB for MySQL clusters that run V3.1.6.4 or later. For more information, see UPDATE.

Can I set SQL variables in AnalyticDB for MySQL?

No, you cannot set SQL variables in AnalyticDB for MySQL.

Can I execute the INSERT ON DUPLICATE KEY UPDATE statement to batch insert data by using the Logstash plug-in?

Yes, you can execute the INSERT ON DUPLICATE KEY UPDATE statement to batch insert data. When you execute the INSERT ON DUPLICATE KEY UPDATE statement to batch insert data, you must add the ON DUPLICATE KEY UPDATE clause after the last VALUES().

For example, execute the following statement to insert three rows of data into the student_course table:

INSERT INTO student_course(`id`, `user_id`, `nc_id`, `nc_user_id`, `nc_commodity_id`, `course_no`, `course_name`, `business_id`)
VALUES(277943, 11056941, '1001EE1000000043G2T5', '1001EE1000000043G2TO', '1001A5100000003YABO2', 'kckm303', 'Industrial Accounting Practice V9.0--77', 'kuaiji'),
(277944, 11056943, '1001EE1000000043G2T5', '1001EE1000000043G2TO', '1001A5100000003YABO2', 'kckm303', 'Industrial Accounting Practice V9.0--88', 'kuaiji'),
(277945, 11056944, '1001EE1000000043G2T5', '1001EE1000000043G2TO', '1001A5100000003YABO2', 'kckm303', 'Industrial Accounting Practice V9.0--99', 'kuaiji')
ON DUPLICATE KEY UPDATE
course_name = 'Industrial Accounting Practice V9.0--77',
business_id = 'kuaiji';

What are the conditions for an AnalyticDB for MySQL cluster to load a built-in dataset?

To load a built-in dataset, an AnalyticDB for MySQL cluster must have at least 24 AnalyticDB compute units (ACUs) of reserved storage resources and at least 16 ACUs of computing reserved resources in the user_default resource group.

How do I check whether a built-in dataset is loaded to an AnalyticDB for MySQL cluster?

In the left-side navigation pane of an AnalyticDB for MySQL cluster, choose Job Development > SQL Development. On the page that appears, you can view the loading progress. If the tu..png icon is displayed before the dimmed Load Built-in Dataset button and the Databases and Tables tab displays the ADB_SampleData_TPCH database and the tables of the database, the built-in dataset is loaded.

What do I do if the system returns an error message that indicates loading failure or prolonged loading?

Execute the DROP TABLE table_name; statement to delete all tables from the ADB_SampleData_TPCH database. Then, execute the DROP DATABASE ADB_SampleData_TPCH; statement to delete the ADB_SampleData_TPCH database. After the ADB_SampleData_TPCH database is deleted, reload the built-in dataset.

How do I use a standard database account to use the built-in dataset feature?

The built-in dataset feature complies with the permission management rules of AnalyticDB for MySQL. Even if your cluster is loaded with a built-in dataset, standard database accounts do not have permissions on the ADB_SampleData_TPCH database. You must use a privileged account to grant the permissions to standard accounts by using the following statement:

GRANT select ON ADB_SampleData_TPCH.* TO <user_name>;

How do I test an AnalyticDB for MySQL cluster after I load a built-in dataset?

AnalyticDB for MySQL provides corresponding query scripts that you can use to test a cluster. After a built-in dataset is loaded, you can choose Job Development > SQL Development, click the Scripts tab, and then execute SQL statements. For more information about the SQL statements, see TPC-H dataset.

Important

To ensure the integrity of the built-in dataset, we recommend that you execute only SELECT statements on the ADB_SampleData_TPCH database. If you execute other DML or DDL statements, data in the built-in dataset may be changed, and the dataset may fail to be loaded. In this case, you must delete the ADB_SampleData_TPCH database and reload the dataset.