【MaxCompute FAQ】 SQL

【MaxCompute FAQ】 SQL

【MaxCompute FAQ】 SQL.How to write sql to delete all data in a table? The purpose of data deletion can be expressed by deletion (DROP). Non-partitioned tables can clear table data through the TRUNCATE TABLE table_name; statement. For a partitioned table, you can delete the partition by ALTER TABLE table_name DROP IF EXISTS PARTITION (partition name='specific partition value') to delete the entire partition data.
SQL

【MaxCompute FAQ】 SQL.MaxCompute SQL syntax, error reporting, usage and other related issues

1. 【MaxCompute FAQ】 SQL.Does MaxCompute support direct conversion of non-partitioned tables to partitioned tables?
Direct conversion is not supported for the time being, and only the table can be rebuilt and partitioned data can be inserted.

2. 【MaxCompute FAQ】 SQL.How to write the sql that adds fields to the specified position in the table in MaxCompute?
The specified position is temporarily not supported, and the specified position defaults to the last column. See the documentation .

3. 【MaxCompute FAQ】 SQL。How to synchronize the data in the development table to the production table?
Use the insert statement to specify the production environment project name. Reference documentation .

4. 【MaxCompute FAQ】 SQL.How to check whether the table is tilted (clustered by is not set)?
The root cause of data skew is that the amount of data processed by a small number of workers far exceeds the amount of data processed by other workers. Therefore, the running time of a few workers is far longer than the average running time of other workers, resulting in the entire task running time is too long, causing task delays . Data skew optimization can refer to the documentation .

5. Will conflicts occur when concurrently writing to different partitions of a MaxCompute table?
There will be no conflict between different partitions, and an error will be reported when writing to the same partition at the same time.

6. In MaxCompute, how to use regular expressions to determine whether it is Chinese? The writing method is select 'me' rlike '^[一-龥]$', and the returned result is false. What should I do?
Correct spelling: select 'i' rlike '[\x{4e00}-\x{9fa5}]+';


7. MaxComputeSQL reports an error: FAILED: ODPS-0121145:Data overflow -Div result is inf, two params are 19.000000 and 0.000000
This error is a data overflow, which exceeds the range of the data type's value range. Make sure the data type is correct. div operates on 19/0, which is an illegal result.

8. In MaxCompute SQL, when using JOIN, will partition clipping take effect if the partition clipping condition is placed in ON, or will the partition clipping condition take effect if the partition clipping condition is placed in WHERE?
When using JOIN for association in SQL statement:
•Partition pruning takes effect if the partition pruning condition is placed in the WHERE clause.
•If the partition pruning condition is placed in the ON clause, the partition pruning of the slave table will take effect, but not the master table. Refer to the official documentation .

9. MaxCompute does not support cross join, so how to implement co-occurrence matrix?
MaxCompute currently does not support cross join. In MAPJOIN, the Cartesian product can be calculated in the form of MAPJOIN ON 1 = 1 without writing an ON statement. Refer to this document .


10. SQL error: Failed to run ddltask - Modify DDL meta encounter exception : ODPS-0121096:MetaStore transaction conflict-Reac-hed maximum retry times because of OTSStorageTxnLockKeyF-ail(Innerexception:Transaction timeout because cannot acqui-re exclusive lock.
Currently, MaxCompute does not have a lock mechanism on the table being operated. This error is caused by the contention of metadata. Check whether the read and write operations are performed on the table or table partition multiple times at the same time. It is recommended not to perform operations on a table at the same time when MaxCompute does not have a lock mechanism.

11. How does MaxCompute create a view?
You can directly perform DDL operations. You can refer to the documentation .

12. In MaxCompute, how to disable "query acceleration mode"?
Add this flag before SQL and execute it together. set odps.mcqa.disable=true;


13. In MaxCompute, CAST(substr(sdk_channel,1,1) AS BIGINT) = 1 or CAST(t.is_client AS BIGINT) = 1 error: Illegal type cast - in function cast, value 'n' cannot be casted from String to Bigint
substr is the string interception (), the two parameters in this function are the subscript and the number of interceptions, the code intercepts the string type, and the cast function outside converts the type. If it is intercepted inside,
For example: abc, cannot be converted to bigint at all, if the number 123 is cut out, it can be converted to bigint. E.g:
select cast(substr("1234",1,1) as BIGINT );

14. How to filter out fields with empty strings in SQL?

15. What can I do if the data length of the MaxCompute table field string type is greater than 8MB?
There are two methods: 1. Split fields. 2. Use the exclusive resource group, and then submit the work order binding to solve the problem of more than 8MB.

16. Can the MaxCompute SQL task perform syntax checking?
You can execute explian and refer to the official documentation .

17. In MaxCompute, what is the syntax for querying data by partition?
select *from table where dt='1'; can refer to the documentation .

18. Can MaxCompute specify columns for insert overwrite?
insert overwrite does not support the function of specifying insert columns, you can use insert into or design a zipper table. You can refer to this article .


19. An error ODPS-0130071:[0,0] Semantic analy-sis exception - physical plan generation failed: java.lang.RuntimeException:com.aliyun.odps.lot.cbo.FailFastException:instancecount exceeds limit 99999 is reported when SQL is executed in MaxCompute.
In MaxCompute, the maximum number of instances in a single job cannot exceed 100,000, and the instance in a job is closely related to the amount of data input by the user and the structure of SQL. setodps.sql.mapper.split.size=4096; Add this sentence before the formal sql, set the maximum data input size of a map, the unit is M, the default is 256M. The user can control the input of the map terminal by controlling this variable.

20. In MaxCompute, if the same partition of the same table is operated at the same time, will the data be doubled?
No, MaxCompute does not allow metadata to be updated at the same time when metadata is updated.

21. How to remove duplicate IDs in MaxCompute SQL?
You can use group or distinct to deduplicate.

22. MaxCompute SQL reports Invalid arguments - format string has se-cond part, but doesn't have minute part : yyyy-mm-dd hh:mm:ss
You can do a type conversion to_date('20181030 15-13-12.345','yyyymmdd hh-mi-ss.ff3')=2018-10-30 15:13:12


23. What should be done when the denominator is 0 when MaxCompute sql calculates the ratio? Is there a corresponding function?
Use case when or the decode built-in function to solve. Refer to the official documentation .

24. Does MaxCompute have some built-in functions similar to the database, which can convert 3 records into 1 record and separate them with commas.
You can use the WM_CONCAT built-in function. You can refer to the official documentation .


25. Does MaxCompute have a function to convert a date into a day of the week?
You can use the WEEKDAY built-in function. You can refer to the official documentation .


26. Can the time type field of MaxCompute not carry hours, minutes and seconds?
The date data type can be used. When MaxCompute 2.0 is enabled, the date type can be used. For details, please refer to the documentation .

27. Do MaxCompute tables have indexes?
There is no index, but Hash Clustering can provide an effect similar to the cluster index in the database. For details, please refer to the documentation .


28. How to write sql to delete all the data in the table?
The purpose of data deletion can be expressed by deletion (DROP).
Non-partitioned tables can clear table data through the TRUNCATE TABLE table_name; statement.
For a partitioned table, you can delete the partition by ALTER TABLE table_name DROP IF EXISTS PARTITION (partition name='specific partition value') to delete the entire partition data.

29. Is there any way to quickly check which tables under the project space are partition tables?
select table_name from information_schema.columns where is_partition_key = true group by table_name;


30. When querying a partition table where the condition is add_months('2020-06-01',-1), an error is reported: is full scan with all partitions, please specify partition predicat-es. How to solve it?
You can use the explain command to check whether partition pruning in SQL takes effect. For details, please refer to the documentation .

31. Can I insert data into MaxComptue using jdbc access to MaxCompute?
INSERT can be executed, please refer to the official documentation for details .


32.select a.*,b.ce from a left join b on a.cente_id = b.id WHERE
a.pt='20200518' and b.pt='20200518' Select the data of the partition 20200518 of the a and b tables, can't you do a left join?
Can be changed to this:
select a1.*,b1.ce from (select * from a where pt='20200518') a1 left join
(select * from b where pt='20200518') b1 on a1.cente_id = b1.id

33. In MaxCompute, will 10 billion pieces of data be used for group by query, will it affect the performance? Is there any limit to the amount of data using Group by?
No impact, no restrictions. For specific group by group query usage rules, please refer to MaxCompute SELECT syntax format and usage
Considerations for the SELECT syntax to perform operations such as nested queries, sorting operations, grouping queries, etc.


34. Does MaxCompute SQL support the with as statement?
Support, please refer to the documentation for details .


35. The use of data types in MaxCompute 2.0 requires set settings.
Running inside DataWorks?
When using the data type system, it needs to be set: set odps.sql.type.system.odps2=true; or setproject odps.s
ql.type.system.odps2=true;, the statement can perform operations or odps in the DDL mode of the newly created table in DataWorks
SQL node running.

36. How to judge whether a field is empty in MaxCompute Sql?
IS NULL. MaxCompute Sql includes relational operators, arithmetic operators, bitwise operators, and logical operators. Refer to the official documentation .

37. Can accidentally drop deleted table be recovered?
The backup and restore function provided by MaxCompute can help you restore table data. For more information on backup and restore, see Backup and Restore .

38. Where can I see all executed SQL?
View the 14-day history through the TASKS_HISTORY details of the Information_Schema metadata. The metadata service Information_Schema has been fully opened. You can use this service to query the metadata information of key objects in the project. In addition to the metadata, it also provides information including job running, The data is uploaded and downloaded using historical behavioral data. Currently, the default information_schema information is only retained for 15 days. If it needs to be used for a long time, it needs to be transferred manually. For details, please refer to the official documentation .

39. How to use multi-line comments in MaxCompute SQL comments?
Multi-line comments are Ctrl + / official documents with detailed code shortcuts and DataStudio shortcuts.

40. In MaxCompute, how to query all the columns A in the table that contain "123456" or "678910"?
select * from tablename whert cloumn rlike'.*(123456|45678).*'

41. Query Data Times: Semanticanalysisexception-INTtypeisnotenabled incurrentmode, why can't the int type be used?
To use the int type, you need to open set odps.sql.type.system.odps2=true; the default support is bigint, if there is no special
If you need it, just use bigint.

42. Can the MaxCompute sql statement be invoked through the shell node in DataStudio ?
No, the shell node supports standard shell syntax, not interactive syntax. If there are many tasks, you can use the ODPS SQL node to complete the task execution. For other introductions about DataStudio, please refer to the official documentation .

43. Does MaxCompute support changing the data type of table fields? Not supported, only field columns can be added. Tables are not allowed to delete fields, modify fields and partition fields. If they must be modified, please delete and rebuild the table. The table can be established as an external table. After the table is deleted and rebuilt, the data can be re-created. Load it back. Please refer to the official documentation for data types .

44. In addition to the UDF function, is there any other way to combine two tables without any relationship into one table?
You can use union all for vertical merging, and for horizontal merging, you can use row number to add a new ID column to both tables, perform ID association, and then take the fields of the two tables.

45. Can partitions be added or changed in MaxCompute? Can delete and change operations be performed on table data?
In MaxCompute, you cannot directly add/change partition information on the source table. Once a partition is created, it cannot be changed. It is recommended to create a new partition table and use dynamic partition SQL to import the data from the source table to the new partition table. At the same time, MaxCompute does not support the direct update (UPDATE) operation of ordinary table data nor the direct delete (DELETE) operation of ordinary table data. Please refer to the documentation .

46. In MaxCompute SQL, use not in followed by sub-queries. The results returned by the sub-queries are tens of thousands of data volumes, but when the number of col1 returned by the sub-queries in the statement exceeds 1,000, the system will report an error as records returned from subquery exceeds-ed limit of 1000. How to implement the subquery limit of 1000?
The left outer join implementation can be used instead of the subquery implementation. Subquery: SELECT * FROM table_a a WHERE a.col1 IN (SELECT col1 FROM table_b b WHERE xxx); join query: SELECT a.* FROM table_a a JOIN (SELECT DISTINCT col1 FROM table_b b WHERE xxx) c ON (a.col1 = c.col1)

47.select 'size' rlike '[一-龥]+'; How to write a regular expression to match Chinese characters?

48. In, select * from sale_detail order by region; error:
Semantic analysis exception-ORDER BY must be used with a LIMET clause
ORDER BY must be used with limit. When ORDER BY is not used together with limit, an error will be returned, please refer to the documentation .


49. In MaxCompute, is the number of partitions in a table better?
In MaxCompute, a table is allowed to have a maximum of 60,000 partitions, and there is no upper limit on the capacity of each partition. However, the number of partitions is too large, which will make the statistics and analysis process very inconvenient. MaxCompute also limits the maximum number of instances in a single job, and the instances in a job are closely related to the amount of data you input and the number of partitions. Therefore, you need to choose an appropriate partitioning strategy according to your business needs.


50. If there is no partition field at the beginning, is it possible to add or change partitions?
You cannot add or change the partition key directly on the source table, and once the partition key is created, it cannot be changed. But you can recreate a partition table and use dynamic partition SQL to import the data from the source table to the new partition table, please refer to the documentation .


51. In MaxCompute, when we need to convert one line into multiple lines, how should we solve it?
Lateral View is used together with UDTFs such as split, explode, etc. It can split one row of data into multiple rows of data, and on this basis, aggregate the split data, please refer to the document .


52. In MaxCompute, if an error occurs during the execution of the INSERT statement, will the original data be damaged?
The original data will not be damaged. MaxCompute satisfies the atomicity (Atomicity), INSERT can either successfully update or fail to roll back.

53. Run the SQL statement to query the table data. The data in the table is 10,000. The query is always in the Job Quening... state and cannot be executed. What is the reason?
Please check the task running status. There may be a task running that occupies resources. Please abort or wait for this task first. You can run the Show Instances/Show P command in the MaxCompute client to view instance information.

54. How to solve the error ODPS-0130161:[1,1] Parse exception - invalid token 'cost' when using MaxCompute SQLTask to execute the cost sql sql command?
You need to use the SQLCostTask interface in the Java SDK to implement a single SQL cost query. Please refer to the documentation for how to use the interface .


55. How to delete the production environment table in MaxCompute?
You can use drop table project_name.table_name; in the MaxCompute client (odpscmd) or DataStudio to delete the table in the production environment.

56. In MaxCompute, can you add or delete columns?
Columns can be added, but not deleted. If you need to delete columns, you can create a new table and then rename the table.

57. During the execution of MaxCompute SQL, how to deal with the error Table xx has n columns but query has m columns?
When MaxCompute SQL uses INSERT INTO/OVERWRITE to insert data, you need to ensure that the fields queried by SELECT match the fields of the inserted table. The matching content includes order, field type, and total number of fields. Currently, MaxCompute does not support inserting specified fields into tables. If other fields are NULL or other default values, you can set it as
NULL, such as SELECT 'a', NULL FROM XX.

58. What method can be used to achieve the same field connection?
MaxCompute can use the WM_CONCAT function to implement the same field connection. For specific function descriptions, please refer to the documentation .

59. How to modify the Hash Clustering attribute of a table in MaxCompute?
The Hash Clustering attribute statement for adding a table is as follows: ALTER TABLE table_name [CLUSTERED BY
(col_name [, col_name, ...]) [SORTED BY (col_name [ASC | DESC] [, col_name [ASC | DESC] ...])] INTO number_of_buckets BUCKETS] The syntax for removing the Hash Clustering attribute of a table is as follows: ALTER TABLE table_name NOT CLUSTERED;
Partition table modification writing: ALTER TABLE table_name [partitions (ds='xxx')] NOT CLUSTERED;


60. In MaxCompute, how to check whether the specified table or partition exists?
You can use the function TABLE_EXISTS to query whether the specified table exists. Using the function PARTITION_EXISTS, the query refers to
Whether the specified partition exists. For specific function descriptions, please refer to the documentation .


61.SELECT table_name FROM INFORMATION_SCHEMA.TABLES where table_schema = ? Use sql to get the table information in the specified space and prompt failed:
ODPS-0130013: Authorization exception - Authorization Failed [4002], You don't exist in project information_schema. How to authorize?
If authorization is required, you can refer to MaxCompute's metadata service, and click the link to refer to the official documentation.

62. When executing INSERT INTO or INSERT OVERWRITE operation, the prompt field does not match, how to solve it?
When performing an INSERT INTO or INSERT OVERWRITE operation to insert data, it is necessary to ensure that the fields obtained by SELECT match the fields of the target table. The matching content includes order, field type and total number of fields. MaxCompute does not support inserting specified fields into tables. When other fields are NULL or other default values, you can set them to NULL during SELECT, for example, select 'a' , null from XX;.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00