DataWorks DataAnalysis provides the SQL query feature. This feature allows you to query and analyze the data of data sources on which you have query permissions. This topic describes how to use the SQL query feature.
Background information
The SQL query feature of DataWorks DataAnalysis is not available for PostgreSQL, SQL Server, Oracle, or PolarDB for Xscale (PolarDB-X) from October 13, 2022. However, this feature is available for big data engines, such as MaxCompute, E-MapReduce (EMR), Cloudera's Distribution including Apache Hadoop (CDH), Hologres, and AnalyticDB.
NoteIf a network connectivity issue occurs when you run an SQL query task, contact the owner of the data source to check the connection between the resource group and the data source. You can also resolve the issue by referring to Appendix: IP address whitelist for DataAnalysis. If the issue persists, contact technical support.
The SQL query feature does not allow you to return to the earlier version from November 14, 2022.
FAQ about the upgraded SQL query feature
Q: How do I add a data source on the SQL Query page?
A: For information about how to add a data source, refer to the following animation. The upgraded SQL query feature allows you to add a data source for each SQL file.
Q: Why is the number of available data sources reduced?
A: The upgraded SQL query feature supports only the data sources on which the tenant administrator or tenant security administrator can grant permissions to your account in Security Center. If you do not have query permissions on specific data sources, you can contact the administrator to grant your account the query permissions on the data sources in Security Center.
Default authorization logic of data sources:
After you associate a MaxCompute or Hologres compute engine with a workspace in standard mode on the Workspace page, the system assigns the Data Analyst role to your account. Then, you can use the related MaxCompute or Hologres data source in the development environment on the SQL Query page.
To use data sources in a workspace in standard mode that is not associated with a MaxCompute or Hologres compute engine or data sources in a workspace in basic mode, you must be manually authorized by the tenant administrator or tenant security administrator in Security Center.
Q: Why do I have a large number of SQL files in the My Files directory?
A: The upgraded SQL query feature allows you to manage all SQL files that you created for different workspaces and data sources in a centralized manner. The SQL files that you created for different workspaces and data sources by using the original SQL query feature are displayed in the My Files directory.
Q: Why is the All Tables directory that is provided by the original SQL query feature unavailable?
A: The upgraded SQL query feature allows you to add tables in the data sources on which you have query permissions as a directory. If a directory contains a large number of tables, you can search for a table by using specific conditions. For frequently used tables, you can add the tables to My Favorites and view them in My Favorites.
Q: The System Management page no longer provides the parameters that can be configured to view and download SQL query results or the parameter that can be used to specify the maximum number of SQL query result records that can be downloaded. How do I configure the settings that are related to SQL query results?
A: To configure the settings for downloading, replicating, and viewing SQL query result records, perform the following operations: Go to Security Center. In the top navigation bar, click the Security policy tab. In the left-side navigation pane, click Data query and analysis control. On the Data query and analysis control tab, click Query result control. Then, find the desired policy and click Edit in the Operation column.
Prerequisites
A compute engine is associated with your workspace. For more information, see Create and manage workspaces.
A data source is added to your workspace. Data sources are classified into data sources that are automatically generated when you associate compute engines with a workspace and data sources that are added to DataWorks on the Data Source page. For more information, see Add and manage data sources.
You are granted query permissions on specific data sources in the DataAnalysis service.
Your account is added as a member of the workspace, and the Data Analyst, Model Developer, Development, O&M, Workspace Manager, or Project Owner role is assigned to the member. For more information, see Add workspace members and assign roles to them.
NoteYou can use the SQL query feature to query only the data of the data sources in a workspace on which you have query permissions. Before you can query data by using the feature, you must contact Workspace Manager to add your account to the workspace as a member and assign one of the preceding roles to your account.
Precautions
Data source permissions
For a workspace in standard mode, you can be authorized to use only the data sources in the development environment. For a workspace in basic mode, you can be authorized by the administrator to use any data sources.
NoteFor information about the data sources on which the administrator can grant permissions to your account, see supported data source types.
For information about how to request permissions on data sources, see Security Center.
For information about workspace modes, see Differences between workspaces in basic mode and workspaces in standard mode.
Data source query
When you use the SQL query feature to query data of the MaxCompute data source that is automatically generated when you associate a MaxCompute compute engine with your workspace, your logon account is automatically used to access data of the MaxCompute data source. If you want to access data of tables in the production environment, you must specify the name of the project to which the tables belong in SQL commands. If you do not have permissions to query the data of the tables in the production environment, go to Security Center to request the permissions.
Whitelist-based access control for data sources
If whitelist-based access control is enabled for a MaxCompute project, errors may occur when you use features provided by DataAnalysis, such as fee calculation, data download, dimension table usage, and data upload, to perform operations on tables in the MaxCompute project. To ensure that DataAnalysis can access the MaxCompute project, you must add the IP address information about DataAnalysis in a specific region to the IP address whitelist of the MaxCompute project in advance. For more information, see Appendix: IP address whitelist for DataAnalysis.
Billing
Features
Feature | Description |
You can add the recommended directories or add tables in the data sources on which you have query permissions as a directory. After you add tables as a directory, you can view the SQL files or data tables in the directory. You can also view the schema of a table and perform simple operations to generate SQL statements for the table. | |
You can create an SQL query file, write SQL query statements, and then commit the SQL query statements to a specific data source for execution. | |
After you write SQL query statements, you can execute the SQL query statements to query the data of the data source. | |
After the SQL query statements are executed, you can view the run logs, running results, and SQL statements that correspond to the query results in the Result section. You can also view the query results in line charts, column charts, bar charts, pie charts, and cross tables. | |
You can manage SQL query files in a centralized manner. |
Go to the SQL Query page
You can use one of the following methods to go to the SQL Query page:
On the homepage of DataAnalysis, click the SQL Query card in the Shortcuts section to go to the SQL Query page.
In the left-side navigation pane of the DataAnalysis page, click SQL Query to go to the SQL Query page.
Add a directory
By default, the SQL Query page displays only the My Files and Other People's Files directories. You can click the icon to the right of the search box to add directories based on your business requirements. After you add a directory, you can view SQL files or data tables in the directory. You can also view the schema of a table and perform simple operations to generate SQL statements for the table.
You can add different types of directories on the following tabs:
Recommended Directory: This tab displays the recommended directories that you can add, such as My Favorites, My MaxCompute tables, and Public Tables.
DataMap - Metadata: On this tab, you can add data tables in the data sources on which you have query permissions as a directory.
MaxCompute: The DataMap - Metadata tab displays all MaxCompute projects on the Compute Engine Information tab of the Workspace page. For more information, see Create and manage workspaces.
Other compute engines: The DataMap - Metadata tab displays all data sources on the Data Source page. For more information, see Add and manage data sources.
DataMap - Data Album: On this tab, you can add the data albums that you manage, create, and follow as directories. For more information, see Table management from the business perspective: Data albums.
Others People's Files: stores the SQL query files that are saved by other members. The files can be viewed by all members of the current workspace.
You can add up to eight directories.
Create an SQL query file
You can create SQL query files based on the workspace mode and the compute engine type.
Create an SQL query file.
You can create an SQL query file in the following scenarios:
Scenario 1: An existing SQL query file is opened
Scenario 2: You go to the SQL Query page for the first time or no SQL query file is displayed on the current page
Scenario 3: General mode
Click the
icon to the right of the name of the existing file to create a new SQL query file.
Click Create SQL Query in the right-side area to create an SQL query file.
In the left-side navigation pane of the SQL Query page, move the pointer over the
icon to the right of My Files and select Create File.
Select a data source for the SQL query file.
Select the desired workspace, the compute engine type, and the data source whose data you want to query. You can query the data of data sources of compute engine types such as MaxCompute, EMR, CDH, Hologres, and AnalyticDB.
NoteYou can select only the data sources on which you have query permissions. If no data source of a specific compute engine type is available, contact Workspace Manager to add your account to the workspace as a member and assign the Data Analyst, Model Developer, Development, O&M, Workspace Manager, or Project Owner role to your account.
Execute SQL query statements
In the SQL editor, write the code that you want to use to query the data of the data source. Then, click the Run icon in the toolbar. You can view the query results in the Result section.
The SQL query feature allows you to perform the following operations:
Configure parameters: The SQL query feature supports SQL statements that contain parameters. The system can read variables in the ${Variable name} format. You need to only assign values to the variables based on your business requirements before you execute the statements in the SQL editor.
Execute SQL statements: You can click the Run icon to the left of an SQL statement to execute the statement. You can also select multiple SQL statements and click the
icon to execute all the selected statements.
Estimate costs: When you execute a single SQL statement, the Estimated Costs dialog box appears and displays the estimated costs. You can also select Estimate Costs from the More drop-down list to view the estimated costs before you execute all SQL statements that you write.
NoteYou can estimate the costs of executing SQL statements only if the compute engine type that you select is MaxCompute.
Others: You can click the Format icon to standardize the SQL statements that you write. In the Result and Log sections, you can view the execution result and the run log.
Perform operations on the query results
After the SQL statements are executed, you can perform the operations that are described in the following table on the query results.
Operation | Screenshot | Description |
View the query results | ![]() | You can click Export to export the query results by using one of the following methods:
|
Visualize the query results | ![]() | You can view the query results in different types of charts, such as line charts, column charts, bar charts, and pie charts. You can also save the query results as data cards to build your own visual knowledge base. You can select a data card to quickly create data reports and create personalized products with ease based on your business requirements. |
View the SQL statements that correspond to the query result | ![]() | You can view the SQL statements that correspond to the query result. You can also copy and paste the SQL statements to the code editor. |
Manage SQL query files
On the SQL Query page, you can perform the following operations on SQL query files.
Item | Description |
Search for a table | In the area that is marked with 1, you can enter the table name in the search box and click the |
View the running history of SQL statements | In the area that is marked with 1, you can click the |
Add a directory | In the area that is marked with 1, you can click the |
My Files | In the area that is marked with 2, the My Files directory stores the SQL query files that you saved. You can share an SQL query file to a specific workspace.
|
Other People's Files | In the area that is marked with 2, the Other People's Files directory stores the SQL query files that are saved by other members. The files can be viewed by all members of the current workspace. |
Public Tables | In the area that is marked with 2, you can click the |
My MaxCompute tables | In the area that is marked with 2, you can click the |
My Favorites | In the area that is marked with 2, you can click the |
Custom settings
You can click the icon in the lower-left corner of the SQL Query page to go to the Settings tab and customize the settings:
Theme: In this section, you can specify the theme for the SQL query.
Settings for Code Editor: In this section, you can specify the code style and code hints, and specify whether to allow the system to enter the prompted suggestions on code writing when you press Enter.
Settings for SQL-related Operations: In this section, you can specify the default mode in which each SQL statement is executed.