All Products
Search
Document Center

DataWorks:SQL query

Last Updated:Aug 28, 2023

DataWorks DataAnalysis provides the SQL query feature. This feature allows you to query and analyze the data of data sources on which you have query permissions. This topic describes how to use the SQL query feature.

Background information

  • The SQL query feature of DataWorks DataAnalysis is not available for PostgreSQL, SQL Server, Oracle, or PolarDB for Xscale (PolarDB-X) from October 13, 2022. However, this feature is available for big data engines, such as MaxCompute, E-MapReduce (EMR), Cloudera's Distribution including Apache Hadoop (CDH), Hologres, and AnalyticDB.

    Note

    If a network connectivity issue occurs when you run an SQL query task, contact the owner of the data source to check the connection between the resource group and the data source. You can also resolve the issue by referring to Appendix: IP address whitelist for DataAnalysis. If the issue persists, contact technical support.

  • The SQL query feature does not allow you to return to the earlier version from November 14, 2022.

FAQ about the upgraded SQL query feature

  • Q: How do I add a data source on the SQL Query page?

    A: For information about how to add a data source, refer to the following animation. The upgraded SQL query feature allows you to add a data source for each SQL file.

  • Q: Why is the number of available data sources reduced?

    A: The upgraded SQL query feature supports only the data sources on which the tenant administrator or tenant security administrator can grant permissions to your account in Security Center. If you do not have query permissions on specific data sources, you can contact the administrator to grant your account the query permissions on the data sources in Security Center.

    Default authorization logic of data sources:

    • After you associate a MaxCompute or Hologres compute engine with a workspace in standard mode on the Workspace page, the system assigns the Data Analyst role to your account. Then, you can use the related MaxCompute or Hologres data source in the development environment on the SQL Query page.

    • To use data sources in a workspace in standard mode that is not associated with a MaxCompute or Hologres compute engine or data sources in a workspace in basic mode, you must be manually authorized by the tenant administrator or tenant security administrator in Security Center.

  • Q: Why do I have a large number of SQL files in the My Files directory?

    A: The upgraded SQL query feature allows you to manage all SQL files that you created for different workspaces and data sources in a centralized manner. The SQL files that you created for different workspaces and data sources by using the original SQL query feature are displayed in the My Files directory.

  • Q: Why is the All Tables directory that is provided by the original SQL query feature unavailable?

    A: The upgraded SQL query feature allows you to add tables in the data sources on which you have query permissions as a directory. If a directory contains a large number of tables, you can search for a table by using specific conditions. For frequently used tables, you can add the tables to My Favorites and view them in My Favorites.

  • Q: The System Management page no longer provides the parameters that can be configured to view and download SQL query results or the parameter that can be used to specify the maximum number of SQL query result records that can be downloaded. How do I configure the settings that are related to SQL query results?

    A: To configure the settings for downloading, replicating, and viewing SQL query result records, perform the following operations: Go to Security Center. In the top navigation bar, click the Security policy tab. In the left-side navigation pane, click Data query and analysis control. On the Data query and analysis control tab, click Query result control. Then, find the desired policy and click Edit in the Operation column.

Prerequisites

  • A compute engine is associated with your workspace. For more information, see Create and manage workspaces.

  • A data source is added to your workspace. Data sources are classified into data sources that are automatically generated when you associate compute engines with a workspace and data sources that are added to DataWorks on the Data Source page. For more information, see Add and manage data sources.

  • You are granted query permissions on specific data sources in the DataAnalysis service.

  • Your account is added as a member of the workspace, and the Data Analyst, Model Developer, Development, O&M, Workspace Manager, or Project Owner role is assigned to the member. For more information, see Add workspace members and assign roles to them.

    Note

    You can use the SQL query feature to query only the data of the data sources in a workspace on which you have query permissions. Before you can query data by using the feature, you must contact Workspace Manager to add your account to the workspace as a member and assign one of the preceding roles to your account.

Precautions

  • Data source permissions

    For a workspace in standard mode, you can be authorized to use only the data sources in the development environment. For a workspace in basic mode, you can be authorized by the administrator to use any data sources.

    Note
  • Data source query

    When you use the SQL query feature to query data of the MaxCompute data source that is automatically generated when you associate a MaxCompute compute engine with your workspace, your logon account is automatically used to access data of the MaxCompute data source. If you want to access data of tables in the production environment, you must specify the name of the project to which the tables belong in SQL commands. If you do not have permissions to query the data of the tables in the production environment, go to Security Center to request the permissions.

  • Whitelist-based access control for data sources

    If whitelist-based access control is enabled for a MaxCompute project, errors may occur when you use features provided by DataAnalysis, such as fee calculation, data download, dimension table usage, and data upload, to perform operations on tables in the MaxCompute project. To ensure that DataAnalysis can access the MaxCompute project, you must add the IP address information about DataAnalysis in a specific region to the IP address whitelist of the MaxCompute project in advance. For more information, see Appendix: IP address whitelist for DataAnalysis.

Billing

Features

Feature

Description

Add a directory

You can add the recommended directories or add tables in the data sources on which you have query permissions as a directory. After you add tables as a directory, you can view the SQL files or data tables in the directory. You can also view the schema of a table and perform simple operations to generate SQL statements for the table.

Create an SQL query file

You can create an SQL query file, write SQL query statements, and then commit the SQL query statements to a specific data source for execution.

Execute SQL query statements

After you write SQL query statements, you can execute the SQL query statements to query the data of the data source.

Perform operations on the query results

After the SQL query statements are executed, you can view the run logs, running results, and SQL statements that correspond to the query results in the Result section. You can also view the query results in line charts, column charts, bar charts, pie charts, and cross tables.

Manage SQL query files

You can manage SQL query files in a centralized manner.

Go to the SQL Query page

You can use one of the following methods to go to the SQL Query page:

  • On the homepage of DataAnalysis, click the SQL Query card in the Shortcuts section to go to the SQL Query page.

  • In the left-side navigation pane of the DataAnalysis page, click SQL Query to go to the SQL Query page.

Add a directory

By default, the SQL Query page displays only the My Files and Other People's Files directories. You can click the 数据集 icon to the right of the search box to add directories based on your business requirements. After you add a directory, you can view SQL files or data tables in the directory. You can also view the schema of a table and perform simple operations to generate SQL statements for the table.

You can add different types of directories on the following tabs:

  • Recommended Directory: This tab displays the recommended directories that you can add, such as My Favorites, My MaxCompute tables, and Public Tables.

  • DataMap - Metadata: On this tab, you can add data tables in the data sources on which you have query permissions as a directory.

    • MaxCompute: The DataMap - Metadata tab displays all MaxCompute projects on the Compute Engine Information tab of the Workspace page. For more information, see Create and manage workspaces.

    • Other compute engines: The DataMap - Metadata tab displays all data sources on the Data Source page. For more information, see Add and manage data sources.

  • DataMap - Data Album: On this tab, you can add the data albums that you manage, create, and follow as directories. For more information, see Table management from the business perspective: Data albums.

Note
  • Others People's Files: stores the SQL query files that are saved by other members. The files can be viewed by all members of the current workspace.

  • You can add up to eight directories.

Create an SQL query file

You can create SQL query files based on the workspace mode and the compute engine type.

  1. Create an SQL query file.

    You can create an SQL query file in the following scenarios:

    Scenario 1: An existing SQL query file is opened

    Scenario 2: You go to the SQL Query page for the first time or no SQL query file is displayed on the current page

    Scenario 3: General mode

    新增文件Click the 图标 icon to the right of the name of the existing file to create a new SQL query file.

    sql查询Click Create SQL Query in the right-side area to create an SQL query file.

    左侧导航In the left-side navigation pane of the SQL Query page, move the pointer over the 添加 icon to the right of My Files and select Create File.

  2. Select a data source for the SQL query file.

    Select the desired workspace, the compute engine type, and the data source whose data you want to query. You can query the data of data sources of compute engine types such as MaxCompute, EMR, CDH, Hologres, and AnalyticDB.选择数据源

    Note

    You can select only the data sources on which you have query permissions. If no data source of a specific compute engine type is available, contact Workspace Manager to add your account to the workspace as a member and assign the Data Analyst, Model Developer, Development, O&M, Workspace Manager, or Project Owner role to your account.

Execute SQL query statements

In the SQL editor, write the code that you want to use to query the data of the data source. Then, click the Run icon in the toolbar. You can view the query results in the Result section.

SQL查询The SQL query feature allows you to perform the following operations:

  • Configure parameters: The SQL query feature supports SQL statements that contain parameters. The system can read variables in the ${Variable name} format. You need to only assign values to the variables based on your business requirements before you execute the statements in the SQL editor.

  • Execute SQL statements: You can click the Run icon to the left of an SQL statement to execute the statement. You can also select multiple SQL statements and click the 运行 icon to execute all the selected statements.

  • Estimate costs: When you execute a single SQL statement, the Estimated Costs dialog box appears and displays the estimated costs. You can also select Estimate Costs from the More drop-down list to view the estimated costs before you execute all SQL statements that you write.

    Note

    You can estimate the costs of executing SQL statements only if the compute engine type that you select is MaxCompute.

  • Others: You can click the Format icon to standardize the SQL statements that you write. In the Result and Log sections, you can view the execution result and the run log.

Perform operations on the query results

After the SQL statements are executed, you can perform the operations that are described in the following table on the query results.

Operation

Screenshot

Description

View the query results

image.png

You can click Export to export the query results by using one of the following methods:

  • Local File: The query results are downloaded to your on-premises machine. You can perform the download operation without approval.

    Note
    • Only DataWorks advanced editions allow you to export query results to your on-premises machine.

    • The maximum number of SQL query result records that can be downloaded to your on-premises machine varies based on the DataWorks advanced edition. DataWorks Standard Edition: 200,000. DataWorks Professional Edition: 2,000,000. DataWorks Enterprise Edition: 5,000,000. The file size cannot exceed 1 GB. The tenant administrator, tenant security administrator, and RAM users who are assigned the Workspace Manager role can go to the Data query and analysis control tab of Security Center to specify the upper limit for the number of SQL query result records that can be viewed, the number of SQL query result records that can be downloaded, and whether to allow users to download data.

    • Only the MaxCompute and EMR compute engines allow you to export query results to your on-premises machine. If the project data protection feature is enabled for a MaxCompute project, the query results will fail to be downloaded. For more information about the project data protection feature, see Project data protection.

  • MaxCompute table: The query results are saved as an online MaxCompute table. You do not need to download the query results to your on-premises machine and then upload the results. You can specify the lifecycle of a table based on your business requirements.

  • Spreadsheet: You can synchronize the query results to a workbook and perform more operations on the query results. For more information, see Analyze data.

  • Spreadsheet and share: You can synchronize the query results to a workbook and share the results to specific users. For more information, see Share a workbook.

Visualize the query results

image.png

You can view the query results in different types of charts, such as line charts, column charts, bar charts, and pie charts. You can also save the query results as data cards to build your own visual knowledge base. You can select a data card to quickly create data reports and create personalized products with ease based on your business requirements.

View the SQL statements that correspond to the query result

sql

You can view the SQL statements that correspond to the query result. You can also copy and paste the SQL statements to the code editor.

Manage SQL query files

On the SQL Query page, you can perform the following operations on SQL query files.SQL查询

Item

Description

Search for a table

In the area that is marked with 1, you can enter the table name in the search box and click the 搜索 icon to search for a table.

View the running history of SQL statements

In the area that is marked with 1, you can click the 历史 icon to view the running history of SQL statements.

Add a directory

In the area that is marked with 1, you can click the 数据集 icon to add a directory. You can add the recommended directories or add tables in the data sources on which you have query permissions as a directory. After you add tables as a directory, you can view the SQL files or data tables in the directory. You can also view the schema of a table and perform simple operations to generate SQL statements for the table.

My Files

In the area that is marked with 2, the My Files directory stores the SQL query files that you saved. You can share an SQL query file to a specific workspace.

  • 分享: indicates that the file is being shared. All members of the workspace to which the file is shared can view the file in the Other People's Files directory. You can right-click a file and select Share to share the file.

  • 计划任务: indicates that the file is running. You can click the file name to go to the configuration tab of the file to view the historical execution details of the file.

Other People's Files

In the area that is marked with 2, the Other People's Files directory stores the SQL query files that are saved by other members. The files can be viewed by all members of the current workspace.

Public Tables

In the area that is marked with 2, you can click the 数据集 icon to add the Public Tables directory. The Public Tables directory stores the tables that are provided by DataAnalysis. You can use the tables to experience the SQL query feature without the need to apply for required permissions.

My MaxCompute tables

In the area that is marked with 2, you can click the 数据集 icon to add the My MaxCompute tables directory. The My MaxCompute tables directory stores the MaxCompute tables within your account, including tables in the development and production environments.

My Favorites

In the area that is marked with 2, you can click the 数据集 icon to add the My Favorites directory. The My Favorites directory stores the tables that are added to favorites. To add a table to your favorites, right-click the table and select Add to Favorites.

Custom settings

You can click the 设置 icon in the lower-left corner of the SQL Query page to go to the Settings tab and customize the settings:

  • Theme: In this section, you can specify the theme for the SQL query.

  • Settings for Code Editor: In this section, you can specify the code style and code hints, and specify whether to allow the system to enter the prompted suggestions on code writing when you press Enter.

  • Settings for SQL-related Operations: In this section, you can specify the default mode in which each SQL statement is executed.