DataAnalysis provides the SQL query feature that allows you to query and analyze the data of the added data sources. This topic describes how to use the SQL query feature.

Prerequisites

  • A compute engine is associated with your workspace. For more information, see Configure a workspace
    DataWorks allows users of a workspace to use the SQL query feature to query only the data in the workspace.
    Note An Alibaba Cloud account or a RAM user can be added as a user to different workspaces. To query the data in a specific workspace, contact a relevant workspace administrator to add your account to the workspace.
  • A data source is added to the workspace. For more information, see Manage connections.

Go to the SQL Query page

You can use one of the following methods to go to the SQL Query page.

  1. Method 1: Go to the SQL Query page from the Homepage page of DataWorks.
    1. Log on to the DataWorks console.

    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where the required workspace resides, find the workspace, and then click Homepage in the Actions column.
    4. On the page that appears, click the Data Analysis Scene tab. On the Data Analysis Scene tab, click Start Now in the DataAnalysis section.
    5. In the top navigation bar, click SQL Query.
  2. Method 2: Go to the SQL Query page from the Overview page of the DataWorks console.
    1. Log on to the DataWorks console.

    2. In the left-side navigation pane, click Overview.
    3. Click the tab for the region where the required workspace resides and click DataAnalysis.
    4. In the top navigation bar, click SQL Query.
  3. Method 3: Go to the SQL Query page from the DataStudio page.
    1. Log on to the DataWorks console.

    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where the required workspace resides, find the workspace, and then click Data Analytics in the Actions column.
    4. On the DataStudio page, click the More icon icon in the upper-left corner and choose All Products > Data Development > DataAnalysis.
    5. In the top navigation bar, click SQL Query.

Procedure

  1. Select the workspace or data source where the data to be queries resides.
    Select the compute engine type involved from the first drop-down list, and select the required workspace or data source from the second drop-down list.
    Note The following types of compute engines are supported: MaxCompute, MySQL, Hologres, CDH hive, and EMR Impala. If a compute engine of one of the preceding types is associated with your workspace, you can use the SQL query feature to query and analyze data of the data sources that are added to the workspace. For information about how to view the added data sources, see Manage connections.
    Select a data source or a workspace
  2. Create an SQL query task. You can use one of the following methods to create a task.
    1. If you go to the SQL Query page for the first time, click New SQL Query in the right-side area.
      New SQL Query
    2. In the left-side pane of the SQL Query page, right-click My Files and select New file. In the Save dialog box, set the Name, Save, and Is it public parameters. By default, the value of the Name parameter is in the format of Username_Date and time, and the Save parameter is set to My Files. Valid values of the Is it public parameter:
      • Only visible to oneself: The SQL query task is visible only to the current account.
      • Current workspace visible: The SQL query task is visible to all users of the current workspace, but other users cannot modify the query task.
      My Files
    3. If the tab for an SQL query task is opened, click the Plus icon icon next to the task name to create another SQL query task.
  3. Write SQL query statements.
    1. You can use one of the following methods to write SQL query statements:
      • In the left-side pane of the SQL Query page, right-click My Files and select New file. In the tab that appears, write SQL query statements in the editor.
      • In the left-side pane of the SQL Query page, find the created SQL query task under My Files. You can also view the tasks that are shared by other users under Other People Files. Double-click the name of an SQL query task, statements saved for the task appear.
      Note
      • The My Files folder contains the SQL query tasks that are visible only to the current account.
      • The Other Peoples Files folder contains the SQL query tasks that are saved by other users and visible to all users of the current workspace.
      • The Public Tables folder contains the tables that all DataWorks users have the permissions to access. If you do not have the permissions to access tables, you can query data in tables in this folder to experience the SQL query feature.
    2. Click the Run icon icon in the toolbar. You can view the query results in the Results section. You can also sort and filter the retrieved data by field type, and copy specific data records.
      SQL query results
      No. Element Description
      1 SQL editor You can write SQL statements in the SQL editor.
      2 Run You can click Run icon to execute all statements in the SQL editor. The query results appear in the lower part. You can also select an SQL statement to be executed. In this case, the system executes only this statement.
      3 Stop You can click the Stop icon to stop executing SQL statements.
      4 Save You can click the Save icon to save the SQL statements that you write. After you click Save, the Save dialog box appears. You can set the Name parameter to customize the query task name. The default value of the Name parameter is in the format of Username_Date and time. You can set the Save parameter to My Files or a subfolder under My Files You can also specify whether to make the SQL query task public. Valid values of the Is it public parameter:
      • Only visible to oneself: The SQL query task is visible only to the current account.
      • Current workspace visible: The SQL query task is visible to all users of the current workspace, but other users cannot modify the query task.
      5 Format You can click the Format icon to standardize the SQL statements that you write.
      6 Log You can view the execution status of SQL statements. For example, you can view the execution duration and error messages.
      7 Results You can view the data of the queried table after the SQL statements are executed.
  4. Analyze the retrieved data.
    • After the data is retrieved, you can click Analyze in Workbook in the lower-right corner to synchronize the data to a workbook. This way, you can perform data pivoting and detection. For more information, see Analyze data.
    • After the data is retrieved, you can also click Download Data in the lower-right corner to download the data to your on-premises machine for subsequent data analysis. In the Download Result dialog box, set the following parameters:
      • Scope: the amount of data that you want to download. Valid values: Data Displayed in Spreadsheet: A maximum of 10,000 data records are supported. Full Data: A maximum of 200,000 data records are supported.
      • Format: the format in which the data is downloaded. The CSV, EXCEL, and TXT formats are supported.

Manage SQL query tasks

On the SQL Query page, you can search for the created SQL query tasks and perform operations on the tasks. For example, you can rename, delete, copy a task. Manage SQL query tasks
Section Description
1 In this section, you can enter a task name, folder name, or table name to search for the task.
2 In this section, you can perform the following operations:
  • Create a subfolder: To create a subfolder, right-click My Files and select New folder. In the New folder dialog box, enter the subfolder name and select My Files or an existing subfolder under My Files.
  • Create a task: To create a task, right-click My Files and select New file. In the Save dialog box, set the Name, Save, and Is it public parameters. By default, the value of the Name parameter is in the format of Username_Date and time. You can set the Save parameter to My Files or a subfolder under My Files. You can also specify whether to make the task public. Valid values of the Is it public parameter:
    • Only visible to oneself: The SQL query task is visible only to the current account.
    • Current workspace visible: The SQL query task is visible to all users of the current workspace, but other users cannot modify the query task.
You can also perform the following operations on the created tasks:
  • Rename: To rename an SQL query task, right-click the task and select Rename. In the Rename dialog box, enter the updated name.
  • Open: To edit the SQL statements in an SQL query task, right-click the task and select Open. In the editor that appears, edit the SQL statements.
  • Delete: To delete an SQL query task, right-click the task and select Delete.
  • Copy to clipboard: To copy the SQL statements in an SQL query task, right-click the task and select Copy to clipboard.