You can use ad hoc queries to customize and run query statements and then download the results. For example, after you develop a job, you can use an ad hoc query to verify that the job runs as expected. This topic describes how to create an ad hoc query and download the query results.
Background information
Ad hoc query jobs use the compute engine that is configured for your current project. If you use a Hadoop compute engine, you can enable Impala jobs to create both Hive SQL and Impala SQL ad hoc query jobs. Because Impala performs in-memory computing, Impala SQL ad hoc queries have a faster response time than Hive SQL ad hoc queries. Dataphin also enables you to quickly switch between Impala SQL and Hive SQL job types without rewriting queries for historical jobs. For more information, see Appendix: Switch job types.
Prerequisites
To create an Impala SQL ad hoc query, you must enable Impala jobs in the Hadoop compute engine. For more information, see Create a Hadoop compute engine.
To download the results of an ad hoc query, ensure that you have the permission to download data from the project. Also, ensure that both full and sample data downloads are enabled for the project. For more information, see Manage permissions and compute engines for a project and Configure data download.
Limits
Query acceleration is not supported for ad hoc queries if you have not purchased the query acceleration feature.
Create an ad hoc query
On the Dataphin homepage, in the top menu bar, choose Develop > Data Development.
In the top menu bar, select a Project. If you are using Dev-Prod mode, also select an Environment.
In the navigation pane on the left, choose Ad-hoc Query. In the ad hoc query list, click the
icon to create a Compute Engine Ad-hoc Query or a Database SQL Ad-hoc Query.In the Create Ad-hoc Query dialog box, configure the following parameters.
NoteIf your compute engine is Hadoop and you have enabled Impala nodes, you can create Hive SQL Ad Hoc Queries and Impala SQL ad hoc queries.
Parameter
Description
Name
Enter a name for the ad hoc query.
The name can be up to 256 characters long and cannot contain vertical bars (|), forward slashes (/), backslashes (\), colons (:), question marks (?), angle brackets (<>), asterisks (*), or double quotation marks (").
Description
Enter a brief description of the ad hoc query.
Select Directory
Select the directory where the job is stored. The default directory is Temporary Code.
If no directory is created, create a new folder as follows:
Above the list of computing jobs on the left side of the page, click the
icon to open the Create Folder dialog box.In the New Folder dialog box, enter a Name for the folder and select a location in the Select Directory field as needed.
Click OK.
Data source type
Select the data source type for the SQL job. For more information about the supported data source types, see the Offline Development - Database SQL column in Data sources supported by Dataphin.
NoteThis parameter is available only when you create a Database SQL Ad-hoc Query.
Datasource
Select the data source for the ad hoc query. If no data source is available, click + New Data Source to create one.
NoteThis parameter is available only when you create a Database SQL Ad-hoc Query.
Catalog
If the data source type is Presto or Trino, you must also configure the Catalog after you configure the data source.
Database/Schema
If the data source type is MySQL, PostgreSQL, AnalyticDB for PostgreSQL, Oracle, Presto, GaussDB (DWS), Microsoft SQL Server, ClickHouse, Hologres, Doris, openGauss, StarRocks, DM, OceanBase (Oracle Tenant Mode), SelectDB, Trino, or PolarDB-X 2.0, you must also configure the Schema after you configure the data source.
Click OK to create the ad hoc query file.
(Optional) Configure and enable query acceleration, and then select an acceleration method.
MCQA: This method uses MaxCompute Query Acceleration (MCQA). Each tenant has limits on the number of jobs and concurrency for MCQA. These limits may cause acceleration to fail or execution errors to occur. For more information, see Query Acceleration (MCQA). You can disable MCQA acceleration in Management Center > System Settings > Development Platform > Query Acceleration.
The supported acceleration methods vary depending on the compute engine and scenario.
When the current tenant uses a MaxCompute compute engine:
If you have not purchased query acceleration but have enabled query acceleration on the development platform, the acceleration method is MCQA.
If you have purchased query acceleration, enabled query acceleration on the development platform, and the compute engine for the current project is not attached to an acceleration source, the acceleration method is MCQA.
If you have purchased query acceleration and the compute engine for the project is attached to an acceleration source, you can select either Acceleration Source or MCQA as the acceleration method.
When the current tenant uses a Hadoop compute engine: The acceleration method is Acceleration Source for Hive SQL, Impala SQL, and Spark SQL ad hoc query jobs if you have purchased query acceleration and the cluster for the project's compute engine is attached to an acceleration source.
Write the query code based on the project's compute engine, database SQL type, .
After you write the code, click Run at the top of the editor.
After the query statement is successfully executed, view the results in the Console.
Download the result data
After the query statement is successfully executed, click the
icon in the upper-right corner of the console to download the results.In the Data Download dialog box, select a Download Data Range and Download Data Format.
Download Data Range: You can download the Full Result Set or a Limited Result Set.
Full data download: Downloading the full dataset may take a long time. You can monitor the download progress in the Message Center or by viewing the run logs.
ImportantA full data download creates a temporary table based on the query statement. For Spark SQL ad hoc queries on a Hadoop compute engine, the temporary table for a full data download is created using the default storage format specified in Management Center > System Settings > Development Platform > Table Management Settings.
For example, if the default storage format in the table management settings is hudi, the
using hudistatement is added when the temporary table is created. If the default storage format is Engine Default, the storage format is not specified when the temporary table is created.Sample data download: By default, all query statements return only a subset of the data. You can configure the number of records to return in Management Center > Standard Settings > Data Download. For more information, see Configure data download.
Download Data Format: If you select Sample Data as the download range, you can select CSV or Excel as the download format. If you select Full Data, you can only select CSV.
Click OK to start the download.
If you select Full Data as the download range, click OK to start data preparation. After the data is prepared, click OK again to start the full data download.
If download approval is enabled, you must submit a data download request for approval. After the request is approved, you can download the data by viewing the approval task, checking the current run results, or accessing the run logs. For more information, see Configure data download approval.
If the watermark feature is enabled for data download approvals, a watermark is automatically added to the downloaded Excel files. The downloaded data file is retained for 30 days from the query date. For example, if the data is queried on April 12, 2023, and the download request is made on April 13, 2023, the file expires on May 12, 2023. For more information, see Configure data download approval.
Appendix: Switch job types
This section applies if the project uses a Hadoop compute engine as its offline engine and Impala jobs are enabled. You can switch between Impala SQL and Hive SQL job types.
In the ad hoc query list, find the Impala SQL or Hive SQL job that you want to modify, click the
icon next to the job name, and then choose Modify Type.In the Modify Type dialog box, select the new type. The following figure shows an example of switching from an Impala SQL job to a Hive SQL job.

Click OK to switch the job type.