All Products
Search
Document Center

Dataphin:Create HIVE_SQL task

Last Updated:Jan 21, 2025

This topic outlines the steps to create an offline computing task using HIVE_SQL on Dataphin.

Background information

HIVE_SQL tasks are ideal for processing existing data to produce results tailored to specific business requirements.

Procedure

  1. On the Dataphin home page, navigate to the top menu bar and select Development > Data Development.

  2. At the top menu bar of the Development page, select Project (Dev-Prod mode requires selecting an environment).

  3. In the left-side navigation pane, choose Data Processing > Script Task. In the Script Task list, click the image icon and select HIVE_SQL.

  4. In the New HIVE_SQL Task dialog box, configure the following parameters:

    Parameter

    Description

    Task Name

    Enter the name for the offline computing task, ensuring it does not exceed 256 characters. The name must not contain vertical lines (|), forward slashes (/), backslashes (\), colons (:), question marks (?), angle brackets (<>), asterisks (*), or quotation marks (").

    The length must be 256 characters or fewer. Unsupported characters include vertical lines (|), forward slashes (/), backslashes (\), colons (:), question marks (?), angle brackets (<>), asterisks (*), and quotation marks (").

    Schedule Type

    Choose the task's schedule type. Options for Schedule Type include:

    • Recurring Task: Automatically included in the system's periodic scheduling.

    • One-Time Task: Requires manual initiation.

    Select Directory

    Choose the directory to store the task. If no directory exists, create one as follows:

    If no directory exists, Create Folder by following these steps:

    1. Click the image icon above the task list on the left to open the Create Folder dialog box.

    2. In the Create Folder dialog, enter the folder Name and choose the Directory location as needed.

    3. Select Confirm.

    Use Template

    Toggle the Use Template switch to decide whether to apply a code template. If enabled, also select the Template and its Version.

    Utilize reference code templates for streamlined development. The template's task code is read-only and cannot be modified. Simply configure the template parameters to finalize code development. For more information, see Create an offline computing template.

    Description

    Provide a brief description of the task within 1000 characters.

  5. Select Confirm.

  6. In the code editing area of the current HIVE_SQL task tab, compose the HIVE_SQL offline computing task code. Once the code is complete, click Precompile to check the HIVE_SQL code syntax.

  7. After precompiling the code, click Run to execute the code.

  8. Click Attribute in the sidebar to set the task Attributes, which include Basic Information, Runtime Parameter, Schedule Attribute (for Recurring Tasks), Schedule Dependency (for Recurring Tasks), Runtime Configuration, and Resource Configuration.

    • Basic Information

      This section is for defining the scheduled task's name, assigning the responsible individual, and providing a description along with other fundamental details. For guidance on configuration, see Configure basic task information.

    • Runtime Parameter

      When your task utilizes parameter variables, you can set values for these parameters in the attributes. This allows for the automatic substitution of the parameter variables with their respective values during node scheduling. For guidance on configuration, see Parameter configuration and use of node parameters.

    • Schedule Attribute (Recurring Task)

      When the schedule type for an offline computing task is set to Recurring Task, you must not only provide Basic Information but also configure the task's scheduling attributes. For guidance on configuration, see Configure schedule attributes.

    • Schedule Dependency (Recurring Task)

      When the offline computing task is set as a Recurring Task, you must not only provide Basic Information but also configure the task's schedule dependency. For instructions on configuration, see Configure schedule dependency.

    • Runtime Configuration

      For offline computing tasks, you can set task-level runtime timeouts and rerun policies based on business needs. In the absence of specific configurations, tasks will default to the timeout and rerun settings established at the tenant level. For guidance on how to configure these settings, see Compute task runtime configuration.

    • Resource Configuration

      You can set up the scheduling resource group for your current computing task, which will utilize the resource quota of that group during scheduling. For guidance on how to configure this, see Compute task resource configuration.

  9. Save and submit the task under the current HIVE_SQL task tab.

    1. Click the image icon to save the code.

    2. Click the image icon to submit the code.

  10. In the Submitting Log page, you need to confirm the Submission Content and the results of the Pre-check, and fill in the remarks. For more information, see Offline computing task submission instructions.

  11. After verification, click Confirm And Submit.

What to do next

  • In Dev-Prod mode, once the task is successfully submitted, proceed to the release list to publish the task to the production environment. For more information, see Manage release tasks.

  • If your development mode is Basic mode, the successfully submitted HIVE_SQL task can participate in the scheduling of the production environment. You can go to the Operation Center to view your published tasks. For more information, see View and manage script tasks, View and manage one-time tasks.

Appendix: Switch task type

If Impala tasks are enabled in your Hadoop computing source, you can convert a HIVE_SQL task to an IMPALA_SQL task for improved performance in query analysis. Follow these steps:

  1. Navigate to the top menu bar on the Dataphin home page and select Development > Data Development.

  2. On the Development page's top menu bar, choose Project (Dev-Prod mode requires environment selection).

  3. In the left-side navigation pane, choose Data Processing > Script Task. Then, from the Script Task list, select the desired HIVE_SQL task.

  4. Click the image icon next to the HIVE_SQL task and choose Modify Type.

  5. In the Modify Type dialog box, select IMPALA_SQL and click Confirm to change the task type.