This topic describes how to create an EMR Spark SQL node. EMR Spark SQL nodes allow you to use the distributed SQL query engine to process structured data, improving the task execution efficiency.

Prerequisites

  • DataWorks Professional Edition or higher is activated.
  • An E-MapReduce cluster is bound to the workspace where you want to create an EMR Spark SQL node. The E-MapReduce service is available in a workspace only after you bind an E-MapReduce cluster to the workspace on the Workspace Management page. For more information, see Configure a workspace.

Procedure

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides, find the workspace, and then click Data Analytics in the Actions column.
  2. On the Data Development tab, move the pointer over the Create icon icon and choose EMR > EMR Spark SQL.
    Alternatively, you can click a workflow in the Business process section, right-click EMR, and then choose New > EMR Spark SQL.
  3. In the New node dialog box, set the Node name and Destination folder parameters.
    Note The node name must be 1 to 128 characters in length and can contain letters, digits, underscores (_), and periods (.). It is not case-sensitive.
  4. Click Submit.
  5. On the node configuration tab, enter the code.
    Note If the current workspace is bound to multiple E-MapReduce compute engine instances, you must select an E-MapReduce compute engine instance. If the current workspace is bound to only one E-MapReduce compute engine instance, you do not need to do so.
  6. Save and commit the node.
    Notice You must set the Rerun attribute and Dependent upstream node parameters on the Scheduling configuration tab before you can commit the node.
    1. Click Save icon in the toolbar to save the node.
    2. Click Submit icon in the toolbar to commit the node.
    3. In the Submit New Version dialog box, enter your comments in the Change description field.
    4. Click OK.
    In a workspace in standard mode, you must click Publish in the upper-right corner after you commit the AnalyticDB for MySQL node. For more information, see Deploy a node.
  7. Test the node. For more information, see Auto triggered nodes.