This topic describes how to create an E-MapReduce (EMR) Spark SQL node. EMR Spark
SQL nodes allow you to use the distributed SQL query engine to process structured
data. This improves the task running efficiency.
Prerequisites
- An EMR cluster is created. The inbound rules of the security group to which the cluster
belongs include the following rules:
- Action: Allow
- Protocol type: Custom TCP
- Port range: 8898/8898
- Authorization object: 100.104.0.0/16
- An EMR compute engine instance is bound to the required workspace. The EMR option is displayed only after you bind an EMR compute engine instance to the workspace
on the Workspace Management page. For more information, see Configure a workspace.
Procedure
- Go to the DataStudio page.
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- In the top navigation bar, select the region where your workspace resides, find the
workspace, and then click Data Analytics in the Actions column.
- On the page that appears, move the pointer over the
icon and choose .Alternatively, you can click the related workflow in the left-side navigation pane,
right-click EMR, and then choose .
- In the Create Node dialog box, set the Node Name and Location parameters.
Note The node name must be 1 to 128 characters in length and can contain letters, digits,
underscores (_), and periods (.).
- Click Commit.
- On the node configuration tab, enter the code.
Note If multiple EMR compute engine instances are bound to the current workspace, you must
select an EMR compute engine instance. If only one EMR compute engine instance is
bound to the current workspace, you do not need to do so.
- Save and commit the node.
Notice You must set the Rerun and Parent Nodes parameters before you can commit the node.
- Click the
icon in the toolbar to save the node.
- Click the
icon in the toolbar.
- In the Commit Node dialog box, enter your comments in the Change description field.
- Click OK.
In a workspace in standard mode, you must click
Deploy in the upper-right corner after you commit the node. For more information, see
Deploy nodes.
- Test the node. For more information, see View auto triggered nodes.