This topic describes how to create an EMR Spark node. EMR Spark nodes allow you to
perform complex memory analysis, helping you build large, low-latency data analysis
applications.
Prerequisites
An E-MapReduce cluster is bound to the workspace where you want to create an EMR Spark
SQL node. The E-MapReduce service is available in a workspace only after you bind
an E-MapReduce cluster to the workspace on the
Workspace Management page. For more information, see
Configure a workspace.
Procedure
- Go to the DataStudio page.
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- In the top navigation bar, select the region where your workspace resides, find the
workspace, and then click Data Analytics in the Actions column.
- On the Data Development tab, move the pointer over the
icon and choose .Alternatively, you can click a workflow in the Business process section, right-click
EMR, and then choose .
- In the Create Node dialog box, set the Node Name and Location parameters.
Note The node name must be 1 to 128 characters in length and can contain letters, digits,
underscores (_), and periods (.).
- Click Commit.
- On the node configuration tab, enter the code.
Note If the current workspace is bound to multiple E-MapReduce compute engine instances,
you must select an E-MapReduce compute engine instance. If the current workspace is
bound to only one E-MapReduce compute engine instance, you do not need to do so.
- Save and commit the node.
Notice You must set the Rerun and Parent Nodes parameters before you can commit the node.
- Click the
icon in the toolbar to save the node.
- Click the
icon in the toolbar.
- In the Commit Node dialog box, enter your comments in the Change description field.
- Click OK.
In a workspace in standard mode, you must click
Deploy in the upper-right corner after you commit the node. For more information, see
Deploy nodes.
- Test the node. For more information, see View auto triggered nodes.