This topic describes how to create an E-MapReduce function.

Prerequisites

  • An E-MapReduce cluster is bound to the workspace where you want to create an E-MapReduce function. The E-MapReduce service is available in a workspace only after you bind an E-MapReduce cluster to the workspace on the Workspace Management page. For more information, see Configure a workspace.
  • Required resources are uploaded.

Procedure

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides, find the workspace, and then click Data Analytics in the Actions column.
  2. Create a workflow. For more information, see Create a workflow.
  3. Write Java code in an offline Java environment, compress the code to a JAR package, and then upload the package as a JAR resource to DataWorks. For more information, see Create an EMR JAR resource.
  4. Create a function.
    1. Click the workflow in the Business process section, right-click EMR, and then choose New > Function.
    2. In the New function dialog box, set Function name, EMR Engine instance, and Destination folder.
    3. Click Submit.
    4. In the Register function section of the configuration tab that appears, set the parameters.
      Register function section
      Parameter Description
      Function type The type of the function. Valid values: Mathematical operation function, Aggregate function, String Processing function, Date function, Window function, and Other functions.
      EMR Engine instance The E-MapReduce cluster that is bound to the current workspace. By default, you cannot change the cluster.
      EMR engine type The type of the E-MapReduce cluster that is bound to the current workspace. By default, you cannot change the cluster type.
      EMR database The database where the E-MapReduce cluster resides. Select a database from the drop-down list. To create a database, click New Library. In the New Library dialog box, set the parameters and click OK.
      Function name The name of the function. You can reference the function in SQL statements by using the function name. The function name must be globally unique and cannot be changed after the function is created.
      Responsible Person The owner of the function. This parameter is automatically set.
      Class name Required. The name of the class for implementing the function.
      Resource List Required. The resource to be used in the function. Select a resource from the ones that are created in the current workspace from the drop-down list. To create a resource, click New resource. In the New resource dialog box, set the parameters and click Confirm.
      Description The description of the function.
      Command format The syntax of the function. Example: test.
      Parameter description The description of the input and output parameters that are supported by the function.
      Return value Optional. The value to return. Example: 1.
      Example Optional. An example of the function.
  5. Click the Save icon in the toolbar.
  6. Commit the function.
    1. Click the Submit icon in the toolbar.
    2. In the Commit Node dialog box, enter your comments in the Change description field.
    3. Click OK.