All Products
Search
Document Center

Custom algorithm upload function

Last Updated: Apr 03, 2020

The custom algorithm upload function provided by Machine Learning Platform for AI allows you to use SQL, Spark 2.0, and PySpark 2.0 to develop algorithms, encapsulate the algorithms in a component, and upload the component to Studio. The custom algorithm function is also integrated with Artificial Intelligence Market. You can release algorithms to Artificial Intelligence Market and share the algorithms with more users.

Pricing

Fees generated when you run programs to use SQL, Spark 2.0, and PySpark 2.0.3 to develop algorithms, and upload the algorithms to Studio are charged at a rate of CNY 1 per billable hour.

Procedure

1. Find the custom algorithm upload function in the Machine Learning Platform for AI console

Log on to the Machine Learning Platform for AI console. In the left-side navigation pane, choose Model Develop and Train > Algorithm release.

2. Develop algorithm code

You can develop an algorithm package based on the document for local debugging. This topic uses the official PySpark case as an example.

3. Click Create a custom algorithm

  • Algorithm name: the name of the algorithm component.
  • Algorithm unique identification: the unique identifier on the backend for the algorithm. You can use the algorithm ID to query information such as logs.
  • Algorithm framework: the framework for the algorithm. Example: SQL, Spark, or PySpark.
  • Algorithm package: If you set Algorithm framework to SQL, you must upload an SQL script. If you set Algorithm framework to Spark, you must upload a JAR package. If you set Algorithm framework to PySpark, you must upload a ZIP package.
  • Types of algorithms: the folder that contains the algorithm package to release to Studio.
  • Entrance parameters: available only for the PySpark component. You must specify the .py entrance file and entrance function that are separated with a period (.). If you set Algorithm framework to Spark, you must specify the entrance class name for the JAR package. Example: com.aliyun.odps.spark.examples.simhash.SimHashSpark.

This topic uses the official PySpark-based algorithm package, uploads the pyspark.zip file, and specifies the entrance file and function,

  1. read_example.mainFunc

4. Edit the algorithm package version

After you submit the algorithm package, you can view the algorithm package instance in the console.

you must configure the version (UI display mode) for the package.

Click Go to configuration in the Operation column corresponding to the version. You can drag and drop controls to the Parameter Configuration section.

5. Edit UI for the component

You can control the input and output nodes. In this example, one table is read. Two fields that correspond to inputTable1 and outputTable1 are written to another table. To use three input and output nodes, define inputTable2 and inputTable3 in the code. Input and output nodes are automatically mapped.

  1. #Define input nodes
  2. INPUT_TABLE = arg_dict["inputTable1"]
  3. OUTPUT_TABLE = arg_dict["outputTable1"]
  4. ID_COL = arg_dict["idCol"]
  5. CONTENT_COL = arg_dict["contentCol"]

This example uses one input node and one output node. You do not need to modify the code.Edit configuration information. In the preceding code, all parameters except inputTable and outputTable must map to basic controls. In this example, idCol and contentCol map to two fields in the input table. You can select only one value for each field. In the left-side control list, find Single Field Control. Drag and drop this control twice to the Parameter Configuration section.Click the first control. On the right-side Basic Settings tab, configure the following parameters:

  • Name: required. This parameter maps to the parameter in the algorithm code. Set this parameter to idCol. After configurations are complete, the idCol information in the algorithm code corresponds to the component input.
  • Label: the display name of the control.
  • converter: optional.
  • Input and Output Binding Characters: the characters that bind the input node to the output node. In this example, select Enter #1.
  • Data Type: By default, all types are supported.

Configure information about two controls that can map to idCol and contentCol.

6. Release the component

Edit the UI version. Click Save.

Go back to the console. Refresh the page. In the Edit version dialog box, click User version in the Operation column corresponding to the version.

You can release the component.

You can use either of the following methods to release the component:

  • Release to Studio: You must select the region and project. The released component can be used only in the specified project for Alibaba Cloud accounts and RAM users.
  • Release to Artificial Intelligence Market: All Machine Learning Platform for AI users can download and use the component.

7. Call the component

Go to the Studio project where the component is released. In the left-side custom algorithm folder, find and use the released component.