This topic describes how to use the Binning component to implement the discretization of continuous features.

Prerequisites

A project is created. For more information, see Create a project.

Background information

Feature discretization is a process of converting continuous data into multiple discrete intervals. To implement feature discretization, Machine Learning Platform for AI (PAI) provides the Binning component. This component supports the following binning modes: equal frequency binning, equal width binning, and automated binning.

In this topic, the Read MaxCompute Table component is used to read data from the pai_online_project.iris_data table. Then, the Binning component is used to put the data that is read into different bins. Finally, the Data Conversion Module component is used to convert the original continuous data in the bins into discrete data.

Procedure

  1. Go to the Machine Learning Studio console.
    1. Log on to the PAI console.
    2. In the left-side navigation pane, choose Model Training > Studio-Modeling Visualization.
    3. On the PAI Visualization Modeling page, find the project in which you want to create an experiment and click Machine Learning in the Operation column.Machine Learning
  2. Create an experiment.
    1. In the left-side navigation pane, click Home.
    2. Choose New > New Experiment.
    3. In the New Experiment dialog box, configure the parameters.
      Parameter Description
      Name Enter Use the Binning component for the discretization of continuous features.
      Project The name of the project to which the experiment belongs. You cannot change the value of this parameter.
      Description Enter Use the Binning component provided by PAI for the discretization of continuous features.
      Save To Click My Experiments.
    4. Click OK.
  3. Configure the experiment.
    1. In the left-side navigation pane, click Components.
    2. In the navigation tree, click Data Source/Target. Then, drag and drop the Read MaxCompute Table component onto the canvas.
    3. In the navigation tree, click Financials. Then, drag and drop the Binning and Data Conversion Module components onto the canvas.
    4. Connect the preceding components.Feature discretization experiment
  4. Configure component parameters.
    1. Click the Read MaxCompute Table component on the canvas. In the right-side pane, configure the following parameters.
      Tab Parameter Description
      Table Selection Table Name Enter pai_online_project.iris_data.
      Partition The pai_online_project.iris_data table is not a partitioned table. Therefore, the Partition check box is dimmed.
      Fields Information Source Table Columns You do not need to manually specify this parameter. After you specify Table Name, the system synchronizes the information of columns in the table specified by Table Name to the Source Table Columns field.
    2. Click the Binning component on the canvas. In the right-side pane, configure the parameters listed in the following table and leave other parameters at their default values.
      Tab Parameter Description
      Fields Setting Feature Columns Select the f1, f2, f3, and f4 columns.
      Parameters Setting Bins Set this parameter to 10. This value indicates that continuous features are converted into 10 discrete intervals.
      Binning Mode Valid values: Equal Frequency, Equal Width, and Automatic Binning. In this topic, the value Equal Frequency is used.
    3. Click the Data Conversion Module component on the canvas. In the right-side pane, configure the parameters listed in the following table and leave other parameters at their default values.
      Tab Parameter Description
      Fields Setting Columns without Data Conversion Select the type column. Data in the output of this column is the same as that in the input.
      Data Conversion Mode Select Index.
  5. In the top toolbar of the canvas, click Run.
  6. View the experiment results.
    1. After the experiment is executed, right-click the Data Conversion Module component on the canvas and select View Data. Then, you can view the discretization results.Discretization results
    2. Right-click the Binning component on the canvas and select Binning.
    3. Click Details in the Action column that corresponds to the feature you want to view. The f1 feature is used in this example.
    4. Click the Charts tab to view the binning results.