All Products
Search
Document Center

Elasticsearch:Use Transforms to process and analyze data

Last Updated:Aug 23, 2023

Transforms provided by Elasticsearch is a data processing and analysis tool that you can use to perform operations such as pre-processing, aggregation, and transformation on data. You can use Transforms to process data without imposing impacts on raw data, which can meet your requirements for data analysis and visualization. This topic describes how to use Transforms to transform basketball shooting data and view the transformation result.

Preparations

  1. Download sample data. In this example, the NBA shot logs dataset from Kaggle is used. The dataset contains data such as the shooting time, the shooters, the shooting spots, the closest defenders, and the distances to the closest defenders. You can click shot_logs.csv to download the sample data.

  2. Create an Alibaba Cloud Elasticsearch cluster. For more information, see Create an Alibaba Cloud Elasticsearch cluster. In this example, an Alibaba Cloud Elasticsearch V7.10.0 cluster is created.

    Note

    Alibaba Cloud Elasticsearch V8.5 clusters are not supported.

  3. Log on to the Kibana console of the Elasticsearch cluster. For more information, see Log on to the Kibana console.

  4. Import the NBA shot logs dataset and create an index.

    1. In the upper-left corner, click the 菜单.png icon. In the left-side navigation pane, choose Kibana > Machine Learning.

    2. Click the Data Visualizer tab.

    3. In the Import data card of the Data Visualizer tab, click Upload file.

    4. Click the image.png icon.

    5. Select the shot_logs file from your on-premises machine.

    6. In the lower-left corner of the Data Visualizer tab, click Import.

    7. On the Simple tab, enter nba_short_logs in the Index name field and select Create index pattern.

    8. Click Import.

      If the dataset is successfully imported, the information shown in the following figure is displayed.

      image.png

Procedure

You can use one of the following methods to transform the data and view the transformation result.

Method 1: Create a transformation task in the Kibana console to transform the data and view the transformation result

  1. In the upper-left corner, click the 菜单.png icon. In the left-side navigation pane, choose Management > Stack Management.

  2. In the left-side navigation pane of the Stack Management page, choose Data > Transforms.

  3. On the Transforms page, click Create your first transform.

  4. In the dialog box that appears, select the nba_short_logs index.

  5. In the Configuration section of the Create transform page, select histogram(GAME_ID) for Group by and DRIBBLES.sum, DRIBBLES.avg, and DRIBBLES.max for Aggregations.

    image.png
    Note
    • Group by GAME_ID: Groups players by game ID.

    • DRIBBLES.sum: Calculates the total number of dribbles of all players in each game.

    • DRIBBLES.avg: Calculates the average number of dribbles of each player in each game.

    • DRIBBLES.max: Calculates the largest number of dribbles in each game.

  6. In the lower-right corner of the Configuration section, click Next.

  7. In the Transform details section, configure the Transform ID and Destination index parameters. Then, click Next.

  8. In the Create section, click Create and start.

    Note

    If the progress percentage in the progress bar becomes 100%, the transformation task is created.

  9. Click Discover to view the data in the destination index.

    image.png

Method 2: Call APIs to create a transformation task to transform the data and view the transformation result

  1. In the upper-left corner, click the 菜单.png icon. In the left-side navigation pane, choose Management > Dev Tools.

    1. On the Console tab, run the following command to create a transformation task:

      PUT _transform/test2_nba_shot_logs
      {
          "source": {
          "index": "nba_short_logs"
        },
        "dest" : { 
          "index" : "test2_nba_short_logs"
        },
        "pivot": {
          "group_by": { 
            "game_id": { "terms": { "field": "GAME_ID" }}
          },
          "aggregations": {
            "dribbles_sum": { "sum": { "field": "DRIBBLES" }},
            "dribbles_avg": { "avg": { "field": "DRIBBLES" }},
            "dribbles_max": { "cardinality": { "field": "DRIBBLES" }}
          }
        }
      }
    2. Run the following command to call the _preview API to view the transformation result:

      POST _transform/_preview
      {
        "source": {
          "index": "nba_shot_logs"
        },
        "dest" : { 
          "index" : "test2_nba_shot_logs"
        },
        "pivot": {
          "group_by": { 
            "game_id": { "terms": { "field": "GAME_ID" }}
          },
          "aggregations": {
            "dribbles_sum": { "sum": { "field": "DRIBBLES" }},
            "dribbles_avg": { "avg": { "field": "DRIBBLES" }},
            "dribbles_max": { "cardinality": { "field": "DRIBBLES" }}
          }
        }
      }

Method 3: Call APIs to create a transformation task to transform the data and view the transformation result on the Discover page

  1. Create and start a transformation task.

    1. In the upper-left corner, click the 菜单.png icon. In the left-side navigation pane, choose Management > Dev Tools.

    2. On the Console tab, run the following command to create a transformation task:

      PUT _transform/test2_nba_shot_logs
      {
          "source": {
          "index": "nba_short_logs"
        },
        "dest" : { 
          "index" : "test2_nba_short_logs"
        },
        "pivot": {
          "group_by": { 
            "game_id": { "terms": { "field": "GAME_ID" }}
          },
          "aggregations": {
            "dribbles_sum": { "sum": { "field": "DRIBBLES" }},
            "dribbles_avg": { "avg": { "field": "DRIBBLES" }},
            "dribbles_max": { "cardinality": { "field": "DRIBBLES" }}
          }
        }
      }
    3. Run the following command to start the transformation task:

      POST _transform/test2_nba_shot_logs/_start
      Note

      By default, the transformation task is not started after it is created.

  2. Create an index pattern.

    Note

    You must create an index pattern before you can view data on the Discover page.

    1. In the upper-left corner, click the 菜单.png icon. In the left-side navigation pane, choose Management > Stack Management.

    2. In the left-side navigation pane of the Stack Management page, choose Kibana > Index Patterns.

    3. In the upper-right corner of the Index patterns page, click Create index pattern.

    4. On the Create index pattern page, enter the name of the destination index that is obtained after the transformation in the Index pattern name field, and click Next step. In this example, test2_nba_short_logs is entered.

    5. Click Create index pattern.

  3. View the transformation result on the Discover page.

    1. In the upper-left corner, click the 菜单.png icon. In the left-side navigation pane, choose Kibana > Discover.

    2. On the Discover page, select the name of the destination index and view the data in the index.