In some business scenarios, you may need to rename some fields in indexes. For example, an Alibaba Cloud Elasticsearch cluster stores one or more fields whose names contain special characters, such as at signs (@), and you want to use DataWorks to migrate data from the cluster to another Alibaba Cloud Elasticsearch cluster. However, DataWorks cannot migrate the data of fields whose names contain special characters. In this case, you must rename the fields before you migrate the data. The topic describes how to use Logstash to rename a field.

Background information

You can use one of the following methods to rename a field:
  • Use a filter plug-in provided by Logstash.

    In this topic, this method is used to rename the field @ctxt_user_info in a source index of a cluster to ctxt_user_info in a destination index of the cluster.

  • Rename the field when you use the reindex API to migrate data.

Prerequisites

  • An Elasticsearch cluster is created.

    For more information, see Create an Alibaba Cloud Elasticsearch cluster. In this topic, an Elasticsearch V7.10 cluster is used.

  • A Logstash cluster is created in the VPC where the Elasticsearch cluster resides.

    For more information, see Create a cluster.

  • Test data is prepared.
    In this topic, the following test data is used. In the test data, the name of the field @ctxt_user_info in a source index of the Elasticsearch cluster contains an at sign (@).
    {
      "took" : 5,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 6,
          "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
          {
            "_index" : "product_info",
            "_type" : "_doc",
            "_id" : "rpN7fn0BKQKHRO31rK6C",
            "_score" : 1.0,
            "_source" : {
              "@ctxt_user_info" : "test1"
            }
          },
          {
            "_index" : "product_info",
            "_type" : "_doc",
            "_id" : "r5N7fn0BKQKHRO31rK6C",
            "_score" : 1.0,
            "_source" : {
              "@ctxt_user_info" : "test2"
            }
          }
        ]
      }
    }

Procedure

  1. Step 1: (Optional) Create a destination index in the Elasticsearch cluster
  2. Step 2: Create and configure a Logstash pipeline
  3. Step 3: Verify the result

Step 1: (Optional) Create a destination index in the Elasticsearch cluster

If you enable the Auto Indexing feature for the Elasticsearch cluster, you can skip this step.
Note We recommend that you disable the Auto Indexing feature because the indexes that are automatically created may not meet your business requirements.
  1. Log on to the Kibana console of the Elasticsearch cluster.
    For more information, see Log on to the Kibana console.
    Note In this topic, an Elasticsearch V7.10 cluster is used. Operations on clusters of other versions may differ. The actual operations required in the console prevail.
  2. Go to the homepage of the Kibana console and click Dev tools in the upper-right corner.
  3. On the Console tab, run the following command to create a destination index named product_info2 in the Elasticsearch cluster:
    PUT /product_info2
    {
        "settings": {
            "number_of_shards": 5,
            "number_of_replicas": 1
        },
        "mappings": {
            "properties": {
                "ctxt_user_info": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                }
            }
        }
    }

Step 2: Create and configure a Logstash pipeline

  1. Log on to the Elasticsearch console.
  2. Navigate to the desired cluster.
    1. In the top navigation bar, select the region where the cluster resides.
    2. In the left-side navigation pane, click Logstash Clusters. On the Logstash Clusters page, find the cluster and click its ID.
  3. In the left-side navigation pane, click Pipelines.
  4. In the Pipelines section, click Create Pipeline.
    Create a pipeline
  5. In the Create Task wizard, enter a pipeline ID and configure the pipeline.
    In this example, the following configurations are used for the pipeline:
    input {
        elasticsearch {
            hosts => ["http://es-cn-tl32gid**********.elasticsearch.aliyuncs.com:9200"]
            user => "elastic"
            password => "your_password"
            index => "product_info"
            docinfo => true
        }
    }
    filter {
        mutate {
            rename => { "@ctxt_user_info" => "ctxt_user_info" }
        }
    }
    output {
        elasticsearch {
            hosts => ["http://es-cn-tl32gid**********.elasticsearch.aliyuncs.com:9200"]
            user => "elastic"
            password => "your_password"
            index => "product_info2"
            document_type => "%{[@metadata][_type]}"
            document_id => "%{[@metadata][_id]}"
        }
    }
                            

    In the preceding configurations, the filter.mutate.rename parameter is used to rename the field in the source index of the Elasticsearch cluster.

    For more information about pipeline configurations, see Use configuration files to manage pipelines and Logstash configuration files.

  6. Click Save or Save and Deploy.
    • Save: After you click this button, the system stores the pipeline settings and triggers a cluster change. However, the settings do not take effect. After you click Save, the Pipelines page appears. In the Pipelines section, find the created pipeline and click Deploy in the Actions column. Then, the system restarts the Logstash cluster to make the settings take effect.
    • Save and Deploy: After you click this button, the system restarts the Logstash cluster to make the settings take effect.

Step 3: Verify the result

  1. Log on to the Kibana console of the Elasticsearch cluster.
    For more information, see Log on to the Kibana console.
  2. Go to the homepage of the Kibana console and click Dev tools in the upper-right corner.
  3. On the Console tab, run the following command to query the destination index in the Elasticsearch cluster:
    GET product_info2/_search

    The following result is returned:

    {
      "took" : 4,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 6,
          "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
          {
            "_index" : "product_info2",
            "_type" : "_doc",
            "_id" : "r5N7fn0BKQKHRO31rK6C",
            "_score" : 1.0,
            "_source" : {
              "@timestamp" : "2021-12-03T04:14:26.872Z",
              "@version" : "1",
              "ctxt_user_info" : "test1"
            }
          },
          {
            "_index" : "product_info2",
            "_type" : "_doc",
            "_id" : "rpN7fn0BKQKHRO31rK6C",
            "_score" : 1.0,
            "_source" : {
              "@timestamp" : "2021-12-03T04:14:26.871Z",
              "@version" : "1",
              "ctxt_user_info" : "test2"
            }
          }
        ]
      }
    }

    The result shows that the field name that matches @ctxt_user_info in the source index of the Elasticsearch cluster is ctxt_user_info.