This topic describes how to use the solr-to-es tool to migrate documents from a Solr cluster to an Alibaba Cloud Elasticsearch cluster. The tool is provided by a third-party community.

Preparations

  1. Create an Alibaba Cloud Elasticsearch V6.X cluster. This topic uses an Elasticsearch V6.3.2 cluster as an example. For more information, see Create an Elasticsearch cluster.
    Notice The solr-to-es tool used in this topic supports only Elasticsearch V6.X clusters. If you want to use an Elasticsearch cluster of another version, first perform a compatibility test.
  2. Enable the Auto Indexing feature for the cluster. For more information, see Enable auto indexing.
  3. Create an Alibaba Cloud Elastic Compute Service (ECS) instance. For more information, see Step 1: Create an ECS instance. In this topic, the ECS instance runs CentOS 7.3.
    Notice The ECS instance must reside in the same region, zone, and Virtual Private Cloud (VPC) as the Elasticsearch cluster.
  4. Install Solr on the ECS instance. This topic uses Solr 5.0.0 as an example. For more information, see Official Solr documentation.
  5. Install Python on the ECS instance. The version must be 3.0 or later. This topic uses Python 3.6.2 as an example.
  6. Install pysolr on the ECS instance. The version must be 3.3.3 or later but earlier than 4.0.

Install solr-to-es

  1. Connect to the ECS instance and download solr-to-es.
  2. Navigate to the directory where setup.py is stored and run the python setup.py install command to install solr-to-es.
  3. After solr-to-es is installed, run the following command to migrate documents:
    python __main__.py <solr_url>:8983/solr/<my_core>/select http://<username>:<password>@<elasticsearch_url>:9200 <elasticsearch_index> <doc_type>
    Table 1. Parameters
    Parameter Description
    <solr_url> The endpoint of your Solr cluster. Example: http://116.62.**.**.
    <my_core> The name of the Solr Core that contains the documents you want to migrate.
    <username> The username that is used to access your Elasticsearch cluster. The default username is elastic.
    <password> The password that is used to access your Elasticsearch cluster. The password is specified when you create the cluster.
    <elasticsearch_url> The internal or public endpoint of your Elasticsearch cluster. You can obtain the endpoint from the Basic Information page of your cluster. For more information, see View basic information of a cluster.
    <elasticsearch_index> The name of the index to which documents will be migrated.
    <doc_type> The type of the index.
    Notice If you are using solr-to-es of a version that is different from the one described in this topic, you can try the following command to migrate documents. For more information, see solr-to-es.
    solr-to-es [-h] [--solr-query SOLR_QUERY] [--solr-fields COMMA_SEP_FIELDS]
                     [--rows-per-page ROWS_PER_PAGE] [--es-timeout ES_TIMEOUT]
                     solr_url elasticsearch_url elasticsearch_index doc_type

    If you use the preceding command in the environment described in this topic, the -bash: solr-to-es.py: command not found error is returned.

Procedure

Query all documents in the my_core Solr Core and write these documents to the index on your Elasticsearch cluster. The name of the index is elasticsearch_index, and the type of the index is doc_type.

  1. In the Solr environment, navigate to the solr-to-es-master/solr_to_es directory.
  2. Run the following command:
    python __main__.py 'http://116.62.**.**:8983/solr/my_core/select?q=*%3A*&wt=json&indent=true' 'http://elastic:Your password@es-cn-so4lwf40ubsrf****.public.elasticsearch.aliyuncs.com:9200' elasticsearch_index doc_type
    Parameter Description
    q Required. This parameter defines a query that uses the standard query syntax in Solr. Operators are supported. The value *%3A* indicates that all documents will be queried.
    wt The type of the data to return. Valid values: JSON, XML, PY, RB, and CSV.
    indent Specifies whether to use indentations to ensure that the returned data is easier to read. Default value: false.

    For information about other parameters, see Table 1.

  3. Log on to the Kibana console of your Elasticsearch cluster.
    For more information, see Log on to the Kibana console.
  4. In the left-side navigation pane, click Dev Tools. On the Console tab of the page that appears, run the following command to check whether the elasticsearch_index index is created on the Elasticsearch cluster:
    GET _cat/indices?v
  5. Run the following command to query details about the migrated documents:
    GET /elasticsearch_index/doc_type/_search
    If the command is executed successfully, the following result is returned:
    {
      "took" : 12,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 2,
        "max_score" : 1.0,
        "hits" : [
          {
            "_index" : "elasticsearch_index",
            "_type" : "doc_type",
            "_id" : "Tz8WNW4BwRjcQciJ****",
            "_score" : 1.0,
            "_source" : {
              "id" : "2",
              "title" : [
                "test"
              ],
              "_version_" : 1648195017403006976
            }
          },
          {
            "_index" : "elasticsearch_index",
            "_type" : "doc_type",
            "_id" : "Tj8WNW4BwRjcQciJ****",
            "_score" : 1.0,
            "_source" : {
              "id" : "1",
              "title" : [
                "change.me"
              ],
              "_version_" : 1648195007391203328
            }
          }
        ]
      }
    }