This topic describes how to use the solr-to-es tool to migrate documents from a Solr
cluster to an Alibaba Cloud Elasticsearch cluster. The tool is provided by a third-party
community.
Preparations
- Create an Alibaba Cloud Elasticsearch V6.X cluster. This topic uses an Elasticsearch
V6.3.2 cluster as an example. For more information, see Create an Elasticsearch cluster.
Notice The solr-to-es tool used in this topic supports only Elasticsearch V6.X clusters.
If you want to use an Elasticsearch cluster of another version, first perform a compatibility
test.
- Enable the Auto Indexing feature for the cluster. For more information, see Enable auto indexing.
- Create an Alibaba Cloud Elastic Compute Service (ECS) instance. For more information,
see Step 1: Create an ECS instance. In this topic, the ECS instance runs CentOS 7.3.
Notice The ECS instance must reside in the same region, zone, and Virtual Private Cloud (VPC)
as the Elasticsearch cluster.
- Install Solr on the ECS instance. This topic uses Solr 5.0.0 as an example. For more
information, see Official Solr documentation.
- Install Python on the ECS instance. The version must be 3.0 or later. This topic uses
Python 3.6.2 as an example.
- Install pysolr on the ECS instance. The version must be 3.3.3 or later but earlier
than 4.0.
Install solr-to-es
- Connect to the ECS instance and download solr-to-es.
- Navigate to the directory where setup.py is stored and run the
python setup.py install
command to install solr-to-es.
- After solr-to-es is installed, run the following command to migrate documents:
python __main__.py <solr_url>:8983/solr/<my_core>/select http://<username>:<password>@<elasticsearch_url>:9200 <elasticsearch_index> <doc_type>
Table 1. Parameters
Parameter |
Description |
<solr_url> |
The endpoint of your Solr cluster. Example: http://116.62.**.**. |
<my_core> |
The name of the Solr Core that contains the documents you want to migrate. |
<username> |
The username that is used to access your Elasticsearch cluster. The default username
is elastic.
|
<password> |
The password that is used to access your Elasticsearch cluster. The password is specified
when you create the cluster.
|
<elasticsearch_url> |
The internal or public endpoint of your Elasticsearch cluster. You can obtain the
endpoint from the Basic Information page of your cluster. For more information, see
View basic information of a cluster.
|
<elasticsearch_index> |
The name of the index to which documents will be migrated. |
<doc_type> |
The type of the index. |
Notice If you are using solr-to-es of a version that is different from the one described
in this topic, you can try the following command to migrate documents. For more information,
see
solr-to-es.
solr-to-es [-h] [--solr-query SOLR_QUERY] [--solr-fields COMMA_SEP_FIELDS]
[--rows-per-page ROWS_PER_PAGE] [--es-timeout ES_TIMEOUT]
solr_url elasticsearch_url elasticsearch_index doc_type
If you use the preceding command in the environment described in this topic, the -bash: solr-to-es.py: command not found
error is returned.
Procedure
Query all documents in the my_core
Solr Core and write these documents to the index on your Elasticsearch cluster. The
name of the index is elasticsearch_index
, and the type of the index is doc_type
.
- In the Solr environment, navigate to the solr-to-es-master/solr_to_es directory.
- Run the following command:
python __main__.py 'http://116.62.**.**:8983/solr/my_core/select?q=*%3A*&wt=json&indent=true' 'http://elastic:Your password@es-cn-so4lwf40ubsrf****.public.elasticsearch.aliyuncs.com:9200' elasticsearch_index doc_type
Parameter |
Description |
q |
Required. This parameter defines a query that uses the standard query syntax in Solr.
Operators are supported. The value *%3A* indicates that all documents will be queried.
|
wt |
The type of the data to return. Valid values: JSON, XML, PY, RB, and CSV. |
indent |
Specifies whether to use indentations to ensure that the returned data is easier to
read. Default value: false .
|
For information about other parameters, see Table 1.
- Log on to the Kibana console of your Elasticsearch cluster.
- In the left-side navigation pane, click Dev Tools. On the Console tab of the page that appears, run the following command to check whether the
elasticsearch_index
index is created on the Elasticsearch cluster:
- Run the following command to query details about the migrated documents:
GET /elasticsearch_index/doc_type/_search
If the command is executed successfully, the following result is returned:
{
"took" : 12,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
{
"_index" : "elasticsearch_index",
"_type" : "doc_type",
"_id" : "Tz8WNW4BwRjcQciJ****",
"_score" : 1.0,
"_source" : {
"id" : "2",
"title" : [
"test"
],
"_version_" : 1648195017403006976
}
},
{
"_index" : "elasticsearch_index",
"_type" : "doc_type",
"_id" : "Tj8WNW4BwRjcQciJ****",
"_score" : 1.0,
"_source" : {
"id" : "1",
"title" : [
"change.me"
],
"_version_" : 1648195007391203328
}
}
]
}
}