By Rifandy Zulvan, Solution Architect Intern
Elasticsearch is a versatile tool that is widely used for system monitoring and logging, data analytics, and full-text search. Elastic provides another tool called Elastic Stack, which includes Elasticsearch, Logstash, Kibana, and Beats.
Alibaba Cloud offers a fully managed Elasticsearch Cluster that is based on the open-source version of Elasticsearch. Alibaba Cloud Elasticsearch also supports other Elastic Stack components such as Logstash, Kibana, and Beats. By using these services, you can deploy managed Elasticsearch Clusters and focus more on implementing Elasticsearch to your business application rather than managing the clusters.
This blog post will guide you through the process of deploying fully managed Elasticsearch clusters on Alibaba Cloud.
Prerequisites:
Before starting the hands-on deployment. We will have to prepare some basic needs in the cloud. Here are some prerequisites that you need.
1) VPC
2) NAT Gateway (Optional)
If you do not need to access the Elasticsearch Clusters over the internet, you don’t need a NAT Gateway.
To start deploying the Elasticsearch Clusters, we can go to the Elasticsearch page on Alibaba Cloud Console Alibaba Elasticsearch. On that page you can see this page.
Click Create Cluster.
On the buy page. Specify your billing method, region, zone, VPC, and resource group as your demands. Alibaba Cloud gives a purchase guide for Elasticsearch, you can read it on this documentation Elasticsearch (alibaba.com)
The next thing to do is choose the specifications of each node you need. You can choose the cluster specification on your demands, but Alibaba Cloud also provides us with a brief information and guideline about the cluster specifications, storage capacity, and the best practice of configuring Elasticsearch. you can read it on the documentation Evaluate specifications and storage capacity (alibaba.com)
There are 4 Nodes that offered, there are :
Data Node
Data node is the main node of the Elastic cluster, this node contains the Elasticsearch and will store the index, and the data on the clusters. You need a minimum of 2 data nodes to run Elasticsearch Clusters.
Kibana Node
Kibana Node is used for accessing the Kibana console for data visualization, and also for controlling the Elasticsearch that is on the data node.
Warm Node
Warm nodes are data nodes that are designed to handle a large amount of read-only indices that are not as likely to be queried frequently. You can implement this node if you want to implement ‘Hot-Warm Architecture’ on Elasticsearch. You can read more about the ‘Hot-Warm Architecture’ on Elasticsearch through this link Elasticsearch Hot Warm Architecture | Elastic Blog.
Dedicated Master Node
A dedicated master node can performs operations on clusters. You can create or delete indexes, track nodes, and allocate shards. If you are using a single zone cluster the dedicated master node will not be activate by default, but you can activate the dedicated master node To improve the stability of your services, we recommend that you purchase dedicated master nodes.
Client Node
Client nodes are used to forward all query and write requests received by an Elasticsearch cluster to data nodes and merge the query results of data nodes.
For more information you can read through these pages Parameters on the buy page (alibabacloud.com) and FAQ about Alibaba Cloud Elasticsearch clusters
The configuration on this blog contains 3 data nodes, 1 Kibana node, and 2 client nodes. After that, you can specify the password to access Elasticsearch and as an account to log on to Kibana.
After that, you can click Buy Now, review the configurations, and click Activate Now.
Go back to the Elasticsearch Page and wait for the cluster initialization. It will take several minutes.
If the status of your cluster is Active your Elasticsearch Cluster is ready to go!.
You can see the Basic Information of your Elasticsearch Cluster by clicking the ID of your cluster. Basic Information page gives you some information about your cluster, including the node visualization and the status of each node on your cluster.
Next step is to access Kibana, you can click the 3 dots on your cluster highlights, and choose Access Kibana.
You will see this page, on Kibana card you can click Access over internet
You’ll see this Note, by default the node on Elasticsearch Cluster can not be accessed over the internet to ensure the data security. You can access the cluster by adding your personal IP address, or your organization IP address to the whitelist configuration, to maintain the access authority of the clusters.
On network settings you can see this configuration list. to access the Kibana, you can turn on the HTTPS option, and also update the Public Network Whitelist to add your authorized IP address that will access the Elasticsearch cluster.
After you set up your configuration, the cluster will restart to change the configuration. It will take a few minutes for applying changes on the cluster.
You can access the Kibana through the Public Network Endpoint that have been provided on the console.
Here’s the first Kibana log in page. You can Log in with your password that you created before.
This is the home page of the Kibana, you can access several features on here. But, on this blog we just focus on how to add a data index and trying the data visualization and the searching features.
To add a sample data to Elasticsearch, you can scroll down on Kibana home page and find the Add data button.
That will bring you to the add data page, there is some sample data and some template data that Kibana provides on this page. For this blog we try the Sample web logs data
If you have successfully added the data, the card will looks like this. Now you can try to view the data with some visualization options. We try to view the Dashboard visualization.
Here are the dashboard for sample logs data on Kibana. Dashboard in Kibana is quite good and it’s very customizable. You can customize the dashboard based on your needs Kibana allows you to display data in a wide variety of formats including line and pie charts, heat maps, data tables, line graphs, gauges and coordinate maps. You can also filter the data on the dashboard.
Now we try to upload our own data to Elasticsearch cluster, Elasticsearch will index the data that we uploaded, so the data retrieval will be faster. To upload a data to Elasticsearch via Kibana, you can go to the Kibana homepage, and scroll down to Upload a file button under the Ingest your data section. Elasticsearch supports file type like CSV, NDJSON, or log file.
Data that uploaded in this blog are Flipkart Product Public Dataset, that you can get form Kaggle
Flipkart Product Dataset | Kaggle download the data file and uploads the file via Kibana. After the data file is uploaded, you can check the dataset summary first to make sure you’re uploading the right datasets. If you already make sure the datasets are right, you can click Import.
After that you can specify the index for the dataset. You can choose between a simple and advanced index for the dataset. For right now I just use the simple index method. Specify the index name and make sure you check the Create Index Pattern fields.
And then, you can click Import and wait for the data upload and indexing process to finish. If the Import process is completed you can see a page like this.
To do the searching on Elasticsearch, we can go to Kibana homepage, and go through the Left Menu > Management > Dev Tools.
On the Dev Tools page you can create a direct request to Elasticsearch node, for this time I just try to do a simple search for the index that just created. This is the preview of the Dev Tools page on Kibana.
For the query, we will use a query to get some products that match on the title, and should match the descriptions, categories, and will be ranking by their product_rate. This is the query that will be used.
GET product/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": {
"query": "USER_KEYWORDS",
"boost": 3
}
}
},
{
"multi_match": {
"query": "USER_KEYWORDS",
"fields": [
"description",
"category_1",
"category_2"
]
}
}
]
}
},
"sort": [
{
"product_rating": {
"order": "desc"
}
}
]
}
Here’s the query for searching a product with “Hand tool kit” keywords, and you can see the result of some products that matched the keywords from around 10K data with just only 41 ms.
For further use on Elasticsearch cluster, you can implement the Elasticsearch cluster to several applications, like creating your Cloud Monitoring & Logging system, creating a data analytics or visualization dashboard, or developing an application that requires a full-text search or search engine features.
Elasticsearch cluster in Alibaba Cloud environment can be really useful to fulfill some business needs. Because Alibaba Cloud Elasticsearch cluster can be integrated with other Alibaba Cloud services easily, the example use case in Alibaba Cloud environment is to integrate ApsaraDB and PolarDB to Elasticsearch using Alibaba Cloud DTS or Logstash cluster. You can also integrate the cluster with your big data resources like MaxCompute and Hadoop, or you can implement your Alibaba Cloud logging service by integrating Elasticsearch cluster with Alibaba Log Service.
1) Alibaba Cloud. ElasticSearch Documentation. Retrieved from https://www.alibabacloud.com/help/en/elasticsearch
2) Alibaba Cloud. Evaluate specifications and storage capacity. ElasticSearch Documentation. Retrieved from https://www.alibabacloud.com/help/en/elasticsearch/latest/evaluate-specifications-and-storage-capacity
3) Elastic. Hot-Warm Architecture in ElasticSearch 5.x. Elastic.co. Retrieved from https://www.elastic.co/cn/blog/hot-warm-architecture-in-elasticsearch-5-x
4) Alibaba Cloud. Parameters on the buy page. ElasticSearch Documentation. Retrieved from https://www.alibabacloud.com/help/en/elasticsearch/latest/parameters-on-the-buy-page
5) Alibaba Cloud. Overview of Best Practices. ElasticSearch Documentation. Retrieved from https://www.alibabacloud.com/help/en/elasticsearch/latest/overview-of-best-practices
Enhancing Data Privacy: Unleashing the Potential of the Alibaba Cloud Data Security Center
99 posts | 17 followers
FollowAlibaba Cloud Indonesia - August 1, 2023
Alibaba Clouder - December 30, 2020
Alibaba Clouder - December 29, 2020
Alibaba Cloud Community - April 15, 2024
Data Geek - April 8, 2024
Data Geek - April 19, 2024
99 posts | 17 followers
FollowAlibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreAlibaba Cloud Elasticsearch helps users easy to build AI-powered search applications seamlessly integrated with large language models, and featuring for the enterprise: robust access control, security monitoring, and automatic updates.
Learn MoreApsaraDB for HBase is a NoSQL database engine that is highly optimized and 100% compatible with the community edition of HBase.
Learn MoreMore Posts by Alibaba Cloud Indonesia