Introduction to Elasticsearch

Elasticsearch is a very powerful search engine. It is currently widely used in various IT companies. Elasticsearch was created by the Elastic company. Its code is on GitHub - elastic/elasticsearch: Free and Open, Distributed, RESTful Search Engine . Currently, Elasticsearch is a free and open project. At the same time, Elastic also owns the Logstash and Kibana open source projects. Combined, these three projects form the ELK software stack . Together, the three of them form a strong ecosystem. Simply put, Logstash is responsible for data collection and processing (enriching data, data conversion, etc.), and Kibana is responsible for data display, analysis, management, supervision and application. Elasticsearch is at the core, it can help us quickly search and analyze data.

In fact, the complete stack of Elasticsearch is as follows::Beats Elasticsearch Kibana Logstash

Beats
Elasticsearch
Kibana
Logstash

Beats are lightweight proxies that can be allowed in client servers. It doesn't need to be deployed to our Elastic Cloud. It can help us collect all the events we need. If Beats are also included in my architecture, then Elastic's stack can be expressed as:
You can find more information about customers at Elastic's official address.
In today's article, let me briefly introduce what Elasticsearch is.

Elastic Product Ecosystem

Elastic has built many mature solutions around Elasticsearch. For more details, please refer to our official website. Free and Open Search: The Creators of Elasticsearch, ELK & Kibana | Elastic。
Power of Search - 60 sec Power of Search - 60 sec_bilibili

Elasticsearch

Simply put, Elaaticsearch is a distributed search engine using a REST interface. Its products can be downloaded at Elasticsearch: The Official Distributed Search & Analytics Engine | Elastic. Elasticsearch is a distributed REST-based search engine designed for the cloud. Its features include: Elasticsearch is an open-source search engine based on Apache Lucene(TM). Whether in the open-source or proprietary space, Lucene can be considered the most advanced, performant, and full-featured search engine library to date. In 1999, Doug Cutting created an open-source project called Lucene:
1) A search engine library is written entirely in Java
2) As of 2005, it is a top-level Apache open source project
3) Provide a powerful full-text search function However, Lucene is just a library. Lucene itself does not provide high availability and distributed deployment. To use its power, you need to use Java and integrate it into your application. Lucene is very complex, and you need a deep understanding of retrieval to understand how it works.
In 2004, Shay Banon, now CEO of Elastic, developed an open-source project called Compass:
1) Built on Lucence
2) The purpose is to make it easier to integrate Lucene search into Java applications
3) Scalability becomes more important
In 2010, Shay completely rewrote Compass to serve two purposes:
1) From the very beginning of the design, distributed deployment runs through the entire design
2) It is convenient to use other languages ​​for docking and use.
In 2004, Shay Banon, now CEO of Elastic, developed an open-source project called Compass:
1) Built on Lucence
2) The purpose is to make it easier to integrate Lucene search into Java applications
3) Scalability becomes more important
In 2010, Shay completely rewrote Compass to serve two purposes:
Shay eventually called the project Elasticsearch and released it on Github in October of that year. If you are more interested in the history of Elasticsearch, please read another article by my colleague "The Past and Present of Elasticsearch".
Elasticsearch is also written in Java and uses Lucene for indexing and search functionality, but its purpose is to make full-text search simple and hide the complexities of Lucene through a simple and coherent RESTful API.
However, Elasticsearch is more than just Lucene and a full-text search engine, it also provides:
1) Distributed real-time file storage, each field is indexed and searchable
2) Distributed search engine for real-time analysis
3) It can be extended to hundreds of servers to process PB-level structured or unstructured data.
And, all of these features are integrated into a single server that your application can interact with through simple RESTful APIs, clients in various languages, and even the command line. Getting started with Elasticsearch is very simple, it provides many reasonable defaults, and hides complex search engine theory from beginners. It works out of the box (installation ready to use) and can be used in a production environment with minimal learning. Elasticsearch is licensed under Elastic V2 and SSPL and is free to download, use, and modify. With the accumulation of knowledge, you can customize the advanced features of Elasticsearch according to different problem domains, it is all configurable and the configuration is very flexible.
The hallmark of Elasticsearch is that it provides an extremely fast search experience. This is due to its high speed. Compared with some other big data engines, Elasticsearch can achieve second-level searches, but for them, it may take hours to complete. An Elasticsearch cluster is a distributed deployment that is extremely easy to scale. This makes it easy to make it handle petabytes of database capacity. The most important thing about Elasticsearch is that its search results can be sorted by score, and it can provide us with the most relevant search results (relevance). We can customize relevance in a positive way according to our own business scenarios.

Distributed and highly available search engine

1. Each index is fully sharded with a configurable number of shards
2. Each shard can have one or more replicas
3. Execute read/search operations on any replica shard

multi-tenancy

1. Support multiple indexes
2. Index level configuration (number of shards, index storage, ...)

Various APIs

1. HTTP RESTful API
2. Native Java API
3. All APIs perform automatic node action rerouting
document oriented
1. No need to define the schema (document structure) in advance
2. A schema can be defined to customize the indexing process Reliable, asynchronous writes for long-term persistence (Near) real-time search Built on Lucene
1. Each shard is a fully functional Lucene index
2. All functions of Lucene can be easily exposed through a simple configuration/plugin Consistency of every operation
Single-document-level operations are atomic, consistent, isolated, and durable. Getting Started Guide First, don't panic. It takes 5 minutes to get the full content of Elasticsearch. Prerequisites You need to install the latest Java on your computer (in the latest version, Java does not need to be installed, because the Java installation package is already included in the installation package). You can see the setup link for more information.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00