All Products
Search
Document Center

Architecture

Last Updated: Sep 09, 2021

Implementation

Architecture of OpenSearch

1
  • White arrows indicate the streaming process of real-time data.

  • Red arrows indicate the streaming process of full data.

  • Black arrows indicate the search process.

You can use OpenSearch in the OpenSearch console or by calling API operations. Typically, you can log on to the OpenSearch console, create an application, and configure the application. The configurations include the field schema, search attributes, text processing plug-ins, and custom rules for relevance sorting. After the application is created and configured, you can push data to OpenSearch by using SDKs or calling API operations. If you are using Alibaba Cloud storage services to store data, you can authorize OpenSearch to access the storage services in the OpenSearch console. This way, the data can be automatically synchronized to OpenSearch. The data that you push is streamed to the iStream Service module of the Import subsystem in real time. The data is parsed, processed, and then stored in the Structured Data Storage system. After that, the iStream Service module of the Dump subsystem processes the data and sends the processed data to Swift, which is a real-time message queue system. HA3, which is a search system, subscribes to the data in Swift, builds indexes on the data in the memory, and provides search services. The streaming process of real-time data takes about 10 seconds.

If you modify the index schema of the application, you need to reindex on specific fields. To ensure the search efficiency, the system reindexes on all data on a regular basis. Red arrows in the preceding figure indicate the reindexing process, which is not a real-time process. The process may take a few minutes to more than 10 minutes. The consumed time varies based on the data size. Reindexing on full data may take several hours.

After the data is processed and indexed in OpenSearch, you can search the data in the application by calling API operations. Search requests are sent to Aggregator first, which is a query aggregation service. If you have configured the query rewriting logic, Aggregator sends queries to QP, which is a query rewriting service. QP rewrites the queries based on the rules configured in the processing logic, such as the rules for spelling correction, synonym rewriting, or semantic rewriting. After that, QP returns the rewritten queries to Aggregator. Then, Aggregator sends the queries to HA3. HA3 sorts the documents that meet the specified query conditions based on the relevance sorting rules that you configure and returns the results to you through Aggregator.

Data pushes and searches of different developers may affect each other and resources may be wasted. To resolve this issue, Quota Server, which is a quota management service, imposes a limit on the amount of the data that can be pushed and frequency of search requests based on the quotas of each developer. The quotas are allocated in terms of the total number of documents and computing resource consumption pre second (calculated based on LCU). If the quotas are used up, extra data pushes will fail and search requests will be randomly discarded.

How HA3 works

2

HA3 is a distributed real-time search system that is developed by Alibaba. HA3 supports automatic disaster recovery, dynamic scaling, and incremental data synchronization in seconds. The following figure shows the modules of the HA3 system.

Admin is the brain of the entire system. Admin is responsible for role assignment of nodes, scheduling, failover, status monitoring, and dynamical scaling. Amonitor monitors the performance of the system and collects and displays the performance metrics of all nodes in the entire system. QRS is the query parsing and rewriting service. QRS serves as a search interface for external systems. Proxy is the search proxy module. It receives search requests from QRS and forwards the requests to all Searcher nodes. Searcher nodes execute actual query matching, aggregates matching results, and then returns the results to QRS. DeployExpress is a distributed system that distributes linked data in real time. It distributes the indexed data in offline clusters to each Searcher node. DeployExpress can distribute multiple copies of a data entry to Searcher nodes, which consumes almost the same time as that consumed by distributing only one copy of a data entry. In addition, DeployExpress supports disaster recovery to prevent single point of failure from affecting data replication. A single-server performance test was conducted based on 12 million data entries. This test shows that QPS of HA3 is four times higher than that of Elasticsearch with the same hardware configuration and the query latency of HA3 is four times lower than that of Elasticsearch.

3

The preceding figure shows how multiple heterogeneous clusters are deployed in HA3. In this example, two heterogeneous clusters, Cluster1 and Cluster2, are deployed. The hardware configuration, index schema, and service capability of the two clusters can be different. You can use this deployment method to implement features such as the hierarchical query on hot and cold data and heterogeneous data query.

OpenSearch uses heterogeneous logical clusters to optimize resource allocation, improve service capabilities, and reduce costs of servers. Applications with different characteristics are allocated to different logical clusters. For example, applications with high QPS and a small amount of data are allocated to the cluster with solid-state disks (SSDs). The cluster has fewer partitions but more replicas and can process large search traffic. Applications with low QPS and a large amount of data are allocated to the cluster with regular disks. The cluster has fewer replicas but more partitions and can store a large amount of user data. When the amount of data in each logical cluster increases, the system can increase the data capacity by adding partitions to each cluster. When the search traffic increases, the system can improve service capabilities by adding replicas to each cluster.