Building a Unified Entity Search Engine by Using UModel for Observability Scenarios

This article introduces a unified entity search engine built on UModel and USearch, enabling efficient cross-domain querying, full-text search, and re.

1. Background Information

In observability systems, UModel defines a unified data model (schema). While a UModel query focuses on exploring knowledge graph metadata, an entity query is designed to query and retrieve specific entity instance data. Powered by the USearch engine, entity queries provide powerful capabilities such as full-text search, exact lookup, and conditional filtering, and support cross-domain and cross-entity-type joint queries.

Unlike UModel queries, which deal with schema definitions, entity queries focus on runtime entity data, enabling users to quickly locate, retrieve, and analyze specific entity instances, such as service instances, pod instances, and host instances.

1.1 Issues Resolved by Entity Queries

In actual observability scenarios, we often need the following features:

Quick entity locating: allows you to quickly find relevant entities based on keywords or property values.
Cross-domain retrieval: allows you to perform unified searches across multiple domains, such as application performance management (APM), Kubernetes, and cloud resources.
Precise query: allows you to query detailed information based on known entity IDs in batches.
Conditional filtering: allows you to perform complex conditional filtering based on entity properties.
Statistical analysis: allows you to aggregate, analyze, and compute entity data.

By providing a unified interface through the USearch engine, Entity queries address the pain points of traditional multi-system querying and deliver efficient and flexible entity retrieval capabilities.

1.2 Differences Among Three Query Types

Three different types of queries exist in EntityStore.

Query type	Destination data	Query content	Typical scenario
UModel query	Knowledge graph schema	EntitySet definitions and relationship type definitions	"What entity types exist in the system?"
Entity query	Specific entity instances	Service instances, pod instances, and host instances	"How is user-service currently doing?"
Topo query	Relationship instances	Call relationships and deployment relationships	"Which services are called by user-service?"

Entity query focuses specifically on entity instance data and is the most frequently used query method in daily O&M and troubleshooting.

2. Introduction to Entity Query

2.1 Data Model

Three-layer storage architecture

USearch adopts a layered storage structure to ensure logical isolation and efficient querying of data:

Workspace layer: the top-level isolation unit. Workspaces are fully isolated from each other.
Domain layer: logical grouping at the business level, such as APM, Kubernetes, and Container Compute Service (ACS).
EntityType: specific entity types that contain the actual entity data, such as apm.service and k8s.pod.

Workspace: my-observability
├── Domain: apm
│   ├── EntityType: apm.service
│   ├── EntityType: apm.host  
│   └── EntityType: apm.instance
├── Domain: k8s
│   ├── EntityType: k8s.pod
│   ├── EntityType: k8s.node
│   └── EntityType: k8s.service
└── Domain: acs
    ├── EntityType: acs.ecs.instance
    └── EntityType: acs.rds.instance

Characteristics of data storage

● Uniqueness guarantee: __entity_id__ is unique within the same EntityType.

● Column-oriented storage: supports tabular data with multiple rows and columns and supports SPL-based statistical analysis

● Index optimization: full-text indexing optimized for retrieval performance. Multi-keyword retrieval and ranking scoring are supported.

● Time series support: allows you to query and filter data based on a time range and trace the status of entities and relationships at any point in time.

2.2 Core Features of USearch

Search capability

USearch provides powerful full-text search capabilities and supports the following features:

● Multi-type joint search: performs joint queries across multiple domains and entity types, with unified scoring and ranking.

● Multi-keyword search scoring: calculates relevance scores based on information such as term weights and field weights.

● Intelligent word segmentation: performs automatic word segmentation and relevance scoring to improve retrieval accuracy

-- Search for all entities that contain "cart" in any domain
.entity with(domain='*', name='*', query='cart')

Scanning capabilities

In addition to search mode, USearch supports scanning mode, which reads raw data and allows further filtering and computation through SPL. This is suitable for scenarios that require complex data processing.

-- Scan the number of applications in the China (Hong Kong) region within the APM domain
.entity with(domain='apm', name='apm.service')
| where region_id = 'cn-hongkong'
| stats count = count()

2.3 Query Syntax

Basic syntax structure

.entity with(
    domain='domain_pattern', -- Domain filter pattern
    name='type_pattern', -- Type filter pattern
    query='search_query', -- Query condition
    topk=10, -- Number of results to return
    ids=['id1','id2','id3'] -- Exact ID query
)

Parameter description

Parameter	Type	Description	Example
domain	string	Domain filter, which supports fnmatch patterns.	`'apm'`, `'a'`, `''`
name	string	Type filter, which supports fnmatch patterns.	`'apm.service'`, `'service'`, `'k8s.'`
query	string	Query condition, which supports multiple syntaxes.	`'error'`, `'name:web-app'`
topk	int	Number of records to return (default: 100).	`50`, `200`
ids	string[]	Exact ID query. The input parameters are an array of strings.	`['id1','id2']`

fnmatch syntax: You can use wildcards to match characters. For example, you can use an asterisk (*) to match any character and use a question mark (?) to match a single character. For more information, see fnmatch documentation.

Domain and type filter patterns

-- Examples of matching patterns
.entity with(domain='ac*')            -- Domains starting with "ac"
.entity with(domain='a*c')             -- Domains starting with "a" and ending with "c"
.entity with(name='*instance')       -- Types ending with instance
.entity with(name='k8s.*')         -- All types in the Kubernetes domain
.entity with(domain='*', name='*')        -- All domains and types

3. Description of Query Patterns

3.1 Exact ID Query

When entity IDs are known, you can use the ids parameter for precise queries:

-- Query entities with specific IDs
.entity with(
    domain='apm',
    name='apm.service',
    ids=['4567bd905a719d197df','973ad511dad2a3f70a']
)

Applicable scenarios:

● Query detailed information based on entity IDs from alerts

● Verify the existence and status of specific entities

● Batch query information about entities with known entity IDs

3.2 Full-text Retrieval Mode

Basic full-text search

-- Simple keyword search
.entity with(query='web application')
-- Multi-term OR relationship (default behavior)
.entity with(query='kubernetes docker container')

Search features:

● Multiple words are connected by an OR relationship, which means that the condition is satisfied if any one of the words appears.

● All fields, including system fields and custom fields, are searched.

● Automatic word segmentation and relevance scoring are performed.

Phrase search

Words connected by hyphens (-) must be matched exactly within the same field:

-- Exact phrase match
.entity with(query='opentelemetry.io/name-fraud-detection')
-- Regular search (matches any individual word)
.entity with(query='opentelemetry.io/name cart')

Field-specific search

Search within specific fields:

-- Search in the description field
.entity with(query='description:"error handling service"')
-- Search in custom properties
.entity with(query='cluster_name:production')
-- Search in labels
.entity with(query='labels.team:backend')

Logical condition combinations

The AND, OR, and NOT logical operators are supported:

-- AND: Both conditions must be satisfied.
.entity with(query='service_name:web AND status:running')
-- OR: Any condition is met.
.entity with(query='environment:prod OR environment:staging')
-- NOT: The condition on the left side is met, and the condition on the right side is not met.
.entity with(query='type:service NOT status:stopped')
-- Complex combination
.entity with(query='(cluster:prod OR cluster:staging) AND NOT status:maintenance')

Special character handling:

● Queries that contain special characters, such as vertical bars (|) and colons (:), must be enclosed in double quotation marks (").

● Example: query='description:"ratio is 1:2"'

3.3 Multi-type Joint Search

Joint queries across multiple domains and entity types are supported, with unified scoring and ranking.

-- Search for all entities that contain "cart" in any domain
.entity with(domain='*', name='*', query='cart')
-- Search for entities whose types contain "service" and properties contain "production" in all domains
.entity with(domain='*', name='*service*', query='production')
-- Search for entities whose properties contain "error" or "rate" in specific domains
.entity with(domain='apm', name='apm.*', query='error rate')

3.4 Data Analysis with SPL

Both in search mode and scan mode, SPL can be combined for more advanced data processing.

-- Retrieve the number of applications in different languages within the APM domain in the China (Hongkong) region, sorted in descending order by the number of applications.
.entity with(domain='apm', name='apm.service')
| where region_id = 'cn-hongkong'
| stats count = count() by language
| project language, count
| sort count desc

4. Scoring and Sorting Mechanism

4.1 Relevance Scoring

USearch uses a multi-factor scoring algorithm:

Term frequency weight: the frequency of the keyword appearing in the document.
Field weight: The importance weight of different fields. For example, the name field has a higher weight than the description field.
Document length: Matches in shorter documents typically receive higher scores.
Inverse document frequency: Rare terms are assigned higher weights.

4.2 Sorting Rules

By default, results are sorted in descending order of relevance score. When scores are equal, sorting falls back to timestamp:

-- Default relevance-based sorting
.entity with(query='web service error', topk=20)
-- Custom sorting using SPL
.entity with(query='kubernetes pod')
| sort __last_observed_time__ desc
| limit 50
-- Multi-field sorting
.entity with(domain='apm', name='apm.service')
| sort cluster asc, service_name asc

5. Application Scenarios of Entity Queries

5.1 Scenario 1: quick entity locating and retrieval

Problem description: If an alert is generated online or you need to search for a specific entity, you must quickly locate the relevant entity instance.

Solution: Select an appropriate query method based on the scenario:

-- Method 1: Perform a precise query by entity ID from the alert.
.entity with(
    domain='apm',
    name='apm.service',
    ids=['4567bd905a719d197df','973ad511dad2a3f70a']
)
-- Method 2: Perform a full-text search based on the keyword.
.entity with(query='user-service error', topk=10)
-- Method 3: Perform a field-specific exact match.
.entity with(query='service_name:user-service')
-- Method 4: Find services owned by a specific team by label.
.entity with(
    domain='apm',
    name='apm.service',
    query='labels.team:backend AND labels.language:java AND status:running'
)

Outcome: You can quickly retrieve complete information about the problematic entities, including their status, properties, and labels. Multiple query methods are supported to meet diverse needs across different scenarios.

5.2 Scenario 2: Cross-domain Joint Search

Problem description: You need to search for entities that contain specific keyword across multiple domains (APM, Kubernetes, and cloud resources) to avoid switching between systems.

Solution: Use multi-type joint search to perform queries across domains:

-- Search for entities that contain "error" across all domains
.entity with(domain='*', name='*', query='error', topk=50)
-- Search for multiple entity types under domains with a specific prefix
.entity with(domain='apm*', name='*', query='error', topk=50)

Outcome: A unified interface is used to retrieve cross-domain entities, which breaks down data silos and improves query efficiency.

5.3 Scenario 3: Conditional Filtering and Data Analysis

Problem description: You need to identify the entities that meet specific conditions and perform statistical analysis to identify patterns or gain data insights.

Solution: Integrate SPL for conditional filtering and aggregate analysis:

-- Find APM services in Java and collect statistics by cluster.
.entity with(domain='apm', name='apm.service')
| where language='java'
| stats count=count() by cluster
-- Query services that run in the production or staging environment.
.entity with(query='(environment:prod OR environment:staging) AND status:running')
| stats count=count() by environment, cluster
-- Retrieve the number of ARMS production applications in the APM domain across different regions, sorted in descending order by the number of applications.
.entity with(domain='apm', query='environment:prod')
| where telemetry_client='ARMS'
| stats service_count = count() by service, region_id
| project region_id, service, service_count
| sort service_count desc

Outcome: You can quickly identify problematic entities, perform data aggregation and analysis, and uncover data patterns.

6. Performance Optimization Recommendations

6.1 Use Exact Match

Field-specific query are more efficient than full-text search:

-- ❌ Full-text search (slow)
.entity with(query='production')
-- ✅ Field-specific query (fast)
.entity with(query='environment:production')

6.2 Avoid Prefix Wildcards

Suffix wildcards perform better than prefix ones:

-- ❌ Prefix wildcard (slow)
.entity with(name='*service')
-- ✅ Suffix wildcard (fast)
.entity with(name='service*')

6.3 Use Logical Operators Wisely

Simple AND conditions outperform complex OR conditions:

-- ✅ Simple AND condition
.entity with(query='status:running AND cluster:prod')
--⚠️ Complex OR conditions (poor performance)
.entity with(query='name:a OR name:b OR name:c OR name:d')

6.4 Set Appropriate topk

Set the topk value based on the actual requirements to avoid returning unnecessary data:

-- Retrieve only the top 10 results
.entity with(query='error', topk=10)
-- Increase the value only when more results are needed
.entity with(query='error', topk=100)

7. Summary

As the core interface in EntityStore for querying entity instances, Entity query provides powerful retrieval and analytical capabilities for observability scenarios. Entity query allows you to implement the following features:

Quick entity locating: You can use keywords, IDs, or conditions to find entities in an efficient manner.
Cross-domain retrieval: You can query entity data across multiple domains by using a unified interface.
Exact query: Exact query methods such as field-specific filtering and logical combinations are supported.
Data analysis: You can combine with SPL to perform complex data filtering and statistical analysis.

These capabilities make entity query an indispensable tool for daily O&M, troubleshooting, and data analysis, and provide a solid foundation for the effective use of observability data.