The Python SDK (holo-search-sdk) for Hologres V4.0 supports high-performance vector search and full-text search. It integrates the HGraph algorithm and provides hybrid search capabilities. - Hologres

Hologres V4.0 enhances vector search with the HGraph vector search algorithm, delivering high performance, high accuracy, and low latency. For more information, see HGraph Index User Guide (recommended). This topic describes how to use the Hologres Python Search SDK (holo-search-sdk) for full-text search and vector search.

Prerequisites

Create an AccessKey. For more information, see Create an AccessKey.
You have Python 3.8 or later installed.
Grant the required permissions to your account. For more information, see Grant permissions to a RAM user.

Install the SDK

Install the Python SDK using pip. For more information, see holo-search-sdk.

Important

This topic uses holo-search-sdk version 0.3.0. If you encounter errors, verify your SDK version.

Install for the first time
```
pip install holo-search-sdk
```

Check the installed version

# Check current version
pip show holo-search-sdk

# Upgrade if version is earlier than 0.3.0
pip install --upgrade holo-search-sdk

Usage steps

Connect to Hologres

import holo_search_sdk as holo

# Connect to the database
client = holo.connect(
    host="<HOLO_HOST>",
    port=<HOLO_PORT>,
    database="<HOLO_DBNAME>",
    access_key_id="<ACCESS_KEY_ID>",
    access_key_secret="<ACCESS_KEY_SECRET>",
    schema="public" # Update as needed
)

# Establish the connection
client.connect()

Variable descriptions:

Variable	Description
HOLO_HOST	The network endpoint of your Hologres instance. Go to the Hologres Management Console, then choose HologresInstances > Instance ID/Name > Instance Details > Network Information to get the endpoint.
HOLO_PORT	The port number of your Hologres instance. Go to the Hologres Management Console, then choose HologresInstances > Instance ID/Name > Instance Details > Network Information to get the port number.
HOLO_DBNAME	The name of your Hologres database.
ACCESS_KEY_ID	The AccessKey ID of your Alibaba Cloud account. Go to AccessKey Management to get your AccessKey ID.
ACCESS_KEY_SECRET	The AccessKey secret of your Alibaba Cloud account.

Create a table

Create the table using DDL. The following is an example:

create_table_sql = """
    CREATE TABLE IF NOT EXISTS <TABLE_NAME> (
        id BIGINT PRIMARY KEY,
        content TEXT,
        vector_column FLOAT4[] CHECK (array_ndims(vector_column) = 1 AND array_length(vector_column, 1) = 3),
        publish_date TIMESTAMP
    );
"""
_ = client.execute(create_table_sql, fetch_result=False)

Note

Replace <TABLE_NAME> with your actual table name.

Open a table

You must create the vector table in your Hologres instance before you open it.

columns = {
    "id": ("INTEGER", "PRIMARY KEY"),
    "content": "TEXT",
    "vector_column": "FLOAT4[]",
    "publish_date": "TIMESTAMP"
}
table = client.open_table("<TABLE_NAME>")

Import text or vector data

data = [
    [1, "Hello world", [0.1, 0.2, 0.3], "2023-01-01"],
    [2, "Python SDK", [0.4, 0.5, 0.6], "2024-01-01"],
    [3, "Vector search", [0.7, 0.8, 0.9], "2025-01-01"]
]
table.insert_multi(data, ["id", "content", "vector_column", "publish_date"])

Update text or vector data

# Upsert one record
table.upsert_one(
    index_column="id",
    values=[1, "Updated content", [0.3, 0.2, 0.1], "2026-01-01"],
    column_names=["id", "content", "vector_column", "publish_date"],
    update=True  # Update on conflict
)

# Upsert multiple records
table.upsert_multi(
    index_column="id",
    values=[
        [1, "Updated content 1", [0.2, 0.3, 0.4], "2026-02-01"],
        [2, "Updated content 2", [0.6, 0.5, 0.7], "2024-02-01"]
    ],
    column_names=["id", "content", "vector_column", "publish_date"],
    update=True,
    update_columns=["content", "vector_column"]  # Specify columns to update
)

Set indexes

Set a vector index

table.set_vector_index(
    column="vector_column",
    distance_method="Cosine",
    base_quantization_type="rabitq",
    use_reorder=True,
    max_degree=64,
    ef_construction=400
)

Set a full-text index

# Create a full-text index
table.create_text_index(
    index_name="ft_idx_content",
    column="content",
    tokenizer="jieba"
)

# Modify a full-text index
table.set_text_index(
    index_name="ft_idx_content",
    tokenizer="ik"
)

# Delete a full-text index
table.drop_text_index(index_name="ft_idx_content")

Query data

Vector search

# Vector search
query_vector = [0.2, 0.3, 0.4]

# Limit results
results = table.search_vector(
    vector=query_vector, 
    column="vector_column",
    distance_method="Cosine"
).limit(10).fetchall()

# Set minimum distance
results = table.search_vector(
    vector=query_vector, 
    column="vector_column",
    distance_method="Cosine"
).min_distance(0.5).fetchall()

# Search with output alias
results = table.search_vector(
    vector=query_vector,
    column="vector_column",
    output_name="similarity_score",
    distance_method="Cosine"
).fetchall()

Full-text search

# Basic full-text search
results = table.search_text(
    column="content",
    expression="machine learning",
    return_all_columns=True
).fetchall()

# Full-text search with BM25 relevance score
results = table.search_text(
    column="content",
    expression="deep learning",
    return_score=True,
    return_score_name="relevance_score"
).select(["id", "vector_column", "content"]).fetchall()

# Use different search modes
    # Keyword mode (default)
results = table.search_text(
    column="content",
    expression="python programming",
    mode="match",
    operator="AND"  # All keywords must be present
).fetchall()

    # Phrase mode
results = table.search_text(
    column="content",
    expression="machine learning",
    mode="phrase"  # Exact phrase match
).fetchall()

    # Natural language mode
results = table.search_text(
    column="content",
    expression="+python -java",  # Must contain python, must not contain java
    mode="natural_language"
).fetchall()

    # Term search
results = table.search_text(
    column="content",
    expression="python",
    mode="term" # No tokenization or processing. Exact match only.
).fetchall()

Hybrid search

# Full-text + scalar search
results = (
    table.search_text(
        column="content",
        expression="artificial intelligence",
        return_score=True,
        return_score_name="score"
    )
    .where("publish_date > '2023-01-01'")
    .order_by("score", "desc")
    .limit(10)
    .fetchall()
)

# Vector + scalar search
results = (
    table.search_vector(
        vector=query_vector, 
        column="vector_column",
        output_name="similarity_score",
        distance_method="Cosine"
    )
    .where("publish_date > '2023-01-01'")
    .order_by("similarity_score", "desc")
    .limit(10)
    .fetchall()
)

Primary key point query

# Query one record by primary key
result = table.get_by_key(
    key_column="id",
    key_value=1,
    return_columns=["id", "content", "vector_column"]  # Optional. Returns all columns if omitted.
).fetchone()

# Query multiple records by primary key list
results = table.get_multi_by_keys(
    key_column="id", 
    key_values=[1, 2, 3],
    return_columns=["id", "content"]  # Optional. Returns all columns if omitted.
).fetchall()

Close the connection

# Close the connection
client.disconnect()

FAQ

Q: You receive this error when importing holo_search_sdk:

import holo_search_sdk as holo racevack (most recent call last):
File "<stdin›"
, line 1, in ‹module›
File "/usr/local/lib/python3.8/site-packages/holo_search_sdk/__init__.py", line 9, in ‹module› from .client import Client, connect
File "/usr/local/lib/python3.8/site-packages/holo_search_sdk/client.py", line 9, in <module> from psycopg. abc import Query
File "/usr/local/lib/python3.8/site-packages/psycopg/__init__.py", line 9, in <module> from. import pa # noqa: F401 import early to stabilize side effects
File "/usr/local/lib/python3.8/site-packages/psycopg/pq/__init__.py", line 116, in ‹module› import_from_libpqO)
File"/usr/local/lib/python3.8/site-packages/psycopg/pq/__init__.py",line 108, in import_from_libpa raise ImportError(
ImportError: no pa wrapper available.
Attempts made:
- couldn't import psycopg 'c' implementation: No module named 'psycopg_c'
- couldn't import psycopg 'binary' implementation: No module named 'psycopg_binary'
- couldn't import psycopg 'python' implementation: libpa library not found

A: Install psycopg-binary in your Python environment. Run the following command:

pip install psycopg-binary