Code Context Hologres user guide - Hologres - Alibaba Cloud Documentation Center

Code Context Hologres is a plugin based on the Model Context Protocol (MCP) that provides codebase-level semantic search capabilities for AI coding assistants. It uses vector embedding models from DashScope, part of Alibaba Cloud's Model Studio (Bailian), to convert code into vectors and stores them in a Hologres real-time data warehouse. This enables hybrid search using both BM25 sparse vectors and dense vectors. Results are then re-ranked with Reciprocal Rank Fusion (RRF) for more accurate code location.

With Code Context Hologres, you can:

Use natural language to search for relevant code snippets across your entire codebase.
Enable your AI assistant to automatically retrieve relevant code context without manual file browsing.
Handle codebases with millions of lines of code and significantly reduce costs by adding only relevant chunks to the AI's context.
Use incremental indexing (based on a Merkle tree) to automatically detect file changes and efficiently update the index.

For more information about MCP, see Model Context Protocol.

Overview

Code Context Hologres provides the following core capabilities:

Capability	Description
Semantic search	Query code using natural language, such as "Find the function that handles user authentication."
Hybrid search	Combines BM25 sparse vectors and dense vectors, then re-ranks results with Reciprocal Rank Fusion to improve accuracy.
Incremental indexing	Detects file changes based on Merkle trees and only re-indexes the modified parts.
AST-aware chunking	Chunks code based on its abstract syntax tree (AST) to preserve its semantic structure.
Scalable storage	Supports codebases with millions of lines of code, powered by the Hologres vector database.
Background indexing	Asynchronous indexing does not interfere with code search operations.

Supported AI coding assistants: Qwen Code, Claude Code, and other MCP-compatible clients such as Cursor, VS Code, and Windsurf.

Supported programming languages: TypeScript, JavaScript, Python, Java, C++, C#, Go, Rust, PHP, Ruby, Swift, Kotlin, Scala, and Markdown.

Supported embedding model providers: DashScope (from Alibaba Cloud's Model Studio (Bailian)), OpenAI, VoyageAI, Google Gemini, and Ollama (local deployment). All examples in this guide use DashScope models.

Prerequisites

Environment requirements

Item	Requirement
Node.js	20.x or 22.x
Package manager	npm (recommended) or pnpm
Operating system	macOS, Linux, or Windows

Note

Code Context Hologres is only compatible with Node.js 20.x and 22.x. If you are using another version, such as 21.x, 23.x, or 24.x, you must switch to a supported version.

DashScope and Bailian configuration

You need a DashScope API Key to use the embedding models.

Sign up for an Alibaba Cloud account.
Activate the Model Studio (Bailian) service.
Create an API Key in the Model Studio (Bailian) console. The key starts with sk-.

Available DashScope embedding models:

Model name	Vector dimension	Description	Batch size limit
text-embedding-v4	1024	Latest recommended model	10
text-embedding-v3	1024	Supports multiple dimensions (1024, 768, or 512)	25
text-embedding-v2	1536	Legacy model	25
text-embedding-v1	1536	Legacy model	25

Hologres instance configuration

You need a Hologres instance to store vector data. Note the following connection details:

Item	Description	Example
Endpoint	The connection endpoint for your Hologres instance.	`hgpostcn-cn-xxx.hologres.aliyuncs.com`
Port	The connection port.	`80` (default)
Database name	The name of the database you created.	`mydb`
AccessKey ID	Your Alibaba Cloud AccessKey ID.	Obtain it from AccessKey Management.
AccessKey Secret	Your Alibaba Cloud AccessKey Secret.	Same as above.

Note

Your Hologres instance must be V4.0 or later.

AI coding assistant installation

Follow the installation instructions for the AI coding assistant you are using:

Qwen Code: Refer to the official Qwen Code documentation to complete the installation.
Claude Code: Refer to the Claude Code documentation to complete the installation.

Installation

Code Context Hologres does not require manual installation. The latest version of the MCP plugin is automatically downloaded through npx.

The first time you run the command, npx code-context-mcp-hologres@latest automatically downloads and executes the package from npm.

You can verify that the tool is available by running the following command:

npx code-context-mcp-hologres@latest --help

This command displays all supported environment variables and configuration options.

Credential configuration

Claude Code configuration

CLI method

Run the following command in your terminal to register Code Context Hologres as an MCP server for Claude Code:

claude mcp add code-context-hologres \
    -e EMBEDDING_PROVIDER=DashScope \
    -e DASHSCOPE_API_KEY=sk-your-dashscope-api-key \
    -e EMBEDDING_MODEL=text-embedding-v4 \
    -e EMBEDDING_BATCH_SIZE=10 \
    -e HOLOGRES_HOST=your-hologres-instance.hologres.aliyuncs.com \
    -e HOLOGRES_PORT=80 \
    -e HOLOGRES_DATABASE=your-database-name \
    -e HOLOGRES_USER=your-access-id \
    -e HOLOGRES_PASSWORD=your-access-secret \
  -- npx code-context-mcp-hologres@latest

Replace the following placeholders with your actual values:

Placeholder	Replace with
`sk-your-dashscope-api-key`	Your DashScope API key from Model Studio (Bailian).
`your-hologres-instance.hologres.aliyuncs.com`	Your Hologres instance endpoint.
`your-database-name`	Your Hologres database name.
`your-access-id`	Your AccessKey ID.
`your-access-secret`	Your AccessKey Secret.

JSON file configuration

Alternatively, you can add the MCP server by editing the Claude Code configuration file.

Create or edit the ~/.claude.json file and add the following content:

{
  "mcpServers": {
    "code-context-hologres": {
      "command": "npx",
      "args": ["-y", "code-context-mcp-hologres@latest"],
      "env": {
        "EMBEDDING_PROVIDER": "DashScope",
        "DASHSCOPE_API_KEY": "sk-your-dashscope-api-key",
        "EMBEDDING_MODEL": "text-embedding-v4",
        "EMBEDDING_BATCH_SIZE": "10",
        "HOLOGRES_HOST": "your-hologres-instance.hologres.aliyuncs.com",
        "HOLOGRES_PORT": "80",
        "HOLOGRES_DATABASE": "your-database-name",
        "HOLOGRES_USER": "your-access-id",
        "HOLOGRES_PASSWORD": "your-access-secret"
      }
    }
  }
}

Verification

Run the following command to confirm that the MCP server is registered:

claude mcp list

The output should list the code-context-hologres server.

Qwen Code configuration

CLI method

Run the following command in your terminal:

qwen mcp add \
    -t stdio \
    -e EMBEDDING_PROVIDER=DashScope \
    -e DASHSCOPE_API_KEY=sk-your-dashscope-api-key \
    -e EMBEDDING_MODEL=text-embedding-v4 \
    -e EMBEDDING_BATCH_SIZE=10 \
    -e HOLOGRES_HOST=your-hologres-instance.hologres.aliyuncs.com \
    -e HOLOGRES_PORT=80 \
    -e HOLOGRES_DATABASE=your-database-name \
    -e HOLOGRES_USER=your-access-id \
    -e HOLOGRES_PASSWORD=your-access-secret \
    code-context-hologres \
    npx -y code-context-mcp-hologres@latest

JSON file configuration

Create or edit the ~/.qwen/settings.json file and add the following content:

{
  "mcpServers": {
    "code-context-hologres": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "code-context-mcp-hologres@latest"],
      "env": {
        "EMBEDDING_PROVIDER": "DashScope",
        "DASHSCOPE_API_KEY": "sk-your-dashscope-api-key",
        "EMBEDDING_MODEL": "text-embedding-v4",
        "EMBEDDING_BATCH_SIZE": "10",
        "HOLOGRES_HOST": "your-hologres-instance.hologres.aliyuncs.com",
        "HOLOGRES_PORT": "80",
        "HOLOGRES_DATABASE": "your-database-name",
        "HOLOGRES_USER": "your-access-id",
        "HOLOGRES_PASSWORD": "your-access-secret"
      }
    }
  }
}

Verification

Check the list of MCP servers in Qwen Code to confirm that code-context-hologres is listed.

Global configuration file

If you use Code Context Hologres with multiple MCP clients, you can manage your credentials in a single global configuration file instead of setting environment variables for each client.

Create a ~/.context/.env file with the following content:

# DashScope embedding model configuration
EMBEDDING_PROVIDER=DashScope
DASHSCOPE_API_KEY=sk-your-dashscope-api-key
EMBEDDING_MODEL=text-embedding-v4
EMBEDDING_BATCH_SIZE=10

# Hologres database configuration
HOLOGRES_HOST=your-hologres-instance.hologres.aliyuncs.com
HOLOGRES_PORT=80
HOLOGRES_DATABASE=your-database-name
HOLOGRES_USER=your-access-id
HOLOGRES_PASSWORD=your-access-secret

After creating the global configuration file, you can simplify the MCP registration command:

Claude Code:

claude mcp add code-context-hologres -- npx code-context-mcp-hologres@latest

Qwen Code:

qwen mcp add \
    -t stdio \
    code-context-hologres \
    npx -y code-context-mcp-hologres@latest

Note

Environment variables are prioritized in the following order: 1. Process environment variables (set with the -e flag). 2. Variables in the ~/.context/.env file. 3. Default values.

Environment variables

DashScope embedding model configuration

Variable	Required	Default	Description
`EMBEDDING_PROVIDER`	No	`OpenAI`	The embedding model provider. Set this to `DashScope`.
`DASHSCOPE_API_KEY`	Yes (if using DashScope)	None	Your DashScope API key from Model Studio (Bailian).
`EMBEDDING_MODEL`	No	`text-embedding-v4`	The name of the embedding model.
`EMBEDDING_BATCH_SIZE`	No	`10`	The embedding batch size. For `text-embedding-v4`, we recommend setting this to `10`.
`DASHSCOPE_BASE_URL`	No	`https://dashscope.aliyuncs.com/compatible-mode/v1`	The DashScope API endpoint. You usually do not need to change this.

Hologres database configuration

Variable	Required	Default	Description
`HOLOGRES_HOST`	Yes	None	The Hologres instance endpoint.
`HOLOGRES_PORT`	No	`80`	The connection port.
`HOLOGRES_DATABASE`	Yes	None	The database name.
`HOLOGRES_USER`	Yes	None	Your AccessKey ID.
`HOLOGRES_PASSWORD`	Yes	None	Your AccessKey Secret.

Advanced configuration

Variable	Required	Default	Description
`HYBRID_MODE`	No	`true`	Enables hybrid search (BM25 + dense vector).
`CUSTOM_EXTENSIONS`	No	None	Additional file extensions, separated by commas. For example, `.vue,.svelte`.
`CUSTOM_IGNORE_PATTERNS`	No	None	Additional ignore patterns, separated by commas. For example, `*.test.ts,__mocks__`.

Usage

Basic workflow

Open your AI coding assistant (Qwen Code or Claude Code) in your project's directory.
Ask the AI to index your codebase.
Check the indexing status and wait for it to complete.
Search your code using natural language.

Codebase indexing

Indexing a codebase involves splitting project files into chunks, generating vector embeddings, and storing them in Hologres. The process runs asynchronously, returning a response immediately after it starts.

Simply ask the AI in natural language:

> Index this codebase

The AI automatically calls the index_codebase tool, which supports the following parameters:

Parameter	Required	Default	Description
`path`	Yes	None	The absolute path to the codebase. The AI automatically uses the current working directory.
`force`	No	`false`	Whether to force re-indexing.
`splitter`	No	`ast`	The code splitter: `ast` (AST-aware) or `langchain` (character-based).
`customExtensions`	No	None	A list of additional file extensions, such as `[".vue", ".svelte"]`.
`ignorePatterns`	No	None	A list of additional ignore patterns.

Example: Include additional file types during indexing.

> Index this codebase, and include .vue and .svelte files

Example: Force re-indexing.

> Force re-indexing of the current codebase

Indexing status

You can check the indexing progress at any time while it runs in the background. The AI calls the get_indexing_status tool.

> Check indexing status

Possible statuses:

Status	Description
`indexed`	Indexing is complete and the codebase is ready for search. Shows the number of indexed files and chunks.
`indexing`	Indexing is in progress. Shows the percentage complete.
`indexfailed`	Indexing failed. Shows an error message. You can re-run the indexing command.
`not_found`	The codebase has not been indexed yet. You must run the index command first.

Code search

Search the indexed codebase using a natural language query. The AI calls the search_code tool.

> Find the function that handles user authentication

The search_code tool supports the following parameters:

Parameter	Required	Default	Description
`path`	Yes	None	The absolute path to the codebase. The AI gets this automatically.
`query`	Yes	None	The natural language query.
`limit`	No	`10`	The number of results to return (maximum 50).
`extensionFilter`	No	None	Filter by file extension, such as `[".py"]`.

More search examples:

> Find all code related to database connections

> Search for error handling logic, but only in .py files

> Find code snippets related to the payment process

> Search for the implementation code of Hologres vector retrieval

Note

You can search while indexing is in progress, but the results may be incomplete. Results are more accurate after indexing is complete.

Clear the index

If you need to clear the index data for a codebase (for example, after switching embedding models), you can ask:

> Clear the index for the current codebase

The AI calls the clear_index tool to delete the index data from Hologres.

Example (Claude Code)

$ cd /path/to/your-project
$ claude

> Index this codebase
code-context-hologres: Started background indexing for codebase '/path/to/your-project'
using AST splitter. Indexing is running in the background.

> Check indexing status
code-context-hologres: Codebase '/path/to/your-project' is currently being indexed.
Progress: 45.2% (Processing files and generating embeddings...)

> Check indexing status
code-context-hologres: Codebase '/path/to/your-project' is fully indexed and ready
for search. Statistics: 128 files, 1,536 chunks.

> Find the function that handles user login authentication
code-context-hologres: Found 5 results for query "user login authentication":
1. Code snippet (TypeScript) [your-project]
   Location: src/auth/login.ts:15-42
   ...

Example (Qwen Code)

$ cd /path/to/your-project
$ qwen

> Index this codebase
code-context-hologres: Started background indexing for codebase '/path/to/your-project'
using AST splitter. Indexing is running in the background.

> Check indexing status
code-context-hologres: Codebase '/path/to/your-project' is fully indexed and ready
for search. Statistics: 128 files, 1,536 chunks.

> Find the function that handles user login authentication
code-context-hologres: Found 5 results for query "user login authentication":
1. Code snippet (TypeScript) [your-project]
   Location: src/auth/login.ts:15-42
   ...

Advanced configuration

Switching models

You can switch DashScope embedding models by modifying the EMBEDDING_MODEL environment variable.

For example, to use the text-embedding-v3 model:

# Modify in the MCP registration command
-e EMBEDDING_MODEL=text-embedding-v3

Or modify it in ~/.context/.env:

EMBEDDING_MODEL=text-embedding-v3

Important

Different models may have different vector dimensions (for example, text-embedding-v4 has 1024 dimensions, while text-embedding-v2 has 1536). After switching models, you must clear the existing index and re-index the codebase.

Custom file extensions

By default, Code Context Hologres supports common programming language files. If you need to index additional file types (such as .vue, .svelte, or .astro), you have two options:

Method 1: Configure globally with the CUSTOM_EXTENSIONS environment variable.

CUSTOM_EXTENSIONS=.vue,.svelte,.astro

Method 2: Specify during indexing using natural language.

> Index this codebase and include .vue and .svelte files

Custom ignore patterns

Code Context Hologres automatically reads the .gitignore and .contextignore files in your project, as well as the global ~/.context/.contextignore file. You can also add extra ignore rules in the following ways:

Method 1: Use the CUSTOM_IGNORE_PATTERNS environment variable.

CUSTOM_IGNORE_PATTERNS=*.test.ts,__mocks__,*.generated.ts

Method 2: Specify during indexing using natural language.

> Index this codebase, ignoring the .test.ts and __mocks__ directories

Code splitters

Code Context Hologres supports two code splitters:

Splitter	Description	Use case
`ast` (default)	Semantic chunking based on the AST, which preserves code structure.	Recommended. Produces higher-quality chunks.
`langchain`	Chunking based on character count.	A fallback option if the AST splitter is not available.

You typically do not need to specify this manually. The system uses the ast splitter by default and automatically falls back if needed.

Hybrid search configuration

Code Context Hologres enables hybrid search by default (BM25 sparse vector + dense vector) for higher search accuracy. If you need to disable it and use only dense vectors, set the following:

HYBRID_MODE=false

Note

For the best search results, we recommend keeping hybrid search enabled (the default setting).

FAQ

Q1: What should I do if I encounter an error indicating that DASHSCOPE_API_KEY is not configured during indexing?

Check the following:

Ensure that -e DASHSCOPE_API_KEY=sk-xxx is correctly set in the MCP registration command.
If you are using a global configuration file, check that the ~/.context/.env file exists and contains DASHSCOPE_API_KEY.
The API key format should start with sk-.

Q2: How do I troubleshoot a failed indexing job?

Use "Check indexing status" to view the error message.
Common causes include a failed Hologres connection (check HOLOGRES_HOST, HOLOGRES_USER, and HOLOGRES_PASSWORD) or an invalid DashScope API key or insufficient balance.
If indexing fails, you can re-run the indexing command without clearing the data manually.

Q3: Does it support multiple projects or codebases?

Yes. Code Context Hologres automatically detects the current working directory, so you do not need to specify paths manually. You can index and search different projects in their respective directories, and the system automatically manages a separate index for each codebase. Additionally, a background sync process runs every five minutes to detect file changes and perform incremental updates.

Q4: How do I force re-indexing?

Simply tell the AI to "force re-indexing of the current codebase." Alternatively, you can specify the force=true parameter when indexing.

Q5: What should I do if I encounter the DashScope API rate limit?

The text-embedding-v4 model has a batch size limit of 10. If you encounter rate limiting when indexing a large codebase, try reducing the value of EMBEDDING_BATCH_SIZE.

Q6: How do I reconnect to the MCP server in Claude Code?

If the MCP server disconnects or its configuration is updated, use the following command:

/mcp reconnect code-context-hologres

Q7: How do I switch to a different embedding model provider?

Modify the EMBEDDING_PROVIDER environment variable. For example, to switch to Ollama for local models:

EMBEDDING_PROVIDER=Ollama
OLLAMA_HOST=http://127.0.0.1:11434
OLLAMA_MODEL=nomic-embed-text

After switching, you must clear the existing index and re-index your codebase.

Q8: What is the maximum number of chunks that can be indexed?

A single codebase can have a maximum of 450,000 indexed chunks. If this limit is reached, indexing stops automatically and a notification is shown.

Important notes

Node.js version restrictions: Only Node.js 20.x and 22.x are supported. Versions such as 21.x, 23.x, and 24.x are not compatible.
API key security: Never hard-code your API keys in your codebase. We recommend using the global configuration file at ~/.context/.env or passing credentials through your MCP client's environment variable mechanism.
Data security: Vector data is stored in your Hologres instance, which is under your Alibaba Cloud account. Code Context Hologres does not store or transmit your code content.
Model switching: After changing the embedding model, you must clear the index and re-index your codebase, as the vector dimensions may be different.
Version updates: After updating the MCP server version, reconnect to it. In Claude Code, use the /mcp reconnect code-context-hologres command; in Qwen Code, restart the server.
Initial indexing time: The first time you index a large codebase, it may take a significant amount of time, depending on the size of the codebase and the DashScope API response speed. Indexing runs in the background, and you can start searching during this process, although results may be incomplete.

Hologres:Code Context Hologres user guide

Overview

Prerequisites

Environment requirements

DashScope and Bailian configuration

Hologres instance configuration

AI coding assistant installation

Installation

Credential configuration

Claude Code configuration

CLI method

JSON file configuration

Verification

Qwen Code configuration

CLI method

JSON file configuration

Verification

Global configuration file

Environment variables

DashScope embedding model configuration

Hologres database configuration

Advanced configuration

Usage

Basic workflow

Codebase indexing

Indexing status

Code search

Clear the index

Example (Claude Code)

Example (Qwen Code)

Advanced configuration

Switching models

Custom file extensions

Custom ignore patterns

Code splitters

Hybrid search configuration

FAQ

Important notes

Related resources