Large Language Models (LLMs) are widely used in scenarios such as intelligent customer service and virtual assistants. However, their context length limitations create a "short-term memory" problem. Hologres uses the Mem0 framework and its high-performance vector search capabilities to provide a long-term memory solution for LLMs. This topic describes how to build a long-term memory system for LLMs using Mem0 and Hologres.
Solution overview
The Hologres Mem0 solution leverages the capabilities of Hologres as a real-time data warehouse, including high-concurrency writes and updates, and millisecond-level vector search. This solution builds a vector engine for storing and managing personalized user memories.
In this solution, information such as conversation history, user preferences, and key facts is automatically extracted. This information is then vectorized and persistently stored in a Hologres table. When a user sends a new request, the system retrieves the most relevant memory fragments from Hologres in real time based on the query's semantics. These fragments are then dynamically added to the LLM's input context. This process improves the model's ability to use historical information without increasing its load. It creates a "long-term memory" effect across sessions, helping you build smarter and more reliable enterprise AI applications.
The advantages of using Hologres as the long-term memory solution for Mem0 are as follows:
High-performance vector search: The built-in vector engine in Hologres supports millisecond-level searches on hundreds of millions of vectors. This meets the demands of high-concurrency online inference scenarios.
Real-time writes and updates: Supports real-time writing of tens of thousands of memory records per second. This ensures that the latest user behaviors take effect immediately and avoids memory lag.
Low cost and high availability: Hologres serves as a unified data platform where a single copy of data can be used for vector search, full-text search, OLAP analysis, and online services. This reduces O&M complexity and the total cost of ownership (TCO).
Out-of-the-box and open source: This solution is developed based on the open source Mem0 framework. The code is public, which allows for quick setup and flexible customization.
Solution architecture
The Hologres Mem0 solution includes the following core components:
Memory Extractor
This component is integrated into the application service. It monitors the interactions between the user and the LLM. It identifies and extracts information with long-term value, such as user preferences, factual statements, and task goals. It then generates structured memory entries.
Embedding Engine
This component uses a text embedding model to transform memory entries into high-dimensional vectors. This process ensures that semantically similar memories are close to each other in the vector space.
Hologres vector storage layer
Hologres serves as the core memory database and provides:
A table schema that supports hybrid storage of vector, text, and scalar fields.
A high-performance HGraph vector index.
Millisecond-level vector similarity search.
Multi-tenant data isolation and access control.
Hybrid search across vector, full-text, and scalar fields.
Memory Retriever
Before each model inference, this component vectorizes the user's current query. It then performs a Top-K similarity search in Hologres to retrieve the most relevant memory fragments.
Context Integrator
This component sorts the retrieved memories by relevance. It then adds them to the prompt in natural language to form an enhanced context for the LLM.
The entire process runs in a closed loop without manual intervention and can be seamlessly integrated into existing LLM application architectures.
Procedure
Mem0 supports various LLMs, such as OpenAI, Gemini, and DeepSeek. This tutorial uses the Mem0 framework, Alibaba Cloud Model Studio, and Hologres to build a long-term memory system. The steps are as follows.
Obtain the sample code: mem0_hologres.
Connect to your Hologres instance and create the mem0 database.
CREATE DATABASE mem0;Open the directory of the sample code. Create and activate a virtual environment, then install the required dependency libraries.
cd /home/mem0_hologres python3.11 -m venv myenv source myenv/bin/activate pip install -e . pip install "psycopg[pool]" pip install psycopg2-binaryAlibaba Cloud Model Studio provides developers with OpenAI-compatible APIs and end-to-end model services. This topic provides the following two examples for testing. For more information about how to obtain a Model Studio API Key, see Get an API key.
Example 1: Basic memory operations
from mem0 import Memory config = { "llm" : { "provider": "openai", "config": {"openai_base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1", "api_key": "<YOUR_DASHSCOPE_API_KEY>", "model": "qwen-plus"}, }, "embedder": { "provider": "openai", "config": {"openai_base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1", "api_key": "<YOUR_DASHSCOPE_API_KEY>", "model": "text-embedding-v4", "embedding_dims": 1536}, }, "vector_store": { "provider": "hologres", "config": { "host": "<hologres-host>", "port": <hologres-port>, "dbname": "mem0", "user": "<hologres-username>", "password": "<hologres-password>", "collection_name": "memories", }, } } m = Memory.from_config(config_dict=config) debug = True # Using only user_id result = m.add("I like pizza", user_id="alice") if debug: print(result) result = m.add("I hate pizza", user_id="alice") if debug: print(result) result = m.add("I like football", user_id="alice", metadata={"category": "hobbies"}) if debug: print(result) # Using both user_id and agent_id result = m.add("I like pizza", user_id="alice", agent_id="food-assistant") if debug: print(result) # Get all memories for a user result = m.get_all(user_id="alice") if debug: print(result) # Get all memories for a specific agent belonging to a user result = m.get_all(user_id="alice", agent_id="food-assistant") if debug: print(result) # Search memories for a user result = m.search("tell me my name.", user_id="alice") if debug: print(result) # Search memories for a specific agent belonging to a user result = m.search("tell me my name.", user_id="alice", agent_id="food-assistant") if debug: print(result) # Search memories for a user result = m.search("what food do you like.", user_id="alice") if debug: print(result) # Search memories for a user result = m.search("what sport do you like.", user_id="alice") if debug: print(result) # Delete all memories for a user result = m.delete_all(user_id="alice") if debug: print(result) # Delete all memories for a specific agent belonging to a user result = m.delete_all(user_id="alice", agent_id="food-assistant") if debug: print(result)Output:
{'results': [{'id': 'c62cba4d-2261-4399-9ef2-9f2af8537615', 'memory': 'Likes pizza', 'event': 'ADD'}]} {'results': [{'id': 'c62cba4d-2261-4399-9ef2-9f2af8537615', 'memory': 'Likes pizza', 'event': 'DELETE'}]} {'results': [{'id': 'd5099c77-0458-4012-a2be-98b2960a6159', 'memory': 'Likes football', 'event': 'ADD'}]} {'results': [{'id': 'f4b3a456-b9d4-435b-ae0f-08d0b32630ac', 'memory': 'Likes pizza', 'event': 'ADD'}]} ... Resetting index memories... {'message': 'Memories deleted successfully!'} Resetting index memories... {'message': 'Memories deleted successfully!'}Example 2: Conversation memory and semantic search
import json from mem0 import Memory config = { "llm" : { "provider": "openai", "config": {"openai_base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1", "api_key": "<YOUR_DASHSCOPE_API_KEY>", "model": "qwen-plus", "temperature": 0.2, "max_tokens": 2000, "top_p": 1.0}, }, "embedder": { "provider": "openai", "config": {"openai_base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1", "api_key": "<YOUR_DASHSCOPE_API_KEY>", "model": "text-embedding-v4", "embedding_dims": 1536}, }, "vector_store": { "provider": "hologres", "config": { "host": "<hologres-host>", "port": <hologres-port>, "dbname": "mem0", "user": "<hologres-username>", "password": "<hologres-password>", "collection_name": "memories", }, } } m = Memory.from_config(config_dict=config) # Add conversation memories messages = [ {"role": "user", "content": "I like sci-fi movies, especially Interstellar."}, {"role": "assistant", "content": "Interstellar is a classic sci-fi movie! I'll remember that you like this type of film."}, {"role": "user", "content": "I also like other movies by Christopher Nolan."}, {"role": "assistant", "content": "Nolan's work is indeed outstanding! Besides 'Interstellar', he also directed classics like 'Inception', 'Tenet', and 'Memento'. Do you have any special thoughts on these movies?"}, ] # Add memories and attach metadata m.add(messages, user_id="alice", metadata={"category": "movies", "tags": ["sci-fi", "preferences"], "importance": "high"}) # Search for memories memories = m.search(query="Recommend a movie I might like", user_id="alice", filters={"category": "movies"}) print(json.dumps(memories, indent=2, ensure_ascii=False)) # Get all memories all_memories = m.get_all(user_id="alice") print(json.dumps(all_memories, indent=2, ensure_ascii=False)) # Delete all memories for a user result = m.delete_all(user_id="alice") print(result)Output:
{ "results": [ { "id": "3b84616f-f705-4b92-be2d-4374ce3644f9", "memory": "Especially likes Interstellar", "score": 0.634332, "user_id": "alice" }, { "id": "210f505f-4d6f-4b0c-ae59-6b55bf2b7cc1", "memory": "Likes sci-fi movies", "score": 0.604246, "user_id": "alice" }, { "id": "59265409-1c75-4b49-b2b4-efa8a9af4869", "memory": "Likes other movies by Christopher Nolan", "score": 0.490508, "user_id": "alice" } ] }