Implement Persistent AI Agent Memory Using PolarDB PolarSearch - PolarDB

AI agents frequently lose context between sessions—forgetting user preferences, past instructions, and conversation history. The PolarSearch Memory Container solves this by providing long-term and short-term memory management built into PolarDB for MySQL. Agents automatically structure and persist conversation data, then retrieve relevant memories using semantic search to deliver continuous, personalized service.

The PolarSearch Memory Container feature is in canary release. To request access, submit a ticket.

How it works

The Memory Container converts unstructured conversation data into searchable structured memory using a three-tier storage architecture and an automated processing pipeline.

Three-tier storage architecture

Memory type	Index type	Best for	Trade-offs
Short-term memory (Working Memory)	Inverted index	Fast retrieval of recent turns; keyword-based search within the current session	Not suited for semantic queries or long-term retention across sessions
Long-term memory (Long-Term Memory)	Vector index + inverted index	Deep knowledge retrieval across sessions; personalization using semantic search (KNN)	Higher write latency due to LLM fact extraction and embedding generation
Memory history (Memory History)	Inverted index	Compliance and traceability; debugging memory changes over time	Read-only; not used in retrieval

Memory processing flow

When an agent receives input, the Memory Container runs three phases automatically:

Key benefits

Capability	Traditional approach	PolarSearch solution
Memory utilization	Less than 30% of historical data is retained	Increases effective information utilization to over 85% through LLM fact extraction and semantic indexing
Search efficiency	Keyword matching, accuracy of 60% or less	Hybrid vector and inverted indexing with reranking achieves semantic search accuracy of over 95%
Enterprise compliance	Multi-tenant isolation and permission control are complex to implement	Native multi-tenant isolation, role-based access control (RBAC), and operation audit trail
System scalability	Single-point storage capped at the terabyte level	Cloud-native distributed storage scales to the petabyte level
Development cost	Custom development with long implementation time	Out-of-the-box APIs and SDKs for rapid integration

Prerequisites

Before you begin, make sure you have:

A PolarSearch search node added to your cluster. See Add a PolarSearch search node.
A search node administrator account. See Create a search node administrator account.

Get started

This guide walks you through the complete setup: configuring your environment, enabling the plugin, registering models, creating a memory container, and storing and verifying your first memory.

Workflow: Configure environment → Enable plugin → Register models → Create memory container → Store and verify memory

Step 1: Configure access credentials

Set the following environment variables before running any commands. This avoids repeating credentials in each request.

Variable	Description	Example
`POLARSEARCH_HOST_PORT`	Connection address and port of the PolarSearch node	`pc-xxx.polardbsearch.rds.aliyuncs.com:3001`
`USER_PASSWORD`	Administrator account for the PolarSearch node	`polarsearch_user:your_password`
`YOUR_API_KEY`	API key for Alibaba Cloud Model Studio	`sk-xxxxxxxxxxxxxxxxxxxxxxxx`

Run the following commands in your terminal. Replace the example values with your actual values.

export POLARSEARCH_HOST_PORT="pc-xxx.polardbsearch.rds.aliyuncs.com:3001"
export USER_PASSWORD="polarsearch_user:your_password"
export YOUR_API_KEY="sk-xxxxxxxxxxxxxxxxxxxxxxxx"

Step 2: Enable the Memory Container plugin

Run the following command to enable the Memory Container feature.

Command line

curl -XPUT "http://${POLARSEARCH_HOST_PORT}/_cluster/settings" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "persistent": {
    "plugins.ml_commons.agentic_memory_enabled": true
  }
}'

Dashboard

PUT _cluster/settings
{
  "persistent": {
    "plugins.ml_commons.agentic_memory_enabled": true
  }
}

Step 3: Register models

The Memory Container requires two models: an embedding model for semantic search and a large language model (LLM) for fact extraction. Register both with PolarSearch before creating a container.

Register an embedding model

The embedding model converts text into vectors for semantic search. This example uses the text-embedding-v4 model from Alibaba Cloud Model Studio.

1. Add the model endpoint to the trusted list.

Command line

curl -XPUT "http://${POLARSEARCH_HOST_PORT}/_cluster/settings" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "persistent": {
    "plugins.ml_commons.trusted_connector_endpoints_regex": [
      "^https://dashscope.aliyuncs.com/compatible-mode/v1/.*$"
    ]
  }
}'

Dashboard

PUT _cluster/settings
{
  "persistent": {
    "plugins.ml_commons.trusted_connector_endpoints_regex": [
      "^https://dashscope.aliyuncs.com/compatible-mode/v1/.*$"
    ]
  }
}

2. Create a model connector.

The pre_process_function and post_process_function parameters adapt request and response formats for different model services. This example uses the built-in openai.embedding format converter, which matches the format of Alibaba Cloud Model Studio in compatible mode.

Command line

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/connectors/_create" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "name": "qwen embedding connector",
  "description": "The connector to qwen embedding model",
  "version": 1,
  "protocol": "http",
  "parameters": {
    "model": "text-embedding-v4",
    "endpoint": "dashscope.aliyuncs.com/compatible-mode"
  },
  "credential": {
    "api_key": "${YOUR_API_KEY}"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer ${credential.api_key}",
        "content-type": "application/json"
      },
      "url": "https://${parameters.endpoint}/v1/embeddings",
      "request_body": "{ \"model\": \"${parameters.model}\", \"input\": ${parameters.input} }",
      "pre_process_function": "connector.pre_process.openai.embedding",
      "post_process_function": "connector.post_process.openai.embedding"
    }
  ]
}'

Dashboard

Important

Replace <YOUR_API_KEY> with your actual API key for Alibaba Cloud Model Studio.

POST _plugins/_ml/connectors/_create
{
  "name": "qwen embedding connector",
  "description": "The connector to qwen embedding model",
  "version": 1,
  "protocol": "http",
  "parameters": {
    "model": "text-embedding-v4",
    "endpoint": "dashscope.aliyuncs.com/compatible-mode"
  },
  "credential": {
    "api_key": "<YOUR_API_KEY>"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer ${credential.api_key}",
        "content-type": "application/json"
      },
      "url": "https://${parameters.endpoint}/v1/embeddings",
      "request_body": "{ \"model\": \"${parameters.model}\", \"input\": ${parameters.input} }",
      "pre_process_function": "connector.pre_process.openai.embedding",
      "post_process_function": "connector.post_process.openai.embedding"
    }
  ]
}

The response returns a connector_id. Record it for the next step.

{"connector_id": "LBFt6ZsBk04xxx"}

3. Register the connector as a model.

Command line

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/_register" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "name": "qwen embedding model",
  "function_name": "remote",
  "description": "Embedding model for memory",
  "connector_id": "LBFt6ZsBk04xxx"
}'

Dashboard

POST _plugins/_ml/models/_register
{
  "name": "qwen embedding model",
  "function_name": "remote",
  "description": "Embedding model for memory",
  "connector_id": "LBFt6ZsBk04xxx"
}

The response returns a model_id. Record it—you will need it when creating the memory container.

{"task_id": "LRFx6ZsBk04ixxx", "status": "CREATED", "model_id": "LhFx6ZsBk04xxx"}

Important

When creating a memory container, always use the model_id, not the connector_id. The connector_id points to the external model service. The model_id is the internal ID that PolarSearch assigns after you register the connector.

4. Test the embedding model.

Command line

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/_predict/text_embedding/<model ID from previous step>" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'{
  "text_docs": ["Bob likes swimming. Context: He expressed his interest in swimming."],
  "return_number": true,
  "target_response": ["sentence_embedding"]
}'

Dashboard

POST _plugins/_ml/_predict/text_embedding/<model ID from previous step>
{
  "text_docs": ["Bob likes swimming. Context: He expressed his interest in swimming."],
  "return_number": true,
  "target_response": ["sentence_embedding"]
}

A successful response returns a 1024-dimensional vector in the sentence_embedding field with a status_code of 200.

{
  "inference_results": [
    {
      "output": [
        {
          "name": "sentence_embedding",
          "data_type": "FLOAT32",
          "shape": [1024],
          "data": [0.019752666354179382, -0.03468115255236626, 0.05591931194067001, ...]
        }
      ],
      "status_code": 200
    }
  ]
}

Register a text generation model (LLM)

The LLM extracts structured facts from conversations before storing them as long-term memory. This example uses the qwen-plus model.

1. Create a model connector.

Command line

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/connectors/_create" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "name": "QWen LLM Connector",
  "description": "The connector to qwen LLM",
  "version": 1,
  "protocol": "http",
  "parameters": {
    "model": "qwen-plus",
    "endpoint": "dashscope.aliyuncs.com/compatible-mode"
  },
  "credential": {
    "api_key": "${YOUR_API_KEY}"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer ${credential.api_key}",
        "content-type": "application/json"
      },
      "url": "https://${parameters.endpoint}/v1/chat/completions",
      "request_body": "{ \"model\":\"${parameters.model}\", \"system\": \"${parameters.system_prompt}\", \"messages\": [ { \"role\": \"system\", \"content\": \"${parameters.system_prompt}\" }, { \"role\": \"user\", \"content\": \"${parameters.user_prompt}\" } ], \"response_format\": { \"type\": \"json_object\" } }"
    }
  ]
}'

Dashboard

Important

Replace <YOUR_API_KEY> with your actual API key for Alibaba Cloud Model Studio.

POST /_plugins/_ml/connectors/_create
{
  "name": "QWen LLM Connector",
  "description": "The connector to qwen LLM",
  "version": 1,
  "protocol": "http",
  "parameters": {
    "model": "qwen-plus",
    "endpoint": "dashscope.aliyuncs.com/compatible-mode"
  },
  "credential": {
    "api_key": "<YOUR_API_KEY>"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer ${credential.api_key}",
        "content-type": "application/json"
      },
      "url": "https://${parameters.endpoint}/v1/chat/completions",
      "request_body": "{ \"model\":\"${parameters.model}\", \"system\": \"${parameters.system_prompt}\", \"messages\": [ { \"role\": \"system\", \"content\": \"${parameters.system_prompt}\" }, { \"role\": \"user\", \"content\": \"${parameters.user_prompt}\" } ], \"response_format\": { \"type\": \"json_object\" } }"
    }
  ]
}

The response returns a connector_id. Record it.

{"connector_id": "PRGy6ZsBk04xxx"}

2. Register the connector as a model.

Command line

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/_register" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "name": "qwen llm model",
  "function_name": "remote",
  "description": "LLM model for memory",
  "connector_id": "PRGy6ZsBk04xxx"
}'

Dashboard

POST _plugins/_ml/models/_register
{
  "name": "qwen llm model",
  "function_name": "remote",
  "description": "LLM model for memory",
  "connector_id": "PRGy6ZsBk04xxx"
}

The response returns a model_id. Record it.

{"task_id": "PhGy6ZsBk04xxx", "status": "CREATED", "model_id": "PxGy6ZsBk04xxx"}

3. Deploy the model.

Command line

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/<model ID from previous step>/_deploy" \
--user "${USER_PASSWORD}"

Dashboard

POST _plugins/_ml/models/<model ID from previous step>/_deploy

A status of COMPLETED confirms successful deployment.

{"task_id": "NxGI6ZsBk04xxx", "task_type": "DEPLOY_MODEL", "status": "COMPLETED"}

4. Test the LLM.

Command line

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/<model ID from previous step>/_predict" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'{
  "parameters": {
    "system_prompt": "<ROLE>You are a USER PREFERENCE EXTRACTOR, not a chat assistant. Your only job is to output JSON facts. Do not answer questions, make suggestions, ask follow-ups, or perform actions.</ROLE>\n\n<SCOPE>\n• Extract preferences only from USER messages. Assistant messages are context only.\n• Explicit: user states a preference (\"I prefer/like/dislike ...\"; \"always/never/usually ...\"; \"set X to Y\"; \"run X when Y\").\n• Implicit: infer only with strong signals: repeated choices (>=2) or clear habitual language. Do not infer from a single one-off.\n</SCOPE>\n\n<EXTRACT>\n• Specific, actionable, likely long-term preferences (likes/dislikes/choices/settings). Ignore non-preferences.\n</EXTRACT>\n\n<STYLE & RULES>\n• One sentence per preference; merge related details; no duplicates; preserve user wording and numbers; avoid relative time; keep each fact < 350 chars.\n• Format: \"Preference sentence. Context: <why/how>. Categories: cat1,cat2\"\n</STYLE & RULES>\n\n<OUTPUT>\nReturn ONLY one minified JSON object exactly as {\"facts\":[\"Preference sentence. Context: <why/how>. Categories: cat1,cat2\"]}. If none, return {\"facts\":[]}. The first character MUST be '{' and the last MUST be '}'. No preambles, explanations, code fences, XML, or other text.\n</OUTPUT>",
    "user_prompt": "I am Alice, I like travel."
  }
}'

Dashboard

POST _plugins/_ml/models/<model ID from previous step>/_predict
{
  "parameters": {
    "system_prompt": "<ROLE>You are a USER PREFERENCE EXTRACTOR, not a chat assistant. Your only job is to output JSON facts. Do not answer questions, make suggestions, ask follow-ups, or perform actions.</ROLE>\n\n<SCOPE>\n• Extract preferences only from USER messages. Assistant messages are context only.\n• Explicit: user states a preference (\"I prefer/like/dislike ...\"; \"always/never/usually ...\"; \"set X to Y\"; \"run X when Y\").\n• Implicit: infer only with strong signals: repeated choices (>=2) or clear habitual language. Do not infer from a single one-off.\n</SCOPE>\n\n<EXTRACT>\n• Specific, actionable, likely long-term preferences (likes/dislikes/choices/settings). Ignore non-preferences.\n</EXTRACT>\n\n<STYLE & RULES>\n• One sentence per preference; merge related details; no duplicates; preserve user wording and numbers; avoid relative time; keep each fact < 350 chars.\n• Format: \"Preference sentence. Context: <why/how>. Categories: cat1,cat2\"\n</STYLE & RULES>\n\n<OUTPUT>\nReturn ONLY one minified JSON object exactly as {\"facts\":[\"Preference sentence. Context: <why/how>. Categories: cat1,cat2\"]}. If none, return {\"facts\":[]}. The first character MUST be '{' and the last MUST be '}'. No preambles, explanations, code fences, XML, or other text.\n</OUTPUT>",
    "user_prompt": "I am Alice, I like travel."
  }
}

The LLM extracts the preference and returns it as a structured JSON fact.

{
  "inference_results": [
    {
      "output": [
        {
          "name": "response",
          "dataAsMap": {
            "choices": [
              {
                "message": {
                  "role": "assistant",
                  "content": "{\"facts\":[\"I like travel. Context: Stated preference. Categories: interest\"]}"
                },
                "finish_reason": "stop",
                "index": 0,
                "logprobs": null
              }
            ],
            "object": "chat.completion",
            "usage": {
              "prompt_tokens": 325,
              "completion_tokens": 17,
              "total_tokens": 342,
              "prompt_tokens_details": {"cached_tokens": 0}
            },
            "created": 1769152651,
            "system_fingerprint": null,
            "model": "qwen-plus",
            "id": "chatcmpl-50c6bfc9-xxx-xxx-xxx-1a39cfe080f5"
          }
        }
      ],
      "status_code": 200
    }
  ]
}

Step 4: Create a memory container

Create a memory container instance and configure its models, storage policies, and automation strategies.

Command line

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/memory_containers/_create" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'{
  "name": "my agentic memory test",
  "description": "Store conversations with semantic search and summarization",
  "configuration": {
    "embedding_model_type": "TEXT_EMBEDDING",
    "embedding_model_id": "<embedding model ID from previous step>",
    "embedding_dimension": 1024,
    "llm_id": "<LLM model ID from previous step>",
    "index_prefix": "mem_test",
    "index_settings": {
      "short_term_memory_index": {
        "index": {"number_of_shards": "2", "number_of_replicas": "2"}
      },
      "long_term_memory_index": {
        "index": {"number_of_shards": "2", "number_of_replicas": "2"}
      },
      "long_term_memory_history_index": {
        "index": {"number_of_shards": "2", "number_of_replicas": "2"}
      }
    },
    "strategies": [
      {
        "type": "SEMANTIC",
        "namespace": ["user_id"],
        "configuration": {"llm_result_path": "$.choices[0].message.content"}
      },
      {
        "type": "USER_PREFERENCE",
        "namespace": ["user_id"],
        "configuration": {"llm_result_path": "$.choices[0].message.content"}
      },
      {
        "type": "SUMMARY",
        "namespace": ["agent_id"],
        "configuration": {"llm_result_path": "$.choices[0].message.content"}
      }
    ],
    "parameters": {
      "llm_result_path": "$.choices[0].message.content"
    }
  }
}'

Dashboard

POST _plugins/_ml/memory_containers/_create
{
  "name": "my agentic memory test",
  "description": "Store conversations with semantic search and summarization",
  "configuration": {
    "embedding_model_type": "TEXT_EMBEDDING",
    "embedding_model_id": "<embedding model ID from previous step>",
    "embedding_dimension": 1024,
    "llm_id": "<LLM model ID from previous step>",
    "index_prefix": "mem_test",
    "index_settings": {
      "short_term_memory_index": {
        "index": {"number_of_shards": "2", "number_of_replicas": "2"}
      },
      "long_term_memory_index": {
        "index": {"number_of_shards": "2", "number_of_replicas": "2"}
      },
      "long_term_memory_history_index": {
        "index": {"number_of_shards": "2", "number_of_replicas": "2"}
      }
    },
    "strategies": [
      {
        "type": "SEMANTIC",
        "namespace": ["user_id"],
        "configuration": {"llm_result_path": "$.choices[0].message.content"}
      },
      {
        "type": "USER_PREFERENCE",
        "namespace": ["user_id"],
        "configuration": {"llm_result_path": "$.choices[0].message.content"}
      },
      {
        "type": "SUMMARY",
        "namespace": ["agent_id"],
        "configuration": {"llm_result_path": "$.choices[0].message.content"}
      }
    ],
    "parameters": {
      "llm_result_path": "$.choices[0].message.content"
    }
  }
}

Key parameters

Parameter	Description
`index_prefix`	A common prefix for the internal indexes created by this container (short-term, long-term, and history indexes).
`index_settings`	Configures shard and replica counts for each memory index to ensure high availability.
`strategies`	Defines automation rules. The `SEMANTIC` strategy extracts semantic facts from conversations scoped to the `user_id` namespace.
`llm_result_path`	Uses JSONPath syntax to extract the core content from the LLM's JSON response. The path `$.choices[0].message.content` matches the response structure of Alibaba Cloud Model Studio in compatible mode.

Warning

Set llm_result_path based on your LLM's actual response structure. If the path does not match, the Memory Container cannot extract facts and long-term memory will not be populated. For Alibaba Cloud Model Studio in compatible mode, the response structure is {"choices":[{"message":{"content":"..."}}]}, so the correct path is $.choices[0].message.content.

The response returns a memory_container_id. Record it.

{"memory_container_id": "QRHF6ZsBk04xxx", "status": "created"}

After creation, the system creates the following indexes and pipelines:

.plugins-ml-am-mem_test-memory-long-term
.plugins-ml-am-mem_test-memory-working
.plugins-ml-am-mem_test-memory-history
.plugins-ml-am-mem_test-memory-long-term-embedding

Step 5: Store and verify memory

Add a conversation to the memory container and confirm it is stored in both short-term and long-term memory.

1. Store memory.

Send a conversation from a user named Bob. The "infer": true parameter triggers the SEMANTIC strategy—the Memory Container calls the LLM to extract facts before storing them.

Warning

If you set "infer": false, the Memory Container stores the raw payload without fact extraction or conflict detection. A later request with "infer": true for the same content creates a new memory entry instead of updating the existing one, which can cause duplicate memories. Keep infer consistent across requests for the same user.

Command line

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/memory_containers/<memory container ID from previous step>/memories" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'{
  "messages": [
    {
      "role": "user",
      "content": [{"text": "I am Bob, I really like swimming.", "type": "text"}]
    },
    {
      "role": "assistant",
      "content": [{"text": "Cool, nice. Hope you enjoy your life.", "type": "text"}]
    }
  ],
  "namespace": {"user_id": "bob"},
  "tags": {"topic": "personal info"},
  "infer": true,
  "memory_type": "conversation"
}'

Dashboard

POST _plugins/_ml/memory_containers/<memory_container_id>/memories
{
  "messages": [
    {
      "role": "user",
      "content": [{"text": "I am Bob, I really like swimming.", "type": "text"}]
    },
    {
      "role": "assistant",
      "content": [{"text": "Cool, nice. Hope you enjoy your life.", "type": "text"}]
    }
  ],
  "namespace": {"user_id": "bob"},
  "tags": {"topic": "personal info"},
  "infer": true,
  "memory_type": "conversation"
}

2. Verify short-term memory.

Query the short-term (working) memory index to view the raw conversation.

Command line

curl -XGET "http://${POLARSEARCH_HOST_PORT}/.plugins-ml-am-<index_prefix from previous step>-memory-working/_search?pretty" \
--user "${USER_PASSWORD}"

Dashboard

GET .plugins-ml-am-<index_prefix from previous step>-memory-working/_search?pretty

The response shows the original conversation stored under the bob namespace.

{
  "_source": {
    "memory_container_id": "QRHF6ZsBk04xxx",
    "payload_type": "conversational",
    "messages": [
      {"role": "user", "content": [{"text": "I am Bob, I really like swimming.", "type": "text"}]},
      {"role": "assistant", "content": [{"text": "Cool, nice. Hope you enjoy your life.", "type": "text"}]}
    ],
    "namespace": {"user_id": "bob"}
  }
}

3. Verify long-term memory.

Query the long-term memory index to confirm the LLM extracted the fact "Bob likes swimming." and the embedding model generated the memory_embedding vector.

Command line

curl -XGET "http://${POLARSEARCH_HOST_PORT}/.plugins-ml-am-<index_prefix from previous step>-memory-long-term/_search?pretty" \
--user "${USER_PASSWORD}"

Dashboard

GET .plugins-ml-am-<index_prefix from previous step>-memory-long-term/_search?pretty

{
  "_source": {
    "created_time": 1769155210918,
    "memory": "Bob likes swimming.",
    "memory_container_id": "QRHF6ZsBk04xxx",
    "tags": {"topic": "personal info"},
    "last_updated_time": 1769155210918,
    "memory_embedding": [0.0195, -0.0387, ...],
    "namespace": {"user_id": "bob"}
  }
}

4. View memory history.

The history index records all operations with timestamps. In this case, it shows two ADD records.

Command line

curl -XGET "http://${POLARSEARCH_HOST_PORT}/.plugins-ml-am-<index_prefix from previous step>-memory-history/_search?pretty" \
--user "${USER_PASSWORD}"

Dashboard

GET .plugins-ml-am-<index_prefix from previous step>-memory-history/_search?pretty

{
  "_source": {
    "created_time": 1769155211164,
    "memory_id": "TBHe6ZsBk04xxx",
    "namespace_size": 1,
    "namespace": {"user_id": "bob"},
    "action": "ADD",
    "memory_container_id": "QRHF6ZsBk04xxx",
    "after": {"memory": "Bob likes swimming."},
    "tags": {"topic": "personal info"}
  }
}

You have stored a memory for your AI agent that supports long-term retrieval.

Step 6: (Optional) Update memory

The Memory Container automatically detects conflicting facts and updates existing long-term memory instead of creating duplicates.

1. Store an initial memory.

Command line

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/memory_containers/<memory container ID from previous step>/memories" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'{
  "messages": [
    {
      "role": "user",
      "content": [{"text": "My name is NameA. I am from AreaA. I currently live in AreaB.", "type": "text"}]
    },
    {
      "role": "assistant",
      "content": [{"text": "Hello, NameA! Nice to meet you.", "type": "text"}]
    }
  ],
  "namespace": {"user_id": "NameA"},
  "tags": {"topic": "personal info"},
  "infer": true,
  "memory_type": "conversation"
}'

Dashboard

POST _plugins/_ml/memory_containers/<memory_container_id>/memories
{
  "messages": [
    {
      "role": "user",
      "content": [{"text": "My name is NameA. I am from AreaA. I currently live in AreaB.", "type": "text"}]
    },
    {
      "role": "assistant",
      "content": [{"text": "Hello, NameA! Nice to meet you.", "type": "text"}]
    }
  ],
  "namespace": {"user_id": "NameA"},
  "tags": {"topic": "personal info"},
  "infer": true,
  "memory_type": "conversation"
}

2. Query the long-term memory index.

Command line

curl -XGET "http://${POLARSEARCH_HOST_PORT}/.plugins-ml-am-<index_prefix from previous step>-memory-long-term/_search" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'{
  "_source": {"excludes": ["memory_embedding"]},
  "query": {"match_all": {}}
}'

Dashboard

GET .plugins-ml-am-<index_prefix from previous step>-memory-long-term/_search
{
  "_source": {"excludes": ["memory_embedding"]},
  "query": {"match_all": {}}
}

The LLM extracted two separate facts:

{
  "hits": [
    {
      "_source": {
        "created_time": 1769156096335,
        "memory": "NameA is from AreaA.",
        "last_updated_time": 1769156096335,
        "namespace": {"user_id": "NameA"},
        "memory_container_id": "QRHF6ZsBk04xxx",
        "tags": {"topic": "personal info"}
      }
    },
    {
      "_source": {
        "created_time": 1769156096335,
        "memory": "NameA currently resides in AreaB.",
        "last_updated_time": 1769156096335,
        "namespace": {"user_id": "NameA"},
        "memory_container_id": "QRHF6ZsBk04xxx",
        "tags": {"topic": "personal info"}
      }
    }
  ]
}

3. Store a conflicting memory.

NameA's current city has changed. Send the updated information.

Command line

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/memory_containers/<memory container ID from previous step>/memories" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'{
  "messages": [
    {
      "role": "user",
      "content": [{"text": "My name is NameA. I currently live in AreaC.", "type": "text"}]
    },
    {
      "role": "assistant",
      "content": [{"text": "Hello, NameA! Nice to meet you.", "type": "text"}]
    }
  ],
  "namespace": {"user_id": "NameA"},
  "tags": {"topic": "personal info"},
  "infer": true,
  "memory_type": "conversation"
}'

Dashboard

POST _plugins/_ml/memory_containers/<memory_container_id>/memories
{
  "messages": [
    {
      "role": "user",
      "content": [{"text": "My name is NameA. I currently live in AreaC.", "type": "text"}]
    },
    {
      "role": "assistant",
      "content": [{"text": "Hello, NameA! Nice to meet you.", "type": "text"}]
    }
  ],
  "namespace": {"user_id": "NameA"},
  "tags": {"topic": "personal info"},
  "infer": true,
  "memory_type": "conversation"
}

4. Query the long-term memory index again.

The Memory Router detected the conflict and updated the existing fact. The last_updated_time is newer than created_time, confirming an update rather than a new entry.

{
  "hits": [
    {
      "_source": {
        "created_time": 1769156096335,
        "memory": "NameA resides in AreaC.",
        "last_updated_time": 1769156493970,
        "namespace": {"user_id": "NameA"},
        "memory_container_id": "QRHF6ZsBk04xxx",
        "tags": {"topic": "personal info"}
      }
    }
  ]
}

Use case: Build a travel memory agent

This section shows how to use the Memory Container API to build a Python agent that remembers user travel preferences and provides personalized recommendations.

Preparations

Install the required Python libraries. Add the following to your requirements.txt file and run pip install -r requirements.txt.

requests
openai

Core functions

All interactions with the Memory Container go through four functions:

Function	Purpose
`opensearch_request()`	Wraps HTTP requests to the PolarSearch API; handles authentication and errors
`add_memory()`	Calls the Memory Container's `memories` API to store new conversation data
`search_memories()`	Retrieves memories relevant to a user query from long-term and short-term memory
`generate_response_with_memories()`	Sends retrieved memories as context to the LLM to generate personalized replies

Complete example code

Save the following as trip_agent.py. Set environment variables as shown in the comments before running.

<details> <summary>Click to expand the complete code</summary>

#!/usr/bin/env python3
"""
Trip Memory Agent - A simplified memory agent that tracks user information
and manages memories using OpenSearch as the backend.
"""
import os
import json
import requests
from requests.auth import HTTPBasicAuth
import urllib3
from typing import Optional, List, Dict, Any

# Disable HTTPS warnings for local development
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

# Import OpenAI library for LLM operations
import openai

# --- Configuration from Environment Variables ---
OPENSEARCH_URL = os.getenv('OPENSEARCH_URL')
OPENSEARCH_USERNAME = os.getenv('OPENSEARCH_USERNAME', 'admin')
OPENSEARCH_PASSWORD = os.getenv('OPENSEARCH_PASSWORD')
CONTAINER_ID = os.getenv('MEM_CONTAINER_ID')
QWEN_API_KEY = os.getenv('QWEN_API_KEY')
QWEN_BASE_URL = os.getenv('QWEN_BASE_URL', 'https://dashscope.aliyuncs.com/compatible-mode/v1')
QWEN_MODEL_NAME = os.getenv('QWEN_MODEL_NAME', 'qwen-max')

# Global variable to store current user ID
current_user_id = None

# Global OpenAI client instance
llm_client = None


# --- OpenSearch Helper ---
def opensearch_request(method: str, endpoint: str, json_data: Optional[Dict] = None) -> requests.Response:
    """Make an HTTP request to the OpenSearch API."""
    if not all([OPENSEARCH_URL, OPENSEARCH_USERNAME, OPENSEARCH_PASSWORD, CONTAINER_ID]):
        raise ValueError("OpenSearch environment variables are not fully configured.")

    url = f"{OPENSEARCH_URL}{endpoint}"
    auth = HTTPBasicAuth(OPENSEARCH_USERNAME, OPENSEARCH_PASSWORD)
    try:
        response = getattr(requests, method.lower())(url, json=json_data, auth=auth, verify=False, timeout=20)
        response.raise_for_status()
        return response
    except requests.exceptions.RequestException as e:
        print(f"Error communicating with OpenSearch: {e}")
        raise e


# --- User Management ---
def get_user_id():
    """Get user ID from global variable or prompt the user."""
    global current_user_id
    if not current_user_id:
        print("=" * 50)
        print("Welcome to the Trip Memory Agent!")
        print("=" * 50)
        print("Hello! I don't know you yet. Please tell me your name:")
        user_input = input("> ")
        current_user_id = user_input.strip()
        print(f"Nice to meet you, {current_user_id}!")
    return current_user_id


# --- LLM-Powered Helper Functions ---
def is_trip_plan_request(query: str) -> bool:
    """Determine if a user's query is a request for a trip plan using an LLM."""
    if not query.strip():
        return False
    try:
        prompt = f"""Please determine if the following user input is a request to plan a trip.
Answer only "YES" or "NO". Do not explain.
User input: {query}"""
        response = llm_client.chat.completions.create(
            model=QWEN_MODEL_NAME,
            messages=[{"role": "user", "content": prompt}],
            max_tokens=10,
            temperature=0.0
        )
        result = response.choices[0].message.content.strip().upper()
        return result == "YES"
    except Exception as e:
        print(f"Error detecting trip plan request with LLM: {e}. Falling back to keywords.")
        keywords = ['plan a trip', 'trip plan', 'plan my travel', 'travel plan']
        return any(keyword in query.lower() for keyword in keywords)


def extract_keywords(query: str) -> List[str]:
    """Extract keywords from a query using an LLM for better search results."""
    try:
        prompt = f"""Extract the most important keywords from the following query for search purposes.
Focus on locations, activities, and specific preferences.
Return only a JSON array of strings (keywords).
Query: {query}"""
        response = llm_client.chat.completions.create(
            model=QWEN_MODEL_NAME,
            messages=[{"role": "user", "content": prompt}],
            max_tokens=100,
            temperature=0.0
        )
        extracted_content = response.choices[0].message.content.strip()
        return json.loads(extracted_content)
    except (Exception, json.JSONDecodeError) as e:
        print(f"Error extracting keywords with LLM: {e}. Falling back to splitting query.")
        return query.split()


def generate_response_with_memories(query: str, search_results: List[Dict]) -> str:
    """Generate a response using an LLM with memory context."""
    if not search_results:
        return "I couldn't find any relevant memories for your query."

    context = "Here is relevant information I found in your memories:\n"
    for result in search_results:
        context += f"- {result['content']}\n"

    prompt = f"""You are a helpful trip planning assistant. Based on the user's past memories and their current request, generate a helpful and personalized response.

{context}
User's current request: {query}

Your response:"""
    try:
        response = llm_client.chat.completions.create(
            model=QWEN_MODEL_NAME,
            messages=[{"role": "system", "content": prompt}],
            max_tokens=1024,
            temperature=0.7
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        print(f"Error generating response with LLM: {e}")
        return "I found some memories, but I encountered an error while trying to generate a response."


# --- Core Memory Management Functions (Our "Tools") ---

def add_memory(message: str) -> Dict[str, Any]:
    """Add a new piece of information (a memory) for the current user."""
    global current_user_id
    payload = {
        "messages": [{"role": "user", "content": [{"text": message, "type": "text"}]}],
        "infer": True,
        "memory_type": "conversation",
        "namespace": {"user_id": current_user_id}
    }
    response = opensearch_request("POST", f"/_plugins/_ml/memory_containers/{CONTAINER_ID}/memories", payload)
    return response.json()


def search_memories(query: str) -> Dict[str, Any]:
    """Search the user's memories based on a query."""
    global current_user_id
    keywords = extract_keywords(query)

    search_query = {
        "query": {
            "bool": {
                "must": [{"term": {"namespace.user_id": current_user_id}}],
                "should": [
                    {"multi_match": {"query": " ".join(keywords), "fields": ["memory", "messages.content.text"]}}],
                "minimum_should_match": 1 if keywords else 0
            }
        },
        "sort": [{"created_time": {"order": "desc"}}],
        "size": 10
    }

    all_hits = []
    for index_suffix in ["long-term", "working"]:
        try:
            response = opensearch_request("POST",
                                          f"/_plugins/_ml/memory_containers/{CONTAINER_ID}/memories/{index_suffix}/_search",
                                          search_query)
            result = response.json()
            if "hits" in result and "hits" in result["hits"]:
                all_hits.extend(result["hits"]["hits"])
        except Exception as e:
            print(f"Error searching {index_suffix} memory: {e}")

    all_hits.sort(key=lambda x: x.get('_source', {}).get('created_time', ''), reverse=True)

    formatted_results = []
    for hit in all_hits:
        source = hit.get('_source', {})
        content = source.get('memory', '')
        if not content and 'messages' in source:
            content = " ".join(
                item['text'] for msg in source['messages'] if 'content' in msg
                for item in msg['content'] if isinstance(item, dict) and 'text' in item
            )

        index_type = "long-term" if 'long-term' in hit.get('_index', '') else "working"
        formatted_results.append({
            "memory_id": hit.get('_id'),
            "index_type": index_type,
            "timestamp": source.get('created_time'),
            "content": content.strip()
        })
    return {"status": "success", "results": formatted_results, "total_found": len(formatted_results)}


def find_and_update_memory(query: str, new_text: str) -> Dict[str, Any]:
    """Find a memory based on a query and update its content."""
    search_result = search_memories(query)
    if not search_result["results"]:
        return {"status": "error", "message": "No matching memory found to update."}

    memory_to_update = search_result["results"][0]
    memory_id = memory_to_update["memory_id"]
    index_type = memory_to_update["index_type"]

    payload = {"memory": new_text}
    response = opensearch_request("PUT",
                                  f"/_plugins/_ml/memory_containers/{CONTAINER_ID}/memories/{index_type}/{memory_id}",
                                  payload)
    return {"status": "success", "updated_memory_id": memory_id, "details": response.json()}


def find_and_delete_memory(query: str) -> Dict[str, Any]:
    """Find a memory based on a query and delete it."""
    search_result = search_memories(query)
    if not search_result["results"]:
        return {"status": "error", "message": "No matching memory found to delete."}

    memory_to_delete = search_result["results"][0]
    memory_id = memory_to_delete["memory_id"]
    index_type = memory_to_delete["index_type"]

    response = opensearch_request("DELETE",
                                  f"/_plugins/_ml/memory_containers/{CONTAINER_ID}/memories/{index_type}/{memory_id}")
    return {"status": "success", "deleted_memory_id": memory_id, "details": response.json()}


# --- Agent Logic ---

def main():
    """Main function to run the trip memory agent."""
    global llm_client

    if not all([QWEN_API_KEY, QWEN_BASE_URL, QWEN_MODEL_NAME]):
        print("Error: Qwen LLM environment variables are not configured.")
        return

    llm_client = openai.OpenAI(api_key=QWEN_API_KEY, base_url=QWEN_BASE_URL)
    get_user_id()

    tools = [
        {
            "type": "function",
            "function": {
                "name": "add_memory",
                "description": "Adds a new piece of information or a memory about the user. Use this to remember user preferences, facts, or past events.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "message": {
                            "type": "string",
                            "description": "The information or content of the memory to be saved. e.g., 'I love to visit historical museums.'"
                        }
                    },
                    "required": ["message"]
                }
            }
        },
        {
            "type": "function",
            "function": {
                "name": "search_memories",
                "description": "Searches for and retrieves existing memories about the user. Use this when the user asks what you know about them, asks for past information, or wants a trip plan based on their preferences.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "The search query to find relevant memories. e.g., 'my preferences', 'what do you know about me', 'trip to Paris'"
                        }
                    },
                    "required": ["query"]
                }
            }
        },
        {
            "type": "function",
            "function": {
                "name": "find_and_update_memory",
                "description": "Finds a specific memory using a search query and updates its content with new text.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {"type": "string", "description": "A query to identify the memory to update. e.g., 'my favorite food'"},
                        "new_text": {"type": "string", "description": "The new content for the memory. e.g., 'My favorite food is now ramen.'"}
                    },
                    "required": ["query", "new_text"]
                }
            }
        },
        {
            "type": "function",
            "function": {
                "name": "find_and_delete_memory",
                "description": "Finds a specific memory using a search query and deletes it.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {"type": "string", "description": "A query to identify the memory to delete. e.g., 'my old address'"}
                    },
                    "required": ["query"]
                }
            }
        }
    ]

    available_tools = {
        "add_memory": add_memory,
        "search_memories": search_memories,
        "find_and_update_memory": find_and_update_memory,
        "find_and_delete_memory": find_and_delete_memory,
    }

    system_prompt = f"""You are a trip assistant for a user named {current_user_id}. Your job is to help them by managing their memories and planning trips.
You have access to the following tools: `add_memory`, `search_memories`, `update_memory`, and `delete_memory`.
- When a user provides new information about themselves, use `add_memory`.
- When a user asks what you know about them, asks for their preferences, or wants a trip plan, use `search_memories` to get context first.
- When a user wants to change a piece of information, use `find_and_update_memory`.
- When a user wants to forget something, use `find_and_delete_memory`.
- Always respond to the user in a friendly and conversational manner. After a tool is used, explain what you did in simple terms."""

    messages = [{"role": "system", "content": system_prompt}]

    print("\nTrip Memory Agent is ready!")
    print(f"Hello {current_user_id}! I can remember things for you, search your memories, and help plan trips.")

    while True:
        try:
            user_input = input(f"\n{current_user_id}> ")
            if user_input.lower() in ['quit', 'exit', 'bye']:
                print("Goodbye!")
                break

            messages.append({"role": "user", "content": user_input})

            if is_trip_plan_request(user_input):
                print("This looks like a trip plan request. Searching memories for context...")
                search_result = search_memories(user_input)
                if search_result.get("total_found", 0) > 0:
                    print("Found relevant memories! Generating a personalized plan...")
                    response = generate_response_with_memories(user_input, search_result["results"])
                    print(f"\nAgent: {response}")
                else:
                    print("I couldn't find any relevant memories to help with that plan. Let's start from scratch!")
                messages.pop()
                continue

            print("Thinking...")
            response = llm_client.chat.completions.create(
                model=QWEN_MODEL_NAME,
                messages=messages,
                tools=tools,
                tool_choice="auto"
            )
            response_message = response.choices[0].message

            if response_message.tool_calls:
                messages.append(response_message)

                for tool_call in response_message.tool_calls:
                    function_name = tool_call.function.name
                    function_to_call = available_tools.get(function_name)

                    if not function_to_call:
                        print(f"Error: LLM tried to call an unknown function '{function_name}'")
                        continue

                    try:
                        function_args = json.loads(tool_call.function.arguments)
                        print(f"Calling tool: `{function_name}` with args: {function_args}")
                        function_response = function_to_call(**function_args)
                        messages.append({
                            "tool_call_id": tool_call.id,
                            "role": "tool",
                            "name": function_name,
                            "content": json.dumps(function_response)
                        })
                    except Exception as e:
                        print(f"Error executing tool '{function_name}': {e}")
                        messages.append({
                            "tool_call_id": tool_call.id,
                            "role": "tool",
                            "name": function_name,
                            "content": f'{{"status": "error", "message": "{str(e)}"}}'
                        })

                print("Summarizing tool results...")
                second_response = llm_client.chat.completions.create(
                    model=QWEN_MODEL_NAME,
                    messages=messages
                )
                final_response = second_response.choices[0].message.content
                print(f"\nAgent: {final_response}")
                messages.append({"role": "assistant", "content": final_response})

            else:
                final_response = response_message.content
                print(f"\nAgent: {final_response}")
                messages.append({"role": "assistant", "content": final_response})

        except Exception as e:
            print(f"\nAn unexpected error occurred: {e}")
            messages = [{"role": "system", "content": system_prompt}]


if __name__ == "__main__":
    # Set environment variables before running:
    # export OPENSEARCH_URL="https://..."
    # export OPENSEARCH_USERNAME="admin"
    # export OPENSEARCH_PASSWORD="your_password"
    # export MEM_CONTAINER_ID="your_container_id"
    # export QWEN_API_KEY="your_api_key"
    main()

</details>

Run and interact

Set the required environment variables:

export OPENSEARCH_URL="https://pc-xxx.polardbsearch.rds.aliyuncs.com:3001"
export OPENSEARCH_USERNAME="polarsearch_user"
export OPENSEARCH_PASSWORD="your_password"
export MEM_CONTAINER_ID="your_memory_container_id"
export QWEN_API_KEY="your_api_key"

Run the script:
```
python trip_agent.py
```
Example interaction:

<details> <summary>Click to expand an example interaction</summary>

==================================================
Welcome to the Trip Memory Agent!
==================================================
Hello! I don't know you yet. Please tell me your name:
> ABC
Nice to meet you, ABC!

Trip Memory Agent is ready!
Hello ABC! I can remember things for you, search your memories, and help plan trips.

ABC> i like hiking and travelling
Thinking...
Calling tool: `add_memory` with args: {'message': 'i like hiking and travelling'}
Summarizing tool results...

Agent: Great to know that you enjoy hiking and traveling, ABC! I've added this information to your profile. Whenever you're looking for trip ideas, I'll make sure to suggest some amazing hiking spots and travel destinations for you.

ABC> I like listen to music
Thinking...
Calling tool: `add_memory` with args: {'message': 'I like listen to music'}
Summarizing tool results...

Agent: Awesome, ABC! I've noted that you like listening to music. Whether it's for your travels or just everyday life, I can help you find some great playlists and music recommendations.

ABC> what do you know about me
Thinking...
Calling tool: `search_memories` with args: {'query': 'what do you know about me'}
Summarizing tool results...

Agent: I know a few things about you, ABC! Here's what I have:

1. You like hiking and traveling.
2. You enjoy listening to music.

If there's anything else you'd like to add, just let me know!

ABC> I live in Beijing
Thinking...
Calling tool: `add_memory` with args: {'message': 'I live in Beijing'}
Summarizing tool results...

Agent: Thanks for letting me know, ABC! I've added that you live in Beijing to your profile.

ABC> help me plan a weekend trip
This looks like a trip plan request. Searching memories for context...
Found relevant memories! Generating a personalized plan...

Agent: Given your love for hiking, traveling, and music, here's a weekend plan from Beijing:

**Yunmeng Mountain, Huairou District**

Day 1:
- Drive to Yunmeng Mountain (about 2 hours from central Beijing)
- Hike the scenic trails and enjoy a picnic lunch with your favorite music
- Stay overnight at a local guesthouse

Day 2:
- Short morning hike or walk around the area
- Brunch at a local restaurant (try Huairou roast duck)
- Return to Beijing in the afternoon

ABC> I live in Shanghai now
Thinking...
Calling tool: `find_and_update_memory` with args: {'query': 'I live in Beijing', 'new_text': 'I live in Shanghai now'}
Summarizing tool results...

Agent: Got it, ABC! I've updated your location from Beijing to Shanghai.

ABC> help me plan a weekend trip
This looks like a trip plan request. Searching memories for context...
Found relevant memories! Generating a personalized plan...

Agent: Since you now live in Shanghai and enjoy hiking and music, here's a weekend plan:

**Moganshan, Zhejiang Province** (about 3 hours from Shanghai)

Day 1:
- Drive to Moganshan in the morning
- Afternoon hike on scenic trails (Jianchi or Daken Scenic Area)
- Stay at a boutique hotel or guesthouse

Day 2:
- Morning walk near your accommodation
- Visit a local tea plantation for a tea-picking experience
- Return to Shanghai in the afternoon

Pack appropriate hiking gear and book accommodation in advance.

</details>

API reference

This guide uses the following APIs. For the full API reference, see the OpenSearch documentation:

Create Memory Container API: Creates and configures a memory container.
Agentic memory APIs: Full CRUD operations for memory create, read, update, and delete.

Billing

Using the PolarSearch Memory Container incurs the following costs:

Compute node fees: PolarSearch nodes incur compute node fees as part of PolarDB.
Model service fees: Calling external models from Alibaba Cloud Model Studio for fact extraction and vectorization incurs API call fees.

Review these costs before using the feature in a production environment.

FAQ

model_id and connector_id: What is the difference? Which one should I use when I create a memory container?

The connector_id is the ID of the connector you created to point to an external model service such as Alibaba Cloud Model Studio. The model_id is the internal ID that PolarSearch assigns to the model after you register the connector. When creating a memory container, use the model_id for both embedding_model_id and llm_id.

llm_result_path: How do I configure this parameter?

Use JSONPath syntax to extract the required text from the LLM's JSON response. The path must match your LLM's actual response structure. For Alibaba Cloud Model Studio in compatible mode, the response structure is {"choices":[{"message":{"content":"..."}}]}, so the correct path is $.choices[0].message.content.