After a document is uploaded, the system automatically splits it into chunks. Each chunk stores vector data and the original text, and serves as the smallest unit for retrieval. Use the following APIs to view and adjust chunk content.
View chunks
Call list_chunks to retrieve a paginated list of chunks for a specified document.
Request parameters
Parameter | Type | Description |
| string | The name of the knowledge base. Required. |
| string | The name of the subspace. Required if subspaces are enabled. |
| string | The document ID. Required if |
| string | The OSS path of the document. Required if |
| int | The maximum number of results to return. The default is 10, and the maximum is 1,000. |
| string | The next token for pagination. |
Code example
resp = client.list_chunks({
"knowledgeBaseName": "product_docs_kb",
"docId": "fc6ed97f-...",
"maxResults": 5
})
for chunk in resp["data"]["chunkDetails"]:
print(f"[Chunk {chunk['chunkId']}] ({chunk['status']}) {chunk['content'][:80]}...")Response fields
Parameter | Type | Description |
| string | The subspace to which the chunk belongs. |
| int | The ID of the chunk. |
| string | The content of the chunk. |
| string | The title of the chunk. |
| string | The type of the chunk, such as |
| string | The status can be |
| string | The ID of the document to which the chunk belongs. |
| string | The OSS path of the document to which the chunk belongs. |
| int | The creation timestamp. |
| int | The update timestamp. |
| string | The next token for pagination. If this field is empty, it indicates the last page of results. |
Update chunks
Call update_chunks to update the title, content, or status of multiple chunks in a batch.
Request parameters
Parameter | Type | Description |
| string | The name of the knowledge base. Required. |
| string | The name of the subspace. Required if subspaces are enabled. |
| list<object> | A list of chunks to update. Required. You can update a maximum of 10 chunks per request. Note To request an increase to this limit, submit a ticket or join the Tablestore technical support DingTalk group (ID: 36165029092). |
| string | The document ID. Required if |
| string | The OSS path of the document. Required if |
| int | The ID of the chunk. Required. |
| string | The new title for the chunk. |
| string | The new content for the chunk. |
| string | The new status for the chunk: |
Code examples
Update the content of a chunk:
resp = client.update_chunks({
"knowledgeBaseName": "product_docs_kb",
"chunks": [
{
"docId": "fc6ed97f-...",
"chunkId": 1,
"title": "Updated title",
"content": "Updated content"
}
]
})Block an inaccurate chunk by setting its status to inactive:
resp = client.update_chunks({
"knowledgeBaseName": "product_docs_kb",
"chunks": [
{
"docId": "fc6ed97f-...",
"chunkId": 0,
"status": "inactive"
}
]
})Response fields
Parameter | Type | Description |
| string | The ID of the document. |
| string | The OSS path of the document. |
| int | The ID of the chunk. |
| string | The status of the update operation. Valid values are |
| string | The reason for the failure. This parameter is returned only when the status is |
Usage notes
After you set the
statusof a chunk toinactive, it no longer appears in retrieval results. You can use this feature to temporarily block inaccurate content without deleting the entire document.