This topic describes the API used to perform text segmentation and vectorization.
Request syntax
POST /v3/openapi/apps/{app_group_identity}/actions/knowledge-splitNote: app_group_identity specifies the name of the OpenSearch instance.
Request parameters
SplitDoc | |||
Parameter | Type | Description | Remarks |
title | String | The data title. | This parameter is optional. |
content | String | The data content to be processed. | This parameter is required. |
use_embedding | Boolean | Specifies whether to perform vectorization. Valid values: true and false. | If you do not specify this parameter, the default value false is used. |
model | String | The vectorization model to be used. | |
Sample request
{
"title":"Test title",
"content":"Test text",
"use_embedding":true,
}Response parameters
Parameter | Type | Description |
chunks | List<ChunkContext> | The chunks after the segmentation. |
ChunkContext | ||
Parameter | Type | Description |
chunk_id | String | The chunk ID. |
chunk | Stirng | The chunk. |
embedding | String | The vector after the vectorization. |
type | String | The type of the text. Valid values: text and image. |
img_url | String | The image URL. This parameter is returned if the value of type is image. |
Sample response
{
"request_id":"111111111",
"status":"OK";
"errors":[],
"result":[
{
"chunk_id":"1",
"chunk":"Chunk 1",
"embedding":"-0.010441,-0.002826,-0.022911,0.000847,0.025610,0.019213,-0.019912,0.008210,0.011974,-0.010120,-0.003866,-0.008091,-0.006889,-0.034774,...-0.012572,0.009668,0.010963,-0.005273,-0.005072,-0.002190,-0.001554,-0.000058",
"type":"text"
},
{
"chunk_id":"2",
"chunk":"Chunk 2",
"embedding":"-0.010441,-0.002826,-0.022911,0.000847,0.025610,0.019213,-0.019912,0.008210,0.011974,-0.010120,-0.003866,-0.008091,-0.006889,-0.034774,...-0.012572,0.009668,0.010963,-0.005273,-0.005072,-0.002190,-0.001554,-0.000058",
"type":"image",
"img_url":"http://127.0.0.1"
},
{
"chunk_id":"3",
"chunk":"Chunk 3",
"type":"text"
}
]
}