Perform text segmentation and vectorization - OpenSearch

This topic describes the API used to perform text segmentation and vectorization.

Request syntax

POST /v3/openapi/apps/{app_group_identity}/actions/knowledge-split

Note: app_group_identity specifies the name of the OpenSearch instance.

Request parameters

SplitDoc

Parameter

Type

Description

Remarks

title

String

The data title.

This parameter is optional.

content

String

The data content to be processed.

This parameter is required.

use_embedding

Boolean

Specifies whether to perform vectorization.

Valid values: true and false.

If you do not specify this parameter, the default value false is used.

model

String

The vectorization model to be used.

Sample request

{
  "title":"Test title",
  "content":"Test text",
  "use_embedding":true,
}

Response parameters

Parameter

Type

Description

chunks

List<ChunkContext>

The chunks after the segmentation.

ChunkContext

Parameter

Type

Description

chunk_id

String

The chunk ID.

chunk

Stirng

The chunk.

embedding

String

The vector after the vectorization.

type

String

The type of the text.

Valid values: text and image.

img_url

String

The image URL. This parameter is returned if the value of type is image.

Sample response

{
  "request_id":"111111111",
  "status":"OK";
  "errors":[],
  "result":[
  {
    "chunk_id":"1",
    "chunk":"Chunk 1",
    "embedding":"-0.010441,-0.002826,-0.022911,0.000847,0.025610,0.019213,-0.019912,0.008210,0.011974,-0.010120,-0.003866,-0.008091,-0.006889,-0.034774,...-0.012572,0.009668,0.010963,-0.005273,-0.005072,-0.002190,-0.001554,-0.000058",
    "type":"text"
  },
  {
    "chunk_id":"2",
    "chunk":"Chunk 2",
    "embedding":"-0.010441,-0.002826,-0.022911,0.000847,0.025610,0.019213,-0.019912,0.008210,0.011974,-0.010120,-0.003866,-0.008091,-0.006889,-0.034774,...-0.012572,0.009668,0.010963,-0.005273,-0.005072,-0.002190,-0.001554,-0.000058",
    "type":"image",
    "img_url":"http://127.0.0.1"
  },
  {
    "chunk_id":"3",
    "chunk":"Chunk 3",
    "type":"text"
  }
]
}