Description
A prediction query uses the built-in vectorization models of Vector Search Edition to convert text, images, or videos into vector data. You can then use the original text, image, or video to search for data.
Note: If you already have vector data and want to import it directly into a Vector Search Edition instance for retrieval, see Vector query.
URL
/vector-service/inference-query
The sample URL omits information such as request headers and the encoding method.
The complete endpoint must also include the host address of your application.
For information about the definitions, usage, and examples of all request parameters, see the Request body parameters section.
Protocol
HTTP
Request method
POST
Supported format
JSON
Signature mechanism
Calculate the signature for the `authorization` header as follows:
Parameter | Type | Description |
accessUserName | string | The username. Find it on the Instance Details page under API Endpoint. |
accessPassWord | string | The password. Modify it on the Instance Details page under API Endpoint. |
import com.aliyun.darabonba.encode.Encoder;
import com.aliyun.darabonbastring.Client;
public class GenerateAuthorization {
public static void main(String[] args) throws Exception {
String accessUserName = "username";
String accessPassWord = "password";
String realmStr = "" + accessUserName + ":" + accessPassWord + "";
String authorization = Encoder.base64EncodeToString(Client.toBytes(realmStr, "UTF-8"));
System.out.println(authorization);
}
}The correct format for the authorization value is:
cm9vdDp******mdhbA==Add the `Basic` prefix when you specify the `authorization` parameter in an HTTP request.
Example (add to the header):
authorization: Basic cm9vdDp******mdhbA==Request body parameters
Parameter Name | Description | Default | Type | Required |
tableName | The name of the table to query. | None | string | Yes |
indexName | The name of the index to query. | The first configured index | string | No |
content | The data for prediction. | None | string | Yes |
contentType | The data type for video prediction. Valid values: text, image, video_uri, and video_base64 | None | string | No |
modal | The vectorization model. Valid values:
| None | string | Yes |
videoFrameTopK | The number of frames to retrieve. | 100 | int | No |
namespace | The namespace of the vector to query. | "" | string | No |
topK | The number of results to return. | 100 | int | No |
includeVector | Specifies whether to return vector information in the document. | false | bool | No |
outputFields | A list of fields to return. | [] | list[string] | No |
order | The sorting order. `ASC`: ascending. `DESC`: descending. | ASC | string | No |
searchParams | Query parameters. | "" | string | No |
filter | The filter expression. | "" | string | No |
scoreThreshold | The score threshold for filtering. If you use Euclidean distance, only results with a score less than `scoreThreshold` are returned. If you use inner product, only results with a score greater than `scoreThreshold` are returned. | No filtering by default | float | No |
Response parameters
Field Name | Description | Type |
result | A list of results. | list[Item] |
totalCount | The number of items in the result list. | int |
totalTime | The time taken by the engine to process the request, in milliseconds. | float |
errorCode | The error code. This field is returned only when an error occurs. | int |
errorMsg | The error message. This field is returned only when an error occurs. | string |
Item definition
Field Name | Description | Type |
score | The distance score. | float |
fields | The field names and their corresponding values. | map<string, FieldType> |
vector | The vector value. | list[float] |
id | The primary key value. The type is the same as the defined field type. | FieldType |
namespace | The namespace of the vector. This field is returned if a namespace is set. | string |
Examples
Text embedding retrieval
Request body:
{ "tableName": "gist", "indexName": "test", "content": "hello", "modal": "text", "topK": 3, "searchParams":"{\"qc.searcher.scan_ratio\":0.01}", "includeVector": true }Response:
{ "result":[ { "id": 1, "score":1.0508723258972169, "vector": [0.1, 0.2, 0.3] }, { "id": 2, "score":1.0329746007919312, "vector": [0.2, 0.2, 0.3] }, { "id": 3, "score":0.980593204498291, "vector": [0.3, 0.2, 0.3] } ], "totalCount":3, "totalTime":2.943 }
Image embedding
Text-to-image search:
Request body:
{ "tableName": "gist", "indexName": "test", "content": "Bicycle", "modal": "text", "topK": 3, "searchParams":"{\"qc.searcher.scan_ratio\":0.01}", "includeVector": true }Response:
{ "result":[ { "id": 1, "score":1.0508723258972169, "vector": [0.1, 0.2, 0.3] }, { "id": 2, "score":1.0329746007919312, "vector": [0.2, 0.2, 0.3] }, { "id": 3, "score":0.980593204498291, "vector": [0.3, 0.2, 0.3] } ], "totalCount":3, "totalTime":2.943 }
Search by image:
Request body:
{ "tableName": "gist", "indexName": "test", "content": "Base64-encoded image", "modal": "image", "topK": 3, "searchParams":"{\"qc.searcher.scan_ratio\":0.01}", "includeVector": true }Response:
{ "totalCount": 5, "result": [ { "id": 5, "score": 1.103209137916565 }, { "id": 3, "score": 1.1278988122940064 }, { "id": 2, "score": 1.1326735019683838 } ], "totalTime": 242.615 }
Subject identification
Request body:
If `range` is not specified:
{ "tableName": "gist", "indexName": "test", "content": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQ", "modal": "image", "searchParams": "{\"crop\": true}", "topK": 3, "includeVector": true }Note:
"crop":truespecifies a search by subject. If you do not specify the `range` parameter, the system calls the subject identification model.If `range` is specified:
{ "tableName": "gist", "indexName": "test", "content": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQ", "modal": "image", "searchParams": "{\"crop\": true, \"range\": \"100,100,60,70\"}", "topK": 3, "includeVector": true }Note:
"crop":true, "range":"100,100,60,70"specifies a search by subject. The `range` parameter specifies the subject's area in the image. The four numbers represent the x and y coordinates of the top-left point, the width, and the height of the subject area.Response:
{ "result":[ { "id": 1, "score":1.0508723258972169, "vector": [0.1, 0.2, 0.3] } ], "__meta__": { "__range__": "100,100,60,70;", } "totalCount":1, "totalTime":2.943 }Note:
The `__range__` field is returned only for subject identification queries where the `modal` parameter is set to `image`.
__range__specifies the subject's area in the image. The four numbers represent the x and y coordinates of the top-left point, the width, and the height of the subject area.If the model detects multiple subjects, the
__range__field contains a list of subject areas sorted by model score in descending order. By default, the query returns the result that corresponds to the first subject in the list.
Text-to-video retrieval
Request body:
{ "tableName": "video", "content": "hello", "modal": "video", "topK": 3, "videoFrameTopK":100, "contentType":"text", "searchParams":"{\"qc.searcher.scan_ratio\":0.01}" }Response:
{ "result":[ { "videoId": 1, "videoUri": "oss://...", "fields" : { "tag" : "demo" }, "clips": [{ "queryStartTime": 5, // Timestamp of the query video frame (in seconds) "startTime": 5, // Timestamp of the matched video frame (in seconds) "duration": 5, // Matching duration (in seconds) "queryStartFrameIndex": 150, // Start index of the query video frame "queryEndFrameIndex": 300, // End index of the query video frame "startFrameIndex": 150, // Start index of the matched video frame "endFrameIndex": 300, // End index of the matched video frame "sim": 0.8 // Overall similarity }] } ], "totalCount":1, "totalTime":2.943 }
Video-to-video retrieval
Supported video formats: MP4, AVI, MKV, MOV, FLV, and WebM.
Request body:
{ "tableName": "video", "content": "oss://...", "modal": "video", "topK": 3, "videoFrameTopK":100, "contentType":"video_uri", "searchParams":"{\"qc.searcher.scan_ratio\":0.01}" }You can specify the OSS path of the input file. For example: `oss://bucket-name/xxx/xxx.mp4`
{ "tableName": "video", "content": "data:video/mp4;base64,AAAAIGZ0eXBtcDQyAAABAGlxxxxxxx", "modal": "video", "topK": 3, "videoFrameTopK":100, "contentType":"video_encode", "searchParams":"{\"qc.searcher.scan_ratio\":0.01}" }The format is
data:video/{format};base64,{base64_video}, where:video/{format}: The format of the video. For example, if the video is in MP4 format, specifyvideo/mp4.base64_video: The Base64-encoded data of the video.
Response:
{ "result":[ { "videoId": 1, "videoUri": "oss://...", "fields" : { "tag" : "demo" }, "clips": [{ "queryStartTime": 5, // Timestamp of the query video frame (in seconds) "startTime": 5, // Timestamp of the matched video frame (in seconds) "duration": 5, // Matching duration (in seconds) "queryStartFrameIndex": 150, // Start index of the query video frame "queryEndFrameIndex": 300, // End index of the query video frame "startFrameIndex": 150, // Start index of the matched video frame "endFrameIndex": 300, // End index of the matched video frame "sim": 0.8 // Overall similarity }] } ], "totalCount":1, "totalTime":2.943 }
Image-to-video retrieval
Supported image formats: PNG, JPEG, and JPG.
Request body:
{ "tableName": "video", "content": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEAxxxxxx", "modal": "video", "topK": 3, "videoFrameTopK":100, "contentType":"image_encode", "searchParams":"{\"qc.searcher.scan_ratio\":0.01}" }Pass the Base64-encoded image data to the
imageparameter in the formatdata:image/{format};base64,{base64_image}, where:image/{format}: The format of the image. For example, if the image is in JPG format, specifyimage/jpeg.base64_image: The Base64-encoded data of the image.
Response:
{ "result":[ { "videoId": 1, "videoUri": "oss://...", "fields" : { "tag" : "demo" }, "clips": [{ "queryStartTime": 5, // Timestamp of the query video frame (in seconds) "startTime": 5, // Timestamp of the matched video frame (in seconds) "duration": 5, // Matching duration (in seconds) "queryStartFrameIndex": 150, // Start index of the query video frame "queryEndFrameIndex": 300, // End index of the query video frame "startFrameIndex": 150, // Start index of the matched video frame "endFrameIndex": 300, // End index of the matched video frame "sim": 0.8 // Overall similarity }] } ], "totalCount":3, "totalTime":2.943 }