AI Search Open Platform supports invoking image content extraction services through APIs. You can integrate the service into your business processing chain. The parsed text can be used for image retrieval and conversational research scenarios.
Service list
Service name | Service ID | Service description | QPS limit for API calls (Alibaba Cloud account and RAM users) |
Image content understanding service 001 | ops-image-analyze-vlm-001 | Provides image content parsing services. It can parse and understand image content based on multimodal large models and perform OCR. The parsed text can be used for image retrieval and Q&A scenarios. | 10 Note To apply for higher QPS, submit a ticket. |
Image text recognition service 001 | ops-image-analyze-ocr-001 | Provides image content OCR recognition services. It can recognize text in images based on OCR capabilities, extract text information, and use it for image retrieval and Q&A scenarios. |
The authentication information is obtained.
When you call an AI Search Open Platform service by using an API, you need to authenticate the caller's identity.
The service access address is obtained.
You can call a service over the Internet or a virtual private cloud (VPC). For more information, see Get service registration address.
Create an asynchronous extraction task
Request method
POST
URL
{host}/v3/openapi/workspaces/{workspace_name}/image-analyze/{service_id}/asynchost: the address for invoking the service. The API service can be invoked through both public network and VPC. For more information, see the referenced document.
workspace_name: the name of the workspace, such as default.
service_id: the built-in service ID, such as ops-image-analyze-vlm-001.
Request parameters
Header parameters
API-KEY authentication
Parameter | Type | Required | Description | Example |
Content-Type | String | Yes | Request type: application/json | application/json |
Authorization | String | Yes | API-Key | Bearer OS-d1**2a |
Body parameters
Parameter | Type | Required | Description | Example |
service_id | String | Yes | Built-in service ID:
| ops-image-analyze-vlm-001 |
document.url | String | No | Specifies the URL address where the file is saved. Either URL or content must be selected. Supports http and https protocols. | http://path/to/***.jpg |
document.content | String | No | Specifies the content of the file, encoded with Base64Encode. Either URL or content must be selected. | "aGVsbG8gd29ybGQ=" |
document.file_name | String | No | File name. If empty, it is inferred from the URL. If the URL is empty, it must be explicitly specified. | test.jpg |
document.file_type | String | No | File type. If empty, it is inferred from the file_name suffix. If it cannot be inferred, it must be explicitly specified, such as jpg, jpeg, png, bmp, tiff. | jpg |
Response parameters
Parameter | Type | Description | Example |
result.task_id | String | Image parsing asynchronous task ID. | 6177bf71-f87f-4d86-ab0c-e2b64dfe**** |
cURL request example
curl -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <Your API key>" \
"http://***-hangzhou.opensearch.aliyuncs.com/v3/openapi/workspaces/default/image-analyze/ops-image-analyze-vlm-001/async"
--data '{
"document": {
"url": "https://img01.yzcdn.cn/****/2017/05/11/FoTMgBa0SvUaAeFruY7i7O_EUMhf.jpg%21middle.jpg",
"file_type": "jpg"
}
}' \
Response example
Normal response example
{
"request_id":"CD4E26F0-23FF-449C-83DC-20CC8FF1****",
"latency":8.0,
"http_code":200,
"result":{
"task_id":"cd4e26f0-23ff-449c-83dc-20cc8ff1****"
}
}Abnormal response example
In case of an error in the access request, the output result will indicate the error reason through code and message.
{
"request_id":"0CCAC03B-D83F-432F-B6BA-C3049576****",
"latency":0.0,
"code":"InvalidParameter",
"http_code":400,
"message":"document.content or document.url required, and both cannot be present at the same time"
}Get the asynchronous extraction task status
Request method
GET
URL
{host}/v3/openapi/workspaces/{workspace_name}/image-analyze/{service_id}/async/task-status?task_id=${task_id}host: The address for invoking the service. The API service can be invoked through both public network and VPC. For more information, see the referenced document.
workspace_name: The name of the workspace, for example, default.
service_id: The built-in service ID, for example, ops-image-analyze-vlm-001.
task_id: The task ID returned in the image parsing response, for example, cd4e26f0-23ff-449c-83dc-20cc8ff1****.
Request parameters
Header parameters
API-KEY authentication
Parameter | Type | Required | Description | Example |
Content-Type | string | Yes | Request type: application/json | application/json |
Authorization | string | Yes | API-Key | Bearer OS-d1**2a |
Response parameters
Parameter | Type | Description | Example |
request_id | String | The unique identifier assigned by the system for an API call. | 3C09570D-12DB-46B4-BF0F-A100D79B**** |
latency | Float/Int | Request latency in ms. | 3.0 |
result.task_id | String | Asynchronous task ID, not present in synchronous calls. | a7e4c0f6-874c-47e3-b05b-02278a96e**** |
result.status | String | Task status:
| SUCCESS |
result.data | Object | Image parsing result. | {"content":"The image shows XXXX", "content_type":"plain"} |
result.data.content | String | Image content. | "XXX" |
result.data.content_type | String | Output text type: plain. | plain |
usage.token_count | int | Number of tokens output, applicable to ops-image-analyze-vlm-001 service. | 1234 |
usage.pv_count | int | Number of calls (fixed at 1), applicable to ops-image-analyze-ocr-001 service. | 1 |
cURL request example
curl -X GET \
-H"Content-Type: application/json" \
-H "Authorization: Bearer <Your API key>" \
"http://***-hangzhou.opensearch.aliyuncs.com/v3/openapi/workspaces/default/image-analyze/ops-image-analyze-vlm-001/async/task-status?task_id=d9781786-20b8-4fb4-bbb5-38f82e69****"
Response example
Normal response example
{
"request_id":"3C09570D-12DB-46B4-BF0F-A100D79B****",
"latency":3.0,
"http_code":200,
"result":{
"status":"SUCCESS",
"data":{
"content":"The image shows a WMF brand blender surrounded by various fruits and vegetables. Next to the blender is a cup filled with red juice, with a straw inserted. Scattered on the table are a few slices of lemon, some strawberries, and some kiwis. In one corner of the table, there is a cut pineapple and an orange. Additionally, some carrots are cut into small pieces and placed in the blender, ready for juicing. The whole scene looks very healthy and delicious.",
"content_type":"plain"
},
"task_id":"d9781786-20b8-4fb4-bbb5-38f82e69****"
},
"usage":{
"token_count":95
}
}Abnormal response example
In case of an error in the access request, the output result will indicate the error reason through code and message.
{
"request_id":"153FC253-468D-4C46-873E-2AEB918C****",
"latency":2.0,
"code":"BadRequest.TaskNotExist",
"http_code":404,
"message":"task[d9781786-20b8-4fb4-bbb5-38f82e690b****] not exist"
}Create a synchronous extraction task
Request method
POST
URL
{host}/v3/openapi/workspaces/{workspace_name}/image-analyze/{service_id}/syncParameter description
host: The address for invoking the service. The API service can be invoked through both public network and VPC. For more information, see the referenced document.
workspace_name: The name of the workspace, for example, default.
service_id: The built-in service ID, for example, ops-image-analyze-vlm-001.
Request parameters
Header parameters
API-KEY authentication
Parameter | Type | Required | Description | Example |
Content-Type | String | Yes | Request type: application/json | application/json |
Authorization | String | Yes | API-Key | Bearer OS-d1**2a |
Body parameters
Parameter | Type | Required | Description | Example |
service_id | String | Yes | Built-in service ID:
| ops-image-analyze-vlm-001 |
document.url | String | No | Specifies the URL address where the file is saved. Either url or content must be selected. Supports http and https protocols. | http://path/to/***.jpg |
document.content | String | No | Document content, encoded with Base64Encode Either document.url or document.content must be selected. | "aGVsbG8gd29ybGQ=" |
document.file_name | String | No | File name. If empty, it is inferred from the URL. If the url is empty, it must be explicitly specified. | test.jpg |
document.file_type | String | No | File type. If empty, it is inferred from the file_name suffix. If it cannot be inferred, it must be explicitly specified, such as jpg, jpeg, png, bmp, tiff. | jpg |
Response parameters
Parameter | Type | Description | Example |
result.status | String | Task status:
| SUCCESS |
result.error | String | Error message when status=FAIL, normally empty. | Document decryption failed |
result.data | Object | Image parsing result. | {"content":"The image shows XXXX", "content_type":"plain"} |
result.data.content | String | Image content. | "XXX" |
result.data.content_type | String | Output text type: plain. | Plain |
request_id | String | The unique identifier assigned by the system for an API call. | B4AB89C8-B135-xxxx-A6F8-2BAB801A2CE4 |
latency | Float/Int | Request latency in ms. | 10 |
usage | Object | Billing information for this call. | "usage": { "token_count": 1234 } |
usage.token_count | Int | Number of tokens output, applicable to ops-image-analyze-vlm-001 service. | 1234 |
usage.pv_count | Int | Number of calls (fixed at 1), applicable to ops-image-analyze-ocr-001 service. | 1 |
cURL request example
curl -X POST \
-H"Content-Type: application/json" \
-H "Authorization: Bearer <Your API key>" \
"http://***-hangzhou.opensearch.aliyuncs.com/v3/openapi/workspaces/default/image-analyze/ops-image-analyze-vlm-001/sync" \
\ -d "{
\"document\":{
\"url\":\"https://img01.yzcdn.cn/****/2017/05/11/FoTMgBa0SvUaAeFruY7i7O_EUMhf.jpg%21middle.jpg\",
\"file_type\":\"jpg\"
}
}"Response example
Normal response example
{
"request_id":"BB5CD4C3-C8B6-40E7-A037-4ADAE88A****",
"latency":12525.0,
"http_code":200,
"result":{
"status":"SUCCESS",
"data":{
"content":"The image shows a WMF brand blender surrounded by various fruits and vegetables. Next to the blender is a cup filled with red juice, with a straw inserted. Scattered on the table are a few slices of lemon, some strawberries, and some kiwis. In one corner of the table, there is a cut pineapple and an orange. Additionally, some carrots are cut into small pieces and placed in the blender, ready for juicing. The whole scene looks very healthy and delicious.",
"content_type":"plain"
}
},
"usage":{
"token_count":95
}
}Abnormal response example
In case of an error in the access request, the output result will indicate the error reason through code and message.
{
"request_id": "6F33AFB6-A35C-4DA7-AFD2-9EA16CCF****",
"latency": 2.0,
"code": "InvalidParameter",
"http_code": 400,
"message": "JSON parse error: Cannot deserialize value of type `ImageStorage` from String \\"xxx\\""
}Status code description
HTTP status code | Error code | Description |
200 | - | Request successful, including task failure scenarios. The actual task status needs to be determined from result.status. |
404 | BadRequest.TaskNotExist | Task does not exist. |
400 | InvalidParameter | Invalid request. |
500 | InternalServerError | Internal error. |
For more status code descriptions, see the referenced document.