You can use the document snapshot feature to generate a snapshot of a specific page from a document, such as a Word, Excel, PPT, or PDF file, in the cloud without downloading the file. This feature supports various scenarios, such as web page embedding and data backup.
Scenarios
Data backup and recovery: You can periodically create snapshots of documents in your Object Storage Service (OSS) buckets for data backup.
Key information extraction: You can use document snapshots to capture a specific page to quickly extract key information.
How to use this feature
Prerequisites
In Object Storage Service (OSS), you must create a bucket, upload the document to be processed to the bucket, and attach an Intelligent Media Management (IMM) project to the bucket. The IMM project must be in the same region as the bucket.
Document snapshots
You can use an SDK to call the document snapshot API for processing.
Java
package com.aliyun.oss.demo;
import com.aliyun.oss.*;
import com.aliyun.oss.common.auth.*;
import com.aliyun.oss.common.comm.SignVersion;
import com.aliyun.oss.model.GeneratePresignedUrlRequest;
import java.net.URL;
import java.util.Date;
public class Demo {
public static void main(String[] args) throws Throwable {
// The China (Hangzhou) region is used as an example. Set the Endpoint to the actual region.
String endpoint = "https://oss-cn-hangzhou.aliyuncs.com";
// Obtain access credentials from environment variables. Before running this sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set.
EnvironmentVariableCredentialsProvider credentialsProvider = CredentialsProviderFactory.newEnvironmentVariableCredentialsProvider();
// Specify the bucket name, for example, examplebucket.
String bucketName = "examplebucket";
// Specify the full path of the object. If the document is not in the root directory of the bucket, you must include the full path, for example, exampledir/demo.docx.
String objectName = "demo.docx";
// Specify the region where the bucket is located. The China (Hangzhou) region is used as an example. Set the region to cn-hangzhou.
String region = "cn-hangzhou";
// Create an OSSClient instance.
// When the OSSClient instance is no longer needed, call the shutdown method to release resources.
ClientBuilderConfiguration clientBuilderConfiguration = new ClientBuilderConfiguration();
clientBuilderConfiguration.setSignatureVersion(SignVersion.V4);
OSS ossClient = OSSClientBuilder.create()
.endpoint(endpoint)
.credentialsProvider(credentialsProvider)
.clientConfiguration(clientBuilderConfiguration)
.region(region)
.build();
try {
// Build the document snapshot processing instruction to get a snapshot of the second page of the document.
String style = "doc/snapshot,target_jpg,source_docx,page_2";
// Set the expiration time of the signed URL to 3,600 seconds.
Date expiration = new Date(new Date().getTime() + 3600 * 1000L);
GeneratePresignedUrlRequest req = new GeneratePresignedUrlRequest(bucketName, objectName, HttpMethod.GET);
req.setExpiration(expiration);
req.setProcess(style);
URL signedUrl = ossClient.generatePresignedUrl(req);
System.out.println(signedUrl);
} catch (OSSException oe) {
System.out.println("Caught an OSSException, which means your request made it to OSS, "
+ "but was rejected with an error response for some reason.");
System.out.println("Error Message:" + oe.getErrorMessage());
System.out.println("Error Code:" + oe.getErrorCode());
System.out.println("Request ID:" + oe.getRequestId());
System.out.println("Host ID:" + oe.getHostId());
} catch (ClientException ce) {
System.out.println("Caught an ClientException, which means the client encountered "
+ "a serious internal problem while trying to communicate with OSS, "
+ "such as not being able to access the network.");
System.out.println("Error Message:" + ce.getMessage());
} finally {
if (ossClient != null) {
ossClient.shutdown();
}
}
}
}Python
# -*- coding: utf-8 -*-
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
# Obtain access credentials from environment variables. Before running this sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set.
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())
# Specify the bucket name.
bucket = 'examplebucket'
# Specify the Endpoint for the region where the bucket is located. The China (Hangzhou) region is used as an example.
endpoint = 'https://oss-cn-hangzhou.aliyuncs.com'
# Specify the general-purpose Alibaba Cloud region ID.
region = 'cn-hangzhou'
bucket = oss2.Bucket(auth, endpoint, bucket, region=region)
# Specify the source document name. If the document is not in the root directory of the bucket, you must include the full path, for example, exampledir/demo.docx.
key = 'demo.docx'
# Specify the expiration time in seconds.
expire_time = 3600
# Build the document snapshot processing instruction to get a snapshot of the second page of the document.
process = 'doc/snapshot,target_jpg,source_docx,page_2 '
# Generate a signed URL with image processing parameters.
url = bucket.sign_url('GET', key, expire_time, params={'x-oss-process': process}, slash_safe=True)
# Print the signed URL.
print(url)Go
package main
import (
"fmt"
"os"
"github.com/aliyun/aliyun-oss-go-sdk/oss"
)
func HandleError(err error) {
fmt.Println("Error:", err)
os.Exit(-1)
}
func main() {
// Obtain access credentials from environment variables. Before running this sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set.
provider, err := oss.NewEnvironmentVariableCredentialsProvider()
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Create an OSSClient instance.
// Set yourEndpoint to the Endpoint of the bucket. The China (Hangzhou) region is used as an example. Set the Endpoint to https://oss-cn-hangzhou.aliyuncs.com. Set other regions as needed.
// Set yourRegion to the region where the bucket is located. The China (Hangzhou) region is used as an example. Set the region to cn-hangzhou. Set other regions as needed.
clientOptions := []oss.ClientOption{oss.SetCredentialsProvider(&provider)}
clientOptions = append(clientOptions, oss.Region("yourRegion"))
// Set the signature version.
clientOptions = append(clientOptions, oss.AuthVersion(oss.AuthV4))
client, err := oss.New("yourEndpoint", "", "", clientOptions...)
if err != nil {
HandleError(err)
}
// Specify the name of the bucket where the document is stored, for example, examplebucket.
bucketName := "examplebucket"
bucket, err := client.Bucket(bucketName)
if err != nil {
HandleError(err)
}
// Specify the document name. If the document is not in the root directory of the bucket, you must include the full path, for example, exampledir/demo.docx.
ossObjectName := "demo.docx"
// Generate a signed URL and set the expiration time to 3,600s. (The maximum validity period is 32,400 seconds.)
signedURL, err := bucket.SignURL(ossObjectName, oss.HTTPGet, 3600, oss.Process("doc/snapshot,target_jpg,source_docx,page_2"))
if err != nil {
HandleError(err)
} else {
fmt.Println(signedURL)
}
}Node.js
const OSS = require("ali-oss");
// Define a function to generate a signed URL.
async function generateSignatureUrl(fileName) {
// Get the signed URL.
const client = await new OSS({
// Obtain access credentials from environment variables. Before running this sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set.
accessKeyId: process.env.OSS_ACCESS_KEY_ID,
accessKeySecret: process.env.OSS_ACCESS_KEY_SECRET,
bucket: 'examplebucket',
// Set yourregion to the region where the bucket is located. The China (Hangzhou) region is used as an example. Set the region to oss-cn-hangzhou.
region: 'oss-cn-hangzhou',
// Set secure to true to use HTTPS and prevent the browser from blocking the generated download link.
secure: true,
authorizationV4: true
});
return await client.signatureUrlV4('GET', 3600, {
headers: {}, // Set the request headers based on the actual request that you send.
queries: {
"x-oss-process": "doc/snapshot,target_jpg,source_docx,page_1" // Build the document snapshot processing instruction to get a snapshot of the first page of the document.
}
}, fileName);
}
// Call the function and pass the file name.
generateSignatureUrl('yourFileName').then(url => {
console.log('Generated Signature URL:', url);
}).catch(err => {
console.error('Error generating signature URL:', err);
});PHP
<?php
if (is_file(__DIR__ . '/../autoload.php')) {
require_once __DIR__ . '/../autoload.php';
}
if (is_file(__DIR__ . '/../vendor/autoload.php')) {
require_once __DIR__ . '/../vendor/autoload.php';
}
use OSS\Credentials\EnvironmentVariableCredentialsProvider;
use OSS\OssClient;
// Obtain access credentials from environment variables. Before running this sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set.
$provider = new EnvironmentVariableCredentialsProvider();
// Set yourEndpoint to the Endpoint for the region where the bucket is located. The China (Hangzhou) region is used as an example. Set the Endpoint to https://oss-cn-hangzhou.aliyuncs.com.
$endpoint = "yourEndpoint";
// Set yourRegion to the region where the bucket is located. The China (Hangzhou) region is used as an example. Set the region to cn-hangzhou. Set other regions as needed.
$region = "yourRegion";
// Specify the bucket name, for example, examplebucket.
$bucket= "examplebucket";
// Specify the full path of the object, for example, exampledir/demo.docx. The full path of the object cannot contain the bucket name.
$object = "exampledir/demo.docx";
$config = array(
"provider" => $provider,
"endpoint" => $endpoint,
"signatureVersion" => OssClient::OSS_SIGNATURE_VERSION_V4,
"region"=> $region
);
$ossClient = new OssClient($config);
// Generate a signed URL with image processing parameters. The URL is valid for 3,600 seconds and can be accessed directly in a browser.
$timeout = 3600;
$options = array(
// Build the document snapshot processing instruction to get a snapshot of the first page of the document.
OssClient::OSS_PROCESS => "doc/snapshot,target_jpg,source_docx,page_1");
$signedUrl = $ossClient->signUrl($bucket, $object, $timeout, "GET", $options);
print("Signed URL: \n" . $signedUrl);The following is an example of a generated signed URL:
https://examplebucket.oss-cn-hangzhou.aliyuncs.com/demo.docx?x-oss-process=doc%2Fsnapshot%2Ctarget_jpg%2Csource_docx%2Cpage_1&x-oss-date=20250225T023122Z&x-oss-expires=3600&x-oss-signature-version=OSS4-HMAC-SHA256&x-oss-credential=LTAI********************%2F20250225%2Fcn-hangzhou%2Foss%2Faliyun_v4_request&x-oss-signature=c6620caa4dc160e5a70ee96b5bae08464edf7a41bb6d47432eda65474f68f26aCopy the generated URL and paste it into the address bar of your browser to view the specified document snapshot.
Parameters
Action: doc/snapshot
The following table describes the parameters.
Parameter | Type | Required | Description |
target | string | No | The target format of the image. Valid values:
|
source | string | No | The file format of the source document. By default, the file extension of the object name is used. Valid values:
Note If you do not specify this parameter and the object does not have a file extension, an error is returned. |
page | int | No | The page number of the document. The default value is 1, which indicates the first page. The maximum value is 2000. |
Related API operations
The preceding operations are implemented using API calls. If your program has high customization requirements, you can directly send REST API requests. If you send REST API requests, you must manually write code to calculate the signature. For more information about how to calculate the `Authorization` common request header, see Signature Version 4 (recommended).
Get a snapshot of the first page of example.docx
Processing method
Default processing
Example
// Get a snapshot of the first page of example.docx.
GET /exmaple.docx?x-oss-process=doc/snapshot HTTP/1.1
Host: doc-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: SignatureValueGet a JPG snapshot of the second page of the Word document example
Processing method
target: jpg
source: docx
page: 2
Example
// Get a JPG snapshot of the second page of the Word document example.
GET /exmaple?x-oss-process=doc/snapshot,target_jpg,source_docx,page_2 HTTP/1.1
Host: doc-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: SignatureValuePermissions
An Alibaba Cloud account has full permissions by default. By default, a Resource Access Management (RAM) user or RAM role under an Alibaba Cloud account does not have any permissions. The Alibaba Cloud account or an administrator must grant permissions using a RAM policy or a bucket policy.
API | Action | Definition |
GetObject |
| Downloads an object. |
| When downloading an object, if you specify the object version through versionId, this permission is required. | |
| When downloading an object, if the object metadata contains X-Oss-Server-Side-Encryption: KMS, this permission is required. |
API | Action | Definition |
None |
| Allows the use of data processing capabilities of IMM in OSS. |
API | Action | Definition |
CreateOfficeConversionTask |
| Permission to use IMM for document conversion or snapshots. |
Billing
Document snapshots are charged for the following billable items. For more information about the pricing of billable items, see OSS Pricing and Billable items:
API | Billable item | Description |
GetObject | GET requests | You are charged request fees based on the number of successful requests. |
Outbound traffic over the Internet | If you call the GetObject operation by using a public endpoint, such as oss-cn-hangzhou.aliyuncs.com, or an acceleration endpoint, such as oss-accelerate.aliyuncs.com, you are charged fees for outbound traffic over the Internet based on the data size. | |
Retrieval of IA objects | If IA objects are retrieved, you are charged IA data retrieval fees based on the size of the retrieved IA data. | |
Retrieval of Archive objects in a bucket for which real-time access is enabled | If you retrieve Archive objects in a bucket for which real-time access is enabled, you are charged Archive data retrieval fees based on the size of retrieved Archive objects. | |
Transfer acceleration fees | If you enable transfer acceleration and use an acceleration endpoint to access your bucket, you are charged transfer acceleration fees based on the data size. |
API | Billable item | Description |
CreateOfficeConversionTask | DocumentConvert | You are charged request fees based on the number of successful requests. |
Notes
Document snapshots support only synchronous processing (the x-oss-process method).
FAQ
What is the maximum size of a source document for a document snapshot?
The maximum size of a source document for a document snapshot is 20 MB.