Convert documents stored in OSS to target formats such as PNG, JPG, PDF, or TXT, and save the results to a specified path.
Scenarios
-
Online preview: Convert PDF, Word, Excel, or PPT documents to images for direct preview on web or mobile devices without downloading.
-
Cross-platform compatibility: Enable seamless document viewing across devices and operating systems.
Supported input file types
|
File type |
File extension |
|
Word |
doc, docx, wps, wpss, docm, dotm, dot, dotx, html |
|
PPT |
pptx, ppt, pot, potx, pps, ppsx, dps, dpt, pptm, potm, ppsm, dpss |
|
Excel |
xls, xlt, et, ett, xlsx, xltx, csv, xlsb, xlsm, xltm, ets |
|
|
|
Get started
Prerequisites
-
Create a bucket in OSS, upload the document to be converted to the bucket, and bind an Intelligent Media Management (IMM) Project to the bucket. The IMM Project must be in the same region as the bucket.
-
You must have the relevant permissions required for IMM processing.
Convert document format
Use the OSS SDK for Java, Python, or Go to call the document conversion API and save results to a specified bucket.
Java
OSS SDK for Java V3.17.4 or later is required.
import com.aliyun.oss.ClientBuilderConfiguration;
import com.aliyun.oss.OSS;
import com.aliyun.oss.OSSClientBuilder;
import com.aliyun.oss.common.auth.CredentialsProviderFactory;
import com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider;
import com.aliyun.oss.common.comm.SignVersion;
import com.aliyun.oss.model.AsyncProcessObjectRequest;
import com.aliyun.oss.model.AsyncProcessObjectResult;
import com.aliyuncs.exceptions.ClientException;
import java.util.Base64;
public class Demo1 {
public static void main(String[] args) throws ClientException {
// Specify the endpoint of the region in which the bucket is located.
String endpoint = "https://oss-cn-hangzhou.aliyuncs.com";
// Specify the ID of the Alibaba Cloud region in which the bucket is located. Example: cn-hangzhou.
String region = "cn-hangzhou";
// Obtain a credential from the environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured.
EnvironmentVariableCredentialsProvider credentialsProvider = CredentialsProviderFactory.newEnvironmentVariableCredentialsProvider();
// Specify the name of the bucket.
String bucketName = "examplebucket";
// Specify the name of the output object.
String targetKey = "dest.png";
// Specify the name of the source document.
String sourceKey = "src.docx";
// Create an OSSClient instance.
// When the OSSClient instance is no longer in use, call the shutdown method to release resources.
ClientBuilderConfiguration clientBuilderConfiguration = new ClientBuilderConfiguration();
clientBuilderConfiguration.setSignatureVersion(SignVersion.V4);
OSS ossClient = OSSClientBuilder.create()
.endpoint(endpoint)
.credentialsProvider(credentialsProvider)
.clientConfiguration(clientBuilderConfiguration)
.region(region)
.build();
try {
// Create a style variable of the string type to store document conversion parameters.
String style = String.format("doc/convert,target_png,source_docx");
// Create an asynchronous processing instruction.
String bucketEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(bucketName.getBytes());
String targetEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(targetKey.getBytes());
String process = String.format("%s|sys/saveas,b_%s,o_%s", style, bucketEncoded, targetEncoded);
// Create an AsyncProcessObjectRequest object.
AsyncProcessObjectRequest request = new AsyncProcessObjectRequest(bucketName, sourceKey, process);
// Execute the asynchronous processing task.
AsyncProcessObjectResult response = ossClient.asyncProcessObject(request);
System.out.println("EventId: " + response.getEventId());
System.out.println("RequestId: " + response.getRequestId());
System.out.println("TaskId: " + response.getTaskId());
} finally {
// Close your OSSClient.
ossClient.shutdown();
}
}
}
Python
OSS SDK for Python 2.18.4 or later is required.
# -*- coding: utf-8 -*-
import base64
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
def main():
# Obtain the temporary access credentials from the environment variables. Before you execute the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured.
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())
# Specify the endpoint for the region in which the bucket is located. For example, if the bucket is located in the China (Hangzhou) region, set the endpoint to https://oss-cn-hangzhou.aliyuncs.com.
endpoint = 'https://oss-cn-hangzhou.aliyuncs.com'
# Specify the ID of the Alibaba Cloud region in which the bucket is located. Example: cn-hangzhou.
region = 'cn-hangzhou'
# Specify the name of the bucket. Example: examplebucket.
bucket = oss2.Bucket(auth, endpoint, 'examplebucket', region=region)
# Specify the name of the source document.
source_key = 'src.docx'
# Specify the name of the output object.
target_key = 'dest.png'
# Create a style variable of the string type to store document conversion parameters.
animation_style = 'doc/convert,target_png,source_docx'
# Create a processing instruction, in which the name of the bucket and the name of the output object are Base64-encoded.
bucket_name_encoded = base64.urlsafe_b64encode('examplebucket'.encode()).decode().rstrip('=')
target_key_encoded = base64.urlsafe_b64encode(target_key.encode()).decode().rstrip('=')
process = f"{animation_style}|sys/saveas,b_{bucket_name_encoded},o_{target_key_encoded}"
try:
# Execute the asynchronous processing task.
result = bucket.async_process_object(source_key, process)
print(f"EventId: {result.event_id}")
print(f"RequestId: {result.request_id}")
print(f"TaskId: {result.task_id}")
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
main()
Go
OSS SDK for Go 3.0.2 or later is required.
package main
import (
"encoding/base64"
"fmt"
"os"
"github.com/aliyun/aliyun-oss-go-sdk/oss"
"log"
)
func main() {
// Obtain the temporary access credentials from the environment variables. Before you execute the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured.
provider, err := oss.NewEnvironmentVariableCredentialsProvider()
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Create an OSSClient instance.
// Specify the endpoint of the region in which the bucket is located. For example, if the bucket is located in the China (Hangzhou) region, set the endpoint to https://oss-cn-hangzhou.aliyuncs.com. Specify your actual endpoint.
// Specify the ID of the Alibaba Cloud region in which the bucket is located. Example: cn-hangzhou.
client, err := oss.New("https://oss-cn-hangzhou.aliyuncs.com", "", "", oss.SetCredentialsProvider(&provider), oss.AuthVersion(oss.AuthV4), oss.Region("cn-hangzhou"))
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Specify the name of the bucket. Example: examplebucket.
bucketName := "examplebucket"
bucket, err := client.Bucket(bucketName)
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Specify the name of the source document.
sourceKey := "src.docx"
// Specify the name of the output object.
targetKey := "dest.png"
// Create a style variable of the string type to store document conversion parameters.
animationStyle := "doc/convert,target_png,source_docx"
// Create a processing instruction, in which the name of the bucket and the name of the output object are Base64-encoded.
bucketNameEncoded := base64.URLEncoding.EncodeToString([]byte(bucketName))
targetKeyEncoded := base64.URLEncoding.EncodeToString([]byte(targetKey))
process := fmt.Sprintf("%s|sys/saveas,b_%v,o_%v", animationStyle, bucketNameEncoded, targetKeyEncoded)
// Execute the asynchronous processing task.
result, err := bucket.AsyncProcessObject(sourceKey, process)
if err != nil {
log.Fatalf("Failed to async process object: %s", err)
}
fmt.Printf("EventId: %s\n", result.EventId)
fmt.Printf("RequestId: %s\n", result.RequestId)
fmt.Printf("TaskId: %s\n", result.TaskId)
}
Parameter description
Action: doc/convert
Parameters:
|
Parameter name |
Type |
Required |
Description |
|
target |
string |
Yes |
The format of the output object. Valid values:
|
|
source |
string |
No |
The source file format. Defaults to the object name extension. Valid values:
|
|
pages |
string |
No |
The page numbers to convert. For example: |
Use sys/saveas to save converted documents to a specified bucket. Save As. To receive the conversion result, use the notify parameter. Notifications.
Event notifications
Document conversion is asynchronous. To receive the processing result without polling, configure event notifications with Simple Message Queue (SMQ, formerly MNS).
Configure event notifications
Related APIs
For advanced customization, call the RESTful API directly. Include signature calculation in your code. Signature Version 4 (Recommended).
Convert document format
-
Source object
-
Document format: DOCX
-
Document name: example.docx
-
-
Destination object
-
Object format: PNG
-
Storage path: oss://test-bucket/doc_images/{index}.png
-
b_dGVzdC1idWNrZXQ=: After the conversion is complete, save to a bucket named test-bucket (
dGVzdC1idWNrZXQ=is the Base64-encoded value oftest-bucket). -
o_ZG9jX2ltYWdlcy97aW5kZXh9LnBuZw==: The object uses the {index} variable to save images with example.docx page numbers as file names to the doc_images directory (
ZG9jX2ltYWdlcy97aW5kZXh9LnBuZw==is the Base64-encoded value ofdoc_images/{index}.png).
-
-
Conversion completion notification: Send to the Simple Message Queue (SMQ, formerly MNS) topic named
test-topic
-
Processing example
// Convert the example.docx file to PNG format image files.
POST /example.docx?x-oss-async-process HTTP/1.1
Host: doc-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: SignatureValue
x-oss-async-process=doc/convert,target_png,source_docx|sys/saveas,b_dGVzdC1idWNrZXQ=,o_ZG9jX2ltYWdlcy97aW5kZXh9LnBuZw==/notify,topic_dGVzdC10b3BpYw
Notes
-
Document conversion supports only asynchronous processing (x-oss-async-process).
-
Anonymous access is not supported.
-
The maximum file size supported for document format conversion is 200 MB, which cannot be adjusted.
FAQ
Does OSS document conversion support specifying the content of an Excel sheet?
No. OSS document conversion converts all sheets in an Excel file. To convert a specific sheet, call the IMM CreateOfficeConversionTask - Create document conversion task API with the SheetIndex parameter.