This topic describes how to handle Chinese characters and time when you use the Python SDK.
Chinese characters
If your Python code contains Chinese characters, a runtime error may occur. To prevent this error, declare the character encoding at the beginning of your code. For example:
# -*- coding: utf-8 -*-
Data types
Python 2 supports the following two data types:
Data type
Description
str
A string. This corresponds to the bytes type in Python 3.x.
unicode
A unicode stream. Its length is the number of characters. For example, the length of
u'éà'is 2.Python 3 supports the following two data types:
Data type
Description
str
A string. This corresponds to the unicode type in Python 2.x.
bytes
A byte stream. The length is the number of bytes. For example, the length of
b'Chinese'depends on the encoding. If the encoding is UTF-8, the length is 6.Input and output type conventions
The input type conventions are as follows:
Input
Type
Notes
OSS file name
str
If the type is bytes, it must be UTF-8 encoded.
Local file name
str, unicode
If the type is bytes, it must be UTF-8 encoded. For example, the yourLocalFile parameter in bucket.get_object_to_file.
Input data stream
bytes
For example, the data parameter in bucket.put_object.
The output type conventions are as follows:
Output
Type
Notes
Result of XML parsing
str
For example, the strings in the result of bucket.list_object.
Downloaded content
bytes
The Python SDK uses UTF-8 encoding for the bytes type by default. Ensure that your Python source file is also UTF-8 encoded.
Type conversion functions
The Python SDK provides three functions for type conversion:
Function
Description
to_bytes
- In Python 2.x, converts unicode to str. For other types, the initial value is returned.
- In Python 3.x, converts str to bytes. For other types, the initial value is returned.
to_unicode
- In Python 2.x, converts str to unicode. For other types, the initial value is returned.
- In Python 3.x, converts bytes to str. For other types, the initial value is returned.
to_string
In Python 2.x, this function is equivalent to to_bytes. In Python 3.x, it is equivalent to to_unicode.
Time
The Python SDK converts time values of the datetime.datetime type that are returned from the server to UNIX timestamps. A UNIX timestamp is the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC) on January 1, 1970. For example, the last_modified value returned by the bucket.get_object method is an integer that represents a UNIX timestamp.
You can use the datetime.datetime.fromtimestamp() method to convert a UNIX timestamp to a datetime object.