Before you use OSS, we recommend that you have a basic understanding of the following concepts.
A bucket is a container for objects stored in OSS. Every object is contained in a bucket. The data model structure of Alibaba Cloud OSS is flat instead of hierarchical.
- All objects (files) are directly related to their corresponding buckets. Therefore, OSS lacks the hierarchical structure of directories and subfolders as in a file system.
- A user can have multiple buckets.
- A bucket name must be globally unique within OSS and cannot be changed once a bucket is created.
- A bucket can contain an unlimited number of objects.
The naming conventions for buckets are as follows:
- The bucket names must contain only lower case letters, numbers, and hyphens (-).
- The bucket names must start and end with a lower-case letter or number.
- The bucket names must be at least 3 bytes and no more than 63 bytes in length.
Objects, also known as files, are the fundamental entities stored in OSS. An object is composed of metadata, data, and key. The key is the unique object name in a bucket. Metadata defines the attributes of an object, such as the time last modified and the object size. You can also specify custom metadata of an object.
The lifecycle of an object starts when it is uploaded, and ends when it is deleted. During the lifecycle, the object content cannot be changed. If you want to modify an object, you must upload a new object with the same name as the existing one to replace it. Therefore, unlike the file system, OSS does not allow users to modify objects directly.
OSS provides the Append Upload function, which allows you to continually append data to the end of an object.
The naming conventions for objects are as follows:
- The object names must use UTF-8 encoding.
- The object names must be at least 1 byte and no more than 1023 bytes.
- The object names cannot start with a backslash ( \ ) or a forward slash ( / ).
Note Object names are case sensitive. Unless otherwise stated, objects and files mentioned in OSS documents are collectively called objects.
A region represents the physical location of an OSS data center. You can choose the region where OSS will store the buckets you create. You may choose a region to optimize latency, minimize costs, or address regulatory requirements. Generally, the closer the user is in proximity to a region, the faster the access speed is. For more information, see OSS regions and endpoints.
Regions are configured at bucket level instead of object level. Therefore, all objects contained in a bucket are stored in the same region. A region is specified when a bucket is created, and cannot be changed once it is created.
An endpoint is the domain name used to access the OSS. OSS provides external services through HTTP RESTful APIs. Different regions use different endpoints. For the same region, access through an intranet or through the Internet also uses different endpoints. For example, regarding the Hangzhou region, its Internet endpoint is oss-cn-hangzhou.aliyuncs.com, and its intranet endpoint is oss-cn-hangzhou-internal.aliyuncs.com. For more information, see OSS regions and endpoints.
An AccessKey (AK) is composed of an AccessKeyId and an AccessKeySecret. They work in pairs to perform access identity verification. OSS verifies the identity of a request sender by using the AccessKeyId/AccessKeySecret symmetric encryption method. The AccessKeyId is used to identify a user. The AccessKeySecret is used for the user to encrypt the signature and for OSS to verify the signature. The AccessKeySecret must be kept confidential. In OSS, AccessKeys are generated by the following three methods:
- The bucket owner applies for AccessKeys.
- The bucket owner uses RAM to authorize a third party to apply for AccessKeys.
- The bucket owner uses STS to authorize a third party to apply for AccessKeys.
For more information about AccessKeys, seeAccess control.
In OSS, object operations are atomic, which means operations are either successful or failed without an intermediate state. OSS will never write corrupted or partial data.
Object operations in OSS are strongly consistent. For example, once a user receives an upload (PUT) success response, the object can be read immediately, and the data has already been written in triplicate. Therefore, OSS provides strong consistency for read-after-write. The same is true for the delete operations. Once a user deletes an object, the object becomes nonexistent immediately.
Data redundancy mechanism
- Object operations in OSS are strongly consistent. For example, once a user receives an upload or copy success response, the object can be read immediately, and the redundant data has already been written to multiple devices.
- To ensure complete data transmission, OSS checks whether an error occurs when packets are transmitted between the client and the server by calculating the checksum of the network traffic packets.
- The redundant storage mechanism of OSS can avoid data loss if two storage facilities
are damaged at the same time.
- After data is stored in OSS, OSS checks whether redundant data is lost. If yes, OSS recovers the lost redundant data to ensure data reliability and availability.
- OSS periodically checks the integrity of data through verification to discover data damage caused by factors such as hardware failure. If data is partially damaged or lost, OSS reconstructs and repairs the damaged data by using redundant data.
Comparison between OSS and file systems
|Comparison item||OSS||File system|
|Data model||OSS is a distributed object storage service that uses a key-value pair format.||The file system is a hierarchical tree structure of directories that contain files.|
|Data retrieval||Objects are retrieved based on unique object names (keys).
Although users can use names like test1/test.jpg, this does not indicate that the object test.jpg is saved in a directory named test1. For OSS, test1/test.jpg and a.jpg have no essential difference. Similar amounts of resources are consumed during access to objects of different names.
|Files are retrieved based on their locations in directories.|
|Advantage||OSS supports massive concurrent accesses, which means large volumes of unstructured data (such as images, videos, and documents) can be stored and retrieved without excessive use of resources.||Folder operations such as renaming, moving, and deleting directories are quite easy, because data does not need to be copied and replaced.|
|Disadvantage||The stored objects cannot be modified directly.
If you want to modify an object, you must upload the new object of the same name to replace the existing one.
|System performance depends on the capacity of a single device. The more files and directories that are created in the file system, the more resources are consumed, and the lengthier the user process becomes.|
As a result, mapping OSS to a file system is not a recommended practice. When you use OSS, we recommend that you make full use of its advantages, including its massive data processing capabilities to store massive volumes of unstructured data, such as images, videos, and documents.
The mapping between OSS concepts and file system concepts is as follows:
|GetService||Retrieving the list of home directories|
|GetBucket||Retrieving the list of files|
|PutObject||Writing a file|
|AppendObject||Appending data to an existing file|
|GetObject||Reading a file|
|DeleteObject||Deleting an object|
|NA||Modifying file content|
|CopyObject (same target and source)||Modifying file attributes|
|CopyObject||Copying a file|
|NA||Renaming a file|