There are mainly three types of data storage
, which are object storage
, file storage and block storage. What are they and what are their differences?
File storage is one of the most common types of storage: most people are familiar with it from their day-to-day computer usage. Consider a simple case: you store photos from a recent trip on your personal laptop/desktop. First, you create a folder named ‘my trip’. Now you can add another folder under this folder with the name ‘my favorites’ and put your favorite photos in it. In this way, you are organizing your files into a hierarchical structure with folders and sub-folders and can access them using the folder/file path.
When a file is stored in this way, it has limited metadata attached to it such as creation date, modification date, and file size. This simple organizational schema can begin to cause problems as the amount of data grows. Performance can go down because of the increasing resource demands on the filesystem to keep track of files and folders, and these “structural” problems cannot be solved by simply increasing the storage space available to the filesystem.
Despite potential issues at scale, filesystems perform fine for day-to-day usage on personal computers and servers used in the workplace and medium to large enterprises. File storage is commonly seen and deployed on hard drives and Network Attached Storage (NAS) systems.
Object storage is a type of data storage in which each unit of data (called an “object”) is stored as a discrete unit. These objects can be virtually any type of data: pdf, video, audio, text, website data or any other file type.
As opposed to file storage, these objects are stored in a single, flat structure without a folder hierarchy. In object storage, all the objects are stored in flat addresses space unlike the nested, hierarchical structure used by file storage. Moreover, all the default and custom metadata are stored with the object itself (not as part of a separate filesystem table or index), in a flat address space with a unique identifier, and in that way becomes easier to index and access.
Object storage is quite common in cloud-based storage scenarios and can be used to manage, process and distribute content with very high scalability and reliability. The flat addressing scheme means that accessing individual objects is fast and easy: object names can serve as “keys” in a lookup table. Object storage systems simply need to know the key (name) of the object you are looking for, and can then return it to you quickly and easily using a lookup table.
Object Storage and File Storage both treat files as a single “unit” of data. Block Storage, as the name suggests, treats data as a sequence of fixed-size “chunks” or “blocks” in which each file or object could be spread across multiple blocks. These blocks need not be stored contiguously. Whenever this data is requested by the user, the underlying storage system merges the data blocks back together and serves the user request.
This can be achieved without the need for a hierarchical structure because each block has a different and unique address and exists independently of all others. In some cases, block storage can retrieve data very quickly because there is not necessarily one path to the data which needs to be read (think of a disk array, in which data for the same file can be read from multiple disks). Block storage also achieves high efficiency because blocks can be stored wherever it is most convenient (blocks representing the same file or object do not need to be stored adjacent to one another). However, block storage is usually expensive and has limited capability to handle metadata (an object or file level concept) and these need to be handled at the application level. Block storage is commonly deployed in Storage Area Network (SAN) storage. In most applications, object or file storage is actually a layer on top of underlying block storage. You can think of block storage as the foundation on which file storage systems are built.
The table below compares the different features of different types of storage. Block storage is 'highly structured' as each data block is arranged in structured fixed blocks for easy indexing and search. File storage is indexed, and 'structured' in a hierarchical manner and object storage is 'unstructured' as there is no format or structure for data storage. Instead, there is simply a flat list of objects. In simple terms, “Data Consistency” can be understood as the read, write and update guarantees made by the storage system, such as whether or not recently written objects are immediately available to be read back or not. "Access level" is the level of permissions users have to access and manipulate data.
| Eventual consistency
| Strong consistency
| Strong consistency
| Hierarchically structured
| Highly structured at block level
| Access Level
| Object level
| File level
| Block level