Evolution of image server architectures
Created#More Posted time:Mar 21, 2017 9:28 AM
Today, image display functions are required for almost all websites, web apps, and mobile apps. These functions also play an important role within websites and apps. For these reasons, proactive planning of image servers becomes indispensable; for example, quick image upload and download speeds are crucial. This doesn't mean your architecture must be robust enough at the very beginning, but it at least needs to be scalable and stable. Although many architecture designs are available, I'd like to share some of my thoughts on which one is ideal.
Undoubtedly, the request I/O consumes the most resources for image servers. For a web app, the image server must be a separate object or the app may crash due to an overloaded of I/O requests with the image server. Therefore, separating image servers from app servers is necessary, especially for larger websites and apps. And building separate image server clusters provides the following advantages:
1. Shares the I/O workload of web servers. By separating resource-consuming image servers, the performance and stability of those servers improve.
2. Implements specialized optimizations for image servers, such as setting special caching schemes for the image service to decrease the bandwidth costs and boost access speeds.
3. Improves the scalability of websites by adding more image servers, so the throughput capabilities of the image service can grow.
From the original Web 1.0 era to Web 2.0 and today's Web 3.0, the architecture of image servers has continued to change with the scaling-up of image storage needs. What follows will describe the evolution of image server architectures during their three major phases.
The initial phase
Before we discuss architectures of small image servers in this initial phase, let's learn about Network File System technology or NFS for short. Developed by Sun, NFS is a technology that enables network-based file sharing among different computers and different operating systems. An NFS server functions as a file server, and is used to share files among UNIX-type systems. With this type of server, you can easily mount files to a directory as if you were accessing locally stored files.
If you do not want to sync all images to each image server, NFS provides you with the simplest means of file sharing as well. NFS is a distributed client/server file system and is essentially used to allow sharing among users on different machines. With NFS, users can connect to shared computers and access files on those computers as if they were local files. The detailed implementation logic of this architecture is as follows:
1. All front-end web servers mount the directories exported by the 3 image servers through NFS to receive the images written by those web servers. Then, the server, represented in [Figure 1], mounts the "export" directories of the other two image servers to a local address so that Apache can provide external access.
2. The user uploads images.
The user submits an upload request "post" to the web server. After processing the uploaded images, the web server copies those images to the corresponding locally "mounted" directories.
3. The user accesses an uploaded image.
When the user tries to access an uploaded image, the image server, represented in [Figure 1], will read the image from the corresponding "mounted" directories.
The above architecture has the following problems:
1. Performance: The existing architecture is heavily dependent on NFS; in particular when the NFS server of image servers becomes faulty, the front-end web servers may also be affected. NFS is also vulnerable to deadlocking problems which can be resolved only by restarting the hardware. NFS can encounter especially bad performance issues when images are scaled up to a certain level.
2. Availability under higher loads: Only one image server provides the image download service for external apps and thus it is vulnerable to single point of failures.
3. Scalability: Image servers are heavily dependent on each other with little horizontal expansion allowances.
4. Storage: Occupation of space isn't even on different image servers, as high volume upload web servers are beyond the architecture's control.
5. Security: For users with the web servers' passwords, NFS provides lower security, as those users may freely modify NFS hosted content.
Alternatively, you can use FTP or rsync, instead of NFS, to perform image syncs for image servers. For example, when using FTP, you can maintain copies of images on each image server, functionally creating a backup for those images. However, uploading images to web servers via FTP is time-consuming, while asynchronous image syncs suffer from high latencies (though this can be minor for small image files). Similarly, for rsync syncs, each rsync scan takes a longer time to complete when data files accumulate beyond a certain level, also introducing latency issues.
The development phase
Websites impose specific performance and stability requirements on image servers when they grow to a certain scale. For this reason, the NFS image server architecture described above is defective, as it is heavily dependent on NFS and is vulnerable to single point of failures. Therefore the overall architecture needs to be upgraded. Then, the next image server architecture, which implements distributed image storage, emerges, as shown in the figure above.
The detailed implementation logic of this updated architecture is as follows:
1. After images are uploaded to web servers by users, the web servers process those images, and then the front-end web servers post the images to one of the image servers represented in [Figure 1], [Figure 2]…[Figure N]. After receiving posted images, the image server writes the images to local disks and returns a corresponding success status code. Front-end web servers decide further operations based on the returned status code. If a success status code is returned, the images will be resized to various thumbnails with watermarks, and then the ID of the image server and corresponding image paths will be written to the database.
2. Control image uploading
When image uploading needs to be adjusted, you can control the image server to which the images will be posted to by web servers simply by changing the ID of the target image server. In the meantime, you need to install nginx on the target image server and provide a Python or PHP service to receive and store the images. Done this way, you can also develop an nginx extended module, if you do not want to enable Python or PHP services.
3. Develop the user access process
When the user accesses a page, images on the page will be accessed from the corresponding image server based on the URLs of the requested images.
For example: http://imgN.xxx.com/image1.jpg
The image server architecture in this phase is added with load balancing and distributed image storage, which can resolve high concurrency and large storage problems, to some extent. You can also consider implementing F5 hard load balancing if you can afford it. You can also consider using open-source LVS soft load balancing (and the caching function can be enabled as well). With this architecture, the performance of concurrent accesses can be improved dramatically, and extra servers may readily be deployed as required. Despite all of these advantages, this architecture also suffers from a minor defect: when the same image is stored on multiple Squids, accessing the image may go to squid1 initially, while squid2 or another squid will be accessed later after the LVS expires. However, a few redundancy issues of this sort can be completely acceptable as concurrency issues are eliminated. In this system architecture, you can use squids for second-level caching, while Varnish or Traffic Server are good choices as well. When selecting open-source software for caching, consider the following items:
1. Performance: Varnish technically has advantages over Squid as it uses "Visual Page Cache" technology. For memory utilization, Varnish is also preferred to Squid as it provides better performance by avoiding frequent file exchanges in both the memory and disks. Varnish does not support caching files to local disks. Additionally, through the powerful Varnish management interface, you can quickly clear part of the cached files in a batch by using regular expressions. Nginx uses the third-party module "ncache" for caching purposes. Nginx provides equivalent performance to Varnish but is normally used for reverse caching in this architecture (Nginx is widely used for caching static files and supports up to 20,000-plus files.) In static architectures, nginx is sufficient for caching purposes if the front-end interfaces with a CDN or the front-end performs 4 layers of workload processing.
3. Stability: Being a proven caching product, Squid provides better stability than Varnish, as the Varnish may crash at times, based on user feedback. As reported by Yahoo, Traffic Server also produces no cases of damage to data and provides outstanding reliability. In the future, Traffic Server is expected to win over more users within China.
The image server architecture described above can eliminate NFS dependency and single points of failure, while balancing space occupation of image servers with improved security. However, it still suffers from horizontal expansion of image servers for redundancy. To store files merely on normal hard disks, the actual processing capability of those disks is a concern. RPMs of 7,200 and 15,000? The difference matters. For the file system, some performance tests need to be done when you select from any of xfs, ext3, ext4, and reiserFs. According to official testing data, reiserFs is preferred for smaller image files. Selecting an appropriate inode size is another concern during file system determination. This is because Linux assigns each file an index node number, called an inode. Think of an inode as a pointer, which always points to the specific storage location of the current file. Each file system supports a limited number of inode nodes. If the file quantity is too high, the file system will eventually be unable to create more files due to all node space available being exhausted, even though the size of each file is technically zero bytes. For this reason, you must balance both the space and the speed to build a rational file directory index.
The cloud storage phase
Way back in 2011, Li Yanhong mentioned the arrival of the internet image-reading era, during the Baidu Union Summit. Since then, image services have been playing a key role in internet apps and the ability to process images became a basic skill for enterprises and developers. Also, the importance of rapid image downloading and uploading has risen in terms of the following three major concerns: high traffic, high concurrency, and mass storage.
The Alibaba Cloud storage service (Open Storage Service, OSS for short) is a massive, secure, cost-effective and highly reliable cloud storage service provided by Alibaba Cloud. Users can upload and download data anytime and anywhere through simple RESTful interfaces. They also can manage their data using web pages. In addition, OSS provides Java, Python, and PHP SDKs to simplify user programming. With OSS, users can develop a diverse range of massive-data-based services such as multimedia sharing websites, online storage, and personal and corporate data backups. The following uses Alibaba Cloud OSS as the example to describe image cloud storage. The figure above shows the brief architecture of OSS.
For meaningful cloud storage, the key is to provide cloud services, but not store images. Using cloud storage services provides the following benefits:
1. Users do not have to know the types, interfaces, and storage media of storage devices.
2. Users do not have to be concerned with data storage paths.
3. Users do not have to manage and maintain any storage devices.
4. Users do not have to consider data backups and disaster recovery solutions.
5. Users can easily access the cloud storage for various storage services.
Structure of architectural modules
1. KV Engine
In OSS, both object source information and data files are stored in the KV Engine. In version 6.15, the version of KV Engine is 0.8.6 and OSSFileClient for OSS is used as well.
This module records the mappings between buckets and users, and displays the utilization of bucket resources in minutes. Also, Quota provides an HTTP interface that is available for querying by the Boss system.
3. Security module
This module records the IDs and keys of users and provides a user certification function for accessing OSS.
1. Access Key ID & Access Key Secret (API Key)
When a user registers for OSS, the system assigns a pair of Access ID & Access Key Secret (referred to as an ID pair) for the user to identify the user and verify the signature for accessing OSS.
OSS provides users with a virtual storage space where they can have one or more buckets.
OSS uses buckets as the namespaces of user files. Each bucket name is globally unique in the entire OSS and cannot be changed. Each object stored in OSS must be included in a bucket. One application, such as an image sharing website, corresponds to one or more buckets. A user can create up to 10 buckets, but there is no limit on the quantity and total size of objects in each bucket. Users do not need to consider the scalability of the data.
Each user file is an object in OSS, and the size of each file must be less than 5 TB. The object contains "key", "data" and "user meta". Specifically, "key" indicates the name of the object, "data" indicates data of the object and "user meta" indicates the description of the object. The usage of objects is very simple. For example, you can use them as follows in the Java SDK:
OSSClient ossClient = new OSSClient(accessKeyId,accessKeySecret);
PutObjectResult result = ossClient.putObject(bucketname, bucketKey, inStream, new ObjectMetadata());
Executing the code above uploads the image stream to the OSS server.
Accessing images is also very simple where the access URL is as follows:http://bucketname.oss.aliyuncs.com/bucketKey
Distributed file system
Distributed storage provides multiple benefits. For example, it can automatically implement redundancies without manual backups. This saves users lots of effort, as backups can be annoying when the file quantity is extremely high, and a single rsync scan can take up to hours to complete. Also, distributed storage allows you to expand the storage capacity dynamically and easily. Though TFS (http://code.taobao.org/p/tfs/src/) and FASTDFS are also used in some file systems, TFS is more useful for smaller file storage scenarios, such as Taobao. Additionally, FASTDFS may encounter performance and stability issues when the number of concurrent write attempts exceeds 300. OSS uses the highly available and reliable distributed file system Pangu as the file system, which was developed by Alibaba based on the Apsara 5K platform.
The distributed file system Pangu is similar to GFS from Google. Pangu uses the master-slave architecture where the master is responsible for metadata management and the slave is referred to as a Chunk Server for reading and writing requests. The master uses the Paxos-based multi-master architecture. When one master becomes faulty, another master will take over shortly and restore the failure to normal within one minute. Files are stored as slices, and each file has three copies that are stored in disks on different racks. Meanwhile, end-to-end data verification is performed.
HAProxy load balancing
The HAProxy-based automatic hash architecture is a brand new caching architecture. It uses nginx as the front-end and acts as the proxy for caching devices. Nginx consists of cache groups, and requests are forwarded to caching devices by performing URL hashes with nginx.
This architecture is suitable for upgrading Squid-only caching architecture and nginx can be installed additionally on Squid devices. Nginx provides a caching function that can directly cache links with a mass access volume in nginx, without another request from the proxy, ensuring high availability and performance of image servers. For example, links can be to the favicon.ico and the website logo. The load balancing function can ensure the balance of all OSS requests, while the background HTTP server supports automatic failover to ensure uninterruptable OSS service.
The Alibaba Cloud CDN service has deployed more than 100 nodes around the country, and can provide users with outstanding and accelerated access performance. When website services experience very high volumes of traffic, you can address this challenge easily with our CDN service, without having to expand the network bandwidth. As with the OSS service, you need to sign up for the CDN service at aliyun.com before you can use it. After successfully signing up to the service, you need to create your distribution channels in the management center on the website. Each distribution channel consists of two parts: a distribution ID and a source site address.
By using Alibaba Cloud OSS and CDN, you can easily implement content acceleration for each bucket as each bucket corresponds to a separate second-level domain. Also, performing CDN deletion on each file can easily and economically resolve storage and network issues for the service, because the storage and network bandwidth of most websites and apps are normally consumed by images or videos.
Within this industry, personal-user-oriented cloud storage websites such as DropBox and Box.net from overseas are very popular among users, while Qiniu and Yupoo cloud storage websites from, within China, also earned a good reputation.
Separated uploading and downloading
For image servers, the image downloading workload is far heavier than the image uploading workload, and their operational processing logic is quite different. Specifically, upload servers are responsible for renaming images and recording import information, while download servers are responsible for various dynamic processing tasks, such as adding watermarks to images and resizing images. From the perspective of high availability, occasional image download failures are acceptable, but not for image upload failures, which mean physical data loss. Separating upload tasks from download tasks ensures that no impact can be made on image uploading by image downloading pressure. Load balancing policies for the download and upload entries are also different. On the one hand, an upload task goes through certain logic processing steps, including recording the relationship between users and images to the Quota Server. On the other hand, if the logic processing steps for a download task bypass the front-end caching processing and go through the backend operational logic processing, image path information needs to be obtained from the OSS. In the near future, Alibaba Cloud will provide CDN-based neighborhood uploading functions, which can automatically select the CDN node that is nearest to the user, for optimal data upload and download efficiencies. Compared with traditional IDCs, the access speed can be improved significantly.
Image anti-leech processing
If services are not enabled with anti-leeching functions, an unexpected high volume of accesses can result in bandwidth and server overload problems. The common solution to this problem is to add a refer ACL judgment mechanism to nginx or Squid's reverse proxy software. OSS also provides a refer-based anti-leeching technology. Meanwhile, OSS has provided an advanced URL signature anti-leeching function, whose implementation logic is as follows:
First, verify that the permission of your own bucket is "private", meaning that any requests to this bucket are valid only when they have been authenticated via a signature. Then, dynamically generate a signed URL based on the operation type, the bucket to be accessed, the object to be accessed, and the timeout value. Via this signed URL, authorized users can perform certain operations before the URL expires.
The Python code for a signature is as follows:
For the code, "method" can be PUT, GET, HEAD, or DELETE, and the last parameter "timeout" refers to a timeout value in second. Through the Python method above, you can obtain the following signature URL:
Using this dynamic signature URL calculation method, you can effectively protect the data in OSS against anti-leeching.
Image editing API
For online image editing, internet technicians should be familiar with GraphicsMagick (GraphicsMagick(http://www.graphicsmagick.org/)). GraphicsMagick is derived from ImageMagick 5.5.2 but features better stability and performance. Today, GraphicsMagick is lighter and easier to install with a higher efficiency. Also, GgraphicsMagick manuals are complete and GraphicsMagick commands are almost identical to ImageMagick commands.
GraphicsMagick provides a rich set of APIs for image cropping, zooming, composition, watermarking, conversion, and padding. Also, it offers an extensive selection of SDKs including Java (im4java), C, C++, Perl, PHP, Tcl, and Ruby; with more than 88 image formats supported, such as DPX, GIF, JPEG, JPEG-2000, PNG, PDF, PNM and TIFF. GraphicsMagick is also available on most platforms, including Linux, Mac, and Windows. However, independently developing these image processing services imposes demanding I/O requirements on servers, and these open-source image processing libraries are not very stable at this stage. For example, a tomcat process crashes when I use GraphicsMagick and this requires me to restart the tomcat service manually.
Currently, Alibaba Cloud has released a number of image processing APIs, which provide solutions to most image processing tasks such as generating thumbnails, watermarks (including text watermarks), patterns, and channels. By using the processing scheme shown in the figure above, developers can easily develop their own products. In the future, more developers can hopefully produce a wider set of outstanding products based on OSS.