Community Blog Provisioning IoT Clusters with ApsaraDB for PostgreSQL

Provisioning IoT Clusters with ApsaraDB for PostgreSQL

This article focuses on how ApsaraDB for PostgreSQL meets the requirements of IoT applications while maintaining the needs of varying data structures and datasets.

By Afzaal Ahmad Zeeshan.

IoT and its Data Requirements

The journey from the internet of people to the internet of things (IoT) has been full of challenges. After all, it is about people anyways because the goal of IoT is to understand people and their behavior rather than things themselves—things are expendables. From the high-level perspective, IoT is mainly about connected hardware devices such as sensors, mobile, and other wireless technologies, and how do we extract information and transform it into meaningful insights for the better consumer experience. However, the actual challenges are met when it comes to IoT data analytics and management. In the end, it is data that gives meaningful information to elevate the business value.

Though every IoT infrastructure is different as compared to others, the base architecture would roughly be the same. The foundation involves devices and objects which are connected to internet; these devices have embedded sensors to collect information from the environment and forward it to the IoT gateways. This constitutes the first phase of the IoT infrastructure development and deployment, and needs special care to be taken while designing to tackle a couple of problems in IoT domain:

  1. High-throughput for the servers.
  2. Queue implementations for ordered message processing.
  3. Data storage for analysis and marketing purposes.
  4. Signal persistence regardless of cluster size.
  5. Low-latency to support quick actions.

In the second step, the unprocessed data gets converted into the digital stream, where it gets filtered before analysis. This phase normally happens on the IoT hubs and central systems that have special software—machine learning based, or normal software—that then processes the incoming message before storing it in the data warehouse.

In the next step, the partially processed and cleaned data is examined for visualization, where multiple machine learning techniques are applied again to this data to process further and analyze the given parameters. Sometimes the data is also processed and filtered to remove duplicates, or statistically compress the data to fit in our data centers—such as sampling the data. Finally, the data is moved to data centers, where it is managed and analyzed to deduce the final insights.

Fundamentally, data is the backbone of IoT, which can easily be deduced from the above-mentioned layers of an overall IoT infrastructure. The requirements of data management and provisioning are growing at the exponential rate. IoT platforms have specialized data requirements to deal with vast real-time datasets provided by embedded sensors and actuators.

Cloud Platforms – Accelerating IoT Applications and Investments

Cloud-based data services are playing a vital role in providing a large and powerful platform to store, manage, and analyze lots of unstructured data for IoT applications. These services are designed to facilitate massive volumes of unprocessed, non-relational data and to perform data analytics, control, and monitoring tasks in an efficient and cost-effective way, which would not be possible otherwise. Moreover, the complexity of data keeps varying, such as the conclusion extracted from the unprocessed data, which was received as input to these IoT devices is mainly in the form of trends, patterns, and statistical visuals to help businesses behave proactively to facilitate their users. Think of this from an automobile's perspective, who might have to keep a track of their systems that are installed in thousands (if not millions) of vehicles on the road.

The technology stack for IoT specific applications is vast, from automating the coffee machine as per the user preference, all the way to the development of smart devices for home automation and automobile industries. Therefore, these cloud-driven data services are able to meet a diversified range of ever growing data requirements.

So, this article focuses on how PostgreSQL by ApsaraDB designed and supported by Alibaba Cloud meets the requirements of IoT applications while maintaining the needs of varying data structures and datasets. Furthermore, we will discuss how IoT clusters are formed and designed to cater to these essential data requirements received from different connected devices.

Here is a default purchase page that you see when you are going to buy the product:


As this screenshot demonstrates, the product purchase page only asks for the basic information, most of the operational headache is taken care of by the infrastructure. Take the "Edition" as an example, the "High-availability" configures the underlying topology for the resources—master/slave nodes, and backups. Same applies to other settings, such as the networks to deploy this database in.

Provisioning the IoT clusters with Alibaba Cloud ApsaraDB

So far, we have discussed the significance of using cloud service providers to meet the database challenges for IoT applications. In this section, we will highlight why Alibaba Cloud PostgreSQL is the most suitable and reliable database service to meet the super dynamic requirements associated with IoT devices and applications—not only as compared to other cloud-based database system but also when it comes to the relational databases offered by Alibaba Cloud ApsaraDB too. And then why Alibaba Cloud is the most viable cloud service to fulfill these intensified capabilities of Postgres by leveraging the streamline features of a cloud platform. However, if you want to have an overview of the basic functionalities of PostgreSQL supported by Alibaba Cloud ApsaraDB before moving further to explore Postgres capabilities for IoT applications, refer this handy and explanatory documentation - ApsaraDB RDS for PostgreSQL—or read my previous post that explore the deployment options made available on Alibaba Cloud for PostgreSQL.

Entertain Huge Volume of Datasets

IoT applications make use of substantial data clusters. Industries revolving around IoT applications require to manage and analyze these massive datasets to extract relevant patterns and statistics. These huge datasets can easily be managed and analyzed using Alibaba Cloud PostgreSQL database service.


Alibaba Cloud can support up to 6 terabytes of database storage. And you can further expand the storage options using Enhanced SSD, that can support up to 32 terabytes of storage.


This also is only the storage counted for the production environment—not the backups. For the backups you can use OSS storage buckets and store your data for longer periods of time and more than several terabytes of backups.

Supports Varying Data Structures

Furthermore, IOT clusters are not bound to receive similar kinds of data from all the devices in every case; the connected devices can transfer relational as well as non-relational databases. Hence, PostgreSQL nonetheless becomes the first choice to accommodate the relational as well as non-relational requirements. Alibaba Cloud Postgres instances can deal with seemingly varying data structures; structured, unstructured, semi-structured as well as with a dozen different data formats and extensions. Even if your data is structured and follows a same transactional format across the globe, you can still partition the data and create multiple databases for different regions.


Cost-Effective Plans and System Failover Recovery

IoT applications requiring event streaming work with massive in-motion datasets. These real-time datasets are used in systems that are designed to deal with emergency situations. Such as IoT applications developed to accommodate financial transitions, flight tracking, etc. These applications require stable systems and quick failover recovery mechanisms, which are the major top-tier features of PostgreSQL by Alibaba Cloud—thus the setting of "High-availability" by Alibaba Cloud. There is also an alerting system in place, that you can use to preview the current state of your deployment, as well as your resource consumption.


Commercial IoT applications make use of extensive datasets as they require descriptive and predictive analysis. For such applications, their data clusters sizes are larger than 10-20GBs. For such case, PostgreSQL provides cost-effective plans which follow pay as you go model in order to minimize cost depending upon the need of your applications.

Your backups are also automated, they can be done each day for you.


As for the security and privacy concerned parties, you can always use the BYOK features of Alibaba Cloud to add encryption for your data storage.


You will find this option by the end of the page, and you can configure the encryption yourself.

Geographical and Spatial Analytics with PostGIS

The spatial and GIS data analytics methods are used to analyze geographical patterns to identify the spatial relationship between different physical objects. Location-based IoT applications such as smart car parking applications, trackers, and delivery systems depend heavily on these types.

PostgreSQL accelerates the developers' productivity to design the solutions and work with IoT applications. A few of the essential data types for IoT applications are JSON and GIS (Geographical Information System). Database administrators heavily use these two formats to meet different data requirements. PostgreSQL enables the DB administrators to query from different formats of data directly and to perform operations on JSON and GIS. As a benchmark—that I read on Stack Overflow a while ago—the GIS performance of PostGIS is native, thanks to the community-led development, and have a huge upper hand when it comes to comparison with other relational databases.

This ease of performing SQL queries directly avoids the overhead of importing data and then querying it. The JSON format is extensively used to perform relational queries whereas; GIS format is used for precise positioning, route monitoring, and raster data analysis (where raster data is used to represent the world with respect to predefined cells and grid-shaped tessellations.)

PostGIS is a spatial database extension used with PostgreSQL. It provides extensive support for geographical-based data types and objects, allowing queries based on different locations and places, which the most common type for IoT applications. Alibaba Cloud PostgreSQL extends the functionalities provided by Postgres even more; this service offers 2D and 3D modeling of geographical data in real-time. It supports functions to identify different patterns and shifts in the data. Moreover, the service supports the precise positioning of the earth using the standards giving by OpenGIS extension. And the benefit of having these extensions provisioned for you out of box is like cherry on the top!

ApsaraDB PostgreSQL supports a wide range of spatial data types such as geometrical collections, different lines, and line string, point and multi-points, polygon and multipolygon, etc. There are different quick functions that are used to query these geographical data types such as area, length, distance based on longitude and latitude, contains overlaps, within and touches, etc. Other than this, the service supports GEOMETRY_COLUMNS and SPATIAL_REF_SYS as the metadata functions types. But we can expand on these topics in a separate post of their own.

The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.

0 0 0
Share on

Alibaba Clouder

2,606 posts | 737 followers

You may also like