Community Blog Using CrateDB with PHP PDO for a Real-Time IoT Project on Alibaba Cloud

Using CrateDB with PHP PDO for a Real-Time IoT Project on Alibaba Cloud

In this article, we will look at using CrateDB with PHP PDO to create a real-time IoT project on Alibaba Cloud.

By Alex Mungai Muchiri, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud's incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

The Internet of Things is on the rise, ushering in a new possibility in device interconnectivity. Prior to the advent of the concept, using data to control vehicles, machines, home appliances and even entire cities seemed like a dream. In the present, however, IoT applications store and analyze data collected by millions of embedded sensors to enable autonomous operations of physical systems. Real-time response in IoT is critical for smooth systems operations and this requires a database to do the following:

  1. Process inputs from millions of data points in real-time
  2. Provide real-time responses from the data
  3. Accommodate varying data structures
  4. Scalable enough to handle complex operations such as time-based, geographical, or machine learning analyses.
  5. Deploy confidently to the cloud

The sheer scale of data volumes and transaction intensity, as well as the complexity of analysis, integration, and machine learning have placed IoT developers up against the wall. Thankfully, CrateDB was designed to handle IoT complexity with ease. In this tutorial, we are going to look at how to deploy CrateDB for your application on an Alibaba Cloud Elastic Compute Service (ECS) server running on Ubuntu 16.04.

Overview of CrateDB

Before we begin with the tutorial, here are some important facts about CrateDB and its ability to act as a backbone to support demanding IoT workloads:

Adaptable Architecture

CrateDB is based on a shared-nothing architecture. In the setup, a cluster of synchronized nodes coordinate to execute read and write operations in a distributed manner. Notably, since operations are distributing across the cluster, database capacity depends on the number of cluster nodes. The database also incorporates automated sharding to improve fault tolerance and workload balancing. The final attribute about its architecture is its native containerization using Docker or Kubernetes. The figure below demonstrates CrateDB architectural configuration:

(Source: crate.io)

Install CrateDB on Linux

There are CrateDB versions for Debian GNU/Linux, Ubuntu, and RedHat Linux. However, you don't need to have the specific package if you choose One-Step Setup. The assumption is that your Alibaba ECS runs on either of the three, in which case, run the following command to install CrateDB:

sh$ bash -c "$(curl -L install.crate.io)"

The command also takes care of Java 8 installation and other housecleaning tasks.

Import Test Data

In this step, we are going to import some test data to experiment with our CrateDB. You should be able to access the web administration user interface (UI) on your server by now. We shall be using it to import test data. Use the http://localhost:4200/ URL to access the interface on your server browser. Replace "localhost" with "CrateDB" if the first use case doesn't work.


We are now going to import some tweets using the Help tab on the admin UI, accessible on your left.


Follow the instructions after selecting Import Tweets for Testing. Try to settle for about 1000 tweets after which you should select the Tables option on the navigation menu. It should lead to the interface below:


The next phase is learning how to query CrateDB using the query console on the admin interface. Access the console interface from the Console tab on the navigation menu


We are going to use this screen to perform query operations of previously imported tweets. Use the Tables tab if you want to see the schema layout. In this example we are going to execute a simple filter query from the tweets collected in our database using the query below:

  FROM tweets
 WHERE account_user['followers_count'] > 100
 LIMIT 100;

The output is tweets from users with more than 100 followers. Once typed into the query console, the operation queries the attributes of the object account_user and selects those whose followers_count is more than 100.

The screenshot below shows the output of the query:


CrateDB supports a wide range of query capabilities such as full-text search, distributed aggregation, as well as more complex scalar functions and data analysis. There's a whole range of capabilities included in the CrateDB query reference. Below are three other ways to query CrateDB

The CrateDB Shell

This command-line comes with CrateDB. Get it started using the following command to connect CrateDB running on your host.

sh$ ./bin/crash

Executing queries on CrateDB is as simple as typing the queries and hitting enter as shown in the example below:

  SELECT account_user['followers_count']
    FROM tweets
ORDER BY account_user['followers_count'] DESC
   LIMIT 10;

CrateDB Shell has an auto-completion feature as you would notice when using this feature. The response would be something like this:

| account_user['followers_count'] |
|                         1416583 |
|                         1076527 |
|                         1025614 |
|                          886577 |
|                          854054 |
|                          818439 |
|                          795778 |
|                          761014 |
|                          740071 |
|                          673962 |
SELECT 10 rows in set (0.003 sec)


The CrateDB HTTP API can also run queries if you are familiar with the method.



We shall use a query to count the number of tweets in our database using HTTPie on localhost:4200. Execute the function below:

sh$ http localhost:4200/_sql stmt="SELECT COUNT(*) FROM tweets"

The JSON response should be as follows:

HTTP/1.1 200 OK
content-length: 71
content-type: application/json; charset=UTF-8

    "cols": [
    "duration": 11.847271,
    "rowcount": 1,
    "rows": [

This response indicates that we have 3879 counts of one row and one column.

Using third-Party Clients

While CrateDB only ships with admin UI and Crash, there are many other tools that can be used on CrateDB as listed on this blog: clients category.

CrateDB PDO driver

Alright, let's get started with the PDO driver. In this phase, we assume that you already have Alibaba Cloud ECS up and running on Ubuntu 16. Ensure that you have a Composer to get started on the project.

Set Up the PDO driver in the Composer

Include the PDO driver package in the composer.json file as shown below:

  "require": {

Install PDO

Install the package like so:

sh$ composer install

In your PHP application, use the line below to require the autoload.php file in the Composer.

require DIR . '/vendor/autoload.php';

Now we can connect to CrateDB after this task

Connect to CrateDB

Noteworthy, PHP PDO uses Data Source Name (DSN) strings as shown by this basic version


Just remember to replace <HOST_ADDR> with the host address of your Alibaba Cloud ECS where it is installed and <PORT> with a valid HTTP endpoint port number. Below are some sample DSN strings


If you have a cluster of nodes, just remember to use a number to specify the respective node as shown:


There's no limit to the number of nodes you can have on your cluster, but ensure all nodes are separated using the comma (,) character.

Your client will attempt to connect to each of your cluster's nodes until one of them connects, with each node being loved to the last position after an unsuccessful connection attempt. CrateDB is able to achieve some sort of round-robin load balancing using this behavior.

Now, if you want to connect to a specific schema, specify it as below:


The <SCHEMA> should be the name of a schema in your database, if you don't, the default 'doc' will be used. Based on this requirement, we can have the following DNS strings


Get a Connection

Now we are ready to get connected. The setup below will get your PDO connection up and running

use Crate\PDO\PDO as PDO;

$dsn = '<DATA_SOURCE_NAME>';
$user = 'crate';
$password = null;
$options = null;
$connection = new PDO($dsn, $user, $password, $options);

CrateDB 2.1.x supports authentication, but earlier models do not require that you supply a username and password. Furthermore, CrateDB superuser does not require a password, which means that you don't need the argument above. However, that should only apply if you have not configured a custom database user.

Advanced Settings

Use the commands below to configure the settings once the connection is established

$connection->setAttribute(, );

The <ATTRIBUTE> section requires a PDO attribute class constant, as indicated in this setAttribute guide.

The <VALUE> is a value for the corresponding class constant, something like this:


The attribute above represents timeout in seconds, whereby the HTTP connection drops after that time elapses. The default is 0.

Driver Constants

There are several attributes associated with the PDO driver such as,


It is the default PDO connection schema. Nonetheless, it is possible to query any schema in the database.

Under the SSL mode, you can specify the following:

PDO::CRATE_ATTR_SSL_MODE (int) (named attribute)

Disabling the SSL mode:


Disable requirement for verification in SSL mode:


Perform verification mode in SSL mode:


Link up to client key file


SSL client password


SSL certificate path


File with SSL certificate password


SSL Certificate Authority (CA) file


There is a whole range of operations to use for setting up SSL certificates in this reference.

PDO Fetch Modes

There are a number of supported fetch modes available for CrateDB PDO:


And that's how you set up CrateDB using PHP PDO on Alibaba cloud running on Ubuntu or any Linux variant. You can find further instructions here in case you are stuck (http://www.php.net/manual/en/intro.pdo.php).

Why Use CrateDB

There are a few other things you may want to know about why to use CrateDB.

CrateDB is based on SQL

Almost everyone knows how to work with SQL, and that makes CrateDB a powerful and easy to integrate within your IoT application. The database is also very compatible with most tools used in other SQL implementations such as PostgreSQL as well as REST interfaces.

CrateDB Uses NoSQL-style Storage and Indexing

The combination of SQL and NoSQL makes CrateDB a familiar yet highly scalable database. The developers achieved this by using distributed SQL and opensource NoSQL technologies. Specifically, the following technologies are used in CrateDB:

  1. Lucene: stores and indexes data
  2. Elasticsearch: logging transaction and masterless clustering
  3. Netty: networking between nodes asynchronously

Dynamic Schemas

Flexible schemas are a big advantage of working with CrateDB. Unlike rigid traditional schemas, CrateDB stores relationships in JSON documents, which can easily change the structure. For instance, if we were to use 200 sensors in our IoT application, CrateDB stores all that data in one table, which is 40 times faster to work with compared to a traditional SQL model.


An IoT system handles millions of data inputs from machines. CrateDB can write tens of thousands of inputs to each node every second and still handle data queries concurrently. Inserts are consistent and non-blocking, and amazing still, the issue of data durability and consistency has been addressed as well. The write-ahead logging handles durability while optimistic concurrency control ensures consistency of data. Furthermore, a refresh frequency setting at table-level enforces data consistency periodically.

Real-time querying

CrateDB uses in-memory columnar indexing to achieve real-time responses. On each node, there is a memory resident column alleviates the data limitations associated with real-time databases. It is thus easy to use cached data to determine if there are any records meeting the query criteria on a specific node. The distribution of querying ensures fast performance as well.


CrateDB is written in Java to take care of distributed processing needs in a typical IoT project. You can run it on data centers, or at the edge before channeling data to a central cloud platform.


In summary, CrateDB is one of the most potent IoT databases available right now. In this tutorial, we have looked into how to use it with PHP PDO and evaluated some of its characteristics. However, at the moment, CrateDB does not support direct device-to-database integration. Nonetheless, the database should handle nearly all your IoT project requirements with ease.

Do you have an Alibaba Cloud account yet? Sign up for an account and try over 40 products for free worth up to $1200. Get Started with Alibaba Cloud to learn more.

0 0 0
Share on


17 posts | 2 followers

You may also like



17 posts | 2 followers

Related Products