All Products
Search
Document Center

Tablestore:Overview

Last Updated:Jan 09, 2024

The Wide Column model is similar to the data model of Bigtable or HBase and is applicable to various scenarios such as metadata and big data. The Wide Column model stores data in data tables. A single data table supports petabyte-level data storage and tens of millions of queries per second (QPS). The data tables are schema-free and support wide columns, multi-version data, and time-to-live (TTL) management. The data tables also support auto-increment primary key columns, local transactions, atomic counters, filters, and conditional updates.

Introduction

The Wide Column model of Tablestore is similar to the data model of Bigtable or HBase. The Wide Column model stores data in data tables in a three-dimensional structure, which is defined by rows, columns, and time. Each row of a data table can have different columns. The attribute columns of a data table can be dynamically added or removed. When you create a data table, you do not need to define a strict schema for the attribute columns of the data table.

Components

The preceding figure shows the components of the Wide Column model. The following table describes the components.

Component

Description

Primary key

Primary keys uniquely identify each row in data tables. A primary key consists of one to four primary key columns.

Partition key

The first primary key column is called the partition key. Tablestore partitions data in a data table based on the partition key values. Rows that share the same partition key value are allocated to the same partition to ensure balanced distribution of data access requests.

Attribute column

All columns except for the primary key columns in a row are called attributed columns. Each attribute column can contain values of different versions. Tablestore does not impose limits on the number of attribute columns that can be contained in each row.

Version

Each value in an attribute column has a unique version number. The version number is a timestamp based on which you can manage the TTL of attribute column values. For more information, see the Version number section of the "Data versions and TTL" topic.

Data type

Tablestore supports the following data types: STRING, BINARY, DOUBLE, INTEGER, and BOOLEAN. For more information, see the Data types section of the "Naming conventions and data types" topic.

TTL

You can specify a TTL value for each data table. For example, if you set the TTL value to one month for a data table, Tablestore deletes data that is older than one month in the data table from. For more information, see the TTL section of the "Data versions and TTL" topic.

Max versions

You can set the maximum number of versions for the value in each attribute column of one data table. Max versions can be used to control the number of versions for the value in each attribute column. When the actual number of versions in an attribute column exceeds the max versions value, Tablestore asynchronously deletes earlier versions. For more information, see the Max versions section of the Data versions and TTL topic.

Core components

Data tables, rows, primary keys, and attributes are the core components of the Wide Column model of Tablestore. A data table consists of rows. Each row consists of a primary key and one or more attributes. The first primary key column is called the partition key.

The following table describes the primary key, attribute, and partition key.

Note

For more information about data types supported by primary key columns and attribute columns, see the Data types section of the "Naming conventions and data types" topic.

Component

Description

Primary key

Primary keys uniquely identify each row in data tables. A primary key consists of one to four primary key columns. When you create a data table, you must specify primary key columns, including the name, data type, and sequence of the primary key columns.

Tablestore indexes data in a data table based on the primary key values of the data table. By default, rows in a data table are sorted in ascending order based on the primary key values.

Partition key

The first primary key column is called the partition key. To ensure load balancing, Tablestore automatically distributes a row of data to the corresponding partition and machine based on the range to which the partition key value of the row belongs. Rows that share the same partition key value belong to the same partition. A partition may store rows that have different partition key values. Tablestore splits and merges partitions based on specified rules.

Note

Partition key values are the basic unit to partition data. Data that shares the same partition key value cannot be further split. To prevent partitions from being too large to split, we recommend that you keep the total size of all rows that share the same partition key value to 10 GB at most. For more information about how to select a partition key, see Table operations.

Attribute

A row can have multiple attribute columns. The number of attribute columns in each row is unlimited, and the attribute columns in each row can be different. The value of an attribute column in a row can be null. The values in the same attribute column of multiple rows can be of different data types.

An attribute column stores multiple versions of its value. You can specify the number of versions that can be retained for an attribute column value for query and use. You can also specify a TTL value for attribute column values. For more information, see Data versions and TTL.

Differences between the Wide Column model and the relational model

The following table describes the differences between the Wide Column model and the relational model.

Model

Feature

Wide Column model

Three-dimensional structure (row, column, and time), schema-free data, wide columns, max versions, and TTL management

Relational model

Two-dimensional structure (row and column) and fixed schema

Limits

For more information about the general limits on the Wide Column model, see General limits.

  • If you use secondary indexes or search indexes to accelerate data queries, take note of the limits on the indexes. For more information, see Secondary index limits and Search index limits.

  • If you use SQL to query and analyze data, take note of the limits on SQL queries. For more information, see SQL limits.

Procedure

image

The following table describes the steps.

Step

Operation

Description

1

Grant permissions on Tablestore resources to a Resource Access Management (RAM) user

After you create a RAM user, configure minimal permissions for the RAM user to access Tablestore resources. You can use system default policies or custom policies to grant the RAM user the permissions to access Tablestore resources.

If you want to use an Alibaba Cloud account or a RAM user that has the required permissions to access Tablestore resources, skip this step.

Important

By default, an Alibaba Cloud account has permissions on all cloud resources. To ensure the security of your resources, we recommend that you create RAM users for your Alibaba Cloud account and authorize them to access different resources.

2

Activate Tablestore

Before you use the features of Tablestore, you must activate Tablestore.

You need to activate Tablestore only once. You are not charged when you activate Tablestore. If Tablestore is activated, skip this step.

3

Create a Tablestore instance

Important
  • Before you create a Tablestore instance, you must determine the billing method and instance type to use based on the actual business characteristics and business requirements on read/write performance and costs. For more information, see Billing overview and Instance.

  • The search index, Tunnel Service, SQL query, data delivery, data encryption, control policy, data backup, and zone-redundant storage (ZRS) features of the Wide Column model are supported only in specific regions. Select a region that supports the required features to create an instance. For more information, see Features and regions.

Create a Tablestore instance in the selected region based on the selected billing method and instance type.

If an existing Tablestore instance meets your business requirements, skip this step.

4

Create a data table

Note

Proper design of the primary key and partition key can effectively prevent data hotspot issues. We recommend that you design tables by referring to Table operations.

Create a data table to store business-related data. When you create a data table, you can configure the following features based on your business requirements:

  • If you want to query data by attributes, you can create secondary indexes to accelerate queries.

  • You can enable data at rest encryption (DARE) by configuring data encryption settings for the data table.

  • If the table data involves auto-increment primary key columns, such as item IDs on e-commerce websites, user IDs of large websites, post IDs on forums, and message IDs in chat tools, you can enable the auto-increment primary key column feature for the data table.

5

Perform basic operations on data

Note

Proper attribute column settings can improve the efficiency of business data usage. We recommend that you specify attribute columns by referring to Data operations.

You can write, update, read, and delete data in the data table.

  1. Write data to the data table. For more information, see Write data.

  2. Read data from the data table based on the primary key. For more information, see Read data.

To delete data, you can manually delete the data or enable automatic deletion by setting the TTL value of the data. For more information, see Delete data or Data versions and TTL.

6

Use indexes to accelerate queries

If data reading based on the primary key of a data table cannot meet your business requirements, you can use indexes to accelerate data queries. Tablestore provides secondary indexes and search indexes to meet data query requirements in different scenarios.

  • Secondary index: allows you to query data based on the attribute columns of a data table. Tablestore provides global secondary indexes and local secondary indexes to meet different requirements for read consistency.

    Secondary indexes are suitable for scenarios in which the columns to be queried can be determined, the number of columns to be queried is small, and the values of all primary key columns or primary key prefix can be determined.

  • Search index: uses inverted indexes, Bkd-trees, and column stores for various query scenarios.

    Search indexes are suitable for all query and analysis scenarios in which queries based on the primary key and secondary indexes cannot meet your business requirements. For example, you can perform non-primary key column conditional query, Boolean query, relational query, full-text search, geo query, prefix query, fuzzy query, nested structure query, and null value query by using search indexes.

7

Analyze data

Use the SQL query feature or search indexes to aggregate and analyze data in the data table.

  • SQL query: You can use the SELECT statement to implement features such as JOIN, full-text search, aggregation, arithmetic operations, relational operations, logical operations, grouping by field, nested query by search index, data type query by search index, and JSON functions. For more information, see Query data.

  • Search index aggregation: You can perform aggregation operations to obtain the minimum value, maximum value, sum, average value, count and distinct count of rows, and percentile statistics. You can also perform aggregation operations to group results by field value, range, geographical location, filter, histogram, or date histogram, query the rows in grouped query results, and perform nested queries.

Note

You can also use compute engines such as MaxCompute, Spark, Hive, HadoopMR, Function Compute, and Realtime Compute for Apache Flink to analyze data in Tablestore. For more information, see Overview.

Billing

The billable items include read throughput, write throughput, storage usage, and outbound traffic over the Internet. For more information, see Billing overview.

FAQ

References

  • You can use the Wide Column model in the Tablestore console and Tablestore CLI. For more information, see the Use the Wide Column model section of the "Use Tablestore" topic.

  • To implement data center-level disaster recovery for instance data, you can create an instance of the ZRS type. For more information, see ZRS.

  • To ensure data storage security and network access security, you can encrypt data tables or associate a virtual private cloud (VPC) with your Tablestore instance to allow access only over the VPC. For more information, see Data encryption and Network security management.

  • To prevent important data from being accidentally deleted, you can use the data backup feature to back up important data on a regular basis. For more information, see Back up data in Tablestore.

  • To consume historical and incremental data in a data table, you can use Tunnel Service. For more information, see Overview.

  • To configure alert notifications for monitoring metrics, you can use CloudMonitor. For more information, see Overview.

  • To visualize data such as displaying data in charts, you can use DataV or Grafana. For more information, see Data visualization tools.