LindormTable is a distributed data engine in which data is distributed based on primary keys. If the primary key of a table contains multiple columns, LindormTable uses the columns in a left-to-right order to query data. A poorly designed primary key concentrates reads and writes on a small set of partitions, creating hot spots that degrade performance. This topic covers the principles for designing effective primary keys and provides examples for common workloads.
Key concepts
GET vs. SCAN queries
Primary key design determines which query methods are available.
| Query method | How it works | Requirement |
|---|---|---|
| GET | Point lookup by primary key | All primary key columns must be specified with explicit values |
| SCAN | Range scan over the primary key | The range must be specified on the first primary key column; full table scans are rejected by default |
Example queries:
-- GET: retrieve a single order
SELECT * FROM table WHERE userid='abc' AND orderid=123
-- SCAN: retrieve a range of orders
SELECT * FROM table WHERE userid='abc' AND 123<orderid<456
-- SCAN with reverse order: retrieve most recent orders first
SELECT * FROM table WHERE userid='abc' AND 123<orderid<456 ORDER BY orderid DESCFor queries that cannot be expressed with GET or SCAN, use an index table or secondary index. For more information, see SELECT.
Design principles
Uniqueness
Different row versions share the same primary key. By default, a GET query returns only the latest version. Design primary keys to be unique unless you intentionally use multi-versioning.
A primary key can be a single column or a composite of multiple columns:
[userid]— one record per user[userid][orderid]— multiple records per user, one per order
Column count and value length
Keep primary keys compact to reduce storage costs and improve write performance.
Column count: Use 1–3 primary key columns.
Value length: Prefer fixed-length values such as long integers. For variable-length values, keep each column under 2 KB.
First-column distribution
Data stored in Lindorm is distributed based on primary keys. If the primary key of a table contains multiple columns, data is distributed based on the columns in a left-to-right order. When values in the first primary key column are skewed or monotonically increasing, writes pile up on a single partition.
The following table shows whether common first-column types produce uniform or skewed distribution:
| First primary key column | Distribution | Suitable? |
|---|---|---|
| User ID (high cardinality, random) | Uniform | Yes |
| Device ID (many devices, even access) | Uniform | Yes |
| Timestamp or auto-increment value | Sequential — all writes go to one partition | No |
| Order type or status code (few possible values) | Skewed — most writes go to a few partitions | No |
| Column with a shared prefix | Skewed — similar values land in the same partition | No |
If you must use a timestamp or auto-increment column as the first primary key column, apply a hash prefix to distribute writes across partitions. See Avoid hot spots.
Avoid hot spots
When the first primary key column is monotonically increasing or has low cardinality, use one of the following techniques to distribute writes evenly. Each technique has a different trade-off for reads.
Hash prefix
Prepend a hash of the original key to scatter writes across partitions. Create a derived column pk1 using:
pk1 = hash(pk).substring(0, 4) + pkThe first four characters of the hash act as a random prefix. Writes spread evenly; reads use pk1 directly.
Primary key: [pk1][...]
Trade-off: To query by the original pk, compute the hash prefix at query time to construct pk1.
MD5 prefix
A common variant of hash prefix using MD5:
Primary key: [md5(userid).subStr(0,4)][userId][orderid]
Reverse index
Reverse the string representation of the key to break up sequential prefixes:
Primary key: [reverse(userid)][orderid]
Trade-off: Range scans on userid become less natural. Works best when exact-match lookups dominate.
Modulo bucket
Divide writes into a fixed number of buckets using modulo arithmetic. Assign each row to a bucket based on a monotonically increasing value:
long bucket = timestamp % numBucketsPrimary key: [bucket][timestamp][hostname][log-event]
Trade-off: To query all data for a time range, issue one query per bucket and merge the results.
Random salt
Append a random number to spread writes across partitions:
Primary key: [userId][orderid][random(100)]
Trade-off: Reads for a single [userId][orderid] require scanning across multiple salt values. Use this pattern only when write distribution is more critical than read simplicity.
Simplify primary keys
Reducing key size speeds up writes and lowers storage costs. Two common approaches:
Replace STRING with LONG or INT. Example:
'2015122410'→Long(2015122410).Replace long names with short codes. Example:
'taobao'→'tb'.
Common designs
Log and time series data
| Query goal | Primary key design |
|---|---|
| Query all records for a metric over a time range | [hostname][log-event][timestamp] |
| Retrieve the most recent records for a metric | [hostname][log-event][timestamp DESC] |
| Query time data with high write volume (hot timestamp) | long bucket = timestamp % numBuckets; [bucket][timestamp][hostname][log-event] |
For "most recent records" queries, define timestamp DESC in the primary key so that new records sort to the top. A SCAN then returns the latest records without additional sorting.
For high-volume time series where a single timestamp column would create a hot partition, the modulo bucket pattern distributes writes. To read all records in a time range, query each bucket separately and merge the results.
Transaction data
Transaction workloads typically require multiple access patterns — by seller, by buyer, and by order ID. Design separate tables for each access pattern:
| Access pattern | Table | Primary key design |
|---|---|---|
| Seller's transaction records by time | Seller table | [seller_id][timestamp][order_number] |
| Buyer's transaction records by time | Buyer table | [buyer_id][timestamp][order_number] |
| Order lookup by order ID | Order table | [order_number] |
Join the three tables to perform queries when you need to access data across all three access patterns.