TairCpc is a data structure built on the compressed probability counting (CPC) sketch. It delivers high-accuracy cardinality estimation with significantly less memory than HyperLogLog (HLL), making it well-suited for real-time deduplication in high-throughput data streams.
Background
Real-time decision systems process business events as they arrive, store results for fast access, and apply rules to act immediately. Two examples where low-latency cardinality counting is critical:
Credit card fraud prevention: Determine whether a card is being used in a suspicious context and stop suspicious transactions at the earliest opportunity.
Anti-ticket-scalping: Detect and stop bots using virtual devices and fake IP addresses before they acquire inventory.
TairCpc lets you deduplicate incoming data by dimension, store the sketch in a Tair instance, and run aggregation queries within nanoseconds — combining storage and compute in a single layer.
Overview
CPC counts distinct values in a data stream and supports merging multiple sketches to get a combined total. Compared to HLL, CPC achieves the same accuracy with roughly 40% less memory.
TairCpc extends the open-source CPC implementation and reduces the error rate to 0.008%, down from 0.67% for standard CPC and 1.95% for HLL.
Think of TairCpc as a memory-efficient alternative to a Set for counting unique items:
| Operation | Set equivalent | TairCpc equivalent |
|---|---|---|
| Add an item | SADD key item | CPC.UPDATE key item |
| Count unique items | SCARD key | CPC.ESTIMATE key |
Unlike a Set, TairCpc does not store the raw items — it stores a compact sketch that approximates the count. This trade-off enables sub-millisecond aggregation at massive scale.
Key features
Low memory footprint with incremental reads and writes, and minimal I/O
High-performance, ultra-high-accuracy deduplication
Reduced error rate of 0.008% (vs. 0.67% for open-source CPC and 1.95% for HLL)
Use cases
Bank security systems: How many distinct accounts or devices have performed a given action in the last minute?
Flash sales: How many unique users have attempted to purchase this item in the current time window?
Anti-ticket-scalping: How many distinct IP addresses have sent requests from this session in the past 10 seconds?
Prerequisites
Before you begin, ensure that your instance is one of the following Tair series types:
DRAM-based instance. If the instance is compatible with Redis 5.0, the minor version must be 1.7.20 or later.
Persistent memory-optimized instance with minor version 1.2.3.3 or later.
Keep your instance on the latest minor version for the most features and highest stability. For upgrade instructions, see Update the minor version of an instance. For cluster instances and read/write splitting instances, also update the proxy nodes to the latest minor version so all commands run as expected.
Usage notes
TairCpc data is stored on a Tair instance.
Supported commands
| Command | Syntax | Description |
|---|---|---|
| CPC.UPDATE | CPC.UPDATE key item [EX|EXAT|PX|PXAT time] | Adds an item to a key. Creates the key if it does not exist. No-op if the item already exists. |
| CPC.ESTIMATE | CPC.ESTIMATE key | Returns the deduplicated cardinality estimate for a key as a DOUBLE. |
| CPC.UPDATE2EST | CPC.UPDATE2EST key item [EX|EXAT|PX|PXAT time] | Adds an item and returns the updated DOUBLE estimate. Creates the key if it does not exist. |
| CPC.UPDATE2JUD | CPC.UPDATE2JUD key item [EX|EXAT|PX|PXAT time] | Adds an item and returns the updated estimate plus the difference. Use the difference to detect duplicates: 1 = new item, 0 = duplicate. |
| CPC.ARRAY.UPDATE | CPC.ARRAY.UPDATE key timestamp item [EX|EXAT|PX|PXAT time] [SIZE size] [WIN window_length] | Adds an item to the time window matching the given timestamp. |
| CPC.ARRAY.ESTIMATE | CPC.ARRAY.ESTIMATE key timestamp | Returns the cardinality estimate for the time window matching the given timestamp. |
| CPC.ARRAY.ESTIMATE.RANGE | CPC.ARRAY.ESTIMATE.RANGE key start_time end_time | Returns per-window cardinality estimates across a time range (closed interval). |
| CPC.ARRAY.ESTIMATE.RANGE.MERGE | CPC.ARRAY.ESTIMATE.RANGE.MERGE key timestamp range | Returns the merged, deduplicated estimate starting from a timestamp and spanning N time windows backward. |
| CPC.ARRAY.UPDATE2EST | CPC.ARRAY.UPDATE2EST key timestamp item [EX|EXAT|PX|PXAT time] [SIZE size] [WIN window_length] | Adds an item to a time window and returns the updated estimate for that window. |
| CPC.ARRAY.UPDATE2JUD | CPC.ARRAY.UPDATE2JUD key timestamp item [EX|EXAT|PX|PXAT time] [SIZE size] [WIN window_length] | Adds an item to a time window and returns the updated estimate and difference for that window. |
| DEL | DEL key [key ...] | Deletes one or more TairCpc keys. |
Syntax conventions
UPPERCASE: command keyword_italic_: variable[option]: optional parameterA|B: mutually exclusive options — specify one only...: the preceding parameter can be repeated
CPC.UPDATE
Adds an item to a TairCpc key. Creates the key if it does not exist. If the item already exists, the command is a no-op.
Syntax
CPC.UPDATE key item [EX|EXAT|PX|PXAT time]Time complexity O(1)
Parameters
| Parameter | Description |
|---|---|
key | The TairCpc key to update. |
item | The item to add. |
EX time | Relative expiration time in seconds. The key does not expire if omitted. |
EXAT time | Absolute expiration time as a UNIX timestamp in seconds. The key does not expire if omitted. |
PX time | Relative expiration time in milliseconds. The key does not expire if omitted. |
PXAT time | Absolute expiration time as a UNIX timestamp in milliseconds. The key does not expire if omitted. |
Return value
OKon success.An error message otherwise.
Example
CPC.UPDATE foo f1 EX 3600OKCPC.ESTIMATE
Returns the cardinality estimate of a TairCpc key after deduplication. The return value is a DOUBLE — round to the nearest integer for a usable count.
Syntax
CPC.ESTIMATE keyTime complexity: O(1)
Parameters
| Parameter | Description |
|---|---|
key | The TairCpc key to query. |
Return value
The DOUBLE-type cardinality estimate on success.
An error message otherwise.
Example
CPC.ESTIMATE foo"19.000027716212127"CPC.UPDATE2EST
Adds an item to a TairCpc key and returns the updated cardinality estimate in a single round trip. Creates the key if it does not exist.
Syntax
CPC.UPDATE2EST key item [EX|EXAT|PX|PXAT time]Time complexity: O(1)
Parameters
| Parameter | Description |
|---|---|
key | The TairCpc key to update. |
item | The item to add. |
EX time | Relative expiration time in seconds. The key does not expire if omitted. |
EXAT time | Absolute expiration time as a UNIX timestamp in seconds. The key does not expire if omitted. |
PX time | Relative expiration time in milliseconds. The key does not expire if omitted. |
PXAT time | Absolute expiration time as a UNIX timestamp in milliseconds. The key does not expire if omitted. |
Return value
The DOUBLE-type cardinality estimate after the update on success.
An error message otherwise.
Example
CPC.UPDATE2EST foo f3"3.0000004768373003"CPC.UPDATE2JUD
Adds an item to a TairCpc key and returns both the updated cardinality estimate and the difference from the previous estimate. Use the difference to determine whether the item was new or already existed — without a separate read command.
If a difference of 1 is returned, the item was new (no duplicate).
If a difference of 0 is returned, the item already existed (duplicate detected).
Creates the key if it does not exist.
Syntax
CPC.UPDATE2JUD key item [EX|EXAT|PX|PXAT time]Time complexity: O(1)
Parameters
| Parameter | Description |
|---|---|
key | The TairCpc key to update. |
item | The item to add. |
EX time | Relative expiration time in seconds. The key does not expire if omitted. |
EXAT time | Absolute expiration time as a UNIX timestamp in seconds. The key does not expire if omitted. |
PX time | Relative expiration time in milliseconds. The key does not expire if omitted. |
PXAT time | Absolute expiration time as a UNIX timestamp in milliseconds. The key does not expire if omitted. |
Return value
On success: two DOUBLE values — the new cardinality estimate, then the difference between the new and previous estimates.
An error message otherwise.
Example
CPC.UPDATE2JUD foo f201) "20.000027716212127" // New cardinality estimate: 20
2) "1.0000014901183398" // Difference: 20 - 19 = 1 (new item)CPC.ARRAY.UPDATE
Adds an item to a TairCpc array key within the time window that the given timestamp falls into. Creates the key if it does not exist.
A TairCpc array key maintains a sliding window of CPC sketches. Each call writes data into the time window matching the timestamp. The total observable time span is SIZE × WIN milliseconds. Data outside this span is overwritten as new data arrives.
SIZEandWINare only applied when the key is first created. Subsequent writes to the same key use the values set at creation time.
Example: To track unique events per minute over the last 10 minutes, set SIZE to 10 and WIN to 60000. When data from the 11th minute arrives, the data from the first minute is overwritten.
Syntax
CPC.ARRAY.UPDATE key timestamp item [EX|EXAT|PX|PXAT time] [SIZE size] [WIN window_length]Time complexity: O(1)
Parameters
| Parameter | Description |
|---|---|
key | The TairCpc array key to update. |
timestamp | UNIX timestamp in milliseconds indicating which time window to write into. |
item | The item to add. |
EX time | Relative expiration time in seconds. The key does not expire if omitted. |
EXAT time | Absolute expiration time as a UNIX timestamp in seconds. The key does not expire if omitted. |
PX time | Relative expiration time in milliseconds. The key does not expire if omitted. |
PXAT time | Absolute expiration time as a UNIX timestamp in milliseconds. The key does not expire if omitted. |
SIZE size | Number of time windows. Default: 10. Valid range: 1–1000. Keep below 120 for typical workloads. |
WIN window_length | Length of each time window in milliseconds. Default: 60000 (1 minute). |
Return value
OKon success.An error message otherwise.
Example
CPC.ARRAY.UPDATE foo 1645584510000 f1 SIZE 120 WIN 10000OKCPC.ARRAY.ESTIMATE
Returns the cardinality estimate for the time window that the given timestamp falls into.
Syntax
CPC.ARRAY.ESTIMATE key timestampTime complexity: O(1)
Parameters
| Parameter | Description |
|---|---|
key | The TairCpc array key to query. |
timestamp | UNIX timestamp in milliseconds identifying the time window to query. |
Return value
The cardinality estimate for the matching time window on success.
An error message otherwise.
Example
CPC.ARRAY.ESTIMATE foo 1645584532000"2"CPC.ARRAY.ESTIMATE.RANGE
Returns per-window cardinality estimates across all time windows within a specified time range (closed interval). Each value in the response corresponds to one time window.
Syntax
CPC.ARRAY.ESTIMATE.RANGE key start_time end_timeTime complexity: O(1)
Parameters
| Parameter | Description |
|---|---|
key | The TairCpc array key to query. |
start_time | Start of the time range as a UNIX timestamp in milliseconds. |
end_time | End of the time range as a UNIX timestamp in milliseconds. |
Return value
A list of cardinality estimates, one per time window in the range, on success.
An error message otherwise.
Example
CPC.ARRAY.ESTIMATE.RANGE foo 1645584510000 16455845500001) "2"
2) "0"
3) "1"
4) "0"
5) "0"CPC.ARRAY.ESTIMATE.RANGE.MERGE
Returns a single merged and deduplicated cardinality estimate, aggregating N time windows starting from the given timestamp and going backward.
Use this command when you need a unified unique count across multiple time windows — for example, total unique users in the last 5 minutes — rather than per-window counts.
Syntax
CPC.ARRAY.ESTIMATE.RANGE.MERGE key timestamp rangeTime complexity: O(1)
Parameters
| Parameter | Description |
|---|---|
key | The TairCpc array key to query. |
timestamp | Starting point of the query as a UNIX timestamp in milliseconds. |
range | Number of time windows to merge, counting backward from the timestamp. |
Return value
The merged, deduplicated cardinality estimate on success.
An error message otherwise.
Example
CPC.ARRAY.ESTIMATE.RANGE.MERGE foo 1645584510000 3"6"CPC.ARRAY.UPDATE2EST
Adds an item to the time window matching the given timestamp and returns the updated cardinality estimate for that window in a single round trip. Creates the key if it does not exist.
Uses the same key-creation parameters as CPC.ARRAY.UPDATE.
Syntax
CPC.ARRAY.UPDATE2EST key timestamp item [EX|EXAT|PX|PXAT time] [SIZE size] [WIN window_length]Time complexity: O(1)
Parameters
| Parameter | Description |
|---|---|
key | The TairCpc array key to update. |
timestamp | UNIX timestamp in milliseconds identifying the target time window. |
item | The item to add. |
EX time | Relative expiration time in seconds. The key does not expire if omitted. |
EXAT time | Absolute expiration time as a UNIX timestamp in seconds. The key does not expire if omitted. |
PX time | Relative expiration time in milliseconds. The key does not expire if omitted. |
PXAT time | Absolute expiration time as a UNIX timestamp in milliseconds. The key does not expire if omitted. |
SIZE size | Number of time windows. Default: 10. Valid range: 1–1000. Keep below 120 for typical workloads. |
WIN window_length | Length of each time window in milliseconds. Default: 60000 (1 minute). |
Return value
The updated DOUBLE cardinality estimate for the time window on success.
An error message otherwise.
Example
CPC.ARRAY.UPDATE2EST foo 1645584530000 f3"3"CPC.ARRAY.UPDATE2JUD
Adds an item to the time window matching the given timestamp and returns both the updated cardinality estimate and the difference from the previous estimate for that window. Creates the key if it does not exist.
If a difference of 1 is returned, the item was new (no duplicate in this window).
If a difference of 0 is returned, the item already existed in this window.
Uses the same key-creation parameters as CPC.ARRAY.UPDATE.
Syntax
CPC.ARRAY.UPDATE2JUD key timestamp item [EX|EXAT|PX|PXAT time] [SIZE size] [WIN window_length]Time complexity: O(1)
Parameters
| Parameter | Description |
|---|---|
key | The TairCpc array key to update. |
timestamp | UNIX timestamp in milliseconds identifying the target time window. |
item | The item to add. |
EX time | Relative expiration time in seconds. The key does not expire if omitted. |
EXAT time | Absolute expiration time as a UNIX timestamp in seconds. The key does not expire if omitted. |
PX time | Relative expiration time in milliseconds. The key does not expire if omitted. |
PXAT time | Absolute expiration time as a UNIX timestamp in milliseconds. The key does not expire if omitted. |
SIZE size | Number of time windows. Default: 10. Valid range: 1–1000. Keep below 120 for typical workloads. |
WIN window_length | Length of each time window in milliseconds. Default: 60000 (1 minute). |
Return value
On success: two values — the updated cardinality estimate for the time window, then the difference between the new and previous estimates.
An error message otherwise.
Example
CPC.ARRAY.UPDATE2JUD foo 1645584530000 f71) "8" // New cardinality estimate for this window: 8
2) "1" // Difference: 8 - 7 = 1 (new item)