All Products
Search
Document Center

Tair (Redis® OSS-Compatible):Bloom

Last Updated:Mar 28, 2026

TairBloom is a space-efficient probabilistic data structure for membership testing at scale. A negative result guarantees the element is absent. A positive result means the element is probably present — a small fraction of positive results are false positives. This trade-off lets TairBloom perform membership checks across millions of records using a fraction of the memory that a hash set requires.

TairBloom implements Scalable Bloom Filters (SBFs), which stack multiple Bloom filter layers to grow dynamically while keeping the false positive rate stable.

Key concepts

Bloom filter

A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, used to test whether an element is a member of a set. It is a bit array of _m_ bits (initially all 0) paired with _k_ hash functions. Adding an element sets _k_ bits to 1. Querying an element checks those _k_ bits — if all are 1, the element is probably present; if any is 0, the element is definitely absent. Because bits are shared across elements, false positives are possible. False negatives are not. Elements cannot be removed once added.

Scalable Bloom Filter (SBF)

As more elements are added to a fixed Bloom filter, the false positive rate rises. An SBF avoids this by stacking layers: when the current layer (for example, BF0) fills to the point where it can no longer maintain the target false positive rate, a new layer (BF1) is created automatically. New elements are always written to the latest layer. Queries traverse from the latest layer back to BF0.

TairBloom is built on SBFs. Each new layer has double the capacity of the previous one but occupies four times the memory.

imageimageimage

Use cases

TairBloom fits any system that needs to skip an expensive lookup when an item is definitely new. The probabilistic false positive is acceptable because the cost of occasionally re-processing a known item is lower than the cost of always querying the database.

TairBloom can be used for recommendation and crawler systems in industries such as live-streaming, music, and e-commerce.

Recommendation system

  • Problem: How do you avoid recommending the same article to the same user twice, without querying a database on every recommendation request?

  • Implementation: Store each recommended article ID per user in a TairBloom key. Before sending a recommendation, call BF.EXISTS. If the result is 0, the article has never been recommended — send it and record it with BF.ADD. If the result is 1, skip to the next candidate.

  • Benefit: A single in-memory check replaces a database round-trip for the vast majority of requests. Occasional false positives (skipping an article that was not actually recommended) are acceptable.

void recommendedSystem(userid) {
    while (true) {
        // Get a candidate article ID.
        docid = getDocByRandom()
        if (bf.exists(userid, docid)) {
            // Probably already recommended — skip.
            continue;
        } else {
            // Definitely not recommended — send it.
            sendRecommendMsg(docid);
            // Record the recommendation.
            bf.add(userid, docid);
            break;
        }
    }
}

Crawler system

  • Problem: How do you prevent re-crawling URLs that have already been processed?

  • Implementation: Store all crawled URLs in a TairBloom key. Before downloading a URL, call BF.EXISTS. If the result is 0, crawl it and add it to the filter. If the result is 1, skip it.

  • Benefit: Cuts duplicate work without maintaining a full URL database in memory.

bool crawlerSystem() {
    while (true) {
        url = getURLFromQueue()
        if (bf.exists(url_bloom, url)) {
            // Probably already crawled — skip.
            continue;
        } else {
            doDownload(url)
            bf.add(url_bloom, url);
        }
    }
}

For more examples, see Use Bloom filters to manage game event push notifications.

Prerequisites

Before you begin, ensure that you have:

The latest minor version provides more features and higher stability. Update your instance to the latest minor version. For more information, see Update the minor version of an instance. For cluster or read/write splitting instances, also update the proxy nodes to the latest minor version to ensure all commands run as expected.

Usage notes

Choose the right creation command

The most important decision before writing data is whether to use BF.ADD or BF.RESERVE. Getting the initial capacity right avoids unnecessary scale-ups.

BF.ADD / BF.MADDBF.RESERVE / BF.INSERT
Initial capacity100 (default)Specified by you
False positive rate0.01 (default)Specified by you
When to usePrototyping or small datasets (fewer than 100 elements)Any production use case where you know the approximate dataset size

If you undersize the initial capacity, TairBloom adds new layers as the filter fills up. This is not an error — it is how SBFs work. However, each additional layer increases query latency because queries must traverse more layers. If you oversize, you reserve more memory than needed.

For 10,000,000 elements at a false positive rate of 0.01:

  • BF.ADD (default capacity 100, auto-scaling): 176 MB

  • BF.RESERVE (capacity set to 10,000,000 upfront): 16 MB

Use the following table to estimate the memory required for a given capacity and target false positive rate:

CapacityFalse positive: 0.01False positive: 0.001False positive: 0.0001
100,0000.12 MB0.25 MB0.25 MB
1,000,0002 MB2 MB4 MB
10,000,00016 MB32 MB32 MB
100,000,000128 MB256 MB256 MB
1,000,000,0002 GB2 GB4 GB
Ultra-high capacity with ultra-high precision may fail to be created if the instance does not have enough available memory.

Monitor for scale-up

Run BF.INFO to check whether a key is approaching a scale-up. When the items count of the latest layer equals its capacity, a scale-up is imminent.

Important

Treat scale-up as a safety mechanism, not normal operating behavior. Each scale-up doubles capacity but quadruples the memory of the new layer. After an unexpected scale-up, rebuild the key promptly using DEL and BF.RESERVE to restore single-layer performance and reduce the risk of future scale-ups. Make sure the instance has enough free memory before writing to a scaled-up key — insufficient memory can trigger prolonged data eviction and block requests.

Manage key size

Elements can only be added to a key, never removed. The key size grows over time and cannot shrink. To prevent out of memory (OOM) errors:

  • Split data across multiple keys. Splitting prevents large keys from concentrating traffic on a single node. In cluster instances, distribute keys across nodes to avoid hot keys.

  • Rebuild keys periodically. Delete the key with DEL, then repopulate it from your backend database. Alternatively, maintain two keys and rotate between them — this keeps each key bounded in size at the cost of additional memory.

Supported commands

CommandSyntaxDescription
BF.RESERVEBF.RESERVE key error_rate capacityCreates an empty TairBloom key with the specified capacity and false positive rate
BF.ADDBF.ADD key itemAdds an element to a TairBloom key
BF.MADDBF.MADD key item [item ...]Adds multiple elements to a TairBloom key
BF.EXISTSBF.EXISTS key itemChecks whether an element exists in a TairBloom key
BF.MEXISTSBF.MEXISTS key item [item ...]Checks whether multiple elements exist in a TairBloom key
BF.INSERTBF.INSERT key [CAPACITY cap] [ERROR error] [NOCREATE] ITEMS item [item ...]Adds multiple elements to a TairBloom key; optionally creates the key with a specified capacity and false positive rate
BF.INFOBF.INFO keyReturns information about a TairBloom key, including layer count, items per layer, and false positive rate
DELDEL key [key ...]Deletes one or more TairBloom keys
Elements already added to a key cannot be deleted individually. DEL removes the entire key.

Command syntax conventions

  • UPPERCASE: command keyword

  • _italic_: variable

  • [options]: optional parameter; parameters not in brackets are required

  • A|B: mutually exclusive parameters; only one can be specified

  • ...: the preceding parameter can be repeated

BF.RESERVE

Time complexity
ItemDescription
SyntaxBF.RESERVE key error_rate capacity
O(1)
DescriptionCreates an empty TairBloom key with the specified initial capacity and false positive rate

Parameters

ParameterDescription
keyThe key name
error_rateThe target false positive rate. Must be between 0 and 1. Lower values improve accuracy but increase memory usage and CPU utilization.
capacityThe initial capacity. Specifies the maximum number of elements before a new layer is added. Set this to your expected dataset size to minimize the number of layers and maximize query performance.

Output

  • OK on success

  • An error message otherwise

Example

BF.RESERVE BFKEY 0.01 100
OK

BF.ADD

ItemDescription
SyntaxBF.ADD key item
Time complexityO(log N), where N is the number of Bloom filter layers
DescriptionAdds an element to a TairBloom key. If the key does not exist, it is created automatically with a default capacity of 100 and a false positive rate of 0.01.

Parameters

ParameterDescription
keyThe key name
itemThe element to add

Output

Return valueMeaning
1Element added (did not exist in the filter)
0Element not added (may already exist)
ErrorOperation failed

Example

BF.ADD BFKEY item1
(integer) 1

BF.MADD

ItemDescription
SyntaxBF.MADD key item [item ...]
Time complexityO(log N), where N is the number of Bloom filter layers
DescriptionAdds multiple elements to a TairBloom key. If the key does not exist, it is created automatically with a default capacity of 100 and a false positive rate of 0.01.

Parameters

ParameterDescription
keyThe key name
itemOne or more elements to add

Output

Returns one value per element, in order:

Return valueMeaning
1Element added (did not exist in the filter)
0Element not added (may already exist)
ErrorOperation failed

Example

BF.MADD BFKEY item1 item2 item3
(integer) 1
(integer) 1
(integer) 1

BF.EXISTS

ItemDescription
SyntaxBF.EXISTS key item
Time complexityO(log N), where N is the number of Bloom filter layers
DescriptionChecks whether an element exists in a TairBloom key

Parameters

ParameterDescription
keyThe key name
itemThe element to check

Output

Return valueMeaning
0Element is definitely absent
1Element is probably present (may be a false positive)
ErrorOperation failed

Example

BF.EXISTS BFKEY item1
(integer) 1

BF.MEXISTS

ItemDescription
SyntaxBF.MEXISTS key item [item ...]
Time complexityO(log N), where N is the number of Bloom filter layers
DescriptionChecks whether multiple elements exist in a TairBloom key

Parameters

ParameterDescription
keyThe key name
itemOne or more elements to check

Output

Returns one value per element, in order:

Return valueMeaning
0Element is definitely absent
1Element is probably present (may be a false positive)
ErrorOperation failed

Example

BF.MEXISTS BFKEY item1 item5
(integer) 1
(integer) 0

BF.INSERT

ItemDescription
SyntaxBF.INSERT key [CAPACITY cap] [ERROR error] [NOCREATE] ITEMS item [item ...]
Time complexityO(log N), where N is the number of Bloom filter layers
DescriptionAdds multiple elements to a TairBloom key. Optionally creates the key with a specified capacity and false positive rate, or prevents auto-creation.

Parameters

ParameterDescription
keyThe key name
CAPACITY capThe initial capacity. Ignored if the key already exists. If the number of elements exceeds this value, a new Bloom filter layer is added automatically.
ERROR errorThe target false positive rate. Must be between 0 and 1.
NOCREATEPrevents the key from being created if it does not exist. Cannot be used together with CAPACITY or ERROR.
ITEMS item [item ...]One or more elements to add

Output

Returns one value per element, in order:

Return valueMeaning
1Element added (did not exist in the filter)
0Element not added (may already exist)
ErrorOperation failed

Example

BF.INSERT bfkey1 CAPACITY 10000 ERROR 0.001 ITEMS item1 item2 item3
(integer) 1
(integer) 1
(integer) 1

BF.INFO

ItemDescription
SyntaxBF.INFO key
Time complexityO(log N), where N is the number of Bloom filter layers
DescriptionReturns information about a TairBloom key, including the number of layers, items per layer, and false positive rate

Parameters

ParameterDescription
keyThe key name

Output

  • Key information on success

  • An error message otherwise

Example

BF.INFO bk1
1) "total_items:6,num_blooms:2"
2) "bytes:4 bits:32 hashes:7 hashwidth:64 capacity:3 items:3 error_ratio:0.01"
3) "bytes:16 bits:128 hashes:9 hashwidth:64 capacity:10 items:3 error_ratio:0.0025"

Response fields

The first line is a summary. Each subsequent line describes one Bloom filter layer.

FieldDescription
total_itemsTotal number of elements across all layers
num_bloomsTotal number of Bloom filter layers
bytesMemory occupied by this layer, in bytes
bitsNumber of bits (bytes x 8)
hashesNumber of hash functions
hashwidthWidth of the hash functions
capacityCapacity of this layer
itemsNumber of elements in this layer
error_ratioFalse positive rate of this layer
Tip: When items equals capacity in the latest layer, a scale-up is imminent.

What's next