Community Blog Best Practices | How to Develop Redis Module with C++

Best Practices | How to Develop Redis Module with C++

This article summarizes some issues encountered by Tair while developing Redis modules with C++.

By Qudong


In this article, I will summarize some issues encountered by Tair while developing Redis modules with C++ and refine them into best practices. The aim is to provide assistance to users and developers of Redis modules. Some of these best practices can also be applied to C programming language and other programming languages.


Since Redis 5.0, module plugins have been supported to enhance the capabilities of Redis. These plugins allow the development of new data structures, implementation of command listening and filtering, and expansion of new network services. It is safe to say that the introduction of modules has significantly increased the flexibility of Redis and reduced the complexity of Redis development.

Numerous modules have emerged in the Redis community, covering various areas and enriching the ecosystem. Most of these modules are developed using the C programming language. However, Redis modules also support development in other languages such as C++ and Rust. This article aims to summarize the issues encountered by Tair while developing Redis modules with C++ and present them as best practices. The intention is to provide assistance to users and developers of Redis modules, with some of these best practices being applicable to C and other languages as well.

How the Redis Module Works

The Redis kernel is developed in the C programming language, which naturally leads to the consideration of dynamic link libraries when developing plugins in the C programming language environment. While Redis does utilize dynamic link libraries, there are a few key points to note:

  1. The Redis kernel exposes and exports various APIs for modules to use. These APIs include memory allocation interfaces and operation interfaces of the Redis core DB structure. It is important to understand that these APIs are parsed and bound by Redis itself, rather than by dynamic connectors.
  2. The Redis kernel uses dlopen to explicitly load modules instead of relying on implicit loading by the dynamic linker. This means that when a module needs to implement a specific interface, Redis automatically calls the module's entry function to initialize APIs, register data structures, and perform other necessary functions.


The following code shows the logic of module loading in the Redis kernel:

int moduleLoad(const char *path, void **module_argv, int module_argc, int is_loadex) {
    int (*onload)(void *, void **, int);
    void *handle;

    struct stat st;
    if (stat(path, &st) == 0) {
        /* This check is best effort */
        if (!(st.st_mode & (S_IXUSR  | S_IXGRP | S_IXOTH))) {
            serverLog(LL_WARNING, "Module %s failed to load: It does not have execute permissions.", path);
            return C_ERR;

    // Open the module so.
    handle = dlopen(path,RTLD_NOW|RTLD_LOCAL);
    if (handle == NULL) {
        serverLog(LL_WARNING, "Module %s failed to load: %s", path, dlerror());
        return C_ERR;

// Obtain the symbolic address of the onload function in the module.
onload = (int (*)(void *, void **, int))(unsigned long) dlsym(handle,"RedisModule_OnLoad");
    if (onload == NULL) {
            "Module %s does not export RedisModule_OnLoad() "
            "symbol. Module not loaded.",path);
        return C_ERR;
    RedisModuleCtx ctx;
    moduleCreateContext(&ctx, NULL, REDISMODULE_CTX_TEMP_CLIENT); /* We pass NULL since we don't have a module yet. */
    // Call onload to initialize the module.
    if (onload((void*)&ctx,module_argv,module_argc) == REDISMODULE_ERR) {
            "Module %s initialization failed. Module not loaded",path);
        if (ctx.module) {
        return C_ERR;

    /* Redis module loaded! Register it. */

    //... irrelevant code is omitted ...

    return C_OK;

API binding

In the initialization function of the module, RedisModule_Init should be called explicitly to initialize the APIs exported by the Redis kernel. Example:

int RedisModule_OnLoad(RedisModuleCtx *ctx, RedisModuleString **argv, int argc) {
    if (RedisModule_Init(ctx, "helloworld", 1, REDISMODULE_APIVER_1) == REDISMODULE_ERR) 
      return REDISMODULE_ERR;

    // ... irrelevant code is omitted ...

RedisModule_Init is a function defined in redismodule.h, which exports and binds each API exposed by the Redis kernel.

static int RedisModule_Init(RedisModuleCtx *ctx, const char *name, int ver, int apiver) {
    void *getapifuncptr = ((void**)ctx)[0];
    RedisModule_GetApi = (int (*)(const char *, void *)) (unsigned long)getapifuncptr;

    // Bind the APIs exported by Redis.
    // ... irrelevant code is omitted ...

Let's look at what the REDISMODULE_GET_API is doing first. It is a macro definition, which essentially calls the RedisModule_GetApi function:

#define REDISMODULE_GET_API(name) \
RedisModule_GetApi("RedisModule_" #name, ((void **)&RedisModule_ ## name))

RedisModule_GetApi looks like an API exposed inside Redis, but we are now performing API binding. How the address of the RedisModule_GetApi function is obtained before binding? The answer is that when the Redis kernel calls the OnLoad function of the module, it passes the address of the RedisModule_GetApi function through RedisModuleCtx. You can see the code for loading the module above. Before calling the Onload function, Redis uses the moduleCreateContext to initialize a RedisModuleCtx, and passes it to the module.

In the moduleCreateContext, the address of the RM_GetApi function defined in Redis is assigned to the getapifuncptr member of RedisModuleCtx.

void moduleCreateContext(RedisModuleCtx *out_ctx, RedisModule *module, int ctx_flags) {
    memset(out_ctx, 0 ,sizeof(RedisModuleCtx));
    // Pass the GetApi address to the module.
    out_ctx->getapifuncptr = (void*)(unsigned long)&RM_GetApi;
    out_ctx->module = module;
    out_ctx->flags = ctx_flags;

    // ... irrelevant code is omitted ...

Therefore, we can use RedisModuleCtx to obtain the GetApi function in the module. Why do we use such a "strange" method, ((void**)ctx)[0], instead of directly using ctx->getapifuncptr? The reason is that RedisModuleCtx is a data structure defined in the Redis kernel, and its internal structure is invisible to the module (opaque pointer). Therefore, we can take advantage of the fact that getapifuncptr is the first member of RedisModuleCtx and directly take the first point.

void *getapifuncptr = ((void**)ctx)[0];
RedisModule_GetApi = (int (*)(const char *, void *)) (unsigned long)getapifuncptr;

The following structure shows the fact that getapifuncptr is the first member of RedisModuleCtx.

struct RedisModuleCtx {
    // getapifuncptr is the first member.
    void *getapifuncptr;            /* NOTE: Must be the first field. */
    struct RedisModule *module;     /* Module reference. */
    client *client;                 /* Client calling a command. */
    // ... irrelevant code is omitted ...

After figuring out how RM_GetApi is exported, let's take a look at what RM_GetApi is doing:

int RM_GetApi(const char *funcname, void **targetPtrPtr) {
    /* Lookup the requested module API and store the function pointer into the
     * target pointer. The function returns REDISMODULE_ERR if there is no such
     * named API, otherwise REDISMODULE_OK.
     * This function is not meant to be used by modules developer, it is only
     * used implicitly by including redismodule.h. */
    dictEntry *he = dictFind(server.moduleapi, funcname);
    if (!he) return REDISMODULE_ERR;
    *targetPtrPtr = dictGetVal(he);
    return REDISMODULE_OK;

The internal implementation of RM_GetApi is very simple—find the corresponding function address in a global hash table (server.mo duleapi) according to the function name to be bound, and assign the address to targetPtrPtr after it is found. So where does the content in dict come from?

When the Redis kernel starts, it registers its exposed module APIs through the moduleRegisterCoreAPI function. The process is as follows:

/* Register all the APIs we export. Keep this function at the end of the
 * file so that's easy to seek it to add new entries. */
void moduleRegisterCoreAPI(void) {
    server.moduleapi = dictCreate(&moduleAPIDictType);
    server.sharedapi = dictCreate(&moduleAPIDictType);

    // Register functions to the global hash table.

    // ... irrelevant code is omitted ...

Among them, REGISTER_API is essentially a macro definition, which is implemented internally by moduleRegisterApi functions. The moduleRegisterApi function adds the exported function name and function pointer to server.mo duleapi.

int moduleRegisterApi(const char *funcname, void *funcptr) {
    return dictAdd(server.moduleapi, (char*)funcname, funcptr);

#define REGISTER_API(name) \
    moduleRegisterApi("RedisModule_" #name, (void *)(unsigned long)RM_ ## name)

Then the question arises—why does Redis have to take so much effort to implement an API export binding mechanism? In theory, the code in the module dynamic library can still call the visible symbols exposed by Redis by directly using the symbol resolution and relocation mechanisms of dynamic connectors. Although this is feasible, there will be symbol conflicts. For example, if other modules also expose a function name that is the same as the Redis API, it depends on the global symbol resolution mechanism and sequence to distinguish (global symbol intervention). Another reason is that Redis can better control different versions of API through this binding mechanism.

Best Practices

The entry function disables C++ mangle

As can be seen from the previous module loading mechanism, the module must strictly ensure that the entry function name is consistent with the requirements of Redis. Therefore, when we write module code in C++, we must first disable C++ mangle. Otherwise, the error "Module does not export RedisModule_OnLoad()" will be reported.

The sample code is as follows:

#include "redismodule.h"

extern "C" __attribute__((visibility("default"))) int RedisModule_OnLoad(RedisModuleCtx *ctx, RedisModuleString **argv, int argc) {

    // Init code and command register
    return REDISMODULE_OK;

Takeover memory statistics

Redis needs accurate statistics on the memory used by the data structure at runtime (the atomic variable used_memory is used to add and subtract internally), which requires that the module must use the same memory allocation interface as the Redis kernel does. Otherwise, the memory allocation in the module may not be counted.

REDISMODULE_API void * (*RedisModule_Alloc)(size_t bytes) REDISMODULE_ATTR;
REDISMODULE_API void * (*RedisModule_Realloc)(void *ptr, size_t bytes) REDISMODULE_ATTR;
REDISMODULE_API void (*RedisModule_Free)(void *ptr) REDISMODULE_ATTR;
REDISMODULE_API void * (*RedisModule_Calloc)(size_t nmemb, size_t size) REDISMODULE_ATTR;

For some simple modules, there is no problem calling these APIs explicitly. However, for some slightly more complicated modules, especially those that rely on some third-party libraries, it is more difficult to replace all memory allocations in the library with module interfaces. If we use C++ to develop the Redis module, it is even more important to make container allocators—new/delete/make_shared—that can be seen everywhere in C++ be taken over by unified memory allocation.

new/operator new/placement new

First of all, I will explain the difference between them: new is a keyword, and like sizeof, we cannot modify its specific functions. New is in charge of three things:

  1. Allocate space (uses operator new).
  2. Initialize the object (uses placement new or type casts), that is, call the constructor of the object.
  3. Return the object pointer.

Operator new is an operator that, like +/-, can allocate space. We can rewrite them and modify the way we allocate space.

Placement new is an overloaded form of operator new (that is, the parameter form is different). Example:

void * operator new(size_t, void *location) {  
    return location; 

It can be seen that to modify the default memory allocation used by new, we can use two methods.

placement new

It is nothing more than manually simulating the behavior of the keyword new. First, use the module API to allocate a piece of memory, and then call the constructor of the object on this memory.

Object *p=(Object*)RedisModule_Alloc(sizeof(Object));
new (p)Object();

Note that the destructor also requires special processing:


Because placement new does not have global behavior and needs to manually handle the allocation of each object, it still cannot completely solve the problem of memory allocation for complex C++ modules.

operator new

C++ has a built-in implementation of operator new. By default, glibc malloc is used to allocate memory. C++ provides us with an overload mechanism, that is, we can implement our own operator new and replace the internal malloc with RedisModule_Alloc.

In fact, it is not appropriate to say that operator new is overloaded (the function names at the same level are the same while the parameters are different) or rewritten (the derived function name and parameters must be the same, and the return values must be the same except for type covariation). I think that "override" is more appropriate here, because the C++ compiler's built-in operator new is implemented as a weak symbol. Take GCC as an example:

operator new (std::size_t sz) _GLIBCXX_THROW (std::bad_alloc)
  void *p;

  /* malloc (0) is unpredictable; avoid it.  */
  if (sz == 0)
    sz = 1;

  while (__builtin_expect ((p = malloc (sz)) == 0, false))
      new_handler handler = std::get_new_handler ();
      if (! handler)
      handler ();

  return p;

In this way, when we implement a strong symbol version, it will override the compiler's own implementation.

Take the basic operator new/operator delete as an example:

void *operator new(std::size_t size) { 
    return RedisModule_Alloc(size); 
void operator delete(void *ptr) noexcept { 

Because operator new has global behavior, this can "once and for all" solve all the problems of using new/delete(make_shared also uses new internally) to allocate memory.

Visibility of Operator New Across Multiple Modules

Because operator new has global visibility (the compiler does not allow operator new to be hidden under a namespace), if Redis loads more than one module written in C++, we need to be careful about the impact of this behavior.

Now assume that there are two modules, namely module1 and module2, in which module1 overloads operator new. Since operator new is essentially a special function, when module1 is loaded by Redis (uses dlopen), the dynamic connector will add the operator new function implemented by module1 to the global symbol table, so when module2 is loaded and symbol relocation is carried out later, module2 will also link its own operator new to the operator new implemented by module1.

If both module1 and module2 are developed by ourselves, there will generally be no problem. However, if module1 and module2 are developed by different developers, or even if they both provide different operator new implementations, then only the implementation loaded first will take effect (global compliance intervention), and the behavior of the implementation loaded later may be abnormal.

Static linking/Dynamic linking C++ standard library

Static Linking

Sometimes, our module may be written and compiled with an advanced C++ version. In order to prevent the module from not being supported by the corresponding C++ environment on the target platform when being distributed, we usually compile the C++ standard library into the module in a static linking manner. Take the Linux platform as an example. We want to statically link libstdc++ and ibgcc_s into the module. Usually, if Redis only loads a C++ module, there will be no problem. However, if there are two C++ modules at the same time and the method of statically linking C++ standard library is adopted, then there will be module exceptions. Specifically, the loaded module cannot normally use C++ stream, and then it cannot normally print information, use regular expressions, etc. (It is suspected that the repeated initialization of some global variables defined by the C++ standard library causes such exceptions)

This problem has existed on GCC for many years: https://gcc.gnu.org/bugzilla//show_bug.cgi?id=68479

Dynamic Linking

Therefore, in this scenario (Redis loads more than one C++ library), it is recommended that all modules use dynamic linking. If you are still worried about the compatibility of the C++ version during distribution, you can package libstdc++.so and ibgcc_s.so together, and then use $ORIGIN to modify rpath to specify the link to your version.

Use the block mechanism to improve concurrent processing capability

Redis is a single-threaded model (worker single-threaded), which means that Redis will not process and respond to another command when executing one command. For some time-consuming module commands, we still hope that this command can be run in the background, so that Redis can continue to read and process the commands of the next client.

As shown in Figure 1, cmd1 is executed in Redis and returns directly after the main thread puts cmd1 into the queue (without waiting for cmd1 to finish executing). At this time, the main thread can continue to process the next command cmd2. After cmd1 is executed, an event is registered in the main thread again. This way, subsequent processing of cmd1 can be continued in the main thread, such as sending execution results to the client, writing AOF, and propagating replicas to the client.

Figure 1 Typical asynchronous processing model

Although the block looks nice and powerful, it needs to be handled carefully, such as:

• Although the command is executed asynchronously, it still needs to write AOF and be replicated to the secondary database synchronously. If the command writes AOF in advance and is replicated to the secondary database, it cannot be rolled back in case of subsequent command execution failure.

• Because the secondary database is not allowed to execute block commands, the primary database needs to rewrite block commands into non-blocking commands and replicate them to the secondary database.

• During asynchronous execution, we cannot only pay attention to the keyname when opening a key, because the original key may have been deleted before the asynchronous thread executes, and then another key with the same name is created. That is, the current key is no longer the original key.

• Design whether block commands support transactions and lua.

• If a thread pool is used, attention should be paid to the order-preserving execution of the same key in the thread pool (i.e., the processing of the same key cannot be out of order).

Avoid Symbol Conflicts with Other Modules

Because Redis can load multiple modules at the same time, and these modules may come from different teams and individuals, there is a certain probability that different modules will define the same function name. In order to avoid undefined behavior caused by symbol conflicts, it is recommended that each module hides all symbols except the Onload and Unload functions and pass some flag implementations to the compiler. Such as GCC:


Beware of the Fork trap

Commands for Handling Inflight Status

Suppose that the module uses an asynchronous execution model (see the block section above). When Redis executes AOF rewrite or BGSAVE, if there are still some commands in the inflight state at the moment Redis uses fork to execute the child process, then the newly generated base AOF or RDB may not contain the inflight data. This does not seem to be a big problem, because the command of inflight will also be written into the incremental AOF when it is finally completed. However, in order to be compatible with Redis' original behavior (that is, there must be no command in the inflight state when forking, and it is in a static state), the module is better to ensure that all commands in the inflight state are executed before forking.

In the module, before forking, we can execute a callback function we passed by taking advantage of the RedisModuleEvent_ForkChild event exposed by Redis.

RedisModule_SubscribeToServerEvent(ctx, RedisModuleEvent_ForkChild, waitAllInflightTaskFinish);

For example, wait for the queue to be empty in waitAllInflightTaskFinish (that is, all tasks are executed):

static void waitAllInflightTaskFinish() {
    while (!thread_pool->idle())

Alternatively, the same effect can be achieved by directly using the pthread_atfork exposed by glibc.

int pthread_atfork(void (*prepare)(void), void (*parent)void(), void (*child)(void));

Avoid Deadlocks

It is important to be aware that a child process created by forking is almost identical to the parent process, but not exactly the same. The child process receives a separate copy of the parent process's user-level virtual address space, including the text, data, bss segment, heap, and user stack. It also receives the same copy of any open file descriptors as the parent process, which means that it can read and write any open file in the parent process. The main difference between the parent and child processes is that they have different process IDs (PIDs).

However, in Linux, when forking, only the current thread is replicated to the child process. The fork(2) - Linux Man Page provides the following related explanation:

The child process is created with a single thread--the one that called fork(). The entire virtual address space of the parent is replicated in the child, including the states of mutexes, condition variables, and other pthreads objects; the use of pthread_atfork(3) may be helpful for dealing with problems that this can cause.

In other words, apart from the thread that calls fork, all other threads "evaporate" in the child process. Therefore, if some asynchronous threads hold locks on certain resources, a deadlock can occur in the child process because these threads disappear.

The solution is the same as for handling inflight. Make sure to release all locks before forking. (In fact, as long as all commands in the inflight state are executed, the general lock will be released.)

Ensure Idempotence of AOF Replicated to Secondary Database

The primary purpose of primary/secondary replication in Redis is to ensure consistency. Therefore, the only task of the secondary database is to unconditionally receive the replicated content from the primary database and maintain strict consistency. However, some special commands need to be handled carefully.

In this example, the Tair string exposed by Tair supports setting the version number for data. For example, we can write the following code:

EXSET key value VER 10

Then, after the primary database executes this command, it is best to rewrite the command as follows when the command is replicated to the secondary database:

EXSET key value ABS 11

That is, the absolute version number is used to force the secondary database to be the same as the primary database. There are many similar cases, such as scenarios relating to time and floating-point calculation.

Support graceful shutdown

The module may start some asynchronous threads or manage some asynchronous resources. These resources need to be processed (such as stopped, destructed, and written to disk) when Redis shuts down. Otherwise, coredump may occur when Redis exits.

In Redis, you can register a RedisModuleEvent_Shutdown event implementation. When Redis shuts down, it will call back ShutdownCallback that we passed.

In newer Redis versions, the module can also achieve similar functions by exposing the unload function.

RedisModule_SubscribeToServerEvent(ctx, RedisModuleEvent_Shutdown, ShutdownCallback);

Avoid oversized AOF

• The AOF file compression feature is implemented. For example, all write operations of a hash can be rewritten as one or more hmset commands.

• Make sure that the size of a single rewritten AOF does not exceed 500 MB. If it exceeds 500 MB, we must rewrite the AOF to multiple CMDs, and make sure that whether these CMDs need to be executed in a transactional manner (that is, ensure that the execution of operation commands is isolated).

• For modules with complex structures that cannot be simply rewritten with existing commands, an internal command, such as xxxload/xxxdump, can be implemented separately to serialize and deserialize the data structure of the module. The command will not be exposed to the client.

• If the RedisModule_EmitAOF contains parameters of the array type (that is, parameters passed with the 'v' flag), the length of the array must be of the size_t type. Otherwise, strange errors may be encountered.


RDB encoding has backward compatibility

RDB is serialized and deserialized in binary format, so it is relatively simple. However, it should be noted that if the serialization method of the data structure may change in the future, it is best to add the codec version, so that compatibility can be guaranteed during the upgrade. The code is as follows:

void *xxx_RdbLoad(RedisModuleIO *rdb, int encver) {
  if (encver == version1 ) {
    /* version1 format */
  } else if (encver == version2 ){
    /* version2 format */ 

Suggestions for Command Implementations

Parameter verification: Verify the validity of parameters (such as the correct number and type of parameters) before executing the command, and try to avoid modifying the keyspace in advance when the command is not executed successfully (such as using RedisModule_ModuleTypeSetValue to modify the primary database in advance).

Error message: The error message returned should be simple and clear, stating the type of error.

Consistent response types: The return types of commands should be consistent in different cases, such as when key does not exist, key type error, successful execution, and some parameter errors. In general, all cases should return the same type except for error types, such as a simple string or an array (even if it is an empty array). This makes it easier for the client to parse the command return value.

Check the read/write type: The command must strictly distinguish between read and write types because it determines whether the command can be executed on a replica and whether the command needs to be synchronized and written to AOF.

Replicate Idempotence and AOF: For write commands, use RedisModule_ReplicateVerbatim or RedisModule_Replicate to perform primary/secondary replication and write to AOF (rewrite the original command if necessary). Multi/exec is automatically added before and after the AOF generated by RedisModule_Replicate (to ensure that the commands generated in the module are isolated). Therefore, it is recommended to preferentially use RedisModule_ReplicateVerbatim for replication and writing to AOF. However, if there are parameters such as version numbers in the command, use RedisModule_Replicate to rewrite the version number as an absolute version number and the expiration time as an absolute expiration time. Additionally, if a command needs to be rewritten with RedisModule_Replicate, ensure that the rewritten command will not be rewritten again.

Reuse the argv parameter: The parameter type in argv passed to the command is RedisModuleString **, and these RedisModuleString pointers will be automatically freed after the command returns. Therefore, these RedisModuleString pointers should not be directly referenced in the command. If it is necessary to do so (such as avoiding memory copy), RedisModule_RetainString/RedisModule_HoldString can be used to increase the reference count of RedisModuleString, but remember to manually free them later.

The way to open a key: When using RedisModule_OpenKey to open a key, strictly distinguish between the two types of opening: REDISMODULE_READ and REDISMODULE_WRITE. Failure to distinguish between them will affect the update of internal stat_keyspace_misses and stat_keyspace_hits information, as well as expired rewriting. At the same time, a key opened using the REDISMODULE_READ method cannot be deleted, otherwise an error will be reported.

The way to process different key types: Currently, only the set command of string can forcibly overwrite keys of other types. Other commands should return the error "WRONGTYPE Operation against a key holding the wrong kind of value" when the key exists but the type does not match.

Cluster support for multi-key commands: For multi-key commands, the values of firstkey, lastkey, and keystep must be handled correctly, as only when these values are correct can Redis check if these keys have CROSS SLOTS issues in cluster mode.

Global index and structure: If the module has its own global index, check if the index contains dbid, key, and other information. Redis' move, rename, swapdb, and other commands can covertly change the name of the key and exchange two dbids. Therefore, if the index is not updated synchronously at this time, unexpected errors may occur.

Determine the action according to the role: The Redis module may be a primary database or a secondary database. The module can use RedisModule_GetContextFlags to determine the current Redis role and take different actions according to the role (such as whether to actively expire).


Tair currently supports a large number of extended data structures (Tair (Enhanced Redis) for Redis 6.x uses the built-in method), which covers various application scenarios (please see the official document for details), including compact and pragmatic data structures such as TairString and TairHash (available in open source), as well as more complex and powerful computing data structures such as Tair Search and Vector. These data structures fully meet the needs of various business scenarios in the context of AIGC. Feel free to use them.

0 1 0
Share on


385 posts | 73 followers

You may also like