This topic describes the ai-cache plug-in. This plug-in is used to cache large language model (LLM)-based results. The default configuration of the plug-in allows you to cache the results by using the protocol over which API requests are made. Both streaming and non-streaming response results can be cached.
Running attributes
Plug-in execution stage: authentication stage. Plug-in execution priority: 10.
Configuration description
Name | Data type | Required | Default value | Description |
cacheKeyFrom.requestBody | string | No | "messages.@reverse.0.content" | The string that is extracted from the request body based on the GJSON PATH syntax. |
cacheValueFrom.responseBody | string | No | "choices.0.message.content" | The string that is extracted from the response body based on the GJSON PATH syntax. |
cacheStreamValueFrom.responseBody | string | No | "choices.0.delta.content" | The string that is extracted from the streaming response body based on the GJSON PATH syntax. |
cacheKeyPrefix | string | No | "higress-ai-cache" | The prefix of the Redis cache key. |
cacheTTL | integer | No | 0 | The expiration time of the cache. Unit: seconds. The default value is 0, which indicates that the cache never expires. |
redis.serviceName | string | Yes | - | The Redis service name, which is a fully qualified domain name (FQDN) with a specific service type, such as my-redis.dns or redis.my-ns.svc.cluster.local. |
redis.servicePort | integer | No | 6379 | The Redis service port. |
redis.timeout | integer | No | 1000 | The timeout period of the Redis request. Unit: milliseconds. |
redis.username | string | No | - | The username that is used to log on to the Redis instance. |
redis.password | string | No | - | The password that is used to log on to the Redis instance. |
returnResponseTemplate | string | No |
| The template of the HTTP response. %s is used to mark the part that needs to be replaced by the cache value. |
returnStreamResponseTemplate | string | No |
| The template of the HTTP streaming response. %s is used to mark the part that needs to be replaced by the cache value. |
Configuration example
redis:
serviceName: my-redis.dns
timeout: 2000Advanced usage
The current default cache key is extracted based on the GJSON PATH expression. For example, the expression
messages.@reverse.0.contentspecifies to obtain the content of the first element after the order of the elements in the messages array is reversed.GJSON PATH supports the condition syntax. For example, if you want to use the content whose last role is user as the key, you can use the expression
messages.@reverse.#(role=="user").content.If you want to combine all the contents whose role is user into an array as the key, you can use the expression
messages.@reverse.#(role=="user")#.content.The pipe syntax is also supported. For example, if you want to use the content whose second role is user as the key, you can use the expression
messages.@reverse.#(role=="user")#.content|1.For more syntax details, see the official documentation. You can use GJSON Playground for syntax testing.