Reduce latency and save LLM costs by caching LLM prompts and responses.
cache_enabled
to true
. We currently will cache the whole conversation, including the system message, user message and the response.
See the example below, we will cache the user message “Hi, how are you?” and its response.
cache_by_customer
option, you can set it to true
or false
. If cache_by_customer
is set to true
, the cache will be stored by the customer identifier.keywordsai/cache
. You can also filter the logs by the Cache hit
field.
omit_logs
parameter to true
or go to Caches in Settings.
So this won’t generate a new LLM log when the cache is hit.