Keywords AI will catch any errors occurring in a request to model and fall back to the list of models you specified in the fallback_models field. This is useful to avoid downtime and ensure that your application is always available.

See all Keywords AI params here.

Set up fallbacks on Keywords AI platform

Go to Settings -> Fallback -> Click on Add fallback models -> Select the models you want to add as fallbacks.

Set up fallbacks in code

{
  // ...other parameters...
  "fallback_models": [
    "gemini/gemini-pro",
    "mistral/mistral-small"
  ]
}
There are no restrictions on the number of fallback models you can add. The order of the models in the list is the order in which they will be tried.

Automatic fallbacks

We use Saftey net to automatically fallback to a system assigned model if fallback models are not responding or specified.

Turn on the Safety net here.

Exceptions for triggering fallbacks

  • You turned off the safety net and did not specify your own fallback models
  • You uploaded some invalid customer_credentials

Fallback logic with retries

Below is the pseudo-code for the fallback logic with retries:

warnings = {}
model = "model_in_request_that_will_fail"
keywords_native_failover = keywords_ai_models_data[model]["fallback"] # A slice of data from our database for the native failover
# Native failovers usually are the same models, but just from different providers, they are optimally ranked to reduce cost and latency
fallback_models = [*keywords_native_failover, "model1", "model2"]

# Keywords AI will filter away models that do not meet the hard requirements of the request (function_calling support, etc.)
# To save time from unnecessary retries
fallback_models = filter_by_hard_requirements(request, fallback_models)

for model in fallback_models:
  for _ in range(retries):
    try:
        response = generate_response_with_load_balance(model)
        break
    except Exception as e:
        error_code, error_message = keywords_ai_exception_handler(e)
        warnings[model] = error_message
raise Exception("All models failed", warnings)
  • To learn more about how generate_response_with_load_balance works, see the Load Balancing section.
  • To learn more about how filter_by_hard_requirements works, see the Model Filtering section.