Keywords AI will retry the failed request if the failing reason is a rate limit issue from the upstream provider. Below is the pseudo-code for the retry logic:

model # User requested model
# Load the models into compatible_models_format
model_params = keywords_models_data[model] # A slice from our database
# Exponential backoff retry logic
for i in range(0, fallback_retries):
    try:
        response = keywords_response_with_load_balance(model)
        return response
        break
    except RateLimitError:
        if model_params["fallback_models"]:
            # If there is a fallback, use that immediately without waiting
            for fallback_model in model_params["fallback_models"]:
                response = keywords_response_with_load_balance(fallback_model)
                return response
        sleep(2 ** i)
    except Exception as e:
        raise e