This feature is only available for AI gateway users.

The user-level rate limit helps you control the LLM usage of each user. You can set a rate limit of requests per minute for each user and we will block the API calls that exceed the limit.

Why user-level rate limit?

  • Prevent random users from abusing your system
  • Control the cost of your LLM usage

How to set user-level rate limit

"customer_params": {
        "customer_identifier": "xxxx", // The user you want to set the rate limit for
        "rate_limit": 100 // The rate limit of the user, requests per minute
    },

Detailed example

import requests
def demo_call(input, 
              model="gpt-4o-mini",
              token="YOUR_KEYWORDS_AI_API_KEY"
              ):
    headers = {
        'Content-Type': 'application/json',
        'Authorization': f'Bearer {token}',
    }

    data = {
        'model': model,
        'messages': [{'role': 'user', 'content': input}],
        "customer_params": {"customer_identifier": "xxxx", "rate_limit": 100}
    }

    response = requests.post('https://api.keywordsai.co/api/chat/completions', headers=headers, json=data)
    return response

messages = "Say 'Hello World'"
print(demo_call(messages).json())