Prompt caching is a feature that allows you to cache the results of a prompt so that it can be reused later.
prompt_caching
feature through the LLM proxy or log those LLM requests that are cached, which will give you a better observability.
Model | Base Input Tokens | Cache Writes | Cache Hits | Output Tokens |
---|---|---|---|---|
Claude 3.5 Sonnet | $3 / MTok | $3.75 / MTok | $0.30 / MTok | $15 / MTok |
Claude 3.5 Haiku | $1 / MTok | $1.25 / MTok | $0.10 / MTok | $5 / MTok |
Claude 3 Haiku | $0.25 / MTok | $0.30 / MTok | $0.03 / MTok | $1.25 / MTok |
Claude 3 Opus | $15 / MTok | $18.75 / MTok | $1.50 / MTok | $75 / MTok |