LLM Providers¶
mnesis uses litellm internally, which means it works with any provider litellm supports — Anthropic, OpenAI, Google Gemini, OpenRouter, Azure, and more. You never interact with litellm directly; just pass a model string and set the corresponding API key.
Model string format¶
| Provider | Example model string | API key env var |
|---|---|---|
| Anthropic | "anthropic/claude-opus-4-6" |
ANTHROPIC_API_KEY |
| OpenAI | "openai/gpt-4o" |
OPENAI_API_KEY |
| Google Gemini | "gemini/gemini-1.5-pro" |
GEMINI_API_KEY |
| OpenRouter | "openrouter/anthropic/claude-opus-4-6" |
OPENROUTER_API_KEY |
| Azure OpenAI | "azure/<your-deployment>" |
AZURE_API_KEY + AZURE_API_BASE |
The provider prefix is optional for well-known model names (e.g. "gpt-4o" resolves to OpenAI automatically), but the full form is recommended for clarity.
Examples¶
Using a cheaper model for compaction¶
Compaction (context summarisation) defaults to the session model. You can point it at a cheaper, faster model to save cost:
from mnesis import MnesisSession, MnesisConfig, CompactionConfig
config = MnesisConfig(
compaction=CompactionConfig(
compaction_model="openai/gpt-4o-mini",
)
)
session = await MnesisSession.create(model="openai/gpt-4o", config=config)
Passing extra litellm parameters¶
For provider-specific options (custom api_base, extra headers for OpenRouter, etc.), configure litellm globally before creating a session:
import litellm
litellm.api_base = "https://my-proxy.example.com"
litellm.headers = {"X-Custom-Header": "value"}
See the litellm provider docs for the full list of supported providers and their options.
Custom or fine-tuned models¶
For models that litellm does not recognise — including fine-tuned variants,
private deployments, or newly released models with unknown context limits —
use MnesisConfig.model_overrides to supply the correct limits:
from mnesis import MnesisSession, MnesisConfig
config = MnesisConfig(
model_overrides={
"context_limit": 128_000,
"max_output_tokens": 16_384,
}
)
session = await MnesisSession.create(model="openai/acme-support-ft-v1", config=config)
result = await session.send("Hello!")
Both context_limit and max_output_tokens affect the compaction budget, so
setting them accurately ensures Mnesis neither over-compacts nor sends an
over-limit context to the provider. See Configuration — model_overrides
for the full field reference.
Prefer your own SDK?¶
If you want to use the Anthropic, OpenAI, or another SDK directly for LLM calls, see BYO-LLM — mnesis can act as a pure memory/compaction layer without routing calls through litellm at all.