Bring Your Own LLM (BYOM)
Enterprise Add-on. Contact sales to enable BYOM for your account.
Overview
Bring Your Own LLM (BYOM) lets you use your own accounts with model providers such as OpenAI, Anthropic, and Google when using Zep Cloud. You keep using Zep’s orchestration, context, and security controls while routing inference through credentials you manage. This approach ensures:
- Contract continuity: Apply your negotiated pricing, quotas, and compliance commitments with each LLM vendor.
- Data governance: Enforce provider-specific policies for data usage, retention, and residency.
- Operational flexibility: Configure the best vendor or model for each project, including fallbacks for high availability.
Model recommendations
Zep intentionally sets thinking and reasoning budgets off or low to minimize cost and latency. We recommend using smaller, faster models optimized for speed rather than extended reasoning.
Recommended model: Gemini 2.5 Flash Lite is the most well-tested model with Zep.
Not all larger models support disabling reasoning entirely. If you configure a model that requires reasoning tokens, you may experience higher costs and latency. Smaller models avoid this issue.
Supported providers
Zep only uses text generation endpoints—no embeddings, fine-tuning, file uploads, or assistants. LLM providers are configured at the account level, meaning the same credentials are used for all projects within your account.
Google Vertex AI recommended for production: We recommend using Google Vertex AI over Google Gemini (AI Studio) for production workloads. Vertex AI offers better control over rate limits, allows you to increase quotas, and supports purchasing provisioned throughput if needed.
Getting started
Add a provider
Select a provider type from the dropdown and enter your credentials. For providers requiring JSON credentials (Vertex AI, Bedrock), paste the full JSON object.
Configure provider settings
Enter any provider-specific settings such as endpoint URLs, project IDs, or regions.
Select a model
Choose a model from the list of verified models for your provider. Mark it as primary or fallback.
Configuration options
When configuring a provider, you can set the following options:
FAQ
Does Zep store our provider keys in its databases? No. Credentials are stored in an encrypted secrets manager (AWS SSM Parameter Store). Values are decrypted in memory only when needed and are never written to Zep databases or logs.
Can we use different vendors or models per project? Yes. Each project maintains its own provider configuration, including defaults and fallbacks. This is useful for isolating production from staging or testing providers side by side.
Can we prevent vendors from training on our data? Yes. Use the vendor endpoints and contractual controls that disable data retention or training. Zep routes requests accordingly and sets the necessary flags in each call.
How is usage billed? You receive invoices from Zep for Zep services only. LLM inference charges come directly from your vendors under your existing contract and pricing.
What happens if a key is compromised or needs rotation? Add a new credential in the dashboard and verify it. Then disable or delete the previous credential. Requests start using the new credential immediately with no downtime required.
How does BYOM affect observability? Requests are tagged by project and provider, so you can attribute usage and costs. Rate limits are applied per provider to protect budgets and enforce quotas.
Can we use a customer-managed KMS key? Contact support if you require customer-controlled encryption for credential storage.