ElevenLabs Agents
A complete working example is available on GitHub: elevenlabs-zep-example
ElevenLabs Agents is a platform for building intelligent voice agents. This guide shows how to integrate Zep with ElevenLabs using a custom LLM proxy.
Why use a proxy instead of tools
ElevenLabs supports custom tools, but using tools for context retrieval has problems:
- Latency — Tool calls add round-trips where the LLM decides whether to call the tool. For voice agents, this delay is noticeable.
- Unreliable — The LLM may skip retrieval when it shouldn’t, or call it unnecessarily.
A proxy solves both problems. Context retrieval happens transparently on every request, without LLM involvement.
Architecture
The proxy sits between ElevenLabs and your LLM. On every request it:
- Adds the user message to Zep and retrieves context in one call
- Injects context into the system prompt
- Forwards to the LLM and streams the response back
- Persists the assistant response to Zep
Implementation
The proxy endpoint
The proxy exposes an OpenAI-compatible /v1/chat/completions endpoint:
The key optimization is return_context=True, which retrieves context in the same call as adding the message.
Frontend integration
Your frontend passes user identity via customLlmExtraBody:
ElevenLabs configuration
- In your agent’s LLM section, select Custom LLM and set the server URL to your proxy
- Add an
Authorizationheader for authentication - In Security > Overrides, enable Custom LLM extra body (required for the proxy to receive user identity)
Production considerations
- User identity — Use your auth system’s user ID, not random IDs
- User metadata — Create users in Zep during registration with
first_name,last_name,emailfor better personalization - Cache warming — Call
zep.user.warm(user_id)when users arrive on your page to pre-fetch their data - Proxy location — Embed the endpoint in your existing backend for direct access to user data, or deploy as a standalone service