For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
PlaygroundDiscordStatusDashboardSign Up >
DocumentationSDK ReferenceGraphiti
DocumentationSDK ReferenceGraphiti
      • LangGraph
      • Autogen
      • LiveKit
      • ElevenLabs
      • CrewAI
      • NVIDIA NeMo Agent Toolkit
      • Google ADK
LogoLogo
PlaygroundDiscordStatusDashboardSign Up >
On this page
  • Why use a proxy instead of tools
  • Architecture
  • Implementation
  • The proxy endpoint
  • Frontend integration
  • ElevenLabs configuration
  • Production considerations
  • Learn more
Ecosystem

ElevenLabs Agents

Add persistent context to ElevenLabs voice agents using a custom LLM proxy.
Was this page helpful?
Previous

CrewAI integration

Add persistent context and knowledge graphs to CrewAI agents
Next
Built with

A complete working example is available on GitHub: elevenlabs-zep-example

ElevenLabs Agents is a platform for building intelligent voice agents. This guide shows how to integrate Zep with ElevenLabs using a custom LLM proxy.

Why use a proxy instead of tools

ElevenLabs supports custom tools, but using tools for context retrieval has problems:

  • Latency — Tool calls add round-trips where the LLM decides whether to call the tool. For voice agents, this delay is noticeable.
  • Unreliable — The LLM may skip retrieval when it shouldn’t, or call it unnecessarily.

A proxy solves both problems. Context retrieval happens transparently on every request, without LLM involvement.

Architecture

┌────────┐ ┌──────────┐ ┌─────────┐ ┌────────┐
│Frontend│ ◄───► │ElevenLabs│ ◄───► │LLM Proxy│ ◄───► │ OpenAI │
└────────┘ └──────────┘ └─────────┘ └────────┘
▲
│
▼
┌─────┐
│ Zep │
└─────┘

The proxy sits between ElevenLabs and your LLM. On every request it:

  1. Adds the user message to Zep and retrieves context in one call
  2. Injects context into the system prompt
  3. Forwards to the LLM and streams the response back
  4. Persists the assistant response to Zep

Implementation

The proxy endpoint

The proxy exposes an OpenAI-compatible /v1/chat/completions endpoint:

1@app.post("/v1/chat/completions")
2async def chat_completions(request: Request):
3 body = await request.json()
4
5 # ElevenLabs puts customLlmExtraBody in "elevenlabs_extra_body"
6 extra = body.get("elevenlabs_extra_body", {})
7 user_id = extra.get("user_id")
8 conversation_id = extra.get("conversation_id")
9
10 # Add user message to Zep and get context in one call
11 user_message = get_latest_user_message(body["messages"])
12 response = await zep.thread.add_messages(
13 thread_id=conversation_id,
14 messages=[Message(role="user", content=user_message)],
15 return_context=True # Returns context without separate call
16 )
17
18 # Inject context into system prompt
19 messages = inject_context(body["messages"], response.context)
20
21 # Stream response from LLM
22 return StreamingResponse(
23 stream_and_persist(messages, conversation_id)
24 )

The key optimization is return_context=True, which retrieves context in the same call as adding the message.

Frontend integration

Your frontend passes user identity via customLlmExtraBody:

1await conversation.startSession({
2 agentId: 'your-agent-id',
3 customLlmExtraBody: {
4 user_id: user.id,
5 conversation_id: crypto.randomUUID(),
6 },
7});

ElevenLabs configuration

  1. In your agent’s LLM section, select Custom LLM and set the server URL to your proxy
  2. Add an Authorization header for authentication
  3. In Security > Overrides, enable Custom LLM extra body (required for the proxy to receive user identity)

Production considerations

  • User identity — Use your auth system’s user ID, not random IDs
  • User metadata — Create users in Zep during registration with first_name, last_name, email for better personalization
  • Cache warming — Call zep.user.warm(user_id) when users arrive on your page to pre-fetch their data
  • Proxy location — Embed the endpoint in your existing backend for direct access to user data, or deploy as a standalone service

Learn more

  • ElevenLabs Custom LLM documentation
  • Zep context retrieval
  • Creating users in Zep