VectorStore Example

Zep Python SDK ships with ZepVectorStore class which can be used with LangChain Expression Language (LCEL)

Let’s explore how to create a RAG chain using the ZepVectorStore for semantic search.

You can generate a project api key in Zep Dashboard.

Before diving into these examples, please ensure you’ve set the following environment variables:

ZEP_API_KEY - API key to your zep project

OPENAI_API_KEY - Open AI api key which the chain will require to generate the answer

You will need to have a collection in place to initialize vector store in this example

If you want to create a collection from a web article, you can run the python ingest script Try modifying the script to ingest the article of your choice.

Alternatively, you can create a collection by running either Document example in python sdk repository or Document example in typescript sdk repository.

1ZEP_API_KEY = os.environ.get("ZEP_API_KEY")

Need a project API key? Create one from the Zep Dashboard.

Initialize ZepClient with necessary imports

1import os
2from typing import List
4from langchain.schema import format_document
5from langchain_core.documents import Document
6from langchain_core.output_parsers import StrOutputParser
7from langchain_core.prompts import ChatPromptTemplate
8from langchain_core.prompts.prompt import PromptTemplate
9from langchain_core.pydantic_v1 import BaseModel
10from langchain_core.runnables import (
11 ConfigurableField,
12 RunnableParallel,
14from langchain_core.runnables.utils import ConfigurableFieldSingleOption
15from langchain_openai import ChatOpenAI
17from zep_cloud.client import AsyncZep
18from zep_cloud.langchain import ZepVectorStore
20zep = AsyncZep(
21 api_key=os.environ["ZEP_API_KEY"],

Initialize ZepVectorStore

1vectorstore = ZepVectorStore(
2 collection_name=ZEP_COLLECTION_NAME,
3 api_key=process.env.ZEP_API_KEY,

Let’s set up the retriever. We’ll use vectorstore for this purpose and configure it to use MMR search result reranking.

1retriever = vectorstore.as_retriever()

Create a prompt template for synthesizing answers.

1template = """Answer the question based only on the following context:
2 <context>
3 {context}
4 </context>"""
5answer_prompt = ChatPromptTemplate.from_messages(
6 [
7 ("system", template),
8 ("user", "{question}"),
9 ]

Create the default document prompt and define the helper function for merging documents.

1DEFAULT_DOCUMENT_PROMPT = PromptTemplate.from_template(template="{page_content}")
3def _combine_documents(
4 docs: List[Document],
5 document_prompt: PromptTemplate = DEFAULT_DOCUMENT_PROMPT,
6 document_separator: str = "\n\n",
8 doc_strings = [format_document(doc, document_prompt) for doc in docs]
9 return document_separator.join(doc_strings)

Let’s set up user input and the context retrieval chain.

1# User input
2class UserInput(BaseModel):
3 question: str
5inputs = RunnableParallel(
6 {"question": lambda x: x["question"], "context": retriever | _combine_documents},

Compose final chain

1chain = inputs | answer_prompt | ChatOpenAI() | StrOutputParser()

Here’s a quick rundown of how the process works:

  1. inputs grabs the user’s question and fetches relevant document context to add to the prompt.
  2. answer_prompt then takes this context and question, combining them in the prompt with instructions to answer the question using only the provided context.
  3. ChatOpenAI calls an OpenAI model to generates an answer based on the prompt.
  4. Finally, StrOutputParser extracts the LLM’s result into a string.

To invote this chain manually, simply pass the question into the chain’s input.

2 {"question": "-"},

Running the Chain with LangServe

You can run this chain, along with others, using our LangServe sample project.

Here’s what you’ll need to do:

Clone our Python SDK

$git clone
>cd examples/langchain-langserve

Review the README in the langchain-langserve directory for setup instructions.

After firing up the server, head over to http://localhost:8000/rag_vector_store/playground to explore the LangServe playground using this chain.