Q&A Chatbot for the Langfuse Docs
Summary of how we've built a Q&A chatbot for the Langfuse docs and how Langfuse helps us to improve it
We've built a new Q&A chatbot to help users find the right information in the Langfuse docs. In this post, I'll briefly summarize how we've built it and how we use Langfuse to monitor and improve it.
Update: All Langfuse Cloud users now have view-only access to this project. View live demo to explore it yourself.
๐ Do you have any questions about Langfuse? Ask me!
โ ๏ธ Warning: Do not enter sensitive information. All chat messages can be viewed in the public demo project. Humans (the founders) are available via the chat widget.
Implementation
Technologies used
- Embeddings of docs (.mdx files)
- Model: OpenAI
text-embedding-ada-002
- GitHub Action workflow to update embeddings on cron schedule: supabase/embeddings-generator (opens in a new tab)
- Storage
- Postgres with pgvector: Supabase (opens in a new tab)
- Schema: supabase/headless-vector-search (opens in a new tab)
- Model: OpenAI
- Streaming responses
- Model: OpenAI GPT-3.5-turbo
- Retrieval (embedding similarity search functions): supabase/headless-vector-search (opens in a new tab)
- Next.js edge function using Vercel AI SDK (opens in a new tab)
- UI components with streaming and markdown support: ai-chatbot (opens in a new tab)
- Observability, analytics and feedback collection: Langfuse (opens in a new tab), integrated via Typescript SDK (edge & web)
Code
All of the code is open source and available on GitHub in the langfuse-docs (opens in a new tab) repo. Relevant files:
- generate_embeddings.yml
- index.tsx
- qa-chatbot.ts
- qa-chatbot.mdx
Langfuse
Want to explore the project in Langfuse yourself? Create account (opens in a new tab) to get view-only access to this project in Langfuse.
Usage reporting
The reporting helps us to:
- Monitor usage (cost control)
- Understand what new users of Langfuse want to know which helps us to improve the docs
- Track latency, quality (based on user feedback) and OpenAI errors
Tracing
Each response is based on the following steps which can go wrong, be slow or expensive:
- Embedding of user request
- Embedding similarity search in Postgres
- Summary of docs as markdown
- Generation of response based on retrieved context and chat history
This is how a single trace looks like in Langfuse:
User feedback collection
In this example, we can see how we do:
- Collection of feedback using the Langfuse Web SDK
Negative, Langchain not included in response
- Browsing of feedback
- Identification of the root cause of the low-quality response
Docs on Langchain integration are not included in embedding similarity search
Why build this?
A user was surprised when I (a human) answered his/her question. It's 2023, a bot was expected. As we added a lot to the docs over the last days, building a retrieval-based chatbot finally made sense to help users explore the project. Also, I love to have an additional production app to dogfood new Langfuse features and demonstrate how Langfuse can be used.
Get in touch
We're super excited to offer users of the Langfuse docs a faster way to find the right information and dogfood Langfuse to monitor it. Check out the repo (opens in a new tab) for all backend and frontend code including the integration with Langfuse.
If you have any questions, join the Discord or reach out to me on Twitter (opens in a new tab).