Kotaemon is a RAG UI that lets you chat with your documents

September 11, 2024

Kotaemon is an open-source RAG (Retrieval Augmented Generation) platform that can interact with your documents, including those with figures and tables. Based on your query, it extracts the relevant information and uses LLMs to generate accurate and contextually appropriate responses.

The system is designed for both end-users and developers.

As an end-user you can ask questions about your documents and the system will provide answers using a combination of information retrieval and text generation. For example: “Find all references to ‘machine learning’ in the technical documentation.” or “Identify any emerging trends in the market analysis report.“
As a developer you can create your own RAG-based question-answering pipelines by integrating the retrieval of relevant data with AI-driven generation of answers.

Kotaemon is a local web UI that uses Gradio and allows you to “chat with your documents” by integrating RAG. It supports various LLMs, including both API providers like OpenAI and local models via frameworks like Ollama and llama-cpp-python.

Kotaemon overview (source: GitHub repository)

The method

Kotaemon is built on the RAG technology and has 2 main steps:

Retrieval: In this phase, Kotaemon indexes a large corpus of documents and creates embeddings for each document. The embeddings are vector representations that capture the semantic meaning of the text. When a user submits a query, Kotaemon uses these embeddings to retrieve the most relevant documents that are contextually aligned with the user’s query.

Generation: Once the relevant documents are retrieved, they are combined with the original query to create a context and passed to an LLM. Kotaemon supports various language model API providers, including OpenAI and AzureOpenAI, as well as local language models. The generative model produces a coherent response that addresses the user’s query.

The picture below illustrates how Kotaemon interacts with your documents and answers your questions.

Kotaemon example (source: GitHub repository)

Pipeline

Document upload and query input: The user uploads documents in various formats (e.g., PDF, DOCX, HTML) and submits a query.
Document indexing and embedding: After all the needed documents are available in the system, they are indexed and converted into numerical representations that capture their semantic meaning.
Retrieval: Relevant documents are identified based on their similarity to the user’s query.
Context creation: The retrieved documents are combined with the original query to form a context.
Generation: The context is fed into an LLM to generate the response.

This process ensures that the system can effectively understand the query, find the most relevant information and generate a coherent and informative response.

Key features

Kotaemon offers several features that make it a flexible and user-friendly tool:

Open-source: Kotaemon is freely available to developers and researchers.
Multi-provider support: It is compatible with various language model API providers allows users to choose the best model for their specific needs.
Local model integration: In addition to cloud-based APIs, Kotaemon supports local language models.
Scalability: Kotaemon can handle large volumes of documents while maintaining high performance.

How to use Kotaemon

Below is a simplified guide. Find more details on the GitHub repository.

Installation: Clone the repository and set up the necessary dependencies.
Configure the LLMs and the embeddings. If you’re using the OpenAI API, set up your API key and adjust the settings accordingly. For local usage, you can use frameworks like Ollama. Follow this guide to setup your local LLMs and embeddings.
Document upload: Upload your documents in various formats (e.g., PDF, DOCX, HTML) directly through the UI.
Querying documents: Start interacting with your documents via a chat interface, querying the content for specific information. The system will retrieve relevant excerpts and generate natural language responses.
Fine-tuning: For more advanced users, Kotaemon allows the fine-tuning of the RAG model, improving response quality for domain-specific queries.

If you are a developer, you can adjust the retrieval and generation algorithms to better suit specific tasks.

Conclusion

Kotaemon acts as an AI assistant, helping you in searching for information across multiple documents. By using the combination of RAG and LLMs, it efficiently locates the information you need and generates detailed answers to your questions, saving your time.

As an open-source platform, Kotaemon is accessible to everyone and can be customized to suit their specific needs.