DocuMind — AI Document Q&A

Upload any document and ask questions in plain English. DocuMind uses Retrieval-Augmented Generation (RAG) to find relevant passages and generate accurate, source-cited answers.

React FastAPI RAG Google Gemini ChromaDB Python Ubuntu VPS Nginx systemd

What is DocuMind?

A production-ready RAG system that turns your documents into an intelligent knowledge base.

DocuMind is an AI-powered document assistant built on Retrieval-Augmented Generation (RAG). Instead of sending your entire document to an LLM, it intelligently chunks, embeds, and indexes your content in a vector database. When you ask a question, it retrieves only the most relevant passages and feeds them to the language model — producing accurate, grounded answers with source citations.

This approach solves two critical problems with vanilla LLMs: hallucination (making up facts) and context-window limits (documents too large to fit in a single prompt). DocuMind handles PDFs, Word docs, plain text, Markdown, and CSV files.

The app comes pre-loaded with three sample documents so you can start asking questions immediately — no upload required.

How It Works

Two pipelines power DocuMind — one for ingestion, one for querying.

Document Ingestion Pipeline

Upload Doc
Extract Text
Chunk
Embed (Gemini)
Store (ChromaDB)

Query Pipeline

Ask Question
Embed Query
Vector Search
Build Context
LLM Generate
Cited Answer

How to Test

Follow these 6 steps to explore every feature of DocuMind.

1

Explore Pre-loaded Documents

DocuMind comes with 3 seed documents already indexed and ready to query. You'll see them in the document sidebar when you open the app:

Pre-loaded docs:
• Acme Corp Employee Handbook
• CloudSync API Documentation
• Q4 2024 Sales Report
2

Ask a Question

Type a natural-language question in the chat input. DocuMind will search the indexed documents, retrieve relevant chunks, and generate an answer with source citations.

Try these:
• "What is the remote work policy?"
• "How do I authenticate with the API?"
• "What was Q4 revenue?"
3

View the RAG Debug Panel

Click the debug toggle to open the RAG transparency panel. This shows exactly what happened behind the scenes — which chunks were retrieved, their similarity scores, embedding details, token usage, and cost breakdown.

What you'll see: Retrieved chunks with similarity scores, embedding model info, token count, response latency, and estimated cost per query.
4

Upload Your Own Document

Click the upload button or drag & drop a file into the upload area. Supported formats: PDF, DOCX, TXT, MD, and CSV. The document will be chunked, embedded, and indexed in real time.

Tip: Try uploading a multi-page PDF and then asking questions that span different sections to see cross-section retrieval in action.
5

Adjust RAG Settings

Open the settings panel to fine-tune how DocuMind processes and answers queries. You can change the language model, temperature, number of chunks retrieved (top-K), similarity threshold, and response style.

Experiment: Lower the temperature for factual answers, raise it for creative summaries. Increase top-K to pull in more context for complex questions.
6

Multi-Document RAG

Upload multiple documents and ask questions that require information from more than one source. DocuMind retrieves chunks from all indexed documents and cites each source separately in the answer.

Try: With all 3 seed docs loaded, ask "Compare the remote work policy with the Q4 sales performance" — you'll see citations from both documents.

Key Features

Everything built into DocuMind.

Multi-Format Upload

Upload PDF, DOCX, TXT, Markdown, and CSV files. Text is extracted, chunked, and embedded automatically.

Natural Language Q&A

Ask questions in plain English. Get accurate, contextual answers generated from your document content.

Source Citations

Every answer includes citations pointing to the exact document and passage the information came from.

RAG Debug Panel

Full transparency into the RAG pipeline — see retrieved chunks, similarity scores, token usage, and cost per query.

Configurable Settings

Tune the model, temperature, top-K, similarity threshold, chunking strategy, and response style to your needs.

Multi-Document RAG

Index multiple documents and ask cross-document questions. Answers pull from all sources with per-document citations.