RAG Demo

Advanced RAG Technology

Advanced RAG with 98% Accuracy

Experience hybrid search combining vector embeddings and full-text search, unified through Reciprocal Rank Fusion for unparalleled retrieval precision.

Try Demo View Source

rag-agent-demo — hybrid-search

Upload

PDF, DOCX, TXT

Embed

1536 Dimensions

Hybrid Search

Vector + Text + RRF

AI Response

Streaming via AI SDK

The Pipeline

How It Works

From document upload to AI-powered answers in four seamless steps, each optimized for maximum retrieval accuracy.

Document Upload

Upload your documents in any format. Our processor automatically extracts and chunks content for optimal retrieval.

PDF, DOCX, TXT, MD support
Automatic text extraction
Smart chunking (500 tokens)
Metadata preservation

Embedding Generation

Each chunk is transformed into a high-dimensional vector using OpenAI's latest embedding model for semantic understanding.

text-embedding-3-small
1536-dimensional vectors
Semantic representation
Batch processing

Hybrid Search

Your queries trigger both vector similarity search and full-text search, combined using Reciprocal Rank Fusion.

MongoDB Atlas Vector Search
Lucene full-text search
RRF with K=60
98% retrieval accuracy

AI Response

The retrieved context is passed to an LLM which generates accurate, contextual responses streamed in real-time.

Claude 3.5 via OpenRouter
Vercel AI SDK streaming
Source citations
Context-aware answers

Architecture

Hybrid Search with Reciprocal Rank Fusion

Our dual-path architecture combines the semantic understanding of vector search with the precision of full-text matching.

Vector Search (Semantic)

Text Search (Lexical)

RRF Fusion

Vector Search

Finds semantically similar content using 1536-dimensional embeddings and cosine similarity. Great for understanding intent and context.

Text Search

Matches exact keywords and phrases using Lucene's BM25 algorithm. Perfect for specific terms and acronyms.

RRF

Rank Fusion

Combines rankings using 1/(K+rank) formula where K=60. This weighting creates balanced, highly accurate results.

Agentic Intelligence

Beyond Traditional RAG: Agentic Retrieval

Autonomous AI agents that reason, plan, and adapt. Our agentic harness transforms simple retrieval into intelligent, multi-step problem-solving.

Planning & Memory

Tools & Verification

Self-Correction Loop

Multi-Step Reasoning

Autonomous planning and decomposition of complex queries into manageable sub-tasks, enabling sophisticated problem-solving across multiple retrieval cycles.

Tool & Function Calling

Dynamic integration with external tools, APIs, and databases. The agent decides which tools to invoke based on query requirements.

Self-Correction

Continuous reflection and refinement of outputs. When initial results are insufficient, the agent iterates until quality thresholds are met.

Context-Aware Retrieval

Adaptive retrieval strategies that consider conversation history, user intent, and document relevance to deliver precise results.

What is an Agent Harness?

The agent harness is the complete architectural system surrounding an LLM—everything except the model itself. It manages the entire context lifecycle: planning queries, orchestrating tool calls, maintaining memory, and verifying outputs.

Think of it as the "operating system" for your AI agent. While the LLM provides reasoning capabilities, the harness provides structure, control, and reliability.

Planning & Decomposition

Breaks complex queries into executable steps

Memory & State Management

Maintains context across conversation turns

Verification & Guardrails

Ensures output quality and safety constraints

Tool Integration Layer

Orchestrates connections to external systems

Traditional RAG vs Agentic RAG

See how agentic capabilities transform the retrieval experience from simple question-answering to intelligent problem-solving.

Traditional RAG

•
Single retrieval pass
One query, one search, one response
•
Static context window
Fixed chunk retrieval without adaptation
•
No self-correction
Cannot identify or fix retrieval failures
•
Limited to vector similarity
Cannot leverage external tools or APIs

N+

Agentic RAG

✓
Multi-step reasoning
Iterative retrieval with query refinement
✓
Dynamic context management
Adapts retrieval strategy per query
✓
Self-correction & reflection
Detects gaps and re-queries as needed
✓
Tool & function calling
Integrates databases, APIs, calculators

Use Case Example

"Analyse our Q3 performance and recommend improvements"

STEP 1

Plan & Decompose

Agent breaks query into sub-tasks: retrieve Q3 data, compare to Q2, identify patterns

STEP 2

Multi-Source Retrieval

Queries vector DB, calls SQL database, fetches external market data

STEP 3

Verify & Iterate

Checks completeness, identifies gaps in regional data, performs additional retrieval

STEP 4

Synthesise & Respond

Combines insights into actionable recommendations with cited sources

Knowledge Graphs

GraphRAG with Neo4j Knowledge Graphs

Enhance retrieval accuracy by combining vector search with knowledge graph traversal. Entity relationships provide contextual depth that traditional RAG cannot achieve.

Hybrid Search

Entity Extraction

Knowledge Graph

Graph Traversal

Entity Extraction

Automatically identifies key entities from retrieved documents: people, organisations, technologies, concepts, and locations for graph-based analysis.

Knowledge Graph

Neo4j's native graph database stores entity relationships with index-free adjacency, enabling up to 1000x faster traversal than relational queries.

Graph-Boosted Ranking

Results are re-ranked using entity connection strength with a 30% boost weight, surfacing contextually related content that pure vector search misses.

MongoDB Hybrid Search vs Neo4j GraphRAG

Both approaches excel in different scenarios. Choose based on your data structure and query patterns.

MongoDB Hybrid Search

Vector + Full-Text + RRF

✓Semantic similarity through vector embeddings
✓Exact keyword matching with BM25
✓RRF fusion for balanced ranking (K=60)
✓Best for document-centric queries

Neo4j GraphRAG

Hybrid + Knowledge Graph

✓All hybrid search capabilities included
✓Entity relationship traversal (2+ hops)
✓Graph-boosted ranking (30% entity weight)
✓Best for interconnected knowledge queries

Deployment Use Cases

GraphRAG excels when understanding relationships between entities is critical.

🏥

Healthcare

Patient records, treatment relationships, drug interactions

💼

Enterprise Knowledge

Organisational structures, project dependencies, expertise mapping

🔬

Research & Development

Citation networks, author collaborations, concept evolution

🛡️

Fraud Detection

Transaction patterns, network analysis, anomaly detection

Built With

Production-Ready Stack

Enterprise-grade technologies working together to deliver fast, accurate, and scalable RAG experiences.

MongoDB Atlas

Cloud-native document database with built-in vector search capabilities for scalable RAG applications.

Native vector indexingAtlas Search (Lucene)Aggregation pipelinesFree tier available

OpenAI Embeddings

State-of-the-art text-embedding-3-small model generating high-quality 1536-dimensional vectors.

1536 dimensionsSemantic understandingBatch processingCost-effective

Reciprocal Rank Fusion

Advanced ranking algorithm that combines multiple search result lists into a single, optimized ranking.

K=60 constantRank-based scoring98% accuracy boostPosition agnostic

Vercel AI SDK

Unified API for AI model integration with first-class streaming support and React hooks.

Real-time streamingOpenRouter supportuseChat hookEdge runtime

Framework:Next.js 15

Runtime:React 19

Auth:Clerk

Styling:Tailwind CSS 4

Performance

Built for Production

Every component optimized for speed, accuracy, and reliability. Real metrics from real-world testing.

98%

Retrieval Accuracy

Hybrid search with RRF achieves near-perfect context retrieval

1536

Vector Dimensions

High-dimensional embeddings capture semantic nuance

<2s

Response Time

First token delivered within 2 seconds of query

K=60

RRF Constant

Optimized fusion constant for balanced ranking

Ready to see it in action?

Upload your documents and experience the power of hybrid search with Reciprocal Rank Fusion firsthand.

Launch Demo View on GitHub

FAQ

Frequently Asked Questions

Learn more about how the RAG Agent demo works and the technology behind it.

RAG (Retrieval-Augmented Generation) combines document search with AI to answer questions based on your uploaded documents. It retrieves relevant content and uses an LLM to generate accurate, contextual responses.

The demo supports PDF, DOCX, TXT, and Markdown files. Documents are processed, chunked into 500-token segments, and stored with 1536-dimensional vector embeddings for semantic search.

We use a combination of vector search (semantic similarity via cosine distance) and full-text search (keyword matching via Lucene/BM25), merged using Reciprocal Rank Fusion (RRF) with K=60 for optimal ranking.

•Vector search finds semantically similar content based on meaning
•Text search matches exact keywords, phrases, and acronyms
•RRF combines both rankings using 1/(K+rank) formula

RRF is a rank aggregation algorithm that combines multiple ranked lists into a single ranking. Each result receives a score of 1/(K+rank) from each list, where K=60 is a constant that balances the contribution of different ranking positions.

The hybrid approach achieves ~98% retrieval accuracy because it combines the semantic understanding of vector search with the precision of lexical matching. Vector search catches paraphrases and related concepts, while text search ensures exact term matches aren't missed.

Yes. All documents are isolated by Clerk user ID using MongoDB's Row-Level Security equivalent via query filters. Data is stored securely in MongoDB Atlas with encryption at rest. This is a demo application - for production, additional security measures may be needed.

GraphRAG (Graph Retrieval-Augmented Generation) extends traditional RAG by storing documents as a knowledge graph in Neo4j. It extracts entities from your documents, builds relationships between them, and uses graph traversal to discover contextually relevant information that pure vector search might miss.

•Extracts six entity types: Person, Organisation, Technology, Concept, Location, and Product
•Creates semantic relationships between entities (WORKS_FOR, CREATED_BY, USES, etc.)
•Traverses up to 2 hops in the graph to find connected context
•Applies graph-boosted scoring (30% weight) to re-rank hybrid search results

Graph-boosted scoring enhances standard RRF results by factoring in entity connections from the knowledge graph. When a chunk mentions entities that are highly connected or frequently referenced across your document corpus, it receives a boost to its final ranking score.

•Entities with higher mention counts contribute more to the boost
•Cross-document entity relationships surface hidden connections
•The boost is weighted at 30% to complement (not override) semantic relevance
•Results in more contextually coherent answers for complex queries

Agentic retrieval goes beyond single-pass RAG by enabling autonomous AI agents that reason, plan, and adapt. Instead of simply retrieving and answering, an agentic system decomposes complex queries into sub-tasks, invokes multiple tools, and iterates until quality thresholds are met.

•Multi-step reasoning: breaks queries into manageable sub-tasks
•Tool & function calling: dynamically integrates databases, APIs, and calculators
•Self-correction: reflects on outputs and re-queries when results are insufficient
•Context-aware retrieval: adapts strategy based on conversation history and user intent

The agent harness is the complete architectural system surrounding an LLM—everything except the model itself. Think of it as the "operating system" for your AI agent. Whilst the LLM provides reasoning capabilities, the harness provides structure, control, and reliability across multiple context windows.

•Planning & Decomposition: breaks complex queries into executable steps
•Memory & State Management: maintains context across conversation turns
•Verification & Guardrails: ensures output quality and safety constraints
•Tool Integration Layer: orchestrates connections to external systems

AI agents must work in discrete sessions, and each new session begins with no memory of what came before. The agent harness solves this by creating persistent state files, progress tracking, and structured feature lists that allow subsequent sessions to pick up exactly where the previous one left off.

•Prevents agents from declaring victory prematurely
•Maintains clean environment state between sessions
•Enables incremental progress on complex, multi-session projects
•Provides audit trails via git commits and progress files

Traditional RAG performs a single retrieval pass with a static context window and no self-correction. Agentic RAG, by contrast, uses iterative retrieval with query refinement, dynamic context management, and the ability to detect gaps and re-query as needed—transforming simple question-answering into intelligent problem-solving.

Advanced RAG with 98% Accuracy

Upload

Embed

Hybrid Search

AI Response

How It Works

Document Upload

Embedding Generation

Hybrid Search

AI Response

Hybrid Search with Reciprocal Rank Fusion

Vector Search

Text Search

Rank Fusion

Beyond Traditional RAG: Agentic Retrieval

Multi-Step Reasoning

Tool & Function Calling

Self-Correction

Context-Aware Retrieval

What is an Agent Harness?

Planning & Decomposition

Memory & State Management

Verification & Guardrails

Tool Integration Layer

Traditional RAG vs Agentic RAG

Traditional RAG

Agentic RAG

"Analyse our Q3 performance and recommend improvements"

GraphRAG with Neo4j Knowledge Graphs

Entity Extraction

Knowledge Graph

Graph-Boosted Ranking

MongoDB Hybrid Search vs Neo4j GraphRAG

MongoDB Hybrid Search

Neo4j GraphRAG

Deployment Use Cases

Healthcare

Enterprise Knowledge

Research & Development

Fraud Detection

Production-Ready Stack

MongoDB Atlas

OpenAI Embeddings

Reciprocal Rank Fusion

Vercel AI SDK

Built for Production

Retrieval Accuracy

Vector Dimensions

Response Time

RRF Constant

Ready to see it in action?

Frequently Asked Questions

What is a RAG Agent?

What file types can I upload?

How does hybrid search work?

What is Reciprocal Rank Fusion?

Why 98% accuracy?

Is my data secure?

What is GraphRAG?

How does graph-boosted scoring work?

What is agentic retrieval?

What is an agent harness?

Why use an agent harness for long-running tasks?

How do traditional RAG and agentic RAG differ?