Building a Tunable, Scalable Agentic RAG: From Deep Research to Open Deep Wide Research

October 27, 2025Ollie @PuppyAgenrt

Abstract

In 2025, Retrieval-Augmented Generation (RAG) is undergoing a paradigm shift from "static pipelines" to "autonomous agents". OpenAI's Deep Research demonstrates the potential of this direction, compressing complex research tasks into minutes through multi-step planning, tool use, and dynamic reasoning. However, its closed architecture and fixed strategies struggle to meet enterprise demands for controllability, cost-efficiency, and data sovereignty. This article introduces Open Deep Wide Research (ODWR)—an open-source, MCP-compatible, Agentic RAG framework that supports runtime policy tuning. It aims to replicate the core capabilities of Deep Research while giving developers fine-grained control over depth, width, and latency.

Problem Background: The Evolutionary Bottlenecks of RAG

Traditional RAG systems use a linear "retrieve → rerank → generate" pipeline, which is suitable for factual question-answering but falls short in the following scenarios:

Multi-hop reasoning: For example, "Compare the open-source strategies of three AI companies from 2024–2025 and their impact on the developer ecosystem."
Heterogeneous data fusion: Requires simultaneously parsing web pages, PDF technical whitepapers, and user-uploaded CSV reports.
Dynamic task adjustment: When initial retrieval results are low-quality, it cannot autonomously correct queries or switch data sources.

OpenAI's Deep Research addresses these issues by introducing an agentic architecture: it decomposes tasks into sub-goals, calls browser and Python tools, adjusts strategies in real-time, and outputs structured reports with citations. This design validates the feasibility of Agentic RAG but also exposes key limitations: black-box models, no custom toolchains, and a lack of resource scheduling interfaces.

Methodology: Distilling Core Mechanisms from Deep Research

We analyzed the public technical descriptions of Deep Research (OpenAI, 2025) and distilled three reusable design principles:

Hierarchical task planning: Translating user instructions into an executable research path (e.g., "identify competitors → collect parameters → cross-validate → generate comparison table").
Collaborative tool execution: Integrating web browsers, code interpreters, and file parsers to form a closed loop.
Evidence-driven output: Each conclusion is linked to its original source, supporting traceability and verification.

These mechanisms can be standardized and encapsulated via the Model Context Protocol (MCP). MCP defines the protocols for context passing, state synchronization, and error recovery between agents and tools, allowing different components (like LLMs, crawlers, and databases) to be plug-and-play.

Implementation: The Open Deep Wide Research Architecture

Based on these insights, we developed Open Deep Wide Research (ODWR), an open-source, self-hostable Agentic RAG system with the following features:

1. MCP-Compatible Agent Core

The agent controller adheres to the MCP specification, supporting dynamic loading of tools (e.g., Selenium browser, PDF parser, SQL query engine).
Context is passed as structured JSON, including task status, visited URLs, cited snippets, and confidence scores.

2. Three-Dimensional Tunable Policies

Users can specify at runtime:

Depth: Maximum reasoning steps (1–10), controlling logical complexity.
Width: Number of parallel retrieval sources (5–100+), affecting information coverage.
Latency Budget: Hard deadline (30s–30min), with automatic fallback on timeout.

Example: A lightweight mode (Depth=2, Width=10, Latency=2min) is suitable for product comparisons; a deep mode (Depth=8, Width=50, Latency=20min) is used for scientific literature reviews.

3. Hybrid Retrieval and Re-planning Mechanism

Initial retrieval uses a HyDE + vector + keyword hybrid strategy.
If a critical sub-task fails (e.g., a company's financial report is not found), it triggers a backtrack-rewrite-retry loop.
Supports user-uploaded files as "anchor knowledge" to guide the retrieval direction.

4. Open Source and Self-Hostable

Code is hosted on GitHub and supports one-click deployment with Docker.
Compatible with major LLMs (e.g., GPT-4o, Claude 3.5, DeepSeek-R1) via a unified MCP adapter.
Output format is Markdown + JSON, facilitating integration with Notion, Obsidian, or internal systems.

Comparison with Deep Research

Dimension	OpenAI Deep Research	Open Deep Wide Research
Accessibility	Limited to ChatGPT subscribers	Open-source, self-hostable
Tool Extension	Closed (OpenAI-provided only)	MCP-compatible, any tool is pluggable
Control Granularity	Fixed policy	Tunable via three parameters: Depth/Width/Latency
Data Sovereignty	Relies on OpenAI Cloud	Supports private knowledge bases and local execution
Output Export	Within ChatGPT only	Supports API, JSON, and Markdown export

Call to Action: Experience ODWR's Capabilities Now

We have integrated a simplified version of ODWR on the puppyone platform, allowing users to quickly build enterprise-grade Agentic RAG applications:

Upload technical documents to automatically generate competitive analysis reports.
Connect to internal databases to enable "natural language queries + supplementary external research."
Deploy as a customer service bot that automatically cites policy documents and user manuals.

puppyone offers a free trial and a Professional plan for team collaboration and high-concurrency scenarios. Visit https://www.puppyone.ai/ to start your Agentic RAG journey.

FAQ

Q1: Can ODWR replace Deep Research? Functionally, it can cover over 80% of its use cases and is especially suitable for enterprises that require data privacy, cost control, or custom tools. However, for extremely complex tasks that rely on OpenAI's proprietary models (like o3), performance may be slightly lower.

Q2: Is a programming background required to use it? Non-technical users can configure task templates through puppyone's graphical interface, while developers can deeply customize agent behavior via the MCP API.

Q3: How can I control costs? ODWR allows you to set maximum token consumption, tool call limits, and timeout thresholds. It also supports switching to lightweight models (like o4-mini or DeepSeek-Lite) to significantly reduce inference costs.

Agentic RAG

Agentic RAG for Deep Research: Architecture, Mechanisms, and Engineering Practices

This article presents a deep research–oriented Agentic RAG system that autonomously plans, iteratively retrieves, and cross-validates information to generate structured reports, completing expert-level research tasks in 2–4 minutes. It details the system’s workflow, dynamic reasoning engine, and performance on benchmarks like Humanity’s Last Exam (21.1%) and SimpleQA (93.9%), while also covering deployment challenges and introducing an open-source implementation.

Ollie @puppyoneOct 27, 2025

Agentic RAG

Open Deep Wide Research: A General-Purpose Agent Collaboration Architecture for Large-Scale Information Gathering

This article explores a novel Agentic RAG architecture that leverages dedicated cloud virtual machines and generic multi-agent collaboration to automate wide-scale research tasks—such as cross-entity comparisons or market surveys—while addressing engineering challenges in latency, resource scheduling, and cost predictability.

Ollie @puppyoneOct 26, 2025

Agentic RAG

How LLM Agent Architectures Work: From Memory to Action in AI Systems

Discover how LLM agent architectures leverage Agentic RAG and dynamic context bases to move from passive chatbots to autonomous AI systems that plan, remember, and act—powered by infrastructure like Puppyone.ai.

Ollie @puppyoneDec 30, 2025

Building a Tunable, Scalable Agentic RAG: From Deep Research to Open Deep Wide Research

Abstract

Problem Background: The Evolutionary Bottlenecks of RAG

Methodology: Distilling Core Mechanisms from Deep Research

Implementation: The Open Deep Wide Research Architecture

1. MCP-Compatible Agent Core

2. Three-Dimensional Tunable Policies

3. Hybrid Retrieval and Re-planning Mechanism

4. Open Source and Self-Hostable

Comparison with Deep Research

Call to Action: Experience ODWR's Capabilities Now

FAQ

Related reading

Agentic RAG for Deep Research: Architecture, Mechanisms, and Engineering Practices

Open Deep Wide Research: A General-Purpose Agent Collaboration Architecture for Large-Scale Information Gathering

How LLM Agent Architectures Work: From Memory to Action in AI Systems