Abstract
In 2025, Retrieval-Augmented Generation (RAG) is undergoing a paradigm shift from "static pipelines" to "autonomous agents". OpenAI's Deep Research demonstrates the potential of this direction, compressing complex research tasks into minutes through multi-step planning, tool use, and dynamic reasoning. However, its closed architecture and fixed strategies struggle to meet enterprise demands for controllability, cost-efficiency, and data sovereignty. This article introduces Open Deep Wide Research (ODWR)—an open-source, MCP-compatible, Agentic RAG framework that supports runtime policy tuning. It aims to replicate the core capabilities of Deep Research while giving developers fine-grained control over depth, width, and latency.
Problem Background: The Evolutionary Bottlenecks of RAG
Traditional RAG systems use a linear "retrieve → rerank → generate" pipeline, which is suitable for factual question-answering but falls short in the following scenarios:
- Multi-hop reasoning: For example, "Compare the open-source strategies of three AI companies from 2024–2025 and their impact on the developer ecosystem."
- Heterogeneous data fusion: Requires simultaneously parsing web pages, PDF technical whitepapers, and user-uploaded CSV reports.
- Dynamic task adjustment: When initial retrieval results are low-quality, it cannot autonomously correct queries or switch data sources.
OpenAI's Deep Research addresses these issues by introducing an agentic architecture: it decomposes tasks into sub-goals, calls browser and Python tools, adjusts strategies in real-time, and outputs structured reports with citations. This design validates the feasibility of Agentic RAG but also exposes key limitations: black-box models, no custom toolchains, and a lack of resource scheduling interfaces.
Methodology: Distilling Core Mechanisms from Deep Research
We analyzed the public technical descriptions of Deep Research (OpenAI, 2025) and distilled three reusable design principles:
- Hierarchical task planning: Translating user instructions into an executable research path (e.g., "identify competitors → collect parameters → cross-validate → generate comparison table").
- Collaborative tool execution: Integrating web browsers, code interpreters, and file parsers to form a closed loop.
- Evidence-driven output: Each conclusion is linked to its original source, supporting traceability and verification.
These mechanisms can be standardized and encapsulated via the Model Context Protocol (MCP). MCP defines the protocols for context passing, state synchronization, and error recovery between agents and tools, allowing different components (like LLMs, crawlers, and databases) to be plug-and-play.
Implementation: The Open Deep Wide Research Architecture
Based on these insights, we developed Open Deep Wide Research (ODWR), an open-source, self-hostable Agentic RAG system with the following features:
- The agent controller adheres to the MCP specification, supporting dynamic loading of tools (e.g., Selenium browser, PDF parser, SQL query engine).
- Context is passed as structured JSON, including task status, visited URLs, cited snippets, and confidence scores.
2. Three-Dimensional Tunable Policies
Users can specify at runtime:
- Depth: Maximum reasoning steps (1–10), controlling logical complexity.
- Width: Number of parallel retrieval sources (5–100+), affecting information coverage.
- Latency Budget: Hard deadline (30s–30min), with automatic fallback on timeout.
Example: A lightweight mode (Depth=2, Width=10, Latency=2min) is suitable for product comparisons; a deep mode (Depth=8, Width=50, Latency=20min) is used for scientific literature reviews.
- Initial retrieval uses a HyDE + vector + keyword hybrid strategy.
- If a critical sub-task fails (e.g., a company's financial report is not found), it triggers a backtrack-rewrite-retry loop.
- Supports user-uploaded files as "anchor knowledge" to guide the retrieval direction.
4. Open Source and Self-Hostable
- Code is hosted on GitHub and supports one-click deployment with Docker.
- Compatible with major LLMs (e.g., GPT-4o, Claude 3.5, DeepSeek-R1) via a unified MCP adapter.
- Output format is Markdown + JSON, facilitating integration with Notion, Obsidian, or internal systems.
Comparison with Deep Research
| Dimension | OpenAI Deep Research | Open Deep Wide Research |
|---|
| Accessibility | Limited to ChatGPT subscribers | Open-source, self-hostable |
| Tool Extension | Closed (OpenAI-provided only) | MCP-compatible, any tool is pluggable |
| Control Granularity | Fixed policy | Tunable via three parameters: Depth/Width/Latency |
| Data Sovereignty | Relies on OpenAI Cloud | Supports private knowledge bases and local execution |
| Output Export | Within ChatGPT only | Supports API, JSON, and Markdown export |
Call to Action: Experience ODWR's Capabilities Now
We have integrated a simplified version of ODWR on the puppyone platform, allowing users to quickly build enterprise-grade Agentic RAG applications:
- Upload technical documents to automatically generate competitive analysis reports.
- Connect to internal databases to enable "natural language queries + supplementary external research."
- Deploy as a customer service bot that automatically cites policy documents and user manuals.
puppyone offers a free trial and a Professional plan for team collaboration and high-concurrency scenarios. Visit https://www.puppyone.ai/ to start your Agentic RAG journey.
FAQ
Q1: Can ODWR replace Deep Research?
Functionally, it can cover over 80% of its use cases and is especially suitable for enterprises that require data privacy, cost control, or custom tools. However, for extremely complex tasks that rely on OpenAI's proprietary models (like o3), performance may be slightly lower.
Q2: Is a programming background required to use it?
Non-technical users can configure task templates through puppyone's graphical interface, while developers can deeply customize agent behavior via the MCP API.
Q3: How can I control costs?
ODWR allows you to set maximum token consumption, tool call limits, and timeout thresholds. It also supports switching to lightweight models (like o4-mini or DeepSeek-Lite) to significantly reduce inference costs.