Open Deep Wide Research: A General-Purpose Agent Collaboration Architecture for Large-Scale Information Gathering

October 26, 2025Ollie @puppyone

Abstract

A novel AI research paradigm automates high-breadth information gathering tasks (such as horizontal research across hundreds of entities) by assigning a dedicated cloud virtual machine to each user session, within which multiple general-purpose agents execute subtasks in parallel. This architecture relies on a Turing-complete execution environment and a role-agnostic multi-agent collaboration mechanism, offering high flexibility. However, it still faces engineering challenges in latency control, resource scheduling, and cost predictability.

Problem Background

Traditional Retrieval-Augmented Generation (RAG) systems typically follow a linear flow: User Input → Retrieval → Generation. While effective for single-point Q&A, this design is significantly limited when faced with tasks requiring multi-round validation, structured comparison, or exploration across numerous heterogeneous sources (e.g., "Analyze the post-graduation career paths of PhDs from the computer science departments of the world's top 50 universities"). The main bottlenecks include:

Lack of proactive exploration and task decomposition capabilities in the retrieval phase.
Inability to dynamically plan or backtrack during the generation phase.
The overall process is non-interruptible and non-extensible, making it difficult to support long-running tasks.

To overcome these limitations, the new generation of systems models large-scale research tasks as a distributed agent collaboration problem.

Method Overview

The core design is to assign a dedicated cloud virtual machine (VM) to each user session. This VM provides a full operating system, network access, and an execution environment, forming a Turing-complete sandbox. Within this sandbox, the system dynamically launches multiple sub-agents. Each is a fully functional, general-purpose instance (rather than having a predefined role like "Researcher" or "Validator") with the following capabilities:

Independently initiate HTTP requests or call external APIs.
Execute scripts to parse unstructured data from web pages, PDFs, tables, etc.
Call built-in toolchains (e.g., headless browsers, document extractors).
Exchange intermediate results with other sub-agents.

Task decomposition is dynamically generated by a main controller. For example, to "research the generative AI tool ecosystem," the system might automatically break it down into:

Obtain a list of tools from multiple platforms (GitHub, Product Hunt, official aggregator pages).
For each tool, concurrently scrape documentation, version history, and user reviews.
Extract key metrics (e.g., open-source status, API support, pricing model).
Align entities and output a structured comparison table.

Since all sub-agents share the same execution environment and possess general-purpose capabilities, the task logic is not constrained by predefined roles, significantly enhancing generalization.

Key Technical Details

1. Virtual Machines as Execution Units

Each session has exclusive use of a lightweight Linux VM (possibly based on micro-virtualization technology like Firecracker).
Pre-installed with common runtimes (Python, Node.js), parsing libraries (BeautifulSoup, PyPDF2), and browser automation tools.
Network egress is rotated through a proxy pool to reduce the risk of being blocked by anti-scraping measures.
All operations are performed in an isolated environment, ensuring security and data boundaries.

2. Multi-Agent Communication and Scheduling

Sub-agents exchange data via shared memory or a lightweight message broker (like Redis Pub/Sub).
Intermediate results are persisted in a structured format (e.g., JSON or JSON-LD) to facilitate subsequent aggregation and validation.
The main controller maintains a task dependency graph (DAG), supporting dynamic scheduling, failure retries, and result caching.

3. Data Processing Pipeline

Take "Fortune 500 company analysis" as an example:

Discovery Phase: Call search engines or public databases to get a list of companies.
Collection Phase: Each sub-agent is responsible for several companies, scraping official websites, annual report PDFs, and press releases.
Parsing Phase: Use rule-based matching, OCR, or multimodal models to extract key fields (e.g., revenue, employee count, CEO).
Alignment Phase: Perform entity resolution based on a unified identifier (like a stock ticker) to build a standardized knowledge table.

This process is highly I/O-intensive, placing high demands on the VM's concurrent processing capabilities and network bandwidth.

Limitations and Scalability Challenges

Current Limitations

Uncontrollable Response Time: Task completion time is determined by the slowest subtask, with no mechanism for timeouts, circuit breaking, or returning partial results.
Non-Transparent Resource Costs: No resource consumption model is provided based on task scale, making it difficult for users to predict expenses.
Single-Node Scaling Bottleneck: All sub-agents run on the same VM, and contention for CPU/memory can lead to performance jitter.
Strong Dependency on the Public Internet: Cannot directly access private knowledge bases or internal data sources.

Large-Scale Deployment Challenges

Cold-Start Latency: VM creation and initialization typically take several to tens of seconds, affecting user experience.
Concurrent Scheduling Overhead: When a large number of subtasks run simultaneously, process management and communication can become bottlenecks.
Storage Costs: If intermediate results are not cleaned up promptly, a large amount of temporary data will accumulate.
Security and Compliance: A sandbox that dynamically executes arbitrary code requires strict auditing, especially in enterprise environments.

Improvement Directions

Introduce depth-breadth control parameters: Allow users to explicitly limit the maximum parallelism (breadth) and number of reasoning steps (depth).
Adopt a layered execution strategy: Prioritize high-value subtasks, while low-priority tasks can be downgraded or skipped.
Support hybrid data source access: Combine public web scraping with private vector database retrieval.
Provide a cost estimation API: Predict resource consumption for the current configuration based on historical task statistics.

If you are looking for a production-ready, self-hostable Agentic RAG solution with fine-grained control, puppyone offers an out-of-the-box implementation path. Built on the MCP protocol, puppyone supports dynamic adjustment of depth and breadth, multi-model backend switching, and seamless integration with private knowledge bases, making it suitable for a variety of scenarios from customer service Q&A to enterprise-level intelligent analysis. Visit https://www.puppyone.ai/ to learn how to deploy your own controllable research agent in minutes.

FAQ

Q1: What is the fundamental difference between this architecture and traditional multi-agent systems?
A: Traditional systems rely on predefined roles (e.g., "Planner," "Executor"), whereas in this architecture, all sub-agents are general-purpose instances that can autonomously decide their course of action. This makes the task structure more flexible and enhances generalization capabilities.

Q2: Can a similar system be deployed on-premises or in a private cloud?
A: Yes, but you would need to handle virtualization scheduling, network proxying, sandbox security, and task coordination yourself. A lightweight alternative is to use containers (like Docker) instead of full VMs and implement agent communication via a message queue.

Q3: What are the main performance bottlenecks in high-concurrency scenarios?
A: The main bottlenecks include VM cold-start latency, the throughput of the subtask scheduler, and the serialization overhead of inter-agent communication. Optimization techniques include using a pre-warmed pool, asynchronous task queues, and caching/reusing intermediate results.

Agentic RAG

Building a Tunable, Scalable Agentic RAG: From Deep Research to Open Deep Wide Research

This article analyzes OpenAI’s Deep Research as a paradigm-shifting agentic RAG system and introduces Open Deep Wide Research (ODWR)—an open-source, MCP-compatible framework that replicates multi-step research capabilities while offering fine-grained control over depth, width, and latency. Designed for enterprises, ODWR addresses the limitations of closed-agent systems, such as a lack of controllability and data sovereignty, through modular tool integration, hybrid retrieval, and runtime policy tuning.

Ollie @PuppyAgenrtOct 27, 2025

Agentic RAG

Agentic RAG for Deep Research: Architecture, Mechanisms, and Engineering Practices

This article presents a deep research–oriented Agentic RAG system that autonomously plans, iteratively retrieves, and cross-validates information to generate structured reports, completing expert-level research tasks in 2–4 minutes. It details the system’s workflow, dynamic reasoning engine, and performance on benchmarks like Humanity’s Last Exam (21.1%) and SimpleQA (93.9%), while also covering deployment challenges and introducing an open-source implementation.

Ollie @puppyoneOct 27, 2025

Agentic RAG

How to Build a Customizable Chatbot in 2026: Boost Engagement Without Coding

Discover how to build a no-code, customizable chatbot in 2026 using Agentic RAG and enterprise-grade context—boost engagement, cut support costs, and deploy AI agents without writing a single line of code.

Ollie @puppyoneDec 30, 2025

Open Deep Wide Research: A General-Purpose Agent Collaboration Architecture for Large-Scale Information Gathering

Abstract

Problem Background

Method Overview

Key Technical Details

1. Virtual Machines as Execution Units

2. Multi-Agent Communication and Scheduling

3. Data Processing Pipeline

Limitations and Scalability Challenges

Current Limitations

Large-Scale Deployment Challenges

Improvement Directions

FAQ

Related reading

Building a Tunable, Scalable Agentic RAG: From Deep Research to Open Deep Wide Research

Agentic RAG for Deep Research: Architecture, Mechanisms, and Engineering Practices

How to Build a Customizable Chatbot in 2026: Boost Engagement Without Coding