Puppyone Sandbox Comparison: Docker Sandbox vs Cloudflare Sandbox — Which One Is Right for Your AI Agent?

April 23, 2026Guanqun @puppyone

Docker container stack versus Cloudflare edge network in a split-screen comparison

When you run AI agents in production, isolation isn't optional. Agents execute code, touch the filesystem, and call external services—and without a proper sandbox, a single bad output can do damage that's hard to walk back.

Puppyone supports two distinct sandbox environments for different workloads: a Docker-based sandbox for full-featured, flexible execution, and a Cloudflare-based sandbox for fast, globally-distributed, lightweight workloads. This post breaks down what each one actually is, where they differ, and how to pick the right fit.

What Is a Sandbox?

A sandbox is an isolated execution environment where everything an agent does—running code, touching files, calling tools—stays contained and controlled. A good sandbox enforces clear boundaries:

What the agent can read and write
What network calls it can make
How much CPU and memory it can consume
How long it can run before being terminated

Both sandbox types enforce these boundaries. The interesting question is how they enforce them, and what trade-offs come with each approach.

Docker Sandbox vs Cloudflare Sandbox: The Core Difference

The whole comparison really boils down to one sentence: a Docker sandbox gives you a full Linux machine; a Cloudflare sandbox gives you a slot to run JavaScript.

The Docker sandbox is built on containers. Each sandbox is effectively its own Linux server—you can install packages (apt install, pip install), run any runtime (Python, Node, Go), spawn subprocesses, read and write a real filesystem, and let tasks run for hours. The cost is a multi-second cold start, and every sandbox burns real CPU and memory while it exists.

The Cloudflare sandbox is built on V8 Isolates—the same isolation model Chrome uses to keep browser tabs apart. It can only run JavaScript / WebAssembly: no package installs, no subprocesses, no local filesystem, and a single execution typically caps at 30 seconds of CPU time. In return, it cold-starts in under 5 milliseconds, packs thousands of isolates onto a single machine, and runs across Cloudflare's 300+ global edge locations automatically.

What you want to do	Pick
Run Python scripts, install libraries, process data	Docker
Let an agent edit files, generate reports, run `git`	Docker
Tasks running for minutes to hours	Docker
Respond to a request in milliseconds	Cloudflare
High-concurrency JavaScript logic (routing, transforming, filtering)	Cloudflare
Execute close to users in multiple regions	Cloudflare

The rest of this post unpacks each one.

The Docker-Based Sandbox

How It Works

Every agent task runs inside its own container—a Linux environment isolated from the host with namespaces and cgroups. When a task arrives, Puppyone spins up a fresh container from a pre-built image, mounts inputs, allocates a working directory, and applies resource quotas. When the task finishes, the entire container is torn down. Each execution starts from a known clean state—no leftover processes, files, or environment variables from the previous run.

What You Get

Customizable images. The default base image is Ubuntu with Python, Node, Git, and curl preinstalled. If that's not enough, bring your own Dockerfile—Puppyone supports the standard instructions (FROM, RUN, COPY, WORKDIR, ENV)—and bake in whatever runtimes, CLI tools, or private dependencies your team needs. Your agent boots straight into an environment shaped for your workflow.

Configurable resource specs. Defaults are 2 vCPUs and 512 MB of RAM, scalable per task up to 8 vCPUs and 8 GB. If a sandbox runs out of memory, the container is killed cleanly—it can't take down the host.

Controlled task duration. Default timeout is 5 minutes per task, configurable up to 24 hours. For longer-running workflows, the sandbox supports pause / resume—the entire filesystem and memory state are snapshotted on pause and restored on resume, so you don't pay to re-bootstrap an environment from scratch.

Fine-grained network egress control. All outbound traffic is blocked by default. You explicitly allowlist domains or ports the agent is permitted to reach (e.g. api.openai.com, *.github.com)—everything else is denied. This is how you stop a prompt-injected agent from quietly exfiltrating data.

A workspace wired into your filesystem. Each sandbox mounts a namespace inside the Puppyone filesystem—not an ephemeral scratch directory. When the agent writes a file inside the container, it's effectively committing a change to Puppyone: versioned, permissioned, and auditable. You can roll it back, hand it off to another agent, or trace it back to the exact task and step that produced it.

Billed by container lifetime. You're charged for every second the container exists, whether or not the CPU is doing anything. This rewards a "run-and-tear-down" pattern—finish the task, kill the container. If your agent spends most of its time waiting on an LLM response or human input, you're paying for that idle time too.

When It's Not the Right Fit

Sub-100ms response paths (cold starts are seconds, not milliseconds)
Thousands of independent requests per second (every container costs real resources)
A few lines of lightweight JavaScript logic (overkill)

When It Is

Running Python / Node / shell scripts that touch files, hit libraries, or generate artifacts
Multi-step workflows where each step builds on the previous step's workspace
Anything that needs to install packages, compile code, or call system utilities
In short: anything that genuinely needs a machine

How Puppyone Supports It

Puppyone's Docker sandbox is API-compatible with E2B—if your agent already runs on E2B, you can point it at Puppyone with no business logic changes. Sandbox.create(), runCode(), commands.run(), file I/O—they all map over directly. What you gain is that the sandbox and filesystem are a single thing. You no longer have to manage the awkward chain of "files in the sandbox → persist to object storage → sync to the next sandbox." Every write lands in the Puppyone filesystem with version history, permissions, and audit logs—the same state, visible across tasks, agents, and even sandbox types.

The Cloudflare-Based Sandbox

How It Works

Agent code runs inside a V8 Isolate on Cloudflare's global edge network—the same technology Chrome uses to isolate browser tabs from each other. Each isolate is a sealed JavaScript execution context that starts in milliseconds (no operating system to boot) and is destroyed when the request finishes. Requests are routed to the edge location closest to the user, keeping round-trip latency to tens of milliseconds.

What You Get

Millisecond cold starts. A single isolate spins up in under 5 ms—roughly three orders of magnitude faster than a container. You can afford to start a fresh sandbox for every request without warm pools or long-lived instances.

JavaScript / WebAssembly only. No Python, no shell, no native binaries. Your logic is either JS/TS or compiles to WASM. Most Node.js built-ins are available, but anything that depends on system calls (fs, child_process, native C extensions) is not.

128 MB memory ceiling, 30 seconds of CPU per request. Each request gets at most 128 MB of memory and 30 seconds of CPU time (configurable up to 5 minutes on paid plans). Worth knowing: only active CPU time counts—time spent waiting on fetch() or a database query doesn't, so a request bound by 30 seconds of CPU can stay open for several wall-clock minutes.

No local filesystem. Isolates are stateless. Anything that needs to persist has to go to external storage—Cloudflare KV, R2, D1, or your own database. To share intermediate state across requests, you serialize it out and read it back, or front it with a separate stateful service.

Global edge execution + autoscaling. Your code is automatically deployed across Cloudflare's 300+ edge locations and runs at whichever one is closest to the user. From zero to tens of thousands of QPS, you don't plan capacity—the platform handles it.

Cloudflare's security layer comes free. DDoS protection, rate limiting, bot management, and TLS termination are baked into the network. You don't have to stand up a separate layer for any of that.

Billed only for active CPU milliseconds. When an isolate is waiting on fetch(), a database, or a user event, you're not paying. You're billed only for the milliseconds V8 is actually executing code. For I/O-bound workloads—like agents that mostly call LLMs and aggregate API responses—the cost of the wait time is essentially free.

When It's Not the Right Fit

Tasks that need to install packages, run Python / shell, or execute native binaries
Single-request workloads needing more than a few minutes of CPU
Workflows that read or write local files, or maintain state across requests inside the sandbox
Memory-heavy tasks that need more than 128 MB

When It Is

High-concurrency, short-duration requests (parsing, transforming, filtering, routing)
Acting as the agent's "front door"—deciding fast, extracting parameters, picking which tool to call next
Globally-distributed, low-latency services (smart API gateways, edge RAG)
Glue code that wires Cloudflare primitives (KV, R2, D1, Queues) together

How Puppyone Supports It

The Cloudflare sandbox is currently in development and not yet publicly available. We're working on wiring it into the Puppyone filesystem the same way the Docker sandbox is: when JS code in an isolate "writes a file," it commits a versioned, permissioned, auditable change to Puppyone—and shares state with the Docker sandbox. The same agent will be able to run heavy compute in a Docker sandbox and respond at the edge in a Cloudflare sandbox, looking at the same workspace from both ends.

If your use case needs edge execution, subscribe for updates—we'll let you know the moment it's available.

Side-by-Side

A condensed view of everything covered above, in one table you can scan.

Dimension	Docker Sandbox	Cloudflare Sandbox
Underlying tech	Linux containers (namespaces + cgroups)	V8 Isolates
Cold start	2-10 seconds	< 5 milliseconds
Runtime	Python, Node, Go, shell—any Linux binary	JavaScript / TypeScript / WebAssembly
Install packages?	✅ `apt`, `pip`, `npm`, custom Dockerfile	❌ Must be bundled ahead of time
Spawn subprocesses?	✅	❌
Local filesystem	✅ Full POSIX, mounted to Puppyone	❌ External storage only (KV / R2 / D1)
Default resources	2 vCPU + 512 MB	128 MB memory
Resource ceiling	8 vCPU + 8 GB (configurable)	128 MB memory (fixed)
Per-execution duration	5 min default, up to 24 hours	30s CPU default, up to 5 min (paid)
State preservation	`pause` / `resume` snapshots (memory + filesystem)	None (each request gets a fresh isolate)
Concurrency model	One container per sandbox (real resource cost)	Thousands of isolates per machine
Geographic distribution	Single region	300+ edge locations, automatic routing
Network egress control	Domain/port allowlist	Configured via Workers
Platform security layer	Bring your own	DDoS / rate limiting / bot / TLS built in
Billing granularity	Per second of container lifetime (incl. wait time)	Per millisecond of active CPU (waits don't count)
Process model	Run-and-tear-down	On-demand, no maintenance
API compatibility	Compatible with E2B	Cloudflare Workers API
Puppyone status	✅ Available	🚧 In development, not yet open
Typical use cases	5-min data processing, Python with packages, multi-step workflows	Millisecond request routing, high-concurrency JS logic, edge APIs
A 10-minute agent—which one?	Docker, if it's actually computing	Cloudflare, if it's spending 90% of that time waiting on LLMs/APIs

How to Choose

Choosing a sandbox isn't about which one is "better"—it's about what your workload actually looks like. The five dimensions below cover almost every real decision.

1. Runtime Language

If your agent needs to run Python, shell, Go, or native binaries (think ffmpeg, pandoc, git)—it has to be Docker. Cloudflare can only run JavaScript / TypeScript / WebAssembly. You can't even pip install.

If your business logic is already JS / TS and doesn't reach into system calls—either could work. Keep reading.

2. Task Duration

Task duration	Pick	Why
< 30s of CPU	Cloudflare	Millisecond startup + per-ms billing—cheapest by far
30s - 5 min of CPU	Cloudflare (paid plans extend to 5 min)	Still inside the limit
Several minutes to hours	Docker	Cloudflare hits its CPU ceiling
Multi-day workflows	Docker + `pause` / `resume` snapshots	Freeze state and pick it up later

3. Cost Structure (the dimension people get wrong most often)

The two billing models are genuinely different—the same workload can cost an order of magnitude more if you pick wrong.

Docker bills for the time the container exists. The meter starts when the container boots, regardless of whether the CPU is doing anything. This rewards short, dense work: start the task, finish it, kill the container. If you let an idle container sit around waiting on user input or an LLM response, you're paying for that wait time.
Cloudflare bills for active CPU milliseconds only. When the isolate is waiting on fetch(), a database, or a streaming LLM response, you're not paying. You're charged only for the milliseconds V8 is actively executing code.

In concrete terms:

A 5-minute data processing task (CPU pegged the whole time) → Docker is cheaper
10,000 users per second, each triggering 50ms of logic → Cloudflare is cheaper by orders of magnitude
A 30-minute agent that spends 90% of its time waiting on LLM streams → Cloudflare is dramatically cheaper; Docker bills the full 30 minutes

Don't compare hourly rates. Compare what fraction of your workload's wall-clock time is actually spent computing. That ratio decides which model wins.

4. Concurrency

A few to a few dozen requests per second → either works
Hundreds per second of short-lived tasks → Cloudflare (thousands of isolates per machine; Docker would need a container per concurrent request)
A few requests per second, but each is a multi-minute compute → Docker (Cloudflare can't hold that long)

5. Data Volume and State

Tasks that read or write lots of files, produce GB-scale artifacts, or maintain a multi-step workspace → Docker (real filesystem, mounted to Puppyone, automatically versioned)
Tasks that just hit a few APIs, do light transformations, and write back to external storage → Cloudflare (no local filesystem, but reading KV / R2 / D1 is fast)
More than 128 MB of memory → Docker (a single Cloudflare isolate caps at 128 MB)

Still unsure? Default to Docker. Its capabilities are a strict superset of Cloudflare's—you almost can't pick wrong, you just might pay more in extreme high-concurrency or I/O-heavy scenarios. When traffic actually grows enough to matter, you can move the hot paths to Cloudflare then.

The ideal architecture, once Cloudflare sandboxes are open: the two work together. Cloudflare sandboxes act as the agent's front door—handling millisecond-scale routing, parameter extraction, and lightweight decisions. When the workload genuinely needs heavy compute, it gets dispatched into a Docker sandbox. Cloudflare handles the fast path; Docker handles the heavy work; the same Puppyone filesystem ties both ends together.

Getting Started With Sandboxes on Puppyone

If your agent is already on E2B, you can just point it at Puppyone—Sandbox.create(), runCode(), commands.run() are all API-compatible, no business logic changes required, and you immediately get filesystem versioning, permissions, and audit logs.

If you're starting from scratch, the Puppyone documentation has a quick-start guide that gets you a versioned-filesystem sandbox in a few lines of code.

Need the Cloudflare edge sandbox? Subscribe for updates—we'll let you know the moment it's open.

Compliance Management FOR AI Agents

Compliance Management for AI Agents: Governance & Audit

A technical guide to compliance for AI agents: governance, audit trails, audit logs, approval workflows, information governance, sandboxing, and why protocol layers like MUT matter.

AI Infrastructure TeamMar 31, 2026

AI Agent Memory Platforms

Best AI Agent Memory Platforms in 2026: A Practical Enterprise Checklist

A practical 2026 comparison of AI agent memory approaches: managed services, SDKs, graph memory, stateful runtimes, build-your-own substrates, and governed Agents Filesystem designs.

Lin IvanApr 21, 2026

AI Agent Infrastructure

AI Agent Infrastructure Needs an Agents Filesystem and a Versioned Control Filesystem

Cloudflare's internal AI engineering stack points to a larger lesson: production AI agent infrastructure needs an Agents Filesystem with permissions, versioning, auditability, and MCP-native access.

Lin IvanApr 21, 2026

Puppyone Sandbox Comparison: Docker Sandbox vs Cloudflare Sandbox — Which One Is Right for Your AI Agent?

What Is a Sandbox?

Docker Sandbox vs Cloudflare Sandbox: The Core Difference

The Docker-Based Sandbox

How It Works

What You Get

When It's Not the Right Fit

When It Is

How Puppyone Supports It

The Cloudflare-Based Sandbox

How It Works

What You Get

When It's Not the Right Fit

When It Is

How Puppyone Supports It

Side-by-Side

How to Choose

1. Runtime Language

2. Task Duration

3. Cost Structure (the dimension people get wrong most often)

4. Concurrency

5. Data Volume and State

Getting Started With Sandboxes on Puppyone

Related reading

Compliance Management for AI Agents: Governance & Audit

Best AI Agent Memory Platforms in 2026: A Practical Enterprise Checklist

AI Agent Infrastructure Needs an Agents Filesystem and a Versioned Control Filesystem