
When you run AI agents in production, isolation isn't optional. Agents execute code, touch the filesystem, and call external services—and without a proper sandbox, a single bad output can do damage that's hard to walk back.
Puppyone supports two distinct sandbox environments for different workloads: a Docker-based sandbox for full-featured, flexible execution, and a Cloudflare-based sandbox for fast, globally-distributed, lightweight workloads. This post breaks down what each one actually is, where they differ, and how to pick the right fit.
A sandbox is an isolated execution environment where everything an agent does—running code, touching files, calling tools—stays contained and controlled. A good sandbox enforces clear boundaries:
Both sandbox types enforce these boundaries. The interesting question is how they enforce them, and what trade-offs come with each approach.
The whole comparison really boils down to one sentence: a Docker sandbox gives you a full Linux machine; a Cloudflare sandbox gives you a slot to run JavaScript.
The Docker sandbox is built on containers. Each sandbox is effectively its own Linux server—you can install packages (apt install, pip install), run any runtime (Python, Node, Go), spawn subprocesses, read and write a real filesystem, and let tasks run for hours. The cost is a multi-second cold start, and every sandbox burns real CPU and memory while it exists.
The Cloudflare sandbox is built on V8 Isolates—the same isolation model Chrome uses to keep browser tabs apart. It can only run JavaScript / WebAssembly: no package installs, no subprocesses, no local filesystem, and a single execution typically caps at 30 seconds of CPU time. In return, it cold-starts in under 5 milliseconds, packs thousands of isolates onto a single machine, and runs across Cloudflare's 300+ global edge locations automatically.
| What you want to do | Pick |
|---|---|
| Run Python scripts, install libraries, process data | Docker |
Let an agent edit files, generate reports, run git | Docker |
| Tasks running for minutes to hours | Docker |
| Respond to a request in milliseconds | Cloudflare |
| High-concurrency JavaScript logic (routing, transforming, filtering) | Cloudflare |
| Execute close to users in multiple regions | Cloudflare |
The rest of this post unpacks each one.
Every agent task runs inside its own container—a Linux environment isolated from the host with namespaces and cgroups. When a task arrives, Puppyone spins up a fresh container from a pre-built image, mounts inputs, allocates a working directory, and applies resource quotas. When the task finishes, the entire container is torn down. Each execution starts from a known clean state—no leftover processes, files, or environment variables from the previous run.
Customizable images. The default base image is Ubuntu with Python, Node, Git, and curl preinstalled. If that's not enough, bring your own Dockerfile—Puppyone supports the standard instructions (FROM, RUN, COPY, WORKDIR, ENV)—and bake in whatever runtimes, CLI tools, or private dependencies your team needs. Your agent boots straight into an environment shaped for your workflow.
Configurable resource specs. Defaults are 2 vCPUs and 512 MB of RAM, scalable per task up to 8 vCPUs and 8 GB. If a sandbox runs out of memory, the container is killed cleanly—it can't take down the host.
Controlled task duration. Default timeout is 5 minutes per task, configurable up to 24 hours. For longer-running workflows, the sandbox supports pause / resume—the entire filesystem and memory state are snapshotted on pause and restored on resume, so you don't pay to re-bootstrap an environment from scratch.
Fine-grained network egress control. All outbound traffic is blocked by default. You explicitly allowlist domains or ports the agent is permitted to reach (e.g. api.openai.com, *.github.com)—everything else is denied. This is how you stop a prompt-injected agent from quietly exfiltrating data.
A workspace wired into your filesystem. Each sandbox mounts a namespace inside the Puppyone filesystem—not an ephemeral scratch directory. When the agent writes a file inside the container, it's effectively committing a change to Puppyone: versioned, permissioned, and auditable. You can roll it back, hand it off to another agent, or trace it back to the exact task and step that produced it.
Billed by container lifetime. You're charged for every second the container exists, whether or not the CPU is doing anything. This rewards a "run-and-tear-down" pattern—finish the task, kill the container. If your agent spends most of its time waiting on an LLM response or human input, you're paying for that idle time too.
Puppyone's Docker sandbox is API-compatible with E2B—if your agent already runs on E2B, you can point it at Puppyone with no business logic changes. Sandbox.create(), runCode(), commands.run(), file I/O—they all map over directly. What you gain is that the sandbox and filesystem are a single thing. You no longer have to manage the awkward chain of "files in the sandbox → persist to object storage → sync to the next sandbox." Every write lands in the Puppyone filesystem with version history, permissions, and audit logs—the same state, visible across tasks, agents, and even sandbox types.
Agent code runs inside a V8 Isolate on Cloudflare's global edge network—the same technology Chrome uses to isolate browser tabs from each other. Each isolate is a sealed JavaScript execution context that starts in milliseconds (no operating system to boot) and is destroyed when the request finishes. Requests are routed to the edge location closest to the user, keeping round-trip latency to tens of milliseconds.
Millisecond cold starts. A single isolate spins up in under 5 ms—roughly three orders of magnitude faster than a container. You can afford to start a fresh sandbox for every request without warm pools or long-lived instances.
JavaScript / WebAssembly only. No Python, no shell, no native binaries. Your logic is either JS/TS or compiles to WASM. Most Node.js built-ins are available, but anything that depends on system calls (fs, child_process, native C extensions) is not.
128 MB memory ceiling, 30 seconds of CPU per request. Each request gets at most 128 MB of memory and 30 seconds of CPU time (configurable up to 5 minutes on paid plans). Worth knowing: only active CPU time counts—time spent waiting on fetch() or a database query doesn't, so a request bound by 30 seconds of CPU can stay open for several wall-clock minutes.
No local filesystem. Isolates are stateless. Anything that needs to persist has to go to external storage—Cloudflare KV, R2, D1, or your own database. To share intermediate state across requests, you serialize it out and read it back, or front it with a separate stateful service.
Global edge execution + autoscaling. Your code is automatically deployed across Cloudflare's 300+ edge locations and runs at whichever one is closest to the user. From zero to tens of thousands of QPS, you don't plan capacity—the platform handles it.
Cloudflare's security layer comes free. DDoS protection, rate limiting, bot management, and TLS termination are baked into the network. You don't have to stand up a separate layer for any of that.
Billed only for active CPU milliseconds. When an isolate is waiting on fetch(), a database, or a user event, you're not paying. You're billed only for the milliseconds V8 is actually executing code. For I/O-bound workloads—like agents that mostly call LLMs and aggregate API responses—the cost of the wait time is essentially free.
The Cloudflare sandbox is currently in development and not yet publicly available. We're working on wiring it into the Puppyone filesystem the same way the Docker sandbox is: when JS code in an isolate "writes a file," it commits a versioned, permissioned, auditable change to Puppyone—and shares state with the Docker sandbox. The same agent will be able to run heavy compute in a Docker sandbox and respond at the edge in a Cloudflare sandbox, looking at the same workspace from both ends.
If your use case needs edge execution, subscribe for updates—we'll let you know the moment it's available.
A condensed view of everything covered above, in one table you can scan.
| Dimension | Docker Sandbox | Cloudflare Sandbox |
|---|---|---|
| Underlying tech | Linux containers (namespaces + cgroups) | V8 Isolates |
| Cold start | 2-10 seconds | < 5 milliseconds |
| Runtime | Python, Node, Go, shell—any Linux binary | JavaScript / TypeScript / WebAssembly |
| Install packages? | ✅ apt, pip, npm, custom Dockerfile | ❌ Must be bundled ahead of time |
| Spawn subprocesses? | ✅ | ❌ |
| Local filesystem | ✅ Full POSIX, mounted to Puppyone | ❌ External storage only (KV / R2 / D1) |
| Default resources | 2 vCPU + 512 MB | 128 MB memory |
| Resource ceiling | 8 vCPU + 8 GB (configurable) | 128 MB memory (fixed) |
| Per-execution duration | 5 min default, up to 24 hours | 30s CPU default, up to 5 min (paid) |
| State preservation | pause / resume snapshots (memory + filesystem) | None (each request gets a fresh isolate) |
| Concurrency model | One container per sandbox (real resource cost) | Thousands of isolates per machine |
| Geographic distribution | Single region | 300+ edge locations, automatic routing |
| Network egress control | Domain/port allowlist | Configured via Workers |
| Platform security layer | Bring your own | DDoS / rate limiting / bot / TLS built in |
| Billing granularity | Per second of container lifetime (incl. wait time) | Per millisecond of active CPU (waits don't count) |
| Process model | Run-and-tear-down | On-demand, no maintenance |
| API compatibility | Compatible with E2B | Cloudflare Workers API |
| Puppyone status | ✅ Available | 🚧 In development, not yet open |
| Typical use cases | 5-min data processing, Python with packages, multi-step workflows | Millisecond request routing, high-concurrency JS logic, edge APIs |
| A 10-minute agent—which one? | Docker, if it's actually computing | Cloudflare, if it's spending 90% of that time waiting on LLMs/APIs |
Choosing a sandbox isn't about which one is "better"—it's about what your workload actually looks like. The five dimensions below cover almost every real decision.
If your agent needs to run Python, shell, Go, or native binaries (think ffmpeg, pandoc, git)—it has to be Docker. Cloudflare can only run JavaScript / TypeScript / WebAssembly. You can't even pip install.
If your business logic is already JS / TS and doesn't reach into system calls—either could work. Keep reading.
| Task duration | Pick | Why |
|---|---|---|
| < 30s of CPU | Cloudflare | Millisecond startup + per-ms billing—cheapest by far |
| 30s - 5 min of CPU | Cloudflare (paid plans extend to 5 min) | Still inside the limit |
| Several minutes to hours | Docker | Cloudflare hits its CPU ceiling |
| Multi-day workflows | Docker + pause / resume snapshots | Freeze state and pick it up later |
The two billing models are genuinely different—the same workload can cost an order of magnitude more if you pick wrong.
fetch(), a database, or a streaming LLM response, you're not paying. You're charged only for the milliseconds V8 is actively executing code.In concrete terms:
Don't compare hourly rates. Compare what fraction of your workload's wall-clock time is actually spent computing. That ratio decides which model wins.
Still unsure? Default to Docker. Its capabilities are a strict superset of Cloudflare's—you almost can't pick wrong, you just might pay more in extreme high-concurrency or I/O-heavy scenarios. When traffic actually grows enough to matter, you can move the hot paths to Cloudflare then.
The ideal architecture, once Cloudflare sandboxes are open: the two work together. Cloudflare sandboxes act as the agent's front door—handling millisecond-scale routing, parameter extraction, and lightweight decisions. When the workload genuinely needs heavy compute, it gets dispatched into a Docker sandbox. Cloudflare handles the fast path; Docker handles the heavy work; the same Puppyone filesystem ties both ends together.
If your agent is already on E2B, you can just point it at Puppyone—Sandbox.create(), runCode(), commands.run() are all API-compatible, no business logic changes required, and you immediately get filesystem versioning, permissions, and audit logs.
If you're starting from scratch, the Puppyone documentation has a quick-start guide that gets you a versioned-filesystem sandbox in a few lines of code.
Need the Cloudflare edge sandbox? Subscribe for updates—we'll let you know the moment it's open.