puppyone for LangChain: a real workspace as a Tool, Retriever, and DocumentLoader

23. April 2026puppyone team

TL;DR

  • LangChain is a framework for assembling agents. Chains, tools, retrievers, memory abstractions, callbacks. It's abstractions, not infrastructure. Where the actual durable state lives is up to you.
  • By default, "where state lives" defaults to one of three bad choices: an in-memory dict that dies when the process exits, a vector DB you have to operate, or an ad-hoc Postgres table you maintain yourself.
  • puppyone slots in as a drop-in Tool, Retriever, and DocumentLoader. Your agent gets read_file, write_file, list_directory, search — backed by a persistent, version-controlled, per-agent-permissioned file workspace. Tomorrow's run reads what today's run wrote. Other agents (Cursor, Claude Code, n8n, your other Python services) read it too.
  • The change isn't conceptual — LangChain stays LangChain. The change is that you stop having to build the state layer yourself.

What changes when you wire puppyone into LangChain

Most LangChain projects have the same shape:

  1. Build an agent.
  2. Give it some tools.
  3. Realise it has no real memory between runs.
  4. Bolt on something — ConversationBufferMemory, a vector DB, a custom Postgres table, a folder of pickled files.
  5. Realise that thing now needs scoping, permissions, version history, multi-agent visibility, and a way for humans to read it.
  6. Now you're maintaining infrastructure instead of shipping agents.

puppyone collapses steps 4–6 into "register one Tool / Retriever and use it":

  • A persistent file workspace that survives process restarts, deployments, and laptop reboots.
  • Tool surface that LangChain agents understand: read_file, write_file, list_directory, search. Standard BaseTool shape.
  • Retriever surface for RAG: hybrid keyword + vector search across the workspace, returned as Documents LangChain already knows how to consume.
  • DocumentLoader surface for ingestion: the connectors that mirror Notion / Slack / Postgres / Gmail / GitHub issues into puppyone become free DocumentLoaders for your LangChain pipelines.
  • Per-agent Access Points. A research chain reads /research/. A writer chain writes /drafts/. A reviewer chain reads /drafts/ and writes /reviews/. No agent sees more than it needs to.
  • Automatic version history. Every write is a commit with author (which chain), timestamp, diff. Roll back agents that go off the rails without losing the rest.
  • Cross-process, cross-language visibility. Your LangChain agent in Python writes a file; your TypeScript service / Cursor / Claude Code / n8n flow reads it.

LangChain stays the orchestration framework. puppyone stops you from rebuilding the persistence layer for the fifth time.

What stays exactly the same

  • All your existing chains, runnables, prompts, output parsers. We don't replace any of them.
  • Your model providers. OpenAI, Anthropic, your local model, anything LangChain supports — unchanged.
  • Your existing tools. Add puppyone alongside them; don't replace them.
  • LCEL. Compose puppyone into runnables exactly like any other tool / retriever.
  • Your eval / tracing / LangSmith setup. Unchanged. Calls into puppyone show up in traces like any other tool call.
  • Your vector DB, if you already have one. puppyone's retriever is a complement, not a replacement — keep your domain vectors where they are; use puppyone for the file-shaped, multi-source, evolving context.

How to wire it up (the short version)

  1. Spin up a puppyone workspace. Cloud (try.puppyone.ai) or self-hosted Docker.
  2. Define an Access Point per chain / per agent role. Read+write /scratch/<agent>/, read /specs, read /integrations/, etc.
  3. Install the puppyone Python SDK (and TypeScript SDK if you're polyglot).
  4. Register the puppyone Tool with your agent:
    • read_file, write_file, list_directory, search, version_history.
    • These are standard BaseTool instances. Add them to your AgentExecutor / LCEL chain like any other tool.
  5. Use the puppyone Retriever for RAG: it returns Document objects with metadata (path, version, source connector, last modified by which agent).
  6. Use the puppyone DocumentLoaders to pull ingested SaaS data into your pipelines — NotionLoader, SlackLoader, PostgresLoader, etc., all backed by puppyone's connectors so you don't write integration code yourself.

Workflows that actually get better

1. Long-horizon research / planning agents

Without puppyone: A research agent assembles findings into a string in memory. Process exits. Findings gone unless you wrote a custom serializer.

With puppyone: The agent's tool spec includes write_file. It writes findings to /research/<topic>/<date>.md. Tomorrow's agent reads the directory, builds on it. Full version history. Other agents (review, summarisation) read the same files.

2. Multi-step plan-and-execute

Without puppyone: "Plan" chain produces a plan. "Execute" chain takes it. You serialise the plan into the conversation buffer or a JSON file you wrote yourself.

With puppyone: Plan chain writes /plans/<run-id>.md. Execute chain reads it. Reviewer chain reads /plans/<run-id>.md and /executions/<run-id>.md. Each chain has its own Access Point. The "plan format" is just "a markdown file" — humans can read it and edit it too.

3. Multi-agent crews / orchestration

Without puppyone: Each agent in your crew has its own memory. Coordinating context between them means message passing through the orchestrator, which becomes the bottleneck.

With puppyone: Agents share a workspace. Researcher writes /research/. Writer reads /research/ and writes /drafts/. Reviewer reads /drafts/ and writes /reviews/. The orchestrator just routes; the state lives in files all agents can see.

4. RAG that pulls from many sources

Without puppyone: You stand up a vector DB, write loaders for each source (Notion, Slack, Postgres, GitHub issues), maintain ingestion pipelines, dedupe, sync.

With puppyone: Connectors mirror sources into the workspace. The puppyone Retriever does hybrid search across all of them. The DocumentLoaders return Documents with provenance (which source, which path, what version). Your RAG chain doesn't care that some came from Notion and some from Slack — it just retrieves files.

5. Human-in-the-loop without separate UI

Without puppyone: Agent produces output, you build a separate UI for humans to review and edit, you serialise edits back into the agent's memory.

With puppyone: Agent writes to /drafts/. Human opens puppyone (or any text editor talking to it) and edits. Next agent reads the edited version. The "review UI" is "a folder of files".

6. Cross-language agent stacks

Without puppyone: Your Python LangChain agent and your TypeScript LangChain.js agent each have their own memory. Sharing requires a shared DB and you maintaining the schema.

With puppyone: Both connect to the same workspace. Both read and write the same files. Schema is "files in a directory tree". No coordination code.

Patterns to avoid

  • Don't put per-step scratch in puppyone. Intermediate RunnableLambda outputs that nobody else needs to see can stay in memory. puppyone is for state you'll want to read later, by another run, or another agent, or a human.
  • Don't replace your domain vector DB with puppyone for high-QPS lookup. If you have a 10M-doc product catalogue and need millisecond p99 retrieval, that's still a job for a dedicated vector DB. puppyone's retriever is great for the multi-source, evolving, file-shaped context — not for serving a product search engine.
  • Don't share one Access Point across all agents. A single overly-broad token undoes all the per-agent permissioning. Make them distinct.
  • Don't let one agent's writes overwrite another's canonical files. Have agents write to scratch paths; promote into canonical paths via review (manual or another chain).
  • Don't forget the git-shape. puppyone gives you commits and rollback. Use them — particularly for agents that occasionally do dramatic things.

How this fits with the rest of your stack

  • LangChain orchestrates. Tools, chains, prompts, output parsing — your job.
  • puppyone persists. Workspace files, version history, per-agent permissions, connector ingestion — its job.
  • Cursor / Claude Code see the same workspace via MCP, so what your LangChain agent writes shows up live in your editor. See puppyone for Cursor and puppyone for Claude Code.
  • n8n (or any workflow tool) reads/writes the same workspace via REST. See puppyone for n8n.
  • Postgres / Pinecone / S3 stay in your stack. Your LangChain chains keep talking to them where they're the right tool. See puppyone vs Postgres and puppyone vs Pinecone for where the line sits.

LangChain stays the framework you reach for. puppyone stops you from rebuilding the boring infrastructure underneath.

FAQ

Is this an official LangChain integration? We ship a Python and TypeScript SDK with BaseTool, BaseRetriever, and BaseLoader implementations following LangChain's standard interfaces. Drop them into your existing chains.

Does puppyone replace ConversationBufferMemory / chat memory? For long-running, multi-session agents — yes, you'll want puppyone-backed memory so it survives the process. For a 5-turn chat that doesn't need to outlive the request, the in-memory class is fine.

Does puppyone replace my vector DB? For multi-source, evolving, file-shaped context where you'd otherwise be building loaders and ingestion plumbing — yes. For a high-throughput, single-domain semantic search index — no, keep your dedicated vector DB.

Can I use puppyone with LangGraph? Yes. Each node in the graph can read/write puppyone via the same Tool. Persistent state across graph executions becomes a property of the workspace, not of the graph runtime.

What about LangSmith / tracing? Calls into puppyone show up as standard tool calls in your traces — file path, operation, latency, return size. No special integration needed.

Does it work with LangChain.js? Yes. Same SDK shape, same Access Points, same workspace. A Python agent and a TypeScript agent can share the workspace transparently.

How do permissions work in a multi-tenant SaaS where each customer has their own LangChain agents? Each tenant gets their own puppyone workspace (or their own scoped tree within one), each agent has its own Access Point, no cross-tenant leakage at the storage layer. Your application logic doesn't need to enforce tenancy — the storage does.

What if I'm using LCEL only, not the older agent abstractions? Same answer — register the tools as RunnableLambda wrappers (or use the SDK helpers) and compose them into your chain.

TL;DR (again)

LangChain gives you the agents. It doesn't give you a place to put their state — that's been an exercise for the user. puppyone fills the gap as a drop-in Tool, Retriever, and DocumentLoader, so your agents stop forgetting and your team stops maintaining ad-hoc persistence layers. Same chains. Same prompts. Real workspace underneath.

Give your LangChain agents a workspace, not a Python dict that dies when the process exits.Get started