Skip to main content

Pydantic's Monty: A Sandboxed Python for AI Agents That Actually Makes Sense

· 5 min read
Victor Jimenez
Software Engineer & AI Agent Builder

Simon Willison shared research on running Pydantic's Monty in WebAssembly. Monty is a minimal, secure Python interpreter written in Rust, designed specifically for safely executing code generated by LLMs. I put together a demo project to test both the Python integration and the WebAssembly build.

This is one of the few sandboxing approaches I have seen that actually addresses the real problem.

The Problem

"AI agents often need to execute code to solve problems. Traditional sandboxing (Docker, Firecracker) has significant overhead."

— Simon Willison, Research Notes

Context

Monty is not trying to be a full Python implementation. It is a subset of Python implemented in Rust. Unlike Pyodide or MicroPython, which aim for full or broad compatibility, Monty is built specifically for speed and security in LLM-generated code execution scenarios.

Monty vs The Alternatives

FeatureMontyPyodideMicroPythonDocker Sandbox
Language coveragePython subsetNear-full CPythonBroad subsetFull Python
RuntimeRust / WASMEmscripten / WASMC / WASMFull OS
Startup latencyMicrosecondsSecondsMillisecondsSeconds
File system accessNone by defaultEmulatedLimitedConfigurable
Network accessNone by defaultLimitedLimitedConfigurable
Ideal for agent inner loopsYesNo (too slow)MaybeNo (too heavy)
Browser executionYes (WASM)Yes (WASM)Yes (WASM)No
  1. Restricted Environment: No access to the host file system or network by default.
  2. Fast Startup: Ideal for serverless or agentic workflows where you need to run small snippets frequently.
  3. Rust Foundation: Leveraging Rust's safety and performance guarantees.
  4. WASM Target: Compiles to WebAssembly for browser or edge execution.

Why This Matters for AI Agents

AI agents often need to execute code to solve problems — math, data processing, validation. Traditional sandboxing (Docker, Firecracker) has significant overhead that makes it impractical for the inner loop of an LLM interaction. Monty offers a "sandbox-in-a-sandbox" approach that is lightweight enough to be part of every turn.

Reality Check

Monty is a Python subset. It will not run your Django app or your pandas pipeline. It is designed for small, self-contained computations — exactly the kind of code LLMs tend to generate for tool use. If your agent needs full Python, you still need a heavier sandbox. The question is whether your agent actually needs full Python for most of its code execution, and the answer is usually no.

What Monty cannot do (by design)
  • Import arbitrary third-party packages
  • Access the file system
  • Make network requests
  • Use threading or multiprocessing
  • Run code that depends on CPython C extensions
  • Execute long-running computations (configurable timeout)

These limitations are features, not bugs. They are what make the microsecond startup possible.

The Code

View Code

What I Learned

  • Monty fills a real gap: microsecond-latency sandboxed execution for LLM-generated code.
  • The WASM compilation target makes browser-side and edge-side execution practical.
  • For AI agent inner loops, the subset limitation is actually the right trade-off.
  • Docker and Firecracker are overkill for "compute 2+2 and format a date string" — which is most of what agents need.

Why This Matters for Drupal and WordPress

Drupal and WordPress sites increasingly embed interactive elements — code playgrounds in documentation, calculators in content, and AI-assisted form validation. Monty's WASM target enables safe, client-side Python execution without server round-trips, which is valuable for CMS-powered educational sites or developer portals. For agencies building AI-augmented Drupal or WordPress experiences, Monty provides a sandboxed execution layer that avoids the security risks of running user-generated or AI-generated code on the server.

References


Looking for an Architect who doesn't just write code, but builds the AI systems that multiply your team's output? View my enterprise CMS case studies at victorjimenezdev.github.io or connect with me on LinkedIn.