Review: Simon Willison's Agentic Engineering Practices Turned Into Testing, Sandboxing, Prompt-Injection, and Secret-Handling Guardrails for Drupal and WordPress Teams

March 16, 2026 · 9 min read

Software Engineer & AI Agent Builder

Simon Willison's recent Agentic Engineering Patterns guide is valuable because it is not promising magic. It is mostly operational discipline: run tests first, use red/green TDD when possible, do real manual testing, keep reviewable changes small, and treat prompt injection as a live systems-design problem instead of a prompt-writing problem.

For Drupal and WordPress teams, that translates into a workable rule set for plugin, theme, and module development. The right takeaway is not "let agents code more." It is "make agent output pass through tighter engineering boundaries than human output would have needed a year ago."

What Simon Willison Is Actually Arguing

Across the guide and related security posts, four ideas matter most:

start with the existing test suite so the agent learns the project's safety rails early;
use test-first or red/green loops when introducing new behavior;
use browser-based manual testing for real interfaces, not just unit tests;
assume prompt injection remains unsolved, especially once an agent can see private data, read untrusted content, and communicate externally.

That combination maps cleanly to Drupal and WordPress work because both ecosystems have a long tail of admin UI code, AJAX/REST handlers, upgrade routines, and extension marketplaces where one weak workflow can turn into many downstream installs.

Guardrail 1: Treat "First Run the Tests" as a Repository Policy

Willison's "first run the tests" prompt is more than convenience. It forces the agent to discover the project's verification commands before it edits anything. For Drupal and WordPress teams, that should become a repo-level expectation:

Drupal modules: run PHPUnit, PHPCS, PHPStan, and any kernel/functional suites before changes.
WordPress plugins: run PHPUnit, PHPCS with WordPress rules, and compatibility checks before changes.
If there are no tests, the task starts by adding a minimal safety test around the behavior being changed.

This matters because agent failures in CMS repos are often not "cannot code." They are "edited the wrong hook", "broke a capability check", "changed output escaping", or "altered a config form flow without noticing." Existing tests expose those boundaries early.

Practical team rule:

No agent-authored Drupal module or WordPress plugin change starts implementation until the agent has run the repo's documented validation commands and reported the result.

If your repo does not have a single documented entry point such as composer test, npm test, or make qa, fix that first. Ambiguous QA commands are an agent hazard.

Guardrail 2: Use Red/Green TDD for Risky CMS Surfaces

Willison's red/green advice fits CMS extension work especially well because dangerous regressions tend to cluster around state-changing behavior:

WordPress admin-post handlers, REST endpoints, AJAX actions, cron callbacks, and plugin activation/migration routines
Drupal forms, controllers, routes, queue workers, migrations, update hooks, and custom access logic

These are exactly the places where teams need proof that:

nonce and capability checks remain intact in WordPress,
CSRF, access, and input-handling expectations remain intact in Drupal,
error paths behave safely,
old data still upgrades cleanly.

Practical team rule:

For any change that mutates state, changes permissions, or touches install/update logic, require a failing test first or a documented reason that browser/manual verification is the only realistic path.

That rule is boring. It is also the difference between an agent helping with delivery and an agent spraying plausible changes into privileged code paths.

Guardrail 3: Make Browser-Based Manual Testing a First-Class Requirement

Willison's guidance on agentic manual testing is the missing piece for Drupal and WordPress teams because so much breakage is interface and workflow breakage:

field widgets and block editor/sidebar interactions,
Drupal administrative forms and Views UI,
media flows, autosaves, revisions, and preview states,
plugin settings pages, multisite screens, and role-specific admin menus.

These are precisely the areas where "tests passed" is not enough. A coding agent can preserve PHP correctness while still breaking the editor, introducing an inaccessible admin interaction, or causing a silent JS error in one privileged screen.

Practical team rule:

Every agent-authored PR that changes admin UX, content editing, or multi-step workflows must include evidence of manual browser validation: Playwright output, screenshots, or explicit test notes.

Willison is right that unreviewed giant PRs are a collaboration failure. For CMS teams, the fix is not just smaller diffs. It is requiring proof that the human owner actually exercised the changed behavior in a browser.

Guardrail 4: Design Around Prompt Injection, Do Not Pretend to Detect It Reliably

This is where many AI coding policies still fail. Willison has been consistent for years: prompt injection is not solved, delimiters are not a real fix, and once trusted instructions and attacker-controlled content share context, the model can be manipulated.

For Drupal and WordPress development teams, that means all of the following are untrusted input:

issue descriptions from public trackers,
support tickets,
scraped advisory text,
README files from third-party repos,
page HTML fetched from the web,
database content pulled from a site,
plugin or module code the agent did not author.

Practical team rule:

Never let the same agent session both ingest untrusted project/web content and hold secrets, production credentials, or write access to external systems.

That is the direct CMS translation of Willison's "lethal trifecta": private data, untrusted content, and external communication. If your agent can read wp-config.php, inspect Drupal settings.php, browse attacker-controlled pages, and post somewhere external, you have built an exfiltration path.

Guardrail 5: Sandboxing Must Be Real, Narrow, and Task-Specific

Willison's security writing points toward a practical architecture lesson: if you cannot trust the model to distinguish instructions from data, then your real protection comes from what the runtime physically lets it do.

For Drupal and WordPress teams, a useful sandbox baseline looks like this:

local disposable environments for analysis and testing;
read-only mounts for production snapshots unless a task explicitly needs writes;
no default access to browser cookies, password managers, shell history, or SSH keys;
no outbound network for tasks that only need local code analysis;
separate identities for read-only review versus release/deploy work.

Practical team rule:

Default every agent task to the smallest filesystem, network, and credential scope that can still finish the job. Escalation should be explicit and logged.

This is especially important for agencies juggling many Drupal and WordPress clients. A single over-privileged agent workspace can collapse tenant boundaries faster than a traditional CI mistake.

Guardrail 6: Secret Handling Should Use Indirection, Not Trust

Willison's prompt-injection model implies a strong secret rule: if a secret ever becomes visible in a model context that also processes untrusted content, you should assume eventual exposure is possible.

For WordPress and Drupal work, the common failures are familiar:

pasting .env, wp-config.php, or settings.php into chat;
giving agents raw API keys so they can "test the integration";
exposing database dumps that still contain salts, tokens, or customer data;
letting logs with bearer tokens flow back into chat transcripts.

Practical team rule:

Agents should receive capabilities, not raw secrets.

That means:

short-lived credentials instead of long-lived keys;
proxy tools that perform one approved action without revealing the underlying secret;
redacted fixtures and sanitized snapshots for debugging;
pre-commit and CI secret scanning for every agent-touched branch.

For Drupal and WordPress teams, this is not abstract policy. It is the difference between safe assisted development and leaking production credentials through a support export, a browser transcript, or an agent log.

A Practical Baseline for Drupal and WordPress Teams

If you need a short operating policy, start here:

Agents run the repo's full documented QA command before editing.
State-changing changes require red/green tests or explicit manual-test justification.
Admin and editor workflow changes require browser validation evidence.
Untrusted content must not share a session with secrets plus outbound tool access.
Sandboxes start read-only, low-network, and low-credential by default.
Secrets are brokered through narrow tools, never pasted into prompts.
Every agent-authored PR stays small enough for a human maintainer to fully review.

That baseline is realistic for Drupal agencies, WordPress product teams, and maintainers shipping contrib/plugins on public ecosystems.

Bottom Line

Simon Willison's agentic engineering practices are useful precisely because they are not mystical. They push teams back toward engineering fundamentals:

stronger tests,
real browser verification,
smaller reviewed changes,
hard runtime boundaries,
and less fantasy about prompt injection being solved by clever prompting.

For Drupal and WordPress teams, that is the right posture. The more agentic your workflow becomes, the more your quality and security depend on enforceable guardrails around testing, sandboxing, prompt exposure, and secret access.

What Simon Willison Is Actually Arguing​

Guardrail 1: Treat "First Run the Tests" as a Repository Policy​

Guardrail 2: Use Red/Green TDD for Risky CMS Surfaces​

Guardrail 3: Make Browser-Based Manual Testing a First-Class Requirement​

Guardrail 4: Design Around Prompt Injection, Do Not Pretend to Detect It Reliably​

Guardrail 5: Sandboxing Must Be Real, Narrow, and Task-Specific​

Guardrail 6: Secret Handling Should Use Indirection, Not Trust​

A Practical Baseline for Drupal and WordPress Teams​

Bottom Line​

References​

What Simon Willison Is Actually Arguing

Guardrail 1: Treat "First Run the Tests" as a Repository Policy

Guardrail 2: Use Red/Green TDD for Risky CMS Surfaces

Guardrail 3: Make Browser-Based Manual Testing a First-Class Requirement

Guardrail 4: Design Around Prompt Injection, Do Not Pretend to Detect It Reliably

Guardrail 5: Sandboxing Must Be Real, Narrow, and Task-Specific

Guardrail 6: Secret Handling Should Use Indirection, Not Trust

A Practical Baseline for Drupal and WordPress Teams

Bottom Line

References