A human writes explicit instructions. The tool executes them deterministically. When the app changes, a human fixes the script. Optimized for CI/CD and regression testing.
Five tools now define how software teams and AI agents control browsers in 2026. The right choice depends on whether you are writing tests, building AI agents, or both. This brief maps each tool's architecture, AI depth, benchmarks, and enterprise readiness.
The landscape has fractured into scripted testing, LLM-driven agents, and coding agents. Each paradigm has a different answer to the same question: who decides what the browser does next?
On Online-Mind2Web (300 tasks, GPT-5.4). Highest among open-sourced harnesses.
Microsoft Research, 2026
GitHub stars for browser-use, the parent project of browser-harness. Fastest-growing browser automation project on GitHub.
GitHub, May 2026
Playwright completes the same test suite 38% faster than Selenium Grid, with lower CPU and memory usage.
Markaicode, 2025
Playwright is the dominant framework for scripted cross-browser testing with 60k+ GitHub stars.
GitHub / npm, 2026
WebWright gives an LLM a terminal to write and execute Python/Playwright scripts. The browser is disposable. The code is the persistent artifact. This approach achieves +15.6 points over prior state of the art on long-horizon tasks.
The most useful frame is not comparing five tools but understanding the three paradigms they represent. Each paradigm has a different answer to: who decides what the browser does next?
A human writes explicit instructions. The tool executes them deterministically. When the app changes, a human fixes the script. Optimized for CI/CD and regression testing.
An LLM decides what the browser does next based on page state. Self-healing: when the app changes, the agent adapts. Optimized for autonomous task completion.
An LLM writes and executes Python/Playwright scripts. The code is the persistent artifact. Optimized for complex, long-horizon tasks and reusable automation libraries.
WebWright gives LLMs a terminal to launch browser sessions, write Python/Playwright scripts, execute them, inspect screenshots and logs, and iterate. The codebase is approximately 1,500 lines. Dependencies: httpx, pydantic, playwright, typer. No hidden frameworks.
Agent writes free-form Python scripts. Browser is disposable. Workspace (code + logs) is the persistent state.
LLM is the driver. Supports OpenAI, Anthropic, OpenRouter. Claude Code plugin via /webwright:run and /webwright:craft.
Online-Mind2Web, 300 tasks, GPT-5.4. Highest among open-sourced harnesses. Microsoft Research, 2026.
MIT research project. 1.5k GitHub stars. No formal enterprise support. MIT license.
Four Python files. Approximately 600 lines. Direct Chrome control via CDP, bypassing framework abstractions entirely. The self-healing architecture is the key differentiator: when an agent cannot complete a task, it writes new helper functions mid-execution and immediately uses them.
Single WebSocket to Chrome's CDP endpoint. No abstraction layer. Raw CDP commands: Page.navigate, Input.dispatchMouseEvent, DOM.querySelector.
Agent edits helpers.py mid-execution. Builds persistent knowledge base of site-specific skills. Adapts to novel workflows without human intervention.
GitHub stars for browser-use parent project. Fastest-growing browser automation project on GitHub as of May 2026.
Self-healing creates RCE surface. Run in isolated environments with restricted file system access and audit logging of agent-generated code.
Both are Microsoft and Google-backed, Apache 2.0 licensed, and mature. Playwright is the better choice for new projects. Puppeteer remains dominant for Chrome-specific and stealth automation workloads.
Suite completion time versus Selenium's 68.9s. 38% faster. 18.4% CPU versus 27.6%. 412 MB versus 583 MB.
Markaicode, 2025
GitHub stars. 7 million npm weekly downloads. 412k+ repositories using Playwright as of 2026. Microsoft-backed with enterprise support.
GitHub stars. 4 million npm weekly downloads. Google-backed. Chrome-first. Best for deep DevTools access and stealth automation.
Playwright MCP interface and Agent CLI added 2025. Puppeteer chrome-devtools-mcp added 2025. Both are scripted tools with AI control layers added on top.
Twenty years of deployments in regulated industries. The performance gap with Playwright has widened in 2025. New enterprise projects almost universally select Playwright. Selenium's installed base is enormous and will persist for years.
Benchmark conditions: standard enterprise test suite, cross-browser scenario. Specific results vary by suite complexity and infrastructure. Source: Markaicode, 2025.
The critical distinction is whether AI is the driver (native) or a control layer added on top of a scripted tool (added). Native AI integration means the LLM decides what happens next. Added AI integration means a human still defines the task structure.
| Tool | AI integration type | LLM is the driver? | Self-healing? | Agent platform support |
|---|---|---|---|---|
| WebWright | Native: LLM writes and executes scripts | Yes | Yes: agent repairs code | Claude Code, Codex, OpenClaw, Hermes |
| browser-harness | Native: LLM controls CDP directly | Yes | Yes: agent edits helpers mid-run | Claude Code MCP, browser-use Cloud |
| Playwright | Added: MCP interface, Agent CLI (2025) | Via wrapper only | No | MCP-compatible agents |
| Puppeteer | Added: chrome-devtools-mcp (2025) | Via wrapper only | No | MCP-compatible agents |
| Selenium | Minimal: ecosystem plugins only | No | No | Limited ecosystem integrations |
Five tools across the dimensions that matter most for enterprise procurement. The right tool depends on the use case, not on GitHub stars or vendor brand.
| Dimension | WebWright | browser-harness | Playwright | Puppeteer | Selenium |
|---|---|---|---|---|---|
| Best for | AI writes automation scripts | LLM autonomous tasks, self-healing | Scripted cross-browser CI/CD | Chrome stealth, scraping | Legacy regulated environments |
| AI native? | Yes | Yes | Added (MCP) | Added (MCP) | No |
| Self-healing? | Yes | Yes | No | No | No |
| Enterprise support | None (research) | Cloud tier | Microsoft-backed | Google-backed | 20-year track record |
| Setup complexity | Low | Low (4 files) | Low | Low | High (Grid + drivers) |
The adoption data from 2025 and 2026 shows a clean directional pattern: AI-native tools are growing faster than scripted tools, and Playwright is displacing Selenium in new enterprise projects.
The browser-use parent project reached 95.7k stars and 10.8k forks by May 2026, signaling rapid enterprise and research adoption of LLM-native browser control.
Playwright surpassed Selenium in new project adoption and added MCP and Agent CLI interfaces, positioning itself as both the scripted testing standard and the browser engine for AI agent frameworks.
Microsoft Research published WebWright's Odysseys results showing the coding-agent paradigm outperforms vision-based click-prediction by 15.6 points on long-horizon tasks, establishing a new performance standard.
Most enterprise teams are using the wrong tool for their use case, or using the right tool incorrectly. Chander Dhall's Browser Automation Diagnostic maps your current automation stack, identifies the paradigm mismatch, and delivers a phased migration plan with working code.
Start a Conversation →Map all existing browser automation: test suites, scraping scripts, AI agent integrations. Identify paradigm mismatches and maintenance cost drivers.
Scripted testing to Playwright. LLM-driven tasks to browser-harness. Coding agent workflows to WebWright. Legacy regulated to Selenium. Deliver the decision with working proof-of-concept code.
Prioritized migration roadmap with fixed-fee or outcome-aligned structures. One team, one line of accountability, from diagnosis to working automation.
All benchmarks and statistics in this brief trace to the sources below. The full report carries extended source notes and links.
Full source list with links available in the full report at /reports/browser-automation-tools/review.
Chander Dhall Methodworks, LLC