Browser Automation · 2026 Research

Browser automation
has a new
pecking order.

Chander Dhall Builder • Leader • Speaker

Five tools now define how software teams and AI agents control browsers in 2026. The right choice depends on whether you are writing tests, building AI agents, or both. This report maps each tool's architecture, AI depth, benchmarks, and enterprise readiness.

Read Full Report → Start a Conversation

5tools · 3 paradigms · 1 decision

Executive Snapshot

Five tools. Three paradigms. One decision.

The landscape has fractured into scripted testing, LLM-driven agents, and coding agents. Each paradigm has a different answer to the same question: who decides what the browser does next?

WebWright benchmark 86.7%

On Online-Mind2Web (300 tasks, GPT-5.4). Highest among open-sourced harnesses.
Microsoft Research, 2026

browser-use community 95.7k

GitHub stars for browser-use, the parent project of browser-harness. Fastest-growing browser automation project on GitHub.
GitHub, May 2026

Playwright vs Selenium speed 38%

Playwright completes the same test suite 38% faster than Selenium Grid, with lower CPU and memory usage.
Markaicode, 2025

Playwright weekly downloads 7M npm

Playwright is the dominant framework for scripted cross-browser testing with 60k+ GitHub stars.
GitHub / npm, 2026

WebWright Benchmark

86.7%

Online-Mind2Web · 300 tasks · GPT-5.4

The coding-agent paradigm is demonstrably more capable than click-prediction for complex tasks.

WebWright gives an LLM a terminal to write and execute Python/Playwright scripts. The browser is disposable. The code is the persistent artifact. This approach achieves +15.6 points over prior state of the art on long-horizon tasks.

84.7%

Claude Opus 4.7
Microsoft Research, 2026

60.1%

Odysseys (200 long-horizon)
GPT-5.4, 2026

+26.6pt

Over coordinate baseline
Microsoft Research, 2026

The Point

Three paradigms, not five tools.

The most useful frame is not comparing five tools but understanding the three paradigms they represent. Each paradigm has a different answer to: who decides what the browser does next?

Paradigm 1: Scripted

Playwright · Puppeteer · Selenium

A human writes explicit instructions. The tool executes them deterministically. When the app changes, a human fixes the script. Optimized for CI/CD and regression testing.

Paradigm 2: LLM Agent

browser-harness · browser-use

An LLM decides what the browser does next based on page state. Self-healing: when the app changes, the agent adapts. Optimized for autonomous task completion.

Paradigm 3: Coding Agent

WebWright

An LLM writes and executes Python/Playwright scripts. The code is the persistent artifact. Optimized for complex, long-horizon tasks and reusable automation libraries.

Tool Profile 1 of 5

Microsoft WebWright: a terminal is all you need.

WebWright gives LLMs a terminal to launch browser sessions, write Python/Playwright scripts, execute them, inspect screenshots and logs, and iterate. The codebase is approximately 1,500 lines. Dependencies: httpx, pydantic, playwright, typer. No hidden frameworks.

Architecture Code as action

Agent writes free-form Python scripts. Browser is disposable. Workspace (code + logs) is the persistent state.

AI integration Native

LLM is the driver. Supports OpenAI, Anthropic, OpenRouter. Claude Code plugin via /webwright:run and /webwright:craft.

Top benchmark 86.7%

Online-Mind2Web, 300 tasks, GPT-5.4. Highest among open-sourced harnesses. Microsoft Research, 2026.

Maturity Early stage

MIT research project. 1.5k GitHub stars. No formal enterprise support. MIT license.

Tool Profile 2 of 5

browser-harness: the thinnest possible harness.

Four Python files. Approximately 600 lines. Direct Chrome control via CDP, bypassing framework abstractions entirely. The self-healing architecture is the key differentiator: when an agent cannot complete a task, it writes new helper functions mid-execution and immediately uses them.

Architecture CDP direct

Single WebSocket to Chrome's CDP endpoint. No abstraction layer. Raw CDP commands: Page.navigate, Input.dispatchMouseEvent, DOM.querySelector.

Self-healing Yes

Agent edits helpers.py mid-execution. Builds persistent knowledge base of site-specific skills. Adapts to novel workflows without human intervention.

Community 95.7k

GitHub stars for browser-use parent project. Fastest-growing browser automation project on GitHub as of May 2026.

Security note Evolving

Self-healing creates RCE surface. Run in isolated environments with restricted file system access and audit logging of agent-generated code.

Tool Profiles 3 and 4 of 5

Playwright and Puppeteer: the scripted testing leaders.

Both are Microsoft and Google-backed, Apache 2.0 licensed, and mature. Playwright is the better choice for new projects. Puppeteer remains dominant for Chrome-specific and stealth automation workloads.

Playwright speed 42.3s

Suite completion time versus Selenium's 68.9s. 38% faster. 18.4% CPU versus 27.6%. 412 MB versus 583 MB.
Markaicode, 2025

Playwright community 60k+

GitHub stars. 7 million npm weekly downloads. 412k+ repositories using Playwright as of 2026. Microsoft-backed with enterprise support.

Puppeteer community 85k+

GitHub stars. 4 million npm weekly downloads. Google-backed. Chrome-first. Best for deep DevTools access and stealth automation.

AI integration Added, not native

Playwright MCP interface and Agent CLI added 2025. Puppeteer chrome-devtools-mcp added 2025. Both are scripted tools with AI control layers added on top.

Tool Profile 5 of 5

Selenium: the legacy enterprise standard.

Twenty years of deployments in regulated industries. The performance gap with Playwright has widened in 2025. New enterprise projects almost universally select Playwright. Selenium's installed base is enormous and will persist for years.

Playwright suite completionMarkaicode, 2025

42.3s

Selenium Grid suite completionMarkaicode, 2025

68.9s

Playwright parallel stabilityMarkaicode, 2025

98.2%

Selenium Grid parallel stabilityMarkaicode, 2025

92.1%

Benchmark conditions: standard enterprise test suite, cross-browser scenario. Specific results vary by suite complexity and infrastructure. Source: Markaicode, 2025.

AI Integration

AI integration depth: native vs. added vs. minimal.

The critical distinction is whether AI is the driver (native) or a control layer added on top of a scripted tool (added). Native AI integration means the LLM decides what happens next. Added AI integration means a human still defines the task structure.

Tool	AI integration type	LLM is the driver?	Self-healing?	Agent platform support
WebWright	Native: LLM writes and executes scripts	Yes	Yes: agent repairs code	Claude Code, Codex, OpenClaw, Hermes
browser-harness	Native: LLM controls CDP directly	Yes	Yes: agent edits helpers mid-run	Claude Code MCP, browser-use Cloud
Playwright	Added: MCP interface, Agent CLI (2025)	Via wrapper only	No	MCP-compatible agents
Puppeteer	Added: chrome-devtools-mcp (2025)	Via wrapper only	No	MCP-compatible agents
Selenium	Minimal: ecosystem plugins only	No	No	Limited ecosystem integrations

Buyer Comparison

The procurement decision in one table.

Five tools across the dimensions that matter most for enterprise procurement. The right tool depends on the use case, not on GitHub stars or vendor brand.

Dimension	WebWright	browser-harness	Playwright	Puppeteer	Selenium
Best for	AI writes automation scripts	LLM autonomous tasks, self-healing	Scripted cross-browser CI/CD	Chrome stealth, scraping	Legacy regulated environments
AI native?	Yes	Yes	Added (MCP)	Added (MCP)	No
Self-healing?	Yes	Yes	No	No	No
Enterprise support	None (research)	Cloud tier	Microsoft-backed	Google-backed	20-year track record
Setup complexity	Low	Low (4 files)	Low	Low	High (Grid + drivers)

Market Signal

Three signals the market is already sending.

The adoption data from 2025 and 2026 shows a clean directional pattern: AI-native tools are growing faster than scripted tools, and Playwright is displacing Selenium in new enterprise projects.

browser-use · May 2026

95.7k GitHub stars: fastest-growing browser automation project

The browser-use parent project reached 95.7k stars and 10.8k forks by May 2026, signaling rapid enterprise and research adoption of LLM-native browser control.

Playwright · 2025-2026

7M weekly npm downloads: Playwright becomes the default scripted tool

Playwright surpassed Selenium in new project adoption and added MCP and Agent CLI interfaces, positioning itself as both the scripted testing standard and the browser engine for AI agent frameworks.

WebWright · May 2026

+15.6 points over prior SOTA: coding agents redefine the benchmark ceiling

Microsoft Research published WebWright's Odysseys results showing the coding-agent paradigm outperforms vision-based click-prediction by 15.6 points on long-horizon tasks, establishing a new performance standard.

Call to Action

The right tool for the right job, deployed correctly.

Most enterprise teams are using the wrong tool for their use case, or using the right tool incorrectly. Chander Dhall's Browser Automation Diagnostic maps your current automation stack, identifies the paradigm mismatch, and delivers a phased migration plan with working code.

Start a Conversation →

1

Audit current stack

Map all existing browser automation: test suites, scraping scripts, AI agent integrations. Identify paradigm mismatches and maintenance cost drivers.

2

Match tool to use case

Scripted testing to Playwright. LLM-driven tasks to browser-harness. Coding agent workflows to WebWright. Legacy regulated to Selenium. Deliver the decision with working proof-of-concept code.

3

Phased migration plan

Prioritized migration roadmap with fixed-fee or outcome-aligned structures. One team, one line of accountability, from diagnosis to working automation.

Sources

Source notes.

All benchmarks and statistics in this report trace to the sources below. The full report carries extended source notes and links.

Microsoft Research · WebWright (2026). Lu, Yadong; Xu, Lingrui; Huang, Chao; Awadallah, Ahmed. "Webwright: A Terminal Is All You Need For Web Agents." github.com/microsoft/webwright.
browser-use / browser-harness (2025-2026). github.com/browser-use/browser-harness. Architecture: Medium (Dr. Fadi Shaar), Pyshine, NeuralStackly, Flowtivity.
Playwright (2025-2026). playwright.dev. Performance: Markaicode 2025 (42.3s, 18.4% CPU, 412 MB, 98.2% stability).
Puppeteer (2025-2026). pptr.dev. MCP via chrome-devtools-mcp, WebDriver BiDi support 2025.
Selenium (2025-2026). selenium.dev. Performance: Markaicode 2025 (68.9s, 27.6% CPU, 583 MB, 92.1% stability).
Markaicode · Playwright MCP vs. Selenium Grid: 2025 Performance Benchmarks. markaicode.com/vs/playwright-mcp-vs-selenium-grid/.
MorphLLM · Playwright vs Puppeteer (2026). morphllm.com/comparisons/playwright-vs-puppeteer.
NXCode · Stagehand vs Browser Use vs Playwright: AI Browser Automation Compared (2026). nxcode.io.

Full source list with links available in the full report at /reports/browser-automation-tools/review.

The Board Question

The only browser automation
question that matters:
"who decides what the browser does next?"

Start a Conversation → Read the full report

Chander Dhall Methodworks, LLC