← Reports
Executive Slide Deck
Read Full Report
01/00·Title
Executive Research · AI Security · May 2026

Hacked for just $20.

46 million messages. 2 hours. One AI agent.
Chander Dhall
Chander Dhall Builder • Leader • Speaker

An AI agent got attacker-level access to McKinsey's Lilli. This is not a SQL-injection story. It is an identity, permissions, and production-readiness story.

$20Attacker spend
What Was Exposed
46M
Chat messages exposed
Source: CodeWall, May 2026

The attacker got read-write access to the production database.

The exposed system included confidential files, user accounts, authentication tokens, and writable prompts. The dangerous part was not only data access. It was authority over the rules the agent followed.

Why It Matters

Writable prompts mean poisoned advice at consultant scale.

CodeWall stopped at disclosure. A motivated adversary would not need to steal data only. They could reshape what the AI system recommends.

Exposed46.5M

Plaintext client conversations across engagements.

Writable95

System prompts that controlled agent behavior.

Scale43K

Consultants could receive altered guidance.

RiskTrust

The breach moved from confidentiality to decision integrity.

The Real Failure

22 endpoints shipped without authentication.

The report describes 200+ documented API endpoints, including 22 that required no authentication. Some write paths were exposed too.

Open surface
22

Unauthenticated endpoints

Production routes were reachable without the identity checks leaders assume exist.

Direct query
SQL

Keys entered the database path

Queries were built from request values, turning weak access control into data reach.

Executive lesson
Proof

Security claims need operational evidence

Ask what is enforced in production, not only what the platform supports in theory.

Architecture Shift

SaaS was built for humans. Agents play by different rules.

The old browser screen was a practical permissions layer. Agentic systems call APIs directly, which moves trust into code, tokens, scopes, and runtime policy.

Decision areaSaaS eraAgent eraBoard question
Primary actorHumans click screensAgents call APIs and tools directlyCan the system identify agent actors?
Permission boundaryScreen, role, workflowCode, scopes, tokens, policiesCan defaults bypass review?
Operational proofVendor claim plus demoTrace, audit, gates, revocationCan a reviewer replay what happened?
Review timingArchitecture after purchaseArchitecture before commitmentWho validates viability before signing?
Market Signal

Six vendors. One signal: the model is not enough.

The market is moving toward implementation support, persistent context, governed tool catalogs, and API-native business systems.

SignalMoveWhy it mattersReport read
AnthropicEnterprise AI JVApplied AI engineers embedded with customersDeployment depth
OpenAIThe Development CompanyCloser to enterprise deployment realityImplementation ownership
SAP + WalkMePersistent enterprise AIReal-time AI layer over business dataRuntime context
Pinecone NexusCompiled knowledgePersistent context across agent sessionsMemory governance
SalesforceHeadless 360Full CRM through API, not browser UIAgent permissions
ServiceNowMCP registryGoverned and auditable agent tool catalogTool control plane
The Inversion

If the agent cannot authenticate, the strategy fails.

Agent identity, permission boundaries, and auditability are not technical cleanup. They are business conditions for using AI safely.

01 · Reframe
Permissions

Access is a business decision

Identity and audit belong on the strategy table, not the IT backlog.

02 · Validate
Proof

Test before you sign

Implementation viability must be proven before purchase, not after.

03 · Include
Review

Architects in the room

Technical reviewers belong in procurement day one.

Board Readiness

Two questions separate theory from production control.

These questions expose the gap between vendor capability and what actually happens when teams are rushed, defaults remain unchanged, and agents gain tool access.

Question 01Actor identity

Does your platform know the difference between a human and an agent?

Why it mattersBlast radius

Agents need narrower, task-scoped access than humans.

Question 02Pressure test

What happens when the team is under delivery pressure?

Why it mattersDefaults

The gap appears when configuration is never revisited.

Readiness Review

A production AI agent should pass these checks before scale.

If the answers are vague, split across owners, or dependent on human memory, the deployment needs more control before it touches sensitive workflows.

Identity
Who

Separate human and agent actors

Use narrower access, task-scoped permissions, and real-time revocation.

Runtime
How

Replay the decision path

Capture traces, tool calls, policy checks, and environment context.

Release
When

Gate deployment on proof

Require explicit evidence before production rollout and vendor commitment.

The Operating Question

The $20 breach was preventable.
So is yours.

The organizations that avoid the next Lilli ask the identity, permission, and pressure-test questions before they deploy.

© 2026 Chander Dhall Methodworks, LLC. All rights reserved.