So we decided to point our autonomous offensive agent at it. No credentials. No insider knowledge. And no human-in-the-loop. Just a domain name and a dream.

Within 2 hours, the agent had full read and write access to the entire production database.

Fun fact: As part of our research preview, the CodeWall research agent autonomously suggested McKinsey as a target citing their public responsible diclosure policy, to keep within guardrails, and recent updates to their Lilli platform. In the AI era, the threat landscape is shifting drastically. AI agents autonomously selecting and attacking targets will become the new normal.

How We Hacked McKinsey’s AI Platform — CodeWall.ai

Simon Willison’s lethal trifecta framework is a great way to predict where there might be attack surfaces.

Trust in the world of AI is a developing system. My only piece of advice as someone who is just as excited trying new tech and seeing what is possible is this: understand the risks and protect yourself. Let these things cook in a secure sandbox.

With MCPs themselves now being something people are moving away from, we are moving back to a world of APIs and sandboxes as it is also proven that skills plus a secure environment are far more token optimized compared to MCPs.