Can you actually go straight from test case doc to automated execution?
Thought I’d give this a go and see what kind of results I came up with.
“Can I just give Claude Code my test case document, connect the Chrome DevTools MCP, and have it run the tests? Like, actually run them in a real browser, without writing any test code or setting up a framework?”
Potential scenarios for this include…
when you have documented test cases but no automation yet – could this give you instant automated execution?
Maybe you could use this to figure out how to interact with difficult applications before building proper test frameworks.
Anyway, no point speculating about how to use this if it doesn’t work. We just need to know if it does work!
I was curious whether the Chrome DevTools MCP could bridge the gap between human-readable test documentation and actual automated execution.
Here’s how I got this working:
Installation:
First you need Node and NPM installed. Then install the Chrome DevTools MCP:
I had ffmpeg already installed (it comes with a Playwright install), but you’ll need it for screenshots.
Configuration:
Add the Chrome DevTools MCP to Claude Code’s configuration. You can do this via command line:
This updates your Claude Code configuration file to include the MCP integration.
Verification:
Start Claude Code and check that Claude can see the MCP using:
You should see chrome-devtools listed and connected. When I ran this, Claude actually fired up an initial Chrome instance just to verify it had access.
Test Case:
I used the test case created in Experiment #1 – a markdown file with structured test steps for adding an investment account. I added the application URL to the preconditions so Claude would know where to navigate.
Starting Point:
So my starting point was a fresh Claude Code session with the test case file and Chrome DevTools MCP configured.
I primed Claude Code:
“You are an expert test automation engineer. Our goal is to take a test case documented in markdown and run it directly as an automated test using the Chrome DevTools MCP.”
Then I gave it the path to my test case file.
What happened:
Claude Code reviewed the test case, created a todo list to track execution, and… opened Chrome automatically. It navigated straight to the dashboard page.
Then it clicked the “Add Account” button.
That’s pretty amazing. We’ve gone from a standard human-readable test case directly to automating it in a browser with no framework, no code – just directly via the Chrome DevTools MCP.
I’m genuinely surprised. It followed every step in my test documentation – clicked buttons, filled forms, validated results. All from my markdown test case.
It actually caught specific details – the account name I specified, the exact description text, even the modal form behavior and validation messages.
That’s pretty impressive!
I ran the test again to see if it would work consistently.
What happened:
Second run was actually quicker. Claude had context from the first run and executed more efficiently.
What I learned:
The approach is repeatable. Once Claude understands your test case structure, subsequent runs are faster and smoother.
I wanted to understand the cost implications. Using a tool called CC Usage to monitor token consumption:
Result:
Each test run consumed about 1-2% of my session token allocation. That means roughly 50-100 test runs per session maximum.
Observation:
Not free, but not prohibitive either. For exploratory testing or figuring out automation approaches, this token cost is worth it. For CI/CD pipeline runs with hundreds of tests daily, you’d want a more deterministic coded approach.
After testing this approach, some patterns emerged:
Works well for:
Gets messy with:
Surprises:
⚡ Quick Verdict:
🟢 I’d use this tomorrow
Would I use this?
Absolutely. Not for everything, but definitely for specific scenarios.
For what?
When would I NOT?
Token usage will add up with frequent runs, but you could test with lower-spec models to reduce costs. The real value is using this approach to figure out automation strategies, then converting to deterministic Playwright or similar frameworks for production use.
What I’m still curious about and want to test further…
If you’re interested in trying this, these are the exact prompts I used:
Want to try this yourself?
Really was simple to get setup once you have the Chrome DevTools MCP installed and configured. Let me know what happens when you try it – I’m especially curious about how it handles complex component libraries like AGgrid or applications with difficult authentication flows.