Blog
Engineering stories, security deep-dives, and lessons from shipping AI-powered test automation.
Value Per Token: The Number the Agent-Loop Hype Forgets
Loops where agents prompt agents to write code are useful in the right hands — and oversold as a universal law by the people selling the tokens. A case for value per token, treating models as commodity suppliers, and keeping one deterministic check between generation and trust.
Read more →23 Free QA Skills for Your Coding Agent
We open-sourced a catalogue of 23 diagnostic QA skills for Claude Code and Codex — Core Web Vitals, secret scanning, dependency audits, dead-code detection, flaky-selector hunting, and more. No signup. Copy a folder and go.
Read more →You Can't Review Your Own Work
AI code generation and AI code testing are adversarial systems that should be separate products. When the same agent writes the code and its tests, they pass by construction. The thesis QualityMax is built on — and the deterministic guardrail harness that makes it real.
Read more →May in Review: The Month We Made It More Trustworthy
A rebuilt AI crawl planner grounded in the live page, platform stability work, security hardening, qmax-code going open source, and the dogfooding loop behind it all.
Read more →
Dogfooding
Six days of shipping QualityMax from my phone
A Faroe Islands trip May 18–23, 43 PRs landed, zero days at a desk, and one revert that proved why the gates exist.
Read more →You're already paying for an AI subscription. Get the full QA loop for free.
The free tier is the product, not a trial. Bring your existing Claude Code or Codex subscription, get crawl → generate → run → fix without leaving your terminal — plus 60 free isolated cloud-sandbox minutes a month for the runs that shouldn't live on a laptop.
Read more →We built our iPhone app in 4 days — without being a mobile testing platform
QualityMax started as a web E2E testing platform. This week we shipped our own iPhone app to TestFlight. The next day, the new app caught its own production bug end-to-end in under 20 minutes. Here's how the dogfooding loop closed.
Read more →qmax-code is now open source
The Go + Charm TUI agent that orchestrates Claude over the QualityMax API is now public on GitHub. Read every line, fork it, send PRs — FSL-1.1-ALv2 (converts to Apache 2.0 in 2 years).
Read more →qmax-code 1.13: Claude Code and Codex on QA Steroids
v1.13 doesn't just find the bug — it patches it in your terminal while you watch. Multi-model routing, in-terminal auto-fix, instant PR security review. Works free with your existing CC or Codex subscription.
Read more →We Redesigned 14 Landing Pages — Through Our Own AI Review Gates
22 commits, 3 PRs, a 5-persona AI review, every commit gated by SAST + prompt-injection + brute-force checks. If we don’t trust our pipeline with our own brand pages, why would you?
Read more →Teaching the Reviewer: How 👍/👎 on a PR Comment Rewires the Next Review
A single click on a QualityMax PR comment becomes durable, per-repo knowledge the reviewer retrieves on the next PR. Here’s the plumbing — three feedback channels, one storage layer, and the GitHub-webhook limitation that forced us to build a poller.
Read more →Two Posts, Same Day: The Gap Between AI Policy and Vibe Coding
One mature engineering org writes a 27-page AI policy with the rule “if you can’t explain the code, don’t commit it.” One workshop ships 10 live websites in an afternoon with Lovable and Cursor. The gap between them is the whole QualityMax market.
Read more →The Möbius Strip QA Loop: When the Tool Tests Itself
Most QA tools sit outside the code they test. QualityMax sits inside — and now monitors its own errors, generates its own regression tests, closes its own loops. A single-sided surface where tool and target merge.
Read more →Your AI Reviewer Now Asks What You Care About
Interactive calibration for AI code reviews: pick which categories to check, which to skip, and get structured findings with a one-command fix for your LLM agent.
Read more →
Engineering
When Claude Goes Down, Your Tests Shouldn't
Today's Anthropic outage took claude.ai partial, Claude Code degraded. Every AI test platform built on a single LLM provider went down with it. Here's why QualityMax routes per-task across Claude, GPT, and Gemini — and what that costs.
Read more →Building qmax-code: Why We Built Our Own AI Testing Agent
7,951 lines of Go. Charm framework TUI. 48 MCP tools. Not based on Claude Code. Two tools, one mission — here's the engineering story.
Read more →AI Coding Agents Are Secured in the Wrong Direction
The Claude Code source leak reveals an industry-wide gap: AI tools invest in containing the agent but barely verify whether the code it produces is secure. 4% of GitHub commits are now AI-generated. Who's checking them?
Read more →
Real Incident
We Got Brute-Forced on Launch Day
We posted our vibe-check page on Reddit and Hacker News. 1,145 users came. So did a brute-force attack that blew through our Resend email quota in 4 minutes.
Read more →Building the Matrix Demo
Behind the scenes of our interactive demo page — boot sequences, the red pill / blue pill choice, a chat-driven AI terminal, and live Playwright execution in the browser.
Read more →Building a Hostile Site to Test Our AI
How we created an adversary website full of prompt injections, XSS traps, and redirect loops to stress-test our AI crawl pipeline — and what we learned.
Read more →