NEWGitHub triggers + Slack/Discord notifications shipped

One QA board for
humans and AI agents.

A universal execution board where testers and autonomous agents collaborate on the same interface. AI progressively takes over the cognitive work — writing test cases, generating scripts, executing runs, reporting back — without ever leaving the board.

Start free· no cardLocal mode · zero infra · self-host on a $5/mo VPS
MIT licensed · OSSSelf-host on a $5/mo VPSOpenAPI & Postman import
Works with the tools your team already runs
ClaudeGitHubMCPSlackDiscordOpenAPIPostmanpytest
The QA gap

QA tooling was built for one user. Now there are two.

Postman is for humans clicking buttons. Pytest is for engineers reading stack traces. Neither was designed for an AI agent that wants to read context, generate a script, run it, and write back the result. QA Runner is.

Today, without QA Runner
Tests live in 5 places. Postman collections, pytest files, Jira tickets, Notion pages, screenshots in Slack.
AI agents can't run them. No machine-readable definition. No tool interface. No way to track results.
Failures get lost. A nightly run fails at 3 AM. You find out at standup.
Onboarding takes weeks. New QAs spend their first month learning where the tests are.
With QA Runner
One board. Feature → Suite → Test Case. Same source of truth for humans and agents.
Agents have an MCP server. Claude can read TCs, run them, and write back results — no glue code.
Slack alerts. A run fails — your channel knows in 2 seconds with the failing assertion inline.
Day-1 productive. Open the board, see every test, run it. No setup, no IDE, no YAML.
TinyDependency surface — auditable in a single afternoon.
5-minFrom clone to first run on a fresh machine.
$5/moSelf-host on a VPS that can run docker compose.
100%Audit-logged — every run, every script snapshot.
The loop

From ticket to passing test in four steps.

No ceremony. Drop in a description, let AI do the busywork, override anything you want, and ship the result to the same board your teammates see.

01
Describe the case
Paste the ticket or a one-liner. Line 1 is the display name; everything below is context the AI reads.
02
AI drafts Part 2
Claude generates a pytest script using your API Library, fixtures, and dependency outputs. You review.
03
Run anywhere
Locally via docker, in CI via GitHub Actions, or right from the board. Same script, same result schema.
04
Reported back
Every run lands on the board with run_by attribution and a snapshot of the exact script that executed.
Built for two audiences

Designed for testers. Trusted by agents.

For testers

  • Feature → Suite → TC treeThe information hierarchy your team already thinks in.
  • Part 1 / Part 2 splitSetup is yours and stays put. Test logic regenerates from AI on demand.
  • Fixture library, your wayBring your own setup functions — they get injected into every run.
  • Multi-env with secret isolationTokens never leave your device — AI sees URLs and header keys only.
  • Postman + OpenAPI importBring an existing collection and you're 80% set up.

For AI agents

  • MCP server, publicAny agent — Claude Code, Cursor, custom — connects via SSE over HTTP.
  • report_result is mandatoryAgents can only persist results through MCP. No back-channel writes.
  • Tool registry, not promptsget_tc_context, gen_script, execute_script — typed, audited, approval-gated.
  • Script snapshot per runEvery result carries the exact bytes the agent executed. Traceability is structural.
  • Approval flow, your callAuto-run, review-each, or human-in-loop on writes. Configurable per env.
Built for the workflow

Everything a QA team and an AI agent both need.

Run tests by hand. Run them on every PR. Run them at 3 AM. Run them from a chat with Claude. Same TC, same result, one history.

AI assistant, in context

Claude sees your TC, your fixtures, your environment, and your last 10 runs. Ask it to write Part 2, refine an assertion, or explain why TC-014 has been flaky for three days. It has tools, not vibes.

claude-sonnet-4.5get_tc_contextexecute_scriptreport_result+8 tools

GitHub triggers

Run on push, PR, or schedule. Block merges on fail.

Slack & Discord

Failures land in your channel with the failing TCs already inline.

Two-part scripts: human owns setup, AI owns logic.

Part 1 is yours — fixtures, dependencies, secrets. Part 2 is the AI's — the actual test logic, regenerable from a spec change. The boundary is enforced. AI never touches your auth.

Part 1 · human Part 2 · AIresettablediff-able
MCP-native

Your QA board, exposed to every agent on the planet.

A typed tool registry over SSE. Connect Claude Code to your qa-runner instance and it writes results to the same board your team is staring at — with attribution, script snapshot, and a verifiable trail.

SSE over HTTP · streams agent steps in realtime
Tools are typed and audited · approvals are first-class
Works with Claude Code, Cursor, GitHub Actions, custom pipelines
Read the MCP specSee tool registry
tools.py
python
@mcp_tool
def report_test_result(
    tc_id: str,
    verdict: Literal["pass", "fail", "error"],
    http_status: int,
    duration_ms: int,
    actual_response: dict,
    note: str,
    env: str,
    run_by: str,           # "human" | "agent:{id}"
    script_snapshot: str,  # exact script executed
) -> RunResult: ...

@mcp_tool
def get_tc_context(tc_id: str) -> TCSpec: ...

@mcp_tool
def gen_script(tc_id: str, part: int = 2) -> str: ...

@mcp_tool
def execute_script(
    tc_id: str, timeout: int = 30
) -> ExecutionResult: ...
connect · http://localhost:8000/mcp
Deploy

Cloud or air-gapped — same product, same data model.

Managed cloud

Self-host on Vercel + Railway

Deploy frontend to Vercel, backend to Railway/Fly/Render. Bring your own Anthropic key. Per-target recipes shipped in the repo.

FrontendVercel
BackendRailway · FastAPI + ARQ
DatabaseSupabase · PostgreSQL
AuthEmail + GitHub OAuth
Deploy recipes

Local · self-hosted

Air-gapped friendly

One command. Docker-compose-based. Agent has full local file/process access — perfect for CI runners and privacy-strict shops.

# docker-compose.yml
services:
  frontend: image: qa-runner-frontend
  backend:
    image: qa-runner-backend
    volumes:
      - ./data:/app/data
      - ./scripts:/app/scripts

$ docker compose up
Self-host guide
FAQ

The questions every team asks.

No. The AI writes Part 2 from a one-sentence description. You can run, edit, or refine without ever opening the script tab. But if you do know Python, you have full pytest underneath — every TC compiles to plain pytest, no proprietary DSL.

Stop pasting Postman screenshots into Slack.

Spin up a board in two minutes. Connect your first agent in five. Self-host is free, forever — no per-seat pricing, no usage caps.