STRUT is a spec-first, TDD-enforced orchestration layer for Claude Code. Instead of generating apps, it processes changes through a structured Read Truth → Process Change → Update Truth cycle, with human gates at the points where judgment matters.
The METR randomized controlled trial found experienced developers believed AI made them 24% faster, but were actually 19% slower. The bottleneck wasn't generation speed. It was intent clarity, spec quality, and review rigor.
STRUT is designed around that finding. Every change goes through classification, specification, test-first implementation, fail-fast review, and build verification, with human gates at the two points where judgment can't be automated: spec approval and PR review.
8 orchestrator skills, 10 worker agents, 1 bash script. All inter-agent communication passes through structured JSON file contracts in .pipeline/. Orchestrators read only status fields, never content.
v0, Bolt, and Lovable generate whole apps fast. STRUT assumes you already have a codebase and processes individual changes through a rigorous pipeline. Different tool, different job.
STRUT orchestrates your build, lint, typecheck, and test commands. It has no opinions about your language, framework, or test runner.
Adding a button doesn't need a 19-component pipeline. STRUT is for changes where getting it wrong costs more than getting it slow.
The pipeline surfaces problems. It doesn't fix your architecture. You still need to make judgment calls at every gate. That's the point.
Every change is classified by scan evidence into two independent binary modifiers. The pipeline adds components only when the risk warrants them.
Adds MUST NEVER constraints, negative criteria, security review by Opus, mandatory knowledge capture, and describe-flow traceability.
Adds task breakdown (up to 5), per-task TDD loop, and a human gate after task 1 to verify the approach before remaining tasks proceed.
Four combinations: standard (both OFF), trust-only, decompose-only, guarded-decompose (both ON, which adds adversarial spec review).
Every architectural decision maps to tiered citations. The full rationale is in architectural-decisions.md and research-index.md in the repo.
Every architectural decision in STRUT is tagged with its platform coupling. The patterns (spec-first, file contracts, modifier-based risk routing, TDD enforcement) are grounded in LLM research that applies to any model. The current implementation uses Claude Code's skill/agent system. That's a starting point, not a ceiling.
STRUT integrates into any project with a working build/test pipeline. Claude handles the mechanical setup; you provide the domain knowledge.
Step mode is useful for your first few pipeline runs. At each pause you see the completed agent, its output file, and what runs next. Type continue or abort at each step. The flag is per-invocation, so omitting it on a resume returns to normal flow.