Build Phases
The MVP was built in 8 phases over 20 PRs, with 350 tests and zero lint errors throughout.
Domain Foundation
CaseSchema (19 Pydantic v2 models), domain entities (Session, Event, State, Artifacts), StateTransitionEngine with 27 action handlers, deterministic replay, case validator CLI (5 commands).
packages/case-schema/packages/domain/packages/simulation-engine/apps/api/cli.pyParticipant Flow
FastAPI application, SQLAlchemy 2.0 ORM (3 models), Alembic migrations, session/action/event/artifact endpoints, React + Vite + TanStack Query workspace with 3-pane layout.
apps/api/apps/web/tests/integration/Stakeholders
Hybrid stakeholder engine: PolicyEngine (6 rule types) + LanguageRenderer with post-generation validation. ModelGateway with provider-neutral Protocol + MockProvider.
packages/stakeholder-engine/packages/model-gateway/Evaluation
12 automated validators, 6 hard constraints (3 critical, 1 major, 2 minor), 6-dimension scoring, auto-triggered report generation on session completion.
packages/evaluation-engine/routes/evaluations.pySeed Cases
3 complete cases: Wrong Use-Case Selection, Unsafe Autonomy Transition, Unmaintainable Prototype. All hidden facts reachable. 2+ valid strategy patterns per case.
cases/wrong-use-case/cases/unsafe-autonomy/cases/unmaintainable-prototype/Replay & Expert Review
ReplayService with state diff computation and dimension tagging. Expert review panel with dimension scoring and event citations. Pairwise trajectory comparison.
ReplayTimeline.tsxExpertReviewPanel.tsxroutes/replay.pyAgent Interface
Dedicated agent API endpoints with machine-readable action schemas. Reference baseline agent script using heuristic policy. Agent-friendly error handling.
routes/agent.pyscripts/baseline_agent.pyHardening
GitHub Actions CI pipeline, Docker Compose, RBAC middleware, slowapi rate limiting, 15 adversarial tests (prompt injection, state extraction, event tampering, cross-session access).
.github/workflows/ci.ymldocker-compose.ymltests/adversarial/