Enterprise AI Readiness Platform

Adaptive Forward Deployed Engineer
Case Simulator

Evaluate and train human engineers and AI agents on realistic enterprise AI deployments. Stateful simulations with deterministic consequences, evidence-linked scoring, and immutable session traces — not a chatbot, not a quiz.

350 Tests Passing
8 Build Phases
3 Seed Cases
27 Action Handlers
6 Scoring Dimensions
15 Adversarial Tests

What Makes This Different

Most AI evaluation platforms are glorified quizzes. AFCS is a stateful work simulation.

Deterministic Truth, Generative Language

LLMs render stakeholder dialogue only. Permissions, facts, approvals, and state transitions are policy-controlled — fully replayable.

Multi-Dimensional Scoring

Not one number. Six independent axes: Discovery, Technical Reasoning, Evaluation Quality, Delivery, Governance, and Operational Sustainability.

Append-Only Traceability

Every state change is an immutable event. Replay any session from its event stream. Verify integrity with pre/post state hashes.

🔒

Hard Safety Constraints

Automatic failure for unauthorized irreversible actions, regulatory bypass, or credential exposure. Scored separately from capability.

🤖

Human + Agent Interfaces

Browser-based participant workspace for humans. Structured REST API for AI agents. Same simulation, same evaluation.

Adversarial Hardening

15 adversarial tests: prompt injection, hidden state extraction, event tampering, cross-session access. RBAC + rate limiting built in.

Trust Architecture

Five trust boundaries enforce separation between hidden canonical state and participant-visible projections.

B1
Participant → Hidden State
Canonical state is NEVER exposed. Only projected visible state reaches the API.
B2
Participant → Stakeholder
Policy layer controls facts. LLM renders language only. Post-generation validation.
B3
Participant → Evaluation
Scores gated behind session completion. 403 until evaluated. No in-progress score leakage.
B4
Evaluator → Participant
RBAC-scoped. Session-level ACL. Audit logging of all evaluator data access.

Three Realistic Seed Cases

Every case has multiple valid strategies, hidden facts, and deterministic consequences.

Wrong Use-Case Selection

intermediate · $50k · 30 days

A team wants a RAG assistant for customer support. The real problem is inconsistent refund policy — not knowledge access. Can you avoid building the wrong thing?

discoveryscope reductionworkflow analysis

Unsafe Autonomy Transition

advanced · $75k · 45 days

Upgrade a recommendation assistant to auto-approve refunds. Irreversible actions, missing approvals, and rare edge cases absent from evaluation data.

governancerisk assessmentrollback planning

Unmaintainable Prototype

advanced · $120k · 60 days

A brittle bespoke system with zero tests needs to scale 10x. No platform support, no runbooks, no owner. Should you scale it or refactor?

operationsownershipplatform strategy

Ready to Evaluate Real FDE Capability?

Not a chatbot. Not a quiz. A stateful, replayable work simulation with evidence-linked evaluation.