Skip to main content

Overview

Guidance Modules are the turn-to-turn intelligence layer for Autopilot Full Auto runs. They replace the manual “continue” loop with a structured decision system that can be evaluated, improved, and composed over time.

Definitions

  • Turn: one Codex execution window that ends in turn/completed or turn/error.
  • Run: a multi-turn Full Auto session composed of turns.
  • Guidance: a soft recommendation for what to do next.
  • Guardrails: deterministic constraints that can override guidance.

Why They Exist

Without guidance, long runs rely on ad hoc prompts and manual intervention. A Guidance Module:
  • Sees the full context of the last turn
  • Understands goal, constraints, and budget
  • Chooses the next action with measurable confidence
  • Enforces deterministic guardrails for safety
  • Logs every decision for replay and optimization

Guidance Contract (Conceptual)

type GuidanceInputs = {
  goal: { intent: string; success_criteria?: string[] }
  summary: FullAutoTurnSummary
  state: {
    turn_count: number
    no_progress_count: number
    tokens_remaining?: number
    time_remaining_ms?: number
    permissions: { can_exec: boolean; can_write: boolean; network: "none" | "scoped" | "full" }
  }
}

type GuidanceDecision = {
  action: "continue" | "pause" | "stop" | "review"
  next_input?: string
  reason: string
  confidence: number
  tags?: string[]
}

Current Pipeline (Full Auto Today)

Full Auto currently runs four concrete steps between Codex turns:
  1. Turn Summary → Build a FullAutoTurnSummary from Codex events.
  2. DSPy Decision → Choose the next action and optional prompt.
  3. Guardrails → Enforce budget/safety limits and override if needed.
  4. Dispatch → Execute the action and start the next turn if continuing.

Turn Summary Inputs (Examples)

  • turn/plan/updated, turn/diff/updated
  • thread/tokenUsage/updated
  • item/commandExecution/requestApproval, item/fileChange/requestApproval
  • item/tool/requestUserInput
  • turn/error, turn/completed

Guardrail Rules (Current)

  • turn_failed -> stop
  • turn_interrupted -> pause
  • max_turns / max_tokens -> stop
  • no_progress_limit -> stop
  • low_confidence or review -> pause

Decision Records

Every decision is logged with:
  • Input summary + hashes
  • Decision output + confidence
  • Guardrail audit trail
  • Versioned model/package info
This enables replay, evaluation, and attribution.

Future Direction

The long-term goal is an extensible, packageable guidance stack:
  • Composable modules (BudgetPolicy, NextActionSelector, Verifier)
  • Clear signatures for drop-in replacements
  • Replayable decision records with versioned manifests
  • Evaluation and optimization loops (DSPy optimizers)
Guidance becomes a marketplace surface for agent intelligence rather than one-off prompts.