Skip to content

AEGIS Parameter Reference Guide

Version: 1.0.0 | Updated: 2026-02-12 | Status: Active

This guide provides complete parameter documentation for the AEGIS Governance Decision SDK. It covers every input parameter, configuration value, and decision output — with derivation guidance, domain examples, and boundary behavior.

Audience: Integration engineers, AI agent developers, platform teams, risk analysts.

Authoritative sources: schema/interface-contract.yaml (frozen values), src/config.py (AegisConfig), src/engine/gates.py (gate logic), src/integration/pcw_decide.py (decision flow).

Interactive guidance: For domain-specific derivation formulas and worked examples via MCP, call aegis_get_scoring_guide(domain) — available for trading, cicd, moderation, agents, and generic domains.


Quick Reference Table

Parameter Type Range Default Gate Section
proposal_summary string non-empty 1.1
estimated_impact string low/medium/high/critical medium 1.2
agent_id string any "aegis-cli" 1.3
session_id string any auto-generated 1.3
risk_baseline float 0.0-1.0 0.0 Risk 2.1
risk_proposed float 0.0-1.0 0.0 Risk 2.2
profit_baseline float any 0.0 Profit 3.1
profit_proposed float any 0.0 Profit 3.2
novelty_score float 0.0-1.0 0.5 Novelty 4
complexity_score float 0.0-1.0 0.5 Complexity 5
quality_score float 0.0-1.0 0.7 Quality 6
quality_subscores float[] 0.0-1.0 each [0.7, 0.7, 0.7] Quality 6
utility_result UtilityResult None Utility 7
requires_human_approval bool false 8
time_sensitive bool false 8
reversible bool true 8
shadow_mode bool false 8
drift_baseline_data float[] 30+ values None Drift 9

1. Proposal Context Parameters

1.1 proposal_summary

What it represents: A human-readable description of the proposed action. Stored in the audit trail and telemetry events.

  • Type: string (non-empty)
  • Used by: Telemetry, audit trail, decision rationale
  • Tip: Keep it specific enough for an auditor to understand the proposal months later.

1.2 estimated_impact

What it represents: The blast radius of the proposal — how much of the system could be affected if something goes wrong.

Value Semantics Gate Behavior
low Isolated change, easy rollback Standard gate evaluation
medium Moderate scope, reversible Standard gate evaluation
high Wide impact, partial reversibility Escalate on any gate failure
critical System-wide, hard to reverse Always escalate, requires human approval

1.3 agent_id

What it represents: Identifier of the AI agent or user making the proposal. Recorded in telemetry and audit trail for accountability.

  • Default: "aegis-cli" (CLI), "mcp-agent" (MCP)
  • Best practice: Use a stable identifier (e.g., "claude-code-v4", "codex-pr-reviewer", "deploy-bot-prod")

session_id is auto-generated (UUID) if not provided. Use it to correlate multiple decisions within a single agent session.


2. Risk Parameters

The risk gate uses Bayesian posterior evaluation: P(delta >= trigger_factor | data) > confidence_threshold.

2.1 risk_baseline

What it represents: The current risk level before the proposed change.

  • Range: 0.0 - 1.0
  • Default: 0.0 (no existing risk)
  • Units: Dimensionless probability / normalized ratio

How to derive it:

Domain Metric Derivation
Trading Value-at-Risk / limit Current VaR divided by position limit
CI/CD Error rate Current production error rate (e.g., 0.02 = 2%)
Content moderation False positive rate Current FPR from quality dashboard
Autonomous agents Failure rate Recent task failure count / total tasks

Boundary behavior: - risk_baseline = 0.0: Any non-zero proposed risk triggers the gate (0 to anything is an infinite multiplier, but epsilon_R=0.01 prevents division by zero) - risk_baseline = risk_proposed: Delta is 0, gate always passes

2.2 risk_proposed

What it represents: The predicted risk level after the proposed change is applied.

  • Range: 0.0 - 1.0
  • Default: 0.0
  • Alias: risk_score (MCP/CLI shorthand)

Gate math: The risk gate normalizes the delta as (risk_proposed - risk_baseline) / max(|risk_baseline|, epsilon_R) and computes a Bayesian posterior probability that this delta exceeds risk_trigger_factor (default: 2.0). If P(delta >= 2.0 | data) > 0.95, the gate fails.

Common mistakes: - Setting risk_proposed without setting risk_baseline — the gate compares the two, so both matter - Using values > 1.0 — the gate accepts them but they indicate the normalization is wrong


3. Profit Parameters

The profit gate mirrors the risk gate (Bayesian posterior) but measures performance improvement.

3.1 profit_baseline

What it represents: Current performance/profit metric before the change.

  • Range: Any real number (not normalized to 0-1)
  • Default: 0.0
  • Units: Domain-specific
Domain Metric Typical Values
Trading Sharpe ratio 0.5 - 3.0
CI/CD Deployment throughput (deploys/day) 1 - 50
Content moderation Precision 0.85 - 0.99
Autonomous agents Task success rate 0.7 - 0.99

3.2 profit_proposed

What it represents: Expected performance/profit after the change.

  • Range: Any real number
  • Default: 0.0

Gate math: Same as risk — (profit_proposed - profit_baseline) / max(profit_baseline, epsilon_P). The gate triggers when performance drops by a factor of profit_trigger_factor (default: 2.0) with 95% confidence.

Note: The profit gate detects decreases in performance, not increases. A positive delta (improvement) always passes.


4. Novelty Score

What it represents: How unprecedented or novel the proposal is. Higher novelty means more scrutiny.

  • Range: 0.0 (routine/well-understood) to 1.0 (completely novel)
  • Default: 0.5

Gate math: Logistic function G(N) = 1 / (1 + exp(-k * (N - N0))) where N0=0.7 (inflection point) and k=10 (steepness). The gate passes when G(N) >= 0.8 (threshold).

Key transition points:

Novelty G(N) Outcome
0.0 ~0.001 Fail (not novel enough)
0.5 ~0.12 Fail (insufficient novelty)
0.65 ~0.38 Fail (moderate, still below threshold)
0.7 0.50 Fail (at inflection, still below 0.8)
0.8 ~0.73 Fail (approaching threshold)
0.84 ~0.80 Threshold — barely passes
0.85 ~0.82 Pass (sufficient novelty)
1.0 ~0.95 Pass (highly novel)

Important: The novelty gate passes for highly novel proposals (G(N) >= 0.8, meaning N >= ~0.84). Proposals with low novelty fail this gate. This reflects the spec's design that proposals should demonstrate sufficient novelty to warrant consideration. If your domain treats high novelty as a risk factor, consider using 1.0 - your_novelty_metric as the novelty_score input.

How to derive it:

Domain Method
Trading 1.0 - cosine_similarity(new_strategy, nearest_historical_strategy)
CI/CD Based on change type: config change=0.2, dependency update=0.5, new service=0.8, architecture change=0.95
Content moderation 0.3 if policy exists, 0.7 if new policy area, 0.95 if unprecedented category
Autonomous agents 1.0 - (count of similar past actions / total historical actions)

Common mistakes: - Setting novelty_score = 1.0 for every new feature. Reserve 0.9+ for truly unprecedented changes - Confusing novelty with risk. A novel change can be low-risk (new but safe pattern)


5. Complexity Score

What it represents: Normalized complexity where HIGHER = SIMPLER. This is counterintuitive by design — it measures "simplicity headroom" above the complexity floor.

  • Range: 0.0 (maximally complex) to 1.0 (trivial)
  • Default: 0.5

Gate math: Hard floor at complexity_floor = 0.5. If complexity_score < 0.5, the proposal receives a HALT status that cannot be overridden.

Why HIGHER = SIMPLER?: The gate evaluates complexity_score >= floor. This means the score represents "how far above the complexity limit" the proposal is — essentially "simplicity margin."

How to derive it:

complexity_score = 1.0 - (raw_complexity / max_complexity)
Domain Raw Complexity Source Normalization
Trading Number of instruments * number of markets 1.0 - (count / max_allowed)
CI/CD Lines changed * number of services touched 1.0 - (diff_size / max_diff_policy)
Content moderation Number of policy rules involved 1.0 - (rules / total_rules)
Autonomous agents Action step count * dependency count 1.0 - (steps / max_plan_length)

Boundary behavior: - complexity_score = 0.49: HALT, non-overridable (below floor) - complexity_score = 0.50: Pass (at floor) - complexity_score = 1.0: Pass (trivially simple)

Common mistakes: - Passing raw complexity (e.g., LOC count) without normalizing to 0-1 - Forgetting that this is an inverted scale


6. Quality Score

What it represents: Overall quality assessment of the proposal.

  • Range: 0.0 - 1.0
  • Default: 0.7
  • Threshold: Must be >= 0.7 (quality_min_score)

quality_subscores: Optional array of per-dimension quality scores. If provided: - Each must be 0.0-1.0 - No zero rule: If quality_no_zero_subscore is true (default), any zero subscore fails the gate

How to derive it:

Domain Subscores Example
Trading [backtest_quality, data_quality, code_review_score]
CI/CD [test_coverage, lint_score, review_approval]
Content moderation [reviewer_confidence, policy_coverage, precedent_match]
Autonomous agents [plan_coherence, safety_check, resource_efficiency]

Boundary behavior: - quality_score = 0.69: Fail (below threshold) - quality_score = 0.70: Pass - quality_subscores = [0.9, 0.0, 0.8]: Fail (zero subscore)


7. Utility Parameters

The utility gate uses the Lower Confidence Bound (LCB) of a utility function. It is optional — if utility_result is not provided, the gate auto-passes.

UtilityResult dataclass fields:

Field Type Description
mean float Expected utility value
variance float Uncertainty in the utility estimate (must be >= 0)
components UtilityComponents Breakdown of utility sources
decision_type str "INVESTMENT", "MAINTENANCE", "RESEARCH", or "OPERATIONAL"
metadata dict Additional context

Gate math: LCB = mean - z_{1-alpha/2} * sqrt(variance) where alpha=0.05 (95% confidence). Gate passes if LCB > utility_threshold (default: 0.0).

When to provide it: Only when you have a quantitative utility model. Most simple proposals should omit it (auto-pass). Provide it for investment decisions, capacity planning, or resource allocation where you can estimate expected value and uncertainty.

Common mistake: Providing a UtilityResult with high variance and low mean. The LCB will be very negative, causing a gate failure even with positive expected utility.


8. Execution Context Flags

Flag Default Effect
requires_human_approval false If true, decision always includes "Obtain human approval" in next_steps
time_sensitive false Noted in rationale; does not change gate thresholds
reversible true If false, next_steps include rollback planning
shadow_mode false Gates evaluate but results are advisory only; no enforcement

shadow_mode: Use during calibration periods (30+ days recommended). Shadow mode still records telemetry and drift observations but does not enforce HALT/PAUSE decisions. The decision result includes a shadow_result object with observation values and baseline hash.


9. Drift Detection

KL divergence drift detection compares current observations against a historical baseline distribution.

drift_baseline_data

  • Type: Array of numbers (float)
  • Minimum: 30+ values recommended (fewer produces unreliable KL divergence)
  • Default: None (drift detection disabled)

Gate behavior:

KL Divergence Status Action
< 0.3 NORMAL No constraint
0.3 - 0.5 WARNING Added as constraint in decision
> 0.5 CRITICAL Decision status forced to HALT

How to derive baseline data: Collect 30+ historical metric observations from production. These should represent the "normal" operating distribution of whatever metric you're monitoring (e.g., error rates, latency percentiles, throughput values).


10. Configuration Parameters

These are set via AegisConfig (YAML, dict, or defaults) and control gate behavior.

Gate Thresholds

Parameter Default Gate Description
epsilon_R 0.01 Risk Floor for risk normalization denominator (prevents div-by-zero)
epsilon_P 0.01 Profit Floor for profit normalization denominator
risk_trigger_factor 2.0 Risk Bayesian trigger: risk must double to fail gate
profit_trigger_factor 2.0 Profit Bayesian trigger: profit must halve to fail gate
trigger_confidence_prob 0.95 Risk/Profit Bayesian confidence threshold (95%)
novelty_N0 0.7 Novelty Logistic function inflection point
novelty_k 10.0 Novelty Logistic function steepness
novelty_threshold 0.8 Novelty Pass/fail threshold for G(N)
complexity_floor 0.5 Complexity Hard floor (non-overridable)
quality_min_score 0.7 Quality Minimum quality score
quality_no_zero_subscore true Quality Reject zero subscores
utility_threshold 0.0 Utility Minimum LCB value
lcb_alpha 0.05 Utility LCB confidence level (95%)

Utility Function Parameters

Parameter Default Description
gamma 0.3 Discount factor for uncertain value
kappa 1.0 Risk coefficient (when risk delta < 0)
phi_S 100.0 Static complexity cost ($/month/kLOC)
phi_D 2000.0 Dynamic complexity cost ($/month/service)
migration_budget 5000.0 Refactoring budget allowance ($)

Drift Detection Parameters

Parameter Default Description
kl_drift.tau_warning 0.3 KL divergence warning threshold
kl_drift.tau_critical 0.5 KL divergence critical threshold
kl_drift.window_days 30 Rolling window for baseline

Operational Parameters

Parameter Default Description
telemetry_url None HTTPS URL for telemetry event POST
mcp_rate_limit 60 MCP requests per minute (0 = disabled)

11. Decision Status Reference

Status Meaning When Override?
PROCEED Safe to proceed All gates pass N/A
PAUSE Pause for review 1+ gates fail, overridable Yes (two-key BIP-322)
ESCALATE Escalate to authority high/critical impact with failures Yes (two-key BIP-322)
HALT Stop immediately Complexity below floor OR critical drift No (complexity); Yes (drift)

Override eligibility: A decision is override-eligible when failing gates are all overridable. The complexity floor gate is never overridable. Drift CRITICAL is overridable but requires explicit re-baseline.


Appendix A: Parameter Derivation Flowchart

1. Identify your domain metrics
   └─ What do you measure? (error rate, VaR, throughput, etc.)

2. Map to AEGIS parameters
   └─ risk_baseline/proposed → your risk metric (normalized 0-1)
   └─ profit_baseline/proposed → your performance metric
   └─ novelty_score → similarity to past actions (inverted)
   └─ complexity_score → 1.0 - (raw_complexity / max_complexity)
   └─ quality_score → composite quality signal

3. Set estimated_impact
   └─ How much could go wrong? → low/medium/high/critical

4. Collect drift baseline (optional)
   └─ 30+ historical observations of your key metric

5. Start in shadow_mode for 30+ days
   └─ Calibrate thresholds before enforcing

Appendix B: Frozen Parameters (interface-contract.yaml)

All gate threshold values are frozen in schema/interface-contract.yaml. They cannot be changed at runtime (AegisConfig is a frozen dataclass). Changes require formal recalibration approval per AEGIS governance policy.

To inspect current values programmatically:

from aegis_governance import AegisConfig
config = AegisConfig.default()
print(config.to_dict())

Or via MCP:

{"method": "tools/call", "params": {"name": "aegis_check_thresholds", "arguments": {}}}

References