AEGIS Parameter Reference Guide¶

Version: 1.0.0 | Updated: 2026-02-12 | Status: Active

This guide provides complete parameter documentation for the AEGIS Governance Decision SDK. It covers every input parameter, configuration value, and decision output — with derivation guidance, domain examples, and boundary behavior.

Audience: Integration engineers, AI agent developers, platform teams, risk analysts.

Authoritative sources: schema/interface-contract.yaml (frozen values), src/config.py (AegisConfig), src/engine/gates.py (gate logic), src/integration/pcw_decide.py (decision flow).

Interactive guidance: For domain-specific derivation formulas and worked examples via MCP, call aegis_get_scoring_guide(domain) — available for trading, cicd, moderation, agents, and generic domains.

Quick Reference Table¶

Parameter	Type	Range	Default	Gate	Section
`proposal_summary`	string	non-empty	—	—	1.1
`estimated_impact`	string	low/medium/high/critical	medium	—	1.2
`agent_id`	string	any	"aegis-cli"	—	1.3
`session_id`	string	any	auto-generated	—	1.3
`risk_baseline`	float	0.0-1.0	0.0	Risk	2.1
`risk_proposed`	float	0.0-1.0	0.0	Risk	2.2
`profit_baseline`	float	any	0.0	Profit	3.1
`profit_proposed`	float	any	0.0	Profit	3.2
`novelty_score`	float	0.0-1.0	0.5	Novelty	4
`complexity_score`	float	0.0-1.0	0.5	Complexity	5
`quality_score`	float	0.0-1.0	0.7	Quality	6
`quality_subscores`	float[]	0.0-1.0 each	[0.7, 0.7, 0.7]	Quality	6
`utility_result`	UtilityResult	—	None	Utility	7
`requires_human_approval`	bool	—	false	—	8
`time_sensitive`	bool	—	false	—	8
`reversible`	bool	—	true	—	8
`shadow_mode`	bool	—	false	—	8
`drift_baseline_data`	float[]	30+ values	None	Drift	9

1. Proposal Context Parameters¶

1.1 proposal_summary¶

What it represents: A human-readable description of the proposed action. Stored in the audit trail and telemetry events.

Type: string (non-empty)
Used by: Telemetry, audit trail, decision rationale
Tip: Keep it specific enough for an auditor to understand the proposal months later.

1.2 estimated_impact¶

What it represents: The blast radius of the proposal — how much of the system could be affected if something goes wrong.

Value	Semantics	Gate Behavior
`low`	Isolated change, easy rollback	Standard gate evaluation
`medium`	Moderate scope, reversible	Standard gate evaluation
`high`	Wide impact, partial reversibility	Escalate on any gate failure
`critical`	System-wide, hard to reverse	Always escalate, requires human approval

1.3 agent_id¶

What it represents: Identifier of the AI agent or user making the proposal. Recorded in telemetry and audit trail for accountability.

Default: "aegis-cli" (CLI), "mcp-agent" (MCP)
Best practice: Use a stable identifier (e.g., "claude-code-v4", "codex-pr-reviewer", "deploy-bot-prod")

session_id is auto-generated (UUID) if not provided. Use it to correlate multiple decisions within a single agent session.

2. Risk Parameters¶

The risk gate uses Bayesian posterior evaluation: P(delta >= trigger_factor | data) > confidence_threshold.

2.1 risk_baseline¶

What it represents: The current risk level before the proposed change.

Range: 0.0 - 1.0
Default: 0.0 (no existing risk)
Units: Dimensionless probability / normalized ratio

How to derive it:

Domain	Metric	Derivation
Trading	Value-at-Risk / limit	Current VaR divided by position limit
CI/CD	Error rate	Current production error rate (e.g., 0.02 = 2%)
Content moderation	False positive rate	Current FPR from quality dashboard
Autonomous agents	Failure rate	Recent task failure count / total tasks

Boundary behavior: - risk_baseline = 0.0: Any non-zero proposed risk triggers the gate (0 to anything is an infinite multiplier, but epsilon_R=0.01 prevents division by zero) - risk_baseline = risk_proposed: Delta is 0, gate always passes

2.2 risk_proposed¶

What it represents: The predicted risk level after the proposed change is applied.

Range: 0.0 - 1.0
Default: 0.0
Alias: risk_score (MCP/CLI shorthand)

Gate math: The risk gate normalizes the delta as (risk_proposed - risk_baseline) / max(|risk_baseline|, epsilon_R) and computes a Bayesian posterior probability that this delta exceeds risk_trigger_factor (default: 2.0). If P(delta >= 2.0 | data) > 0.95, the gate fails.

Common mistakes: - Setting risk_proposed without setting risk_baseline — the gate compares the two, so both matter - Using values > 1.0 — the gate accepts them but they indicate the normalization is wrong

3. Profit Parameters¶

The profit gate mirrors the risk gate (Bayesian posterior) but measures performance improvement.

3.1 profit_baseline¶

What it represents: Current performance/profit metric before the change.

Range: Any real number (not normalized to 0-1)
Default: 0.0
Units: Domain-specific

Domain	Metric	Typical Values
Trading	Sharpe ratio	0.5 - 3.0
CI/CD	Deployment throughput (deploys/day)	1 - 50
Content moderation	Precision	0.85 - 0.99
Autonomous agents	Task success rate	0.7 - 0.99

3.2 profit_proposed¶

What it represents: Expected performance/profit after the change.

Range: Any real number
Default: 0.0

Gate math: Same as risk — (profit_proposed - profit_baseline) / max(profit_baseline, epsilon_P). The gate triggers when performance drops by a factor of profit_trigger_factor (default: 2.0) with 95% confidence.

Note: The profit gate detects decreases in performance, not increases. A positive delta (improvement) always passes.

4. Novelty Score¶

What it represents: How unprecedented or novel the proposal is. Higher novelty means more scrutiny.

Range: 0.0 (routine/well-understood) to 1.0 (completely novel)
Default: 0.5

Gate math: Logistic function G(N) = 1 / (1 + exp(-k * (N - N0))) where N0=0.7 (inflection point) and k=10 (steepness). The gate passes when G(N) >= 0.8 (threshold).

Key transition points:

Novelty	G(N)	Outcome
0.0	~0.001	Fail (not novel enough)
0.5	~0.12	Fail (insufficient novelty)
0.65	~0.38	Fail (moderate, still below threshold)
0.7	0.50	Fail (at inflection, still below 0.8)
0.8	~0.73	Fail (approaching threshold)
0.84	~0.80	Threshold — barely passes
0.85	~0.82	Pass (sufficient novelty)
1.0	~0.95	Pass (highly novel)

Important: The novelty gate passes for highly novel proposals (G(N) >= 0.8, meaning N >= ~0.84). Proposals with low novelty fail this gate. This reflects the spec's design that proposals should demonstrate sufficient novelty to warrant consideration. If your domain treats high novelty as a risk factor, consider using 1.0 - your_novelty_metric as the novelty_score input.

How to derive it:

Domain	Method
Trading	1.0 - cosine_similarity(new_strategy, nearest_historical_strategy)
CI/CD	Based on change type: config change=0.2, dependency update=0.5, new service=0.8, architecture change=0.95
Content moderation	0.3 if policy exists, 0.7 if new policy area, 0.95 if unprecedented category
Autonomous agents	1.0 - (count of similar past actions / total historical actions)

Common mistakes: - Setting novelty_score = 1.0 for every new feature. Reserve 0.9+ for truly unprecedented changes - Confusing novelty with risk. A novel change can be low-risk (new but safe pattern)

5. Complexity Score¶

What it represents: Normalized complexity where HIGHER = SIMPLER. This is counterintuitive by design — it measures "simplicity headroom" above the complexity floor.

Range: 0.0 (maximally complex) to 1.0 (trivial)
Default: 0.5

Gate math: Hard floor at complexity_floor = 0.5. If complexity_score < 0.5, the proposal receives a HALT status that cannot be overridden.

Why HIGHER = SIMPLER?: The gate evaluates complexity_score >= floor. This means the score represents "how far above the complexity limit" the proposal is — essentially "simplicity margin."

How to derive it:

complexity_score = 1.0 - (raw_complexity / max_complexity)

Domain	Raw Complexity Source	Normalization
Trading	Number of instruments * number of markets	1.0 - (count / max_allowed)
CI/CD	Lines changed * number of services touched	1.0 - (diff_size / max_diff_policy)
Content moderation	Number of policy rules involved	1.0 - (rules / total_rules)
Autonomous agents	Action step count * dependency count	1.0 - (steps / max_plan_length)

Boundary behavior: - complexity_score = 0.49: HALT, non-overridable (below floor) - complexity_score = 0.50: Pass (at floor) - complexity_score = 1.0: Pass (trivially simple)

Common mistakes: - Passing raw complexity (e.g., LOC count) without normalizing to 0-1 - Forgetting that this is an inverted scale

6. Quality Score¶

What it represents: Overall quality assessment of the proposal.

Range: 0.0 - 1.0
Default: 0.7
Threshold: Must be >= 0.7 (quality_min_score)

quality_subscores: Optional array of per-dimension quality scores. If provided: - Each must be 0.0-1.0 - No zero rule: If quality_no_zero_subscore is true (default), any zero subscore fails the gate

How to derive it:

Domain	Subscores Example
Trading	[backtest_quality, data_quality, code_review_score]
CI/CD	[test_coverage, lint_score, review_approval]
Content moderation	[reviewer_confidence, policy_coverage, precedent_match]
Autonomous agents	[plan_coherence, safety_check, resource_efficiency]

Boundary behavior: - quality_score = 0.69: Fail (below threshold) - quality_score = 0.70: Pass - quality_subscores = [0.9, 0.0, 0.8]: Fail (zero subscore)

7. Utility Parameters¶

The utility gate uses the Lower Confidence Bound (LCB) of a utility function. It is optional — if utility_result is not provided, the gate auto-passes.

UtilityResult dataclass fields:

Field	Type	Description
`mean`	float	Expected utility value
`variance`	float	Uncertainty in the utility estimate (must be >= 0)
`components`	UtilityComponents	Breakdown of utility sources
`decision_type`	str	"INVESTMENT", "MAINTENANCE", "RESEARCH", or "OPERATIONAL"
`metadata`	dict	Additional context

Gate math: LCB = mean - z_{1-alpha/2} * sqrt(variance) where alpha=0.05 (95% confidence). Gate passes if LCB > utility_threshold (default: 0.0).

When to provide it: Only when you have a quantitative utility model. Most simple proposals should omit it (auto-pass). Provide it for investment decisions, capacity planning, or resource allocation where you can estimate expected value and uncertainty.

Common mistake: Providing a UtilityResult with high variance and low mean. The LCB will be very negative, causing a gate failure even with positive expected utility.

8. Execution Context Flags¶

Flag	Default	Effect
`requires_human_approval`	false	If true, decision always includes "Obtain human approval" in next_steps
`time_sensitive`	false	Noted in rationale; does not change gate thresholds
`reversible`	true	If false, next_steps include rollback planning
`shadow_mode`	false	Gates evaluate but results are advisory only; no enforcement

shadow_mode: Use during calibration periods (30+ days recommended). Shadow mode still records telemetry and drift observations but does not enforce HALT/PAUSE decisions. The decision result includes a shadow_result object with observation values and baseline hash.

9. Drift Detection¶

KL divergence drift detection compares current observations against a historical baseline distribution.

drift_baseline_data¶

Type: Array of numbers (float)
Minimum: 30+ values recommended (fewer produces unreliable KL divergence)
Default: None (drift detection disabled)

Gate behavior:

KL Divergence	Status	Action
< 0.3	NORMAL	No constraint
0.3 - 0.5	WARNING	Added as constraint in decision
> 0.5	CRITICAL	Decision status forced to HALT

How to derive baseline data: Collect 30+ historical metric observations from production. These should represent the "normal" operating distribution of whatever metric you're monitoring (e.g., error rates, latency percentiles, throughput values).

10. Configuration Parameters¶

These are set via AegisConfig (YAML, dict, or defaults) and control gate behavior.

Gate Thresholds¶

Parameter	Default	Gate	Description
`epsilon_R`	0.01	Risk	Floor for risk normalization denominator (prevents div-by-zero)
`epsilon_P`	0.01	Profit	Floor for profit normalization denominator
`risk_trigger_factor`	2.0	Risk	Bayesian trigger: risk must double to fail gate
`profit_trigger_factor`	2.0	Profit	Bayesian trigger: profit must halve to fail gate
`trigger_confidence_prob`	0.95	Risk/Profit	Bayesian confidence threshold (95%)
`novelty_N0`	0.7	Novelty	Logistic function inflection point
`novelty_k`	10.0	Novelty	Logistic function steepness
`novelty_threshold`	0.8	Novelty	Pass/fail threshold for G(N)
`complexity_floor`	0.5	Complexity	Hard floor (non-overridable)
`quality_min_score`	0.7	Quality	Minimum quality score
`quality_no_zero_subscore`	true	Quality	Reject zero subscores
`utility_threshold`	0.0	Utility	Minimum LCB value
`lcb_alpha`	0.05	Utility	LCB confidence level (95%)

Utility Function Parameters¶

Parameter	Default	Description
`gamma`	0.3	Discount factor for uncertain value
`kappa`	1.0	Risk coefficient (when risk delta < 0)
`phi_S`	100.0	Static complexity cost ($/month/kLOC)
`phi_D`	2000.0	Dynamic complexity cost ($/month/service)
`migration_budget`	5000.0	Refactoring budget allowance ($)

Drift Detection Parameters¶

Parameter	Default	Description
`kl_drift.tau_warning`	0.3	KL divergence warning threshold
`kl_drift.tau_critical`	0.5	KL divergence critical threshold
`kl_drift.window_days`	30	Rolling window for baseline

Operational Parameters¶

Parameter	Default	Description
`telemetry_url`	None	HTTPS URL for telemetry event POST
`mcp_rate_limit`	60	MCP requests per minute (0 = disabled)

11. Decision Status Reference¶

Status	Meaning	When	Override?
PROCEED	Safe to proceed	All gates pass	N/A
PAUSE	Pause for review	1+ gates fail, overridable	Yes (two-key BIP-322)
ESCALATE	Escalate to authority	high/critical impact with failures	Yes (two-key BIP-322)
HALT	Stop immediately	Complexity below floor OR critical drift	No (complexity); Yes (drift)

Override eligibility: A decision is override-eligible when failing gates are all overridable. The complexity floor gate is never overridable. Drift CRITICAL is overridable but requires explicit re-baseline.

Appendix A: Parameter Derivation Flowchart¶

1. Identify your domain metrics
   └─ What do you measure? (error rate, VaR, throughput, etc.)

2. Map to AEGIS parameters
   └─ risk_baseline/proposed → your risk metric (normalized 0-1)
   └─ profit_baseline/proposed → your performance metric
   └─ novelty_score → similarity to past actions (inverted)
   └─ complexity_score → 1.0 - (raw_complexity / max_complexity)
   └─ quality_score → composite quality signal

3. Set estimated_impact
   └─ How much could go wrong? → low/medium/high/critical

4. Collect drift baseline (optional)
   └─ 30+ historical observations of your key metric

5. Start in shadow_mode for 30+ days
   └─ Calibrate thresholds before enforcing

Appendix B: Frozen Parameters (interface-contract.yaml)¶

All gate threshold values are frozen in schema/interface-contract.yaml. They cannot be changed at runtime (AegisConfig is a frozen dataclass). Changes require formal recalibration approval per AEGIS governance policy.

To inspect current values programmatically:

from aegis_governance import AegisConfig
config = AegisConfig.default()
print(config.to_dict())

Or via MCP:

{"method": "tools/call", "params": {"name": "aegis_check_thresholds", "arguments": {}}}

References¶

Interface Contract — Frozen parameter values
Domain Templates — Worked examples for 4 domains
Production Guide — Deployment and observability
AEGIS Specification — Full system specification