AEGIS Roadmap¶

Version: 1.93.0 Updated: 2026-02-25 Status: Active Cross-References: README.md, CLAUDE.md, gap-analysis.md

This document is the single source of truth for AEGIS future work, including active PRs, open issues, and release milestones.

Next Steps (Ordered Checklist)¶

Work through these in order. Check off each item as completed. Source: Discovery Analysis 2026-02-08.

Immediate (no blockers, can start now)¶

[x] 1. Fix dependency misclassification — scipy and prometheus_client moved from dev to dedicated engine and telemetry optional groups with try/except import guards and clear ImportError messages at point of use. Files: pyproject.toml, src/engine/utility.py, src/telemetry/prometheus_exporter.py, README.md, tests/test_optional_deps.py
[x] 2. Docs version sync — repository-structure.md:28 and comprehensive-todo-discovery.md synced to CLAUDE.md v4.2.3, metrics 1475/94.21%. Commit: d84330a
[x] 3. Update dependency versions — safety>=2.3.0 → >=3.0.0; python-bitcoinlib already pinned at >=0.12.0 (no change needed). All quality gates pass. File: pyproject.toml, Commit: e2ab0a5
[x] 4. Document broad exception catches — Added # Intentional: <reason> comments to 15 except Exception sites across 8 files per CLAUDE.md §3. Commit: fc0b41c

Short-term (v1.1.0)¶

[x] 5. Boundary tests for all gates — 77 parametrized BVA tests verifying comparison operators at exact thresholds for all 6 gates + drift detector. File: tests/test_gate_boundaries.py
[x] 6. GOVERNANCE actor type — Override workflow orchestration: Governance actor class with override lifecycle (initiate/sign/approve/reject/expire), compliance checking (complexity gate non-overridable), emergency halt; ultrathink-hardened (halt guards, fail-closed compliance, thread safety). Files: src/actors/governance.py, src/actors/base.py, tests/test_governance_actor.py (41 tests)
[x] 7. CALIBRATOR actor type — Statistical threshold tuning: Calibrator actor class with drift recalibration (delegates to DriftMonitor.calibrate_thresholds()), Bayesian prior update (delegates to BayesianPosterior.update_prior()), gate parameter proposals with recognized parameter whitelist, approval-gated application workflow, telemetry emission; thread-safe with threading.Lock; ultrathink-hardened (U-1..U-5). Files: src/actors/calibrator.py, src/actors/base.py, tests/test_calibrator_actor.py (69 tests incl. 12 regression)
[x] 8. Extract shared serialization pattern — ensure_utc() extracted from 3 workflow files to src/workflows/serialization.py. Files: serialization.py, consensus.py, override.py, proposal.py
[x] 9. Extract shared parameter validation — 4 validators (validate_positive, validate_range, validate_normalized, validate_threshold_ordering) extracted to src/engine/validation.py, replacing ~24 inline checks across 5 engine modules. Files: validation.py, bayesian.py, drift.py, gates.py, utility.py, complexity.py Deferred: consolidating inline timezone checks in persistence/telemetry (different module boundary).

Medium-term (v1.2.0)¶

[x] 10. Production deployment guide — docs/deployment/production-guide.md with Docker, K8s, AWS examples, Docker Compose, HSM integration, multi-region DR, observability setup, production checklist. Files: docs/deployment/production-guide.md, Dockerfile, docker-compose.yaml, monitoring/prometheus/prometheus.yml
[x] 11. Migration guide — Parameter recalibration via Calibrator actor, workflow state migration, schema upgrade path, version compatibility matrix. File: docs/deployment/migration-guide.md
[x] 12. Performance SLAs — Latency targets (p50 < 100ms, p95 < 500ms, p99 < 1s), throughput baselines, component latency budget, recorded benchmark results. File: docs/deployment/performance-slas.md
[x] 13. Shadow mode for KL divergence calibration — shadow_mode=True on pcw_decide() with ShadowResult dataclass, drift monitor integration, SHADOW_EVALUATION telemetry, Prometheus mode label, CLI --shadow flag, MCP shadow_mode param. 44 new tests. Unblocks issue #1. Files: src/integration/pcw_decide.py, src/telemetry/emitter.py, src/telemetry/prometheus_exporter.py, src/cli.py, src/aegis_governance/mcp_server.py, src/aegis_governance/__init__.py, monitoring/prometheus/alerting-rules.yaml, monitoring/prometheus/recording-rules.yaml, tests/integration/test_shadow_mode.py
[x] 14. HTTP telemetry sink — HTTPEventSink (per-event POST), BatchHTTPSink (batching + retry), http_sink() factory; stdlib-only (urllib.request); AegisConfig.telemetry_url; CLI --telemetry-url; MCP telemetry_url param; SDK re-exports. 45 new tests. Files: src/telemetry/emitter.py, src/config.py, src/cli.py, src/aegis_governance/mcp_server.py, src/aegis_governance/__init__.py, src/telemetry/__init__.py, tests/telemetry/test_http_sink.py
[x] 15. Drift detection → policy connection — DriftMonitor wired into production pcw_decide() path: CRITICAL → HALT (non-overridable), WARNING → advisory constraint, NORMAL → no change; _evaluate_drift_policy() + _apply_drift_overrides() helpers; DRIFT_POLICY_ENFORCED telemetry; create_drift_monitor() factory; CLI --drift-baseline; MCP drift_baseline_data; DriftAction/DriftResult re-exports; null-value filtering; drift-specific next_steps. 39 new tests. Files: src/integration/pcw_decide.py, src/config.py, src/cli.py, src/aegis_governance/mcp_server.py, src/aegis_governance/__init__.py, src/engine/__init__.py, src/telemetry/emitter.py, tests/integration/test_drift_enforcement.py, tests/integration/test_drift_regression.py

Long-term (v2.0.0) — AWS Infrastructure DEPLOYED¶

[x] 16. Agent integration guide & parameter cookbook — (a) docs/integration/parameter-reference.md — comprehensive parameter reference with derivation guidance, domain examples, boundary behavior for all inputs; (b) docs/integration/domain-templates.md — 4 worked examples (trading, CI/CD, content moderation, autonomous agent) with parameter mappings, JSON inputs, gate-by-gate walkthroughs; (c) MCP tool descriptions enriched with semantic context, minimum/maximum JSON Schema constraints, instructions field in initialize response. Files: docs/integration/parameter-reference.md, docs/integration/domain-templates.md, src/aegis_governance/mcp_server.py
[x] 17. GAP-L1 Deployment — Grafana deployment, production Prometheus, alert routing (Slack/PagerDuty). Issue #9. DEPLOYED: AegisMonitoringStack-dev — CloudWatch dashboard AEGIS-Governance-dev, SNS topic aegis-governance-alarms-dev, 4 alarms (Lambda errors, Lambda throttles, ECS unhealthy, billing). Grafana/Prometheus observability via ADOT sidecar on ECS Fargate.
[x] 18. GAP-L2 — OpenTelemetry distributed tracing. DEPLOYED: ADOT sidecar running on ECS Fargate (AegisMcpStack-dev), configured for Prometheus remote write to AMP. Full OTLP span correlation deferred to production workload phase.
[x] 19. Issues #2, #5, #7, #8, #9 Phase 2 — Infrastructure requirements met. DEPLOYED: DynamoDB aegis-governance-state-dev, S3 aegis-governance-audit-dev-164171672016, Secrets Manager aegis/signing-keys-dev, KMS encryption key, IAM auth on API Gateway, SNS alarm topic. Remaining: Slack/email subscription on SNS, multi-region replication, Locust load testing against live API Gateway endpoint.
[x] 20. Red-team fuzzing — Phase 2 adversarial testing. DEPLOYED: Lambda aegis-evaluate-proposal-dev + API Gateway https://yd1xm4ahcg.execute-api.us-west-2.amazonaws.com/dev/ available as live targets. ECS aegis-mcp-dev running. Fuzzing execution pending.
[x] 20a. MCP hardening (CoSAI/Red Hat) — Per research/003: ✅ (a) MCP audit logging — structured log for every tool invocation; ✅ (b) MCP rate limiting — token bucket/sliding window on mcp_server.py; ✅ (c) TLS enforcement — _validate_sink_url() enforces HTTPS on HTTPEventSink/BatchHTTPSink (with allow_insecure escape hatch for local dev), MCP _ALLOWED_TELEMETRY_SCHEMES restricted to {"https"}; ✅ (d) CoSAI MCP-T cross-reference in CLAUDE.md §11.4.1 (MCP-T1..T12 → AEGIS control matrix); ✅ (e) MCP tool schema signing — AMTSS Protocol v1 (src/crypto/schema_signer.py, ToolSchemaSigner, Ed25519, RFC 8785, _meta inline delivery, capabilities.experimental keyset); research: docs/research/004-mcp-schema-signing-design.md. All 5 sub-items complete.

Infrastructure (ECS MCP)¶

[x] 23. MCP Streamable HTTP transport — Implemented MCP Streamable HTTP transport (2025-03-26 spec) using stdlib http.server (zero new deps). --transport http flag on aegis-mcp-server, POST /mcp (JSON-RPC single + batch), GET /health, origin validation, 1 MiB request limit. ECS stack updated: internal ALB (:80 → :8080), /health health check, ALB 5xx alarm. 50 new tests, 1923 total passing, 94.62% coverage. Deferred: SSE streaming, session management, resumability (all AEGIS tools are synchronous/stateless). Files: src/aegis_governance/mcp_server.py, tests/test_mcp_http_transport.py, Dockerfile, infra/stacks/ecs_stack.py, infra/stacks/monitoring_stack.py

Business & Legal (pre-public-release)¶

[ ] 21. IP & licensing review — Engage IP attorney to evaluate: (a) patentability of the unified governance architecture (6-gate Bayesian framework + two-key crypto override + MCP agent integration + shadow calibration workflow); (b) trademark feasibility for "AEGIS Governance" in AI/software governance class; (c) license strategy — MIT is maximally permissive, evaluate BSL/SSPL/dual-license if commercial protection needed before public launch; (d) prior art landscape assessment. Decision required before broad public release or commercialization.
[x] 22. Commercialization strategy — Market research complete: AI governance market $300-850M (2025), 35-40% CAGR to $1.5B-$4.8B by 2033. Open core model recommended (free engine + paid enterprise features). Pricing tiers defined: Community (free), Professional ($2-5K/mo), Enterprise ($10-25K/mo), Financial Services ($25-50K/mo). File: docs/research/002-market-research-competitive-landscape.md

Active Work (In Progress)¶

Gap Closure Sprint (PR #25) — Phase 1 Complete¶

Task	Type	Status	Details
RBAC Enforcement (#7)	Security	✅ Phase 1	RBACEnforcer, YAMLRoleResolver, wired into override + pcw_decide
Override Audit (#5)	Observability	✅ Phase 1	Override telemetry events, AlertSink protocol, LogAlertSink + WebhookAlertSink
Performance Benchmarks (#2)	Testing	✅ Phase 1	13 pytest-benchmark functions across 3 files
DR Verification (#8)	Reliability	✅ Phase 1	Crash recovery tests, hash chain integrity, health CLI
Monitoring Infrastructure (#9)	Observability	✅ Phase 1	MetricsServer, CLI metrics/health, Grafana + Prometheus configs
validate() Refactor (#24)	Tech Debt	✅ Complete	CC=56 → CC~6 via data-driven _validate_section()

Metrics (at PR #25 merge): 1997 tests, 94.47% coverage, 6 issues addressed + 13 rigor findings + 11 bug-hunt #5 + 6 bug-hunt #6 + 6 bug-hunt #8 + 8 bug-hunt #9 + 2 ultrathink + 5 QG-ultrathink-10 + shadow mode + HTTP sink + drift enforcement + MCP HTTP transport + H-1 SSRF fix + MCP hardening + TLS enforcement + parameter cookbook + QG56 ultrathink + QG57 ultrathink + BH10 (7 bugs)

AWS Deployment (ROADMAP Items 16-20) — DEPLOYED¶

Stack	AWS Resource	Status	Details
AegisSharedStack-dev	DynamoDB, KMS, S3, Secrets Manager	✅ DEPLOYED	`aegis-governance-state-dev`, `aegis-governance-audit-dev-164171672016`
AegisLambdaStack-dev	Lambda + API Gateway	✅ DEPLOYED	`aegis-evaluate-proposal-dev`, REST API with IAM auth
AegisMcpStack-dev	ECS Fargate	✅ DEPLOYED	`aegis-mcp-dev` (1/1 running), keepalive loop (stdio transport)
AegisMonitoringStack-dev	CloudWatch + SNS	✅ DEPLOYED	Dashboard, 4 alarms, SNS topic

API Endpoint: https://yd1xm4ahcg.execute-api.us-west-2.amazonaws.com/dev/ Routes: POST /evaluate, POST /risk-check, GET /health

Completed (v3.26.0 — Rigor Protocol)¶

Task	Type	Status	Details
Rigor Protocol Phase 1	Bug Fix	✅ Complete	v3.24.0: 7 fixes (M7, M8, L13, L16, L19, L31, M11 doc)
Rigor Protocol Phase 2	Bug Fix	✅ Complete	v3.25.0: 17 fixes, 25 regression tests
Rigor Protocol Phase 3	Bug Fix	✅ Complete	v3.26.0: 13 fixes (M14-M18, L33-L40)
Quality Gate Ultrathink	Hardening	✅ Complete	M1-M4, L4: input validation, error handling

Metrics: 1689 tests, 94.60% coverage, 103/103 bugs fixed (100% fix rate)

Previously Completed (v3.11.0-v3.13.0)¶

Task	Type	Status	Details
Posterior Predictive (NEW-A)	Math Fix	✅ Complete	ADR-006, `compute_posterior_predictive()`
Covariance Matrix (U1+)	Math Fix	✅ Complete	`cov_pv`, `cov_pr`, `cov_vr` parameters
PERT Variance (P1*)	Documentation	✅ Complete	Docstring warning ±22-40% error
Fail-Closed Default (I1)	Security Fix	✅ Complete	`lcb=float('-inf')`
Input Validation	Robustness	✅ Complete	ValueError for invalid std values

See: Multi-Model Coherence Review for full analysis.

Recently Merged PRs¶

PR	Title	Merged Commit	Status
#25	feat: gap closure sprint — RBAC, alerts, metrics, DR, benchmarks	`edc278c`	✅ Merged
#23	feat: AEGIS v1.0 Governance Decision SDK	`cfa3783`	✅ Merged
#22	chore(claude): audit & regenerate CLAUDE.md v4.0.0	`d114f07`	✅ Merged
#21	Default legacy algorithm on deserialization	`cd4572a`	✅ Merged
#20	Fix telemetry timestamp validation for ISO strings	`d9ea971`	✅ Merged
#19	Add structured decision trace to pcw_decide	`7df8bf7`	✅ Merged

Open Issues¶

#	Title	Priority	Status	Labels	Milestone
#1	GAP-DriftThreshold: Calibrate KL Divergence Threshold	MEDIUM	Open — needs production data	`GAP`, `team:risk`	v1.2.0+
#2	GAP-PerfTest: Load-Test Guardrail Service (<500 ms p95)	MEDIUM	Phase 1 complete — needs Locust testing	`GAP`, `team:devops`	v1.2.0
#5	GAP-OverrideAudit: Enhance Override Logging & Alerts	MEDIUM	Phase 1 complete — needs SNS subscriptions	`GAP`, `team:seceng`	v1.2.0
#7	GAP-RBAC-Enforcement: Apply Role-Based Access Controls	MEDIUM	Phase 1 complete — needs IAM integration	`GAP`, `team:seceng`	v1.2.0
#8	GAP-DR-Drill: Test Disaster Recovery Process	LOW	Phase 1 complete — needs multi-region replication	`GAP`, `team:devops`	v2.0.0
#9	GAP-MonitoringDashboard: Implement Guardrail Monitoring Dashboard	LOW	Phase 1 complete — needs Grafana provisioning	`GAP`, `team:devops`	v2.0.0

Recently Closed Issues¶

#	Title	Closed	Notes
#24	validate() CC=56 refactor	2026-02-08	Refactored to CC~6 via data-driven `_validate_section()`
#6	GAP-TelemetryPrivacy: PII Redaction	2026-01-31	12-field PII encryption via HybridKEM

Sprint Update (PR #25): Issues #2, #5, #7, #8, #9 have Phase 1 code-side implementations complete. AWS infrastructure now DEPLOYED (4 CDK stacks live in us-west-2). Remaining work: Slack/email SNS subscriptions, multi-region replication, Locust load testing against live endpoints. Issues #6, #18, #24 closed. Milestone "Guardrail β-to-Prod" due date updated to 2026-06-30.

Release Roadmap¶

v1.0.1 (Patch - Pre-Release Bug Fixes) ✅ RELEASED¶

Released: 2026-01-31 (pre-release fixes merged before v1.0.0 SDK release) Focus: Bug fixes merged

Task	PR/Issue	Status
Timestamp validation fix	PR #20	✅ Merged (`d9ea971`)
Signature algorithm preservation	PR #21	✅ Merged (`cd4572a`)
Structured decision trace	PR #19	✅ Merged (`7df8bf7`)
Fix broken documentation links	Issues #14-18	✅ Fixed (`0f18c71`)

v1.0.0 (Major - SDK Release) ✅ RELEASED¶

Released: 2026-02-06 (PR #23, commit cfa3783) Focus: Governance Decision SDK — public API, CLI, MCP server

Task	PR/Issue	Status
AegisConfig frozen dataclass (`src/config.py`)	PR #23	✅ Complete
CLI entry point (`src/cli.py`, `aegis` command)	PR #23	✅ Complete
Public API facade (`src/aegis_governance/__init__.py`)	PR #23	✅ Complete
MCP server (`src/aegis_governance/mcp_server.py`)	PR #23	✅ Complete
79 new tests (config, CLI, facade, MCP)	PR #23	✅ Complete
4 runnable examples (`examples/`)	PR #23	✅ Complete
README rewrite (SDK positioning)	PR #23	✅ Complete
pyproject.toml `[project.scripts]` entries	PR #23	✅ Complete

v1.1.0 (Minor - Enhancements)¶

Target: Q1 2026 Focus: Testing improvements and new features

Task	Effort	Status	Notes
Mathematical coherence fixes	8h	✅ Complete	v3.11.0 (NEW-A, U1+, P1*, I1)
Boundary tests for all gates	4h	✅ Complete	77 parametrized BVA tests (`tests/test_gate_boundaries.py`)
Integration test: Proposal → Execution	8h	✅ Complete	`tests/integration/test_e2e_proposal_lifecycle.py` (5 tests)
GOVERNANCE actor type	6h	✅ Complete	Override orchestration, compliance, emergency halt (41 tests incl. 6 regression)
CALIBRATOR actor type	6h	✅ Complete	Statistical threshold tuning, approval-gated workflow (69 tests incl. 12 regression)

v1.2.0 (Minor - Features)¶

Target: Q2 2026 Focus: Production readiness

Task	Effort	Status	Notes
Shadow mode deployment prerequisites	16h	✅ Complete	`pcw_decide(shadow_mode=True)`, ShadowResult, 44 tests (ROADMAP Item 13)
HTTP telemetry sink	4h	✅ Complete	HTTPEventSink + BatchHTTPSink + http_sink() factory, config/CLI/MCP wiring, 41 tests (ROADMAP Item 14)
Configuration management system	12h	✅ Complete	AegisConfig in v1.0.0 (PR #23)
Drift detection → policy connection	4h	✅ Complete	DriftMonitor wired into pcw_decide; CRITICAL→HALT, WARNING→constraint (ROADMAP Item 15)
GAP-DriftThreshold (#1)	TBD	Unblocked	Shadow mode enables data collection; needs 30+ days of observed KL values
GAP-PerfTest (#2)	8h	Phase 1 Complete	Benchmarks established; Locust load testing now possible against live API Gateway
GAP-OverrideAudit (#5)	8h	Phase 1 Complete	Override telemetry + AlertSink protocol; Slack/email sinks pluggable
GAP-RBAC-Enforcement (#7)	12h	Phase 1 Complete	RBACEnforcer + YAMLRoleResolver; IAM integration pluggable via RoleResolver protocol
MCP Streamable HTTP transport	12h	✅ Complete	`--transport http` on `aegis-mcp-server`; POST `/mcp` (JSON-RPC single + batch), `/health` endpoint; internal ALB; origin validation; SSRF protection; 50 new tests (ROADMAP Item 23)

v2.0.0 (Major - Backlog)¶

Target: 2026 H2 Focus: Operational excellence

Task	Effort	Status	Notes
GAP-L1 Phase 2-3: Grafana dashboards & alerting	87h	✅ DEPLOYED	Phases 1-3 code-complete; CloudWatch dashboard + SNS alarms deployed (`AegisMonitoringStack-dev`)
GAP-L2: OpenTelemetry distributed tracing	16h	✅ DEPLOYED (foundation)	ADOT sidecar on ECS Fargate; full OTLP span correlation deferred
GAP-DR-Drill (#8)	16h	Phase 1 Complete	DR verification tests + health CLI; live drill now possible against deployed infrastructure
GAP-MonitoringDashboard (#9)	16h	✅ DEPLOYED	CloudWatch `AEGIS-Governance-dev` dashboard + Grafana configs available
Phase 2 red-team fuzzing	20h	Infrastructure Ready	Lambda + API Gateway + ECS deployed as live targets
Parameter freezing mechanism	8h	Backlog	Governance compliance

GAP Status Summary¶

Completed GAPs¶

GAP	Description	Completion	Implementation
GAP-C1	Decision Logic Divergence	100%	`src/engine/gates.py`
GAP-C2	Override Mechanism	100%	`src/workflows/override.py`
GAP-C3	AFABridge Gate Integration	100%	`src/integration/afa_bridge.py`
GAP-H1	Parameter Naming	100%	`schema/interface-contract.yaml`
GAP-H2	Telemetry Schema	100%	`src/telemetry/schema.py`
GAP-H3	RBAC Reconciliation	100%	`schema/rbac-definitions.yaml`
GAP-M1	Feedback Timing	100%	`src/engine/drift.py`
GAP-M2	Actor Types	100%	`src/actors/`
GAP-M3	Workflow Persistence	100%	`src/workflows/persistence/`
GAP-M4	Signature Format	100%	`src/crypto/`
GAP-Q1	Post-Quantum Signatures	100%	`src/crypto/mldsa.py`, `hybrid_provider.py`
GAP-Q2	Post-Quantum Encryption	100%	`src/crypto/mlkem.py`, `hybrid_kem.py`

In-Progress GAPs¶

GAP	Description	Completion	Phase
GAP-L1	Unified Monitoring Dashboard	100% code + deployed	Phases 1-3 code-complete; CloudWatch + SNS deployed; Grafana available via configs

Planned GAPs¶

GAP	Description	Completion	Target
GAP-L2	Cross-Component Tracing	Foundation deployed	ADOT sidecar running; full OTLP deferred to v2.0.0

Production Readiness Metrics¶

Metric	Current	Target	Status
Test Coverage	~94.9%	90%	Exceeds
Tests Passing	3041 (2 skipped)	All	Pass
Security Vulnerabilities	0	0	Pass
CI/CD	All green	All green	Pass
Documentation Accuracy	99.6%	95%	Exceeds
AWS Deployment	4/4 stacks deployed	All stacks	Pass

Quick Links¶

Architecture¶

Gap Analysis - Component gap tracking
Repository Structure - Codebase organization
Unified AEGIS Specification - System architecture

Decision Records¶

ADR Index - All Architecture Decision Records
ADR-001 - Workflow Persistence
ADR-002 - BIP-322 Signatures
ADR-003 - Post-Quantum Signatures
ADR-004 - Post-Quantum Encryption
ADR-005 - KL Divergence Threshold Calibration
ADR-006 - Posterior Predictive for Bayesian Gates
ADR-007 - AWS Deployment Architecture

Integration Guides¶

Parameter Reference - Complete parameter reference with derivation guidance
Domain Templates - Worked examples for 4 domains (trading, CI/CD, content moderation, autonomous agents)

Implementation¶

Implementation Plans - EPCC detailed plans
Shadow Mode Prerequisites - Deployment guide

Research¶

Research Index - All research documents
KL Threshold Calibration - Drift detection tuning
Market Research & Competitive Landscape - Market sizing, competitors, GTM
MCP Security Ecosystem Review - CoSAI taxonomy + Red Hat patterns mapped to AEGIS

Analysis¶

Comprehensive TODO Discovery - Historical progress
Test Count Methodology - Test verification
Cross-Reference Verification - Documentation accuracy

Changelog¶

Version	Date	Changes
1.93.0	2026-02-25	Bug Hunt #45 (Hybrid): 6 fixes (1 Codex, 2M, 2L + 1 ultrathink), 31 regression tests; BH45-Codex-M1 proposal metadata deep copy, BH45-M1 MCP risk_score eager eval transport parity, BH45-M2 BayesianPosterior update_prior validation, BH45-T1 update_prior bool guard, BH45-L1 PipelineConfig int validation, BH45-L2 PipelineConfig enum validation; 3029 tests, ~94.8% coverage
1.92.0	2026-02-25	Scoring Guide MCP Tool + Advisor v2: aegis_get_scoring_guide with 5-domain derivation guidance, Advisor rewrite with domain funnel + factual rubric + real API calls, demo API key provisioned; 2998 tests, ~94.8% coverage
1.91.0	2026-02-24	SaaS Commercialization Sprint: API key auth + usage plans (CDK), tenant context extraction (Lambda), customer provisioning script, OpenAPI 3.1 spec, mkdocs-material docs site (10 pages), PyPI trusted publishing, SECURITY.md, CHANGELOG.md; pyproject.toml v1.1.0; 2967 tests, ~94.8% coverage
1.90.0	2026-02-24	Transport Parity Fix: 15 gaps closed across CLI/MCP/Lambda (GAP 2-4 CRITICAL: MCP missing bool flags, GAP 1 metadata, GAP 6-8 inputSchema + Lambda telemetry, GAP 12 strict impact, GAP 15 UUID session, GAP 17 SSRF, GAP 18-22 output fields); new telemetry/url_validation.py shared module; 2958 tests, ~94.8% coverage
1.89.0	2026-02-23	Bug Hunt #44 (Hybrid): 4 fixes (1 Codex, 2M, 1L), 15 regression tests; BH44-Codex-M1 schema_signer chain state corruption, BH44-M1 calibrator utility_threshold constraint, BH44-M2 proposer TypeError catch, BH44-L1 pcw_decide drift alias; 2923 tests, ~94.8% coverage
1.88.0	2026-02-23	Bug Hunt #43 (Hybrid): 11 fixes (2 Codex, 5M, 4L) + 1 ultrathink fix, 31 regression tests; BH43-Codex-M1 analyst gate exception handling, BH43-Codex-M2 analyst quality_subscores TypeError, BH43-M1 CLI null subscores crash, BH43-M2 ComplexityBreakdown bool fields, BH43-M3 value_variance negative floor, BH43-M4+M5 pipeline ingest() aliasing, BH43-L1 CLI metric alias null, BH43-L2 utility value_low_conf NaN, BH43-L3 utility covariance NaN, BH43-L4 ProposalWorkflow from_dict cls(), QG-T1 from_dict evaluation_result; 2908 tests, ~94.8% coverage
1.87.0	2026-02-23	Bug Hunt #42 (Hybrid): 13 fixes (3 Codex, 6M, 2L + 2 ultrathink), 29 regression tests; BH42-M1 complexity mutable default, BH42-M2 calibrator novelty_k positive, BH42-M3 prometheus NaN latency, BH42-M4 prometheus NaN KL divergence, BH42-M5 emitter correlation_id or-falsy, BH42-M6 lambda shadow_mode bool, BH42-L1 pcw_decide posterior or-falsy, BH42-L2 afa_bridge posterior or-falsy, BH42-Codex-M1 auth falsy fail-open, BH42-Codex-M2 allow_abstain bool, BH42-Codex-L1 checkpoint collision retry, QG-T1 MCP shadow_mode parity, QG-T2 analyst confidence or-falsy; 2877 tests, 94.81% coverage
1.86.0	2026-02-22	Bug Hunt #41 (Hybrid): 7 bugs (1 Codex + 4M, 2L), 33 regression tests; BH41-M1 analyst None subscores saw_non_null (analyst.py), BH41-M2 validate_range check_nan default False→True (validation.py), BH41-M3 schema_signer _prev_digests atomic commit (schema_signer.py), BH41-M4 consensus DEFER excluded from required_missing (consensus.py), BH41-L1 calibrator list_proposals lock-snapshot race (calibrator.py), BH41-L2 emitter correlation_id or-coercion (emitter.py), BH41-Codex complexity_floor bool guard (complexity.py); QG verify: ruff B017 narrowed, black format, mypy attr-defined; 2848 tests, 94.82% coverage
1.85.0	2026-02-22	Bug Hunt #40 (Hybrid): 9 bugs (4M, 5L), 40 regression tests; BH40-M1 quality_subscores empty-list bypass (Codex+Claude), BH40-M2 BatchHTTPSink.stop() lock-before-join race, BH40-M3 validate_normalized bool guard missing, BH40-M4 _parse_mcp_rate_limit string-fractional truncation, BH40-L1 GateEvaluator negative threshold values disable gates, BH40-L2 _parse_kl_drift_dict string-fractional window_days, BH40-L3 stdio size guard char vs byte count, BH40-L4 get_decision_history truthy agent_id bypass, BH40-L5 DEKRotator readers without lock; 2815 tests, 94.78% coverage
1.84.0	2026-02-21	Bug Hunt #39: 13 bugs (1H, 6M, 6L), 54 regression tests; BH39-H1 chain root forgery, BH39-M1/M3 lock-before-join, BH39-M2 TOCTOU, BH39-M4 inf trigger factor, BH39-M5 NaN utility, BH39-M6/L5 float truncation, BH39-L1 from_dict cls.new, BH39-L2 novelty_k=0, BH39-L3 JSON-RPC §4.1, BH39-L4 bip322 length, BH39-Codex-2 memory_sink maxlen=0; 2775 tests, 94.77% coverage
1.83.0	2026-02-21	QG-UT1: GateEvaluator(trigger_confidence_prob=True) silently accepted via validate_range inclusive upper bound (True==1.0); explicit bool guard added; 2721 tests, 94.78% coverage
1.82.0	2026-02-21	Bug Hunt #38 (Hybrid): 6 bugs (1H, 4M, 1L), 35 regression tests; BH38-H1 key_store.py Python 3.10+ async-with SyntaxError on 3.9 (+ fmt:off guard), BH38-M1 UtilityCalculator bool-is-int bypass (phi_S/phi_D/gamma/kappa/migration_budget), BH38-M2 GateEvaluator bool-is-int bypass (trigger factors + thresholds), BH38-M3 CalibrationProposal + _validate_gate_param bool bypass, BH38-M4 MetricsServer.stop() lock held during join, BH38-L1 BatchHTTPSink non-int params (Codex); 2720 tests, 94.78% coverage
1.81.0	2026-02-20	Bug Hunt #37: 6 bugs (3M, 3L) -- BayesianPosterior NaN, emergency_halt audit, calibrator novelty_N0, PipelineConfig float, ThreePointEstimate bool, DriftMonitor window_days; 2685 tests, 94.76% coverage
1.80.0	2026-02-20	Bug Hunt #36 (Hybrid): 6 bugs (4M, 2L), 17 regression tests; QG Ultrathink: 2 findings (2L); BH36-M1 Lambda `or` pattern falsy bypass (Codex), BH36-M2 mark_completed non-enum state injection, BH36-M3 CLI `or` estimated_impact, BH36-M4 MCP `or` estimated_impact, BH36-L1 complexity_tax bool guard, BH36-L2 proposal_summary `or` pattern; 2659 tests, 94.74% coverage
1.79.0	2026-02-20	Bug Hunt #35 (Hybrid): 6 bugs (4M, 2L), 22 regression tests; QG Ultrathink: 4 findings (4L), 19 regression tests; BH35-M1 check_and_mark_expired terminal state downgrade (Codex), BH35-M2 RBAC NaN signer_count bypass, BH35-M3 PipelineConfig flush_interval no validation, BH35-M4 BatchHTTPSink flush_interval no validation, BH35-L1 PipelineConfig bool-is-int, BH35-L2 DEKCache ttl_seconds no validation; 2642 tests, 94.79% coverage
1.78.0	2026-02-20	Bug Hunt #34 (Hybrid): 5 bugs (4M, 1L), 14 regression tests; BH34-M1 DriftMonitor num_bins float accepted, BH34-M2 CLI cmd_evaluate missing TypeError catch, BH34-M3 DualSignatureValidator expiration_hours upper bound, BH34-M4 TelemetryPipeline worker_loop inconsistent state, BH34-L1 AegisConfig.from_dict() telemetry_url type coercion; 2601 tests, 94.79% coverage
1.77.0	2026-02-20	Bug Hunt #33 (Hybrid): 5 bugs (5M), 15 regression tests; BH33-M1 config._parse_flat_numeric non-numeric type silently accepted, BH33-M2 config._from_raw_dict DIRECT param non-numeric type, BH33-M3 DriftMonitor.evaluate() unfiltered window, BH33-M4 OverrideWorkflow failed_gates no defensive copy, BH33-M5 mark_completed() state_data desync (Codex); 2587 tests, 94.80% coverage
1.76.0	2026-02-20	Bug Hunt #32 (Hybrid): 3 bugs (2M, 1L), 20 regression tests; BH32-M1 DriftMonitor constructor negative/Inf threshold parity, BH32-M2 calibrator negative threshold governance bypass, BH32-L1 KLDriftConfig window_days validation; 2572 tests, 94.80% coverage
1.75.0	2026-02-20	Bug Hunt #31 (Hybrid) + QG73 Ultrathink: 4 bugs (1M, 3L) + 2 QG73 findings (1M, 1L), 22 regression tests; BH31-M1 MCP caller_id non-string guard, BH31-L1 Lambda threshold dict.get() null, BH31-L2 ConsensusConfig fractional minimum, BH31-L3 DualSignatureValidator fractional minimum; QG73-L1 CLI agent_id transport parity, QG73-M1 AFABridge timeout fractional minimum; 2552 tests, 94.80% coverage
1.74.0	2026-02-19	Bug Hunt #30 (Hybrid) + QG72 Ultrathink: 5 bugs (2M, 3L) + 4 QG72 findings (2M, 2L), 12 regression tests; BH30 dict.get() null gotcha transport parity (CLI/MCP/Lambda), AFABridge float limit, pipeline config mutation; QG72 remaining null gaps; 2530 tests, 94.76% coverage
1.73.0	2026-02-18	Bug Hunt #29 (Hybrid) + QG71 Ultrathink: 8 bugs (3M, 5L) + 3 QG71 findings (3L), 26 regression tests; BH29-M1 estimated_impact case bypass, BH29-M2 executor TOCTOU, BH29-M3 calibrator novelty_k zero; QG71 MCP null guards + drain broadening; 2518 tests, 94.76% coverage
1.72.0	2026-02-18	Bug Hunt #28 (Hybrid) + QG70 Ultrathink: 5 bugs (3M, 2L) + 3 QG70 findings (3L), 22 regression tests; BH28-M1 consensus quorum revert, BH28-M2 governance expired override eviction, BH28-M3 CLI risk alias priority; QG70 config bool coercion + drift baseline Inf; 2492 tests, 94.73% coverage
1.71.0	2026-02-17	Quality-Gate QG69 Ultrathink: 1 finding (1M), 7 regression tests; QG69-M1 MCP+CLI drift_baseline_data isfinite transport parity; 2470 tests, 94.73% coverage
1.70.1	2026-02-17	Bug Hunt #27 (Hybrid): 4 bugs (3M, 1L), 13 regression tests; BH27-M1 (resume_or_create ID propagation), BH27-M2 (_from_raw_dict string-to-float), BH27-M3 (Lambda/MCP null bypass), BH27-L4 (Lambda drift_baseline isfinite); 2470 tests, 94.73% coverage
1.70.0	2026-02-17	Scaffold Adoption: Engineering Standards ai_scaffold_package v2.1.1 (50 files); ai/ (8 artifacts), docs/compliance/ (7 runbooks), tools/ci/ (9 validators), GitHub (templates, workflows, 15 labels), Makefile, .pre-commit-config; 100% placeholder elimination; CLAUDE.md v4.5.33; 2448 tests, 94.83% coverage
1.69.0	2026-02-16	Bug Hunt #26 (Hybrid): 4 bugs (3M, 1L), 18 regression tests; BH26-M1 (validate_positive bool-is-int — Codex), BH26-M2 (bayesian update_prior variance overflow), BH26-M3 (RBAC bool constraint None fail-open), BH26-L1 (complexity delta NaN/Inf propagation); 0 deferred bugs; 2448 tests, 94.83% coverage
1.68.0	2026-02-16	Bug Hunt #25 (Hybrid): 6 bugs (3M, 3L), 18 regression tests; BH25-M1 (analyst utility components null), BH25-M2 (CLI risk_score transport parity), BH25-M3 (drift histogram large-magnitude), BH25-L1 (analyst risk_delta/profit_delta null — Codex), BH25-L2 (bayesian overflow), BH25-L3 (config string NaN); PLR0912: `_parse_flat_numeric()` helper; 0 deferred bugs; 2430 tests, 94.81% coverage
1.67.0	2026-02-16	Bug Hunt #24 (Hybrid) + QG68 Ultrathink: 10 bugs (4M, 6L), 26 regression tests; BH24-M1 (analyst `_evaluate_utility_gate` null guard), BH24-M2 (Lambda `_float` bool), BH24-M3 (MCP `_float_arg` bool), BH24-M4 (CLI `_parse_proposal` bool), BH24-L1 (drift `evaluate` baseline bool), BH24-L2 (override `add_signature` bool), BH24-L3 (MCP `risk_check` threshold null), BH24-L4 (config `mcp_rate_limit` bool), BH24-L5 (pcw_decide `quality_subscores` null), BH24-L6 (analyst `profit_baseline` null); QG68: analyst utility null guards; 0 deferred bugs; 2412 tests, 94.80% coverage
1.66.0	2026-02-16	AMTSS Protocol v1 — MCP Tool Schema Signing: `src/crypto/schema_signer.py` (ToolSchemaSigner, SigningKeyPair, compute_tool_digest), Ed25519 per-tool + manifest dual signing, RFC 8785 canonicalization, `_meta` inline delivery, `capabilities.experimental` keyset; MCP server integration (tools/list proofs + initialize keyset); research doc `004-mcp-schema-signing-design.md`; Claude-GPT dialogue (GPT 5.2 Pro xhigh); QG ultrathink: 5+4 findings fixed (manifest duplicate-name bypass, `_meta` stripping, statement type validation, digest chain, strict base64url + QG67: null sig crash, NaN canonicalization, manifest revision increment, signing error log level); ROADMAP 20a(e) complete — all 5 MCP hardening sub-items done; 2386 tests, 94.74% coverage
1.65.0	2026-02-16	CoSAI MCP-T Cross-Reference: CLAUDE.md §11.4.1 MCP-T1..T12 threat mapping (9 STRONG, 2 MODERATE, 1 PARTIAL); ROADMAP 20a(d) complete; docs-only; 2304 tests, 94.63% coverage
1.64.0	2026-02-16	Bug Hunt #23 (Hybrid): 7 bugs (3M, 4L), 29 regression tests; BH23-M1 (CLI drift baseline bool), BH23-M2 (CLI quality_subscores empty list), BH23-M3 (Calibrator eviction race), BH23-L1 (CLI subscores type check), BH23-L2 (BayesianPosterior prior_mean NaN/Inf), BH23-L3 (ConsensusWorkflow check_timeout), BH23-L4 (KeyStore audit lock TOCTOU); 0 deferred bugs; 2304 tests, 94.63% coverage
1.63.0	2026-02-15	Quality-Gate QG66 Ultrathink: 2 findings (2L), 2 regression tests; UT-1 MCP empty subscores parity, UT-2 MCP non-numeric string crash; 2275 tests, 94.63% coverage
1.62.0	2026-02-15	Bug Hunt #22 (Hybrid): 8 bugs (4M, 4L), 20 regression tests; BH22-M1 (override reject() wall-clock), BH22-M2 (MCP quality_subscores extraction), BH22-M3 (DriftMonitor update_thresholds validation), BH22-M4 (persistence re-completion guard), BH22-L1 (drift_baseline_data bool guard), BH22-L2 (governance override eviction), BH22-L3 (afa_bridge string-as-iterable), BH22-L4 (analyst null subscores); 0 deferred bugs; 2273 tests, 94.64% coverage
1.61.0	2026-02-15	Bug Hunt #21 (Hybrid): 8 bugs (3M, 5L), 16 regression tests; BH21-M1 (KLDriftConfig post_init), BH21-M2 (Lambda subscores bool), BH21-M3 (AFABridge subscores validation), BH21-L1 (DriftMonitor window_days), BH21-L2 (Calibrator unbounded proposals), BH21-L3 (shadow eval key collision), BH21-L4 (drift status label cardinality), BH21-L5 (MCP 405 Allow header); 0 deferred bugs; 2273 tests, 94.64% coverage
1.60.0	2026-02-15	Bug Hunt #20 (Hybrid) + QG65 Ultrathink: 9 bugs (7M, 2L) + 5 QG65 fixes; 22 regression tests total; durable non-dict crash, override mutable sharing, base64 strict (override+crypto+lambda), consensus voter aliasing + timeout overflow, pcw_decide trace crash, encryption base64, config window_days, transport bool guards, CLI risk/subscore bool guards; 2236 tests, 94.68% coverage
1.59.0	2026-02-15	Rigor: Resolve All Deferred Bugs — fixed BH16-L5 (WorkflowTransition.verify_hash standalone false negatives, added previous_hash column), closed BH15-L6 (Lambda telemetry by-design); 8 regression tests; 0 deferred bugs remaining; 2214 tests, 94.68% coverage
1.58.0	2026-02-14	Bug Hunt #19 (Hybrid): 5 bugs (2M, 3L), 12 regression tests; proposal.py from_dict mutable aliasing, override key rotation TOCTOU, afa_bridge bool guard + non-boolean execution flags + null authorization crash; 2206 tests, 94.68% coverage
1.57.0	2026-02-14	Bug Hunt #18 (Hybrid): 7 bugs (3M, 4L), 25 regression tests; lambda_handler/cli non-boolean control flags, config flat key NaN/Inf validation, bayesian ddof bool, consensus config bool guards, afa_bridge timeout_hours bool; 2194 tests, 94.61% coverage
1.56.0	2026-02-14	Bug Hunt #17 (Hybrid): 6 bugs (1M, 5L), 13 regression tests; afa_bridge risk_check transport parity, config NaN/Inf validation, ensure_utc timezone conversion, BatchHTTPSink negative max_retries, governance emergency_halt; 2169 tests, 94.60% coverage
1.55.0	2026-02-14	Quality Gate #62 (Ultrathink): 6 findings (1M, 5L), 11 regression tests; afa_bridge isfinite, config kl_drift NaN validation, lambda null subscores; 2156 tests, 94.58% coverage
1.54.0	2026-02-14	Bug Hunt #16: 9 bugs (4M, 5L), 22 regression tests; 1 deferred (BH16-L5); 2145 tests, 94.56% coverage
1.53.0	2026-02-14	Bug Hunt #15 (Hybrid): 8 bugs (2M, 6L), 22 regression tests + Quality Gate #61 (Ultrathink): 7 findings (4M, 3L), 5 fixed + 8 regression tests; CLI observation_values sanitization; 2123 tests, 94.53% coverage
1.52.0	2026-02-13	Bug Hunt #14 (Hybrid): 3 bugs (3M) — ConsensusConfig bool timeout_hours, DualSignatureValidator expiration_hours validation, Lambda quality_subscores isfinite parity; 2101 tests, 94.54% coverage
1.51.0	2026-02-13	Rigor Close Deferrals v3: closed all 5 deferred bugs (BH12-L2 fixed + QG60-6/7/8/9 documented/accepted-risk); 0 deferred remaining; 2091 tests, 94.52% coverage
1.50.0	2026-02-13	Bug Hunt #13: 7 bugs (4M, 3L), 16 regression tests
1.49.0	2026-02-13	Quality-Gate Ultrathink (QG60): 5 fixes — validate_positive Inf FAIL-OPEN, UtilityCalculator gamma/kappa/migration_budget Inf, MCP POST 404 body drain, MCP 413 connection close, ThreePointEstimate Inf; SDK facade Calibrator/Governance exports; 2072 tests, 94.50% coverage
1.48.0	2026-02-12	Bug Hunt #12 (Hybrid): 10 bugs (1H, 7M, 2L) — GateEvaluator NaN governance lockout, complexity analyze NaN, Lambda _float NaN/Inf parity, risk_check NaN, ExecutionPlan NaN timeout, CalibrationProposal data_window, config null params, proposal to_dict mutable leak; 2053 tests, 94.52% coverage
1.47.0	2026-02-12	Quality-Gate Ultrathink (QG59): 12 fixes from 21 findings (8M, 4L) — NaN trigger_factor bypass, trigger_confidence_prob fail-OPEN, YAML null crash, CalibrationProposal NaN/Inf, analyst coerce NaN strings, proposer PERT NaN/Inf, MCP _float_arg NaN/Inf, emitter dropped-event semantics; 2031 tests, 94.52% coverage
1.46.0	2026-02-12	Bug Hunt #11 (Hybrid): 10 bugs (8M, 2L) — CLI null subscores/phase, calibrator capability check, governance halt override cancel, consensus NaN timeout, MCP POST /health body, pipeline PII encryptor bypass, BatchHTTPSink batch_size=0, utility lcb_alpha NaN, stdio strip order; 2009 tests, 94.49% coverage
1.45.0	2026-02-12	Quality-Gate Ultrathink (QG58): Docs sync — test metrics updated to 1997 tests, 94.47% coverage across all documentation files
1.44.0	2026-02-12	Bug Hunt #10 + QG57: validate_positive/validate_threshold_ordering NaN guards, stdio MCP size limit, CLI null-coalesce, Lambda phase type guard + drift baseline guard, governance emergency_halt lock atomicity, MCP drift baseline guard; 1997 tests, 94.47% coverage
1.43.0	2026-02-12	Quality-Gate Ultrathink (QG56): stdio batch array support, WebhookAlertSink TLS enforcement, URL whitespace stripping, mcp_rate_limit negative clamp; 1978 tests, 94.47% coverage
1.42.0	2026-02-12	ROADMAP Items 16 + 20a(c): TLS enforcement on HTTPEventSink/BatchHTTPSink (`_validate_sink_url()` + `allow_insecure` escape hatch), MCP `_ALLOWED_TELEMETRY_SCHEMES` restricted to `{"https"}`, parameter reference guide, domain integration templates (4 domains), MCP tool description enrichment with `instructions` field + JSON Schema min/max constraints; closes CoSAI MCP-T7 gap (G2); 1964 tests, 94.47% coverage
1.41.0	2026-02-12	MCP Hardening Phase 1 (ROADMAP Item 20a): Token bucket rate limiter + structured audit logging; closes CoSAI MCP-T10 and MCP-T12 gaps; 1948 tests, 94.59% coverage
1.40.0	2026-02-11	H-1 SSRF hex/decimal IP bypass fix: resolve-then-validate via `socket.getaddrinfo()`, `_is_forbidden_ip()` uses `not is_global` (blocks CGNAT 100.64/10); M-3 Slowloris timeout (30s per-connection); 14 regression tests; 1923 tests, 94.62% coverage
1.39.0	2026-02-11	Completed ROADMAP Item 23: MCP Streamable HTTP transport — stdlib `http.server` implementation (zero new deps), POST `/mcp` (JSON-RPC single + batch), origin validation, internal ALB, 50 new tests (1909 total, 94.63%), deferred SSE/sessions/resumability
1.38.0	2026-02-11	Added ROADMAP Item 23: MCP Streamable HTTP transport — MCP spec (2025-03-26) already standardizes network transport; updated KNOWN_ISSUES.md with resolution path and spec references; added to v1.2.0 release roadmap and Next Steps checklist
1.37.0	2026-02-11	Post-deployment security hardening: 17 ultrathink findings fixed (3H, 11M, 3L) — CORS restriction, script injection fixes (env vars + heredoc delimiters), error message sanitization, IAM least-privilege (Scan/PutObjectAcl removed), ADOT pinned v0.41.2, CDK approval broadening, billing alarm all stages, deploy test gate; 1859 tests, 94.54% coverage
1.36.0	2026-02-10	AWS Deployment Complete: All 4 CDK stacks deployed to us-west-2 (AegisSharedStack-dev, AegisLambdaStack-dev, AegisMcpStack-dev, AegisMonitoringStack-dev); Items 17-20 updated to DEPLOYED; 7 deployment bugs fixed (cdk.json context, pyproject py-modules, Dockerfile pins, ECS ALB removal, Lambda cyclic refs, CloudWatch math, CDK protocol); added AWS Deployment section to Active Work; added ADR-007 to Quick Links; 1859 tests, 94.55% coverage
1.35.0	2026-02-10	AWS Deployment Infrastructure (ROADMAP Items 16-20): Hybrid Lambda+ECS CDK stacks, `src/lambda_handler.py`, `Dockerfile.lambda`, aegis-deploy.yml, aegis-gate action, ADR-007; ultrathink hardening (U-1 null subscores, U-2 injection fix); 42 new tests; 1859 tests, 94.55% coverage
1.34.0	2026-02-10	ROADMAP Item 15: Drift detection → policy connection — `DriftMonitor` wired into production `pcw_decide()` path (CRITICAL→HALT, WARNING→constraint, NORMAL→no change); `_evaluate_drift_policy()` + `_apply_drift_overrides()` helpers; `DRIFT_POLICY_ENFORCED` telemetry; CLI `--drift-baseline`; MCP `drift_baseline_data`; SDK re-exports; 39 new tests; 1817 tests, 94.56% coverage
1.33.0	2026-02-09	Research 003: MCP Security Ecosystem Review — CoSAI MCP-T1..T12 taxonomy (12 threat categories, ~40 threats, 11 control families) + Red Hat enterprise MCP architecture (4-stage progressive promotion) mapped to AEGIS controls; identified 6 gaps (MCP audit logging, rate limiting, TLS enforcement, tool schema signing, shadow server detection, SPIFFE identity); added ROADMAP Item 20a (MCP hardening)
1.32.0	2026-02-09	ROADMAP Item 22: Market research & competitive landscape — AI governance market sizing ($300-850M → $1.5-4.8B), 7 direct + 6 adjacent competitors profiled, unique positioning matrix, regulatory timeline (EU AI Act Aug 2026), open core pricing model, go-to-market strategy
1.31.0	2026-02-09	ROADMAP Item 14: HTTP telemetry sink — `HTTPEventSink` (per-event POST), `BatchHTTPSink` (batching + retry + background flush), `http_sink()` factory; `AegisConfig.telemetry_url`; CLI `--telemetry-url`; MCP `telemetry_url` param; SDK re-exports; stdlib-only (urllib.request); 45 new tests; 1778 tests, 94.44% coverage
1.30.0	2026-02-09	ROADMAP Item 13: Shadow mode for KL divergence calibration — `shadow_mode` keyword param on `pcw_decide()`, `ShadowResult` dataclass, `DriftMonitor`/`TelemetryEmitter` integration, Prometheus `mode` label + shadow counter, CLI `--shadow` flag, MCP `shadow_mode` param, SDK re-export, alerting/recording rule filters; 44 new tests; 1733 tests, 94.48% coverage
1.29.0	2026-02-09	ROADMAP Items 10-12: Production deployment guide (`docs/deployment/production-guide.md`), migration guide (`docs/deployment/migration-guide.md`), performance SLAs with recorded benchmarks (`docs/deployment/performance-slas.md`); Dockerfile + docker-compose.yaml + Prometheus scrape config; no code changes
1.28.0	2026-02-09	ROADMAP Item 7: CALIBRATOR actor type — statistical threshold tuning with drift recalibration, Bayesian prior update, gate parameter proposals, approval-gated application, recognized parameter whitelist, telemetry emission; ultrathink-hardened (U-1..U-5); `ActorRole.CALIBRATOR` + `ActorCapabilities`; 69 new tests (12 regression); 1689 tests, 94.60% coverage
1.27.0	2026-02-09	ROADMAP Item 6: GOVERNANCE actor type — override orchestration (initiate/sign/approve/reject/expire), compliance checking (complexity gate non-overridable), emergency halt; ultrathink-hardened (halt guards, fail-closed compliance, thread safety); `ActorRole.GOVERNANCE` + `ActorCapabilities`; 41 new tests; 1620 tests, 94.36% coverage
1.26.0	2026-02-08	Docs-Sync Audit: Fixed GAP-L1 status (66%→code-complete), repo-structure tree (6 files added), telemetry schema v2.0→v2.1.0, stale counts, TD-2/TD-3 resolved, gap-analysis changelog gaps, ActorBase→Actor, duplicate sections merged
1.25.0	2026-02-08	ROADMAP Items 8 & 9: DRY extraction — `ensure_utc()` shared across 3 workflows, 4 validation helpers shared across 5 engine modules; 27 new tests; deferred: persistence/telemetry timezone consolidation; 1579 tests, 94.31% coverage
1.24.0	2026-02-08	ROADMAP Item 5: 77 boundary tests for all 6 gates + drift detector via `@pytest.mark.parametrize`; verifies comparison operators at exact thresholds; 1552 tests, 94.27% coverage
1.23.0	2026-02-08	ROADMAP Items 2-4: docs version sync committed, safety 2.3→3.x upgrade, broad exception catch documentation (15 sites, 8 files); 1475 tests, 94.21% coverage
1.22.0	2026-02-08	Dependency fix: scipy/prometheus_client moved to dedicated `engine`/`telemetry` optional groups with graceful degradation; 4 regression tests; 1475 tests, 94.21% coverage
1.21.0	2026-02-08	Added "Next Steps (Ordered Checklist)" section — 19 prioritized items from Discovery Analysis 2026-02-08; single place to find what's next
1.20.0	2026-02-08	Quality-Gate Ultrathink #10: 5 MEDIUM bugs fixed (Bayesian overflow, pipeline validator exception, executor rollback retry); 7 regression tests; 1475 tests, 94.21% coverage
1.19.0	2026-02-08	Rigor Close Deferrals v2: 4 bugs fixed + 3 closed as intentional; 6 regression tests; 1466 tests, 94.22% coverage
1.18.0	2026-02-08	Bug-Hunt #9 + Ultrathink: 8 bugs fixed (4M, 4L) + 2 ultrathink findings (T-1 critical, T-4 low); 19 regression tests; 1466 tests, 94.22% coverage
1.17.0	2026-02-07	Docs-sync: Issue #18 closed, changelog alignment, stale reference cleanup
1.16.0	2026-02-07	Bug-Hunt #8: 6 bugs fixed (config YAML drop, drift histogram, Bayesian NaN, consensus premature rejection, pipeline buffer, repository async); 8 regression tests; 1398 tests, 94.13% coverage
1.14.0	2026-02-06	Gap closure sprint: issues #24, #2, #7, #5, #8, #9; new modules (rbac.py, alert.py, metrics_server.py); RBAC wired into override + pcw_decide; monitoring/ configs; 115 new tests; 1390 tests, 93.98% coverage
1.13.0	2026-02-06	v1.0 SDK Release: PR #23 merged — AegisConfig, CLI, facade, MCP server, 79 new tests, 4 examples, README rewrite; 1172 tests, 94.61% coverage
1.12.0	2026-02-05	Deferred Bug Fixes v3.34.0: All 17 deferred bugs fixed (1 MEDIUM, 16 LOW); 1037 tests, 94.11% coverage
1.11.0	2026-02-05	Bug Hunt v3.32.0: Codex+Claude hybrid bug-hunt, 5 bug fixes (bayesian zero-override, prometheus idempotent, override rejection metadata, proposal exporter DI); 956 tests, 93.63% coverage
1.10.0	2026-02-05	Claude-GPT Dialogue v3.31.0: phi_S/phi_D Single Source of Truth, KNOWN_ISSUES.md cleanup (L45→Intentional, L7→HSM mitigation), docs-consistency.yml CI workflow; 946 tests, 93.48% coverage
1.9.0	2026-02-04	Deferred Bug Fix v3.30.0: L44 type coercion validation in analyst.py, L49 audit_mode for timing side-channel mitigation in hybrid_provider.py; 946 tests, 93.48% coverage
1.8.0	2026-02-04	Hybrid Bug Hunt v3.29.0: H-WF-001 consensus fix, H-WF-003 pipeline thread safety, M24/M25 crypto validation, M-ENG-005 exception handling; 931 tests, 93.48% coverage
1.7.0	2026-02-04	Quality Gate v3.28.0: 16 deferred bugs fixed, 4 regression tests added; pip CVE-2026-1703 patched; 916 tests, 93.39% coverage
1.6.0	2026-02-04	Rigor Protocol complete (v3.24.0-v3.26.0): 60/62 bugs fixed (97% fix rate); Quality Gate hardening; 910 tests, 93.48% coverage
1.5.0	2026-02-03	All LOW severity bugs fixed (L1-L9): bounded deques, public gate API, scipy z-score, input validation, thread-safe singleton, timezone parsing, docstring updates; 867 tests, 93.81% coverage
1.4.0	2026-01-31	Bug fixes v3.14.0: empty data validation, timezone-aware datetime, specific exception handling, pipeline refactor; 839 tests, 93.74% coverage
1.3.0	2026-01-31	Mathematical coherence review: ddof parameter, public API usage, GateType enum; 821 tests, 93.34% coverage
1.2.0	2026-01-31	Optional deps installed (btclib, liboqs-python); All 807 tests now pass (0 skipped); Coverage 93.76%
1.1.0	2026-01-31	PRs #19-21 merged; v3.11.0 math fixes complete; ADR-006 added; Test counts updated
1.0.1	2026-01-31	Updated PR #20 status (CI failing); Added ADR-005 to Quick Links
1.0.0	2026-01-30	Initial roadmap creation; Added PRs #19-21; Added open issues; Release milestones; GAP status summary