Skip to content

AEGIS Roadmap

Version: 1.93.0 Updated: 2026-02-25 Status: Active Cross-References: README.md, CLAUDE.md, gap-analysis.md

This document is the single source of truth for AEGIS future work, including active PRs, open issues, and release milestones.


Next Steps (Ordered Checklist)

Work through these in order. Check off each item as completed. Source: Discovery Analysis 2026-02-08.

Immediate (no blockers, can start now)

  • [x] 1. Fix dependency misclassificationscipy and prometheus_client moved from dev to dedicated engine and telemetry optional groups with try/except import guards and clear ImportError messages at point of use. Files: pyproject.toml, src/engine/utility.py, src/telemetry/prometheus_exporter.py, README.md, tests/test_optional_deps.py
  • [x] 2. Docs version syncrepository-structure.md:28 and comprehensive-todo-discovery.md synced to CLAUDE.md v4.2.3, metrics 1475/94.21%. Commit: d84330a
  • [x] 3. Update dependency versionssafety>=2.3.0>=3.0.0; python-bitcoinlib already pinned at >=0.12.0 (no change needed). All quality gates pass. File: pyproject.toml, Commit: e2ab0a5
  • [x] 4. Document broad exception catches — Added # Intentional: <reason> comments to 15 except Exception sites across 8 files per CLAUDE.md §3. Commit: fc0b41c

Short-term (v1.1.0)

  • [x] 5. Boundary tests for all gates — 77 parametrized BVA tests verifying comparison operators at exact thresholds for all 6 gates + drift detector. File: tests/test_gate_boundaries.py
  • [x] 6. GOVERNANCE actor type — Override workflow orchestration: Governance actor class with override lifecycle (initiate/sign/approve/reject/expire), compliance checking (complexity gate non-overridable), emergency halt; ultrathink-hardened (halt guards, fail-closed compliance, thread safety). Files: src/actors/governance.py, src/actors/base.py, tests/test_governance_actor.py (41 tests)
  • [x] 7. CALIBRATOR actor type — Statistical threshold tuning: Calibrator actor class with drift recalibration (delegates to DriftMonitor.calibrate_thresholds()), Bayesian prior update (delegates to BayesianPosterior.update_prior()), gate parameter proposals with recognized parameter whitelist, approval-gated application workflow, telemetry emission; thread-safe with threading.Lock; ultrathink-hardened (U-1..U-5). Files: src/actors/calibrator.py, src/actors/base.py, tests/test_calibrator_actor.py (69 tests incl. 12 regression)
  • [x] 8. Extract shared serialization patternensure_utc() extracted from 3 workflow files to src/workflows/serialization.py. Files: serialization.py, consensus.py, override.py, proposal.py
  • [x] 9. Extract shared parameter validation — 4 validators (validate_positive, validate_range, validate_normalized, validate_threshold_ordering) extracted to src/engine/validation.py, replacing ~24 inline checks across 5 engine modules. Files: validation.py, bayesian.py, drift.py, gates.py, utility.py, complexity.py Deferred: consolidating inline timezone checks in persistence/telemetry (different module boundary).

Medium-term (v1.2.0)

  • [x] 10. Production deployment guidedocs/deployment/production-guide.md with Docker, K8s, AWS examples, Docker Compose, HSM integration, multi-region DR, observability setup, production checklist. Files: docs/deployment/production-guide.md, Dockerfile, docker-compose.yaml, monitoring/prometheus/prometheus.yml
  • [x] 11. Migration guide — Parameter recalibration via Calibrator actor, workflow state migration, schema upgrade path, version compatibility matrix. File: docs/deployment/migration-guide.md
  • [x] 12. Performance SLAs — Latency targets (p50 < 100ms, p95 < 500ms, p99 < 1s), throughput baselines, component latency budget, recorded benchmark results. File: docs/deployment/performance-slas.md
  • [x] 13. Shadow mode for KL divergence calibrationshadow_mode=True on pcw_decide() with ShadowResult dataclass, drift monitor integration, SHADOW_EVALUATION telemetry, Prometheus mode label, CLI --shadow flag, MCP shadow_mode param. 44 new tests. Unblocks issue #1. Files: src/integration/pcw_decide.py, src/telemetry/emitter.py, src/telemetry/prometheus_exporter.py, src/cli.py, src/aegis_governance/mcp_server.py, src/aegis_governance/__init__.py, monitoring/prometheus/alerting-rules.yaml, monitoring/prometheus/recording-rules.yaml, tests/integration/test_shadow_mode.py
  • [x] 14. HTTP telemetry sinkHTTPEventSink (per-event POST), BatchHTTPSink (batching + retry), http_sink() factory; stdlib-only (urllib.request); AegisConfig.telemetry_url; CLI --telemetry-url; MCP telemetry_url param; SDK re-exports. 45 new tests. Files: src/telemetry/emitter.py, src/config.py, src/cli.py, src/aegis_governance/mcp_server.py, src/aegis_governance/__init__.py, src/telemetry/__init__.py, tests/telemetry/test_http_sink.py
  • [x] 15. Drift detection → policy connectionDriftMonitor wired into production pcw_decide() path: CRITICAL → HALT (non-overridable), WARNING → advisory constraint, NORMAL → no change; _evaluate_drift_policy() + _apply_drift_overrides() helpers; DRIFT_POLICY_ENFORCED telemetry; create_drift_monitor() factory; CLI --drift-baseline; MCP drift_baseline_data; DriftAction/DriftResult re-exports; null-value filtering; drift-specific next_steps. 39 new tests. Files: src/integration/pcw_decide.py, src/config.py, src/cli.py, src/aegis_governance/mcp_server.py, src/aegis_governance/__init__.py, src/engine/__init__.py, src/telemetry/emitter.py, tests/integration/test_drift_enforcement.py, tests/integration/test_drift_regression.py

Long-term (v2.0.0) — AWS Infrastructure DEPLOYED

  • [x] 16. Agent integration guide & parameter cookbook — (a) docs/integration/parameter-reference.md — comprehensive parameter reference with derivation guidance, domain examples, boundary behavior for all inputs; (b) docs/integration/domain-templates.md — 4 worked examples (trading, CI/CD, content moderation, autonomous agent) with parameter mappings, JSON inputs, gate-by-gate walkthroughs; (c) MCP tool descriptions enriched with semantic context, minimum/maximum JSON Schema constraints, instructions field in initialize response. Files: docs/integration/parameter-reference.md, docs/integration/domain-templates.md, src/aegis_governance/mcp_server.py
  • [x] 17. GAP-L1 Deployment — Grafana deployment, production Prometheus, alert routing (Slack/PagerDuty). Issue #9. DEPLOYED: AegisMonitoringStack-dev — CloudWatch dashboard AEGIS-Governance-dev, SNS topic aegis-governance-alarms-dev, 4 alarms (Lambda errors, Lambda throttles, ECS unhealthy, billing). Grafana/Prometheus observability via ADOT sidecar on ECS Fargate.
  • [x] 18. GAP-L2 — OpenTelemetry distributed tracing. DEPLOYED: ADOT sidecar running on ECS Fargate (AegisMcpStack-dev), configured for Prometheus remote write to AMP. Full OTLP span correlation deferred to production workload phase.
  • [x] 19. Issues #2, #5, #7, #8, #9 Phase 2 — Infrastructure requirements met. DEPLOYED: DynamoDB aegis-governance-state-dev, S3 aegis-governance-audit-dev-164171672016, Secrets Manager aegis/signing-keys-dev, KMS encryption key, IAM auth on API Gateway, SNS alarm topic. Remaining: Slack/email subscription on SNS, multi-region replication, Locust load testing against live API Gateway endpoint.
  • [x] 20. Red-team fuzzing — Phase 2 adversarial testing. DEPLOYED: Lambda aegis-evaluate-proposal-dev + API Gateway https://yd1xm4ahcg.execute-api.us-west-2.amazonaws.com/dev/ available as live targets. ECS aegis-mcp-dev running. Fuzzing execution pending.
  • [x] 20a. MCP hardening (CoSAI/Red Hat) — Per research/003: ✅ (a) MCP audit logging — structured log for every tool invocation; ✅ (b) MCP rate limiting — token bucket/sliding window on mcp_server.py; ✅ (c) TLS enforcement — _validate_sink_url() enforces HTTPS on HTTPEventSink/BatchHTTPSink (with allow_insecure escape hatch for local dev), MCP _ALLOWED_TELEMETRY_SCHEMES restricted to {"https"}; ✅ (d) CoSAI MCP-T cross-reference in CLAUDE.md §11.4.1 (MCP-T1..T12 → AEGIS control matrix); ✅ (e) MCP tool schema signing — AMTSS Protocol v1 (src/crypto/schema_signer.py, ToolSchemaSigner, Ed25519, RFC 8785, _meta inline delivery, capabilities.experimental keyset); research: docs/research/004-mcp-schema-signing-design.md. All 5 sub-items complete.

Infrastructure (ECS MCP)

  • [x] 23. MCP Streamable HTTP transport — Implemented MCP Streamable HTTP transport (2025-03-26 spec) using stdlib http.server (zero new deps). --transport http flag on aegis-mcp-server, POST /mcp (JSON-RPC single + batch), GET /health, origin validation, 1 MiB request limit. ECS stack updated: internal ALB (:80 → :8080), /health health check, ALB 5xx alarm. 50 new tests, 1923 total passing, 94.62% coverage. Deferred: SSE streaming, session management, resumability (all AEGIS tools are synchronous/stateless). Files: src/aegis_governance/mcp_server.py, tests/test_mcp_http_transport.py, Dockerfile, infra/stacks/ecs_stack.py, infra/stacks/monitoring_stack.py
  • [ ] 21. IP & licensing review — Engage IP attorney to evaluate: (a) patentability of the unified governance architecture (6-gate Bayesian framework + two-key crypto override + MCP agent integration + shadow calibration workflow); (b) trademark feasibility for "AEGIS Governance" in AI/software governance class; (c) license strategy — MIT is maximally permissive, evaluate BSL/SSPL/dual-license if commercial protection needed before public launch; (d) prior art landscape assessment. Decision required before broad public release or commercialization.
  • [x] 22. Commercialization strategy — Market research complete: AI governance market $300-850M (2025), 35-40% CAGR to $1.5B-$4.8B by 2033. Open core model recommended (free engine + paid enterprise features). Pricing tiers defined: Community (free), Professional ($2-5K/mo), Enterprise ($10-25K/mo), Financial Services ($25-50K/mo). File: docs/research/002-market-research-competitive-landscape.md

Active Work (In Progress)

Gap Closure Sprint (PR #25) — Phase 1 Complete

Task Type Status Details
RBAC Enforcement (#7) Security ✅ Phase 1 RBACEnforcer, YAMLRoleResolver, wired into override + pcw_decide
Override Audit (#5) Observability ✅ Phase 1 Override telemetry events, AlertSink protocol, LogAlertSink + WebhookAlertSink
Performance Benchmarks (#2) Testing ✅ Phase 1 13 pytest-benchmark functions across 3 files
DR Verification (#8) Reliability ✅ Phase 1 Crash recovery tests, hash chain integrity, health CLI
Monitoring Infrastructure (#9) Observability ✅ Phase 1 MetricsServer, CLI metrics/health, Grafana + Prometheus configs
validate() Refactor (#24) Tech Debt ✅ Complete CC=56 → CC~6 via data-driven _validate_section()

Metrics (at PR #25 merge): 1997 tests, 94.47% coverage, 6 issues addressed + 13 rigor findings + 11 bug-hunt #5 + 6 bug-hunt #6 + 6 bug-hunt #8 + 8 bug-hunt #9 + 2 ultrathink + 5 QG-ultrathink-10 + shadow mode + HTTP sink + drift enforcement + MCP HTTP transport + H-1 SSRF fix + MCP hardening + TLS enforcement + parameter cookbook + QG56 ultrathink + QG57 ultrathink + BH10 (7 bugs)

AWS Deployment (ROADMAP Items 16-20) — DEPLOYED

Stack AWS Resource Status Details
AegisSharedStack-dev DynamoDB, KMS, S3, Secrets Manager ✅ DEPLOYED aegis-governance-state-dev, aegis-governance-audit-dev-164171672016
AegisLambdaStack-dev Lambda + API Gateway ✅ DEPLOYED aegis-evaluate-proposal-dev, REST API with IAM auth
AegisMcpStack-dev ECS Fargate ✅ DEPLOYED aegis-mcp-dev (1/1 running), keepalive loop (stdio transport)
AegisMonitoringStack-dev CloudWatch + SNS ✅ DEPLOYED Dashboard, 4 alarms, SNS topic

API Endpoint: https://yd1xm4ahcg.execute-api.us-west-2.amazonaws.com/dev/ Routes: POST /evaluate, POST /risk-check, GET /health

Completed (v3.26.0 — Rigor Protocol)

Task Type Status Details
Rigor Protocol Phase 1 Bug Fix ✅ Complete v3.24.0: 7 fixes (M7, M8, L13, L16, L19, L31, M11 doc)
Rigor Protocol Phase 2 Bug Fix ✅ Complete v3.25.0: 17 fixes, 25 regression tests
Rigor Protocol Phase 3 Bug Fix ✅ Complete v3.26.0: 13 fixes (M14-M18, L33-L40)
Quality Gate Ultrathink Hardening ✅ Complete M1-M4, L4: input validation, error handling

Metrics: 1689 tests, 94.60% coverage, 103/103 bugs fixed (100% fix rate)

Previously Completed (v3.11.0-v3.13.0)

Task Type Status Details
Posterior Predictive (NEW-A) Math Fix ✅ Complete ADR-006, compute_posterior_predictive()
Covariance Matrix (U1+) Math Fix ✅ Complete cov_pv, cov_pr, cov_vr parameters
PERT Variance (P1*) Documentation ✅ Complete Docstring warning ±22-40% error
Fail-Closed Default (I1) Security Fix ✅ Complete lcb=float('-inf')
Input Validation Robustness ✅ Complete ValueError for invalid std values

See: Multi-Model Coherence Review for full analysis.

Recently Merged PRs

PR Title Merged Commit Status
#25 feat: gap closure sprint — RBAC, alerts, metrics, DR, benchmarks edc278c ✅ Merged
#23 feat: AEGIS v1.0 Governance Decision SDK cfa3783 ✅ Merged
#22 chore(claude): audit & regenerate CLAUDE.md v4.0.0 d114f07 ✅ Merged
#21 Default legacy algorithm on deserialization cd4572a ✅ Merged
#20 Fix telemetry timestamp validation for ISO strings d9ea971 ✅ Merged
#19 Add structured decision trace to pcw_decide 7df8bf7 ✅ Merged

Open Issues

# Title Priority Status Labels Milestone
#1 GAP-DriftThreshold: Calibrate KL Divergence Threshold MEDIUM Open — needs production data GAP, team:risk v1.2.0+
#2 GAP-PerfTest: Load-Test Guardrail Service (<500 ms p95) MEDIUM Phase 1 complete — needs Locust testing GAP, team:devops v1.2.0
#5 GAP-OverrideAudit: Enhance Override Logging & Alerts MEDIUM Phase 1 complete — needs SNS subscriptions GAP, team:seceng v1.2.0
#7 GAP-RBAC-Enforcement: Apply Role-Based Access Controls MEDIUM Phase 1 complete — needs IAM integration GAP, team:seceng v1.2.0
#8 GAP-DR-Drill: Test Disaster Recovery Process LOW Phase 1 complete — needs multi-region replication GAP, team:devops v2.0.0
#9 GAP-MonitoringDashboard: Implement Guardrail Monitoring Dashboard LOW Phase 1 complete — needs Grafana provisioning GAP, team:devops v2.0.0

Recently Closed Issues

# Title Closed Notes
#24 validate() CC=56 refactor 2026-02-08 Refactored to CC~6 via data-driven _validate_section()
#6 GAP-TelemetryPrivacy: PII Redaction 2026-01-31 12-field PII encryption via HybridKEM

Sprint Update (PR #25): Issues #2, #5, #7, #8, #9 have Phase 1 code-side implementations complete. AWS infrastructure now DEPLOYED (4 CDK stacks live in us-west-2). Remaining work: Slack/email SNS subscriptions, multi-region replication, Locust load testing against live endpoints. Issues #6, #18, #24 closed. Milestone "Guardrail β-to-Prod" due date updated to 2026-06-30.


Release Roadmap

v1.0.1 (Patch - Pre-Release Bug Fixes) ✅ RELEASED

Released: 2026-01-31 (pre-release fixes merged before v1.0.0 SDK release) Focus: Bug fixes merged

Task PR/Issue Status
Timestamp validation fix PR #20 ✅ Merged (d9ea971)
Signature algorithm preservation PR #21 ✅ Merged (cd4572a)
Structured decision trace PR #19 ✅ Merged (7df8bf7)
Fix broken documentation links Issues #14-18 ✅ Fixed (0f18c71)

v1.0.0 (Major - SDK Release) ✅ RELEASED

Released: 2026-02-06 (PR #23, commit cfa3783) Focus: Governance Decision SDK — public API, CLI, MCP server

Task PR/Issue Status
AegisConfig frozen dataclass (src/config.py) PR #23 ✅ Complete
CLI entry point (src/cli.py, aegis command) PR #23 ✅ Complete
Public API facade (src/aegis_governance/__init__.py) PR #23 ✅ Complete
MCP server (src/aegis_governance/mcp_server.py) PR #23 ✅ Complete
79 new tests (config, CLI, facade, MCP) PR #23 ✅ Complete
4 runnable examples (examples/) PR #23 ✅ Complete
README rewrite (SDK positioning) PR #23 ✅ Complete
pyproject.toml [project.scripts] entries PR #23 ✅ Complete

v1.1.0 (Minor - Enhancements)

Target: Q1 2026 Focus: Testing improvements and new features

Task Effort Status Notes
Mathematical coherence fixes 8h ✅ Complete v3.11.0 (NEW-A, U1+, P1*, I1)
Boundary tests for all gates 4h ✅ Complete 77 parametrized BVA tests (tests/test_gate_boundaries.py)
Integration test: Proposal → Execution 8h ✅ Complete tests/integration/test_e2e_proposal_lifecycle.py (5 tests)
GOVERNANCE actor type 6h ✅ Complete Override orchestration, compliance, emergency halt (41 tests incl. 6 regression)
CALIBRATOR actor type 6h ✅ Complete Statistical threshold tuning, approval-gated workflow (69 tests incl. 12 regression)

v1.2.0 (Minor - Features)

Target: Q2 2026 Focus: Production readiness

Task Effort Status Notes
Shadow mode deployment prerequisites 16h ✅ Complete pcw_decide(shadow_mode=True), ShadowResult, 44 tests (ROADMAP Item 13)
HTTP telemetry sink 4h ✅ Complete HTTPEventSink + BatchHTTPSink + http_sink() factory, config/CLI/MCP wiring, 41 tests (ROADMAP Item 14)
Configuration management system 12h ✅ Complete AegisConfig in v1.0.0 (PR #23)
Drift detection → policy connection 4h ✅ Complete DriftMonitor wired into pcw_decide; CRITICAL→HALT, WARNING→constraint (ROADMAP Item 15)
GAP-DriftThreshold (#1) TBD Unblocked Shadow mode enables data collection; needs 30+ days of observed KL values
GAP-PerfTest (#2) 8h Phase 1 Complete Benchmarks established; Locust load testing now possible against live API Gateway
GAP-OverrideAudit (#5) 8h Phase 1 Complete Override telemetry + AlertSink protocol; Slack/email sinks pluggable
GAP-RBAC-Enforcement (#7) 12h Phase 1 Complete RBACEnforcer + YAMLRoleResolver; IAM integration pluggable via RoleResolver protocol
MCP Streamable HTTP transport 12h ✅ Complete --transport http on aegis-mcp-server; POST /mcp (JSON-RPC single + batch), /health endpoint; internal ALB; origin validation; SSRF protection; 50 new tests (ROADMAP Item 23)

v2.0.0 (Major - Backlog)

Target: 2026 H2 Focus: Operational excellence

Task Effort Status Notes
GAP-L1 Phase 2-3: Grafana dashboards & alerting 87h ✅ DEPLOYED Phases 1-3 code-complete; CloudWatch dashboard + SNS alarms deployed (AegisMonitoringStack-dev)
GAP-L2: OpenTelemetry distributed tracing 16h ✅ DEPLOYED (foundation) ADOT sidecar on ECS Fargate; full OTLP span correlation deferred
GAP-DR-Drill (#8) 16h Phase 1 Complete DR verification tests + health CLI; live drill now possible against deployed infrastructure
GAP-MonitoringDashboard (#9) 16h ✅ DEPLOYED CloudWatch AEGIS-Governance-dev dashboard + Grafana configs available
Phase 2 red-team fuzzing 20h Infrastructure Ready Lambda + API Gateway + ECS deployed as live targets
Parameter freezing mechanism 8h Backlog Governance compliance

GAP Status Summary

Completed GAPs

GAP Description Completion Implementation
GAP-C1 Decision Logic Divergence 100% src/engine/gates.py
GAP-C2 Override Mechanism 100% src/workflows/override.py
GAP-C3 AFABridge Gate Integration 100% src/integration/afa_bridge.py
GAP-H1 Parameter Naming 100% schema/interface-contract.yaml
GAP-H2 Telemetry Schema 100% src/telemetry/schema.py
GAP-H3 RBAC Reconciliation 100% schema/rbac-definitions.yaml
GAP-M1 Feedback Timing 100% src/engine/drift.py
GAP-M2 Actor Types 100% src/actors/
GAP-M3 Workflow Persistence 100% src/workflows/persistence/
GAP-M4 Signature Format 100% src/crypto/
GAP-Q1 Post-Quantum Signatures 100% src/crypto/mldsa.py, hybrid_provider.py
GAP-Q2 Post-Quantum Encryption 100% src/crypto/mlkem.py, hybrid_kem.py

In-Progress GAPs

GAP Description Completion Phase
GAP-L1 Unified Monitoring Dashboard 100% code + deployed Phases 1-3 code-complete; CloudWatch + SNS deployed; Grafana available via configs

Planned GAPs

GAP Description Completion Target
GAP-L2 Cross-Component Tracing Foundation deployed ADOT sidecar running; full OTLP deferred to v2.0.0

Production Readiness Metrics

Metric Current Target Status
Test Coverage ~94.9% 90% Exceeds
Tests Passing 3041 (2 skipped) All Pass
Security Vulnerabilities 0 0 Pass
CI/CD All green All green Pass
Documentation Accuracy 99.6% 95% Exceeds
AWS Deployment 4/4 stacks deployed All stacks Pass

Architecture

Decision Records

  • ADR Index - All Architecture Decision Records
  • ADR-001 - Workflow Persistence
  • ADR-002 - BIP-322 Signatures
  • ADR-003 - Post-Quantum Signatures
  • ADR-004 - Post-Quantum Encryption
  • ADR-005 - KL Divergence Threshold Calibration
  • ADR-006 - Posterior Predictive for Bayesian Gates
  • ADR-007 - AWS Deployment Architecture

Integration Guides

  • Parameter Reference - Complete parameter reference with derivation guidance
  • Domain Templates - Worked examples for 4 domains (trading, CI/CD, content moderation, autonomous agents)

Implementation

Research

Analysis


Changelog

Version Date Changes
1.93.0 2026-02-25 Bug Hunt #45 (Hybrid): 6 fixes (1 Codex, 2M, 2L + 1 ultrathink), 31 regression tests; BH45-Codex-M1 proposal metadata deep copy, BH45-M1 MCP risk_score eager eval transport parity, BH45-M2 BayesianPosterior update_prior validation, BH45-T1 update_prior bool guard, BH45-L1 PipelineConfig int validation, BH45-L2 PipelineConfig enum validation; 3029 tests, ~94.8% coverage
1.92.0 2026-02-25 Scoring Guide MCP Tool + Advisor v2: aegis_get_scoring_guide with 5-domain derivation guidance, Advisor rewrite with domain funnel + factual rubric + real API calls, demo API key provisioned; 2998 tests, ~94.8% coverage
1.91.0 2026-02-24 SaaS Commercialization Sprint: API key auth + usage plans (CDK), tenant context extraction (Lambda), customer provisioning script, OpenAPI 3.1 spec, mkdocs-material docs site (10 pages), PyPI trusted publishing, SECURITY.md, CHANGELOG.md; pyproject.toml v1.1.0; 2967 tests, ~94.8% coverage
1.90.0 2026-02-24 Transport Parity Fix: 15 gaps closed across CLI/MCP/Lambda (GAP 2-4 CRITICAL: MCP missing bool flags, GAP 1 metadata, GAP 6-8 inputSchema + Lambda telemetry, GAP 12 strict impact, GAP 15 UUID session, GAP 17 SSRF, GAP 18-22 output fields); new telemetry/url_validation.py shared module; 2958 tests, ~94.8% coverage
1.89.0 2026-02-23 Bug Hunt #44 (Hybrid): 4 fixes (1 Codex, 2M, 1L), 15 regression tests; BH44-Codex-M1 schema_signer chain state corruption, BH44-M1 calibrator utility_threshold constraint, BH44-M2 proposer TypeError catch, BH44-L1 pcw_decide drift alias; 2923 tests, ~94.8% coverage
1.88.0 2026-02-23 Bug Hunt #43 (Hybrid): 11 fixes (2 Codex, 5M, 4L) + 1 ultrathink fix, 31 regression tests; BH43-Codex-M1 analyst gate exception handling, BH43-Codex-M2 analyst quality_subscores TypeError, BH43-M1 CLI null subscores crash, BH43-M2 ComplexityBreakdown bool fields, BH43-M3 value_variance negative floor, BH43-M4+M5 pipeline ingest() aliasing, BH43-L1 CLI metric alias null, BH43-L2 utility value_low_conf NaN, BH43-L3 utility covariance NaN, BH43-L4 ProposalWorkflow from_dict cls(), QG-T1 from_dict evaluation_result; 2908 tests, ~94.8% coverage
1.87.0 2026-02-23 Bug Hunt #42 (Hybrid): 13 fixes (3 Codex, 6M, 2L + 2 ultrathink), 29 regression tests; BH42-M1 complexity mutable default, BH42-M2 calibrator novelty_k positive, BH42-M3 prometheus NaN latency, BH42-M4 prometheus NaN KL divergence, BH42-M5 emitter correlation_id or-falsy, BH42-M6 lambda shadow_mode bool, BH42-L1 pcw_decide posterior or-falsy, BH42-L2 afa_bridge posterior or-falsy, BH42-Codex-M1 auth falsy fail-open, BH42-Codex-M2 allow_abstain bool, BH42-Codex-L1 checkpoint collision retry, QG-T1 MCP shadow_mode parity, QG-T2 analyst confidence or-falsy; 2877 tests, 94.81% coverage
1.86.0 2026-02-22 Bug Hunt #41 (Hybrid): 7 bugs (1 Codex + 4M, 2L), 33 regression tests; BH41-M1 analyst None subscores saw_non_null (analyst.py), BH41-M2 validate_range check_nan default False→True (validation.py), BH41-M3 schema_signer _prev_digests atomic commit (schema_signer.py), BH41-M4 consensus DEFER excluded from required_missing (consensus.py), BH41-L1 calibrator list_proposals lock-snapshot race (calibrator.py), BH41-L2 emitter correlation_id or-coercion (emitter.py), BH41-Codex complexity_floor bool guard (complexity.py); QG verify: ruff B017 narrowed, black format, mypy attr-defined; 2848 tests, 94.82% coverage
1.85.0 2026-02-22 Bug Hunt #40 (Hybrid): 9 bugs (4M, 5L), 40 regression tests; BH40-M1 quality_subscores empty-list bypass (Codex+Claude), BH40-M2 BatchHTTPSink.stop() lock-before-join race, BH40-M3 validate_normalized bool guard missing, BH40-M4 _parse_mcp_rate_limit string-fractional truncation, BH40-L1 GateEvaluator negative threshold values disable gates, BH40-L2 _parse_kl_drift_dict string-fractional window_days, BH40-L3 stdio size guard char vs byte count, BH40-L4 get_decision_history truthy agent_id bypass, BH40-L5 DEKRotator readers without lock; 2815 tests, 94.78% coverage
1.84.0 2026-02-21 Bug Hunt #39: 13 bugs (1H, 6M, 6L), 54 regression tests; BH39-H1 chain root forgery, BH39-M1/M3 lock-before-join, BH39-M2 TOCTOU, BH39-M4 inf trigger factor, BH39-M5 NaN utility, BH39-M6/L5 float truncation, BH39-L1 from_dict cls.new, BH39-L2 novelty_k=0, BH39-L3 JSON-RPC §4.1, BH39-L4 bip322 length, BH39-Codex-2 memory_sink maxlen=0; 2775 tests, 94.77% coverage
1.83.0 2026-02-21 QG-UT1: GateEvaluator(trigger_confidence_prob=True) silently accepted via validate_range inclusive upper bound (True==1.0); explicit bool guard added; 2721 tests, 94.78% coverage
1.82.0 2026-02-21 Bug Hunt #38 (Hybrid): 6 bugs (1H, 4M, 1L), 35 regression tests; BH38-H1 key_store.py Python 3.10+ async-with SyntaxError on 3.9 (+ fmt:off guard), BH38-M1 UtilityCalculator bool-is-int bypass (phi_S/phi_D/gamma/kappa/migration_budget), BH38-M2 GateEvaluator bool-is-int bypass (trigger factors + thresholds), BH38-M3 CalibrationProposal + _validate_gate_param bool bypass, BH38-M4 MetricsServer.stop() lock held during join, BH38-L1 BatchHTTPSink non-int params (Codex); 2720 tests, 94.78% coverage
1.81.0 2026-02-20 Bug Hunt #37: 6 bugs (3M, 3L) -- BayesianPosterior NaN, emergency_halt audit, calibrator novelty_N0, PipelineConfig float, ThreePointEstimate bool, DriftMonitor window_days; 2685 tests, 94.76% coverage
1.80.0 2026-02-20 Bug Hunt #36 (Hybrid): 6 bugs (4M, 2L), 17 regression tests; QG Ultrathink: 2 findings (2L); BH36-M1 Lambda or pattern falsy bypass (Codex), BH36-M2 mark_completed non-enum state injection, BH36-M3 CLI or estimated_impact, BH36-M4 MCP or estimated_impact, BH36-L1 complexity_tax bool guard, BH36-L2 proposal_summary or pattern; 2659 tests, 94.74% coverage
1.79.0 2026-02-20 Bug Hunt #35 (Hybrid): 6 bugs (4M, 2L), 22 regression tests; QG Ultrathink: 4 findings (4L), 19 regression tests; BH35-M1 check_and_mark_expired terminal state downgrade (Codex), BH35-M2 RBAC NaN signer_count bypass, BH35-M3 PipelineConfig flush_interval no validation, BH35-M4 BatchHTTPSink flush_interval no validation, BH35-L1 PipelineConfig bool-is-int, BH35-L2 DEKCache ttl_seconds no validation; 2642 tests, 94.79% coverage
1.78.0 2026-02-20 Bug Hunt #34 (Hybrid): 5 bugs (4M, 1L), 14 regression tests; BH34-M1 DriftMonitor num_bins float accepted, BH34-M2 CLI cmd_evaluate missing TypeError catch, BH34-M3 DualSignatureValidator expiration_hours upper bound, BH34-M4 TelemetryPipeline worker_loop inconsistent state, BH34-L1 AegisConfig.from_dict() telemetry_url type coercion; 2601 tests, 94.79% coverage
1.77.0 2026-02-20 Bug Hunt #33 (Hybrid): 5 bugs (5M), 15 regression tests; BH33-M1 config._parse_flat_numeric non-numeric type silently accepted, BH33-M2 config._from_raw_dict DIRECT param non-numeric type, BH33-M3 DriftMonitor.evaluate() unfiltered window, BH33-M4 OverrideWorkflow failed_gates no defensive copy, BH33-M5 mark_completed() state_data desync (Codex); 2587 tests, 94.80% coverage
1.76.0 2026-02-20 Bug Hunt #32 (Hybrid): 3 bugs (2M, 1L), 20 regression tests; BH32-M1 DriftMonitor constructor negative/Inf threshold parity, BH32-M2 calibrator negative threshold governance bypass, BH32-L1 KLDriftConfig window_days validation; 2572 tests, 94.80% coverage
1.75.0 2026-02-20 Bug Hunt #31 (Hybrid) + QG73 Ultrathink: 4 bugs (1M, 3L) + 2 QG73 findings (1M, 1L), 22 regression tests; BH31-M1 MCP caller_id non-string guard, BH31-L1 Lambda threshold dict.get() null, BH31-L2 ConsensusConfig fractional minimum, BH31-L3 DualSignatureValidator fractional minimum; QG73-L1 CLI agent_id transport parity, QG73-M1 AFABridge timeout fractional minimum; 2552 tests, 94.80% coverage
1.74.0 2026-02-19 Bug Hunt #30 (Hybrid) + QG72 Ultrathink: 5 bugs (2M, 3L) + 4 QG72 findings (2M, 2L), 12 regression tests; BH30 dict.get() null gotcha transport parity (CLI/MCP/Lambda), AFABridge float limit, pipeline config mutation; QG72 remaining null gaps; 2530 tests, 94.76% coverage
1.73.0 2026-02-18 Bug Hunt #29 (Hybrid) + QG71 Ultrathink: 8 bugs (3M, 5L) + 3 QG71 findings (3L), 26 regression tests; BH29-M1 estimated_impact case bypass, BH29-M2 executor TOCTOU, BH29-M3 calibrator novelty_k zero; QG71 MCP null guards + drain broadening; 2518 tests, 94.76% coverage
1.72.0 2026-02-18 Bug Hunt #28 (Hybrid) + QG70 Ultrathink: 5 bugs (3M, 2L) + 3 QG70 findings (3L), 22 regression tests; BH28-M1 consensus quorum revert, BH28-M2 governance expired override eviction, BH28-M3 CLI risk alias priority; QG70 config bool coercion + drift baseline Inf; 2492 tests, 94.73% coverage
1.71.0 2026-02-17 Quality-Gate QG69 Ultrathink: 1 finding (1M), 7 regression tests; QG69-M1 MCP+CLI drift_baseline_data isfinite transport parity; 2470 tests, 94.73% coverage
1.70.1 2026-02-17 Bug Hunt #27 (Hybrid): 4 bugs (3M, 1L), 13 regression tests; BH27-M1 (resume_or_create ID propagation), BH27-M2 (_from_raw_dict string-to-float), BH27-M3 (Lambda/MCP null bypass), BH27-L4 (Lambda drift_baseline isfinite); 2470 tests, 94.73% coverage
1.70.0 2026-02-17 Scaffold Adoption: Engineering Standards ai_scaffold_package v2.1.1 (50 files); ai/ (8 artifacts), docs/compliance/ (7 runbooks), tools/ci/ (9 validators), GitHub (templates, workflows, 15 labels), Makefile, .pre-commit-config; 100% placeholder elimination; CLAUDE.md v4.5.33; 2448 tests, 94.83% coverage
1.69.0 2026-02-16 Bug Hunt #26 (Hybrid): 4 bugs (3M, 1L), 18 regression tests; BH26-M1 (validate_positive bool-is-int — Codex), BH26-M2 (bayesian update_prior variance overflow), BH26-M3 (RBAC bool constraint None fail-open), BH26-L1 (complexity delta NaN/Inf propagation); 0 deferred bugs; 2448 tests, 94.83% coverage
1.68.0 2026-02-16 Bug Hunt #25 (Hybrid): 6 bugs (3M, 3L), 18 regression tests; BH25-M1 (analyst utility components null), BH25-M2 (CLI risk_score transport parity), BH25-M3 (drift histogram large-magnitude), BH25-L1 (analyst risk_delta/profit_delta null — Codex), BH25-L2 (bayesian overflow), BH25-L3 (config string NaN); PLR0912: _parse_flat_numeric() helper; 0 deferred bugs; 2430 tests, 94.81% coverage
1.67.0 2026-02-16 Bug Hunt #24 (Hybrid) + QG68 Ultrathink: 10 bugs (4M, 6L), 26 regression tests; BH24-M1 (analyst _evaluate_utility_gate null guard), BH24-M2 (Lambda _float bool), BH24-M3 (MCP _float_arg bool), BH24-M4 (CLI _parse_proposal bool), BH24-L1 (drift evaluate baseline bool), BH24-L2 (override add_signature bool), BH24-L3 (MCP risk_check threshold null), BH24-L4 (config mcp_rate_limit bool), BH24-L5 (pcw_decide quality_subscores null), BH24-L6 (analyst profit_baseline null); QG68: analyst utility null guards; 0 deferred bugs; 2412 tests, 94.80% coverage
1.66.0 2026-02-16 AMTSS Protocol v1 — MCP Tool Schema Signing: src/crypto/schema_signer.py (ToolSchemaSigner, SigningKeyPair, compute_tool_digest), Ed25519 per-tool + manifest dual signing, RFC 8785 canonicalization, _meta inline delivery, capabilities.experimental keyset; MCP server integration (tools/list proofs + initialize keyset); research doc 004-mcp-schema-signing-design.md; Claude-GPT dialogue (GPT 5.2 Pro xhigh); QG ultrathink: 5+4 findings fixed (manifest duplicate-name bypass, _meta stripping, statement type validation, digest chain, strict base64url + QG67: null sig crash, NaN canonicalization, manifest revision increment, signing error log level); ROADMAP 20a(e) complete — all 5 MCP hardening sub-items done; 2386 tests, 94.74% coverage
1.65.0 2026-02-16 CoSAI MCP-T Cross-Reference: CLAUDE.md §11.4.1 MCP-T1..T12 threat mapping (9 STRONG, 2 MODERATE, 1 PARTIAL); ROADMAP 20a(d) complete; docs-only; 2304 tests, 94.63% coverage
1.64.0 2026-02-16 Bug Hunt #23 (Hybrid): 7 bugs (3M, 4L), 29 regression tests; BH23-M1 (CLI drift baseline bool), BH23-M2 (CLI quality_subscores empty list), BH23-M3 (Calibrator eviction race), BH23-L1 (CLI subscores type check), BH23-L2 (BayesianPosterior prior_mean NaN/Inf), BH23-L3 (ConsensusWorkflow check_timeout), BH23-L4 (KeyStore audit lock TOCTOU); 0 deferred bugs; 2304 tests, 94.63% coverage
1.63.0 2026-02-15 Quality-Gate QG66 Ultrathink: 2 findings (2L), 2 regression tests; UT-1 MCP empty subscores parity, UT-2 MCP non-numeric string crash; 2275 tests, 94.63% coverage
1.62.0 2026-02-15 Bug Hunt #22 (Hybrid): 8 bugs (4M, 4L), 20 regression tests; BH22-M1 (override reject() wall-clock), BH22-M2 (MCP quality_subscores extraction), BH22-M3 (DriftMonitor update_thresholds validation), BH22-M4 (persistence re-completion guard), BH22-L1 (drift_baseline_data bool guard), BH22-L2 (governance override eviction), BH22-L3 (afa_bridge string-as-iterable), BH22-L4 (analyst null subscores); 0 deferred bugs; 2273 tests, 94.64% coverage
1.61.0 2026-02-15 Bug Hunt #21 (Hybrid): 8 bugs (3M, 5L), 16 regression tests; BH21-M1 (KLDriftConfig post_init), BH21-M2 (Lambda subscores bool), BH21-M3 (AFABridge subscores validation), BH21-L1 (DriftMonitor window_days), BH21-L2 (Calibrator unbounded proposals), BH21-L3 (shadow eval key collision), BH21-L4 (drift status label cardinality), BH21-L5 (MCP 405 Allow header); 0 deferred bugs; 2273 tests, 94.64% coverage
1.60.0 2026-02-15 Bug Hunt #20 (Hybrid) + QG65 Ultrathink: 9 bugs (7M, 2L) + 5 QG65 fixes; 22 regression tests total; durable non-dict crash, override mutable sharing, base64 strict (override+crypto+lambda), consensus voter aliasing + timeout overflow, pcw_decide trace crash, encryption base64, config window_days, transport bool guards, CLI risk/subscore bool guards; 2236 tests, 94.68% coverage
1.59.0 2026-02-15 Rigor: Resolve All Deferred Bugs — fixed BH16-L5 (WorkflowTransition.verify_hash standalone false negatives, added previous_hash column), closed BH15-L6 (Lambda telemetry by-design); 8 regression tests; 0 deferred bugs remaining; 2214 tests, 94.68% coverage
1.58.0 2026-02-14 Bug Hunt #19 (Hybrid): 5 bugs (2M, 3L), 12 regression tests; proposal.py from_dict mutable aliasing, override key rotation TOCTOU, afa_bridge bool guard + non-boolean execution flags + null authorization crash; 2206 tests, 94.68% coverage
1.57.0 2026-02-14 Bug Hunt #18 (Hybrid): 7 bugs (3M, 4L), 25 regression tests; lambda_handler/cli non-boolean control flags, config flat key NaN/Inf validation, bayesian ddof bool, consensus config bool guards, afa_bridge timeout_hours bool; 2194 tests, 94.61% coverage
1.56.0 2026-02-14 Bug Hunt #17 (Hybrid): 6 bugs (1M, 5L), 13 regression tests; afa_bridge risk_check transport parity, config NaN/Inf validation, ensure_utc timezone conversion, BatchHTTPSink negative max_retries, governance emergency_halt; 2169 tests, 94.60% coverage
1.55.0 2026-02-14 Quality Gate #62 (Ultrathink): 6 findings (1M, 5L), 11 regression tests; afa_bridge isfinite, config kl_drift NaN validation, lambda null subscores; 2156 tests, 94.58% coverage
1.54.0 2026-02-14 Bug Hunt #16: 9 bugs (4M, 5L), 22 regression tests; 1 deferred (BH16-L5); 2145 tests, 94.56% coverage
1.53.0 2026-02-14 Bug Hunt #15 (Hybrid): 8 bugs (2M, 6L), 22 regression tests + Quality Gate #61 (Ultrathink): 7 findings (4M, 3L), 5 fixed + 8 regression tests; CLI observation_values sanitization; 2123 tests, 94.53% coverage
1.52.0 2026-02-13 Bug Hunt #14 (Hybrid): 3 bugs (3M) — ConsensusConfig bool timeout_hours, DualSignatureValidator expiration_hours validation, Lambda quality_subscores isfinite parity; 2101 tests, 94.54% coverage
1.51.0 2026-02-13 Rigor Close Deferrals v3: closed all 5 deferred bugs (BH12-L2 fixed + QG60-6/7/8/9 documented/accepted-risk); 0 deferred remaining; 2091 tests, 94.52% coverage
1.50.0 2026-02-13 Bug Hunt #13: 7 bugs (4M, 3L), 16 regression tests
1.49.0 2026-02-13 Quality-Gate Ultrathink (QG60): 5 fixes — validate_positive Inf FAIL-OPEN, UtilityCalculator gamma/kappa/migration_budget Inf, MCP POST 404 body drain, MCP 413 connection close, ThreePointEstimate Inf; SDK facade Calibrator/Governance exports; 2072 tests, 94.50% coverage
1.48.0 2026-02-12 Bug Hunt #12 (Hybrid): 10 bugs (1H, 7M, 2L) — GateEvaluator NaN governance lockout, complexity analyze NaN, Lambda _float NaN/Inf parity, risk_check NaN, ExecutionPlan NaN timeout, CalibrationProposal data_window, config null params, proposal to_dict mutable leak; 2053 tests, 94.52% coverage
1.47.0 2026-02-12 Quality-Gate Ultrathink (QG59): 12 fixes from 21 findings (8M, 4L) — NaN trigger_factor bypass, trigger_confidence_prob fail-OPEN, YAML null crash, CalibrationProposal NaN/Inf, analyst coerce NaN strings, proposer PERT NaN/Inf, MCP _float_arg NaN/Inf, emitter dropped-event semantics; 2031 tests, 94.52% coverage
1.46.0 2026-02-12 Bug Hunt #11 (Hybrid): 10 bugs (8M, 2L) — CLI null subscores/phase, calibrator capability check, governance halt override cancel, consensus NaN timeout, MCP POST /health body, pipeline PII encryptor bypass, BatchHTTPSink batch_size=0, utility lcb_alpha NaN, stdio strip order; 2009 tests, 94.49% coverage
1.45.0 2026-02-12 Quality-Gate Ultrathink (QG58): Docs sync — test metrics updated to 1997 tests, 94.47% coverage across all documentation files
1.44.0 2026-02-12 Bug Hunt #10 + QG57: validate_positive/validate_threshold_ordering NaN guards, stdio MCP size limit, CLI null-coalesce, Lambda phase type guard + drift baseline guard, governance emergency_halt lock atomicity, MCP drift baseline guard; 1997 tests, 94.47% coverage
1.43.0 2026-02-12 Quality-Gate Ultrathink (QG56): stdio batch array support, WebhookAlertSink TLS enforcement, URL whitespace stripping, mcp_rate_limit negative clamp; 1978 tests, 94.47% coverage
1.42.0 2026-02-12 ROADMAP Items 16 + 20a(c): TLS enforcement on HTTPEventSink/BatchHTTPSink (_validate_sink_url() + allow_insecure escape hatch), MCP _ALLOWED_TELEMETRY_SCHEMES restricted to {"https"}, parameter reference guide, domain integration templates (4 domains), MCP tool description enrichment with instructions field + JSON Schema min/max constraints; closes CoSAI MCP-T7 gap (G2); 1964 tests, 94.47% coverage
1.41.0 2026-02-12 MCP Hardening Phase 1 (ROADMAP Item 20a): Token bucket rate limiter + structured audit logging; closes CoSAI MCP-T10 and MCP-T12 gaps; 1948 tests, 94.59% coverage
1.40.0 2026-02-11 H-1 SSRF hex/decimal IP bypass fix: resolve-then-validate via socket.getaddrinfo(), _is_forbidden_ip() uses not is_global (blocks CGNAT 100.64/10); M-3 Slowloris timeout (30s per-connection); 14 regression tests; 1923 tests, 94.62% coverage
1.39.0 2026-02-11 Completed ROADMAP Item 23: MCP Streamable HTTP transport — stdlib http.server implementation (zero new deps), POST /mcp (JSON-RPC single + batch), origin validation, internal ALB, 50 new tests (1909 total, 94.63%), deferred SSE/sessions/resumability
1.38.0 2026-02-11 Added ROADMAP Item 23: MCP Streamable HTTP transport — MCP spec (2025-03-26) already standardizes network transport; updated KNOWN_ISSUES.md with resolution path and spec references; added to v1.2.0 release roadmap and Next Steps checklist
1.37.0 2026-02-11 Post-deployment security hardening: 17 ultrathink findings fixed (3H, 11M, 3L) — CORS restriction, script injection fixes (env vars + heredoc delimiters), error message sanitization, IAM least-privilege (Scan/PutObjectAcl removed), ADOT pinned v0.41.2, CDK approval broadening, billing alarm all stages, deploy test gate; 1859 tests, 94.54% coverage
1.36.0 2026-02-10 AWS Deployment Complete: All 4 CDK stacks deployed to us-west-2 (AegisSharedStack-dev, AegisLambdaStack-dev, AegisMcpStack-dev, AegisMonitoringStack-dev); Items 17-20 updated to DEPLOYED; 7 deployment bugs fixed (cdk.json context, pyproject py-modules, Dockerfile pins, ECS ALB removal, Lambda cyclic refs, CloudWatch math, CDK protocol); added AWS Deployment section to Active Work; added ADR-007 to Quick Links; 1859 tests, 94.55% coverage
1.35.0 2026-02-10 AWS Deployment Infrastructure (ROADMAP Items 16-20): Hybrid Lambda+ECS CDK stacks, src/lambda_handler.py, Dockerfile.lambda, aegis-deploy.yml, aegis-gate action, ADR-007; ultrathink hardening (U-1 null subscores, U-2 injection fix); 42 new tests; 1859 tests, 94.55% coverage
1.34.0 2026-02-10 ROADMAP Item 15: Drift detection → policy connection — DriftMonitor wired into production pcw_decide() path (CRITICAL→HALT, WARNING→constraint, NORMAL→no change); _evaluate_drift_policy() + _apply_drift_overrides() helpers; DRIFT_POLICY_ENFORCED telemetry; CLI --drift-baseline; MCP drift_baseline_data; SDK re-exports; 39 new tests; 1817 tests, 94.56% coverage
1.33.0 2026-02-09 Research 003: MCP Security Ecosystem Review — CoSAI MCP-T1..T12 taxonomy (12 threat categories, ~40 threats, 11 control families) + Red Hat enterprise MCP architecture (4-stage progressive promotion) mapped to AEGIS controls; identified 6 gaps (MCP audit logging, rate limiting, TLS enforcement, tool schema signing, shadow server detection, SPIFFE identity); added ROADMAP Item 20a (MCP hardening)
1.32.0 2026-02-09 ROADMAP Item 22: Market research & competitive landscape — AI governance market sizing ($300-850M → $1.5-4.8B), 7 direct + 6 adjacent competitors profiled, unique positioning matrix, regulatory timeline (EU AI Act Aug 2026), open core pricing model, go-to-market strategy
1.31.0 2026-02-09 ROADMAP Item 14: HTTP telemetry sink — HTTPEventSink (per-event POST), BatchHTTPSink (batching + retry + background flush), http_sink() factory; AegisConfig.telemetry_url; CLI --telemetry-url; MCP telemetry_url param; SDK re-exports; stdlib-only (urllib.request); 45 new tests; 1778 tests, 94.44% coverage
1.30.0 2026-02-09 ROADMAP Item 13: Shadow mode for KL divergence calibration — shadow_mode keyword param on pcw_decide(), ShadowResult dataclass, DriftMonitor/TelemetryEmitter integration, Prometheus mode label + shadow counter, CLI --shadow flag, MCP shadow_mode param, SDK re-export, alerting/recording rule filters; 44 new tests; 1733 tests, 94.48% coverage
1.29.0 2026-02-09 ROADMAP Items 10-12: Production deployment guide (docs/deployment/production-guide.md), migration guide (docs/deployment/migration-guide.md), performance SLAs with recorded benchmarks (docs/deployment/performance-slas.md); Dockerfile + docker-compose.yaml + Prometheus scrape config; no code changes
1.28.0 2026-02-09 ROADMAP Item 7: CALIBRATOR actor type — statistical threshold tuning with drift recalibration, Bayesian prior update, gate parameter proposals, approval-gated application, recognized parameter whitelist, telemetry emission; ultrathink-hardened (U-1..U-5); ActorRole.CALIBRATOR + ActorCapabilities; 69 new tests (12 regression); 1689 tests, 94.60% coverage
1.27.0 2026-02-09 ROADMAP Item 6: GOVERNANCE actor type — override orchestration (initiate/sign/approve/reject/expire), compliance checking (complexity gate non-overridable), emergency halt; ultrathink-hardened (halt guards, fail-closed compliance, thread safety); ActorRole.GOVERNANCE + ActorCapabilities; 41 new tests; 1620 tests, 94.36% coverage
1.26.0 2026-02-08 Docs-Sync Audit: Fixed GAP-L1 status (66%→code-complete), repo-structure tree (6 files added), telemetry schema v2.0→v2.1.0, stale counts, TD-2/TD-3 resolved, gap-analysis changelog gaps, ActorBase→Actor, duplicate sections merged
1.25.0 2026-02-08 ROADMAP Items 8 & 9: DRY extraction — ensure_utc() shared across 3 workflows, 4 validation helpers shared across 5 engine modules; 27 new tests; deferred: persistence/telemetry timezone consolidation; 1579 tests, 94.31% coverage
1.24.0 2026-02-08 ROADMAP Item 5: 77 boundary tests for all 6 gates + drift detector via @pytest.mark.parametrize; verifies comparison operators at exact thresholds; 1552 tests, 94.27% coverage
1.23.0 2026-02-08 ROADMAP Items 2-4: docs version sync committed, safety 2.3→3.x upgrade, broad exception catch documentation (15 sites, 8 files); 1475 tests, 94.21% coverage
1.22.0 2026-02-08 Dependency fix: scipy/prometheus_client moved to dedicated engine/telemetry optional groups with graceful degradation; 4 regression tests; 1475 tests, 94.21% coverage
1.21.0 2026-02-08 Added "Next Steps (Ordered Checklist)" section — 19 prioritized items from Discovery Analysis 2026-02-08; single place to find what's next
1.20.0 2026-02-08 Quality-Gate Ultrathink #10: 5 MEDIUM bugs fixed (Bayesian overflow, pipeline validator exception, executor rollback retry); 7 regression tests; 1475 tests, 94.21% coverage
1.19.0 2026-02-08 Rigor Close Deferrals v2: 4 bugs fixed + 3 closed as intentional; 6 regression tests; 1466 tests, 94.22% coverage
1.18.0 2026-02-08 Bug-Hunt #9 + Ultrathink: 8 bugs fixed (4M, 4L) + 2 ultrathink findings (T-1 critical, T-4 low); 19 regression tests; 1466 tests, 94.22% coverage
1.17.0 2026-02-07 Docs-sync: Issue #18 closed, changelog alignment, stale reference cleanup
1.16.0 2026-02-07 Bug-Hunt #8: 6 bugs fixed (config YAML drop, drift histogram, Bayesian NaN, consensus premature rejection, pipeline buffer, repository async); 8 regression tests; 1398 tests, 94.13% coverage
1.14.0 2026-02-06 Gap closure sprint: issues #24, #2, #7, #5, #8, #9; new modules (rbac.py, alert.py, metrics_server.py); RBAC wired into override + pcw_decide; monitoring/ configs; 115 new tests; 1390 tests, 93.98% coverage
1.13.0 2026-02-06 v1.0 SDK Release: PR #23 merged — AegisConfig, CLI, facade, MCP server, 79 new tests, 4 examples, README rewrite; 1172 tests, 94.61% coverage
1.12.0 2026-02-05 Deferred Bug Fixes v3.34.0: All 17 deferred bugs fixed (1 MEDIUM, 16 LOW); 1037 tests, 94.11% coverage
1.11.0 2026-02-05 Bug Hunt v3.32.0: Codex+Claude hybrid bug-hunt, 5 bug fixes (bayesian zero-override, prometheus idempotent, override rejection metadata, proposal exporter DI); 956 tests, 93.63% coverage
1.10.0 2026-02-05 Claude-GPT Dialogue v3.31.0: phi_S/phi_D Single Source of Truth, KNOWN_ISSUES.md cleanup (L45→Intentional, L7→HSM mitigation), docs-consistency.yml CI workflow; 946 tests, 93.48% coverage
1.9.0 2026-02-04 Deferred Bug Fix v3.30.0: L44 type coercion validation in analyst.py, L49 audit_mode for timing side-channel mitigation in hybrid_provider.py; 946 tests, 93.48% coverage
1.8.0 2026-02-04 Hybrid Bug Hunt v3.29.0: H-WF-001 consensus fix, H-WF-003 pipeline thread safety, M24/M25 crypto validation, M-ENG-005 exception handling; 931 tests, 93.48% coverage
1.7.0 2026-02-04 Quality Gate v3.28.0: 16 deferred bugs fixed, 4 regression tests added; pip CVE-2026-1703 patched; 916 tests, 93.39% coverage
1.6.0 2026-02-04 Rigor Protocol complete (v3.24.0-v3.26.0): 60/62 bugs fixed (97% fix rate); Quality Gate hardening; 910 tests, 93.48% coverage
1.5.0 2026-02-03 All LOW severity bugs fixed (L1-L9): bounded deques, public gate API, scipy z-score, input validation, thread-safe singleton, timezone parsing, docstring updates; 867 tests, 93.81% coverage
1.4.0 2026-01-31 Bug fixes v3.14.0: empty data validation, timezone-aware datetime, specific exception handling, pipeline refactor; 839 tests, 93.74% coverage
1.3.0 2026-01-31 Mathematical coherence review: ddof parameter, public API usage, GateType enum; 821 tests, 93.34% coverage
1.2.0 2026-01-31 Optional deps installed (btclib, liboqs-python); All 807 tests now pass (0 skipped); Coverage 93.76%
1.1.0 2026-01-31 PRs #19-21 merged; v3.11.0 math fixes complete; ADR-006 added; Test counts updated
1.0.1 2026-01-31 Updated PR #20 status (CI failing); Added ADR-005 to Quick Links
1.0.0 2026-01-30 Initial roadmap creation; Added PRs #19-21; Added open issues; Release milestones; GAP status summary