Deployment
AEGIS Performance SLAs Version: 1.0.0 | Updated: 2026-02-09 | Status: Active
Version : 1.0.0 | Updated : 2026-02-09 | Status : Active
Performance targets, resource baselines, and measurement methodology for AEGIS production deployments. Latency targets are sourced from docs/implementation-plans/002-performance-load-testing.md in the repository (not published on the website). Key targets are reproduced inline below.
Source: docs/implementation-plans/002-performance-load-testing.md section 2.2.
Percentile Target Alert Threshold Critical Threshold p50 < 100 ms > 150 ms > 250 ms p95 < 500 ms > 500 ms > 750 ms p99 < 1000 ms > 1000 ms > 1500 ms
These targets apply to end-to-end pcw_decide() evaluation, measured by the aegis_decision_latency_seconds Prometheus histogram.
Metric Target Alert Threshold Measurement Window Minimum throughput 100 evaluations/sec < 80 eval/s 60 seconds Burst throughput 500 evaluations/sec N/A 60 seconds Error rate < 0.1% > 0.5% 5 minutes
Throughput measured as rate(aegis_decision_latency_seconds_count[1m]).
Resource Target Alert Threshold Notes CPU utilization < 70% > 80% Per-process Memory utilization < 80% > 85% Per-process Per-evaluation memory ~2 MB peak N/A Working set Startup time < 5 seconds N/A Process initialization
Breakdown of per-evaluation latency budget:
Component Budget Notes Gate evaluation (6 gates) < 50 ms Pure computation — risk, profit, novelty, complexity, quality, utility Bayesian posterior < 20 ms With scipy (engine optional group) Utility calculation < 10 ms With scipy z-score computation Telemetry emission < 5 ms Async, non-blocking via pipeline Crypto signing < 15 ms Ed25519 (~0.5 ms); ML-DSA-44 adds ~2-5 ms; HSM adds variable latency Total overhead < 100 ms Target p50
Algorithm Sign Verify Notes Ed25519 ~0.5 ms ~0.5 ms Software implementation ML-DSA-44 ~2-5 ms ~1-3 ms Post-quantum (liboqs) Hybrid (Ed25519 + ML-DSA-44) ~3-6 ms ~2-4 ms Combined HSM Ed25519 ~10-50 ms ~10-50 ms Hardware dependent
[0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0]
Corresponding to: 5 ms, 10 ms, 25 ms, 50 ms, 100 ms, 250 ms, 500 ms, 1000 ms, 2500 ms, 5000 ms.
Exclude cold starts : First 30 seconds after process start
Measure end-to-end : pcw_decide() entry to return
Prometheus metric : aegis_decision_latency_seconds (histogram)
Recording rule : aegis:p99_latency_5m (pre-computed)
Minimum sample size: 1000 evaluations before reporting percentiles
Measurement window: Rolling 5-minute windows for alerting
Baseline: Establish on identical hardware with standardized input
Run benchmarks to establish local baselines:
pytest tests/benchmarks/ --benchmark-only -v
Test File Functions What It Measures test_gate_benchmarks.pyGate evaluation for each of the 6 gates Individual gate latency test_pcw_decide_benchmark.pyFull pcw_decide() evaluation End-to-end decision latency test_bayesian_benchmark.pyBayesian posterior computation Mathematical kernel performance
Environment : Python 3.12.11, macOS (Darwin), pytest-benchmark 5.2.3
Date : 2026-02-09
Note : Run pytest tests/benchmarks/ --benchmark-only on your target hardware to record environment-specific baselines. Results vary by CPU, Python version, and available dependencies.
Benchmark Min Median Mean Max OPS (Kops/s) test_posterior_calculation416 ns 500 ns 537 ns 57.9 us 1,862 test_posterior_with_overrides433 ns 510 ns 511 ns 4.6 us 1,958 test_posterior_predictive498 ns 583 ns 587 ns 6.3 us 1,705 test_compute_full667 ns 833 ns 840 ns 46.3 us 1,191 test_update_prior1.17 us 1.42 us 1.43 us 49.0 us 701
Benchmark Min Median Mean Max OPS (Kops/s) test_complexity_gate_evaluation354 ns 410 ns 421 ns 74.2 us 2,374 test_quality_gate_evaluation500 ns 625 ns 650 ns 30.9 us 1,538 test_utility_gate_evaluation625 ns 792 ns 811 ns 98.2 us 1,232 test_novelty_gate_evaluation708 ns 875 ns 912 ns 35.9 us 1,096 test_risk_gate_evaluation1.04 us 1.25 us 1.27 us 63.2 us 788 test_profit_gate_evaluation1.54 us 1.87 us 1.88 us 65.1 us 533 test_full_gate_pipeline7.04 us 7.21 us 7.97 us 53.2 us 125
Benchmark Min Median Mean Max OPS (Kops/s) test_pcw_decide_passing14.25 us 15.04 us 16.29 us 133.9 us 61.4 test_pcw_decide_failing14.54 us 16.63 us 17.15 us 1,217 us 58.3 test_pcw_decide_with_prometheus15.38 us 17.54 us 17.97 us 379.5 us 55.7
Full gate pipeline : ~7 us median (well within 50 ms budget)
End-to-end pcw_decide : ~15-18 us median (well within 100 ms p50 target)
Single-process throughput : ~55-61K ops/s (exceeds 100 eval/s target by orders of magnitude)
Bayesian posterior : ~500 ns median (negligible latency contribution)
Configuration Expected Throughput Notes Single process ~100 eval/s Baseline Multi-worker (2 workers) ~200 eval/s Near-linear scaling Multi-worker (4 workers) ~350-400 eval/s I/O bound begins Multi-worker (8 workers) ~500-600 eval/s Diminishing returns
Feature Additional Latency Notes PostgreSQL persistence +5-10 ms Async write per evaluation PII encryption (12 fields) +1-3 ms AES-256-GCM per field ML-DSA-44 signing +2-5 ms vs Ed25519 ~0.5 ms ML-KEM-768 encryption +1-3 ms Key encapsulation Prometheus metrics < 1 ms Counter/histogram increment
CPU-bound : Gate evaluation and Bayesian computation (mitigate with multi-worker)
I/O-bound : Database persistence and HSM communication (mitigate with async and connection pooling)
Memory : Large batch evaluations (mitigate with streaming)
Defined in monitoring/prometheus/alerting-rules.yaml:
Alert Expression For Severity AegisHighLatencyaegis:p99_latency_5m > 1.05m Warning AegisErrorRateaegis:error_rate_5m > 0.055m Warning AegisHighGateFailRateaegis:gate_pass_rate_5m < 0.510m Warning AegisOverrideSpikeOverride rate > 0.1/min 15m Critical AegisDriftCriticalKL divergence = critical 5m Critical AegisOverrideStalePartialPartial override stuck 2h Warning
Dashboard File Key Panels AEGIS Overview monitoring/grafana/overview-dashboard.jsonDecision rate, gate pass rate, latency p50/p95/p99 Risk Analysis monitoring/grafana/risk-analysis-dashboard.jsonKL divergence, Bayesian posteriors, override history
Pre-computed queries in monitoring/prometheus/recording-rules.yaml:
Rule Expression Interval aegis:gate_pass_rate_5mPass rate by gate 30s aegis:decision_rate_5mDecision rate by status 30s aegis:p99_latency_5mp99 latency by operation 30s aegis:override_rate_1hOverride rate by outcome 30s aegis:error_rate_5mError rate by component 30s