AEGISdocs
Deployment

AEGIS Performance SLAs

Version: 1.0.0 | Updated: 2026-02-09 | Status: Active

Version: 1.0.0 | Updated: 2026-02-09 | Status: Active

Performance targets, resource baselines, and measurement methodology for AEGIS production deployments. Latency targets are sourced from docs/implementation-plans/002-performance-load-testing.md in the repository (not published on the website). Key targets are reproduced inline below.


1. Decision Latency Targets

Source: docs/implementation-plans/002-performance-load-testing.md section 2.2.

PercentileTargetAlert ThresholdCritical Threshold
p50< 100 ms> 150 ms> 250 ms
p95< 500 ms> 500 ms> 750 ms
p99< 1000 ms> 1000 ms> 1500 ms

These targets apply to end-to-end pcw_decide() evaluation, measured by the aegis_decision_latency_seconds Prometheus histogram.


2. Throughput Targets

MetricTargetAlert ThresholdMeasurement Window
Minimum throughput100 evaluations/sec< 80 eval/s60 seconds
Burst throughput500 evaluations/secN/A60 seconds
Error rate< 0.1%> 0.5%5 minutes

Throughput measured as rate(aegis_decision_latency_seconds_count[1m]).


3. Resource Baselines

ResourceTargetAlert ThresholdNotes
CPU utilization< 70%> 80%Per-process
Memory utilization< 80%> 85%Per-process
Per-evaluation memory~2 MB peakN/AWorking set
Startup time< 5 secondsN/AProcess initialization

4. Component Latency Budget

Breakdown of per-evaluation latency budget:

ComponentBudgetNotes
Gate evaluation (6 gates)< 50 msPure computation — risk, profit, novelty, complexity, quality, utility
Bayesian posterior< 20 msWith scipy (engine optional group)
Utility calculation< 10 msWith scipy z-score computation
Telemetry emission< 5 msAsync, non-blocking via pipeline
Crypto signing< 15 msEd25519 (~0.5 ms); ML-DSA-44 adds ~2-5 ms; HSM adds variable latency
Total overhead< 100 msTarget p50

Crypto Latency by Algorithm

AlgorithmSignVerifyNotes
Ed25519~0.5 ms~0.5 msSoftware implementation
ML-DSA-44~2-5 ms~1-3 msPost-quantum (liboqs)
Hybrid (Ed25519 + ML-DSA-44)~3-6 ms~2-4 msCombined
HSM Ed25519~10-50 ms~10-50 msHardware dependent

5. Measurement Methodology

Prometheus Histogram Buckets

[0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0]

Corresponding to: 5 ms, 10 ms, 25 ms, 50 ms, 100 ms, 250 ms, 500 ms, 1000 ms, 2500 ms, 5000 ms.

Measurement Rules

  • Exclude cold starts: First 30 seconds after process start
  • Measure end-to-end: pcw_decide() entry to return
  • Prometheus metric: aegis_decision_latency_seconds (histogram)
  • Recording rule: aegis:p99_latency_5m (pre-computed)

Statistical Requirements

  • Minimum sample size: 1000 evaluations before reporting percentiles
  • Measurement window: Rolling 5-minute windows for alerting
  • Baseline: Establish on identical hardware with standardized input

6. Benchmark Baselines

Run benchmarks to establish local baselines:

pytest tests/benchmarks/ --benchmark-only -v

Benchmark Tests

Test FileFunctionsWhat It Measures
test_gate_benchmarks.pyGate evaluation for each of the 6 gatesIndividual gate latency
test_pcw_decide_benchmark.pyFull pcw_decide() evaluationEnd-to-end decision latency
test_bayesian_benchmark.pyBayesian posterior computationMathematical kernel performance

Recorded Baselines

Environment: Python 3.12.11, macOS (Darwin), pytest-benchmark 5.2.3 Date: 2026-02-09 Note: Run pytest tests/benchmarks/ --benchmark-only on your target hardware to record environment-specific baselines. Results vary by CPU, Python version, and available dependencies.

Bayesian Computation

BenchmarkMinMedianMeanMaxOPS (Kops/s)
test_posterior_calculation416 ns500 ns537 ns57.9 us1,862
test_posterior_with_overrides433 ns510 ns511 ns4.6 us1,958
test_posterior_predictive498 ns583 ns587 ns6.3 us1,705
test_compute_full667 ns833 ns840 ns46.3 us1,191
test_update_prior1.17 us1.42 us1.43 us49.0 us701

Gate Evaluation

BenchmarkMinMedianMeanMaxOPS (Kops/s)
test_complexity_gate_evaluation354 ns410 ns421 ns74.2 us2,374
test_quality_gate_evaluation500 ns625 ns650 ns30.9 us1,538
test_utility_gate_evaluation625 ns792 ns811 ns98.2 us1,232
test_novelty_gate_evaluation708 ns875 ns912 ns35.9 us1,096
test_risk_gate_evaluation1.04 us1.25 us1.27 us63.2 us788
test_profit_gate_evaluation1.54 us1.87 us1.88 us65.1 us533
test_full_gate_pipeline7.04 us7.21 us7.97 us53.2 us125

End-to-End Decision (pcw_decide)

BenchmarkMinMedianMeanMaxOPS (Kops/s)
test_pcw_decide_passing14.25 us15.04 us16.29 us133.9 us61.4
test_pcw_decide_failing14.54 us16.63 us17.15 us1,217 us58.3
test_pcw_decide_with_prometheus15.38 us17.54 us17.97 us379.5 us55.7

Summary

  • Full gate pipeline: ~7 us median (well within 50 ms budget)
  • End-to-end pcw_decide: ~15-18 us median (well within 100 ms p50 target)
  • Single-process throughput: ~55-61K ops/s (exceeds 100 eval/s target by orders of magnitude)
  • Bayesian posterior: ~500 ns median (negligible latency contribution)

7. Scaling Characteristics

ConfigurationExpected ThroughputNotes
Single process~100 eval/sBaseline
Multi-worker (2 workers)~200 eval/sNear-linear scaling
Multi-worker (4 workers)~350-400 eval/sI/O bound begins
Multi-worker (8 workers)~500-600 eval/sDiminishing returns

Latency Impact by Feature

FeatureAdditional LatencyNotes
PostgreSQL persistence+5-10 msAsync write per evaluation
PII encryption (12 fields)+1-3 msAES-256-GCM per field
ML-DSA-44 signing+2-5 msvs Ed25519 ~0.5 ms
ML-KEM-768 encryption+1-3 msKey encapsulation
Prometheus metrics< 1 msCounter/histogram increment

Bottlenecks

  1. CPU-bound: Gate evaluation and Bayesian computation (mitigate with multi-worker)
  2. I/O-bound: Database persistence and HSM communication (mitigate with async and connection pooling)
  3. Memory: Large batch evaluations (mitigate with streaming)

8. SLA Monitoring

Prometheus Alerting Rules

Defined in monitoring/prometheus/alerting-rules.yaml:

AlertExpressionForSeverity
AegisHighLatencyaegis:p99_latency_5m > 1.05mWarning
AegisErrorRateaegis:error_rate_5m > 0.055mWarning
AegisHighGateFailRateaegis:gate_pass_rate_5m < 0.510mWarning
AegisOverrideSpikeOverride rate > 0.1/min15mCritical
AegisDriftCriticalKL divergence = critical5mCritical
AegisOverrideStalePartialPartial override stuck2hWarning

Grafana Dashboards

DashboardFileKey Panels
AEGIS Overviewmonitoring/grafana/overview-dashboard.jsonDecision rate, gate pass rate, latency p50/p95/p99
Risk Analysismonitoring/grafana/risk-analysis-dashboard.jsonKL divergence, Bayesian posteriors, override history

Recording Rules

Pre-computed queries in monitoring/prometheus/recording-rules.yaml:

RuleExpressionInterval
aegis:gate_pass_rate_5mPass rate by gate30s
aegis:decision_rate_5mDecision rate by status30s
aegis:p99_latency_5mp99 latency by operation30s
aegis:override_rate_1hOverride rate by outcome30s
aegis:error_rate_5mError rate by component30s

References

On this page