The same physical event that used to generate 3 red rows on the board now consolidates into a single line with source_systems = [anomaly, energy, prognostics]. One action closes the incident.

Case: Unified Inbox — 1 spike, 1 alert

Context

Industry: metalworking
Equipment: process centrifugal pump B-07, 75 kW
Sensors: triaxial vibration, discharge pressure, motor current, instantaneous kWh
Criticality: high — feeds the water-jet cutting line
Rela AI stack active: ML anomaly detection + energy monitoring + prognostics

The problem before the aggregator

When a bearing started failing, three subsystems caught the same anomaly at the same time and produced three independent alerts:

[14:02:15]  🔴 CRITICAL  anomaly_detection   B-07  ensemble score 0.92
[14:02:17]  🔴 CRITICAL  energy_anomaly      B-07  z=3.4 on kwh
[14:02:21]  🔴 CRITICAL  prognostics         B-07  RUL 18h

Real consequences the customer experienced:

Operator saw 3 reds at once and didn't know which to address first.
3 duplicate tasks on the kanban for the same asset — tech closed them all, supervisor confirmed 3 times the same thing.
3 WhatsApp messages 6 seconds apart that the operator read as 3 different problems.
MTTR metrics got dirty — CMMS synced 3 incidents when it was really one.

What Rela AI does now

sequenceDiagram
  participant S as Sensors
  participant A as anomaly_detection
  participant E as energy_service
  participant P as prognostics
  participant AG as alert_aggregator
  participant I as Inbox

  S->>A: Vibration spike
  A->>A: Ensemble score 0.75 (warning)
  A->>AG: ingest_alert(warning)
  AG->>I: Insert row · status=open · severity=warning

  S->>E: kWh spike
  E->>E: z-score 3.4 (high)
  E->>AG: ingest_alert(high)
  AG->>I: Upgrade · severity=high · source_systems=[anomaly,energy] · count=2

  S->>P: AHI collapses
  P->>P: RUL 18h (critical)
  P->>AG: ingest_alert(critical)
  AG->>I: Upgrade · severity=critical · source_systems=[anomaly,energy,prognostics] · count=3

The three detections arrive at the aggregator within the configured window (alert_dedup_window_minutes, default 60). Result in _alerts:

Field	Value
`asset_id`	B-07
`status`	`open`
`severity`	`critical` (max of 3)
`source_systems`	`["anomaly", "energy", "prognostics"]`
`count`	3
`sources` (upgrade trail)	warning→ / →high / →critical
`first_seen_at`	14:02:15
`last_seen_at`	14:02:21

What the operator sees

Before:

[14:02:15] 🔴 CRITICAL  anomaly    B-07  score 0.92
[14:02:17] 🔴 CRITICAL  energy     B-07  z=3.4
[14:02:21] 🔴 CRITICAL  prognostics B-07  RUL 18h

After:

[14:02:15 → 14:02:21]  🔴 CRITICAL  B-07  [anomaly, energy, prognostics]  count=3
  └ trail:
      14:02:15  anomaly      warning→    ensemble score 0.75
      14:02:17  energy       →high       z=3.4 on kwh
      14:02:21  prognostics  →critical   RUL 18h

One row. Severity is the worst observed. Trail tells the escalation story — audit without opening 3 collections.

Post-ACK escalation

Maintenance lead marks the row as acknowledged. 15 min later, a new critical detection from anomaly_detection arrives for the same asset.

Before auto re-open: the acknowledged alert stayed silenced while worse detections arrived.

Now: the row flips to status = open with reopened_reason = severity_upgrade. UI promotes it back to the active inbox view and fires a new notification — no way to lose visibility during escalation.

Hysteresis: trail stays clean

If an incoming detection is equal-or-lower severity than stored, the aggregator still bumps count and source_systems (stats) but does not push to sources. Prevents a critical alert from ending up with a trail of 87 warning entries that add no information.

Impact

Metric	No aggregator	With aggregator
Red rows per incident	3	1
Tasks on kanban	3 duplicates	1
WhatsApp notifications	3 in 6s	1
Operator decision time	+90s of confusion	immediate
MTTR to CMMS	3 mixed incidents	1 clean incident
Post-ACK escalation visibility	zero (silenced)	automatic re-open

Configuration used

{
  "alert_dedup_window_minutes": 60,
  "ahi_weights": {
    "condition": 0.35,
    "alarm_health": 0.20,
    "maintenance_compliance": 0.15,
    "trend_stability": 0.10,
    "anomaly_pressure": 0.20
  }
}

See Predictive Config and Unified Inbox for the full parameter list.

What makes it possible

Backed by concrete technical changes:

R1: alert_aggregator_service with per-asset merge rules and configurable window.
R3: canonical severity (info/warning/high/critical) shared by the 3 detectors.
R6: E2E test (test_single_spike_consolidates_into_one_alert) that codifies the "1 spike → 1 alert" contract.
R8: trail hysteresis, re-open on upgrade, dedup search that includes acknowledged.

Why it matters commercially

Less noise → more action. Operator stops ignoring alerts because "all shout the same".
Clean MTTR. CMMS reports real incidents, not duplicates — the metric is trustworthy again.
Post-ACK confidence. Shift leads can acknowledge without fear of missing escalations.
External integrations. WhatsApp, email, ServiceNow, SAP PM — all receive one notification per incident, not three.

Not a UI "nice to have": these are persisted contract changes in _alerts, with integration tests guaranteeing nobody breaks semantics. See Unified Inbox for the full technical contract.

Case: Unified Inbox — 1 spike, 1 alert

On this page