One inbox per asset: detections from anomalies, energy, and prognostics collapse into a single row. Canonical severity, windowed dedup, automatic re-open on escalation.

Unified Inbox — Alert Aggregator

The classic problem: three independent detection systems (mechanical anomalies, energy anomalies, RUL prognostics) produce three "critical" alerts when a pump degrades. The operator sees three rows, hesitates, and noise kills signal.

The solution: a single inbox consolidating the three sources into one row per asset, with severity = the highest observed, source(s) = union of detectors, and a severity-change trail to audit how the incident escalated.

Executive summary

One spike → one alert. Before: 3 reds in the inbox for the same physical event. After: 1 row with source_systems = [anomaly, energy, prognostics]. Operator acts once; the system closes all three detections together.

What it's for

Eliminate redundancy: the same phenomenon can fire ML detection + energy residual + critical RUL in seconds. Same story.
Preserve the trail: consolidation does not lose information — each detector contributes to the alert's sources with its summary and timestamp.
Canonical max severity: the alert reflects the worst diagnosis.
Automatic escalation: if operator acknowledges but a new higher-severity detection arrives, the alert reopens (reopened_reason = severity_upgrade).

How it works

Input contract

Each of the three detectors calls the aggregator after persisting its own record:

ingest_alert(
    tenant_id,
    asset_id,
    source_system="anomaly" | "energy" | "prognostics",
    severity="warning" | "high" | "critical",
    summary="...",
    source_doc_id="<optional origin doc id>",
    extra={...}
)

Input filter: only severity >= warning feeds the aggregator. info is noise, not incident.

Merge rules

flowchart TD
  IN[New detection<br/>asset_id + severity] --> Q{Active row<br/>for this asset<br/>in window?}
  Q -- No --> INS[Insert new row<br/>status=open]
  Q -- Yes --> CMP{Incoming severity<br/>&gt; stored?}
  CMP -- Yes --> UP[Upgrade<br/>severity=max<br/>push sources<br/>re-open if acked]
  CMP -- No --> EQ[Silent merge<br/>count += 1<br/>source_systems addToSet<br/>NO push sources]

Dedup window: alert_dedup_window_minutes (default 60, per-tenant).
Search: includes open and acknowledged rows. An acknowledged alert within the window is still "live" for dedup.
Severity upgrade: severity := max(stored, incoming) per canonical order.
Re-open on escalation: acknowledged row + higher incoming → back to open with reopened_reason = severity_upgrade.
Trail hysteresis: equal or lower severity still bumps count and source_systems, but does not push to sources.

Persistence

Per-tenant _alerts collection. Key fields:

Field	Meaning
`asset_id`	Affected asset. Dedup key.
`status`	`open` / `acknowledged` (with `reopened_*` if reopened).
`severity`	Max historic within the row.
`source_systems`	Set of contributing detectors.
`sources`	Append-only trail of upgrades.
`count`	Total ingested detections.
`first_seen_at`, `last_seen_at`	Timestamps.

Contract-codifying tests

Behavior is codified in integration tests (tests/predictive/test_alert_aggregator_cross_system.py):

test_single_spike_consolidates_into_one_alert — all 3 detectors fire in the same window on the same asset → len(_alerts) == 1, source_systems == {anomaly, energy, prognostics}, severity == critical, count == 3, len(sources) == 3.
test_two_assets_two_separate_alerts — spikes on different assets → 2 separate rows.
test_upgrade_reopens_acknowledged_alert — ACK + critical incoming → status returns to open.
test_same_severity_on_ack_keeps_it_acknowledged — ACK + same severity incoming → does not reopen.
test_lower_severity_does_not_push_source_trail — stored critical + warning incoming → $push to trail does NOT happen.

How to use it

View the inbox

Operations → Alerts — tenant list, sorted by last_seen_at desc.
Columns: Asset · Severity · Source systems · Count · First seen · Last seen · Actions.
Filters: severity, source, status, date range.

Per-alert actions

Action	Effect
Acknowledge	`status := acknowledged`, records actor.
Create task	Converts into work order, linking all `sources`.
Notify	Fires WhatsApp/email.
Close	Closes the incident. New detections start a new one.

Per-tenant config

Via Predictive Config:

alert_dedup_window_minutes — default 60.

Before / After

Before

[14:02:15] CRITICAL  anomaly_detection   pump-B07  vibration ensemble score 0.92
[14:02:17] CRITICAL  energy_anomaly      pump-B07  z=3.4 on kwh
[14:02:21] CRITICAL  prognostics         pump-B07  RUL 18h

After

[14:02:15→14:02:21] CRITICAL  pump-B07  [anomaly,energy,prognostics]  count=3
  └ sources:
      14:02:15  anomaly      warning→   ensemble score 0.75
      14:02:17  energy       →high      z=3.4 on kwh
      14:02:21  prognostics  →critical  RUL 18h

Limitations

Dedup is by asset_id. Different metrics of the same asset collapse into the same row.
Window slides on last_seen_at. A detection arriving outside the window opens a new row.
Trail captures upgrades, not repetitions. Use count for frequency stats.
Not a full workflow system. Shift management, assignments, SLAs live in Work Orders.

Key benefits

Single inbox per asset — less noise, more action.
Canonical max severity, auditable trail.
Automatic re-open on escalation post-ACK.
Smart hysteresis — trail reflects severity changes, not repetitions.
Per-tenant configurable window.
E2E integration tests guaranteeing the contract.

Unified Inbox (Alert Aggregator)