Rela AIRela AI Docs
Condition Monitoring

Unified Inbox (Alert Aggregator)

One inbox per asset: detections from anomalies, energy, and prognostics collapse into a single row. Canonical severity, windowed dedup, automatic re-open on escalation.

Unified Inbox — Alert Aggregator

The classic problem: three independent detection systems (mechanical anomalies, energy anomalies, RUL prognostics) produce three "critical" alerts when a pump degrades. The operator sees three rows, hesitates, and noise kills signal.

The solution: a single inbox consolidating the three sources into one row per asset, with severity = the highest observed, source(s) = union of detectors, and a severity-change trail to audit how the incident escalated.

Executive summary

One spike → one alert. Before: 3 reds in the inbox for the same physical event. After: 1 row with source_systems = [anomaly, energy, prognostics]. Operator acts once; the system closes all three detections together.

What it's for

  • Eliminate redundancy: the same phenomenon can fire ML detection + energy residual + critical RUL in seconds. Same story.
  • Preserve the trail: consolidation does not lose information — each detector contributes to the alert's sources with its summary and timestamp.
  • Canonical max severity: the alert reflects the worst diagnosis.
  • Automatic escalation: if operator acknowledges but a new higher-severity detection arrives, the alert reopens (reopened_reason = severity_upgrade).

How it works

Input contract

Each of the three detectors calls the aggregator after persisting its own record:

ingest_alert(
    tenant_id,
    asset_id,
    source_system="anomaly" | "energy" | "prognostics",
    severity="warning" | "high" | "critical",
    summary="...",
    source_doc_id="<optional origin doc id>",
    extra={...}
)

Input filter: only severity >= warning feeds the aggregator. info is noise, not incident.

Merge rules

flowchart TD
  IN[New detection<br/>asset_id + severity] --> Q{Active row<br/>for this asset<br/>in window?}
  Q -- No --> INS[Insert new row<br/>status=open]
  Q -- Yes --> CMP{Incoming severity<br/>&gt; stored?}
  CMP -- Yes --> UP[Upgrade<br/>severity=max<br/>push sources<br/>re-open if acked]
  CMP -- No --> EQ[Silent merge<br/>count += 1<br/>source_systems addToSet<br/>NO push sources]
  • Dedup window: alert_dedup_window_minutes (default 60, per-tenant).
  • Search: includes open and acknowledged rows. An acknowledged alert within the window is still "live" for dedup.
  • Severity upgrade: severity := max(stored, incoming) per canonical order.
  • Re-open on escalation: acknowledged row + higher incoming → back to open with reopened_reason = severity_upgrade.
  • Trail hysteresis: equal or lower severity still bumps count and source_systems, but does not push to sources.

Persistence

Per-tenant _alerts collection. Key fields:

FieldMeaning
asset_idAffected asset. Dedup key.
statusopen / acknowledged (with reopened_* if reopened).
severityMax historic within the row.
source_systemsSet of contributing detectors.
sourcesAppend-only trail of upgrades.
countTotal ingested detections.
first_seen_at, last_seen_atTimestamps.

Contract-codifying tests

Behavior is codified in integration tests (tests/predictive/test_alert_aggregator_cross_system.py):

  • test_single_spike_consolidates_into_one_alert — all 3 detectors fire in the same window on the same asset → len(_alerts) == 1, source_systems == {anomaly, energy, prognostics}, severity == critical, count == 3, len(sources) == 3.
  • test_two_assets_two_separate_alerts — spikes on different assets → 2 separate rows.
  • test_upgrade_reopens_acknowledged_alert — ACK + critical incoming → status returns to open.
  • test_same_severity_on_ack_keeps_it_acknowledged — ACK + same severity incoming → does not reopen.
  • test_lower_severity_does_not_push_source_trail — stored critical + warning incoming → $push to trail does NOT happen.

How to use it

View the inbox

  1. Operations → Alerts — tenant list, sorted by last_seen_at desc.
  2. Columns: Asset · Severity · Source systems · Count · First seen · Last seen · Actions.
  3. Filters: severity, source, status, date range.

Per-alert actions

ActionEffect
Acknowledgestatus := acknowledged, records actor.
Create taskConverts into work order, linking all sources.
NotifyFires WhatsApp/email.
CloseCloses the incident. New detections start a new one.

Per-tenant config

Via Predictive Config:

  • alert_dedup_window_minutes — default 60.

Before / After

Before

[14:02:15] CRITICAL  anomaly_detection   pump-B07  vibration ensemble score 0.92
[14:02:17] CRITICAL  energy_anomaly      pump-B07  z=3.4 on kwh
[14:02:21] CRITICAL  prognostics         pump-B07  RUL 18h

After

[14:02:15→14:02:21] CRITICAL  pump-B07  [anomaly,energy,prognostics]  count=3
  └ sources:
      14:02:15  anomaly      warning→   ensemble score 0.75
      14:02:17  energy       →high      z=3.4 on kwh
      14:02:21  prognostics  →critical  RUL 18h

Limitations

  • Dedup is by asset_id. Different metrics of the same asset collapse into the same row.
  • Window slides on last_seen_at. A detection arriving outside the window opens a new row.
  • Trail captures upgrades, not repetitions. Use count for frequency stats.
  • Not a full workflow system. Shift management, assignments, SLAs live in Work Orders.

Key benefits

  • Single inbox per asset — less noise, more action.
  • Canonical max severity, auditable trail.
  • Automatic re-open on escalation post-ACK.
  • Smart hysteresis — trail reflects severity changes, not repetitions.
  • Per-tenant configurable window.
  • E2E integration tests guaranteeing the contract.

On this page