End-to-end ISO 13374 predictive maintenance: from sensor to prognostics, with a single unified alert inbox and a health index blending 5 signals.

Condition Monitoring

Condition Monitoring is Rela AI's predictive maintenance system. It goes beyond threshold alarms: it learns how each asset behaves normally, monitors health over time, detects anomalies with ML, estimates when it might fail, and consolidates all those signals into a single alert row per asset — not three for the same physical event.

Executive summary

The shift: stop reacting, start anticipating. Before, three independent systems shouted the same thing and no one knew what to do. Today, one unified inbox consolidates mechanical, energy, and RUL detections into a single alert per asset, with canonical severity A/B/C/D/F and actionable recommendations.

Before vs After

Dimension	Before (reactive)	After (Rela AI predictive)
Fault detection	Equipment fails → operator calls → tech arrives	18–72h lead time with estimated RUL
Operator inbox	3 alerts for the same spike (anomaly + energy + RUL)	1 consolidated row per asset, severity = max
Maintenance decision	Calendar-fixed or reactive	Real condition + confidence + per-tenant thresholds
Early ML signal	Ignored until an ISA-18.2 alarm fired	`anomaly_pressure` drops AHI from the first ML event
Silent drift	Motor loses 0.5%/week for months undetected	Page-Hinkley catches it within 20 samples
Auditability	"What threshold was active when this got an A?" unanswered	Every snapshot carries a traceable `config_version`

What it's for

Reactive maintenance (wait for failure) and calendar-based preventive maintenance (change oil every 3 months regardless of actual state) are the two extremes. Predictive maintenance is the sweet spot: intervene exactly when the asset needs it, based on real condition.

Condition Monitoring lets you:

Detect gradual degradation weeks before a failure.
Estimate remaining useful life of a component in days/hours.
Prioritize maintenance on equipment that needs it most.
Reduce unplanned downtime — the costliest form of production loss.
Consolidate the noise from multiple detection systems into an actionable inbox.

End-to-end pipeline

flowchart LR
  S[Sensors / PLC / SCADA] --> I[Ingest MQTT / OPC UA / Modbus / S7 / EtherNet-IP / HTTP]
  I --> F[Field mapping + normalization]
  F --> T[Trends + baselines]
  T --> A1[ML detection<br/>IsolationForest + LOF]
  T --> A2[Energy residuals<br/>z-score + Page-Hinkley]
  T --> H[Asset Health Index<br/>5 sub-indices]
  H --> P[RUL prognostics<br/>+ CBM triggers]
  A1 --> AG[Alert Aggregator<br/>per-asset dedup]
  A2 --> AG
  P --> AG
  AG --> INBOX[Unified inbox<br/>1 row per asset]
  INBOX --> TASK[Work order]
  INBOX --> CMMS[CMMS sync]
  INBOX --> NOTIF[WhatsApp / Email]

Each block is documentable, observable, and per-tenant configurable. See Anomaly Detection, Alert Aggregator, Industrial Protocols.

ISO 13374 — 6 levels

L1 — Data acquisition

Raw sensor, PLC and SCADA data arrive via natively supported protocols: HTTP, MQTT, OPC UA (incl. Reverse Connect), Modbus TCP, S7comm, EtherNet/IP (CIP). Stored with original timestamp, normalized for uniform downstream processing.

L2 — Trend analysis

Raw data gets moving averages, min/max, rate-of-change, std deviation. Loose data becomes readable trends.

L3 — State detection

Two complementary flows:

Condition vs baseline — current data vs learned baselines.
ML anomaly detection — IsolationForest + LocalOutlierFactor ensemble scores 0–1 each reading; >0.7 enters the inbox.

Each detection is translated to a canonical severity (info, warning, high, critical) — the common language for every detector.

L4 — Health assessment

AHI 0–100 combining 5 sub-indices:

Sub-index	Default weight	Measures
`condition`	0.35	Instantaneous vs baselines
`alarm_health`	0.20	Accumulated alarm-hours (ISA-18.2), cap per alarm
`maintenance_compliance`	0.15	Overdue preventive plans
`trend_stability`	0.10	24h trend direction + r²
`anomaly_pressure`	0.20	Recent ML detection density (7d), severity-weighted

Weights tunable per tenant and per asset_type. See Predictive Config.

Grades:

Grade	AHI	Status
A	90-100	Excellent
B	70-89	Good
C	50-69	Acceptable
D	30-49	Unsatisfactory
F	0-29	Critical — imminent failure risk

A/B/C/D thresholds are tenant-configurable via ahi_grades.

L5 — Prognostics

Based on AHI history: RUL with bootstrap confidence, degradation rate (AHI pts/day), CBM trigger, failure probability. See Prognostics.

L6 — Recommendations + consolidated inbox

AI generates natural-language recommendations. Each detection is published to the Alert Aggregator, which consolidates across the 3 systems into one row per asset and routes to WhatsApp/email/CMMS/tasks.

Predictive maturity levels

Level	Name	Requirements	Capabilities
0	Monitoring	Insufficient data	Basic alerts
1	Health tracking	10+ snapshots	AHI active, trends visible
2	Prediction	30+ snapshots + 1 registered failure	Reliable RUL, recommendations
3	Optimized	30+ snapshots + 3 failures + confidence > 70%	Full automation, auto-CBM

Auto-progresses — thresholds tenant-configurable.

Audit trail: `config_version`

Every health snapshot and every prognostics record carries config_version — the number of the predictive configuration active at compute time. Past alerts are not rewritten when config changes. Audit can rebuild exactly which thresholds produced each historical grade.

Use cases with measurable impact

Case 1 — Pump pre-failure caught 18h early. Compressor C-03 AHI drops 87 → 65 over 3 weeks. Prognostics estimates 9 days until F. Auto-task created; tech finds clogged oil filter. Replace → AHI back to 83 in 2 days.

Case 2 — Silent drift. Extruder motor within ±2σ for 4 months but drifting +0.3%/week. Page-Hinkley catches the regime change at sample 20. Tech realigns coupling, consumption returns to baseline.

Case 3 — Unified inbox. Vibration spike on pump B-07 fires 3 detectors. Before: 3 alerts. After: 1 row with source_systems = [anomaly, energy, prognostics]. See Unified Inbox case.

Rollout

Phase 1 — connect sources, accumulate 7-14 days of data, review trends. Phase 2 — compute baselines, train ML anomaly models, turn on health assessment. Phase 3 — enable RUL prognostics, tune alert_dedup_window_minutes, wire recommendations.

Key benefits

Reactive → predictive with days of lead time.
5-signal health index including early ML detection.
Single inbox — cross-system consolidation with max severity.
Per-tenant and per-asset-type thresholds.
Full config_version traceability.
Gradual drift detection (Page-Hinkley).
ISO 13374 standard-aligned.

Condition Monitoring

On this page