Rela AIRela AI Docs
Condition Monitoring

Condition Monitoring

End-to-end ISO 13374 predictive maintenance: from sensor to prognostics, with a single unified alert inbox and a health index blending 5 signals.

Condition Monitoring

Condition Monitoring is Rela AI's predictive maintenance system. It goes beyond threshold alarms: it learns how each asset behaves normally, monitors health over time, detects anomalies with ML, estimates when it might fail, and consolidates all those signals into a single alert row per asset — not three for the same physical event.

Executive summary

The shift: stop reacting, start anticipating. Before, three independent systems shouted the same thing and no one knew what to do. Today, one unified inbox consolidates mechanical, energy, and RUL detections into a single alert per asset, with canonical severity A/B/C/D/F and actionable recommendations.

Before vs After

DimensionBefore (reactive)After (Rela AI predictive)
Fault detectionEquipment fails → operator calls → tech arrives18–72h lead time with estimated RUL
Operator inbox3 alerts for the same spike (anomaly + energy + RUL)1 consolidated row per asset, severity = max
Maintenance decisionCalendar-fixed or reactiveReal condition + confidence + per-tenant thresholds
Early ML signalIgnored until an ISA-18.2 alarm firedanomaly_pressure drops AHI from the first ML event
Silent driftMotor loses 0.5%/week for months undetectedPage-Hinkley catches it within 20 samples
Auditability"What threshold was active when this got an A?" unansweredEvery snapshot carries a traceable config_version

What it's for

Reactive maintenance (wait for failure) and calendar-based preventive maintenance (change oil every 3 months regardless of actual state) are the two extremes. Predictive maintenance is the sweet spot: intervene exactly when the asset needs it, based on real condition.

Condition Monitoring lets you:

  • Detect gradual degradation weeks before a failure.
  • Estimate remaining useful life of a component in days/hours.
  • Prioritize maintenance on equipment that needs it most.
  • Reduce unplanned downtime — the costliest form of production loss.
  • Consolidate the noise from multiple detection systems into an actionable inbox.

End-to-end pipeline

flowchart LR
  S[Sensors / PLC / SCADA] --> I[Ingest MQTT / OPC UA / Modbus / S7 / EtherNet-IP / HTTP]
  I --> F[Field mapping + normalization]
  F --> T[Trends + baselines]
  T --> A1[ML detection<br/>IsolationForest + LOF]
  T --> A2[Energy residuals<br/>z-score + Page-Hinkley]
  T --> H[Asset Health Index<br/>5 sub-indices]
  H --> P[RUL prognostics<br/>+ CBM triggers]
  A1 --> AG[Alert Aggregator<br/>per-asset dedup]
  A2 --> AG
  P --> AG
  AG --> INBOX[Unified inbox<br/>1 row per asset]
  INBOX --> TASK[Work order]
  INBOX --> CMMS[CMMS sync]
  INBOX --> NOTIF[WhatsApp / Email]

Each block is documentable, observable, and per-tenant configurable. See Anomaly Detection, Alert Aggregator, Industrial Protocols.

ISO 13374 — 6 levels

L1 — Data acquisition

Raw sensor, PLC and SCADA data arrive via natively supported protocols: HTTP, MQTT, OPC UA (incl. Reverse Connect), Modbus TCP, S7comm, EtherNet/IP (CIP). Stored with original timestamp, normalized for uniform downstream processing.

L2 — Trend analysis

Raw data gets moving averages, min/max, rate-of-change, std deviation. Loose data becomes readable trends.

L3 — State detection

Two complementary flows:

  • Condition vs baseline — current data vs learned baselines.
  • ML anomaly detection — IsolationForest + LocalOutlierFactor ensemble scores 0–1 each reading; >0.7 enters the inbox.

Each detection is translated to a canonical severity (info, warning, high, critical) — the common language for every detector.

L4 — Health assessment

AHI 0–100 combining 5 sub-indices:

Sub-indexDefault weightMeasures
condition0.35Instantaneous vs baselines
alarm_health0.20Accumulated alarm-hours (ISA-18.2), cap per alarm
maintenance_compliance0.15Overdue preventive plans
trend_stability0.1024h trend direction + r²
anomaly_pressure0.20Recent ML detection density (7d), severity-weighted

Weights tunable per tenant and per asset_type. See Predictive Config.

Grades:

GradeAHIStatus
A90-100Excellent
B70-89Good
C50-69Acceptable
D30-49Unsatisfactory
F0-29Critical — imminent failure risk

A/B/C/D thresholds are tenant-configurable via ahi_grades.

L5 — Prognostics

Based on AHI history: RUL with bootstrap confidence, degradation rate (AHI pts/day), CBM trigger, failure probability. See Prognostics.

L6 — Recommendations + consolidated inbox

AI generates natural-language recommendations. Each detection is published to the Alert Aggregator, which consolidates across the 3 systems into one row per asset and routes to WhatsApp/email/CMMS/tasks.

Predictive maturity levels

LevelNameRequirementsCapabilities
0MonitoringInsufficient dataBasic alerts
1Health tracking10+ snapshotsAHI active, trends visible
2Prediction30+ snapshots + 1 registered failureReliable RUL, recommendations
3Optimized30+ snapshots + 3 failures + confidence > 70%Full automation, auto-CBM

Auto-progresses — thresholds tenant-configurable.

Audit trail: config_version

Every health snapshot and every prognostics record carries config_version — the number of the predictive configuration active at compute time. Past alerts are not rewritten when config changes. Audit can rebuild exactly which thresholds produced each historical grade.

Use cases with measurable impact

Case 1 — Pump pre-failure caught 18h early. Compressor C-03 AHI drops 87 → 65 over 3 weeks. Prognostics estimates 9 days until F. Auto-task created; tech finds clogged oil filter. Replace → AHI back to 83 in 2 days.

Case 2 — Silent drift. Extruder motor within ±2σ for 4 months but drifting +0.3%/week. Page-Hinkley catches the regime change at sample 20. Tech realigns coupling, consumption returns to baseline.

Case 3 — Unified inbox. Vibration spike on pump B-07 fires 3 detectors. Before: 3 alerts. After: 1 row with source_systems = [anomaly, energy, prognostics]. See Unified Inbox case.

Rollout

Phase 1 — connect sources, accumulate 7-14 days of data, review trends. Phase 2 — compute baselines, train ML anomaly models, turn on health assessment. Phase 3 — enable RUL prognostics, tune alert_dedup_window_minutes, wire recommendations.

Key benefits

  • Reactive → predictive with days of lead time.
  • 5-signal health index including early ML detection.
  • Single inbox — cross-system consolidation with max severity.
  • Per-tenant and per-asset-type thresholds.
  • Full config_version traceability.
  • Gradual drift detection (Page-Hinkley).
  • ISO 13374 standard-aligned.

On this page