Rela AIRela AI Docs
Condition Monitoring

Anomaly Detection

ML-based anomaly detection (IsolationForest + LocalOutlierFactor). Trains on your own data and warns you when an asset behaves unusually — before threshold alarms fire.

Anomaly Detection

This page covers the predictive pipeline anomaly detector (IsolationForest + LocalOutlierFactor on industrial asset sensors). For anomalies in quality control (SPC, manufacturing processes), see quality/anomaly-detection.

Rela AI's anomaly detection system finds unusual patterns in your equipment data before an ISA-18.2 alarm fires, before a trend is visible by eye, and before the equipment fails. No fixed thresholds — it learns from your data what "normal" looks like and flags anything that doesn't fit.

What it's for

Traditional alarms only fire when a value crosses a fixed limit ("temperature > 80°C"). But many industrial problems never cross a threshold until it's too late. A bearing with early wear shakes 12% above its usual — well within any alarm range, but clearly anomalous if the asset used to be in a different regime.

ML anomaly detection catches exactly that case. Trained on the asset's own history, it distinguishes normal variation from genuinely unusual behavior.

Concrete benefits:

  • Extended lead time: detects degradation weeks earlier than classical alarms.
  • Per-asset adaptation: what's normal for pump A may be anomalous for pump B; the model is per-asset + per-metric.
  • Feeds the AHI: each detection drains points from anomaly_pressure, lowering the asset's grade even without ISA alarms (see Condition Monitoring).
  • Feeds the unified inbox: warning+ detections reach the Alert Aggregator with canonical severity.

How it works — IsolationForest + LOF ensemble

flowchart LR
  H[Reading history<br/>per asset + metric] --> E[Feature extraction<br/>value, mean, std, rate-of-change]
  E --> M1[IsolationForest<br/>isolates global outliers]
  E --> M2[LocalOutlierFactor<br/>detects local outliers]
  M1 --> B[Boost by asset<br/>criticality]
  M2 --> B
  B --> S{Score ≥ 0.7?}
  S -- No --> X[Discard]
  S -- Yes --> D[Persist detection<br/>with canonical severity]
  D --> AG[Alert Aggregator]
  D --> AHI[anomaly_pressure<br/>of AHI]

IsolationForest: builds random trees that isolate "rare" points in few splits. Good for global outliers (extreme spikes, impossible values).

LocalOutlierFactor (LOF): compares a point's local density against its neighbors. Good for regime changes — values that used to be normal but today sit outside the usual neighborhood.

Ensemble: combining both covers the two most common failure modes. Final score between 0 and 1.

Asset criticality boost

Not all assets are equal. An anomaly on a critical compressor weighs more than the same on a low-priority fan. The asset's criticality boosts the score: a 0.75 on a critical asset can equate to 0.85 final and cross the high threshold.

Canonical severity

The boosted score is translated to the common severity language:

Boosted scoreCanonical severity
≥ 0.9critical
≥ 0.8high
≥ 0.7warning
< 0.7info (discarded — does not enter the inbox)

The same scale is used by prognostics (over RUL) and energy (over z-score). That lets the Alert Aggregator collapse the 3 signals into a single row per asset.

Integration with AHI's anomaly_pressure

Each persisted detection counts toward anomaly_pressure, one of the 5 AHI sub-indices:

  • critical → −8 points
  • high → −4 points
  • warning → −1.5 points
  • info → 0

7-day window, floor of 10 (even a drowned-in-criticals asset keeps some headroom — other sub-indices still matter).

Training strategy

The model learns from the client

Rela AI does not ship generic pre-trained models. Each tenant trains on its own historical data, because what's normal in a cement plant is not what's normal in a food plant.

  • Minimum dataset: 200 readings per (asset, metric) — typically 2–3 weeks of capture.
  • Retraining: models automatically retrain when RUL drift is detected (Page-Hinkley on prognostics residuals) or when a failure event closes (feedback loop).
  • Persistence: each model is versioned and tied to asset_id + metric_name.

Cold start

While an asset lacks data, ML detection is dormant for it. AHI keeps computing without anomaly_pressure (effective weight 0) until the model is ready. Maturity Levels reflect that state.

How to use it

See active anomalies

  1. Assets → list — "Anomalies 7d" column shows counts per asset.
  2. Asset → detailAnomalies tab with score-vs-time scatter, filters.
  3. Inbox — consolidated alerts via Alert Aggregator.

Tune sensitivity

Per-model:

  • Score threshold (default 0.7).
  • Feature window (default 24h).
  • Asset criticality (low/medium/high/critical).

UI: Asset → Anomalies → Configure.

Automatic actions

A rule can turn a critical anomaly into: work order, WhatsApp/email notification, supervisor escalation, CMMS sync.

See Event rules.

Limitations & assumptions

  • Models are per-asset + per-metric. No parameter sharing across similar assets; that protects against over-generalization but requires data per combination.
  • Sparse data degrades signal. Below ~200 points the detection stays dormant or produces false positives until convergence.
  • Sensor staleness. If the sensor watchdog marks the source as stale, anomalies from that source are suppressed and don't drain AHI — dead hardware must not falsely tank asset health.
  • Doesn't replace root cause analysis. The system tells you what is anomalous and when, not always the why. AI recommendation offers hypotheses; physical inspection is the final step.

Use cases

Case 1 — Early bearing wear. Pump B-07: vibration_rms avg 2.1 mm/s, std 0.3. Suddenly consecutive readings at 2.9 mm/s. Within ISA limits (under 4 mm/s). LOF detects regime change. Score 0.84 → high. Inbox flags it, AHI drops 4 points, tech inspects and finds incipient wear in rear bearing.

Case 2 — Internal valve leak. Compressor C-03: pressure_avg normal, temperature_avg rising 0.3°C/day in correlated fashion. IsolationForest flags the combined temperature × time point as global outlier. Score 0.91 → critical. Investigation reveals internal leak in discharge valve.

Case 3 — Useful false positive. System flags score 0.73 on extractor X-02 during night shift. Tech reviews: fan was running in planned reduced mode (undocumented setpoint change). Not a fault, but a real regime change. Engineering team adds the new setpoint to operational log; model learns it on next retraining window.

Key benefits

  • Catches patterns fixed thresholds can't see — extended lead time.
  • Per-tenant, per-asset, per-metric: the model speaks your plant's language.
  • IsolationForest + LOF ensemble covers global outliers and regime changes.
  • Canonical severity compatible with the whole system.
  • Feeds anomaly_pressure of the AHI — health drops before ISA alarms fire.
  • Automatic consolidation in the Alert Aggregator.

On this page