Anomaly Detection
ML-based anomaly detection (IsolationForest + LocalOutlierFactor). Trains on your own data and warns you when an asset behaves unusually — before threshold alarms fire.
Anomaly Detection
This page covers the predictive pipeline anomaly detector (IsolationForest + LocalOutlierFactor on industrial asset sensors). For anomalies in quality control (SPC, manufacturing processes), see quality/anomaly-detection.
Rela AI's anomaly detection system finds unusual patterns in your equipment data before an ISA-18.2 alarm fires, before a trend is visible by eye, and before the equipment fails. No fixed thresholds — it learns from your data what "normal" looks like and flags anything that doesn't fit.
What it's for
Traditional alarms only fire when a value crosses a fixed limit ("temperature > 80°C"). But many industrial problems never cross a threshold until it's too late. A bearing with early wear shakes 12% above its usual — well within any alarm range, but clearly anomalous if the asset used to be in a different regime.
ML anomaly detection catches exactly that case. Trained on the asset's own history, it distinguishes normal variation from genuinely unusual behavior.
Concrete benefits:
- Extended lead time: detects degradation weeks earlier than classical alarms.
- Per-asset adaptation: what's normal for pump A may be anomalous for pump B; the model is per-asset + per-metric.
- Feeds the AHI: each detection drains points from
anomaly_pressure, lowering the asset's grade even without ISA alarms (see Condition Monitoring). - Feeds the unified inbox:
warning+detections reach the Alert Aggregator with canonical severity.
How it works — IsolationForest + LOF ensemble
flowchart LR
H[Reading history<br/>per asset + metric] --> E[Feature extraction<br/>value, mean, std, rate-of-change]
E --> M1[IsolationForest<br/>isolates global outliers]
E --> M2[LocalOutlierFactor<br/>detects local outliers]
M1 --> B[Boost by asset<br/>criticality]
M2 --> B
B --> S{Score ≥ 0.7?}
S -- No --> X[Discard]
S -- Yes --> D[Persist detection<br/>with canonical severity]
D --> AG[Alert Aggregator]
D --> AHI[anomaly_pressure<br/>of AHI]IsolationForest: builds random trees that isolate "rare" points in few splits. Good for global outliers (extreme spikes, impossible values).
LocalOutlierFactor (LOF): compares a point's local density against its neighbors. Good for regime changes — values that used to be normal but today sit outside the usual neighborhood.
Ensemble: combining both covers the two most common failure modes. Final score between 0 and 1.
Asset criticality boost
Not all assets are equal. An anomaly on a critical compressor weighs more than the same on a low-priority fan. The asset's criticality boosts the score: a 0.75 on a critical asset can equate to 0.85 final and cross the high threshold.
Canonical severity
The boosted score is translated to the common severity language:
| Boosted score | Canonical severity |
|---|---|
| ≥ 0.9 | critical |
| ≥ 0.8 | high |
| ≥ 0.7 | warning |
| < 0.7 | info (discarded — does not enter the inbox) |
The same scale is used by prognostics (over RUL) and energy (over z-score). That lets the Alert Aggregator collapse the 3 signals into a single row per asset.
Integration with AHI's anomaly_pressure
Each persisted detection counts toward anomaly_pressure, one of the 5 AHI sub-indices:
critical→ −8 pointshigh→ −4 pointswarning→ −1.5 pointsinfo→ 0
7-day window, floor of 10 (even a drowned-in-criticals asset keeps some headroom — other sub-indices still matter).
Training strategy
The model learns from the client
Rela AI does not ship generic pre-trained models. Each tenant trains on its own historical data, because what's normal in a cement plant is not what's normal in a food plant.
- Minimum dataset: 200 readings per (asset, metric) — typically 2–3 weeks of capture.
- Retraining: models automatically retrain when RUL drift is detected (Page-Hinkley on prognostics residuals) or when a failure event closes (feedback loop).
- Persistence: each model is versioned and tied to
asset_id+metric_name.
Cold start
While an asset lacks data, ML detection is dormant for it. AHI keeps computing without anomaly_pressure (effective weight 0) until the model is ready. Maturity Levels reflect that state.
How to use it
See active anomalies
- Assets → list — "Anomalies 7d" column shows counts per asset.
- Asset → detail — Anomalies tab with score-vs-time scatter, filters.
- Inbox — consolidated alerts via Alert Aggregator.
Tune sensitivity
Per-model:
- Score threshold (default 0.7).
- Feature window (default 24h).
- Asset criticality (
low/medium/high/critical).
UI: Asset → Anomalies → Configure.
Automatic actions
A rule can turn a critical anomaly into: work order, WhatsApp/email notification, supervisor escalation, CMMS sync.
See Event rules.
Limitations & assumptions
- Models are per-asset + per-metric. No parameter sharing across similar assets; that protects against over-generalization but requires data per combination.
- Sparse data degrades signal. Below ~200 points the detection stays dormant or produces false positives until convergence.
- Sensor staleness. If the sensor watchdog marks the source as
stale, anomalies from that source are suppressed and don't drain AHI — dead hardware must not falsely tank asset health. - Doesn't replace root cause analysis. The system tells you what is anomalous and when, not always the why. AI recommendation offers hypotheses; physical inspection is the final step.
Use cases
Case 1 — Early bearing wear. Pump B-07: vibration_rms avg 2.1 mm/s, std 0.3. Suddenly consecutive readings at 2.9 mm/s. Within ISA limits (under 4 mm/s). LOF detects regime change. Score 0.84 → high. Inbox flags it, AHI drops 4 points, tech inspects and finds incipient wear in rear bearing.
Case 2 — Internal valve leak. Compressor C-03: pressure_avg normal, temperature_avg rising 0.3°C/day in correlated fashion. IsolationForest flags the combined temperature × time point as global outlier. Score 0.91 → critical. Investigation reveals internal leak in discharge valve.
Case 3 — Useful false positive. System flags score 0.73 on extractor X-02 during night shift. Tech reviews: fan was running in planned reduced mode (undocumented setpoint change). Not a fault, but a real regime change. Engineering team adds the new setpoint to operational log; model learns it on next retraining window.
Key benefits
- Catches patterns fixed thresholds can't see — extended lead time.
- Per-tenant, per-asset, per-metric: the model speaks your plant's language.
- IsolationForest + LOF ensemble covers global outliers and regime changes.
- Canonical severity compatible with the whole system.
- Feeds
anomaly_pressureof the AHI — health drops before ISA alarms fire. - Automatic consolidation in the Alert Aggregator.
Trend Analysis
Trend analysis transforms sensor readings into useful information: whether temperature is rising, how fast it is doing so, and when it could become a problem.
Prognostics and Recommendations
The system estimates how many days the equipment has before needing maintenance and generates specific recommendations based on health history, trends, and recent alarms.