HACCP plans, CCPs with real grace period, corrective actions that create work orders and per-batch chain of custody. Certifiable under SQF / BRC / FSSC 22000.

HACCP — Verifiable Food-Safety Control

Rela AI's HACCP module is not a deviation log. It's an auditable implementation of the 7 HACCP principles integrated with the rest of the industrial pipeline: from the temperature sensor to the work order the operator closes with evidence.

What is it for?

Monitor each HACCP Critical Control Point (CCP) online with ranges, grace periods and dead-sensor suppression.
Fire a corrective action + assigned work order with deadline on every real deviation.
Generate auditable evidence (per-lot chain of custody, hashed audit trail) acceptable for SQF / BRC / FSSC 22000.

How it works

Each CCP defines an acceptable range + grace period + dead-sensor suppression rule. Readings arrive via POST /api/v1/haccp/readings (sensor push) or MQTT/OPC UA; the pipeline validates against the range, discards transients inside the grace period, suppresses the fake 0 from dead sensors, and on real deviation opens a corrective action + work order + entry in the lot chain of custody.

Executive summary

From reactive record to control system. Before, a temperature deviation produced a text-only record with no traceability. Today it produces: a consolidated alert in the unified inbox, an urgent work order assigned to the responsible operator, an auditable configuration entry, and — if the batch was held — an entry in the batch's chain of custody.

Before vs After

Dimension	Before	After
Transients (door-open, defrost)	9 false deviations per door opening	0 — `grace_period_seconds` respected
Dead sensor sending 0	Phantom "critical low"	Suppressed by sensor_watchdog + H2
Operator inbox	3 silos: anomaly, energy, HACCP	1 consolidated row, canonical severity
30-min sustained excursion	~180 persisted deviations	1 deviation with `sample_count=180` and `max_deviation`
Corrective action	Text record with no owner	Urgent `HAC-` prefixed work order, assigned, traceable
Critical limit change	No trail	`_audit_trail` with prev→new, actor, timestamp
HACCP plan past review	Nobody notices	`plan_review_overdue` alert via heartbeat
Batch held due to deviation	Free-text search over CAs	Direct query `GET /batches/{id}/history`

CCP? HACCP Plan?

HACCP Plan: the regulatory document. Groups hazard analysis, process flow, CCPs controlling the risks, and verification procedures. Each plan has review_frequency_days and next_review_at.
CCP (Critical Control Point): a specific step in the process where a hazard is controlled. Belongs to a plan (or is legacy unplanned), monitors a metric on an asset_id, has critical limits, grace period, corrective actions, and responsible.

flowchart LR
  A[Sensor/PLC/SCADA] --> B[Ingest via machine_events<br/>or POST /haccp/readings]
  B --> C{Sensor stale?}
  C -- Yes --> SKIP[Skip: sensor_watchdog]
  C -- No --> D[evaluate_limits<br/>high/low band]
  D -->|Inside| BUF[Clear buffer]
  D -->|Outside| E{grace_period<br/>elapsed?}
  E -- No --> BUF2[Open/extend buffer]
  E -- Yes --> F[record_deviation<br/>with M3 dedup]
  F -->|is_new=true| G[alert_aggregator<br/>source_system=haccp]
  F -->|is_new=false| H[merge into existing]
  G --> I[Unified inbox]
  F --> J[Corrective Action]
  J --> K[Work Order HAC-xxx]
  J --> L[Batch disposition<br/>if hold/rework/destroy]

Create a CCP

Dashboard

Food Safety → HACCP → Plans.
Create a plan with hazard analysis, process flow, and review cadence.
New CCP inside the plan:
- Name, asset (validated against _assets), data source, metric.
- Hazard type: biological / chemical / physical / allergen.
- High/low critical limits (at least one).
- Grace period in seconds: the UI explains that 300 means "5 minutes of allowed excursion before a deviation is recorded".
- Structured corrective steps (description + estimated minutes + required role + verification method).
- Responsible person.

API

# 1) Create plan
POST /api/v1/haccp/plans
{
  "name": "UHT Milk 1L Plan",
  "product_type": "UHT Milk 1L",
  "process_flow": ["Receiving", "Filter", "Pasteurize", "Pack"],
  "hazards": [
    {
      "stage_name": "Pasteurize",
      "hazard_description": "Pathogen survival if T<72C",
      "hazard_type": "biological",
      "severity": "critical",
      "likelihood": "possible",
      "control_measure": "T >= 72C for 15s",
      "is_ccp": true
    }
  ],
  "review_frequency_days": 365
}

# 2) Create CCP linked to the plan
POST /api/v1/haccp/ccps
{
  "name": "Pasteurizer Temp",
  "asset_id": "64f...pasteurizer",
  "source_id": "src_plc_uht",
  "monitoring_metric": "temperature",
  "hazard_type": "biological",
  "critical_limit_high": 85,
  "critical_limit_low": 72,
  "unit": "C",
  "grace_period_seconds": 15,
  "corrective_steps": [
    { "description": "Stop feed pump", "estimated_minutes": 1, "required_role": "operator" },
    { "description": "Verify hold tank temperature", "estimated_minutes": 5, "required_role": "qa_lead",
      "verification_method": "QA signature on CCP-01" }
  ],
  "plan_id": "64f...plan"
}

Monitoring — four protection layers

1. Sensor staleness (HACCP-H2)

If the sensor_watchdog marks the source as stale (no readings within the expected interval), HACCP skips evaluation. A dead thermometer sending 0.0 cannot fabricate a false "critical low".

2. Real grace period (HACCP-H1)

An out-of-band reading opens a persistent buffer in _haccp_deviation_buffers (MongoDB, not memory — a worker restart must not discard accumulated time). Only when elapsed >= grace_period_seconds AND the reading is still out of band does the deviation persist. Resolved transients (door openings, defrost cycles) never produce records.

3. Incident dedup (HACCP-M3)

A 30-min sustained excursion does not produce 5 separate rows. Within _HACCP_DEDUP_WINDOW_MINUTES (default 15) subsequent samples of the same (ccp_id, direction) merge into the existing deviation: sample_count++, last_seen_at, max_reading/max_deviation track the incident peak.

4. Unified inbox (HACCP-H3)

Every confirmed deviation publishes to the Alert Aggregator with source_system="haccp". Canonical severity mapped from hazard_type:

Hazard	Severity
biological	`critical`
chemical	`high`
physical	`warning`
allergen / unknown	`warning` (HACCP never drops to `info`)

Natural consolidation: a temperature spike that fires anomaly + energy + HACCP shows up as one row per asset with source_systems = [anomaly, energy, haccp].

Direct ingest — `POST /api/v1/haccp/readings` (HACCP-L2)

For sources outside /machine/events (manual-registry tablet, standalone HACCP thermometer with its own webhook):

POST /api/v1/haccp/readings
{
  "source_id": "tablet-qa-01",
  "metadata": { "temperature": 80.5 }
}

Full pipeline (staleness / grace / dedup / aggregator) applies identically. No "second path".

Corrective action → Work Order (HACCP-H5)

When an operator records a corrective action, the system creates a kanban task:

task_code = HAC-xxx (distinguishable from maintenance MAI-xxx).
priority = urgent (food safety doesn't wait for weekly triage).
status = todo, assigned_to = taken_by.
Denormalised back-links: haccp_deviation_id, haccp_ccp_id, haccp_batch_id, haccp_product_disposition.
Narrative description with full deviation context.

Batch chain of custody (HACCP-L4)

If the corrective action includes batch_id + product_disposition != "none", an immutable entry is recorded in _haccp_batch_dispositions:

GET /api/v1/haccp/batches/B-2026-03-15-A/history
→ [
  { "batch_id": "B-2026-03-15-A", "disposition": "hold", ... },
  { "batch_id": "B-2026-03-15-A", "disposition": "destroy", ... }
]

When an inspector asks "what happened to batch X?", a single query returns the chain.

Configuration audit trail (HACCP-M5)

Every CCP mutation is recorded in _audit_trail with actor, timestamp and prev→new diff:

ccp_created — new_state with the 14 regulatory-relevant fields.
ccp_updated — only changed fields (no updated_at noise).
ccp_deleted — previous_state preserved.

"Who changed critical_limit_high from 75 to 72 on March 3?" is answerable with actor email, exact time, both values.

Plan review heartbeat (HACCP-M6)

POST /internal/haccp-health-check — periodic job (Cloud Scheduler):

Lists HACCP plans with next_review_at <= now.
Emits a warning alert to the aggregator per plan with extra.kind = "plan_review_overdue".
Does NOT duplicate silent-sensor detection (already covered by sensor_watchdog + H2).

A plan past review shows up in the same inbox as operational incidents.

Technical architecture

haccp_service — CCP + plan CRUD + checking pipeline.
thresholds.py (shared) — evaluate_limits() reused by HACCP and cold_chain (HACCP-M4).
_haccp_ccps, _haccp_plans, _haccp_deviations, _haccp_deviation_buffers, _haccp_corrective_actions, _haccp_batch_dispositions — per-tenant.
alert_aggregator_service.ingest_alert — fire-and-forget integration.
audit_trail_service.log_action — fire-and-forget over CCP mutations.

Contract-codifying tests

test_check_ccp_reading_grace_* — H1.
test_check_ccp_reading_skips_stale_source — H2.
test_check_ccp_reading_biological_feeds_aggregator_as_critical — H3.
test_create_ccp_* / test_update_ccp_* — M2 + M5.
test_record_deviation_merges_into_open_deviation_in_window — M3.
test_record_corrective_action_creates_work_order_with_expected_shape — H5.
test_heartbeat_flags_overdue_plans — M6.
test_record_corrective_action_logs_batch_disposition — L4.
test_get_haccp_plan_with_ccps_attaches_linked_ccps — H4.

Limitations

ProductDisposition doesn't block downstream systems. A hold records the disposition but doesn't integrate with ERP/MES to prevent shipment. That's operations work, not HACCP.
No own silent-sensor heartbeat. We reuse sensor_watchdog; if that chain breaks, HACCP doesn't detect silences (only filters them if already marked stale).
CorrectiveStep not executed as checklist. The UI can render steps, but backend doesn't verify each step individually — dashboard UX, not backend.

Key benefits

Transients don't pollute the record — grace respected.
Dead sensors don't fabricate false positives — staleness filter.
One inbox, not four — alert_aggregator integration.
Incident = one row even when it lasts hours — internal dedup with peak tracking.
Corrective actions are real kanban work — urgent work orders with back-links.
Full auditability — CCP audit trail + per-batch chain of custody.
Formal HACCP plans — group hazards + CCPs + verification, certifiable.
Shared logic with cold_chain — one threshold-detection math.

HACCP — Verifiable Food-Safety Control

On this page