Skip to content

Information flows

ADMINISTRATOR ::: danger Restricted

Internal pipeline documentation. :::

End-to-end data flows for the major operations. Every flow is deterministic-first, AI-advisory-second.

Flow 1 — Vessel onboarding (natural sequence)

The authoritative onboarding sequence per project_tracked_matters_spec:

Add vessel
  → Particulars upload (Ship's Particulars PDF)
    → general_particulars extractor
      → writes vessel_particulars row
      → writes vessel_particulars_provenance entries (extracted)
      → IMO + DWT (Summer Salt Water) propagate to vessels table on approval
  → GA Plan upload
    → general_arrangement extractor
      → writes vessel_equipment_inventory (nested JSON in vessel_particulars.json_blob)
      → writes vessel_tanks
      → writes vessel_particulars_provenance entries
  → Sister-vessel detection (matches on dimensions + flag + class)
  → CL Skeleton wizard (12-step Windows-installer-style)
    → blocked unless 6 mandatory docs present (Particulars · Capacity Plan · GA Plan · Form E/SER · GMDSS · ECDIS)
    → emits per-group XLSX (Groups 2–8)
  → Helper-doc agent (B1 milestone — agent-only, not in UI)
  → Canonical import (POST /api/vessels/:id/components/import, 207 Multi-Status)
  → Dashboard live

Delete-ship preserves RAG

Deleting a vessel removes vessels row + assignments, but preserves rag_chunks and cl_knowledge_base entries. Only the administrator can delete; the operation is audit-logged.

Flow 2 — Source-document upload (wizard)

Operator drops file in Class & Flag Documents wizard
  → POST /api/vessels/:vesselId/source-documents
  → Anti-misuse hardening:
      → Workers AI filename classifier (Llama-4-Scout)
      → magic-byte sniff (catches .jpeg→.pdf spoof and CERT→GA rename)
  → If a prior evidence exists for the same (vessel_id, wizard_slot):
      → archiveOnSupersede(prior) writes source_documents_archive row
        with superseded_by_source_document_id pointing at the new doc
  → UPDATE source_documents (idempotent on (vessel_id, wizard_slot))
  → On UPDATE failure: rollback the archive insert (no orphan archive)
  → Receipt response includes:
      action ∈ {created, replaced}
      wizard_slot
      g1_ucs_code
      superseded_source_document_id (when applicable)
  → ai-status messenger thread on supersede event (Sofia 23-08 quiet-hours guard)

Flow 3 — Run 1 / Run 2 separation

Run 1 — Group 1 cert population. Cert filename prefix → G1 row. Cert-level metadata only (validity date, surveyor, certificate number). Equipment details extracted from inside the cert are routed to vessel_equipment_inventory, NOT the G1 row.

Run 2 — Groups 2–8 component population. CL Skeleton Builder reads:

vessel_particulars + vessel_equipment_inventory + vessel_tanks + Master List
  → emits per-group XLSX

Tech Detail injection priority (highest first):

  1. Vessel CL "Other Detail" column
  2. equipment_inventory keyword longest-match
  3. tank_inventory keyword
  4. Empty

Flow 4 — UCS Foundation cascade (A4)

The cascade is what keeps every dependent table consistent when a new UCS Master version is activated.

Affected stores when a Master code changes:

ucs_master_list   → renamed | moved | split | merged | deleted | new

vessel_components.code
source_documents.g1_ucs_code
rag_chunks.code (in RAG_DB binding)
cl_builds.code references
cl_knowledge_base.target_code
vessel_particulars_provenance.field_path (when path includes a code)
pms_jobs.code
code_history (audit trail — backwards-compatible reads)

Two-step UX (v2.31.0.34/.35):

Step 1 — POST /api/admin/ucs-foundation/cascade-preview
  body: { fromVersionId?, toVersionId, dryRun: true (default) }
  → reads-only diff classifies each code as
    kept | renamed | moved | split | merged | deleted | new
  → returns per-table impact counts
  → returns sample rows that would be touched

Step 2 — POST /api/admin/ucs-foundation/cascade-apply
  body: { toVersionId, dryRun: false, confirmApply: true }
  → guards:
      toVersionId must be is_active=1
      dryRun:false REQUIRES confirmApply:true
  → atomic D1 batch write
  → quarantine guard: ambiguous splits routed to operator quarantine pick
  → audit_log entries for every row touched (migration 0104)

Backwards-compatible reads

The code_history table preserves old → new code mappings. Reads against old codes return the new row, with a code_history_via=N annotation. This means RAG chunks for old codes still resolve until the next rag-cascade pass (F-5 milestone) rewrites them.

Flow 5 — KB orphan heal

Trigger: POST /api/admin/kb-orphan-heal { dryRun? }
  → SELECT cl_knowledge_base WHERE quarantined=1
  → For each orphan row:
      → SELECT candidates FROM ucs_master_list
        WHERE component_name LIKE '%<token>%'
          AND version_id IN (SELECT id FROM ucs_foundation_versions WHERE is_active=1)
      → Score each candidate with Jaccard(orphan.target_code_text, candidate.component_name)
      → If best ≥ 0.72:
          UPDATE cl_knowledge_base SET target_code = <new>, quarantined = 0
          push to sampleHeals[] (cap 10)
        Else:
          push to sampleUnhealed[] (cap 10)
      → On D1 error:
          push to errorSamples[] (cap 10) with raw e?.message
  → Returns { scanned, healed, unhealed, errors, sampleHeals, sampleUnhealed, errorSamples }
  → Posts ai-status messenger thread on completion (no dedupe — admin-triggered)

Flow 6 — RAG eval cron

02:00 UTC daily
  → Run a fixed eval set against retrieveHybrid()
  → Compute recall@5 vs rolling 7-day baseline
  → If regression ≥ 5pp:
      → SELECT FROM rag_eval_sentinel WHERE date = today  (per-UTC-day dedupe)
      → If no sentinel:
          INSERT sentinel row
          enqueue Postmark email
          createSystemNotification('ai-status', ...)
      → If SELECT errors: notify anyway, log [cron:rag-eval]

Flow 7 — Backup & restore

03:00 UTC daily
  → wrangler d1 export → R2 bucket pms (binding BACKUPS) at backups/pms-db-YYYY-MM-DD.sql
  → Vectorize export → R2 at backups/vec-YYYY-MM-DD.ndjson
  → 30-day retention sweep at 04:00 UTC

Restore:
  → POST /api/admin/backups/restore { snapshotKey }
  → reads R2 object → applies via wrangler d1 execute --remote
  → audit_log row: restore_backup

Flow 8 — In-app messenger + email mirror

Source event (e.g. supersede, heal complete, RAG eval regression, draft submitted)
  → createSystemNotification(env, db, { subject, body, recipients, domain })
  → INSERT into message_threads + messages + message_recipients
  → Quiet hours guard: 23:00–08:00 Europe/Sofia
      → suppressed message logged with quiet_hours_skipped: true
      → morning digest at 08:00 sweeps the queue and dispatches
  → enqueueEmailMirror(threadId)
      → resolveRecipientEmails(recipients)
      → Postmark Send Email API (50-recipient batching, sandbox-aware 412 ACK)
      → From: [email protected] (verified SPF + DKIM + Return-Path)

Domains in use:
  notification     — generic
  upload-request   — supervisor asks superintendent for a doc
  audit            — compliance events
  general          — open thread
  ai-status        — automatic AI-pipeline events

Flow 9 — Draft → approve

Superintendent: creates row → status = 'draft'

                submits → status = 'pending_approval'

Supervisor or Administrator:
  approves → status = 'approved' → live in PMS
  rejects  → status = 'rejected' → returns to superintendent with reason

Batch approve (supervisor): selects N pending rows of one entity type
  → atomic D1 batch UPDATE
  → audit_log entries per row
  → messenger thread to the originating superintendent

Cross-cutting hooks

  • Cron (*/10 * * * *): classifier sweep + 03:00 UTC daily backup + 04:00 UTC retention sweep + hourly session GC + 02:00 UTC RAG eval + watchdogs
  • Login probe (post-deploy): /api/auth/login with admin / Spb812 — exit 3 on failure
  • /api/health (50-byte JSON): public uptime probe; carries version, classifierVersion, runnerVersion
  • Migration sentinel: every migration writes a sentinel row in kv_state so re-runs skip; db:migrate is idempotent

RAPAX PMS Help · v2.31.0.26 · released 2026-04-28