fetch_ml/docs/src/verification.md
Jeremie Fraeys 5d75f3576b
docs: comprehensive documentation updates
- Update TEST_COVERAGE_MAP with current requirements
- Refresh ADR-004 with C++ implementation details
- Update architecture, deployment, and security docs
- Improve CLI/TUI UX contract documentation
2026-03-04 13:23:48 -05:00

7.6 KiB

Verification & Maintenance

Continuous enforcement, drift detection, and compliance maintenance for the FetchML platform.

Overview

The verification layer provides structural enforcement at compile time, behavioral invariants across random inputs, drift detection from security baseline, supply chain integrity, and audit log verification. Together these form the difference between "we tested this once" and "we can prove it holds continuously."

Components

Schema Validation

Purpose: Ensures that manifest.Artifacts, Config, and SandboxConfig structs match a versioned schema at compile time. If a field is added, removed, or retyped without updating the schema, the build fails.

Files:

  • internal/manifest/schema.json - Canonical JSON Schema for manifest validation
  • internal/manifest/schema_version.go - Schema versioning and compatibility
  • internal/manifest/schema_test.go - Drift detection tests

Key Invariants:

  • Environment field is required and non-null in every Artifacts record
  • Environment.ConfigHash is a non-empty string
  • Environment.DetectionMethod is one of enumerated values
  • Exclusions is present (may be empty array, never null)
  • compliance_mode if present is one of "hipaa", "standard"

Commands:

make verify-schema           # Check schema hasn't drifted
make test-schema-validation  # Test validation works correctly

CI Integration: Runs on every commit via verification.yml workflow.

Custom Linting Rules

Purpose: Enforces structural invariants that can't be expressed as tests—such as CreateDetector never being called without capturing DetectionInfo, or any function returning manifest.Artifacts populating Environment.

Tool: Custom go vet analyzers using golang.org/x/tools/go/analysis.

Analyzers:

Analyzer Rule Rationale
nobaredetector Flag any call to GPUDetectorFactory.CreateDetector() not assigned to a variable also receiving CreateDetectorWithInfo() CreateDetector silently discards GPUDetectionInfo needed for manifest and audit log
manifestenv Flag any function with return type manifest.Artifacts where Environment field is not explicitly set before return Enforces schema validation at the call site, not just in tests
noinlinecreds Flag any struct literal of type Config where RedisPassword, SecretKey, or AccessKey fields are set to non-empty string literals Credentials must not appear in source or config files
hippacomplete Flag any switch or if-else on compliance_mode == "hipaa" that does not check all six hard-required fields Prevents partial HIPAA enforcement from silently passing

Files:

  • tools/fetchml-vet/cmd/fetchml-vet/main.go - CLI entry point
  • tools/fetchml-vet/analyzers/nobaredetector.go
  • tools/fetchml-vet/analyzers/manifestenv.go
  • tools/fetchml-vet/analyzers/noinlinecredentials.go
  • tools/fetchml-vet/analyzers/hipaacomplete.go

Commands:

make lint-custom             # Build and run custom analyzers

CI Integration: Runs on every commit via verification.yml workflow. Lint failures are build failures, not warnings.

Audit Chain Integrity Verification

Purpose: Proves audit logs have not been tampered with by verifying the integrity chain. Each entry includes a hash of the previous entry, forming a Merkle-chain. Any insertion, deletion, or modification breaks the chain.

Implementation:

type Event struct {
    Timestamp    time.Time              `json:"timestamp"`
    EventType    EventType              `json:"event_type"`
    UserID       string                 `json:"user_id,omitempty"`
    // ... other fields
    PrevHash     string                 `json:"prev_hash"`     // hash of previous entry
    EntryHash    string                 `json:"entry_hash"`    // hash of this entry's fields + prev_hash
    SequenceNum  int64                  `json:"sequence_num"`
}

Components:

  • internal/audit/verifier.go - Chain verification logic
  • cmd/audit-verifier/main.go - Standalone CLI tool
  • tests/unit/audit/verifier_test.go - Unit tests

Features:

  • Continuous verification: Background job runs every 15 minutes (HIPAA) or hourly (other)
  • Tamper detection: Identifies first sequence number where chain breaks
  • External verification: Chain root hash can be published to append-only store (S3 Object Lock, Azure Immutable Blob)

Commands:

make verify-audit                                    # Run unit tests
make verify-audit-chain AUDIT_LOG_PATH=/path/to.log  # Verify specific log file

CLI Usage:

# Single verification
./bin/audit-verifier -log-path=/var/log/fetchml/audit.log

# Continuous monitoring
./bin/audit-verifier -log-path=/var/log/fetchml/audit.log -continuous -interval=15m

CI Integration: Runs on every commit via verification.yml workflow.

Maintenance Cadence

Activity Frequency Blocking Location
Schema drift check Every commit Yes verification.yml
Property-based tests Every commit Yes verification.yml (planned)
Custom lint rules Every commit Yes verification.yml
gosec + nancy Every commit Yes security-scan.yml
trivy image scan Every commit Yes (CRITICAL) security-scan.yml
Audit chain verification 15 min (HIPAA), hourly Alerts Deployment config
Mutation testing Pre-release Yes (< 80%) Release workflow (planned)
Fault injection Nightly + pre-release Yes (pre-release) Nightly workflow (planned)
OpenSSF Scorecard Weekly Alerts (>1pt drop) Weekly workflow (planned)
Reproducibility Toolchain changes Yes Verify workflow (planned)

Usage

Quick Verification (Development)

make verify-quick    # Fast checks: schema only

Full Verification (CI)

make verify-all      # All verification checks

Install Verification Tools

make install-verify-deps    # Install all verification tooling

CI/CD Integration

The verification.yml workflow runs automatically on:

  • Every push to main or develop
  • Every pull request to main or develop
  • Nightly (for scorecard and extended checks)

Jobs:

  1. schema-drift-check - Schema validation
  2. custom-lint - Custom analyzers
  3. audit-verification - Audit chain integrity
  4. security-scan-extended - Extended security scanning
  5. scorecard - OpenSSF Scorecard (weekly)

Planned Components

Component Status Description
Property-Based Testing Planned gopter for behavioral invariants across all valid inputs
Mutation Testing Planned go-mutesting to verify tests catch security invariants
SLSA Conformance Planned Supply chain provenance at Level 2/3
Continuous Scanning Partial trivy, grype, checkov, nancy integration
Reproducible Builds Planned Binary and container image reproducibility
Fault Injection Planned toxiproxy, libfiu for resilience testing
OpenSSF Scorecard Partial Scorecard evaluation and badge

Relationship to Security Plan

This verification layer builds on the Security Plan by adding continuous enforcement:

Security Plan (implement controls)
    ↓
Verification Plan (enforce and maintain controls)
    ↓
Ongoing: scanning, scoring, fault injection, audit verification

The Compliance Dashboard and Compliance Reporting features from the Security Plan consume outputs from this verification layer—scan results, mutation scores, SLSA provenance, Scorecard ratings, and audit chain verification status feed directly into compliance reporting.