docs(security): document comprehensive security hardening

Updates documentation with new security features and hardening guide: **CHANGELOG.md:** - Added detailed security hardening section (2026-02-23) - Documents all phases: file ingestion, sandbox, secrets, audit logging, tests - Lists specific files changed and security controls implemented **docs/src/security.md:** - Added Overview section with defense-in-depth layers - Added Comprehensive Security Hardening section with: - File ingestion security with code examples - Sandbox hardening with complete YAML config - Secrets management with env expansion syntax - HIPAA audit logging with tamper-evident chain hashing
2026-02-23 18:03:25 -05:00 · 2026-02-23 18:03:25 -05:00 · b00439b86e
commit b00439b86e
parent fccced6bb3
2 changed files with 158 additions and 0 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,5 +1,52 @@
 ## [Unreleased]

+### Security - Comprehensive Hardening (2026-02-23)
+
+**File Ingestion Security (Phase 1):**
+- `internal/fileutil/secure.go`: Added `SecurePathValidator` with symlink resolution and path boundary enforcement to prevent path traversal attacks
+- `internal/fileutil/filetype.go`: New file with magic bytes validation for ML artifacts (safetensors, GGUF, HDF5, numpy)
+- `internal/fileutil/filetype.go`: Dangerous extension blocking (.pt, .pkl, .pickle, .exe, .sh, .zip) to prevent pickle deserialization and executable injection
+- `internal/worker/artifacts.go`: Integrated `SecurePathValidator` for artifact path validation
+- `internal/worker/config.go`: Added upload limits to `SandboxConfig` (MaxUploadSizeBytes: 10GB, MaxUploadRateBps: 100MB/s, MaxUploadsPerMinute: 10)
+
+**Sandbox Hardening (Phase 2):**
+- `internal/worker/config.go`: Added `ApplySecurityDefaults()` with secure-by-default principle
+  - NetworkMode: "none" (was empty string)
+  - ReadOnlyRoot: true
+  - NoNewPrivileges: true  
+  - DropAllCaps: true
+  - UserNS: true (user namespace)
+  - RunAsUID/RunAsGID: 1000 (non-root)
+  - SeccompProfile: "default-hardened"
+- `internal/container/podman.go`: Added `PodmanSecurityConfig` struct and `BuildSecurityArgs()` function
+- `internal/container/podman.go`: `BuildPodmanCommand` now accepts security config with full sandbox hardening
+- `internal/worker/executor/container.go`: Container executor now passes `SandboxConfig` to Podman command builder
+- `configs/seccomp/default-hardened.json`: New hardened seccomp profile blocking dangerous syscalls (ptrace, mount, reboot, kexec_load)
+
+**Secrets Management (Phase 3):**
+- `internal/worker/config.go`: Added `expandSecrets()` for environment variable expansion using `${VAR}` syntax
+- `internal/worker/config.go`: Added `validateNoPlaintextSecrets()` with entropy-based detection and pattern matching
+- `internal/worker/config.go`: Detects AWS keys (AKIA/ASIA), GitHub tokens (ghp_/gho_), GitLab (glpat-), OpenAI/Stripe (sk-)
+- `internal/worker/config.go`: Shannon entropy calculation to detect high-entropy secrets (>4 bits/char)
+- Secrets are expanded from environment during `LoadConfig()` before validation
+
+**HIPAA-Compliant Audit Logging (Phase 5):**
+- `internal/audit/audit.go`: Added tamper-evident chain hashing with SHA-256
+- `internal/audit/audit.go`: New file access event types: `EventFileRead`, `EventFileWrite`, `EventFileDelete`
+- `internal/audit/audit.go`: `Event` struct extended with `PrevHash`, `EventHash`, `SequenceNum` for integrity chain
+- `internal/audit/audit.go`: Added `LogFileAccess()` helper for HIPAA file access logging
+- `internal/audit/audit.go`: Added `VerifyChain()` function for tamper detection
+
+**Security Testing (Phase 7):**
+- `tests/unit/security/path_traversal_test.go`: 3 tests for `SecurePathValidator` including symlink escape prevention
+- `tests/unit/security/filetype_test.go`: 3 tests for magic bytes validation and dangerous extension detection
+- `tests/unit/security/secrets_test.go`: 3 tests for env expansion and plaintext secret detection with entropy validation
+- `tests/unit/security/audit_test.go`: 4 tests for audit logger chain integrity and file access logging
+
+**Supporting Changes:**
+- `internal/storage/db_jobs.go`: Added `DeleteJob()` and `DeleteJobsByPrefix()` methods
+- `tests/benchmarks/payload_performance_test.go`: Updated to use `DeleteJob()` for proper test isolation
+
 ### Added - CSV Export Features (2026-02-18)
 - CLI: `ml compare --csv` - Export run comparisons as CSV with actual run IDs as column headers
 - CLI: `ml find --csv` - Export search results as CSV for spreadsheet analysis  
--- a/docs/src/security.md
+++ b/docs/src/security.md
@ -2,6 +2,18 @@

 This document outlines security features, best practices, and hardening procedures for FetchML.

+## Overview
+
+FetchML implements defense-in-depth security with multiple layers of protection:
+
+1. **File Ingestion Security** - Path traversal prevention, file type validation
+2. **Sandbox Hardening** - Container isolation with seccomp, capability dropping
+3. **Secrets Management** - Environment-based credential injection with plaintext detection
+4. **Audit Logging** - Tamper-evident logging for compliance (HIPAA)
+5. **Authentication** - API key-based access control with RBAC
+
+---
+
 ## Security Features

 ### Authentication & Authorization
@ -25,6 +37,105 @@ This document outlines security features, best practices, and hardening procedur
 - **Firewall Rules**: Restrictive port access
 - **Container Isolation**: Services run in separate containers/pods

+---
+
+## Comprehensive Security Hardening (2026-02)
+
+### File Ingestion Security
+
+All file operations are protected against path traversal attacks:
+
+```go
+// All paths are validated with symlink resolution
+validator := fileutil.NewSecurePathValidator(basePath)
+cleanPath, err := validator.ValidatePath(userInput)
+if err != nil {
+    return fmt.Errorf("path validation failed: %w", err)
+}
+```
+
+**Features:**
+- Symlink resolution and canonicalization
+- Path boundary enforcement (cannot escape base directory)
+- Magic bytes validation for ML artifacts (safetensors, GGUF, HDF5)
+- Dangerous extension blocking (.pt, .pkl, .exe, .sh)
+- Upload limits (size, rate, frequency)
+
+### Sandbox Hardening
+
+Containers run with hardened security defaults:
+
+```yaml
+# configs/worker/homelab-sandbox.yaml
+sandbox:
+  network_mode: "none"           # No network access by default
+  read_only_root: true          # Read-only filesystem
+  no_new_privileges: true       # Prevent privilege escalation
+  drop_all_caps: true           # Drop all capabilities
+  allowed_caps: []              # Add CAP_ only if required
+  user_ns: true                 # User namespace isolation
+  run_as_uid: 1000               # Run as non-root user
+  run_as_gid: 1000
+  seccomp_profile: "default-hardened"  # Restricted syscall profile
+  max_runtime_hours: 24
+  max_upload_size_bytes: 10737418240   # 10GB
+  max_upload_rate_bps: 104857600       # 100MB/s
+  max_uploads_per_minute: 10
+```
+
+**Seccomp Profile** (`configs/seccomp/default-hardened.json`):
+- Blocks: `ptrace`, `mount`, `umount2`, `reboot`, `kexec_load`
+- Blocks: `open_by_handle_at`, `perf_event_open`
+- Default action: `SCMP_ACT_ERRNO` (deny by default)
+
+### Secrets Management
+
+**Environment Variable Expansion:**
+```yaml
+# config.yaml - use ${VAR} syntax for secrets
+redis_password: "${REDIS_PASSWORD}"
+snapshot_store:
+  access_key: "${AWS_ACCESS_KEY_ID}"
+  secret_key: "${AWS_SECRET_ACCESS_KEY}"
+```
+
+**Plaintext Detection:**
+The system detects and rejects plaintext secrets using:
+- Shannon entropy calculation (>4 bits/char indicates secret)
+- Pattern matching: AWS keys (`AKIA`, `ASIA`), GitHub tokens (`ghp_`), etc.
+
+**Loading Process:**
+1. Config loaded from YAML
+2. Environment variables expanded (`${VAR}` → value)
+3. Plaintext secrets detected and rejected
+4. Validation fails if secrets don't use env reference syntax
+
+### HIPAA-Compliant Audit Logging
+
+**Tamper-Evident Logging:**
+```go
+// Each event includes chain hash for integrity
+audit.Log(audit.Event{
+    EventType: audit.EventFileRead,
+    UserID:    "user1",
+    Resource:  "/data/file.txt",
+})
+```
+
+**Event Types:**
+- `file_read` - File access logged
+- `file_write` - File modification logged
+- `file_delete` - File deletion logged
+- `auth_success` / `auth_failure` - Authentication events
+- `job_queued` / `job_started` / `job_completed` - Job lifecycle
+
+**Chain Hashing:**
+- Each event includes SHA-256 hash of previous event
+- Modification of any log entry breaks the chain
+- `VerifyChain()` function detects tampering
+
+---
+
 ## Security Checklist

 ### Initial Setup