docs(security): document comprehensive security hardening

Updates documentation with new security features and hardening guide:

**CHANGELOG.md:**
- Added detailed security hardening section (2026-02-23)
- Documents all phases: file ingestion, sandbox, secrets, audit logging, tests
- Lists specific files changed and security controls implemented

**docs/src/security.md:**
- Added Overview section with defense-in-depth layers
- Added Comprehensive Security Hardening section with:
  - File ingestion security with code examples
  - Sandbox hardening with complete YAML config
  - Secrets management with env expansion syntax
  - HIPAA audit logging with tamper-evident chain hashing
This commit is contained in:
Jeremie Fraeys 2026-02-23 18:03:25 -05:00
parent fccced6bb3
commit b00439b86e
No known key found for this signature in database
2 changed files with 158 additions and 0 deletions

View file

@ -1,5 +1,52 @@
## [Unreleased]
### Security - Comprehensive Hardening (2026-02-23)
**File Ingestion Security (Phase 1):**
- `internal/fileutil/secure.go`: Added `SecurePathValidator` with symlink resolution and path boundary enforcement to prevent path traversal attacks
- `internal/fileutil/filetype.go`: New file with magic bytes validation for ML artifacts (safetensors, GGUF, HDF5, numpy)
- `internal/fileutil/filetype.go`: Dangerous extension blocking (.pt, .pkl, .pickle, .exe, .sh, .zip) to prevent pickle deserialization and executable injection
- `internal/worker/artifacts.go`: Integrated `SecurePathValidator` for artifact path validation
- `internal/worker/config.go`: Added upload limits to `SandboxConfig` (MaxUploadSizeBytes: 10GB, MaxUploadRateBps: 100MB/s, MaxUploadsPerMinute: 10)
**Sandbox Hardening (Phase 2):**
- `internal/worker/config.go`: Added `ApplySecurityDefaults()` with secure-by-default principle
- NetworkMode: "none" (was empty string)
- ReadOnlyRoot: true
- NoNewPrivileges: true
- DropAllCaps: true
- UserNS: true (user namespace)
- RunAsUID/RunAsGID: 1000 (non-root)
- SeccompProfile: "default-hardened"
- `internal/container/podman.go`: Added `PodmanSecurityConfig` struct and `BuildSecurityArgs()` function
- `internal/container/podman.go`: `BuildPodmanCommand` now accepts security config with full sandbox hardening
- `internal/worker/executor/container.go`: Container executor now passes `SandboxConfig` to Podman command builder
- `configs/seccomp/default-hardened.json`: New hardened seccomp profile blocking dangerous syscalls (ptrace, mount, reboot, kexec_load)
**Secrets Management (Phase 3):**
- `internal/worker/config.go`: Added `expandSecrets()` for environment variable expansion using `${VAR}` syntax
- `internal/worker/config.go`: Added `validateNoPlaintextSecrets()` with entropy-based detection and pattern matching
- `internal/worker/config.go`: Detects AWS keys (AKIA/ASIA), GitHub tokens (ghp_/gho_), GitLab (glpat-), OpenAI/Stripe (sk-)
- `internal/worker/config.go`: Shannon entropy calculation to detect high-entropy secrets (>4 bits/char)
- Secrets are expanded from environment during `LoadConfig()` before validation
**HIPAA-Compliant Audit Logging (Phase 5):**
- `internal/audit/audit.go`: Added tamper-evident chain hashing with SHA-256
- `internal/audit/audit.go`: New file access event types: `EventFileRead`, `EventFileWrite`, `EventFileDelete`
- `internal/audit/audit.go`: `Event` struct extended with `PrevHash`, `EventHash`, `SequenceNum` for integrity chain
- `internal/audit/audit.go`: Added `LogFileAccess()` helper for HIPAA file access logging
- `internal/audit/audit.go`: Added `VerifyChain()` function for tamper detection
**Security Testing (Phase 7):**
- `tests/unit/security/path_traversal_test.go`: 3 tests for `SecurePathValidator` including symlink escape prevention
- `tests/unit/security/filetype_test.go`: 3 tests for magic bytes validation and dangerous extension detection
- `tests/unit/security/secrets_test.go`: 3 tests for env expansion and plaintext secret detection with entropy validation
- `tests/unit/security/audit_test.go`: 4 tests for audit logger chain integrity and file access logging
**Supporting Changes:**
- `internal/storage/db_jobs.go`: Added `DeleteJob()` and `DeleteJobsByPrefix()` methods
- `tests/benchmarks/payload_performance_test.go`: Updated to use `DeleteJob()` for proper test isolation
### Added - CSV Export Features (2026-02-18)
- CLI: `ml compare --csv` - Export run comparisons as CSV with actual run IDs as column headers
- CLI: `ml find --csv` - Export search results as CSV for spreadsheet analysis

View file

@ -2,6 +2,18 @@
This document outlines security features, best practices, and hardening procedures for FetchML.
## Overview
FetchML implements defense-in-depth security with multiple layers of protection:
1. **File Ingestion Security** - Path traversal prevention, file type validation
2. **Sandbox Hardening** - Container isolation with seccomp, capability dropping
3. **Secrets Management** - Environment-based credential injection with plaintext detection
4. **Audit Logging** - Tamper-evident logging for compliance (HIPAA)
5. **Authentication** - API key-based access control with RBAC
---
## Security Features
### Authentication & Authorization
@ -25,6 +37,105 @@ This document outlines security features, best practices, and hardening procedur
- **Firewall Rules**: Restrictive port access
- **Container Isolation**: Services run in separate containers/pods
---
## Comprehensive Security Hardening (2026-02)
### File Ingestion Security
All file operations are protected against path traversal attacks:
```go
// All paths are validated with symlink resolution
validator := fileutil.NewSecurePathValidator(basePath)
cleanPath, err := validator.ValidatePath(userInput)
if err != nil {
return fmt.Errorf("path validation failed: %w", err)
}
```
**Features:**
- Symlink resolution and canonicalization
- Path boundary enforcement (cannot escape base directory)
- Magic bytes validation for ML artifacts (safetensors, GGUF, HDF5)
- Dangerous extension blocking (.pt, .pkl, .exe, .sh)
- Upload limits (size, rate, frequency)
### Sandbox Hardening
Containers run with hardened security defaults:
```yaml
# configs/worker/homelab-sandbox.yaml
sandbox:
network_mode: "none" # No network access by default
read_only_root: true # Read-only filesystem
no_new_privileges: true # Prevent privilege escalation
drop_all_caps: true # Drop all capabilities
allowed_caps: [] # Add CAP_ only if required
user_ns: true # User namespace isolation
run_as_uid: 1000 # Run as non-root user
run_as_gid: 1000
seccomp_profile: "default-hardened" # Restricted syscall profile
max_runtime_hours: 24
max_upload_size_bytes: 10737418240 # 10GB
max_upload_rate_bps: 104857600 # 100MB/s
max_uploads_per_minute: 10
```
**Seccomp Profile** (`configs/seccomp/default-hardened.json`):
- Blocks: `ptrace`, `mount`, `umount2`, `reboot`, `kexec_load`
- Blocks: `open_by_handle_at`, `perf_event_open`
- Default action: `SCMP_ACT_ERRNO` (deny by default)
### Secrets Management
**Environment Variable Expansion:**
```yaml
# config.yaml - use ${VAR} syntax for secrets
redis_password: "${REDIS_PASSWORD}"
snapshot_store:
access_key: "${AWS_ACCESS_KEY_ID}"
secret_key: "${AWS_SECRET_ACCESS_KEY}"
```
**Plaintext Detection:**
The system detects and rejects plaintext secrets using:
- Shannon entropy calculation (>4 bits/char indicates secret)
- Pattern matching: AWS keys (`AKIA`, `ASIA`), GitHub tokens (`ghp_`), etc.
**Loading Process:**
1. Config loaded from YAML
2. Environment variables expanded (`${VAR}` → value)
3. Plaintext secrets detected and rejected
4. Validation fails if secrets don't use env reference syntax
### HIPAA-Compliant Audit Logging
**Tamper-Evident Logging:**
```go
// Each event includes chain hash for integrity
audit.Log(audit.Event{
EventType: audit.EventFileRead,
UserID: "user1",
Resource: "/data/file.txt",
})
```
**Event Types:**
- `file_read` - File access logged
- `file_write` - File modification logged
- `file_delete` - File deletion logged
- `auth_success` / `auth_failure` - Authentication events
- `job_queued` / `job_started` / `job_completed` - Job lifecycle
**Chain Hashing:**
- Each event includes SHA-256 hash of previous event
- Modification of any log entry breaks the chain
- `VerifyChain()` function detects tampering
---
## Security Checklist
### Initial Setup