fetch_ml/internal
Jeremie Fraeys 92aab06d76
feat(security): implement comprehensive security hardening phases 1-5,7
Implements defense-in-depth security for HIPAA and multi-tenant requirements:

**Phase 1 - File Ingestion Security:**
- SecurePathValidator with symlink resolution and path boundary enforcement
  in internal/fileutil/secure.go
- Magic bytes validation for ML artifacts (safetensors, GGUF, HDF5, numpy)
  in internal/fileutil/filetype.go
- Dangerous extension blocking (.pt, .pkl, .exe, .sh, .zip)
- Upload limits (10GB size, 100MB/s rate, 10 uploads/min)

**Phase 2 - Sandbox Hardening:**
- ApplySecurityDefaults() with secure-by-default principle
  - network_mode: none, read_only_root: true, no_new_privileges: true
  - drop_all_caps: true, user_ns: true, run_as_uid/gid: 1000
- PodmanSecurityConfig and BuildSecurityArgs() in internal/container/podman.go
- BuildPodmanCommand now accepts full security configuration
- Container executor passes SandboxConfig to Podman command builder
- configs/seccomp/default-hardened.json blocks dangerous syscalls
  (ptrace, mount, reboot, kexec_load, open_by_handle_at)

**Phase 3 - Secrets Management:**
- expandSecrets() for environment variable expansion using ${VAR} syntax
- validateNoPlaintextSecrets() with entropy-based detection
- Pattern matching for AWS, GitHub, GitLab, OpenAI, Stripe tokens
- Shannon entropy calculation (>4 bits/char triggers detection)
- Secrets expanded during LoadConfig() before validation

**Phase 5 - HIPAA Audit Logging:**
- Tamper-evident chain hashing with SHA-256 in internal/audit/audit.go
- Event struct extended with PrevHash, EventHash, SequenceNum
- File access event types: EventFileRead, EventFileWrite, EventFileDelete
- LogFileAccess() helper for HIPAA compliance
- VerifyChain() function for tamper detection

**Supporting Changes:**
- Add DeleteJob() and DeleteJobsByPrefix() to storage package
- Integrate SecurePathValidator in artifact scanning
2026-02-23 18:00:33 -05:00
..
api refactor(api): internal refactoring for TUI and worker modules 2026-02-20 15:51:23 -05:00
audit feat(security): implement comprehensive security hardening phases 1-5,7 2026-02-23 18:00:33 -05:00
auth fix(auth): make DeleteAPIKey resilient to keyring errors 2026-02-21 21:19:46 -05:00
config refactor(api): internal refactoring for TUI and worker modules 2026-02-20 15:51:23 -05:00
container feat(security): implement comprehensive security hardening phases 1-5,7 2026-02-23 18:00:33 -05:00
controller Fix multi-user authentication and clean up debug code 2025-12-06 12:35:32 -05:00
crypto feat: implement Argon2id hashing and Ed25519 manifest signing 2026-02-19 15:34:20 -05:00
domain refactor(api): internal refactoring for TUI and worker modules 2026-02-20 15:51:23 -05:00
envpool feat(worker): add integrity checks, snapshot staging, and prewarm support 2026-01-05 12:31:13 -05:00
errtypes feat: implement research-grade maintainability phases 1,3,4,7 2026-02-18 15:27:50 -05:00
experiment refactor: adopt PathRegistry in experiment manager 2026-02-18 16:53:41 -05:00
fileutil feat(security): implement comprehensive security hardening phases 1-5,7 2026-02-23 18:00:33 -05:00
jupyter test: Reorganize and add unit tests 2026-02-18 21:28:13 -05:00
logging refactor(api): internal refactoring for TUI and worker modules 2026-02-20 15:51:23 -05:00
manifest feat: add manifest signing and native hashing support 2026-02-19 15:34:39 -05:00
metrics refactor: Phase 6 - Complete migration, remove legacy files 2026-02-17 14:39:48 -05:00
middleware fix: resolve TODOs and standardize tests 2026-02-19 15:34:59 -05:00
network refactor(dependency-hygiene): Move path functions from config to storage 2026-02-17 21:15:23 -05:00
privacy feat: Privacy and PII detection 2026-02-18 21:27:23 -05:00
prommetrics feat(api): refactor websocket handlers; add health and prometheus middleware 2026-01-05 12:31:07 -05:00
queue feat: native GPU detection and NVML bridge for macOS and Linux 2026-02-21 17:59:59 -05:00
resources feat(worker): add integrity checks, snapshot staging, and prewarm support 2026-01-05 12:31:13 -05:00
security feat: add security monitoring and validation framework 2026-02-19 15:34:25 -05:00
storage feat(security): implement comprehensive security hardening phases 1-5,7 2026-02-23 18:00:33 -05:00
telemetry Fix multi-user authentication and clean up debug code 2025-12-06 12:35:32 -05:00
tracking feat(tracking): add pluggable tracking backends and audit support 2026-01-05 12:33:57 -05:00
validation feat: add security monitoring and validation framework 2026-02-19 15:34:25 -05:00
worker feat(security): implement comprehensive security hardening phases 1-5,7 2026-02-23 18:00:33 -05:00