fetch_ml/internal/storage
Jeremie Fraeys 92aab06d76
feat(security): implement comprehensive security hardening phases 1-5,7
Implements defense-in-depth security for HIPAA and multi-tenant requirements:

**Phase 1 - File Ingestion Security:**
- SecurePathValidator with symlink resolution and path boundary enforcement
  in internal/fileutil/secure.go
- Magic bytes validation for ML artifacts (safetensors, GGUF, HDF5, numpy)
  in internal/fileutil/filetype.go
- Dangerous extension blocking (.pt, .pkl, .exe, .sh, .zip)
- Upload limits (10GB size, 100MB/s rate, 10 uploads/min)

**Phase 2 - Sandbox Hardening:**
- ApplySecurityDefaults() with secure-by-default principle
  - network_mode: none, read_only_root: true, no_new_privileges: true
  - drop_all_caps: true, user_ns: true, run_as_uid/gid: 1000
- PodmanSecurityConfig and BuildSecurityArgs() in internal/container/podman.go
- BuildPodmanCommand now accepts full security configuration
- Container executor passes SandboxConfig to Podman command builder
- configs/seccomp/default-hardened.json blocks dangerous syscalls
  (ptrace, mount, reboot, kexec_load, open_by_handle_at)

**Phase 3 - Secrets Management:**
- expandSecrets() for environment variable expansion using ${VAR} syntax
- validateNoPlaintextSecrets() with entropy-based detection
- Pattern matching for AWS, GitHub, GitLab, OpenAI, Stripe tokens
- Shannon entropy calculation (>4 bits/char triggers detection)
- Secrets expanded during LoadConfig() before validation

**Phase 5 - HIPAA Audit Logging:**
- Tamper-evident chain hashing with SHA-256 in internal/audit/audit.go
- Event struct extended with PrevHash, EventHash, SequenceNum
- File access event types: EventFileRead, EventFileWrite, EventFileDelete
- LogFileAccess() helper for HIPAA compliance
- VerifyChain() function for tamper detection

**Supporting Changes:**
- Add DeleteJob() and DeleteJobsByPrefix() to storage package
- Integrate SecurePathValidator in artifact scanning
2026-02-23 18:00:33 -05:00
..
dataset.go refactor(dependency-hygiene): Fix Redis leak, simplify TUI wrapper, clean go.mod 2026-02-17 21:13:49 -05:00
db_connect.go security: implement comprehensive secrets protection 2026-02-18 16:18:09 -05:00
db_experiments.go refactor(storage,queue): split storage layer and add sqlite queue backend 2026-01-05 12:31:02 -05:00
db_jobs.go feat(security): implement comprehensive security hardening phases 1-5,7 2026-02-23 18:00:33 -05:00
db_metrics.go refactor(api): internal refactoring for TUI and worker modules 2026-02-20 15:51:23 -05:00
migrate.go refactor(dependency-hygiene): Fix Redis leak, simplify TUI wrapper, clean go.mod 2026-02-17 21:13:49 -05:00
paths.go refactor: Phase 3 - fix config/storage boundaries 2026-02-17 12:49:53 -05:00
schema_embed.go refactor(storage,queue): split storage layer and add sqlite queue backend 2026-01-05 12:31:02 -05:00
schema_postgres.sql refactor: update WebSocket handlers and database schemas 2026-02-18 14:36:30 -05:00
schema_sqlite.sql refactor: update WebSocket handlers and database schemas 2026-02-18 14:36:30 -05:00