fetch_ml

Author	SHA1	Message	Date
Jeremie Fraeys	92aab06d76	feat(security): implement comprehensive security hardening phases 1-5,7 Implements defense-in-depth security for HIPAA and multi-tenant requirements: Phase 1 - File Ingestion Security: - SecurePathValidator with symlink resolution and path boundary enforcement in internal/fileutil/secure.go - Magic bytes validation for ML artifacts (safetensors, GGUF, HDF5, numpy) in internal/fileutil/filetype.go - Dangerous extension blocking (.pt, .pkl, .exe, .sh, .zip) - Upload limits (10GB size, 100MB/s rate, 10 uploads/min) Phase 2 - Sandbox Hardening: - ApplySecurityDefaults() with secure-by-default principle - network_mode: none, read_only_root: true, no_new_privileges: true - drop_all_caps: true, user_ns: true, run_as_uid/gid: 1000 - PodmanSecurityConfig and BuildSecurityArgs() in internal/container/podman.go - BuildPodmanCommand now accepts full security configuration - Container executor passes SandboxConfig to Podman command builder - configs/seccomp/default-hardened.json blocks dangerous syscalls (ptrace, mount, reboot, kexec_load, open_by_handle_at) Phase 3 - Secrets Management: - expandSecrets() for environment variable expansion using ${VAR} syntax - validateNoPlaintextSecrets() with entropy-based detection - Pattern matching for AWS, GitHub, GitLab, OpenAI, Stripe tokens - Shannon entropy calculation (>4 bits/char triggers detection) - Secrets expanded during LoadConfig() before validation Phase 5 - HIPAA Audit Logging: - Tamper-evident chain hashing with SHA-256 in internal/audit/audit.go - Event struct extended with PrevHash, EventHash, SequenceNum - File access event types: EventFileRead, EventFileWrite, EventFileDelete - LogFileAccess() helper for HIPAA compliance - VerifyChain() function for tamper detection Supporting Changes: - Add DeleteJob() and DeleteJobsByPrefix() to storage package - Integrate SecurePathValidator in artifact scanning	2026-02-23 18:00:33 -05:00
Jeremie Fraeys	27c8b08a16	test: Reorganize and add unit tests Reorganize tests for better structure and coverage: - Move container/security_test.go from internal/ to tests/unit/container/ - Move related tests to proper unit test locations - Delete orphaned test files (startup_blacklist_test.go) - Add privacy middleware unit tests - Add worker config unit tests - Update E2E tests for homelab and websocket scenarios - Update test fixtures with utility functions - Add CLI helper script for arraylist fixes	2026-02-18 21:28:13 -05:00
Jeremie Fraeys	4756348c48	feat: Worker sandboxing and security configuration Add security hardening features for worker execution: - Worker config with sandboxing options (network_mode, read_only, secrets) - Execution setup with security context propagation - Podman container runtime security enhancements - Security configuration management in config package - Add homelab-sandbox.yaml example configuration Supports running jobs in isolated, restricted environments.	2026-02-18 21:27:59 -05:00
Jeremie Fraeys	5644338ebd	security: implement Podman secrets for container credential management Add comprehensive Podman secrets support to prevent credential exposure: New types and methods (internal/container/podman.go): - PodmanSecret struct for secret definitions - CreateSecret() - Create Podman secrets from sensitive data - DeleteSecret() - Clean up secrets after use - BuildSecretArgs() - Generate podman run arguments for secrets - SanitizeContainerEnv() - Extract sensitive env vars as secrets - ContainerConfig.Secrets field for secret list Enhanced container lifecycle: - StartContainer() now creates secrets before starting container - Secrets automatically mounted via --secret flag - Cleanup on failure to prevent secret leakage - Secrets logged as count only (not content) Jupyter service integration (internal/jupyter/service_manager.go): - prepareContainerConfig() uses SanitizeContainerEnv() - JUPYTER_TOKEN and JUPYTER_PASSWORD now use secrets - Maintains backward compatibility with env var mounting Security benefits: - Credentials no longer visible in 'podman inspect' output - Secrets not exposed via /proc/*/environ inside container - Automatic cleanup prevents secret accumulation - Compatible with existing Jupyter authentication	2026-02-18 16:35:58 -05:00
Jeremie Fraeys	7194826871	feat: implement research-grade maintainability phases 1,3,4,7 Phase 1: Event Sourcing - Add TaskEvent types (queued, started, completed, failed, etc.) - Create EventStore with Redis Streams (append-only) - Support event querying by task ID and time range Phase 3: Diagnosable Failures - Enhance TaskExecutionError with Context map, Timestamp, Recoverable flag - Update container.go to populate error context (image, GPU, duration) - Add WithContext helper for building error context - Create cmd/errors CLI for querying task errors Phase 4: Testable Security - Add security fields to PodmanConfig (Privileged, Network, ReadOnlyMounts) - Create ValidateSecurityPolicy() with ErrSecurityViolation - Add security contract tests (privileged rejection, host network rejection) - Tests serve as executable security documentation Phase 7: Reproducible Builds - Add BuildHash and BuildTime ldflags to Makefile - Create verify-build target for reproducibility testing - Add -version and -verify flags to api-server All tests pass: - go test ./internal/errtypes/... - go test ./internal/container/... -run Security - go test ./internal/queue/... - go build ./cmd/api-server/...	2026-02-18 15:27:50 -05:00
Jeremie Fraeys	6b771e4a50	feat(jupyter): improve runtime management and update security/workflow docs	2026-01-05 12:37:27 -05:00
Jeremie Fraeys	cd5640ebd2	Slim and secure: move scripts, clean configs, remove secrets - Move ci-test.sh and setup.sh to scripts/ - Trim docs/src/zig-cli.md to current structure - Replace hardcoded secrets with placeholders in configs - Update .gitignore to block .env*, secrets/, keys, build artifacts - Slim README.md to reflect current CLI/TUI split - Add cleanup trap to ci-test.sh - Ensure no secrets are committed	2025-12-07 13:57:51 -05:00
Jeremie Fraeys	ea15af1833	Fix multi-user authentication and clean up debug code - Fix YAML tags in auth config struct (json -> yaml) - Update CLI configs to use pre-hashed API keys - Remove double hashing in WebSocket client - Fix port mapping (9102 -> 9103) in CLI commands - Update permission keys to use jobs:read, jobs:create, etc. - Clean up all debug logging from CLI and server - All user roles now authenticate correctly: * Admin: Can queue jobs and see all jobs * Researcher: Can queue jobs and see own jobs * Analyst: Can see status (read-only access) Multi-user authentication is now fully functional.	2025-12-06 12:35:32 -05:00
Jeremie Fraeys	803677be57	feat: implement Go backend with comprehensive API and internal packages - Add API server with WebSocket support and REST endpoints - Implement authentication system with API keys and permissions - Add task queue system with Redis backend and error handling - Include storage layer with database migrations and schemas - Add comprehensive logging, metrics, and telemetry - Implement security middleware and network utilities - Add experiment management and container orchestration - Include configuration management with smart defaults	2025-12-04 16:53:53 -05:00

9 commits