Commit graph

5 commits

Author SHA1 Message Date
Jeremie Fraeys
0b5e99f720
refactor(scheduler,worker): improve service management and GPU detection
Scheduler enhancements:
- auth.go: Group membership validation in authentication
- hub.go: Task distribution with group affinity
- port_allocator.go: Dynamic port allocation with conflict resolution
- scheduler_conn.go: Connection pooling and retry logic
- service_manager.go: Lifecycle management for scheduler services
- service_templates.go: Template-based service configuration
- state.go: Persistent state management with recovery

Worker improvements:
- config.go: Extended configuration for task visibility rules
- execution/setup.go: Sandboxed execution environment setup
- executor/container.go: Container runtime integration
- executor/runner.go: Task runner with visibility enforcement
- gpu_detector.go: Robust GPU detection (NVIDIA, AMD, Apple Silicon, CPU fallback)
- integrity/validate.go: Data integrity validation
- lifecycle/runloop.go: Improved runloop with graceful shutdown
- lifecycle/service_manager.go: Service lifecycle coordination
- process/isolation.go + isolation_unix.go: Process isolation with namespaces/cgroups
- tenant/manager.go: Multi-tenant resource isolation
- tenant/middleware.go: Tenant context propagation
- worker.go: Core worker with group-scoped task execution
2026-03-08 13:03:15 -04:00
Jeremie Fraeys
4756348c48
feat: Worker sandboxing and security configuration
Add security hardening features for worker execution:
- Worker config with sandboxing options (network_mode, read_only, secrets)
- Execution setup with security context propagation
- Podman container runtime security enhancements
- Security configuration management in config package
- Add homelab-sandbox.yaml example configuration

Supports running jobs in isolated, restricted environments.
2026-02-18 21:27:59 -05:00
Jeremie Fraeys
f7afb36a7c
refactor: adopt PathRegistry in execution/setup.go
Update internal/worker/execution/setup.go to use centralized PathRegistry:

Changes:
- Add import for internal/config package
- Update SetupJobDirectories to use config.FromEnv() for directory creation
- Replace all os.MkdirAll calls with paths.EnsureDir()
  - pendingDir creation
  - jobDir creation
  - outputDir (running) creation

Benefits:
- Consistent directory creation via PathRegistry
- Centralized path management for job execution directories
- Better error handling for directory creation failures
2026-02-18 16:57:04 -05:00
Jeremie Fraeys
fb2bbbaae5
refactor: Phase 7 - TUI cleanup - reorganize model package
Phase 7 of the monorepo maintainability plan:

New files created:
- model/jobs.go - Job type, JobStatus constants, list.Item interface
- model/messages.go - tea.Msg types (JobsLoadedMsg, StatusMsg, TickMsg, etc.)
- model/styles.go - NewJobListDelegate(), JobListTitleStyle(), SpinnerStyle()
- model/keys.go - KeyMap struct, DefaultKeys() function

Modified files:
- model/state.go - reduced from 226 to ~130 lines
  - Removed: Job, JobStatus, KeyMap, Keys, inline styles
  - Kept: State struct, domain re-exports, ViewMode, DatasetInfo, InitialState()
- controller/commands.go - use model. prefix for message types
- controller/controller.go - use model. prefix for message types
- controller/settings.go - use model.SettingsContentMsg

Deleted files:
- controller/keys.go (moved to model/keys.go since State references KeyMap)

Result:
- No file >150 lines in model/ package
- Single concern per file: state, jobs, messages, styles, keys
- All 41 test packages pass
2026-02-17 20:22:04 -05:00
Jeremie Fraeys
c46be7f815
refactor: Phase 4 deferred - Extract GPU utilities and execution helpers
Extracted from execution.go to focused packages:

1. internal/worker/gpu.go (60 lines)
   - gpuVisibleDevicesString() - GPU device string formatting
   - filterExistingDevicePaths() - Device path filtering
   - gpuVisibleEnvVarName() - GPU env var selection
   - Reuses GPUType constants from gpu_detector.go

2. internal/worker/execution/setup.go (108 lines)
   - SetupJobDirectories() - Job directory creation
   - CopyDir() - Directory tree copying
   - copyFile() - Single file copy helper

3. internal/worker/execution/snapshot.go (52 lines)
   - StageSnapshot() - Snapshot staging for jobs
   - StageSnapshotFromPath() - Snapshot staging from path

Updated execution.go:
- Removed 64 lines of GPU utilities (now in gpu.go)
- Reduced from 1,082 to ~1,018 lines
- Still contains main execution flow (runJob, executeJob, etc.)

Build status: Compiles successfully
2026-02-17 14:03:11 -05:00