fetch_ml/internal/scheduler
Jeremie Fraeys 0b5e99f720
refactor(scheduler,worker): improve service management and GPU detection
Scheduler enhancements:
- auth.go: Group membership validation in authentication
- hub.go: Task distribution with group affinity
- port_allocator.go: Dynamic port allocation with conflict resolution
- scheduler_conn.go: Connection pooling and retry logic
- service_manager.go: Lifecycle management for scheduler services
- service_templates.go: Template-based service configuration
- state.go: Persistent state management with recovery

Worker improvements:
- config.go: Extended configuration for task visibility rules
- execution/setup.go: Sandboxed execution environment setup
- executor/container.go: Container runtime integration
- executor/runner.go: Task runner with visibility enforcement
- gpu_detector.go: Robust GPU detection (NVIDIA, AMD, Apple Silicon, CPU fallback)
- integrity/validate.go: Data integrity validation
- lifecycle/runloop.go: Improved runloop with graceful shutdown
- lifecycle/service_manager.go: Service lifecycle coordination
- process/isolation.go + isolation_unix.go: Process isolation with namespaces/cgroups
- tenant/manager.go: Multi-tenant resource isolation
- tenant/middleware.go: Tenant context propagation
- worker.go: Core worker with group-scoped task execution
2026-03-08 13:03:15 -04:00
..
auth.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
hub.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
pacing.go feat(scheduler): implement multi-tenant job scheduler with gang scheduling 2026-02-26 12:03:23 -05:00
plugin_quota.go refactor(scheduler): remove dead code 2026-03-04 13:35:18 -05:00
port_allocator.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
priority_queue.go feat: enhance task domain and scheduler protocol 2026-03-04 13:23:38 -05:00
protocol.go feat: enhance task domain and scheduler protocol 2026-03-04 13:23:38 -05:00
scheduler_conn.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
service_manager.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
service_manager_unix.go feat(scheduler): implement multi-tenant job scheduler with gang scheduling 2026-02-26 12:03:23 -05:00
service_manager_windows.go feat(scheduler): implement multi-tenant job scheduler with gang scheduling 2026-02-26 12:03:23 -05:00
service_templates.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
state.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
template.go feat(scheduler): implement multi-tenant job scheduler with gang scheduling 2026-02-26 12:03:23 -05:00