Commit graph

5 commits

Author SHA1 Message Date
Jeremie Fraeys
085c23f66a
refactor(phase7): Initialize JobRunner in factory.go
- Create jobRunner using NewJobRunner with local and container executors
- Assign jobRunner to Worker.runner field
- JobRunner available for future task execution orchestration

Build status: Compiles successfully
2026-02-17 16:40:03 -05:00
Jeremie Fraeys
1ba67e419d
refactor(phase7): Integrate resource manager into Worker
- Added resources field to Worker struct
- Updated factory.go to pass resource manager to Worker
- Removed placeholder discard of resource manager
- Build compiles successfully
2026-02-17 16:37:33 -05:00
Jeremie Fraeys
a7360869f8
refactor: Implement TaskExecutorAdapter and Worker.runningCount()
- Created executor/TaskExecutorAdapter implementing lifecycle.TaskExecutor
- Properly wires LocalExecutor and ContainerExecutor through adapter
- Worker.runningCount() now delegates to runLoop.RunningCount()
- Added lifecycle.RunLoop.RunningCount() public method
- factory.go creates proper executor chain instead of placeholder

Build status: Compiles successfully
2026-02-17 16:15:41 -05:00
Jeremie Fraeys
38fa017b8e
refactor: Phase 6 - Complete migration, remove legacy files
BREAKING CHANGE: Legacy worker files removed, Worker struct simplified

Changes:
1. worker.go - Simplified to 8 fields using composed dependencies:
   - runLoop, runner, metrics, health (from new packages)
   - Removed: server, queue, running, datasetCache, ctx, cancel, etc.

2. factory.go - Updated NewWorker to use new structure
   - Uses lifecycle.NewRunLoop
   - Integrates jupyter.Manager properly

3. Removed legacy files:
   - execution.go (1,016 lines)
   - data_integrity.go (929 lines)
   - runloop.go (555 lines)
   - jupyter_task.go (144 lines)
   - simplified.go (demonstration no longer needed)

4. Fixed references to use new packages:
   - hash_selector.go -> integrity.DirOverallSHA256Hex
   - snapshot_store.go -> integrity.NormalizeSHA256ChecksumHex
   - metrics.go - Removed resource-dependent metrics temporarily

5. Added RecordQueueLatency to metrics.Metrics for lifecycle.MetricsRecorder

Worker struct: 27 fields -> 8 fields (70% reduction)

Build status: Compiles successfully
2026-02-17 14:39:48 -05:00
Jeremie Fraeys
a5c1a9fc0b
refactor: Phase 4 - split worker package into focused files
Split 551-line worker/core.go into single-concern files:

- worker/config.go (+44 lines)
  - Added config parsing: envInt(), parseCPUFromConfig(), parseGPUCountFromConfig()
  - parseGPUSlotsPerGPUFromConfig()
  - Now has all config logic in one place (440 lines total)

- worker/metrics.go (new file, 172 lines)
  - Extracted setupMetricsExporter() with ~30 Prometheus metric registrations
  - Isolated metrics logic for easy modification

- worker/factory.go (new file, 183 lines)
  - Extracted NewWorker() factory function
  - Moved prePullImages(), pullImage() from core.go
  - Centralized worker instantiation

- worker/worker.go (renamed from core.go, ~100 lines)
  - Now just defines Worker struct, MLServer, JupyterManager
  - Clean, focused file without mixed concerns

Lines redistributed: ~350 lines moved from monolithic core.go
Build status: Compiles successfully
2026-02-17 12:57:02 -05:00