fetch_ml/internal/scheduler
Jeremie Fraeys d0266c4a90
refactor: scheduler hub bug fix, test helpers, and orphan recovery tests
Fix bug in scheduler hub orphan reconciliation:
- Move delete(h.pendingAcceptance, taskID) inside the requeue success block
- Prevents premature cleanup when requeue fails

Add comprehensive test infrastructure:
- hub_test_helpers.go: New test helper utilities (78 lines)
  - Mock scheduler components for isolated testing
  - Test fixture setup and teardown helpers

Refactor and enhance hub capabilities tests:
- Significant restructuring of hub_capabilities_test.go (213 lines changed)
- Improved test coverage for worker capability matching

Add comprehensive orphan recovery tests:
- internal/scheduler/orphan_recovery_test.go (451 lines)
- Tests orphaned job detection and recovery
- Covers requeue logic, timeout handling, state cleanup
2026-03-12 16:38:33 -04:00
..
auth.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
capability_routing_test.go refactor: co-locate scheduler non-hub tests with source code 2026-03-12 16:36:29 -04:00
failure_scenarios_test.go refactor: co-locate scheduler non-hub tests with source code 2026-03-12 16:36:29 -04:00
heartbeat_test.go refactor: co-locate scheduler non-hub tests with source code 2026-03-12 16:36:29 -04:00
hub.go refactor: scheduler hub bug fix, test helpers, and orphan recovery tests 2026-03-12 16:38:33 -04:00
hub_capabilities_test.go refactor: scheduler hub bug fix, test helpers, and orphan recovery tests 2026-03-12 16:38:33 -04:00
hub_test_helpers.go refactor: scheduler hub bug fix, test helpers, and orphan recovery tests 2026-03-12 16:38:33 -04:00
orphan_recovery_test.go refactor: scheduler hub bug fix, test helpers, and orphan recovery tests 2026-03-12 16:38:33 -04:00
pacing.go feat(scheduler): implement multi-tenant job scheduler with gang scheduling 2026-02-26 12:03:23 -05:00
plugin_quota.go refactor(scheduler): remove dead code 2026-03-04 13:35:18 -05:00
plugin_quota_test.go refactor: co-locate scheduler non-hub tests with source code 2026-03-12 16:36:29 -04:00
port_allocator.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
port_allocator_test.go refactor: co-locate scheduler non-hub tests with source code 2026-03-12 16:36:29 -04:00
priority_queue.go feat: enhance task domain and scheduler protocol 2026-03-04 13:23:38 -05:00
priority_queue_test.go refactor: co-locate scheduler non-hub tests with source code 2026-03-12 16:36:29 -04:00
protocol.go feat(scheduler): implement capability-based routing and hub v2 2026-03-12 12:00:05 -04:00
scheduler_conn.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
service_manager.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
service_manager_unix.go feat(scheduler): implement multi-tenant job scheduler with gang scheduling 2026-02-26 12:03:23 -05:00
service_manager_windows.go feat(scheduler): implement multi-tenant job scheduler with gang scheduling 2026-02-26 12:03:23 -05:00
service_templates.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
service_templates_test.go refactor: co-locate scheduler non-hub tests with source code 2026-03-12 16:36:29 -04:00
state.go refactor(scheduler,worker): improve service management and GPU detection 2026-03-08 13:03:15 -04:00
state_store_test.go refactor: co-locate scheduler non-hub tests with source code 2026-03-12 16:36:29 -04:00
template.go feat(scheduler): implement capability-based routing and hub v2 2026-03-12 12:00:05 -04:00