fetch_ml

Author	SHA1	Message	Date
Jeremie Fraeys	188cf55939	refactor(api): overhaul WebSocket handler and protocol layer Major WebSocket handler refactor: - Rewrite ws/handler.go with structured message routing and backpressure - Add connection lifecycle management with heartbeats and timeouts - Implement graceful connection draining for zero-downtime restarts Protocol improvements: - Define structured protocol types in protocol.go for hub communication - Add versioned message envelopes for backward compatibility - Standardize error codes and response formats across WebSocket API Job streaming via WebSocket: - Simplify ws/jobs.go with async job status streaming - Add compression for high-volume job updates Testing: - Update websocket_e2e_test.go for new protocol semantics - Add connection resilience tests	2026-03-12 12:01:21 -04:00
Jeremie Fraeys	c52179dcbe	feat(auth): add token-based access and structured logging Add comprehensive authentication and authorization enhancements: - tokens.go: New token management system for public task access and cloning * SHA-256 hashed token storage for security * Token generation, validation, and automatic cleanup * Support for public access and clone permissions - api_key.go: Extend User struct with Groups field * Lab group membership (ml-lab, nlp-group) * Integration with permission system for group-based access - flags.go: Security hardening - migrate to structured logging * Replace log.Printf with log/slog to prevent log injection attacks * Consistent structured output for all auth warnings * Safe handling of file paths and errors in logs - permissions.go: Add task sharing permission constants * PermissionTasksReadOwn: Access own tasks * PermissionTasksReadLab: Access lab group tasks * PermissionTasksReadAll: Admin/institution-wide access * PermissionTasksShare: Grant access to other users * PermissionTasksClone: Create copies of shared tasks * CanAccessTask() method with visibility checks - database.go: Improve error handling * Add structured error logging on row close failures	2026-03-08 12:51:07 -04:00
Jeremie Fraeys	c6a224d5fc	feat(cli,server): unify info command with remote/local support Enhance ml info to query server when connected, falling back to local manifests when offline. Unifies behavior with other commands like run, exec, and cancel. CLI changes: - Add --local and --remote flags for explicit control - Auto-detect connection state via mode.detect() - queryRemoteRun(): Query server via WebSocket for run details - queryLocalRun(): Read local run_manifest.json - displayRunInfo(): Shared display logic for both sources - Add connection status indicators (Remote: connecting.../connected) WebSocket protocol: - Add query_run_info opcode (0x28) to cli and server - Add sendQueryRunInfo() method to ws/client.zig - Protocol: [opcode:1][api_key_hash:16][run_id_len:1][run_id:var] Server changes: - Add handleQueryRunInfo() handler to ws/handler.go - Returns run_id, job_name, user, timestamp, overall_sha, files_count - Checks PermJobsRead permission - Looks up run in experiment manager Usage: ml info abc123 # Auto: tries remote, falls back to local ml info abc123 --local # Force local manifest lookup ml info abc123 --remote # Force remote query (fails if offline)	2026-03-05 12:07:00 -05:00
Jeremie Fraeys	420de879ff	feat(api): integrate scheduler protocol and WebSocket enhancements Update API layer for scheduler integration: - WebSocket handlers with scheduler protocol support - Jobs WebSocket endpoint with priority queue integration - Validation middleware for scheduler messages - Server configuration with security hardening - Protocol definitions for worker-scheduler communication - Dataset handlers with tenant isolation checks - Response helpers with audit context - OpenAPI spec updates for new endpoints	2026-02-26 12:05:57 -05:00
Jeremie Fraeys	6028779239	feat: update CLI, TUI, and security documentation - Add safety checks to Zig build - Add TUI with job management and narrative views - Add WebSocket support and export services - Add smart configuration defaults - Update API routes with security headers - Update SECURITY.md with comprehensive policy - Add Makefile security scanning targets	2026-02-19 15:35:05 -05:00
Jeremie Fraeys	260e18499e	feat: Research features - narrative fields and outcome tracking Add comprehensive research context tracking to jobs: - Narrative fields: hypothesis, context, intent, expected_outcome - Experiment groups and tags for organization - Run comparison (compare command) for diff analysis - Run search (find command) with criteria filtering - Run export (export command) for data portability - Outcome setting (outcome command) for experiment validation Update queue and requeue commands to support narrative fields. Add narrative validation to manifest validator. Add WebSocket handlers for compare, find, export, and outcome operations. Includes E2E tests for phase 2 features.	2026-02-18 21:27:05 -05:00
Jeremie Fraeys	10e6416e11	refactor: update WebSocket handlers and database schemas - Update datasets handlers with improved error handling - Refactor WebSocket handler for better organization - Clean up jobs.go handler implementation - Add websocket_metrics table to Postgres and SQLite schemas	2026-02-18 14:36:30 -05:00
Jeremie Fraeys	f92e0bbdf9	feat: implement WebSocket handlers by delegating to sub-packages Implemented WebSocket handlers by creating and integrating sub-packages: New package: api/datasets - HandleDatasetList, HandleDatasetRegister, HandleDatasetInfo, HandleDatasetSearch - Binary protocol parsing for each operation Updated ws/handler.go - Added jobsHandler, jupyterHandler, datasetsHandler fields - Updated NewHandler to accept sub-handlers - Implemented handleAnnotateRun -> api/jobs - Implemented handleSetRunNarrative -> api/jobs - Implemented handleStartJupyter -> api/jupyter - Implemented handleStopJupyter -> api/jupyter - Implemented handleListJupyter -> api/jupyter - Implemented handleDatasetList -> api/datasets - Implemented handleDatasetRegister -> api/datasets - Implemented handleDatasetInfo -> api/datasets - Implemented handleDatasetSearch -> api/datasets Updated api/routes.go - Create jobs, jupyter, and datasets handlers - Pass all handlers to ws.NewHandler Build passes, all tests pass.	2026-02-17 20:49:31 -05:00
Jeremie Fraeys	3694d4e56f	refactor: extract ws handlers to separate files to reduce handler.go size - Extract job handlers (handleQueueJob, handleQueueJobWithSnapshot, handleCancelJob, handlePrune) to ws/jobs.go (209 lines) - Extract validation handler (handleValidateRequest) to ws/validate.go (167 lines) - Reduce ws/handler.go from 879 to 474 lines (under 500 line target) - Keep core framework in handler.go: Handler struct, dispatch, packet sending, auth helpers - All handlers remain as methods on Handler for backward compatibility Result: handler.go 474 lines, jobs.go 209 lines, validate.go 167 lines	2026-02-17 20:38:03 -05:00
Jeremie Fraeys	fb2bbbaae5	refactor: Phase 7 - TUI cleanup - reorganize model package Phase 7 of the monorepo maintainability plan: New files created: - model/jobs.go - Job type, JobStatus constants, list.Item interface - model/messages.go - tea.Msg types (JobsLoadedMsg, StatusMsg, TickMsg, etc.) - model/styles.go - NewJobListDelegate(), JobListTitleStyle(), SpinnerStyle() - model/keys.go - KeyMap struct, DefaultKeys() function Modified files: - model/state.go - reduced from 226 to ~130 lines - Removed: Job, JobStatus, KeyMap, Keys, inline styles - Kept: State struct, domain re-exports, ViewMode, DatasetInfo, InitialState() - controller/commands.go - use model. prefix for message types - controller/controller.go - use model. prefix for message types - controller/settings.go - use model.SettingsContentMsg Deleted files: - controller/keys.go (moved to model/keys.go since State references KeyMap) Result: - No file >150 lines in model/ package - Single concern per file: state, jobs, messages, styles, keys - All 41 test packages pass	2026-02-17 20:22:04 -05:00
Jeremie Fraeys	a1ce267b86	feat: Implement all worker stub methods with real functionality - VerifySnapshot: SHA256 verification using integrity package - EnforceTaskProvenance: Strict and best-effort provenance validation - RunJupyterTask: Full Jupyter service lifecycle (start/stop/remove/restore/list_packages) - RunJob: Job execution using executor.JobRunner - PrewarmNextOnce: Prewarming with queue integration All methods now use new architecture components instead of placeholders	2026-02-17 17:37:56 -05:00
Jeremie Fraeys	f0ffbb4a3d	refactor: Phase 5 complete - API packages extracted Extracted all deferred API packages from monolithic ws_.go files: - api/routes.go (75 lines) - Extracted route registration from server.go - api/errors.go (108 lines) - Standardized error responses and error codes - api/jobs/handlers.go (271 lines) - Job WebSocket handlers HandleAnnotateRun, HandleSetRunNarrative * HandleCancelJob, HandlePruneJobs, HandleListJobs - api/jupyter/handlers.go (244 lines) - Jupyter WebSocket handlers * HandleStartJupyter, HandleStopJupyter * HandleListJupyter, HandleListJupyterPackages * HandleRemoveJupyter, HandleRestoreJupyter - api/validate/handlers.go (163 lines) - Validation WebSocket handlers * HandleValidate, HandleGetValidateStatus, HandleListValidations - api/ws/handler.go (298 lines) - WebSocket handler framework * Core WebSocket handling logic * Opcode constants and error codes Lines redistributed: ~1,150 lines from ws_jobs.go (1,365), ws_jupyter.go (512), ws_validate.go (523), ws_handler.go (379) into focused packages. Note: Original ws_*.go files still present - cleanup in next commit. Build status: Compiles successfully	2026-02-17 13:25:58 -05:00

12 commits