Jeremie Fraeys
412d7b82e9
security: implement comprehensive secrets protection
...
Critical fixes:
- Add SanitizeConnectionString() in storage/db_connect.go to remove passwords
- Add SecureEnvVar() in api/factory.go to clear env vars after reading (JWT_SECRET)
- Clear DB password from config after connection
Logging improvements:
- Enhance logging/sanitize.go with patterns for:
- PostgreSQL connection strings
- Generic connection string passwords
- HTTP Authorization headers
- Private keys
CLI security:
- Add --security-audit flag to api-server for security checks:
- Config file permissions
- Exposed environment variables
- Running as root
- API key file permissions
- Add warning when --api-key flag used (process list exposure)
Files changed:
- internal/storage/db_connect.go
- internal/api/factory.go
- internal/logging/sanitize.go
- internal/auth/flags.go
- cmd/api-server/main.go
2026-02-18 16:18:09 -05:00
Jeremie Fraeys
10e6416e11
refactor: update WebSocket handlers and database schemas
...
- Update datasets handlers with improved error handling
- Refactor WebSocket handler for better organization
- Clean up jobs.go handler implementation
- Add websocket_metrics table to Postgres and SQLite schemas
2026-02-18 14:36:30 -05:00
Jeremie Fraeys
96a8e139d5
refactor(internal): update native bridge and queue integration
...
- Improve native queue integration in protocol layer
- Update native bridge library loading
- Clean up queue native implementation
2026-02-18 12:45:59 -05:00
Jeremie Fraeys
320e6fd409
refactor(dependency-hygiene): Move path functions from config to storage
...
Move ExpandPath function and path-related utilities from internal/config to internal/storage where they belong.
Files updated:
- internal/worker/config.go: use storage.ExpandPath
- internal/network/ssh.go: use storage.ExpandPath
- cmd/data_manager/data_manager_config.go: use storage.ExpandPath
- internal/api/server_config.go: use storage.ExpandPath
internal/storage/paths.go already contained the canonical implementation.
Result: Path utilities now live in storage layer, config package focuses on configuration structs.
2026-02-17 21:15:23 -05:00
Jeremie Fraeys
f92e0bbdf9
feat: implement WebSocket handlers by delegating to sub-packages
...
Implemented WebSocket handlers by creating and integrating sub-packages:
**New package: api/datasets**
- HandleDatasetList, HandleDatasetRegister, HandleDatasetInfo, HandleDatasetSearch
- Binary protocol parsing for each operation
**Updated ws/handler.go**
- Added jobsHandler, jupyterHandler, datasetsHandler fields
- Updated NewHandler to accept sub-handlers
- Implemented handleAnnotateRun -> api/jobs
- Implemented handleSetRunNarrative -> api/jobs
- Implemented handleStartJupyter -> api/jupyter
- Implemented handleStopJupyter -> api/jupyter
- Implemented handleListJupyter -> api/jupyter
- Implemented handleDatasetList -> api/datasets
- Implemented handleDatasetRegister -> api/datasets
- Implemented handleDatasetInfo -> api/datasets
- Implemented handleDatasetSearch -> api/datasets
**Updated api/routes.go**
- Create jobs, jupyter, and datasets handlers
- Pass all handlers to ws.NewHandler
Build passes, all tests pass.
2026-02-17 20:49:31 -05:00
Jeremie Fraeys
3694d4e56f
refactor: extract ws handlers to separate files to reduce handler.go size
...
- Extract job handlers (handleQueueJob, handleQueueJobWithSnapshot, handleCancelJob, handlePrune) to ws/jobs.go (209 lines)
- Extract validation handler (handleValidateRequest) to ws/validate.go (167 lines)
- Reduce ws/handler.go from 879 to 474 lines (under 500 line target)
- Keep core framework in handler.go: Handler struct, dispatch, packet sending, auth helpers
- All handlers remain as methods on Handler for backward compatibility
Result: handler.go 474 lines, jobs.go 209 lines, validate.go 167 lines
2026-02-17 20:38:03 -05:00
Jeremie Fraeys
fb2bbbaae5
refactor: Phase 7 - TUI cleanup - reorganize model package
...
Phase 7 of the monorepo maintainability plan:
New files created:
- model/jobs.go - Job type, JobStatus constants, list.Item interface
- model/messages.go - tea.Msg types (JobsLoadedMsg, StatusMsg, TickMsg, etc.)
- model/styles.go - NewJobListDelegate(), JobListTitleStyle(), SpinnerStyle()
- model/keys.go - KeyMap struct, DefaultKeys() function
Modified files:
- model/state.go - reduced from 226 to ~130 lines
- Removed: Job, JobStatus, KeyMap, Keys, inline styles
- Kept: State struct, domain re-exports, ViewMode, DatasetInfo, InitialState()
- controller/commands.go - use model. prefix for message types
- controller/controller.go - use model. prefix for message types
- controller/settings.go - use model.SettingsContentMsg
Deleted files:
- controller/keys.go (moved to model/keys.go since State references KeyMap)
Result:
- No file >150 lines in model/ package
- Single concern per file: state, jobs, messages, styles, keys
- All 41 test packages pass
2026-02-17 20:22:04 -05:00
Jeremie Fraeys
a1ce267b86
feat: Implement all worker stub methods with real functionality
...
- VerifySnapshot: SHA256 verification using integrity package
- EnforceTaskProvenance: Strict and best-effort provenance validation
- RunJupyterTask: Full Jupyter service lifecycle (start/stop/remove/restore/list_packages)
- RunJob: Job execution using executor.JobRunner
- PrewarmNextOnce: Prewarming with queue integration
All methods now use new architecture components instead of placeholders
2026-02-17 17:37:56 -05:00
Jeremie Fraeys
4c8c9dfe4b
refactor: Export SelectDependencyManifest for API helpers
...
- Renamed selectDependencyManifest to SelectDependencyManifest (exported)
- Added re-export in worker package for backward compatibility
- Updated internal call in container.go to use exported function
- API helpers can now access via worker.SelectDependencyManifest
Build status: Compiles successfully
2026-02-17 16:45:59 -05:00
Jeremie Fraeys
d8cc2a4efa
refactor: Migrate all test imports from api to api/ws package
...
Updated 6 test files to use proper api/ws package imports:
1. tests/e2e/websocket_e2e_test.go
- api.NewWSHandler → ws.NewHandler
2. tests/e2e/wss_reverse_proxy_e2e_test.go
- api.NewWSHandler → ws.NewHandler
3. tests/integration/ws_handler_integration_test.go
- api.NewWSHandler → wspkg.NewHandler
- api.Opcode* → wspkg.Opcode*
4. tests/integration/websocket_queue_integration_test.go
- api.NewWSHandler → wspkg.NewHandler
- api.Opcode* → wspkg.Opcode*
5. tests/unit/api/ws_test.go
- api.NewWSHandler → wspkg.NewHandler
- api.Opcode* → wspkg.Opcode*
6. tests/unit/api/ws_jobs_args_test.go
- api.Opcode* → wspkg.Opcode*
Removed api/ws_compat.go shim as all tests now use proper imports.
Build status: Compiles successfully
2026-02-17 13:52:20 -05:00
Jeremie Fraeys
83ca393ebc
fix: Add proper WebSocket compatibility shim for test imports
...
Updated api/ws_compat.go to properly delegate to api/ws package:
- NewWSHandler returns http.Handler interface (not interface{})
- All Opcode* constants re-exported from ws package
- Maintains backward compatibility for existing tests
This allows gradual migration of tests to use api/ws directly without
breaking the build. Tests can be updated incrementally.
Build status: Compiles successfully
2026-02-17 13:47:47 -05:00
Jeremie Fraeys
d9c5750ed8
refactor: Phase 5 cleanup - Remove original ws_*.go files
...
Removed original monolithic WebSocket handler files after extracting
to focused packages:
Deleted:
- ws_jobs.go (1,365 lines) → Extracted to api/jobs/handlers.go
- ws_jupyter.go (512 lines) → Extracted to api/jupyter/handlers.go
- ws_validate.go (523 lines) → Extracted to api/validate/handlers.go
- ws_handler.go (379 lines) → Extracted to api/ws/handler.go
- ws_datasets.go (174 lines) - Functionality not migrated
- ws_tls_auth.go (101 lines) - Functionality not migrated
Updated:
- routes.go - Changed NewWSHandler → ws.NewHandler
Lines deleted: ~3,000+ lines from monolithic files
Build status: Compiles successfully
2026-02-17 13:33:00 -05:00
Jeremie Fraeys
f0ffbb4a3d
refactor: Phase 5 complete - API packages extracted
...
Extracted all deferred API packages from monolithic ws_*.go files:
- api/routes.go (75 lines) - Extracted route registration from server.go
- api/errors.go (108 lines) - Standardized error responses and error codes
- api/jobs/handlers.go (271 lines) - Job WebSocket handlers
* HandleAnnotateRun, HandleSetRunNarrative
* HandleCancelJob, HandlePruneJobs, HandleListJobs
- api/jupyter/handlers.go (244 lines) - Jupyter WebSocket handlers
* HandleStartJupyter, HandleStopJupyter
* HandleListJupyter, HandleListJupyterPackages
* HandleRemoveJupyter, HandleRestoreJupyter
- api/validate/handlers.go (163 lines) - Validation WebSocket handlers
* HandleValidate, HandleGetValidateStatus, HandleListValidations
- api/ws/handler.go (298 lines) - WebSocket handler framework
* Core WebSocket handling logic
* Opcode constants and error codes
Lines redistributed: ~1,150 lines from ws_jobs.go (1,365), ws_jupyter.go (512),
ws_validate.go (523), ws_handler.go (379) into focused packages.
Note: Original ws_*.go files still present - cleanup in next commit.
Build status: Compiles successfully
2026-02-17 13:25:58 -05:00
Jeremie Fraeys
db7fbbd8d5
refactor: Phase 5 - split API package into focused files
...
Reorganized internal/api/ package to follow single-concern principle:
- api/factory.go (new file, 257 lines)
- Extracted component initialization from server.go
- initializeComponents(), setupLogger(), initExperimentManager()
- initTaskQueue(), initDatabase(), initDatabaseSchema()
- initSecurity(), initJupyterServiceManager(), initAuditLogger()
- api/middleware.go (new file, 31 lines)
- Extracted wrapWithMiddleware() - security middleware chain
- Centralized auth, rate limiting, CORS, security headers
- api/server.go (reduced from 446 to 212 lines)
- Now focused on Server lifecycle: NewServer, Start, WaitForShutdown, Close
- Removed initialization logic (moved to factory.go)
- Removed middleware wrapper (moved to middleware.go)
- api/metrics_middleware.go (existing, 64 lines)
- Already had wrapWithMetrics(), left in place
Lines redistributed: ~180 lines from monolithic server.go
Build status: Compiles successfully
2026-02-17 13:11:02 -05:00
Jeremie Fraeys
d1bef0a450
refactor: Phase 3 - fix config/storage boundaries
...
Move schema ownership to infrastructure layer:
- Redis keys: config/constants.go -> queue/keys.go (TaskQueueKey, TaskPrefix, etc.)
- Filesystem paths: config/paths.go -> storage/paths.go (JobPaths)
- Create config/shared.go with RedisConfig, SSHConfig
- Update all imports: worker/, api/helpers, api/ws_jobs, api/ws_validate
- Clean up: remove duplicates from queue/task.go, queue/queue.go, config/paths.go
Build status: Compiles successfully
2026-02-17 12:49:53 -05:00
Jeremie Fraeys
b05470b30a
refactor: improve API structure and WebSocket protocol
...
- Extract WebSocket protocol handling to dedicated module
- Add helper functions for DB operations, validation, and responses
- Improve WebSocket frame handling and opcodes
- Refactor dataset, job, and Jupyter handlers
- Add duplicate detection processing
2026-02-16 20:38:12 -05:00
Jeremie Fraeys
2e701340e5
feat(core): API, worker, queue, and manifest improvements
...
- Add protocol buffer optimizations (internal/api/protocol.go)
- Add filesystem queue backend (internal/queue/filesystem_queue.go)
- Add run manifest support (internal/manifest/run_manifest.go)
- Worker and jupyter task refinements
- Exported test wrappers for benchmarking
2026-02-12 12:05:17 -05:00
Jeremie Fraeys
add4a90e62
feat(api): refactor websocket handlers; add health and prometheus middleware
2026-01-05 12:31:07 -05:00
Jeremie Fraeys
cd5640ebd2
Slim and secure: move scripts, clean configs, remove secrets
...
- Move ci-test.sh and setup.sh to scripts/
- Trim docs/src/zig-cli.md to current structure
- Replace hardcoded secrets with placeholders in configs
- Update .gitignore to block .env*, secrets/, keys, build artifacts
- Slim README.md to reflect current CLI/TUI split
- Add cleanup trap to ci-test.sh
- Ensure no secrets are committed
2025-12-07 13:57:51 -05:00
Jeremie Fraeys
ea15af1833
Fix multi-user authentication and clean up debug code
...
- Fix YAML tags in auth config struct (json -> yaml)
- Update CLI configs to use pre-hashed API keys
- Remove double hashing in WebSocket client
- Fix port mapping (9102 -> 9103) in CLI commands
- Update permission keys to use jobs:read, jobs:create, etc.
- Clean up all debug logging from CLI and server
- All user roles now authenticate correctly:
* Admin: Can queue jobs and see all jobs
* Researcher: Can queue jobs and see own jobs
* Analyst: Can see status (read-only access)
Multi-user authentication is now fully functional.
2025-12-06 12:35:32 -05:00
Jeremie Fraeys
803677be57
feat: implement Go backend with comprehensive API and internal packages
...
- Add API server with WebSocket support and REST endpoints
- Implement authentication system with API keys and permissions
- Add task queue system with Redis backend and error handling
- Include storage layer with database migrations and schemas
- Add comprehensive logging, metrics, and telemetry
- Implement security middleware and network utilities
- Add experiment management and container orchestration
- Include configuration management with smart defaults
2025-12-04 16:53:53 -05:00