History

Jeremie Fraeys 7194826871 feat: implement research-grade maintainability phases 1,3,4,7 Phase 1: Event Sourcing - Add TaskEvent types (queued, started, completed, failed, etc.) - Create EventStore with Redis Streams (append-only) - Support event querying by task ID and time range Phase 3: Diagnosable Failures - Enhance TaskExecutionError with Context map, Timestamp, Recoverable flag - Update container.go to populate error context (image, GPU, duration) - Add WithContext helper for building error context - Create cmd/errors CLI for querying task errors Phase 4: Testable Security - Add security fields to PodmanConfig (Privileged, Network, ReadOnlyMounts) - Create ValidateSecurityPolicy() with ErrSecurityViolation - Add security contract tests (privileged rejection, host network rejection) - Tests serve as executable security documentation Phase 7: Reproducible Builds - Add BuildHash and BuildTime ldflags to Makefile - Create verify-build target for reproducibility testing - Add -version and -verify flags to api-server All tests pass: - go test ./internal/errtypes/... - go test ./internal/container/... -run Security - go test ./internal/queue/... - go build ./cmd/api-server/...	2026-02-18 15:27:50 -05:00
..
main.go	feat: implement research-grade maintainability phases 1,3,4,7	2026-02-18 15:27:50 -05:00
README.md	feat(api): refactor websocket handlers; add health and prometheus middleware	2026-01-05 12:31:07 -05:00

Jeremie Fraeys 7194826871

feat: implement research-grade maintainability phases 1,3,4,7

Phase 1: Event Sourcing
- Add TaskEvent types (queued, started, completed, failed, etc.)
- Create EventStore with Redis Streams (append-only)
- Support event querying by task ID and time range

Phase 3: Diagnosable Failures
- Enhance TaskExecutionError with Context map, Timestamp, Recoverable flag
- Update container.go to populate error context (image, GPU, duration)
- Add WithContext helper for building error context
- Create cmd/errors CLI for querying task errors

Phase 4: Testable Security
- Add security fields to PodmanConfig (Privileged, Network, ReadOnlyMounts)
- Create ValidateSecurityPolicy() with ErrSecurityViolation
- Add security contract tests (privileged rejection, host network rejection)
- Tests serve as executable security documentation

Phase 7: Reproducible Builds
- Add BuildHash and BuildTime ldflags to Makefile
- Create verify-build target for reproducibility testing
- Add -version and -verify flags to api-server

All tests pass:
- go test ./internal/errtypes/...
- go test ./internal/container/... -run Security
- go test ./internal/queue/...
- go build ./cmd/api-server/...

2026-02-18 15:27:50 -05:00

main.go

feat: implement research-grade maintainability phases 1,3,4,7

2026-02-18 15:27:50 -05:00

README.md

feat(api): refactor websocket handlers; add health and prometheus middleware

2026-01-05 12:31:07 -05:00

README.md

API Server

WebSocket API server for the ML CLI tool...

Usage

./bin/api-server --config configs/api/dev.yaml

Endpoints

GET /health - Health check
WS /ws - WebSocket endpoint for CLI communication

Binary Protocol

See CLI README for protocol details.

Configuration

Uses the same configuration file as the worker. Experiment base path is read from base_path configuration key.

Example

# Start API server
./bin/api-server --listen :9100

# In another terminal, test with CLI
./cli/zig-out/bin/ml status