Jeremie Fraeys ea15af1833 Fix multi-user authentication and clean up debug code

- Fix YAML tags in auth config struct (json -> yaml)
- Update CLI configs to use pre-hashed API keys
- Remove double hashing in WebSocket client
- Fix port mapping (9102 -> 9103) in CLI commands
- Update permission keys to use jobs:read, jobs:create, etc.
- Clean up all debug logging from CLI and server
- All user roles now authenticate correctly:
  * Admin: Can queue jobs and see all jobs
  * Researcher: Can queue jobs and see own jobs
  * Analyst: Can see status (read-only access)

Multi-user authentication is now fully functional.

2025-12-06 12:35:32 -05:00

2.4 KiB

Raw Blame History

ADR-001: Use Go for API Server

Status

Accepted

Context

We needed to choose a programming language for the Fetch ML API server that would provide:

High performance for ML experiment management
Strong concurrency support for handling multiple experiments
Good ecosystem for HTTP APIs and WebSocket connections
Easy deployment and containerization
Strong type safety and reliability

Decision

We chose Go as the primary language for the API server implementation.

Consequences

Positive

Excellent performance with low memory footprint
Built-in concurrency primitives (goroutines, channels) perfect for parallel ML experiment execution
Rich ecosystem for HTTP servers, WebSocket, and database drivers
Static compilation creates single binary deployments
Strong typing catches many errors at compile time
Good tooling for testing, benchmarking, and profiling

Negative

Steeper learning curve for team members unfamiliar with Go
Less expressive than dynamic languages for rapid prototyping
Smaller ecosystem for ML-specific libraries compared to Python

Options Considered

Python with FastAPI

Pros:

Rich ML ecosystem (TensorFlow, PyTorch, scikit-learn)
Easy to learn and write
Great for data science teams
FastAPI provides good performance

Cons:

Global Interpreter Lock limits true parallelism
Higher memory usage
Slower performance for high-throughput scenarios
More complex deployment (multiple files, dependencies)

Node.js with Express

Pros:

Excellent WebSocket support
Large ecosystem
Fast development cycle

Cons:

Single-threaded event loop can be limiting
Not ideal for CPU-intensive ML operations
Dynamic typing can lead to runtime errors

Rust

Pros:

Maximum performance and memory safety
Strong type system
Growing ecosystem

Cons:

Very steep learning curve
Longer development time
Smaller ecosystem for web frameworks

Java with Spring Boot

Pros:

Mature ecosystem
Good performance
Strong typing

Cons:

Higher memory usage
More verbose syntax
Slower startup time
Heavier deployment footprint

Rationale

Go provides the best balance of performance, concurrency support, and deployment simplicity for our API server needs. The ability to handle many concurrent ML experiments efficiently with goroutines is a key advantage. The single binary deployment model also simplifies our containerization and distribution strategy.

2.4 KiB Raw Blame History

ADR-001: Use Go for API Server

Status

Context

Decision

Consequences

Positive

Negative

Options Considered

Python with FastAPI

Node.js with Express

Rust

Java with Spring Boot

Rationale

2.4 KiB

Raw Blame History