- Fix YAML tags in auth config struct (json -> yaml) - Update CLI configs to use pre-hashed API keys - Remove double hashing in WebSocket client - Fix port mapping (9102 -> 9103) in CLI commands - Update permission keys to use jobs:read, jobs:create, etc. - Clean up all debug logging from CLI and server - All user roles now authenticate correctly: * Admin: Can queue jobs and see all jobs * Researcher: Can queue jobs and see own jobs * Analyst: Can see status (read-only access) Multi-user authentication is now fully functional.
2.4 KiB
2.4 KiB
ADR-001: Use Go for API Server
Status
Accepted
Context
We needed to choose a programming language for the Fetch ML API server that would provide:
- High performance for ML experiment management
- Strong concurrency support for handling multiple experiments
- Good ecosystem for HTTP APIs and WebSocket connections
- Easy deployment and containerization
- Strong type safety and reliability
Decision
We chose Go as the primary language for the API server implementation.
Consequences
Positive
- Excellent performance with low memory footprint
- Built-in concurrency primitives (goroutines, channels) perfect for parallel ML experiment execution
- Rich ecosystem for HTTP servers, WebSocket, and database drivers
- Static compilation creates single binary deployments
- Strong typing catches many errors at compile time
- Good tooling for testing, benchmarking, and profiling
Negative
- Steeper learning curve for team members unfamiliar with Go
- Less expressive than dynamic languages for rapid prototyping
- Smaller ecosystem for ML-specific libraries compared to Python
Options Considered
Python with FastAPI
Pros:
- Rich ML ecosystem (TensorFlow, PyTorch, scikit-learn)
- Easy to learn and write
- Great for data science teams
- FastAPI provides good performance
Cons:
- Global Interpreter Lock limits true parallelism
- Higher memory usage
- Slower performance for high-throughput scenarios
- More complex deployment (multiple files, dependencies)
Node.js with Express
Pros:
- Excellent WebSocket support
- Large ecosystem
- Fast development cycle
Cons:
- Single-threaded event loop can be limiting
- Not ideal for CPU-intensive ML operations
- Dynamic typing can lead to runtime errors
Rust
Pros:
- Maximum performance and memory safety
- Strong type system
- Growing ecosystem
Cons:
- Very steep learning curve
- Longer development time
- Smaller ecosystem for web frameworks
Java with Spring Boot
Pros:
- Mature ecosystem
- Good performance
- Strong typing
Cons:
- Higher memory usage
- More verbose syntax
- Slower startup time
- Heavier deployment footprint
Rationale
Go provides the best balance of performance, concurrency support, and deployment simplicity for our API server needs. The ability to handle many concurrent ML experiments efficiently with goroutines is a key advantage. The single binary deployment model also simplifies our containerization and distribution strategy.