- Create docker-tests.yml for merge-to-main CI pipeline - Add mock GPU test matrix (NVIDIA, Metal, CPU-only) - Add AGENTS.md with container architecture rules: * Docker for CI/CD testing and deployments * Podman for ML experiment isolation only - Update .gitignore to track AGENTS.md
4.1 KiB
4.1 KiB
AGENTS.md - FetchML
Architecture
┌─────────┐ ┌─────────┐ ┌──────────┐ ┌─────────┐ ┌──────────┐
│ CLI │────▶│ API │────▶│ Scheduler│────▶│ Worker │────▶│ Storage │
│ (Zig) │◄────│(Go/HTTP)│◄────│ (Go) │◄────│ (Go) │◄────│ (MinIO) │
└─────────┘ └─────────┘ └──────────┘ └─────────┘ └──────────┘
│
▼
┌──────────┐
│ Redis │
│ (Queue) │
└──────────┘
CLI ↔ Server: HTTP (default) or Unix socket (local). execution_mode config:
direct (bypass scheduler) or queue (full flow). Auth via API key in header.
Container Architecture
Docker - Used for:
- CI/CD testing pipelines (
.forgejo/workflows/docker-tests.yml) - Application deployments (staging/production)
- Build environments
Podman - Used for:
- ML experiment isolation only
- Running untrusted/3rd party ML workloads
- Rootless container execution for security
Rule: Never use Podman for CI testing or deployments. Never use Docker for experiment isolation.
Critical Invariants
Audit Log — never break these
- Append-only — entries are never modified or deleted
- Hash chain — every entry includes SHA256 of the previous entry
- All mutations to tasks/groups/tokens must produce an audit entry
- Write the audit entry before the storage write — partial failures must be audited
Auth
TokenFromContext(ctx)is the only authorised way to extract auth in handlers- Group visibility enforced at DB query level — never filter in application code
- API keys hashed with bcrypt before storage — never log raw keys
Storage
- All DB access through repository types in
internal/db/repository/ - Transactions via
WithTx(ctx, db, func(tx *sql.Tx) error)— never manage tx manually - Migrations: additive only — new columns must be nullable or have defaults, never drop columns (mark deprecated, remove later)
CGO / Native Libs
Use -tags native_libs when building with C++ extensions. This has broken twice —
always check build tags when touching GPU detection or native code.
Build Commands
make build # all components
make dev # fast, no LTO
make prod # production-optimized
make prod-with-native # production + C++ libs
make cross-platform # Linux/macOS/Windows
cd cli && make dev # Zig: fast compile + format
cd cli && make prod # Zig: release=fast, LTO
cd cli && make debug # Zig: no optimizations
cd cli && zig build test
Test Commands
make test # all tests (Docker)
make test-unit
make test-integration
make test-e2e
make test-coverage
go test -v ./path/to/package -run TestName
go test -race ./path/to/package/...
LOG_LEVEL=debug go test -v ./path/to/package
FETCH_ML_E2E_PODMAN=1 go test ./tests/e2e/...
Lint / Security
make lint
make security-scan
make configlint
make openapi-validate
go vet ./...
cd cli && zig fmt .
Legacy Go — modernize when touching existing code only
| Legacy | Modern |
|---|---|
interface{} |
any |
for i := 0; i < n; i++ |
for i := range items |
[]byte(fmt.Sprintf(...)) |
fmt.Appendf(nil, ...) |
sort.Slice with closure |
slices.Sort(x) |
| Manual contains loop | slices.Contains |
Dependencies
- Go 1.25+, Zig 0.15+, Python 3.11+
- Redis (integration tests), Docker/Podman (container tests)