fetch_ml/AGENTS.md
Jeremie Fraeys 6646f3a382
ci(docker): add test workflow and container architecture docs
- Create docker-tests.yml for merge-to-main CI pipeline
- Add mock GPU test matrix (NVIDIA, Metal, CPU-only)
- Add AGENTS.md with container architecture rules:
  * Docker for CI/CD testing and deployments
  * Podman for ML experiment isolation only
- Update .gitignore to track AGENTS.md
2026-03-12 14:05:53 -04:00

126 lines
4.1 KiB
Markdown

# AGENTS.md - FetchML
## Architecture
```
┌─────────┐ ┌─────────┐ ┌──────────┐ ┌─────────┐ ┌──────────┐
│ CLI │────▶│ API │────▶│ Scheduler│────▶│ Worker │────▶│ Storage │
│ (Zig) │◄────│(Go/HTTP)│◄────│ (Go) │◄────│ (Go) │◄────│ (MinIO) │
└─────────┘ └─────────┘ └──────────┘ └─────────┘ └──────────┘
┌──────────┐
│ Redis │
│ (Queue) │
└──────────┘
```
**CLI ↔ Server**: HTTP (default) or Unix socket (local). `execution_mode` config:
`direct` (bypass scheduler) or `queue` (full flow). Auth via API key in header.
---
## Container Architecture
**Docker** - Used for:
- CI/CD testing pipelines (`.forgejo/workflows/docker-tests.yml`)
- Application deployments (staging/production)
- Build environments
**Podman** - Used for:
- ML experiment isolation only
- Running untrusted/3rd party ML workloads
- Rootless container execution for security
**Rule**: Never use Podman for CI testing or deployments. Never use Docker for experiment isolation.
---
## Critical Invariants
### Audit Log — never break these
- **Append-only** — entries are never modified or deleted
- **Hash chain** — every entry includes SHA256 of the previous entry
- **All mutations** to tasks/groups/tokens must produce an audit entry
- Write the audit entry before the storage write — partial failures must be audited
### Auth
- `TokenFromContext(ctx)` is the only authorised way to extract auth in handlers
- Group visibility enforced at DB query level — never filter in application code
- API keys hashed with bcrypt before storage — never log raw keys
### Storage
- All DB access through repository types in `internal/db/repository/`
- Transactions via `WithTx(ctx, db, func(tx *sql.Tx) error)` — never manage tx manually
- Migrations: additive only — new columns must be nullable or have defaults,
never drop columns (mark deprecated, remove later)
### CGO / Native Libs
Use `-tags native_libs` when building with C++ extensions. This has broken twice —
always check build tags when touching GPU detection or native code.
---
## Build Commands
```bash
make build # all components
make dev # fast, no LTO
make prod # production-optimized
make prod-with-native # production + C++ libs
make cross-platform # Linux/macOS/Windows
cd cli && make dev # Zig: fast compile + format
cd cli && make prod # Zig: release=fast, LTO
cd cli && make debug # Zig: no optimizations
cd cli && zig build test
```
## Test Commands
```bash
make test # all tests (Docker)
make test-unit
make test-integration
make test-e2e
make test-coverage
go test -v ./path/to/package -run TestName
go test -race ./path/to/package/...
LOG_LEVEL=debug go test -v ./path/to/package
FETCH_ML_E2E_PODMAN=1 go test ./tests/e2e/...
```
## Lint / Security
```bash
make lint
make security-scan
make configlint
make openapi-validate
go vet ./...
cd cli && zig fmt .
```
---
## Legacy Go — modernize when touching existing code only
| Legacy | Modern |
| -------------------------- | ----------------------- |
| `interface{}` | `any` |
| `for i := 0; i < n; i++` | `for i := range items` |
| `[]byte(fmt.Sprintf(...))` | `fmt.Appendf(nil, ...)` |
| `sort.Slice` with closure | `slices.Sort(x)` |
| Manual contains loop | `slices.Contains` |
---
## Dependencies
- Go 1.25+, Zig 0.15+, Python 3.11+
- Redis (integration tests), Docker/Podman (container tests)