fetch_ml/README.md
Jeremie Fraeys 799afb9efa
docs: update coverage map and development documentation
Comprehensive documentation updates for 100% test coverage:

- TEST_COVERAGE_MAP.md: 49/49 requirements marked complete (100% coverage)
- CHANGELOG.md: Document Phase 8 test coverage implementation
- DEVELOPMENT.md: Add testing strategy and property-based test guidelines
- README.md: Add Testing & Security section with coverage highlights

All security and reproducibility requirements now tracked and tested
2026-02-23 20:26:13 -05:00

198 lines
6.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# FetchML
A lightweight ML experiment platform with a tiny Zig CLI and a Go backend. Designed for homelabs and small teams.
## Installation (recommended)
FetchML publishes pre-built release artifacts (CLI + Go services) on GitHub Releases.
If you prefer a one-shot check (recommended for most users), you can use:
```bash
./scripts/verify_release.sh --dir . --repo <org>/<repo>
```
1) Download the right archive for your platform
2) Verify `checksums.txt` signature (recommended)
The release includes a signed `checksums.txt` plus:
- `checksums.txt.sig`
- `checksums.txt.cert`
Verify the signature (keyless Sigstore) using cosign:
```bash
cosign verify-blob \
--certificate checksums.txt.cert \
--signature checksums.txt.sig \
--certificate-identity-regexp "^https://github.com/jfraeysd/fetch_ml/.forgejo/workflows/release-mirror.yml@refs/tags/v.*$" \
--certificate-oidc-issuer https://token.actions.githubusercontent.com \
checksums.txt
```
3) Verify the SHA256 checksum against `checksums.txt`
4) Extract and install
Example (CLI on Linux x86_64):
```bash
# Download
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/ml-linux-x86_64.tar.gz
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt.sig
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt.cert
# Verify
cosign verify-blob \
--certificate checksums.txt.cert \
--signature checksums.txt.sig \
--certificate-identity-regexp "^https://github.com/jfraeysd/fetch_ml/.forgejo/workflows/release-mirror.yml@refs/tags/v.*$" \
--certificate-oidc-issuer https://token.actions.githubusercontent.com \
checksums.txt
sha256sum -c --ignore-missing checksums.txt
# Install
tar -xzf ml-linux-x86_64.tar.gz
chmod +x ml-linux-x86_64
sudo mv ml-linux-x86_64 /usr/local/bin/ml
ml --help
```
## Quick start
```bash
# Clone and run (dev)
git clone <your-repo>
cd fetch_ml
make dev-up
# Or build the CLI locally
cd cli && make all
./zig-out/bin/ml --help
```
## What you get
- **Zig CLI** (`ml`): Tiny, fast local client. Uses `~/.ml/config.toml` and `FETCH_ML_CLI_*` env vars.
- **Go backends**: API server, worker, and a TUI for richer remote features.
- **TUI over SSH**: `ml monitor` launches the TUI on the server, keeping the local CLI minimal.
- **CI/CD**: Crossplatform builds with `zig build-exe` and Go releases.
## Testing & Security
FetchML maintains **100% test coverage** (49/49 requirements) for all security and reproducibility controls:
- **Unit tests**: 150+ tests covering security, reproducibility, and core functionality
- **Property-based tests**: gopter-based invariant verification
- **Integration tests**: Cross-tenant isolation, audit verification, PHI redaction
- **Fault injection**: Prepared tests for toxiproxy integration
- **Custom lint analyzers**: `fetchml-vet` enforces security at compile time
See `docs/TEST_COVERAGE_MAP.md` for detailed coverage tracking and `DEVELOPMENT.md` for testing guidelines.
## CLI usage
```bash
# Configure
cat > ~/.ml/config.toml <<EOF
worker_host = "127.0.0.1"
worker_user = "dev_user"
worker_base = "/tmp/ml-experiments"
worker_port = 22
api_key = "your-api-key"
EOF
# Core commands
ml status
ml queue my-job
ml cancel my-job
ml dataset list
ml monitor # SSH to run TUI remotely
# Research features (see docs/src/research-features.md)
ml queue train.py --hypothesis "LR scaling..." --tags ablation
ml outcome set run_abc --outcome validates --summary "Accuracy +2%"
ml find --outcome validates --tag lr-test
ml compare run_abc run_def
ml privacy set run_abc --level team
ml export run_abc --anonymize
ml dataset verify /path/to/data
```
## Phase 1 (V1) notes
- **Task schema** supports optional `snapshot_id` (opaque identifier) and `dataset_specs` (structured dataset inputs). If `dataset_specs` is present it takes precedence over legacy `datasets` / `--datasets` args.
- **Snapshot restore (S1)** stages verified `snapshot_id` into each task workspace and exposes it via `FETCH_ML_SNAPSHOT_DIR` and `FETCH_ML_SNAPSHOT_ID`. If `snapshot_store.enabled: true` in the worker config, the worker will pull `<prefix>/<snapshot_id>.tar.gz` from an S3-compatible store (e.g. MinIO), verify `snapshot_sha256`, and cache it under `data_dir/snapshots/sha256/<snapshot_sha256>`.
- **Prewarm (best-effort)** can fetch datasets for the next queued task while another task is running. Prewarm state is surfaced in `ml status --json` under the optional `prewarm` field.
- **Env prewarm (best-effort)** can build a warmed Podman image keyed by `deps_manifest_sha256` and reuse it for later tasks.
## Changelog
See `CHANGELOG.md`.
## Build
### Native C++ Libraries (Optional)
FetchML includes optional C++ native libraries for performance. See `docs/src/native-libraries.md` for detailed build instructions.
Quick start:
```bash
make native-build # Build native libs
make native-smoke # Run smoke test
go build -tags native_libs # Enable native libraries
```
### Standard Build
```bash
# CLI (Zig)
cd cli && make all # release-small
make tiny # extra-small
make fast # release-fast
# Go backends
make cross-platform # builds for Linux/macOS/Windows
```
## Deploy
- **Dev**: `docker-compose up -d`
- **Prod**: Use the provided systemd units or containers on Rocky Linux.
## Docs
See `docs/` for detailed guides:
- `docs/src/native-libraries.md` Native C++ libraries (build, test, deploy)
- `docs/src/zig-cli.md` CLI reference
- `docs/src/quick-start.md` Full setup guide
- `docs/src/deployment.md` Production deployment
- `docs/src/research-features.md` Research workflow features (narrative capture, outcomes, search)
- `docs/src/privacy-security.md` Privacy levels, PII detection, anonymized export
## CLI Architecture (2026-02)
The Zig CLI has been refactored for improved maintainability:
- **Modular 3-layer architecture**: `core/` (foundation), `local/`/`server/` (mode-specific), `commands/` (routers)
- **Unified context**: `core.context.Context` handles mode detection, output formatting, and dispatch
- **Code reduction**: `experiment.zig` reduced from 836 to 348 lines (58% reduction)
- **Bug fixes**: Resolved 15+ compilation errors across multiple commands
See `cli/README.md` for detailed architecture documentation.
## Source code
The FetchML source code is intentionally not hosted on GitHub.
The canonical source repository is available at: `<SOURCE_REPO_URL>`.
## License
FetchML is source-available for transparency and auditability. It is not open-source.
See `LICENSE`.