Reduce worker polling interval from 5ms to 1ms for faster task pickup Add 100ms buffer after job submission to allow queue to settle Increase timeout from 30s to 60s to prevent flaky failures Fixes intermittent timeout issues in integration tests |
||
|---|---|---|
| .forgejo/workflows | ||
| .gitea | ||
| api | ||
| build | ||
| cli | ||
| cmd | ||
| configs | ||
| db | ||
| deployments | ||
| docs | ||
| examples | ||
| internal | ||
| monitoring | ||
| native | ||
| podman | ||
| redis | ||
| scripts | ||
| tests | ||
| tools | ||
| .dockerignore | ||
| .env.example | ||
| .flake8 | ||
| .gitignore | ||
| .golangci.yml | ||
| .golintrc | ||
| CHANGELOG.md | ||
| DEVELOPMENT.md | ||
| go.mod | ||
| go.sum | ||
| LICENSE | ||
| Makefile | ||
| pyproject.toml | ||
| README.md | ||
| SECURITY.md | ||
FetchML
A lightweight ML experiment platform with a tiny Zig CLI and a Go backend. Designed for homelabs and small teams.
Installation (recommended)
FetchML publishes pre-built release artifacts (CLI + Go services) on GitHub Releases.
If you prefer a one-shot check (recommended for most users), you can use:
./scripts/verify_release.sh --dir . --repo <org>/<repo>
-
Download the right archive for your platform
-
Verify
checksums.txtsignature (recommended)
The release includes a signed checksums.txt plus:
checksums.txt.sigchecksums.txt.cert
Verify the signature (keyless Sigstore) using cosign:
cosign verify-blob \
--certificate checksums.txt.cert \
--signature checksums.txt.sig \
--certificate-identity-regexp "^https://github.com/jfraeysd/fetch_ml/.forgejo/workflows/release-mirror.yml@refs/tags/v.*$" \
--certificate-oidc-issuer https://token.actions.githubusercontent.com \
checksums.txt
-
Verify the SHA256 checksum against
checksums.txt -
Extract and install
Example (CLI on Linux x86_64):
# Download
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/ml-linux-x86_64.tar.gz
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt.sig
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt.cert
# Verify
cosign verify-blob \
--certificate checksums.txt.cert \
--signature checksums.txt.sig \
--certificate-identity-regexp "^https://github.com/jfraeysd/fetch_ml/.forgejo/workflows/release-mirror.yml@refs/tags/v.*$" \
--certificate-oidc-issuer https://token.actions.githubusercontent.com \
checksums.txt
sha256sum -c --ignore-missing checksums.txt
# Install
tar -xzf ml-linux-x86_64.tar.gz
chmod +x ml-linux-x86_64
sudo mv ml-linux-x86_64 /usr/local/bin/ml
ml --help
Quick start
# Clone and run (dev)
git clone <your-repo>
cd fetch_ml
make dev-up
# Or build the CLI locally
cd cli && make all
./zig-out/bin/ml --help
What you get
- Zig CLI (
ml): Tiny, fast local client. Uses~/.ml/config.tomlandFETCH_ML_CLI_*env vars. - Go backends: API server, worker, and a TUI for richer remote features.
- TUI over SSH:
ml monitorlaunches the TUI on the server, keeping the local CLI minimal. - CI/CD: Cross‑platform builds with
zig build-exeand Go releases.
CLI usage
# Configure
cat > ~/.ml/config.toml <<EOF
worker_host = "127.0.0.1"
worker_user = "dev_user"
worker_base = "/tmp/ml-experiments"
worker_port = 22
api_key = "your-api-key"
EOF
# Core commands
ml status
ml queue my-job
ml cancel my-job
ml dataset list
ml monitor # SSH to run TUI remotely
# Research features (see docs/src/research-features.md)
ml queue train.py --hypothesis "LR scaling..." --tags ablation
ml outcome set run_abc --outcome validates --summary "Accuracy +2%"
ml find --outcome validates --tag lr-test
ml compare run_abc run_def
ml privacy set run_abc --level team
ml export run_abc --anonymize
ml dataset verify /path/to/data
Phase 1 (V1) notes
- Task schema supports optional
snapshot_id(opaque identifier) anddataset_specs(structured dataset inputs). Ifdataset_specsis present it takes precedence over legacydatasets/--datasetsargs. - Snapshot restore (S1) stages verified
snapshot_idinto each task workspace and exposes it viaFETCH_ML_SNAPSHOT_DIRandFETCH_ML_SNAPSHOT_ID. Ifsnapshot_store.enabled: truein the worker config, the worker will pull<prefix>/<snapshot_id>.tar.gzfrom an S3-compatible store (e.g. MinIO), verifysnapshot_sha256, and cache it underdata_dir/snapshots/sha256/<snapshot_sha256>. - Prewarm (best-effort) can fetch datasets for the next queued task while another task is running. Prewarm state is surfaced in
ml status --jsonunder the optionalprewarmfield. - Env prewarm (best-effort) can build a warmed Podman image keyed by
deps_manifest_sha256and reuse it for later tasks.
Changelog
See CHANGELOG.md.
Build
Native C++ Libraries (Optional)
FetchML includes optional C++ native libraries for performance. See docs/src/native-libraries.md for detailed build instructions.
Quick start:
make native-build # Build native libs
make native-smoke # Run smoke test
go build -tags native_libs # Enable native libraries
Standard Build
# CLI (Zig)
cd cli && make all # release-small
make tiny # extra-small
make fast # release-fast
# Go backends
make cross-platform # builds for Linux/macOS/Windows
Deploy
- Dev:
docker-compose up -d - Prod: Use the provided systemd units or containers on Rocky Linux.
Docs
See docs/ for detailed guides:
docs/src/native-libraries.md– Native C++ libraries (build, test, deploy)docs/src/zig-cli.md– CLI referencedocs/src/quick-start.md– Full setup guidedocs/src/deployment.md– Production deploymentdocs/src/research-features.md– Research workflow features (narrative capture, outcomes, search)docs/src/privacy-security.md– Privacy levels, PII detection, anonymized export
CLI Architecture (2026-02)
The Zig CLI has been refactored for improved maintainability:
- Modular 3-layer architecture:
core/(foundation),local//server/(mode-specific),commands/(routers) - Unified context:
core.context.Contexthandles mode detection, output formatting, and dispatch - Code reduction:
experiment.zigreduced from 836 to 348 lines (58% reduction) - Bug fixes: Resolved 15+ compilation errors across multiple commands
See cli/README.md for detailed architecture documentation.
Source code
The FetchML source code is intentionally not hosted on GitHub.
The canonical source repository is available at: <SOURCE_REPO_URL>.
License
FetchML is source-available for transparency and auditability. It is not open-source.
See LICENSE.