|
Some checks failed
Checkout test / test (push) Successful in 4s
CI with Native Libraries / Check Build Environment (push) Successful in 11s
Documentation / build-and-publish (push) Failing after 40s
CI with Native Libraries / Build and Test Native Libraries (push) Failing after 16s
CI with Native Libraries / Build Release Libraries (push) Has been skipped
|
||
|---|---|---|
| .forgejo/workflows | ||
| .gitea | ||
| .windsurf/rules | ||
| build | ||
| cli | ||
| cmd | ||
| configs | ||
| db | ||
| deployments | ||
| docs | ||
| examples | ||
| internal | ||
| monitoring | ||
| native | ||
| podman | ||
| redis | ||
| scripts | ||
| tests | ||
| tools | ||
| .dockerignore | ||
| .env.example | ||
| .flake8 | ||
| .gitignore | ||
| .golangci.yml | ||
| .golintrc | ||
| AGENTS.md | ||
| CHANGELOG.md | ||
| DEVELOPMENT.md | ||
| go.mod | ||
| go.sum | ||
| LICENSE | ||
| Makefile | ||
| mem.prof | ||
| pyproject.toml | ||
| README.md | ||
| SECURITY.md | ||
FetchML
A lightweight ML experiment platform with a tiny Zig CLI and a Go backend. Designed for homelabs and small teams.
Installation (recommended)
FetchML publishes pre-built release artifacts (CLI + Go services) on GitHub Releases.
If you prefer a one-shot check (recommended for most users), you can use:
./scripts/verify_release.sh --dir . --repo <org>/<repo>
-
Download the right archive for your platform
-
Verify
checksums.txtsignature (recommended)
The release includes a signed checksums.txt plus:
checksums.txt.sigchecksums.txt.cert
Verify the signature (keyless Sigstore) using cosign:
cosign verify-blob \
--certificate checksums.txt.cert \
--signature checksums.txt.sig \
--certificate-identity-regexp "^https://github.com/jfraeysd/fetch_ml/.forgejo/workflows/release-mirror.yml@refs/tags/v.*$" \
--certificate-oidc-issuer https://token.actions.githubusercontent.com \
checksums.txt
-
Verify the SHA256 checksum against
checksums.txt -
Extract and install
Example (CLI on Linux x86_64):
# Download
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/ml-linux-x86_64.tar.gz
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt.sig
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt.cert
# Verify
cosign verify-blob \
--certificate checksums.txt.cert \
--signature checksums.txt.sig \
--certificate-identity-regexp "^https://github.com/jfraeysd/fetch_ml/.forgejo/workflows/release-mirror.yml@refs/tags/v.*$" \
--certificate-oidc-issuer https://token.actions.githubusercontent.com \
checksums.txt
sha256sum -c --ignore-missing checksums.txt
# Install
tar -xzf ml-linux-x86_64.tar.gz
chmod +x ml-linux-x86_64
sudo mv ml-linux-x86_64 /usr/local/bin/ml
ml --help
Quick start
# Clone and run (dev)
git clone <your-repo>
cd fetch_ml
make dev-up
# Or build the CLI locally
cd cli && make all
./zig-out/bin/ml --help
What you get
- Zig CLI (
ml): Tiny, fast local client. Uses~/.ml/config.tomlandFETCH_ML_CLI_*env vars. - Go backends: API server, worker, and a TUI for richer remote features.
- TUI over SSH:
ml monitorlaunches the TUI on the server, keeping the local CLI minimal. - CI/CD: Cross‑platform builds with
zig build-exeand Go releases.
CLI usage
# Configure
cat > ~/.ml/config.toml <<EOF
worker_host = "127.0.0.1"
worker_user = "dev_user"
worker_base = "/tmp/ml-experiments"
worker_port = 22
api_key = "your-api-key"
EOF
# Core commands
ml status
ml queue my-job
ml cancel my-job
ml dataset list
ml monitor # SSH to run TUI remotely
Phase 1 (V1) notes
- Task schema supports optional
snapshot_id(opaque identifier) anddataset_specs(structured dataset inputs). Ifdataset_specsis present it takes precedence over legacydatasets/--datasetsargs. - Snapshot restore (S1) stages verified
snapshot_idinto each task workspace and exposes it viaFETCH_ML_SNAPSHOT_DIRandFETCH_ML_SNAPSHOT_ID. Ifsnapshot_store.enabled: truein the worker config, the worker will pull<prefix>/<snapshot_id>.tar.gzfrom an S3-compatible store (e.g. MinIO), verifysnapshot_sha256, and cache it underdata_dir/snapshots/sha256/<snapshot_sha256>. - Prewarm (best-effort) can fetch datasets for the next queued task while another task is running. Prewarm state is surfaced in
ml status --jsonunder the optionalprewarmfield. - Env prewarm (best-effort) can build a warmed Podman image keyed by
deps_manifest_sha256and reuse it for later tasks.
Changelog
See CHANGELOG.md.
Build
Native C++ Libraries (Optional Performance Optimization)
FetchML includes optional C++ native libraries for performance-critical operations:
- dataset_hash: mmap + SIMD SHA256 hashing (78% syscall reduction)
- queue_index: Binary index format (96% syscall reduction)
- artifact_scanner: Fast directory traversal (87% syscall reduction)
- streaming_io: Parallel gzip extraction (95% syscall reduction)
Requirements: CMake 3.15+, C++17 compiler, zlib
# Build native libraries
make native-build # Development build
make native-release # Production optimized (-O3)
make native-debug # Debug build with ASan
# Enable native libraries at runtime
export FETCHML_NATIVE_LIBS=1
# Build Go binaries with native library support
make prod-with-native # Copies .so/.dylib files to bin/
Deployment: Ship the native libraries alongside your Go binaries:
- Linux:
lib*.sofiles - macOS:
lib*.dylibfiles
The libraries are loaded dynamically via cgo. If not found, FetchML falls back to pure Go implementations.
Standard Build
# CLI (Zig)
cd cli && make all # release-small
make tiny # extra-small
make fast # release-fast
# Go backends
make cross-platform # builds for Linux/macOS/Windows
Deploy
- Dev:
docker-compose up -d - Prod: Use the provided systemd units or containers on Rocky Linux.
Docs
See docs/ for detailed guides:
docs/src/zig-cli.md– CLI referencedocs/src/quick-start.md– Full setup guidedocs/src/deployment.md– Production deployment
Source code
The FetchML source code is intentionally not hosted on GitHub.
The canonical source repository is available at: <SOURCE_REPO_URL>.
Contributing
Contributions are welcome.
- Questions / bug reports: Use GitHub Issues:
<GITHUB_ISSUES_URL>. Include:- how to reproduce
- expected vs actual behavior
- logs/config snippets (sanitize secrets)
- OS + versions (Go, Zig, Podman/Docker if relevant)
- Changes: Submit a patch in a GitHub issue.
- Create a topic branch.
- Run tests/linters.
- Export your change as either:
- a patch series:
git format-patch -N origin/main, or - a single bundle:
git bundle create fetchml.bundle origin/main..HEAD
- a patch series:
- Attach the generated files to a GitHub issue at
<GITHUB_ISSUES_URL>.
License
FetchML is source-available for transparency and auditability. It is not open-source.
See LICENSE.