fetch_ml/README.md

# FetchML

A lightweight ML experiment platform with a tiny Zig CLI and a Go backend. Designed for homelabs and small teams.

## Installation (recommended)

FetchML publishes pre-built release artifacts (CLI + Go services) on GitHub Releases.

If you prefer a one-shot check (recommended for most users), you can use:

```bash
./scripts/verify_release.sh --dir . --repo <org>/<repo>
```

1) Download the right archive for your platform

2) Verify `checksums.txt` signature (recommended)

The release includes a signed `checksums.txt` plus:

- `checksums.txt.sig`
- `checksums.txt.cert`

Verify the signature (keyless Sigstore) using cosign:

```bash
cosign verify-blob \
  --certificate checksums.txt.cert \
  --signature checksums.txt.sig \
  --certificate-identity-regexp "^https://github.com/jfraeysd/fetch_ml/.forgejo/workflows/release-mirror.yml@refs/tags/v.*$" \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  checksums.txt
```

3) Verify the SHA256 checksum against `checksums.txt`

4) Extract and install

Example (CLI on Linux x86_64):

```bash
# Download
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/ml-linux-x86_64.tar.gz
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt.sig
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt.cert

# Verify
cosign verify-blob \
  --certificate checksums.txt.cert \
  --signature checksums.txt.sig \
  --certificate-identity-regexp "^https://github.com/jfraeysd/fetch_ml/.forgejo/workflows/release-mirror.yml@refs/tags/v.*$" \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  checksums.txt
sha256sum -c --ignore-missing checksums.txt

# Install
tar -xzf ml-linux-x86_64.tar.gz
chmod +x ml-linux-x86_64
sudo mv ml-linux-x86_64 /usr/local/bin/ml

ml --help
```

## Quick start

```bash
# Clone and run (dev)
git clone <your-repo>
cd fetch_ml
make dev-up

# Or build the CLI locally
cd cli && make all
./zig-out/bin/ml --help
```

## What you get

- **Zig CLI** (`ml`): Tiny, fast local client. Uses `~/.ml/config.toml` and `FETCH_ML_CLI_*` env vars.
- **Go backends**: API server, worker, and a TUI for richer remote features.
- **TUI over SSH**: `ml monitor` launches the TUI on the server, keeping the local CLI minimal.
- **CI/CD**: Cross‑platform builds with `zig build-exe` and Go releases.

## CLI usage

```bash
# Configure
cat > ~/.ml/config.toml <<EOF
worker_host = "127.0.0.1"
worker_user = "dev_user"
worker_base = "/tmp/ml-experiments"
worker_port = 22
api_key = "your-api-key"
EOF

# Core commands
ml status
ml queue my-job
ml cancel my-job
ml dataset list
ml monitor  # SSH to run TUI remotely
```

## Phase 1 (V1) notes

- **Task schema** supports optional `snapshot_id` (opaque identifier) and `dataset_specs` (structured dataset inputs). If `dataset_specs` is present it takes precedence over legacy `datasets` / `--datasets` args.
- **Snapshot restore (S1)** stages verified `snapshot_id` into each task workspace and exposes it via `FETCH_ML_SNAPSHOT_DIR` and `FETCH_ML_SNAPSHOT_ID`. If `snapshot_store.enabled: true` in the worker config, the worker will pull `<prefix>/<snapshot_id>.tar.gz` from an S3-compatible store (e.g. MinIO), verify `snapshot_sha256`, and cache it under `data_dir/snapshots/sha256/<snapshot_sha256>`.
- **Prewarm (best-effort)** can fetch datasets for the next queued task while another task is running. Prewarm state is surfaced in `ml status --json` under the optional `prewarm` field.
- **Env prewarm (best-effort)** can build a warmed Podman image keyed by `deps_manifest_sha256` and reuse it for later tasks.

## Changelog

See `CHANGELOG.md`.

## Build

### Native C++ Libraries (Optional Performance Optimization)

FetchML includes optional C++ native libraries for performance-critical operations:
- **dataset_hash**: mmap + SIMD SHA256 hashing (78% syscall reduction)
- **queue_index**: Binary index format (96% syscall reduction)
- **artifact_scanner**: Fast directory traversal (87% syscall reduction)
- **streaming_io**: Parallel gzip extraction (95% syscall reduction)

**Requirements:** CMake 3.15+, C++17 compiler, zlib

```bash
# Build native libraries
make native-build        # Development build
make native-release      # Production optimized (-O3)
make native-debug        # Debug build with ASan

# Enable native libraries at runtime
export FETCHML_NATIVE_LIBS=1

# Build Go binaries with native library support
make prod-with-native    # Copies .so/.dylib files to bin/
```

**Deployment:** Ship the native libraries alongside your Go binaries:
- Linux: `lib*.so` files
- macOS: `lib*.dylib` files

The libraries are loaded dynamically via cgo. If not found, FetchML falls back to pure Go implementations.

### Standard Build

```bash
# CLI (Zig)
cd cli && make all      # release-small
make tiny              # extra-small
make fast              # release-fast

# Go backends
make cross-platform    # builds for Linux/macOS/Windows
```

## Deploy

- **Dev**: `docker-compose up -d`
- **Prod**: Use the provided systemd units or containers on Rocky Linux.

## Docs

See `docs/` for detailed guides:
- `docs/src/zig-cli.md` – CLI reference
- `docs/src/quick-start.md` – Full setup guide
- `docs/src/deployment.md` – Production deployment

## Source code

The FetchML source code is intentionally not hosted on GitHub.

The canonical source repository is available at: `<SOURCE_REPO_URL>`.

## Contributing

Contributions are welcome.

- **Questions / bug reports**: Use GitHub Issues: `<GITHUB_ISSUES_URL>`. Include:
  - how to reproduce
  - expected vs actual behavior
  - logs/config snippets (sanitize secrets)
  - OS + versions (Go, Zig, Podman/Docker if relevant)
- **Changes**: Submit a patch in a GitHub issue.
  - Create a topic branch.
  - Run tests/linters.
  - Export your change as either:
    - a patch series: `git format-patch -N origin/main`, or
    - a single bundle: `git bundle create fetchml.bundle origin/main..HEAD`
  - Attach the generated files to a GitHub issue at `<GITHUB_ISSUES_URL>`.

## License

FetchML is source-available for transparency and auditability. It is not open-source.

See `LICENSE`.