No description
Find a file
Jeremie Fraeys ec9e845bb6
fix(test): Fix WebSocketQueue test timeout and race conditions
Reduce worker polling interval from 5ms to 1ms for faster task pickup

Add 100ms buffer after job submission to allow queue to settle

Increase timeout from 30s to 60s to prevent flaky failures

Fixes intermittent timeout issues in integration tests
2026-02-23 14:38:18 -05:00
.forgejo/workflows chore: Update security scan workflow and SQLite build script 2026-02-23 14:24:00 -05:00
.gitea ci: migrate from GitHub to Forgejo/Gitea 2026-02-12 12:05:00 -05:00
api refactor(cli): Update build system and core infrastructure 2026-02-20 21:39:51 -05:00
build refactor: migrate from env var to build tags for native libs 2026-02-21 13:43:58 -05:00
cli chore: Update security scan workflow and SQLite build script 2026-02-23 14:24:00 -05:00
cmd refactor(go): Update Go commands and TUI controller 2026-02-23 14:13:14 -05:00
configs feat: update CLI, TUI, and security documentation 2026-02-19 15:35:05 -05:00
db feat: add GitHub workflows and development tooling 2025-12-04 16:56:25 -05:00
deployments fix(docker): Use named volume for Redis to fix permission errors 2026-02-23 14:20:23 -05:00
docs docs: Update privacy/security and research runner docs 2026-02-23 14:13:35 -05:00
examples Slim and secure: move scripts, clean configs, remove secrets 2025-12-07 13:57:51 -05:00
internal feat: GPU detection transparency and artifact scanner improvements 2026-02-23 12:29:34 -05:00
monitoring chore(ops): reorganize deployments/monitoring and remove legacy scripts 2026-01-05 12:31:26 -05:00
native docs: Update privacy/security and research runner docs 2026-02-23 14:13:35 -05:00
podman refactor: reorganize podman directory structure 2026-02-18 16:40:46 -05:00
redis Slim and secure: move scripts, clean configs, remove secrets 2025-12-07 13:57:51 -05:00
scripts fix(docker): Use named volume for Redis to fix permission errors 2026-02-23 14:20:23 -05:00
tests fix(test): Fix WebSocketQueue test timeout and race conditions 2026-02-23 14:38:18 -05:00
tools chore(build): update build system, scripts, and additional tests 2026-02-12 12:05:55 -05:00
.dockerignore chore(repo): add dockerignore, changelog, and ignore local artifacts 2026-01-05 12:30:57 -05:00
.env.example chore(repo): add dockerignore, changelog, and ignore local artifacts 2026-01-05 12:30:57 -05:00
.flake8 feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
.gitignore chore: add AI assistant files to .gitignore, update AGENTS.md 2026-02-23 11:22:22 -05:00
.golangci.yml ci: align workflows, build scripts, and docs with current architecture 2026-01-05 12:34:23 -05:00
.golintrc Fix multi-user authentication and clean up debug code 2025-12-06 12:35:32 -05:00
CHANGELOG.md docs: Update CHANGELOG and add feature documentation 2026-02-18 21:28:25 -05:00
DEVELOPMENT.md docs: comprehensive documentation updates 2026-02-12 12:05:27 -05:00
go.mod feat(tui): Add SQLite support for local mode 2026-02-20 21:28:49 -05:00
go.sum feat(tui): Add SQLite support for local mode 2026-02-20 21:28:49 -05:00
LICENSE ci: align workflows, build scripts, and docs with current architecture 2026-01-05 12:34:23 -05:00
Makefile fix(make): Create tests/bin directory for CPU profiling output 2026-02-23 14:31:08 -05:00
pyproject.toml feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
README.md docs: update all documentation to use build tags instead of deprecated env var 2026-02-21 15:11:27 -05:00
SECURITY.md feat: update CLI, TUI, and security documentation 2026-02-19 15:35:05 -05:00

FetchML

A lightweight ML experiment platform with a tiny Zig CLI and a Go backend. Designed for homelabs and small teams.

FetchML publishes pre-built release artifacts (CLI + Go services) on GitHub Releases.

If you prefer a one-shot check (recommended for most users), you can use:

./scripts/verify_release.sh --dir . --repo <org>/<repo>
  1. Download the right archive for your platform

  2. Verify checksums.txt signature (recommended)

The release includes a signed checksums.txt plus:

  • checksums.txt.sig
  • checksums.txt.cert

Verify the signature (keyless Sigstore) using cosign:

cosign verify-blob \
  --certificate checksums.txt.cert \
  --signature checksums.txt.sig \
  --certificate-identity-regexp "^https://github.com/jfraeysd/fetch_ml/.forgejo/workflows/release-mirror.yml@refs/tags/v.*$" \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  checksums.txt
  1. Verify the SHA256 checksum against checksums.txt

  2. Extract and install

Example (CLI on Linux x86_64):

# Download
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/ml-linux-x86_64.tar.gz
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt.sig
curl -fsSLO https://github.com/jfraeysd/fetch_ml/releases/download/<tag>/checksums.txt.cert

# Verify
cosign verify-blob \
  --certificate checksums.txt.cert \
  --signature checksums.txt.sig \
  --certificate-identity-regexp "^https://github.com/jfraeysd/fetch_ml/.forgejo/workflows/release-mirror.yml@refs/tags/v.*$" \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  checksums.txt
sha256sum -c --ignore-missing checksums.txt

# Install
tar -xzf ml-linux-x86_64.tar.gz
chmod +x ml-linux-x86_64
sudo mv ml-linux-x86_64 /usr/local/bin/ml

ml --help

Quick start

# Clone and run (dev)
git clone <your-repo>
cd fetch_ml
make dev-up

# Or build the CLI locally
cd cli && make all
./zig-out/bin/ml --help

What you get

  • Zig CLI (ml): Tiny, fast local client. Uses ~/.ml/config.toml and FETCH_ML_CLI_* env vars.
  • Go backends: API server, worker, and a TUI for richer remote features.
  • TUI over SSH: ml monitor launches the TUI on the server, keeping the local CLI minimal.
  • CI/CD: Crossplatform builds with zig build-exe and Go releases.

CLI usage

# Configure
cat > ~/.ml/config.toml <<EOF
worker_host = "127.0.0.1"
worker_user = "dev_user"
worker_base = "/tmp/ml-experiments"
worker_port = 22
api_key = "your-api-key"
EOF

# Core commands
ml status
ml queue my-job
ml cancel my-job
ml dataset list
ml monitor  # SSH to run TUI remotely

# Research features (see docs/src/research-features.md)
ml queue train.py --hypothesis "LR scaling..." --tags ablation
ml outcome set run_abc --outcome validates --summary "Accuracy +2%"
ml find --outcome validates --tag lr-test
ml compare run_abc run_def
ml privacy set run_abc --level team
ml export run_abc --anonymize
ml dataset verify /path/to/data

Phase 1 (V1) notes

  • Task schema supports optional snapshot_id (opaque identifier) and dataset_specs (structured dataset inputs). If dataset_specs is present it takes precedence over legacy datasets / --datasets args.
  • Snapshot restore (S1) stages verified snapshot_id into each task workspace and exposes it via FETCH_ML_SNAPSHOT_DIR and FETCH_ML_SNAPSHOT_ID. If snapshot_store.enabled: true in the worker config, the worker will pull <prefix>/<snapshot_id>.tar.gz from an S3-compatible store (e.g. MinIO), verify snapshot_sha256, and cache it under data_dir/snapshots/sha256/<snapshot_sha256>.
  • Prewarm (best-effort) can fetch datasets for the next queued task while another task is running. Prewarm state is surfaced in ml status --json under the optional prewarm field.
  • Env prewarm (best-effort) can build a warmed Podman image keyed by deps_manifest_sha256 and reuse it for later tasks.

Changelog

See CHANGELOG.md.

Build

Native C++ Libraries (Optional)

FetchML includes optional C++ native libraries for performance. See docs/src/native-libraries.md for detailed build instructions.

Quick start:

make native-build           # Build native libs
make native-smoke           # Run smoke test
go build -tags native_libs  # Enable native libraries

Standard Build

# CLI (Zig)
cd cli && make all      # release-small
make tiny              # extra-small
make fast              # release-fast

# Go backends
make cross-platform    # builds for Linux/macOS/Windows

Deploy

  • Dev: docker-compose up -d
  • Prod: Use the provided systemd units or containers on Rocky Linux.

Docs

See docs/ for detailed guides:

  • docs/src/native-libraries.md Native C++ libraries (build, test, deploy)
  • docs/src/zig-cli.md CLI reference
  • docs/src/quick-start.md Full setup guide
  • docs/src/deployment.md Production deployment
  • docs/src/research-features.md Research workflow features (narrative capture, outcomes, search)
  • docs/src/privacy-security.md Privacy levels, PII detection, anonymized export

CLI Architecture (2026-02)

The Zig CLI has been refactored for improved maintainability:

  • Modular 3-layer architecture: core/ (foundation), local//server/ (mode-specific), commands/ (routers)
  • Unified context: core.context.Context handles mode detection, output formatting, and dispatch
  • Code reduction: experiment.zig reduced from 836 to 348 lines (58% reduction)
  • Bug fixes: Resolved 15+ compilation errors across multiple commands

See cli/README.md for detailed architecture documentation.

Source code

The FetchML source code is intentionally not hosted on GitHub.

The canonical source repository is available at: <SOURCE_REPO_URL>.

License

FetchML is source-available for transparency and auditability. It is not open-source.

See LICENSE.