- Update build.zig configuration - Improve queue command implementation - Enhance native hash support |
||
|---|---|---|
| .. | ||
| scripts | ||
| src | ||
| tests | ||
| build.zig | ||
| Makefile | ||
| README.md | ||
| src.zig | ||
ML CLI
Fast CLI tool for managing ML experiments. Supports both local mode (SQLite) and server mode (WebSocket).
Build Policy
Native C++ libraries (dataset_hash, etc.) are available when building natively on any platform. Cross-compilation is supported for development on non-native targets but disables native library features.
| Build Type | Target | Native Libraries | Purpose |
|---|---|---|---|
| Native | Host platform (Linux, macOS) | Yes | Dev, staging, production |
| Cross-compile | Different arch/OS | Stubbed | Testing on foreign targets |
Native Build (Recommended)
Builds on the host platform with full native library support:
zig build -Doptimize=ReleaseSmall
Cross-Compile (Dev Only)
For testing on different architectures without native library support:
zig build -Dtarget=x86_64-linux-gnu # from macOS/Windows
Architecture
The CLI follows a modular 3-layer architecture for maintainability:
src/
├── core/ # Shared foundation
│ ├── context.zig # Execution context (allocator, config, mode dispatch)
│ ├── output.zig # Unified JSON/text output helpers
│ └── flags.zig # Common flag parsing
├── local/ # Local mode operations (SQLite)
│ └── experiment_ops.zig # Experiment CRUD for local DB
├── server/ # Server mode operations (WebSocket)
│ └── experiment_api.zig # Experiment API for remote server
├── commands/ # Thin command routers
│ ├── experiment.zig # ~100 lines (was 887)
│ ├── queue.zig # Job submission
│ └── queue/ # Queue submodules
│ ├── parse.zig # Job template parsing
│ ├── validate.zig # Validation logic
│ └── submit.zig # Job submission
└── utils/ # Utilities (21 files)
Mode Dispatch Pattern
Commands auto-detect local vs server mode using core.context.Context:
var ctx = core.context.Context.init(allocator, cfg, flags.json);
if (ctx.isLocal()) {
return try local.experiment.list(ctx.allocator, ctx.json_output);
} else {
return try server.experiment.list(ctx.allocator, ctx.json_output);
}
Quick Start
# 1. Build
zig build
# 2. Initialize local tracking (creates fetch_ml.db)
./zig-out/bin/ml init
# 3. Create experiment and run locally
./zig-out/bin/ml experiment create --name "baseline"
./zig-out/bin/ml run start --experiment <id> --name "run-1"
./zig-out/bin/ml experiment log --run <id> --name loss --value 0.5
./zig-out/bin/ml run finish --run <id>
Commands
Local Mode Commands (SQLite)
ml init- Initialize local experiment tracking databaseml experiment create --name <name>- Create experiment locallyml experiment list- List experiments from SQLiteml experiment log --run <id> --name <key> --value <val>- Log metricsml run start --experiment <id> [--name <name>]- Start a runml run finish --run <id>- Mark run as finishedml run fail --run <id>- Mark run as failedml run list- List all runs
Server Mode Commands (WebSocket)
ml sync <path>- Sync project to serverml queue <job1> [job2 ...] [--commit <id>] [--priority N] [--note <text>]- Queue jobsml status- Check system/queue statusml validate <commit_id> [--json] [--task <task_id>]- Validate provenanceml cancel <job>- Cancel a running/queued job
Shared Commands (Auto-detect Mode)
ml experiment log|show|list|delete- Works in both local and server modeml monitor- Launch TUI (local SQLite or remote SSH)
Notes:
- Commands auto-detect mode from config (
sqlite://vswss://) --jsonmode is designed to be pipe-friendly
Core Modules
core.context
Provides unified execution context for all commands:
- Mode detection: Automatically detects local (SQLite) vs server (WebSocket) mode
- Output handling: JSON vs text output based on
--jsonflag - Dispatch helpers:
ctx.dispatch(local_fn, server_fn, args)for mode-specific implementations
const core = @import("../core.zig");
pub fn execute(allocator: std.mem.Allocator, args: []const []const u8) !void {
const cfg = try config.Config.load(allocator);
var ctx = core.context.Context.init(allocator, cfg, flags.json);
defer ctx.deinit();
// Dispatch to local or server implementation
if (ctx.isLocal()) {
return try local.experiment.list(ctx.allocator, ctx.json_output);
} else {
return try server.experiment.list(ctx.allocator, ctx.json_output);
}
}
core.output
Unified output helpers that respect --json flag:
core.output.errorMsg("command", "Error message"); // JSON: {"success":false,...}
core.output.success("command"); // JSON: {"success":true,...}
core.output.successString("cmd", "key", "value"); // JSON with data
core.output.info("Text output", .{}); // Text mode only
core.output.usage("cmd", "usage string"); // Help text
core.flags
Common flag parsing utilities:
var flags = core.flags.CommonFlags{};
var remaining = try core.flags.parseCommon(allocator, args, &flags);
// Check for subcommands
if (core.flags.matchSubcommand(remaining.items, "list")) |sub_args| {
return try executeList(ctx, sub_args);
}
Configuration
Local Mode (SQLite)
# .fetchml/config.toml or ~/.ml/config.toml
tracking_uri = "sqlite://./fetch_ml.db"
artifact_path = "./experiments/"
sync_uri = "" # Optional: server to sync with
Server Mode (WebSocket)
# ~/.ml/config.toml
worker_host = "worker.local"
worker_user = "mluser"
worker_base = "/data/ml-experiments"
worker_port = 22
api_key = "your-api-key"
Building
Development
cd cli
zig build
Production (requires SQLite in assets/)
cd cli
make build-sqlite # Fetch SQLite amalgamation
zig build prod # Build with embedded SQLite
Install
# Install to system
make install
# Or copy binary manually
cp zig-out/bin/ml /usr/local/bin/
Local/Server Module Pattern
Commands that work in both modes follow this structure:
src/
├── local.zig # Module index
├── local/
│ └── experiment_ops.zig # Local implementations
├── server.zig # Module index
└── server/
└── experiment_api.zig # Server implementations
Adding a New Command
- Create local implementation in
src/local/<name>_ops.zig - Create server implementation in
src/server/<name>_api.zig - Export from
src/local.zigandsrc/server.zig - Create thin router in
src/commands/<name>.zigusingctx.dispatch()
Maintainability Cleanup (2026-02)
Recent refactoring improved code organization:
| Metric | Before | After |
|---|---|---|
| experiment.zig | 836 lines | 348 lines (58% reduction) |
| queue.zig | 1203 lines | Modular structure |
| Duplicate printUsage | 24 functions | 1 shared helper |
| Mode dispatch logic | Inlined everywhere | core.context.Context |
Key Improvements
- Core Modules: Unified
core.output,core.flags,core.contexteliminate duplication - Mode Abstraction: Local/server operations separated into dedicated modules
- Queue Decomposition:
queue/submodules for parsing, validation, submission - Bug Fixes: Resolved 15+ compilation errors in
narrative.zig,outcome.zig,annotate.zig, etc.
Need Help?
ml --help- Show command helpml <command> --help- Show command-specific help