- Makefile: Update build targets for native library integration - build.zig: Add SQLite linking and native hash library support - scripts/build_rsync.sh: Update rsync embedded binary build process - scripts/build_sqlite.sh: Add SQLite constants generation script - src/assets/README.md: Document embedded asset structure - src/utils/rsync_embedded_binary.zig: Update for new build layout
6.8 KiB
6.8 KiB
ML CLI
Fast CLI tool for managing ML experiments. Supports both local mode (SQLite) and server mode (WebSocket).
Architecture
The CLI follows a modular 3-layer architecture for maintainability:
src/
├── core/ # Shared foundation
│ ├── context.zig # Execution context (allocator, config, mode dispatch)
│ ├── output.zig # Unified JSON/text output helpers
│ └── flags.zig # Common flag parsing
├── local/ # Local mode operations (SQLite)
│ └── experiment_ops.zig # Experiment CRUD for local DB
├── server/ # Server mode operations (WebSocket)
│ └── experiment_api.zig # Experiment API for remote server
├── commands/ # Thin command routers
│ ├── experiment.zig # ~100 lines (was 887)
│ ├── queue.zig # Job submission
│ └── queue/ # Queue submodules
│ ├── parse.zig # Job template parsing
│ ├── validate.zig # Validation logic
│ └── submit.zig # Job submission
└── utils/ # Utilities (21 files)
Mode Dispatch Pattern
Commands auto-detect local vs server mode using core.context.Context:
var ctx = core.context.Context.init(allocator, cfg, flags.json);
if (ctx.isLocal()) {
return try local.experiment.list(ctx.allocator, ctx.json_output);
} else {
return try server.experiment.list(ctx.allocator, ctx.json_output);
}
Quick Start
# 1. Build
zig build
# 2. Initialize local tracking (creates fetch_ml.db)
./zig-out/bin/ml init
# 3. Create experiment and run locally
./zig-out/bin/ml experiment create --name "baseline"
./zig-out/bin/ml run start --experiment <id> --name "run-1"
./zig-out/bin/ml experiment log --run <id> --name loss --value 0.5
./zig-out/bin/ml run finish --run <id>
Commands
Local Mode Commands (SQLite)
ml init- Initialize local experiment tracking databaseml experiment create --name <name>- Create experiment locallyml experiment list- List experiments from SQLiteml experiment log --run <id> --name <key> --value <val>- Log metricsml run start --experiment <id> [--name <name>]- Start a runml run finish --run <id>- Mark run as finishedml run fail --run <id>- Mark run as failedml run list- List all runs
Server Mode Commands (WebSocket)
ml sync <path>- Sync project to serverml queue <job1> [job2 ...] [--commit <id>] [--priority N] [--note <text>]- Queue jobsml status- Check system/queue statusml validate <commit_id> [--json] [--task <task_id>]- Validate provenanceml cancel <job>- Cancel a running/queued job
Shared Commands (Auto-detect Mode)
ml experiment log|show|list|delete- Works in both local and server modeml monitor- Launch TUI (local SQLite or remote SSH)
Notes:
- Commands auto-detect mode from config (
sqlite://vswss://) --jsonmode is designed to be pipe-friendly
Core Modules
core.context
Provides unified execution context for all commands:
- Mode detection: Automatically detects local (SQLite) vs server (WebSocket) mode
- Output handling: JSON vs text output based on
--jsonflag - Dispatch helpers:
ctx.dispatch(local_fn, server_fn, args)for mode-specific implementations
const core = @import("../core.zig");
pub fn execute(allocator: std.mem.Allocator, args: []const []const u8) !void {
const cfg = try config.Config.load(allocator);
var ctx = core.context.Context.init(allocator, cfg, flags.json);
defer ctx.deinit();
// Dispatch to local or server implementation
if (ctx.isLocal()) {
return try local.experiment.list(ctx.allocator, ctx.json_output);
} else {
return try server.experiment.list(ctx.allocator, ctx.json_output);
}
}
core.output
Unified output helpers that respect --json flag:
core.output.errorMsg("command", "Error message"); // JSON: {"success":false,...}
core.output.success("command"); // JSON: {"success":true,...}
core.output.successString("cmd", "key", "value"); // JSON with data
core.output.info("Text output", .{}); // Text mode only
core.output.usage("cmd", "usage string"); // Help text
core.flags
Common flag parsing utilities:
var flags = core.flags.CommonFlags{};
var remaining = try core.flags.parseCommon(allocator, args, &flags);
// Check for subcommands
if (core.flags.matchSubcommand(remaining.items, "list")) |sub_args| {
return try executeList(ctx, sub_args);
}
Configuration
Local Mode (SQLite)
# .fetchml/config.toml or ~/.ml/config.toml
tracking_uri = "sqlite://./fetch_ml.db"
artifact_path = "./experiments/"
sync_uri = "" # Optional: server to sync with
Server Mode (WebSocket)
# ~/.ml/config.toml
worker_host = "worker.local"
worker_user = "mluser"
worker_base = "/data/ml-experiments"
worker_port = 22
api_key = "your-api-key"
Building
Development
cd cli
zig build
Production (requires SQLite in assets/)
cd cli
make build-sqlite # Fetch SQLite amalgamation
zig build prod # Build with embedded SQLite
Install
# Install to system
make install
# Or copy binary manually
cp zig-out/bin/ml /usr/local/bin/
Local/Server Module Pattern
Commands that work in both modes follow this structure:
src/
├── local.zig # Module index
├── local/
│ └── experiment_ops.zig # Local implementations
├── server.zig # Module index
└── server/
└── experiment_api.zig # Server implementations
Adding a New Command
- Create local implementation in
src/local/<name>_ops.zig - Create server implementation in
src/server/<name>_api.zig - Export from
src/local.zigandsrc/server.zig - Create thin router in
src/commands/<name>.zigusingctx.dispatch()
Maintainability Cleanup (2026-02)
Recent refactoring improved code organization:
| Metric | Before | After |
|---|---|---|
| experiment.zig | 836 lines | 348 lines (58% reduction) |
| queue.zig | 1203 lines | Modular structure |
| Duplicate printUsage | 24 functions | 1 shared helper |
| Mode dispatch logic | Inlined everywhere | core.context.Context |
Key Improvements
- Core Modules: Unified
core.output,core.flags,core.contexteliminate duplication - Mode Abstraction: Local/server operations separated into dedicated modules
- Queue Decomposition:
queue/submodules for parsing, validation, submission - Bug Fixes: Resolved 15+ compilation errors in
narrative.zig,outcome.zig,annotate.zig, etc.
Need Help?
ml --help- Show command helpml <command> --help- Show command-specific help