- Update build.zig configuration - Improve queue command implementation - Enhance native hash support
255 lines
7.6 KiB
Markdown
255 lines
7.6 KiB
Markdown
# ML CLI
|
|
|
|
Fast CLI tool for managing ML experiments. Supports both **local mode** (SQLite) and **server mode** (WebSocket).
|
|
|
|
## Build Policy
|
|
|
|
**Native C++ libraries** (dataset_hash, etc.) are available when building natively on any platform. Cross-compilation is supported for development on non-native targets but disables native library features.
|
|
|
|
| Build Type | Target | Native Libraries | Purpose |
|
|
|------------|--------|-------------------|---------|
|
|
| Native | Host platform (Linux, macOS) | Yes | Dev, staging, production |
|
|
| Cross-compile | Different arch/OS | Stubbed | Testing on foreign targets |
|
|
|
|
### Native Build (Recommended)
|
|
Builds on the host platform with full native library support:
|
|
```bash
|
|
zig build -Doptimize=ReleaseSmall
|
|
```
|
|
|
|
### Cross-Compile (Dev Only)
|
|
For testing on different architectures without native library support:
|
|
```bash
|
|
zig build -Dtarget=x86_64-linux-gnu # from macOS/Windows
|
|
```
|
|
|
|
## Architecture
|
|
|
|
The CLI follows a modular 3-layer architecture for maintainability:
|
|
|
|
```
|
|
src/
|
|
├── core/ # Shared foundation
|
|
│ ├── context.zig # Execution context (allocator, config, mode dispatch)
|
|
│ ├── output.zig # Unified JSON/text output helpers
|
|
│ └── flags.zig # Common flag parsing
|
|
├── local/ # Local mode operations (SQLite)
|
|
│ └── experiment_ops.zig # Experiment CRUD for local DB
|
|
├── server/ # Server mode operations (WebSocket)
|
|
│ └── experiment_api.zig # Experiment API for remote server
|
|
├── commands/ # Thin command routers
|
|
│ ├── experiment.zig # ~100 lines (was 887)
|
|
│ ├── queue.zig # Job submission
|
|
│ └── queue/ # Queue submodules
|
|
│ ├── parse.zig # Job template parsing
|
|
│ ├── validate.zig # Validation logic
|
|
│ └── submit.zig # Job submission
|
|
└── utils/ # Utilities (21 files)
|
|
```
|
|
|
|
### Mode Dispatch Pattern
|
|
|
|
Commands auto-detect local vs server mode using `core.context.Context`:
|
|
|
|
```zig
|
|
var ctx = core.context.Context.init(allocator, cfg, flags.json);
|
|
if (ctx.isLocal()) {
|
|
return try local.experiment.list(ctx.allocator, ctx.json_output);
|
|
} else {
|
|
return try server.experiment.list(ctx.allocator, ctx.json_output);
|
|
}
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# 1. Build
|
|
zig build
|
|
|
|
# 2. Initialize local tracking (creates fetch_ml.db)
|
|
./zig-out/bin/ml init
|
|
|
|
# 3. Create experiment and run locally
|
|
./zig-out/bin/ml experiment create --name "baseline"
|
|
./zig-out/bin/ml run start --experiment <id> --name "run-1"
|
|
./zig-out/bin/ml experiment log --run <id> --name loss --value 0.5
|
|
./zig-out/bin/ml run finish --run <id>
|
|
```
|
|
|
|
## Commands
|
|
|
|
### Local Mode Commands (SQLite)
|
|
|
|
- `ml init` - Initialize local experiment tracking database
|
|
- `ml experiment create --name <name>` - Create experiment locally
|
|
- `ml experiment list` - List experiments from SQLite
|
|
- `ml experiment log --run <id> --name <key> --value <val>` - Log metrics
|
|
- `ml run start --experiment <id> [--name <name>]` - Start a run
|
|
- `ml run finish --run <id>` - Mark run as finished
|
|
- `ml run fail --run <id>` - Mark run as failed
|
|
- `ml run list` - List all runs
|
|
|
|
### Server Mode Commands (WebSocket)
|
|
|
|
- `ml sync <path>` - Sync project to server
|
|
- `ml queue <job1> [job2 ...] [--commit <id>] [--priority N] [--note <text>]` - Queue jobs
|
|
- `ml status` - Check system/queue status
|
|
- `ml validate <commit_id> [--json] [--task <task_id>]` - Validate provenance
|
|
- `ml cancel <job>` - Cancel a running/queued job
|
|
|
|
### Shared Commands (Auto-detect Mode)
|
|
|
|
- `ml experiment log|show|list|delete` - Works in both local and server mode
|
|
- `ml monitor` - Launch TUI (local SQLite or remote SSH)
|
|
|
|
Notes:
|
|
|
|
- Commands auto-detect mode from config (`sqlite://` vs `wss://`)
|
|
- `--json` mode is designed to be pipe-friendly
|
|
|
|
## Core Modules
|
|
|
|
### `core.context`
|
|
|
|
Provides unified execution context for all commands:
|
|
|
|
- **Mode detection**: Automatically detects local (SQLite) vs server (WebSocket) mode
|
|
- **Output handling**: JSON vs text output based on `--json` flag
|
|
- **Dispatch helpers**: `ctx.dispatch(local_fn, server_fn, args)` for mode-specific implementations
|
|
|
|
```zig
|
|
const core = @import("../core.zig");
|
|
|
|
pub fn execute(allocator: std.mem.Allocator, args: []const []const u8) !void {
|
|
const cfg = try config.Config.load(allocator);
|
|
var ctx = core.context.Context.init(allocator, cfg, flags.json);
|
|
defer ctx.deinit();
|
|
|
|
// Dispatch to local or server implementation
|
|
if (ctx.isLocal()) {
|
|
return try local.experiment.list(ctx.allocator, ctx.json_output);
|
|
} else {
|
|
return try server.experiment.list(ctx.allocator, ctx.json_output);
|
|
}
|
|
}
|
|
```
|
|
|
|
### `core.output`
|
|
|
|
Unified output helpers that respect `--json` flag:
|
|
|
|
```zig
|
|
core.output.errorMsg("command", "Error message"); // JSON: {"success":false,...}
|
|
core.output.success("command"); // JSON: {"success":true,...}
|
|
core.output.successString("cmd", "key", "value"); // JSON with data
|
|
core.output.info("Text output", .{}); // Text mode only
|
|
core.output.usage("cmd", "usage string"); // Help text
|
|
```
|
|
|
|
### `core.flags`
|
|
|
|
Common flag parsing utilities:
|
|
|
|
```zig
|
|
var flags = core.flags.CommonFlags{};
|
|
var remaining = try core.flags.parseCommon(allocator, args, &flags);
|
|
|
|
// Check for subcommands
|
|
if (core.flags.matchSubcommand(remaining.items, "list")) |sub_args| {
|
|
return try executeList(ctx, sub_args);
|
|
}
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Local Mode (SQLite)
|
|
|
|
```toml
|
|
# .fetchml/config.toml or ~/.ml/config.toml
|
|
tracking_uri = "sqlite://./fetch_ml.db"
|
|
artifact_path = "./experiments/"
|
|
sync_uri = "" # Optional: server to sync with
|
|
```
|
|
|
|
### Server Mode (WebSocket)
|
|
|
|
```toml
|
|
# ~/.ml/config.toml
|
|
worker_host = "worker.local"
|
|
worker_user = "mluser"
|
|
worker_base = "/data/ml-experiments"
|
|
worker_port = 22
|
|
api_key = "your-api-key"
|
|
```
|
|
|
|
## Building
|
|
|
|
### Development
|
|
|
|
```bash
|
|
cd cli
|
|
zig build
|
|
```
|
|
|
|
### Production (requires SQLite in assets/)
|
|
|
|
```bash
|
|
cd cli
|
|
make build-sqlite # Fetch SQLite amalgamation
|
|
zig build prod # Build with embedded SQLite
|
|
```
|
|
|
|
## Install
|
|
|
|
```bash
|
|
# Install to system
|
|
make install
|
|
|
|
# Or copy binary manually
|
|
cp zig-out/bin/ml /usr/local/bin/
|
|
```
|
|
|
|
## Local/Server Module Pattern
|
|
|
|
Commands that work in both modes follow this structure:
|
|
|
|
```
|
|
src/
|
|
├── local.zig # Module index
|
|
├── local/
|
|
│ └── experiment_ops.zig # Local implementations
|
|
├── server.zig # Module index
|
|
└── server/
|
|
└── experiment_api.zig # Server implementations
|
|
```
|
|
|
|
### Adding a New Command
|
|
|
|
1. Create local implementation in `src/local/<name>_ops.zig`
|
|
2. Create server implementation in `src/server/<name>_api.zig`
|
|
3. Export from `src/local.zig` and `src/server.zig`
|
|
4. Create thin router in `src/commands/<name>.zig` using `ctx.dispatch()`
|
|
|
|
## Maintainability Cleanup (2026-02)
|
|
|
|
Recent refactoring improved code organization:
|
|
|
|
| Metric | Before | After |
|
|
|--------|--------|-------|
|
|
| experiment.zig | 836 lines | 348 lines (58% reduction) |
|
|
| queue.zig | 1203 lines | Modular structure |
|
|
| Duplicate printUsage | 24 functions | 1 shared helper |
|
|
| Mode dispatch logic | Inlined everywhere | `core.context.Context` |
|
|
|
|
### Key Improvements
|
|
|
|
1. **Core Modules**: Unified `core.output`, `core.flags`, `core.context` eliminate duplication
|
|
2. **Mode Abstraction**: Local/server operations separated into dedicated modules
|
|
3. **Queue Decomposition**: `queue/` submodules for parsing, validation, submission
|
|
4. **Bug Fixes**: Resolved 15+ compilation errors in `narrative.zig`, `outcome.zig`, `annotate.zig`, etc.
|
|
|
|
## Need Help?
|
|
|
|
- `ml --help` - Show command help
|
|
- `ml <command> --help` - Show command-specific help
|
|
|