- Add new commands: annotate, narrative, requeue - Refactor WebSocket client into modular components (net/ws/) - Add rsync embedded binary support - Improve error handling and response packet processing - Update build.zig and completions
83 lines
3.2 KiB
Markdown
83 lines
3.2 KiB
Markdown
# ML CLI
|
|
|
|
Fast CLI tool for managing ML experiments.
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# 1. Build
|
|
zig build
|
|
|
|
# 2. Setup configuration
|
|
./zig-out/bin/ml init
|
|
|
|
# 3. Run experiment
|
|
./zig-out/bin/ml sync ./my-experiment --queue
|
|
```
|
|
|
|
## Commands
|
|
|
|
- `ml init` - Setup configuration
|
|
- `ml sync <path>` - Sync project to server
|
|
- `ml queue <job1> [job2 ...] [--commit <id>] [--priority N] [--note <text>]` - Queue one or more jobs
|
|
- `ml status` - Check system/queue status for your API key
|
|
- `ml validate <commit_id> [--json] [--task <task_id>]` - Validate provenance + integrity for a commit or task (includes `run_manifest.json` consistency checks when validating by task)
|
|
- `ml info <path|id> [--json] [--base <path>]` - Show run info from `run_manifest.json` (by path or by scanning `finished/failed/running/pending`)
|
|
- `ml annotate <path|run_id|task_id> --note <text> [--author <name>] [--base <path>] [--json]` - Append a human annotation to `run_manifest.json`
|
|
- `ml narrative set <path|run_id|task_id> [--hypothesis <text>] [--context <text>] [--intent <text>] [--expected-outcome <text>] [--parent-run <id>] [--experiment-group <text>] [--tags <csv>] [--base <path>] [--json]` - Patch the `narrative` field in `run_manifest.json`
|
|
- `ml monitor` - Launch monitoring interface (TUI)
|
|
- `ml cancel <job>` - Cancel a running/queued job you own
|
|
- `ml prune --keep N` - Keep N recent experiments
|
|
- `ml watch <path>` - Auto-sync directory
|
|
- `ml experiment log|show|list|delete` - Manage experiments and metrics
|
|
|
|
Notes:
|
|
|
|
- `--json` mode is designed to be pipe-friendly: machine-readable JSON is emitted to stdout, while user-facing messages/errors go to stderr.
|
|
- When running `ml validate --task <task_id>`, the server will try to locate the job's `run_manifest.json` under the configured base path (pending/running/finished/failed) and cross-check key fields (task id, commit id, deps, snapshot).
|
|
- For tasks in `running`, `completed`, or `failed` state, a missing `run_manifest.json` is treated as a validation failure. For `queued` tasks, it is treated as a warning (the job may not have started yet).
|
|
|
|
### Experiment workflow (minimal)
|
|
|
|
- `ml sync ./my-experiment --queue`
|
|
Syncs files, computes a unique commit ID for the directory, and queues a job.
|
|
|
|
- `ml queue my-job`
|
|
Queues a job named `my-job`. If `--commit` is omitted, the CLI generates a random commit ID
|
|
and records `(job_name, commit_id)` in `~/.ml/history.log` so you don't have to remember hashes.
|
|
|
|
- `ml queue my-job --note "baseline run; lr=1e-3"`
|
|
Adds a human-readable note to the run; it will be persisted into the run's `run_manifest.json` (under `metadata.note`).
|
|
|
|
- `ml experiment list`
|
|
Shows recent experiments from history with alias (job name) and commit ID.
|
|
|
|
- `ml experiment delete <alias|commit>`
|
|
Cancels a running/queued experiment by job name, full commit ID, or short commit prefix.
|
|
|
|
## Configuration
|
|
|
|
Create `~/.ml/config.toml`:
|
|
|
|
```toml
|
|
worker_host = "worker.local"
|
|
worker_user = "mluser"
|
|
worker_base = "/data/ml-experiments"
|
|
worker_port = 22
|
|
api_key = "your-api-key"
|
|
```
|
|
|
|
## Install
|
|
|
|
```bash
|
|
# Install to system
|
|
make install
|
|
|
|
# Or copy binary manually
|
|
cp zig-out/bin/ml /usr/local/bin/
|
|
```
|
|
|
|
## Need Help?
|
|
|
|
- `ml --help` - Show command help
|
|
- `ml <command> --help` - Show command-specific help
|