fetch_ml/docs/_pages/cli-reference.md
Jeremie Fraeys 385d2cf386 docs: add comprehensive documentation with MkDocs site
- Add complete API documentation and architecture guides
- Include quick start, installation, and deployment guides
- Add troubleshooting and security documentation
- Include CLI reference and configuration schema docs
- Add production monitoring and operations guides
- Implement MkDocs configuration with search functionality
- Include comprehensive user and developer documentation

Provides complete documentation for users and developers
covering all aspects of the FetchML platform.
2025-12-04 16:54:57 -05:00

404 lines
8.8 KiB
Markdown

---
layout: page
title: "CLI Reference"
permalink: /cli-reference/
nav_order: 2
---
# Fetch ML CLI Reference
Comprehensive command-line tools for managing ML experiments in your homelab with Zig-based high-performance CLI.
## Overview
Fetch ML provides a comprehensive CLI toolkit built with performance and security in mind:
- **Zig CLI** - High-performance experiment management written in Zig
- **Go Commands** - API server, TUI, and data management utilities
- **Management Scripts** - Service orchestration and deployment
- **Setup Scripts** - One-command installation and configuration
## Zig CLI (`./cli/zig-out/bin/ml`)
High-performance command-line interface for experiment management, written in Zig for speed and efficiency.
### Available Commands
| Command | Description | Example |
|---------|-------------|----------|
| `init` | Interactive configuration setup | `ml init` |
| `sync` | Sync project to worker with deduplication | `ml sync ./project --name myjob --queue` |
| `queue` | Queue job for execution | `ml queue myjob --commit abc123 --priority 8` |
| `status` | Get system and worker status | `ml status` |
| `monitor` | Launch TUI monitoring via SSH | `ml monitor` |
| `cancel` | Cancel running job | `ml cancel job123` |
| `prune` | Clean up old experiments | `ml prune --keep 10` |
| `watch` | Auto-sync directory on changes | `ml watch ./project --queue` |
### Command Details
#### `init` - Configuration Setup
```bash
ml init
```
Creates a configuration template at `~/.ml/config.toml` with:
- Worker connection details
- API authentication
- Base paths and ports
#### `sync` - Project Synchronization
```bash
# Basic sync
ml sync ./my-project
# Sync with custom name and queue
ml sync ./my-project --name "experiment-1" --queue
# Sync with priority
ml sync ./my-project --priority 9
```
**Features:**
- Content-addressed storage for deduplication
- SHA256 commit ID generation
- Rsync-based file transfer
- Automatic queuing (with `--queue` flag)
#### `queue` - Job Management
```bash
# Queue with commit ID
ml queue my-job --commit abc123def456
# Queue with priority (1-10, default 5)
ml queue my-job --commit abc123 --priority 8
```
**Features:**
- WebSocket-based communication
- Priority queuing system
- API key authentication
#### `watch` - Auto-Sync Monitoring
```bash
# Watch directory for changes
ml watch ./project
# Watch and auto-queue on changes
ml watch ./project --name "dev-exp" --queue
```
**Features:**
- Real-time file system monitoring
- Automatic re-sync on changes
- Configurable polling interval (2 seconds)
- Commit ID comparison for efficiency
#### `prune` - Cleanup Management
```bash
# Keep last N experiments
ml prune --keep 20
# Remove experiments older than N days
ml prune --older-than 30
```
#### `monitor` - Remote Monitoring
```bash
ml monitor
```
Launches TUI interface via SSH for real-time monitoring.
#### `cancel` - Job Cancellation
```bash
ml cancel running-job-id
```
Cancels currently running jobs by ID.
### Configuration
The Zig CLI reads configuration from `~/.ml/config.toml`:
```toml
worker_host = "worker.local"
worker_user = "mluser"
worker_base = "/data/ml-experiments"
worker_port = 22
api_key = "your-api-key"
```
### Performance Features
- **Content-Addressed Storage**: Automatic deduplication of identical files
- **Incremental Sync**: Only transfers changed files
- **SHA256 Hashing**: Reliable commit ID generation
- **WebSocket Communication**: Efficient real-time messaging
- **Multi-threaded**: Concurrent operations where applicable
## Go Commands
### API Server (`./cmd/api-server/main.go`)
Main HTTPS API server for experiment management.
```bash
# Build and run
go run ./cmd/api-server/main.go
# With configuration
./bin/api-server --config configs/config-local.yaml
```
**Features:**
- HTTPS-only communication
- API key authentication
- Rate limiting and IP whitelisting
- WebSocket support for real-time updates
- Redis integration for caching
### TUI (`./cmd/tui/main.go`)
Terminal User Interface for monitoring experiments.
```bash
# Launch TUI
go run ./cmd/tui/main.go
# With custom config
./tui --config configs/config-local.yaml
```
**Features:**
- Real-time experiment monitoring
- Interactive job management
- Status visualization
- Log viewing
### Data Manager (`./cmd/data_manager/`)
Utilities for data synchronization and management.
```bash
# Sync data
./data_manager --sync ./data
# Clean old data
./data_manager --cleanup --older-than 30d
```
### Config Lint (`./cmd/configlint/main.go`)
Configuration validation and linting tool.
```bash
# Validate configuration
./configlint configs/config-local.yaml
# Check schema compliance
./configlint --schema configs/schema/config_schema.yaml
```
## Management Script (`./tools/manage.sh`)
Simple service management for your homelab.
### Commands
```bash
./tools/manage.sh start # Start all services
./tools/manage.sh stop # Stop all services
./tools/manage.sh status # Check service status
./tools/manage.sh logs # View logs
./tools/manage.sh monitor # Basic monitoring
./tools/manage.sh security # Security status
./tools/manage.sh cleanup # Clean project artifacts
```
## Setup Script (`./setup.sh`)
One-command homelab setup.
### Usage
```bash
# Full setup
./setup.sh
# Setup includes:
# - SSL certificate generation
# - Configuration creation
# - Build all components
# - Start Redis
# - Setup Fail2Ban (if available)
```
## API Testing
Test the API with curl:
```bash
# Health check
curl -k -H 'X-API-Key: password' https://localhost:9101/health
# List experiments
curl -k -H 'X-API-Key: password' https://localhost:9101/experiments
# Submit experiment
curl -k -X POST -H 'X-API-Key: password' \
-H 'Content-Type: application/json' \
-d '{"name":"test","config":{"type":"basic"}}' \
https://localhost:9101/experiments
```
## Zig CLI Architecture
The Zig CLI is designed for performance and reliability:
### Core Components
- **Commands** (`cli/src/commands/`): Individual command implementations
- **Config** (`cli/src/config.zig`): Configuration management
- **Network** (`cli/src/net/ws.zig`): WebSocket client implementation
- **Utils** (`cli/src/utils/`): Cryptography, storage, and rsync utilities
- **Errors** (`cli/src/errors.zig`): Centralized error handling
### Performance Optimizations
- **Content-Addressed Storage**: Deduplicates identical files across experiments
- **SHA256 Hashing**: Fast, reliable commit ID generation
- **Rsync Integration**: Efficient incremental file transfers
- **WebSocket Protocol**: Low-latency communication with worker
- **Memory Management**: Efficient allocation with Zig's allocator system
### Security Features
- **API Key Hashing**: Secure authentication token handling
- **SSH Integration**: Secure file transfers
- **Input Validation**: Comprehensive argument checking
- **Error Handling**: Secure error reporting without information leakage
## Configuration
Main configuration file: `configs/config-local.yaml`
### Key Settings
```yaml
auth:
enabled: true
api_keys:
homelab_user:
hash: "5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8"
admin: true
server:
address: ":9101"
tls:
enabled: true
cert_file: "./ssl/cert.pem"
key_file: "./ssl/key.pem"
security:
rate_limit:
enabled: true
requests_per_minute: 30
ip_whitelist:
- "127.0.0.1"
- "::1"
- "192.168.0.0/16"
- "10.0.0.0/8"
```
## Docker Commands
If using Docker Compose:
```bash
# Start services
docker-compose up -d
# View logs
docker-compose logs -f
# Stop services
docker-compose down
# Check status
docker-compose ps
```
## Troubleshooting
### Common Issues
**Zig CLI not found:**
```bash
# Build the CLI
cd cli && make build
# Check binary exists
ls -la ./cli/zig-out/bin/ml
```
**Configuration not found:**
```bash
# Create configuration
./cli/zig-out/bin/ml init
# Check config file
ls -la ~/.ml/config.toml
```
**Worker connection failed:**
```bash
# Test SSH connection
ssh -p 22 mluser@worker.local
# Check configuration
cat ~/.ml/config.toml
```
**Sync not working:**
```bash
# Check rsync availability
rsync --version
# Test manual sync
rsync -avz ./project/ mluser@worker.local:/tmp/test/
```
**WebSocket connection failed:**
```bash
# Check worker WebSocket port
telnet worker.local 9100
# Verify API key
./cli/zig-out/bin/ml status
```
**API not responding:**
```bash
./tools/manage.sh status
./tools/manage.sh logs
```
**Authentication failed:**
```bash
# Check API key in config-local.yaml
grep -A 5 "api_keys:" configs/config-local.yaml
```
**Redis connection failed:**
```bash
# Check Redis status
redis-cli ping
# Start Redis
redis-server
```
### Getting Help
```bash
# CLI help
./cli/zig-out/bin/ml help
# Management script help
./tools/manage.sh help
# Check all available commands
make help
```
---
**That's it for the CLI reference!** For complete setup instructions, see the main [README](/).