- Add complete API documentation and architecture guides - Include quick start, installation, and deployment guides - Add troubleshooting and security documentation - Include CLI reference and configuration schema docs - Add production monitoring and operations guides - Implement MkDocs configuration with search functionality - Include comprehensive user and developer documentation Provides complete documentation for users and developers covering all aspects of the FetchML platform.
452 lines
9.5 KiB
Markdown
452 lines
9.5 KiB
Markdown
---
|
|
layout: page
|
|
title: "Zig CLI Guide"
|
|
permalink: /zig-cli/
|
|
nav_order: 3
|
|
---
|
|
|
|
# Zig CLI Guide
|
|
|
|
High-performance command-line interface for ML experiment management, written in Zig for maximum speed and efficiency.
|
|
|
|
## Overview
|
|
|
|
The Zig CLI (`ml`) is the primary interface for managing ML experiments in your homelab. Built with Zig, it provides exceptional performance for file operations, network communication, and experiment management.
|
|
|
|
## Installation
|
|
|
|
### Pre-built Binaries (Recommended)
|
|
|
|
Download from [GitHub Releases](https://github.com/jfraeys/fetch_ml/releases):
|
|
|
|
```bash
|
|
# Download for your platform
|
|
curl -LO https://github.com/jfraeys/fetch_ml/releases/latest/download/ml-<platform>.tar.gz
|
|
|
|
# Extract
|
|
tar -xzf ml-<platform>.tar.gz
|
|
|
|
# Install
|
|
chmod +x ml-<platform>
|
|
sudo mv ml-<platform> /usr/local/bin/ml
|
|
|
|
# Verify
|
|
ml --help
|
|
```
|
|
|
|
**Platforms:**
|
|
- `ml-linux-x86_64.tar.gz` - Linux (fully static, zero dependencies)
|
|
- `ml-macos-x86_64.tar.gz` - macOS Intel
|
|
- `ml-macos-arm64.tar.gz` - macOS Apple Silicon
|
|
|
|
All release binaries include **embedded static rsync** for complete independence.
|
|
|
|
### Build from Source
|
|
|
|
**Development Build** (uses system rsync):
|
|
```bash
|
|
cd cli
|
|
zig build dev
|
|
./zig-out/dev/ml-dev --help
|
|
```
|
|
|
|
**Production Build** (embedded rsync):
|
|
```bash
|
|
cd cli
|
|
# For testing: uses rsync wrapper
|
|
zig build prod
|
|
|
|
# For release with static rsync:
|
|
# 1. Place static rsync binary at src/assets/rsync_release.bin
|
|
# 2. Build
|
|
zig build prod
|
|
strip zig-out/prod/ml # Optional: reduce size
|
|
|
|
# Verify
|
|
./zig-out/prod/ml --help
|
|
ls -lh zig-out/prod/ml
|
|
```
|
|
|
|
See [cli/src/assets/README.md](https://github.com/jfraeys/fetch_ml/blob/main/cli/src/assets/README.md) for details on obtaining static rsync binaries.
|
|
|
|
### Verify Installation
|
|
```bash
|
|
ml --help
|
|
ml --version # Shows build config
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
1. **Initialize Configuration**
|
|
```bash
|
|
./cli/zig-out/bin/ml init
|
|
```
|
|
|
|
2. **Sync Your First Project**
|
|
```bash
|
|
./cli/zig-out/bin/ml sync ./my-project --queue
|
|
```
|
|
|
|
3. **Monitor Progress**
|
|
```bash
|
|
./cli/zig-out/bin/ml status
|
|
```
|
|
|
|
## Command Reference
|
|
|
|
### `init` - Configuration Setup
|
|
|
|
Initialize the CLI configuration file.
|
|
|
|
```bash
|
|
ml init
|
|
```
|
|
|
|
**Creates:** `~/.ml/config.toml`
|
|
|
|
**Configuration Template:**
|
|
```toml
|
|
worker_host = "worker.local"
|
|
worker_user = "mluser"
|
|
worker_base = "/data/ml-experiments"
|
|
worker_port = 22
|
|
api_key = "your-api-key"
|
|
```
|
|
|
|
### `sync` - Project Synchronization
|
|
|
|
Sync project files to the worker with intelligent deduplication.
|
|
|
|
```bash
|
|
# Basic sync
|
|
ml sync ./project
|
|
|
|
# Sync with custom name and auto-queue
|
|
ml sync ./project --name "experiment-1" --queue
|
|
|
|
# Sync with priority
|
|
ml sync ./project --priority 8
|
|
```
|
|
|
|
**Options:**
|
|
- `--name <name>`: Custom experiment name
|
|
- `--queue`: Automatically queue after sync
|
|
- `--priority N`: Set priority (1-10, default 5)
|
|
|
|
**Features:**
|
|
- **Content-Addressed Storage**: Automatic deduplication
|
|
- **SHA256 Commit IDs**: Reliable change detection
|
|
- **Incremental Transfer**: Only sync changed files
|
|
- **Rsync Backend**: Efficient file transfer
|
|
|
|
### `queue` - Job Management
|
|
|
|
Queue experiments for execution on the worker.
|
|
|
|
```bash
|
|
# Queue with commit ID
|
|
ml queue my-job --commit abc123def456
|
|
|
|
# Queue with priority
|
|
ml queue my-job --commit abc123 --priority 8
|
|
```
|
|
|
|
**Options:**
|
|
- `--commit <id>`: Commit ID from sync output
|
|
- `--priority N`: Execution priority (1-10)
|
|
|
|
**Features:**
|
|
- **WebSocket Communication**: Real-time job submission
|
|
- **Priority Queuing**: Higher priority jobs run first
|
|
- **API Authentication**: Secure job submission
|
|
|
|
### `watch` - Auto-Sync Monitoring
|
|
|
|
Monitor directories for changes and auto-sync.
|
|
|
|
```bash
|
|
# Watch for changes
|
|
ml watch ./project
|
|
|
|
# Watch and auto-queue on changes
|
|
ml watch ./project --name "dev-exp" --queue
|
|
```
|
|
|
|
**Options:**
|
|
- `--name <name>`: Custom experiment name
|
|
- `--queue`: Auto-queue on changes
|
|
- `--priority N`: Set priority for queued jobs
|
|
|
|
**Features:**
|
|
- **Real-time Monitoring**: 2-second polling interval
|
|
- **Change Detection**: File modification time tracking
|
|
- **Commit Comparison**: Only sync when content changes
|
|
- **Automatic Queuing**: Seamless development workflow
|
|
|
|
### `status` - System Status
|
|
|
|
Check system and worker status.
|
|
|
|
```bash
|
|
ml status
|
|
```
|
|
|
|
**Displays:**
|
|
- Worker connectivity
|
|
- Queue status
|
|
- Running jobs
|
|
- System health
|
|
|
|
### `monitor` - Remote Monitoring
|
|
|
|
Launch TUI interface via SSH for real-time monitoring.
|
|
|
|
```bash
|
|
ml monitor
|
|
```
|
|
|
|
**Features:**
|
|
- **Real-time Updates**: Live experiment status
|
|
- **Interactive Interface**: Browse and manage experiments
|
|
- **SSH Integration**: Secure remote access
|
|
|
|
### `cancel` - Job Cancellation
|
|
|
|
Cancel running or queued jobs.
|
|
|
|
```bash
|
|
ml cancel job-id
|
|
```
|
|
|
|
**Options:**
|
|
- `job-id`: Job identifier from status output
|
|
|
|
### `prune` - Cleanup Management
|
|
|
|
Clean up old experiments to save space.
|
|
|
|
```bash
|
|
# Keep last N experiments
|
|
ml prune --keep 20
|
|
|
|
# Remove experiments older than N days
|
|
ml prune --older-than 30
|
|
```
|
|
|
|
**Options:**
|
|
- `--keep N`: Keep N most recent experiments
|
|
- `--older-than N`: Remove experiments older than N days
|
|
|
|
## Architecture
|
|
|
|
### Core Components
|
|
|
|
```
|
|
cli/src/
|
|
├── commands/ # Command implementations
|
|
│ ├── init.zig # Configuration setup
|
|
│ ├── sync.zig # Project synchronization
|
|
│ ├── queue.zig # Job management
|
|
│ ├── watch.zig # Auto-sync monitoring
|
|
│ ├── status.zig # System status
|
|
│ ├── monitor.zig # Remote monitoring
|
|
│ ├── cancel.zig # Job cancellation
|
|
│ └── prune.zig # Cleanup operations
|
|
├── config.zig # Configuration management
|
|
├── errors.zig # Error handling
|
|
├── net/ # Network utilities
|
|
│ └── ws.zig # WebSocket client
|
|
└── utils/ # Utility functions
|
|
├── crypto.zig # Hashing and encryption
|
|
├── storage.zig # Content-addressed storage
|
|
└── rsync.zig # File synchronization
|
|
```
|
|
|
|
### Performance Features
|
|
|
|
#### Content-Addressed Storage
|
|
- **Deduplication**: Identical files shared across experiments
|
|
- **Hash-based Storage**: Files stored by SHA256 hash
|
|
- **Space Efficiency**: Reduces storage by up to 90%
|
|
|
|
#### SHA256 Commit IDs
|
|
- **Reliable Detection**: Cryptographic change detection
|
|
- **Collision Resistance**: Guaranteed unique identifiers
|
|
- **Fast Computation**: Optimized for large directories
|
|
|
|
#### WebSocket Protocol
|
|
- **Low Latency**: Real-time communication
|
|
- **Binary Protocol**: Efficient message format
|
|
- **Connection Pooling**: Reused connections
|
|
|
|
#### Memory Management
|
|
- **Arena Allocators**: Efficient memory allocation
|
|
- **Zero-copy Operations**: Minimized memory usage
|
|
- **Resource Cleanup**: Automatic resource management
|
|
|
|
### Security Features
|
|
|
|
#### Authentication
|
|
- **API Key Hashing**: Secure token storage
|
|
- **SHA256 Hashes**: Irreversible token protection
|
|
- **Config Validation**: Input sanitization
|
|
|
|
#### Secure Communication
|
|
- **SSH Integration**: Encrypted file transfers
|
|
- **WebSocket Security**: TLS-protected communication
|
|
- **Input Validation**: Comprehensive argument checking
|
|
|
|
#### Error Handling
|
|
- **Secure Reporting**: No sensitive information leakage
|
|
- **Graceful Degradation**: Safe error recovery
|
|
- **Audit Logging**: Operation tracking
|
|
|
|
## Advanced Usage
|
|
|
|
### Workflow Integration
|
|
|
|
#### Development Workflow
|
|
```bash
|
|
# 1. Initialize project
|
|
ml sync ./project --name "dev" --queue
|
|
|
|
# 2. Auto-sync during development
|
|
ml watch ./project --name "dev" --queue
|
|
|
|
# 3. Monitor progress
|
|
ml status
|
|
```
|
|
|
|
#### Batch Processing
|
|
```bash
|
|
# Process multiple experiments
|
|
for dir in experiments/*/; do
|
|
ml sync "$dir" --queue
|
|
done
|
|
```
|
|
|
|
#### Priority Management
|
|
```bash
|
|
# High priority experiment
|
|
ml sync ./urgent --priority 10 --queue
|
|
|
|
# Background processing
|
|
ml sync ./background --priority 1 --queue
|
|
```
|
|
|
|
### Configuration Management
|
|
|
|
#### Multiple Workers
|
|
```toml
|
|
# ~/.ml/config.toml
|
|
worker_host = "worker.local"
|
|
worker_user = "mluser"
|
|
worker_base = "/data/ml-experiments"
|
|
worker_port = 22
|
|
api_key = "your-api-key"
|
|
```
|
|
|
|
#### Security Settings
|
|
```bash
|
|
# Set restrictive permissions
|
|
chmod 600 ~/.ml/config.toml
|
|
|
|
# Verify configuration
|
|
ml status
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
#### Build Problems
|
|
```bash
|
|
# Check Zig installation
|
|
zig version
|
|
|
|
# Clean build
|
|
cd cli && make clean && make build
|
|
```
|
|
|
|
#### Connection Issues
|
|
```bash
|
|
# Test SSH connectivity
|
|
ssh -p $worker_port $worker_user@$worker_host
|
|
|
|
# Verify configuration
|
|
cat ~/.ml/config.toml
|
|
```
|
|
|
|
#### Sync Failures
|
|
```bash
|
|
# Check rsync
|
|
rsync --version
|
|
|
|
# Manual sync test
|
|
rsync -avz ./test/ $worker_user@$worker_host:/tmp/
|
|
```
|
|
|
|
#### Performance Issues
|
|
```bash
|
|
# Monitor resource usage
|
|
top -p $(pgrep ml)
|
|
|
|
# Check disk space
|
|
df -h $worker_base
|
|
```
|
|
|
|
### Debug Mode
|
|
|
|
Enable verbose logging:
|
|
```bash
|
|
# Environment variable
|
|
export ML_DEBUG=1
|
|
ml sync ./project
|
|
|
|
# Or use debug build
|
|
cd cli && make debug
|
|
```
|
|
|
|
## Performance Benchmarks
|
|
|
|
### File Operations
|
|
- **Sync Speed**: 100MB/s+ (network limited)
|
|
- **Hash Computation**: 500MB/s+ (CPU limited)
|
|
- **Deduplication**: 90%+ space savings
|
|
|
|
### Memory Usage
|
|
- **Base Memory**: ~10MB
|
|
- **Large Projects**: ~50MB (1GB+ projects)
|
|
- **Memory Efficiency**: Constant per-file overhead
|
|
|
|
### Network Performance
|
|
- **WebSocket Latency**: <10ms (local network)
|
|
- **Connection Setup**: <100ms
|
|
- **Throughput**: Network limited
|
|
|
|
## Contributing
|
|
|
|
### Development Setup
|
|
```bash
|
|
cd cli
|
|
zig build-exe src/main.zig
|
|
```
|
|
|
|
### Testing
|
|
```bash
|
|
# Run tests
|
|
cd cli && zig test src/
|
|
|
|
# Integration tests
|
|
zig test tests/
|
|
```
|
|
|
|
### Code Style
|
|
- Follow Zig style guidelines
|
|
- Use explicit error handling
|
|
- Document public APIs
|
|
- Add comprehensive tests
|
|
|
|
---
|
|
|
|
**For more information, see the [CLI Reference](/cli-reference/) and [Architecture](/architecture/) pages.**
|