- Add complete API documentation and architecture guides - Include quick start, installation, and deployment guides - Add troubleshooting and security documentation - Include CLI reference and configuration schema docs - Add production monitoring and operations guides - Implement MkDocs configuration with search functionality - Include comprehensive user and developer documentation Provides complete documentation for users and developers covering all aspects of the FetchML platform.
9.5 KiB
| layout | title | permalink | nav_order |
|---|---|---|---|
| page | Zig CLI Guide | /zig-cli/ | 3 |
Zig CLI Guide
High-performance command-line interface for ML experiment management, written in Zig for maximum speed and efficiency.
Overview
The Zig CLI (ml) is the primary interface for managing ML experiments in your homelab. Built with Zig, it provides exceptional performance for file operations, network communication, and experiment management.
Installation
Pre-built Binaries (Recommended)
Download from GitHub Releases:
# Download for your platform
curl -LO https://github.com/jfraeys/fetch_ml/releases/latest/download/ml-<platform>.tar.gz
# Extract
tar -xzf ml-<platform>.tar.gz
# Install
chmod +x ml-<platform>
sudo mv ml-<platform> /usr/local/bin/ml
# Verify
ml --help
Platforms:
ml-linux-x86_64.tar.gz- Linux (fully static, zero dependencies)ml-macos-x86_64.tar.gz- macOS Intelml-macos-arm64.tar.gz- macOS Apple Silicon
All release binaries include embedded static rsync for complete independence.
Build from Source
Development Build (uses system rsync):
cd cli
zig build dev
./zig-out/dev/ml-dev --help
Production Build (embedded rsync):
cd cli
# For testing: uses rsync wrapper
zig build prod
# For release with static rsync:
# 1. Place static rsync binary at src/assets/rsync_release.bin
# 2. Build
zig build prod
strip zig-out/prod/ml # Optional: reduce size
# Verify
./zig-out/prod/ml --help
ls -lh zig-out/prod/ml
See cli/src/assets/README.md for details on obtaining static rsync binaries.
Verify Installation
ml --help
ml --version # Shows build config
Quick Start
-
Initialize Configuration
./cli/zig-out/bin/ml init -
Sync Your First Project
./cli/zig-out/bin/ml sync ./my-project --queue -
Monitor Progress
./cli/zig-out/bin/ml status
Command Reference
init - Configuration Setup
Initialize the CLI configuration file.
ml init
Creates: ~/.ml/config.toml
Configuration Template:
worker_host = "worker.local"
worker_user = "mluser"
worker_base = "/data/ml-experiments"
worker_port = 22
api_key = "your-api-key"
sync - Project Synchronization
Sync project files to the worker with intelligent deduplication.
# Basic sync
ml sync ./project
# Sync with custom name and auto-queue
ml sync ./project --name "experiment-1" --queue
# Sync with priority
ml sync ./project --priority 8
Options:
--name <name>: Custom experiment name--queue: Automatically queue after sync--priority N: Set priority (1-10, default 5)
Features:
- Content-Addressed Storage: Automatic deduplication
- SHA256 Commit IDs: Reliable change detection
- Incremental Transfer: Only sync changed files
- Rsync Backend: Efficient file transfer
queue - Job Management
Queue experiments for execution on the worker.
# Queue with commit ID
ml queue my-job --commit abc123def456
# Queue with priority
ml queue my-job --commit abc123 --priority 8
Options:
--commit <id>: Commit ID from sync output--priority N: Execution priority (1-10)
Features:
- WebSocket Communication: Real-time job submission
- Priority Queuing: Higher priority jobs run first
- API Authentication: Secure job submission
watch - Auto-Sync Monitoring
Monitor directories for changes and auto-sync.
# Watch for changes
ml watch ./project
# Watch and auto-queue on changes
ml watch ./project --name "dev-exp" --queue
Options:
--name <name>: Custom experiment name--queue: Auto-queue on changes--priority N: Set priority for queued jobs
Features:
- Real-time Monitoring: 2-second polling interval
- Change Detection: File modification time tracking
- Commit Comparison: Only sync when content changes
- Automatic Queuing: Seamless development workflow
status - System Status
Check system and worker status.
ml status
Displays:
- Worker connectivity
- Queue status
- Running jobs
- System health
monitor - Remote Monitoring
Launch TUI interface via SSH for real-time monitoring.
ml monitor
Features:
- Real-time Updates: Live experiment status
- Interactive Interface: Browse and manage experiments
- SSH Integration: Secure remote access
cancel - Job Cancellation
Cancel running or queued jobs.
ml cancel job-id
Options:
job-id: Job identifier from status output
prune - Cleanup Management
Clean up old experiments to save space.
# Keep last N experiments
ml prune --keep 20
# Remove experiments older than N days
ml prune --older-than 30
Options:
--keep N: Keep N most recent experiments--older-than N: Remove experiments older than N days
Architecture
Core Components
cli/src/
├── commands/ # Command implementations
│ ├── init.zig # Configuration setup
│ ├── sync.zig # Project synchronization
│ ├── queue.zig # Job management
│ ├── watch.zig # Auto-sync monitoring
│ ├── status.zig # System status
│ ├── monitor.zig # Remote monitoring
│ ├── cancel.zig # Job cancellation
│ └── prune.zig # Cleanup operations
├── config.zig # Configuration management
├── errors.zig # Error handling
├── net/ # Network utilities
│ └── ws.zig # WebSocket client
└── utils/ # Utility functions
├── crypto.zig # Hashing and encryption
├── storage.zig # Content-addressed storage
└── rsync.zig # File synchronization
Performance Features
Content-Addressed Storage
- Deduplication: Identical files shared across experiments
- Hash-based Storage: Files stored by SHA256 hash
- Space Efficiency: Reduces storage by up to 90%
SHA256 Commit IDs
- Reliable Detection: Cryptographic change detection
- Collision Resistance: Guaranteed unique identifiers
- Fast Computation: Optimized for large directories
WebSocket Protocol
- Low Latency: Real-time communication
- Binary Protocol: Efficient message format
- Connection Pooling: Reused connections
Memory Management
- Arena Allocators: Efficient memory allocation
- Zero-copy Operations: Minimized memory usage
- Resource Cleanup: Automatic resource management
Security Features
Authentication
- API Key Hashing: Secure token storage
- SHA256 Hashes: Irreversible token protection
- Config Validation: Input sanitization
Secure Communication
- SSH Integration: Encrypted file transfers
- WebSocket Security: TLS-protected communication
- Input Validation: Comprehensive argument checking
Error Handling
- Secure Reporting: No sensitive information leakage
- Graceful Degradation: Safe error recovery
- Audit Logging: Operation tracking
Advanced Usage
Workflow Integration
Development Workflow
# 1. Initialize project
ml sync ./project --name "dev" --queue
# 2. Auto-sync during development
ml watch ./project --name "dev" --queue
# 3. Monitor progress
ml status
Batch Processing
# Process multiple experiments
for dir in experiments/*/; do
ml sync "$dir" --queue
done
Priority Management
# High priority experiment
ml sync ./urgent --priority 10 --queue
# Background processing
ml sync ./background --priority 1 --queue
Configuration Management
Multiple Workers
# ~/.ml/config.toml
worker_host = "worker.local"
worker_user = "mluser"
worker_base = "/data/ml-experiments"
worker_port = 22
api_key = "your-api-key"
Security Settings
# Set restrictive permissions
chmod 600 ~/.ml/config.toml
# Verify configuration
ml status
Troubleshooting
Common Issues
Build Problems
# Check Zig installation
zig version
# Clean build
cd cli && make clean && make build
Connection Issues
# Test SSH connectivity
ssh -p $worker_port $worker_user@$worker_host
# Verify configuration
cat ~/.ml/config.toml
Sync Failures
# Check rsync
rsync --version
# Manual sync test
rsync -avz ./test/ $worker_user@$worker_host:/tmp/
Performance Issues
# Monitor resource usage
top -p $(pgrep ml)
# Check disk space
df -h $worker_base
Debug Mode
Enable verbose logging:
# Environment variable
export ML_DEBUG=1
ml sync ./project
# Or use debug build
cd cli && make debug
Performance Benchmarks
File Operations
- Sync Speed: 100MB/s+ (network limited)
- Hash Computation: 500MB/s+ (CPU limited)
- Deduplication: 90%+ space savings
Memory Usage
- Base Memory: ~10MB
- Large Projects: ~50MB (1GB+ projects)
- Memory Efficiency: Constant per-file overhead
Network Performance
- WebSocket Latency: <10ms (local network)
- Connection Setup: <100ms
- Throughput: Network limited
Contributing
Development Setup
cd cli
zig build-exe src/main.zig
Testing
# Run tests
cd cli && zig test src/
# Integration tests
zig test tests/
Code Style
- Follow Zig style guidelines
- Use explicit error handling
- Document public APIs
- Add comprehensive tests
For more information, see the CLI Reference and Architecture pages.