Restructure configuration files for better organization: - Add scheduler configuration examples (scheduler.yaml.example) - Reorganize worker configs into subdirectories: - distributed/ - Multi-node cluster configurations - standalone/ - Single-node deployment configs - Add environment-specific configs: - dev-local.yaml, docker-dev.yaml, docker-prod.yaml - homelab-secure.yaml, worker-prod.toml - Add deployment configs for different security modes: - docker-standard.yaml, docker-hipaa.yaml, docker-dev.yaml Add documentation: - configs/README.md with configuration guidelines - configs/SECURITY.md with security configuration best practices |
||
|---|---|---|
| .. | ||
| configs/worker | ||
| Caddyfile.dev | ||
| Caddyfile.homelab-secure | ||
| Caddyfile.prod.smoke | ||
| Caddyfile.smoke | ||
| deploy.sh | ||
| docker-compose.dev.yml | ||
| docker-compose.homelab-secure.yml | ||
| docker-compose.local.yml | ||
| docker-compose.prod.smoke.yml | ||
| docker-compose.prod.yml | ||
| env.dev.example | ||
| env.prod.example | ||
| Makefile | ||
| README.md | ||
| setup.sh | ||
| tui-test-config.toml | ||
Docker Compose Deployments
This directory contains Docker Compose configurations for different deployment environments.
Environment Configurations
Development (docker-compose.dev.yml)
- Full development stack with monitoring
- Includes: API, Worker, Redis, MinIO (snapshots), Prometheus, Grafana, Loki, Promtail
- Optimized for local development and testing
- Usage:
docker-compose -f deployments/docker-compose.dev.yml up -d
Homelab - Secure (docker-compose.homelab-secure.yml)
- Secure homelab deployment with authentication and a Caddy reverse proxy
- TLS is terminated at the reverse proxy (Approach A)
- Includes: API, Redis (password protected), Caddy reverse proxy
- Usage:
docker-compose -f deployments/docker-compose.homelab-secure.yml up -d
Production (docker-compose.prod.yml)
- Production deployment configuration
- Optimized for performance and security
- External services assumed (Redis, monitoring)
- Usage:
docker-compose -f deployments/docker-compose.prod.yml up -d
Note: docker-compose.prod.yml is a reproducible staging/testing harness. Real production deployments do not require Docker; you can run the Go services directly (systemd) and use Caddy for TLS/WSS termination.
TLS / WSS Policy
- The Zig CLI currently supports
ws://only (nativewss://is not implemented). - Production deployments terminate TLS/WSS at a reverse proxy (Caddy in
docker-compose.prod.yml) and keep the API server on internalws://. - Homelab deployments terminate TLS/WSS at a reverse proxy (Caddy) and keep the API server on internal
ws://. - Health checks in compose files should use
http://localhost:9101/healthwhenserver.tls.enabled: false.
Required Volume Mounts
base_path(experiments) must be writable by the API server.data_dirshould be mounted if you want snapshot/dataset integrity validation viaml validate.
For the default configs:
base_path:/data/experiments(dev/homelab configs) or/app/data/experiments(prod configs)data_dir:/data/active
Quick Start
# Development (most common)
docker-compose -f deployments/docker-compose.dev.yml up -d
# Check status
docker-compose -f deployments/docker-compose.dev.yml ps
# View logs
docker-compose -f deployments/docker-compose.dev.yml logs -f api-server
# Stop services
docker-compose -f deployments/docker-compose.dev.yml down
Dev: MinIO-backed snapshots (smoke test)
The dev compose file provisions a MinIO bucket and uploads a small example snapshot object at:
s3://fetchml-snapshots/snapshots/snap-1.tar.gz
To queue a task that forces the worker to pull the snapshot from MinIO:
-
Start the dev stack:
docker-compose -f deployments/docker-compose.dev.yml up -d -
Read the
snapshot_sha256printed by the init job:docker-compose -f deployments/docker-compose.dev.yml logs minio-init -
Queue a job using the snapshot fields:
ml queue <job-name> --snapshot-id snap-1 --snapshot-sha256 <snapshot_sha256>
Smoke tests
-
make dev-smokeruns the development stack smoke test. -
make prod-smokeruns a Docker-based staging smoke test for the production stack, using a localhost-only Caddy configuration.Note:
ml queueby itself will generate a random commit ID. For full provenance enforcement (manifest + dependency manifest), useml sync ./your-project --queueso the server has real code + dependency files.Examples:
ml queue train-mnist --priority 3 --snapshot-id snap-1 --snapshot-sha256 <snapshot_sha256>ml queue train-a train-b train-c --priority 5 --snapshot-id snap-1 --snapshot-sha256 <snapshot_sha256>
Environment Variables
Create a .env file in the project root:
# Grafana
GRAFANA_ADMIN_PASSWORD=your_secure_password
# API Configuration
LOG_LEVEL=info
# TLS (for secure deployments)
TLS_CERT_PATH=/app/ssl/cert.pem
TLS_KEY_PATH=/app/ssl/key.pem
Service Ports
| Service | Development | Homelab | Production |
|---|---|---|---|
| API Server | 9101 | 9101 | 9101 |
| Redis | 6379 | 6379 | - |
| Prometheus | 9090 | - | - |
| Grafana | 3000 | - | - |
| Loki | 3100 | - | - |
Monitoring
- Development: Full monitoring stack included
- Homelab: Basic monitoring (configurable)
- Production: External monitoring assumed
Security Notes
- If you need HTTPS externally, terminate TLS at a reverse proxy.
- API keys should be managed via environment variables
- Database credentials should use secrets management in production