From 86f9ae5a7e0967a216da42f112795b574fef5ae4 Mon Sep 17 00:00:00 2001 From: Jeremie Fraeys Date: Thu, 26 Feb 2026 12:04:11 -0500 Subject: [PATCH] docs(config): reorganize configuration structure and add documentation Restructure configuration files for better organization: - Add scheduler configuration examples (scheduler.yaml.example) - Reorganize worker configs into subdirectories: - distributed/ - Multi-node cluster configurations - standalone/ - Single-node deployment configs - Add environment-specific configs: - dev-local.yaml, docker-dev.yaml, docker-prod.yaml - homelab-secure.yaml, worker-prod.toml - Add deployment configs for different security modes: - docker-standard.yaml, docker-hipaa.yaml, docker-dev.yaml Add documentation: - configs/README.md with configuration guidelines - configs/SECURITY.md with security configuration best practices --- configs/README.md | 60 ++++++++ configs/SECURITY.md | 130 ++++++++++++++++++ configs/scheduler/scheduler.yaml.example | 32 +++++ configs/{workers => worker}/dev-local.yaml | 0 .../worker/distributed/worker.yaml.example | 33 +++++ configs/{workers => worker}/docker-dev.yaml | 0 configs/{workers => worker}/docker-prod.yaml | 0 configs/{workers => worker}/docker.yaml | 0 .../{workers => worker}/homelab-secure.yaml | 0 configs/worker/standalone/worker.yaml.example | 32 +++++ configs/{workers => worker}/worker-prod.toml | 0 configs/workers/examples/prewarm-worker.yaml | 27 ---- deployments/configs/worker/docker-dev.yaml | 31 +++++ deployments/configs/worker/docker-hipaa.yaml | 53 +++++++ .../configs/worker/docker-standard.yaml | 35 +++++ 15 files changed, 406 insertions(+), 27 deletions(-) create mode 100644 configs/README.md create mode 100644 configs/SECURITY.md create mode 100644 configs/scheduler/scheduler.yaml.example rename configs/{workers => worker}/dev-local.yaml (100%) create mode 100644 configs/worker/distributed/worker.yaml.example rename configs/{workers => worker}/docker-dev.yaml (100%) rename configs/{workers => worker}/docker-prod.yaml (100%) rename configs/{workers => worker}/docker.yaml (100%) rename configs/{workers => worker}/homelab-secure.yaml (100%) create mode 100644 configs/worker/standalone/worker.yaml.example rename configs/{workers => worker}/worker-prod.toml (100%) delete mode 100644 configs/workers/examples/prewarm-worker.yaml create mode 100644 deployments/configs/worker/docker-dev.yaml create mode 100644 deployments/configs/worker/docker-hipaa.yaml create mode 100644 deployments/configs/worker/docker-standard.yaml diff --git a/configs/README.md b/configs/README.md new file mode 100644 index 0000000..31e5042 --- /dev/null +++ b/configs/README.md @@ -0,0 +1,60 @@ +# fetch_ml Configuration Guide + +## Quick Start + +### Standalone Mode (Existing Behavior) +```bash +# Single worker, direct queue access +go run ./cmd/worker -config configs/worker/standalone/worker.yaml +``` + +### Distributed Mode +```bash +# Terminal 1: Start scheduler +go run ./cmd/scheduler -config configs/scheduler/scheduler.yaml + +# Terminal 2: Start worker +go run ./cmd/worker -config configs/worker/distributed/worker.yaml +``` + +### Single-Node Mode (Zero Config) +```bash +# Both scheduler and worker in one process +go run ./cmd/fetch_ml -config configs/multi-node/single-node.yaml +``` + +## Config Structure + +``` +configs/ +├── scheduler/ +│ └── scheduler.yaml # Central scheduler configuration +├── worker/ +│ ├── standalone/ +│ │ └── worker.yaml # Direct queue access (Redis/SQLite) +│ └── distributed/ +│ └── worker.yaml # WebSocket to scheduler +└── multi-node/ + └── single-node.yaml # Combined scheduler+worker +``` + +## Key Configuration Modes + +| Mode | Use Case | Backend | +|------|----------|---------| +| `standalone` | Single machine, existing behavior | Redis/SQLite/Filesystem | +| `distributed` | Multiple workers, central scheduler | WebSocket to scheduler | +| `both` | Quick testing, single process | In-process scheduler | + +## Worker Mode Selection + +Set `worker.mode` to switch between implementations: + +```yaml +worker: + mode: "standalone" # Uses Redis/SQLite queue.Backend + # OR + mode: "distributed" # Uses SchedulerBackend over WebSocket +``` + +The worker code is unchanged — only the backend implementation changes. diff --git a/configs/SECURITY.md b/configs/SECURITY.md new file mode 100644 index 0000000..7138ff1 --- /dev/null +++ b/configs/SECURITY.md @@ -0,0 +1,130 @@ +# Security Guidelines for fetch_ml Distributed Mode + +## Token Management + +### Quick Start (Recommended) + +```bash +# 1. Generate config with tokens +scheduler -init -config scheduler.yaml + +# 2. Or generate a single token +scheduler -generate-token +``` + +### Generating Tokens + +**Option 1: Initialize full config (recommended)** +```bash +# Generate config with 3 worker tokens +scheduler -init -config /etc/fetch_ml/scheduler.yaml + +# Generate with more tokens +scheduler -init -config /etc/fetch_ml/scheduler.yaml -tokens 5 +``` + +**Option 2: Generate single token** +```bash +# Generate one token +scheduler -generate-token +# Output: wkr_abc123... +``` + +**Option 3: Using OpenSSL** +```bash +openssl rand -hex 32 +``` + +### Token Storage + +- **NEVER commit tokens to git** — config files with real tokens are gitignored +- Store tokens in environment variables or secure secret management +- Use `.env` files locally (already gitignored) +- Rotate tokens periodically + +### Config File Security + +``` +configs/ +├── scheduler/scheduler.yaml # ⛔ NEVER commit with real tokens +├── scheduler/scheduler.yaml.example # ✅ Safe to commit (placeholders) +└── worker/distributed/worker.yaml # ⛔ NEVER commit with real tokens +``` + +All `*.yaml` files in `configs/` subdirectories are gitignored by default. + +### Distribution Workflow + +```bash +# On scheduler host: +$ scheduler -init -config /etc/fetch_ml/scheduler.yaml +Config generated: /etc/fetch_ml/scheduler.yaml + +Generated 3 worker tokens. Copy the appropriate token to each worker's config. + +=== Generated Worker Tokens === +Copy these to your worker configs: + +Worker: worker-01 +Token: wkr_abc123... + +Worker: worker-02 +Token: wkr_def456... + +# On each worker host - copy the appropriate token: +$ cat > /etc/fetch_ml/worker.yaml <