# Configuration Reference

## Overview

This document provides a comprehensive reference for all configuration options in the FetchML project.

## Environment Configurations

### Local Development
**File:** `configs/api/dev.yaml`

```yaml
auth:
  enabled: true
  api_keys:
    dev_user:
      hash: "CHANGE_ME_SHA256_DEV_USER_KEY"
      admin: true
      roles: ["admin"]
      permissions:
        "*": true

server:
  address: ":9101"
  tls:
    enabled: false

security:
  rate_limit:
    enabled: false
  ip_whitelist:
    - "127.0.0.1"
    - "::1"
    - "localhost"
```

### Multi-User Setup
**File:** `configs/api/multi-user.yaml`

```yaml
auth:
  enabled: true
  api_keys:
    admin_user:
      hash: "CHANGE_ME_SHA256_ADMIN_USER_KEY"
      admin: true
      roles: ["user", "admin"]
      permissions:
        read: true
        write: true
        delete: true
    
    researcher1:
      hash: "CHANGE_ME_SHA256_RESEARCHER1_KEY"
      admin: false
      roles: ["user", "researcher"]
      permissions:
        jobs:read: true
        jobs:create: true
        jobs:update: true
        jobs:delete: false
    
    analyst1:
      hash: "CHANGE_ME_SHA256_ANALYST1_KEY"
      admin: false
      roles: ["user", "analyst"]
      permissions:
        jobs:read: true
        jobs:create: false
        jobs:update: false
        jobs:delete: false
```

### Production
**File:** `configs/api/prod.yaml`

```yaml
auth:
  enabled: true
  api_keys:
    # Production users configured here

server:
  address: ":9101"
  tls:
    enabled: true
    cert_file: "/app/ssl/cert.pem"
    key_file: "/app/ssl/key.pem"

security:
  rate_limit:
    enabled: true
    requests_per_minute: 30
  ip_whitelist:
    - "127.0.0.1"
    - "::1"
    - "192.168.0.0/16"
    - "10.0.0.0/8"

redis:
  addr: "redis:6379"
  password: ""
  db: 0

logging:
  level: "info"
  file: "/app/logs/app.log"
  audit_log: "/app/logs/audit.log"
```

## Worker Configurations

### Production Worker
**File:** `configs/workers/worker-prod.toml`

```toml
worker_id = "worker-prod-01"
base_path = "/data/ml-experiments"
max_workers = 4

redis_addr = "localhost:6379"
redis_password = "CHANGE_ME_REDIS_PASSWORD"
redis_db = 0

host = "localhost"
user = "ml-user"
port = 22
ssh_key = "~/.ssh/id_rsa"

podman_image = "ml-training:latest"
gpu_vendor = "none"
gpu_visible_devices = []
gpu_devices = []
container_workspace = "/workspace"
container_results = "/results"
train_script = "train.py"

[resources]
max_workers = 4
desired_rps_per_worker = 2
podman_cpus = "4"
podman_memory = "16g"

[metrics]
enabled = true
listen_addr = ":9100"
```

```toml
# Production Worker (NVIDIA, UUID-based GPU selection)
worker_id = "worker-prod-01"
base_path = "/data/ml-experiments"

podman_image = "ml-training:latest"
gpu_vendor = "nvidia"
gpu_visible_device_ids = ["GPU-REPLACE_WITH_REAL_UUID"]
gpu_devices = ["/dev/dri"]
container_workspace = "/workspace"
container_results = "/results"
train_script = "train.py"
```

### Docker Worker
**File:** `configs/workers/docker.yaml`

```yaml
worker_id: "docker-worker"
base_path: "/tmp/fetchml-jobs"
train_script: "train.py"

redis_addr: "redis:6379"
redis_password: ""
redis_db: 0

local_mode: true

max_workers: 1
poll_interval_seconds: 5

podman_image: "python:3.9-slim"
container_workspace: "/workspace"
container_results: "/results"
gpu_devices: []
gpu_vendor: "none"
gpu_visible_devices: []

metrics:
  enabled: true
  listen_addr: ":9100"
metrics_flush_interval: "500ms"
```

## CLI Configuration

### User Config File
**Location:** `~/.ml/config.toml`

```toml
[server]
worker_host = "localhost"
worker_user = "appuser"
worker_base = "/app"
worker_port = 22

[auth]
api_key = "<your-api-key>"

[cli]
default_timeout = 30
verbose = false
```

### Multi-User CLI Configs

**Admin Config:** `~/.ml/config-admin.toml`
```toml
[server]
worker_host = "localhost"
worker_user = "appuser"
worker_base = "/app"
worker_port = 22

[auth]
api_key = "<admin-api-key>"
```

**Researcher Config:** `~/.ml/config-researcher.toml`
```toml
[server]
worker_host = "localhost"
worker_user = "appuser"
worker_base = "/app"
worker_port = 22

[auth]
api_key = "<researcher-api-key>"
```

**Analyst Config:** `~/.ml/config-analyst.toml`
```toml
[server]
worker_host = "localhost"
worker_user = "appuser"
worker_base = "/app"
worker_port = 22

[auth]
api_key = "<analyst-api-key>"
```

## Configuration Options

### Authentication

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `auth.enabled` | bool | false | Enable authentication |
| `auth.apikeys` | map | {} | API key configurations |
| `auth.apikeys.[user].hash` | string | - | SHA256 hash of API key |
| `auth.apikeys.[user].admin` | bool | false | Admin privileges |
| `auth.apikeys.[user].roles` | array | [] | User roles |
| `auth.apikeys.[user].permissions` | map | {} | User permissions |

### Server

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `server.address` | string | ":9101" | Server bind address |
| `server.tls.enabled` | bool | false | Enable TLS |
| `server.tls.cert_file` | string | - | TLS certificate file |
| `server.tls.key_file` | string | - | TLS private key file |

### Security

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `security.rate_limit.enabled` | bool | true | Enable rate limiting |
| `security.rate_limit.requests_per_minute` | int | 60 | Rate limit |
| `security.ip_whitelist` | array | [] | Allowed IP addresses |

### Redis

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `redis.url` | string | "redis://localhost:6379" | Redis connection URL |
| `redis.max_connections` | int | 10 | Max Redis connections |

### Logging

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `logging.level` | string | "info" | Log level |
| `logging.file` | string | - | Log file path |
| `logging.audit_file` | string | - | Audit log path |

## Permission System

### Permission Keys

| Permission | Description |
|------------|-------------|
| `jobs:read` | Read job information |
| `jobs:create` | Create new jobs |
| `jobs:update` | Update existing jobs |
| `jobs:delete` | Delete jobs |
| `*` | All permissions (admin only) |

### Role-Based Permissions

| Role | Default Permissions |
|------|-------------------|
| admin | All permissions |
| researcher | jobs:read, jobs:create, jobs:update |
| analyst | jobs:read |
| user | No default permissions |

## Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `FETCHML_CONFIG` | - | Path to config file |
| `FETCHML_LOG_LEVEL` | "info" | Override log level |
| `CLI_CONFIG` | - | Path to CLI config file |

## Troubleshooting

### Common Configuration Issues

1. **Authentication Failures**
   - Check API key hashes are correct SHA256
   - Verify YAML syntax
   - Ensure auth.enabled: true

2. **Connection Issues**
   - Verify server address and ports
   - Check firewall settings
   - Validate network connectivity

3. **Permission Issues**
   - Check user roles and permissions
   - Verify permission key format
   - Ensure admin users have "*": true

### Configuration Validation

```bash
# Validate server configuration
go run cmd/api-server/main.go --config configs/api/dev.yaml --validate

# Test CLI configuration
./cli/zig-out/bin/ml status --debug
```