Jeremie Fraeys 385d2cf386 docs: add comprehensive documentation with MkDocs site

- Add complete API documentation and architecture guides
- Include quick start, installation, and deployment guides
- Add troubleshooting and security documentation
- Include CLI reference and configuration schema docs
- Add production monitoring and operations guides
- Implement MkDocs configuration with search functionality
- Include comprehensive user and developer documentation

Provides complete documentation for users and developers
covering all aspects of the FetchML platform.

2025-12-04 16:54:57 -05:00

7.3 KiB

Raw Blame History

Security Guide

This document outlines security features, best practices, and hardening procedures for FetchML.

Security Features

Authentication & Authorization

API Keys: SHA256-hashed with role-based access control (RBAC)
Permissions: Granular read/write/delete permissions per user
IP Whitelisting: Network-level access control
Rate Limiting: Per-user request quotas

Communication Security

TLS/HTTPS: End-to-end encryption for API traffic
WebSocket Auth: API key required before upgrade
Redis Auth: Password-protected task queue

Data Privacy

Log Sanitization: Automatically redacts API keys, passwords, tokens
Experiment Isolation: User-specific experiment directories
No Anonymous Access: All services require authentication

Network Security

Internal Networks: Backend services (Redis, Loki) not exposed publicly
Firewall Rules: Restrictive port access
Container Isolation: Services run in separate containers/pods

Security Checklist

Initial Setup

Generate Strong Passwords

# Grafana admin password
openssl rand -base64 32 > .grafana-password

# Redis password
openssl rand -base64 32

Configure Environment Variables

cp .env.example .env
# Edit .env and set:
# - GRAFANA_ADMIN_PASSWORD

Enable TLS (Production only)

# configs/config-prod.yaml
server:
  tls:
    enabled: true
    cert_file: "/secrets/cert.pem"
    key_file: "/secrets/key.pem"

Configure Firewall

# Allow only necessary ports
sudo ufw allow 22/tcp    # SSH
sudo ufw allow 443/tcp   # HTTPS
sudo ufw allow 80/tcp    # HTTP (redirect to HTTPS)
sudo ufw enable

Production Hardening

Restrict IP Access

# configs/config-prod.yaml
auth:
  ip_whitelist:
    - "10.0.0.0/8"
    - "192.168.0.0/16"
    - "127.0.0.1"

Enable Audit Logging

logging:
  level: "info"
  audit: true
  file: "/var/log/fetch_ml/audit.log"

Harden Redis

# Redis security
redis-cli CONFIG SET requirepass "your-strong-password"
redis-cli CONFIG SET rename-command FLUSHDB ""
redis-cli CONFIG SET rename-command FLUSHALL ""

Secure Grafana

# Change default admin password
docker-compose exec grafana grafana-cli admin reset-admin-password new-strong-password

Regular Updates

# Update system packages
sudo apt update && sudo apt upgrade -y

# Update containers
docker-compose pull
docker-compose up -d (testing only)

Password Management

Generate Secure Passwords

# Method 1: OpenSSL
openssl rand -base64 32

# Method 2: pwgen (if installed)
pwgen -s 32 1

# Method 3: /dev/urandom
head -c 32 /dev/urandom | base64

Store Passwords Securely

Development: Use .env file (gitignored)

echo "REDIS_PASSWORD=$(openssl rand -base64 32)" >> .env
echo "GRAFANA_ADMIN_PASSWORD=$(openssl rand -base64 32)" >> .env

Production: Use systemd environment files

sudo mkdir -p /etc/fetch_ml/secrets
sudo chmod 700 /etc/fetch_ml/secrets
echo "REDIS_PASSWORD=..." | sudo tee /etc/fetch_ml/secrets/redis.env
sudo chmod 600 /etc/fetch_ml/secrets/redis.env

API Key Management

Generate API Keys

# Generate random API key
openssl rand -hex 32

# Hash for storage
echo -n "your-api-key" | sha256sum

Rotate API Keys

Generate new API key
Update config-local.yaml with new hash
Distribute new key to users
Remove old key after grace period

Revoke API Keys

Remove user entry from config-local.yaml:

auth:
  apikeys:
    # user_to_revoke:  # Comment out or delete

Network Security

Production Network Topology

Internet
    ↓
[Firewall] (ports 3000, 9102)
    ↓
[Reverse Proxy] (nginx/Apache) - TLS termination
    ↓
┌─────────────────────┐
│   Application Pod   │
│                     │
│  ┌──────────────┐   │
│  │ API Server   │   │  ← Public (via reverse proxy)
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │   Redis      │   │  ← Internal only
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │   Grafana    │   │  ← Public (via reverse proxy)
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │ Prometheus   │   │  ← Internal only
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │    Loki      │   │  ← Internal only
│  └──────────────┘   │
└─────────────────────┘

Recommended Firewall Rules

# Allow only necessary inbound connections
sudo firewall-cmd --permanent --zone=public --add-rich-rule='
  rule family="ipv4"
  source address="YOUR_NETWORK"
  port port="3000" protocol="tcp" accept'

sudo firewall-cmd --permanent --zone=public --add-rich-rule='
  rule family="ipv4"
  source address="YOUR_NETWORK"
  port port="9102" protocol="tcp" accept'

# Block all other traffic
sudo firewall-cmd --permanent --set-default-zone=drop
sudo firewall-cmd --reload

Incident Response

Suspected Breach

Immediate Actions
Investigation
Recovery
- Rotate all API keys
- Stop affected services
- Review audit logs

Investigation

# Check recent logins
sudo journalctl -u fetchml-api --since "1 hour ago"

# Review failed auth attempts
grep "authentication failed" /var/log/fetch_ml/*.log

# Check active connections
ss -tnp | grep :9102

Recovery
- Rotate all passwords and API keys
- Update firewall rules
- Patch vulnerabilities
- Resume services

Security Monitoring

# Monitor failed authentication
tail -f /var/log/fetch_ml/api.log | grep "auth.*failed"

# Monitor unusual activity
journalctl -u fetchml-api -f | grep -E "(ERROR|WARN)"

# Check open ports
nmap -p- localhost

Security Best Practices

Principle of Least Privilege: Grant minimum necessary permissions
Defense in Depth: Multiple security layers (firewall + auth + TLS)
Regular Updates: Keep all components patched
Audit Regularly: Review logs and access patterns
Secure Secrets: Never commit passwords/keys to git
Network Segmentation: Isolate services with internal networks
Monitor Everything: Enable comprehensive logging and alerting
Test Security: Regular penetration testing and vulnerability scans

Compliance

Data Privacy

Logs are sanitized (no passwords/API keys)
Experiment data is user-isolated
No telemetry or external data sharing

Audit Trail

All API access is logged with:

Timestamp
User/API key
Action performed
Source IP
Result (success/failure)

Getting Help

Security Issues: Report privately via email
Questions: See documentation or create issue
Updates: Monitor releases for security patches

7.3 KiB Raw Blame History

Security Guide

Security Features

Authentication & Authorization

Communication Security

Data Privacy

Network Security

Security Checklist

Initial Setup

Production Hardening

Password Management

Generate Secure Passwords

Store Passwords Securely

API Key Management

Generate API Keys

Rotate API Keys

Revoke API Keys

Network Security

Production Network Topology

Recommended Firewall Rules

Incident Response

Suspected Breach

Security Monitoring

Security Best Practices

Compliance

Data Privacy

Audit Trail

Getting Help

7.3 KiB

Raw Blame History