fetch_ml/docs/src/security.md
Jeremie Fraeys 385d2cf386 docs: add comprehensive documentation with MkDocs site
- Add complete API documentation and architecture guides
- Include quick start, installation, and deployment guides
- Add troubleshooting and security documentation
- Include CLI reference and configuration schema docs
- Add production monitoring and operations guides
- Implement MkDocs configuration with search functionality
- Include comprehensive user and developer documentation

Provides complete documentation for users and developers
covering all aspects of the FetchML platform.
2025-12-04 16:54:57 -05:00

7.3 KiB

Security Guide

This document outlines security features, best practices, and hardening procedures for FetchML.

Security Features

Authentication & Authorization

  • API Keys: SHA256-hashed with role-based access control (RBAC)
  • Permissions: Granular read/write/delete permissions per user
  • IP Whitelisting: Network-level access control
  • Rate Limiting: Per-user request quotas

Communication Security

  • TLS/HTTPS: End-to-end encryption for API traffic
  • WebSocket Auth: API key required before upgrade
  • Redis Auth: Password-protected task queue

Data Privacy

  • Log Sanitization: Automatically redacts API keys, passwords, tokens
  • Experiment Isolation: User-specific experiment directories
  • No Anonymous Access: All services require authentication

Network Security

  • Internal Networks: Backend services (Redis, Loki) not exposed publicly
  • Firewall Rules: Restrictive port access
  • Container Isolation: Services run in separate containers/pods

Security Checklist

Initial Setup

  1. Generate Strong Passwords
# Grafana admin password
openssl rand -base64 32 > .grafana-password

# Redis password
openssl rand -base64 32
  1. Configure Environment Variables
cp .env.example .env
# Edit .env and set:
# - GRAFANA_ADMIN_PASSWORD
  1. Enable TLS (Production only)
# configs/config-prod.yaml
server:
  tls:
    enabled: true
    cert_file: "/secrets/cert.pem"
    key_file: "/secrets/key.pem"
  1. Configure Firewall
# Allow only necessary ports
sudo ufw allow 22/tcp    # SSH
sudo ufw allow 443/tcp   # HTTPS
sudo ufw allow 80/tcp    # HTTP (redirect to HTTPS)
sudo ufw enable

Production Hardening

  1. Restrict IP Access
# configs/config-prod.yaml
auth:
  ip_whitelist:
    - "10.0.0.0/8"
    - "192.168.0.0/16"
    - "127.0.0.1"
  1. Enable Audit Logging
logging:
  level: "info"
  audit: true
  file: "/var/log/fetch_ml/audit.log"
  1. Harden Redis
# Redis security
redis-cli CONFIG SET requirepass "your-strong-password"
redis-cli CONFIG SET rename-command FLUSHDB ""
redis-cli CONFIG SET rename-command FLUSHALL ""
  1. Secure Grafana
# Change default admin password
docker-compose exec grafana grafana-cli admin reset-admin-password new-strong-password
  1. Regular Updates
# Update system packages
sudo apt update && sudo apt upgrade -y

# Update containers
docker-compose pull
docker-compose up -d (testing only)

Password Management

Generate Secure Passwords

# Method 1: OpenSSL
openssl rand -base64 32

# Method 2: pwgen (if installed)
pwgen -s 32 1

# Method 3: /dev/urandom
head -c 32 /dev/urandom | base64

Store Passwords Securely

Development: Use .env file (gitignored)

echo "REDIS_PASSWORD=$(openssl rand -base64 32)" >> .env
echo "GRAFANA_ADMIN_PASSWORD=$(openssl rand -base64 32)" >> .env

Production: Use systemd environment files

sudo mkdir -p /etc/fetch_ml/secrets
sudo chmod 700 /etc/fetch_ml/secrets
echo "REDIS_PASSWORD=..." | sudo tee /etc/fetch_ml/secrets/redis.env
sudo chmod 600 /etc/fetch_ml/secrets/redis.env

API Key Management

Generate API Keys

# Generate random API key
openssl rand -hex 32

# Hash for storage
echo -n "your-api-key" | sha256sum

Rotate API Keys

  1. Generate new API key
  2. Update config-local.yaml with new hash
  3. Distribute new key to users
  4. Remove old key after grace period

Revoke API Keys

Remove user entry from config-local.yaml:

auth:
  apikeys:
    # user_to_revoke:  # Comment out or delete

Network Security

Production Network Topology

Internet
    ↓
[Firewall] (ports 3000, 9102)
    ↓
[Reverse Proxy] (nginx/Apache) - TLS termination
    ↓
┌─────────────────────┐
│   Application Pod   │
│                     │
│  ┌──────────────┐   │
│  │ API Server   │   │  ← Public (via reverse proxy)
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │   Redis      │   │  ← Internal only
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │   Grafana    │   │  ← Public (via reverse proxy)
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │ Prometheus   │   │  ← Internal only
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │    Loki      │   │  ← Internal only
│  └──────────────┘   │
└─────────────────────┘
# Allow only necessary inbound connections
sudo firewall-cmd --permanent --zone=public --add-rich-rule='
  rule family="ipv4"
  source address="YOUR_NETWORK"
  port port="3000" protocol="tcp" accept'

sudo firewall-cmd --permanent --zone=public --add-rich-rule='
  rule family="ipv4"
  source address="YOUR_NETWORK"
  port port="9102" protocol="tcp" accept'

# Block all other traffic
sudo firewall-cmd --permanent --set-default-zone=drop
sudo firewall-cmd --reload

Incident Response

Suspected Breach

  1. Immediate Actions

  2. Investigation

  3. Recovery

    • Rotate all API keys
    • Stop affected services
    • Review audit logs
  4. Investigation

    # Check recent logins
    sudo journalctl -u fetchml-api --since "1 hour ago"
    
    # Review failed auth attempts
    grep "authentication failed" /var/log/fetch_ml/*.log
    
    # Check active connections
    ss -tnp | grep :9102
    
  5. Recovery

    • Rotate all passwords and API keys
    • Update firewall rules
    • Patch vulnerabilities
    • Resume services

Security Monitoring

# Monitor failed authentication
tail -f /var/log/fetch_ml/api.log | grep "auth.*failed"

# Monitor unusual activity
journalctl -u fetchml-api -f | grep -E "(ERROR|WARN)"

# Check open ports
nmap -p- localhost

Security Best Practices

  1. Principle of Least Privilege: Grant minimum necessary permissions
  2. Defense in Depth: Multiple security layers (firewall + auth + TLS)
  3. Regular Updates: Keep all components patched
  4. Audit Regularly: Review logs and access patterns
  5. Secure Secrets: Never commit passwords/keys to git
  6. Network Segmentation: Isolate services with internal networks
  7. Monitor Everything: Enable comprehensive logging and alerting
  8. Test Security: Regular penetration testing and vulnerability scans

Compliance

Data Privacy

  • Logs are sanitized (no passwords/API keys)
  • Experiment data is user-isolated
  • No telemetry or external data sharing

Audit Trail

All API access is logged with:

  • Timestamp
  • User/API key
  • Action performed
  • Source IP
  • Result (success/failure)

Getting Help

  • Security Issues: Report privately via email
  • Questions: See documentation or create issue
  • Updates: Monitor releases for security patches