# Security Guide

This document outlines security features, best practices, and hardening procedures for FetchML.

## Security Features

### Authentication & Authorization
- **API Keys**: SHA256-hashed with role-based access control (RBAC)
- **Permissions**: Granular read/write/delete permissions per user
- **IP Whitelisting**: Network-level access control
- **Rate Limiting**: Per-user request quotas

### Communication Security
- **TLS/HTTPS**: End-to-end encryption for API traffic
- **WebSocket Auth**: API key required before upgrade
- **Redis Auth**: Password-protected task queue

### Data Privacy
- **Log Sanitization**: Automatically redacts API keys, passwords, tokens
- **Experiment Isolation**: User-specific experiment directories
- **No Anonymous Access**: All services require authentication

### Network Security
- **Internal Networks**: Backend services (Redis, Loki) not exposed publicly
- **Firewall Rules**: Restrictive port access
- **Container Isolation**: Services run in separate containers/pods

## Security Checklist

### Initial Setup

1. **Generate Strong Passwords**
  ```bash
  # Grafana admin password
  openssl rand -base64 32 > .grafana-password
  
  # Redis password
  openssl rand -base64 32
  ```

2. **Configure Environment Variables**
  ```bash
  cp .env.example .env
  # Edit .env and set:
  # - GRAFANA_ADMIN_PASSWORD
  ```

3. **Enable TLS** (Production only)
  ```yaml
  # configs/api/prod.yaml
  server:
    tls:
      enabled: true
      cert_file: "/secrets/cert.pem"
      key_file: "/secrets/key.pem"
  ```

4. **Configure Firewall**
  ```bash
  # Allow only necessary ports
  sudo ufw allow 22/tcp    # SSH
  sudo ufw allow 443/tcp   # HTTPS
  sudo ufw allow 80/tcp    # HTTP (redirect to HTTPS)
  sudo ufw enable
  ```

### Production Hardening

5. **Restrict IP Access**
  ```yaml
  # configs/api/prod.yaml
  auth:
    ip_whitelist:
      - "10.0.0.0/8"
      - "192.168.0.0/16"
      - "127.0.0.1"
  ```

6. **Enable Audit Logging**
  ```yaml
  logging:
    level: "info"
    audit: true
    file: "/var/log/fetch_ml/audit.log"
  ```

7. **Harden Redis**
  ```bash
  # Redis security
  redis-cli CONFIG SET requirepass "your-strong-password"
  redis-cli CONFIG SET rename-command FLUSHDB ""
  redis-cli CONFIG SET rename-command FLUSHALL ""
  ```

8. **Secure Grafana**
  ```bash
  # Change default admin password
  docker-compose exec grafana grafana-cli admin reset-admin-password new-strong-password
  ```

9. **Regular Updates**
  ```bash
  # Update system packages
  sudo apt update && sudo apt upgrade -y
  
  # Update containers
  docker-compose pull
  docker-compose up -d (testing only)
  ```

## Password Management

### Generate Secure Passwords

```bash
# Method 1: OpenSSL
openssl rand -base64 32

# Method 2: pwgen (if installed)
pwgen -s 32 1

# Method 3: /dev/urandom
head -c 32 /dev/urandom | base64
```

### Store Passwords Securely

**Development**: Use `.env` file (gitignored)
```bash
echo "REDIS_PASSWORD=$(openssl rand -base64 32)" >> .env
echo "GRAFANA_ADMIN_PASSWORD=$(openssl rand -base64 32)" >> .env
```

**Production**: Use systemd environment files
```bash
sudo mkdir -p /etc/fetch_ml/secrets
sudo chmod 700 /etc/fetch_ml/secrets
echo "REDIS_PASSWORD=..." | sudo tee /etc/fetch_ml/secrets/redis.env
sudo chmod 600 /etc/fetch_ml/secrets/redis.env
```

## API Key Management

### Generate API Keys

```bash
# Generate random API key
openssl rand -hex 32

# Hash for storage
echo -n "your-api-key" | sha256sum
```

### Rotate API Keys

1. Generate new API key
2. Update your chosen API server config (for example a private copy of `configs/api/homelab-secure.yaml`) with the new hash
3. Distribute new key to users
4. Remove old key after grace period

### Revoke API Keys

Remove user entry from your API server config file:
```yaml
auth:
  api_keys:
    # user_to_revoke:  # Comment out or delete
```

## Secret Flow (What lives where)

- **API server config (`configs/api/*.yaml`)**
  - Stores **SHA256 hashes** of API keys (never raw keys).
  - The repo-shipped configs intentionally contain `CHANGE_ME_...` placeholders.
  - For real deployments, make a private copy (e.g. `/etc/fetch_ml/config.yaml`) and fill in real hashes.

- **Docker Compose `.env` / secret files**
  - Used for values that should not be committed (e.g. `REDIS_PASSWORD`, Grafana admin password).
  - `deployments/docker-compose.homelab-secure.yml` requires `REDIS_PASSWORD` to be set explicitly.

- **TLS certs**
  - Provided as mounted files (e.g. `/app/ssl/cert.pem`, `/app/ssl/key.pem`).

## Network Security

### Production Network Topology

```
Internet
    ↓
[Firewall] (ports 3000, 9102)
    ↓
[Reverse Proxy] (nginx/Apache) - TLS termination
    ↓
┌─────────────────────┐
│   Application Pod   │
│                     │
│  ┌──────────────┐   │
│  │ API Server   │   │  ← Public (via reverse proxy)
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │   Redis      │   │  ← Internal only
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │   Grafana    │   │  ← Public (via reverse proxy)
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │ Prometheus   │   │  ← Internal only
│  └──────────────┘   │
│                     │
│  ┌──────────────┐   │
│  │    Loki      │   │  ← Internal only
│  └──────────────┘   │
└─────────────────────┘
```

### Recommended Firewall Rules

```bash
# Allow only necessary inbound connections
sudo firewall-cmd --permanent --zone=public --add-rich-rule='
  rule family="ipv4"
  source address="YOUR_NETWORK"
  port port="3000" protocol="tcp" accept'

sudo firewall-cmd --permanent --zone=public --add-rich-rule='
  rule family="ipv4"
  source address="YOUR_NETWORK"
  port port="9102" protocol="tcp" accept'

# Block all other traffic
sudo firewall-cmd --permanent --set-default-zone=drop
sudo firewall-cmd --reload
```

## Incident Response

### Suspected Breach

1. **Immediate Actions**
2. **Investigation** 
3. **Recovery** 
   - Rotate all API keys
   - Stop affected services
   - Review audit logs
   
2. **Investigation**
   ```bash
   # Check recent logins
   sudo journalctl -u fetchml-api --since "1 hour ago"
   
   # Review failed auth attempts
   grep "authentication failed" /var/log/fetch_ml/*.log
   
   # Check active connections
   ss -tnp | grep :9102
   ```

3. **Recovery**
   - Rotate all passwords and API keys
   - Update firewall rules
   - Patch vulnerabilities
   - Resume services

### Security Monitoring

```bash
# Monitor failed authentication
tail -f /var/log/fetch_ml/api.log | grep "auth.*failed"

# Monitor unusual activity
journalctl -u fetchml-api -f | grep -E "(ERROR|WARN)"

# Check open ports
nmap -p- localhost
```

## Security Best Practices

1. **Principle of Least Privilege**: Grant minimum necessary permissions
2. **Defense in Depth**: Multiple security layers (firewall + auth + TLS)
3. **Regular Updates**: Keep all components patched
4. **Audit Regularly**: Review logs and access patterns
5. **Secure Secrets**: Never commit passwords/keys to git
6. **Network Segmentation**: Isolate services with internal networks
7. **Monitor Everything**: Enable comprehensive logging and alerting
8. **Test Security**: Regular penetration testing and vulnerability scans

## Compliance

### Data Privacy
- Logs are sanitized (no passwords/API keys)
- Experiment data is user-isolated
- No telemetry or external data sharing

### Audit Trail
All API access is logged with:
- Timestamp
- User/API key
- Action performed
- Source IP
- Result (success/failure)

## Getting Help

- **Security Issues**: Report privately via email
- **Questions**: See documentation or create issue
- **Updates**: Monitor releases for security patches