fetch_ml/SECURITY.md

# Security Policy

## Reporting a Vulnerability

Please report security vulnerabilities to security@fetchml.io.
Do NOT open public issues for security bugs.

Response timeline:
- Acknowledgment: within 48 hours
- Initial assessment: within 5 days
- Fix released: within 30 days (critical), 90 days (high)

## Security Features

FetchML implements defense-in-depth security for ML research systems:

### Authentication & Authorization
- **Argon2id API Key Hashing**: Memory-hard hashing resists GPU cracking
- **RBAC with Role Inheritance**: Granular permissions (admin, data_scientist, data_engineer, viewer, operator)
- **Constant-time Comparison**: Prevents timing attacks on key validation

### Cryptographic Practices
- **Ed25519 Manifest Signing**: Tamper detection for run manifests
- **SHA-256 with Salt**: Legacy key support with migration path
- **Secure Key Generation**: 256-bit entropy for all API keys

### Container Security
- **Rootless Podman**: No privileged containers
- **Capability Dropping**: `--cap-drop ALL` by default
- **No New Privileges**: `no-new-privileges` security opt
- **Read-only Root Filesystem**: Immutable base image

### Input Validation
- **Path Traversal Prevention**: Canonical path validation
- **Command Injection Protection**: Shell metacharacter filtering
- **Length Limits**: Prevents DoS via oversized inputs

### Audit & Monitoring
- **Structured Audit Logging**: JSON-formatted security events
- **Hash-chained Logs**: Tamper-evident audit trail
- **Anomaly Detection**: Brute force, privilege escalation alerts
- **Security Metrics**: Prometheus integration

### Supply Chain
- **Dependency Scanning**: gosec + govulncheck in CI
- **No unsafe Package**: Prohibited in production code
- **Manifest Signing**: Ed25519 signatures for integrity

## Supported Versions

| Version | Supported          |
| ------- | ------------------ |
| 0.2.x   | :white_check_mark: |
| 0.1.x   | :x:                |

## Security Checklist (Pre-Release)

### Code Review
- [ ] No hardcoded secrets
- [ ] No `unsafe` usage without justification
- [ ] All user inputs validated
- [ ] All file paths canonicalized
- [ ] No secrets in error messages

### Dependency Audit
- [ ] `go mod verify` passes
- [ ] `govulncheck` shows no vulnerabilities
- [ ] All dependencies pinned
- [ ] No unmaintained dependencies

### Container Security
- [ ] No privileged containers
- [ ] Rootless execution
- [ ] Seccomp/AppArmor applied
- [ ] Network isolation

### Cryptography
- [ ] Argon2id for key hashing
- [ ] Ed25519 for signing
- [ ] TLS 1.3 only
- [ ] No weak ciphers

### Testing
- [ ] Security tests pass
- [ ] Fuzz tests for parsers
- [ ] Authentication bypass tested
- [ ] Container escape tested

## Security Commands

```bash
# Run security scan
make security-scan

# Check for vulnerabilities
govulncheck ./...

# Static analysis
gosec ./...

# Check for unsafe usage
grep -r "unsafe\." --include="*.go" ./internal ./cmd

# Build with sanitizers
cd native && cmake -DENABLE_ASAN=ON .. && make
```

## Threat Model

### Attack Surfaces
1. **External API**: Researchers submitting malicious jobs
2. **Container Runtime**: Escape to host system
3. **Data Exfiltration**: Stealing datasets/models
4. **Privilege Escalation**: Researcher → admin
5. **Supply Chain**: Compromised dependencies
6. **Secrets Leakage**: API keys in logs/errors

### Mitigations
| Threat | Mitigation |
|--------|------------|
| Malicious Jobs | Input validation, container sandboxing, resource limits |
| Container Escape | Rootless, no-new-privileges, seccomp, read-only root |
| Data Exfiltration | Network policies, audit logging, rate limiting |
| Privilege Escalation | RBAC, least privilege, anomaly detection |
| Supply Chain | Dependency scanning, manifest signing, pinned versions |
| Secrets Leakage | Log sanitization, secrets manager, memory clearing |

## Responsible Disclosure

We follow responsible disclosure practices:

1. **Report privately**: Email security@fetchml.io with details
2. **Provide details**: Steps to reproduce, impact assessment
3. **Allow time**: We need 30-90 days to fix before public disclosure
4. **Acknowledgment**: We credit researchers who report valid issues

## Security Team

- security@fetchml.io - Security issues and questions
- security-response@fetchml.io - Active incident response

---

*Last updated: 2026-02-19*

---

# Security Guide for Fetch ML Homelab

*The following section covers security best practices for deploying Fetch ML in a homelab environment.*

## Quick Setup

Secure setup requires manual configuration:

1. **Generate API keys**: Use the instructions in [API Security](#api-security) below
2. **Create TLS certificates**: Use OpenSSL commands in [Troubleshooting](#troubleshooting)
3. **Configure security**: Copy and edit `configs/api/homelab-secure.yaml`
4. **Set permissions**: Ensure `.api-keys` and `.env.secure` have 600 permissions

## Security Features

### Authentication
- **API Key Authentication**: SHA256 hashed API keys
- **Role-based Access Control**: Admin, researcher, analyst roles
- **Permission System**: Granular permissions per resource

### Network Security
- **TLS/SSL**: HTTPS encrypted communication
- **IP Whitelisting**: Restrict access to trusted networks
- **Rate Limiting**: Prevent abuse and DoS attacks
- **Reverse Proxy**: Caddy with security headers

### Data Protection
- **Path Traversal Protection**: Prevents directory escape attacks
- **Package Validation**: Blocks dangerous Python packages
- **Input Validation**: Comprehensive input sanitization

## Configuration Files

### Secure Config Location
- `configs/api/homelab-secure.yaml` - Main secure configuration

### API Keys
- `.api-keys` - Generated API keys (600 permissions)
- Never commit to version control
- Store in password manager

### TLS Certificates
- `ssl/cert.pem` - TLS certificate
- `ssl/key.pem` - Private key (600 permissions)

### Environment Variables
- `.env.secure` - JWT secret and other secrets (600 permissions)

## Deployment Options

### Option 1: Docker Compose (Recommended)

```bash
# Configure secure setup manually (see Quick Setup above)
# Copy and edit the secure configuration
cp configs/api/homelab-secure.yaml configs/api/my-secure.yaml
# Edit with your API keys, TLS settings, and IP whitelist

# Deploy with security overlay
docker-compose -f docker-compose.yml -f docker-compose.homelab-secure.yml up -d
```

### Option 2: Direct Deployment

```bash
# Configure secure setup manually (see Quick Setup above)
# Copy and edit the secure configuration
cp configs/api/homelab-secure.yaml configs/api/my-secure.yaml
# Edit with your API keys, TLS settings, and IP whitelist

# Load environment variables
source .env.secure

# Start server
./api-server -config configs/api/my-secure.yaml
```

## Security Checklist

### Before Deployment
- [ ] Generate unique API keys (don't use defaults)
- [ ] Set strong JWT secret
- [ ] Enable TLS/SSL
- [ ] Configure IP whitelist for your network
- [ ] Set up rate limiting
- [ ] Enable Redis authentication

### Network Security
- [ ] Use HTTPS only (disable HTTP)
- [ ] Restrict API access to trusted IPs
- [ ] Use reverse proxy (caddy)
- [ ] Enable security headers
- [ ] Monitor access logs

### Data Protection
- [ ] Regular backups of configuration
- [ ] Secure storage of API keys
- [ ] Encrypt sensitive data at rest
- [ ] Regular security updates

### Monitoring
- [ ] Enable security logging
- [ ] Monitor failed authentication attempts
- [ ] Set up alerts for suspicious activity
- [ ] Regular security audits

## API Security

### Authentication Headers
```bash
# Use API key in header
curl -H "X-API-Key: your-api-key" https://localhost:9101/health

# Or Bearer token
curl -H "Authorization: Bearer your-api-key" https://localhost:9101/health
```

### Rate Limits
- Default: 60 requests per minute
- Burst: 10 requests
- Per IP address

### IP Whitelisting
Configure in `config-homelab-secure.yaml`:
```yaml
security:
  ip_whitelist:
    - "127.0.0.1"
    - "192.168.1.0/24"  # Your local network
```

## Container Security

### Docker Security
- Use non-root users
- Minimal container images
- Resource limits
- Network segmentation

### Podman Security
- Rootless containers
- SELinux confinement
- Seccomp profiles
- Read-only filesystems where possible

## Troubleshooting

### Common Issues

**TLS Certificate Errors**
```bash
# Regenerate certificates
openssl req -x509 -newkey rsa:4096 -keyout ssl/key.pem -out ssl/cert.pem -days 365 -nodes \
  -subj "/C=US/ST=Homelab/L=Local/O=FetchML/CN=localhost"
```

**API Key Authentication Failed**
```bash
# Check your API key
grep "ADMIN_API_KEY" .api-keys

# Verify hash matches config
echo -n "your-api-key" | sha256sum | cut -d' ' -f1
```

**IP Whitelist Blocking**
```bash
# Check your IP
curl -s https://api.ipify.org

# Add to whitelist in config
```

### Security Logs

Monitor these files:
- `logs/fetch_ml.log` - Application logs
- Caddy access logs (configure if enabled)
- Docker logs: `docker logs ml-experiments-api`

## Best Practices

1. **Regular Updates**: Keep dependencies updated
2. **Principle of Least Privilege**: Minimal required permissions
3. **Defense in Depth**: Multiple security layers
4. **Monitor and Alert**: Security monitoring
5. **Backup and Recovery**: Regular secure backups

## Emergency Procedures

### Compromised API Keys
1. Immediately revoke compromised keys
2. Generate new API keys
3. Update all clients
4. Review access logs

### Security Incident
1. Isolate affected systems
2. Preserve evidence
3. Review access logs
4. Update security measures
5. Document incident

## Support

For security issues:
- Check logs for error messages
- Review configuration files
- Test with minimal setup
- Report security vulnerabilities responsibly