fetch_ml/SECURITY.md
Jeremie Fraeys 6028779239
feat: update CLI, TUI, and security documentation
- Add safety checks to Zig build
- Add TUI with job management and narrative views
- Add WebSocket support and export services
- Add smart configuration defaults
- Update API routes with security headers
- Update SECURITY.md with comprehensive policy
- Add Makefile security scanning targets
2026-02-19 15:35:05 -05:00

357 lines
9.6 KiB
Markdown

# Security Policy
## Reporting a Vulnerability
Please report security vulnerabilities to security@fetchml.io.
Do NOT open public issues for security bugs.
Response timeline:
- Acknowledgment: within 48 hours
- Initial assessment: within 5 days
- Fix released: within 30 days (critical), 90 days (high)
## Security Features
FetchML implements defense-in-depth security for ML research systems:
### Authentication & Authorization
- **Argon2id API Key Hashing**: Memory-hard hashing resists GPU cracking
- **RBAC with Role Inheritance**: Granular permissions (admin, data_scientist, data_engineer, viewer, operator)
- **Constant-time Comparison**: Prevents timing attacks on key validation
### Cryptographic Practices
- **Ed25519 Manifest Signing**: Tamper detection for run manifests
- **SHA-256 with Salt**: Legacy key support with migration path
- **Secure Key Generation**: 256-bit entropy for all API keys
### Container Security
- **Rootless Podman**: No privileged containers
- **Capability Dropping**: `--cap-drop ALL` by default
- **No New Privileges**: `no-new-privileges` security opt
- **Read-only Root Filesystem**: Immutable base image
### Input Validation
- **Path Traversal Prevention**: Canonical path validation
- **Command Injection Protection**: Shell metacharacter filtering
- **Length Limits**: Prevents DoS via oversized inputs
### Audit & Monitoring
- **Structured Audit Logging**: JSON-formatted security events
- **Hash-chained Logs**: Tamper-evident audit trail
- **Anomaly Detection**: Brute force, privilege escalation alerts
- **Security Metrics**: Prometheus integration
### Supply Chain
- **Dependency Scanning**: gosec + govulncheck in CI
- **No unsafe Package**: Prohibited in production code
- **Manifest Signing**: Ed25519 signatures for integrity
## Supported Versions
| Version | Supported |
| ------- | ------------------ |
| 0.2.x | :white_check_mark: |
| 0.1.x | :x: |
## Security Checklist (Pre-Release)
### Code Review
- [ ] No hardcoded secrets
- [ ] No `unsafe` usage without justification
- [ ] All user inputs validated
- [ ] All file paths canonicalized
- [ ] No secrets in error messages
### Dependency Audit
- [ ] `go mod verify` passes
- [ ] `govulncheck` shows no vulnerabilities
- [ ] All dependencies pinned
- [ ] No unmaintained dependencies
### Container Security
- [ ] No privileged containers
- [ ] Rootless execution
- [ ] Seccomp/AppArmor applied
- [ ] Network isolation
### Cryptography
- [ ] Argon2id for key hashing
- [ ] Ed25519 for signing
- [ ] TLS 1.3 only
- [ ] No weak ciphers
### Testing
- [ ] Security tests pass
- [ ] Fuzz tests for parsers
- [ ] Authentication bypass tested
- [ ] Container escape tested
## Security Commands
```bash
# Run security scan
make security-scan
# Check for vulnerabilities
govulncheck ./...
# Static analysis
gosec ./...
# Check for unsafe usage
grep -r "unsafe\." --include="*.go" ./internal ./cmd
# Build with sanitizers
cd native && cmake -DENABLE_ASAN=ON .. && make
```
## Threat Model
### Attack Surfaces
1. **External API**: Researchers submitting malicious jobs
2. **Container Runtime**: Escape to host system
3. **Data Exfiltration**: Stealing datasets/models
4. **Privilege Escalation**: Researcher → admin
5. **Supply Chain**: Compromised dependencies
6. **Secrets Leakage**: API keys in logs/errors
### Mitigations
| Threat | Mitigation |
|--------|------------|
| Malicious Jobs | Input validation, container sandboxing, resource limits |
| Container Escape | Rootless, no-new-privileges, seccomp, read-only root |
| Data Exfiltration | Network policies, audit logging, rate limiting |
| Privilege Escalation | RBAC, least privilege, anomaly detection |
| Supply Chain | Dependency scanning, manifest signing, pinned versions |
| Secrets Leakage | Log sanitization, secrets manager, memory clearing |
## Responsible Disclosure
We follow responsible disclosure practices:
1. **Report privately**: Email security@fetchml.io with details
2. **Provide details**: Steps to reproduce, impact assessment
3. **Allow time**: We need 30-90 days to fix before public disclosure
4. **Acknowledgment**: We credit researchers who report valid issues
## Security Team
- security@fetchml.io - Security issues and questions
- security-response@fetchml.io - Active incident response
---
*Last updated: 2026-02-19*
---
# Security Guide for Fetch ML Homelab
*The following section covers security best practices for deploying Fetch ML in a homelab environment.*
## Quick Setup
Secure setup requires manual configuration:
1. **Generate API keys**: Use the instructions in [API Security](#api-security) below
2. **Create TLS certificates**: Use OpenSSL commands in [Troubleshooting](#troubleshooting)
3. **Configure security**: Copy and edit `configs/api/homelab-secure.yaml`
4. **Set permissions**: Ensure `.api-keys` and `.env.secure` have 600 permissions
## Security Features
### Authentication
- **API Key Authentication**: SHA256 hashed API keys
- **Role-based Access Control**: Admin, researcher, analyst roles
- **Permission System**: Granular permissions per resource
### Network Security
- **TLS/SSL**: HTTPS encrypted communication
- **IP Whitelisting**: Restrict access to trusted networks
- **Rate Limiting**: Prevent abuse and DoS attacks
- **Reverse Proxy**: Caddy with security headers
### Data Protection
- **Path Traversal Protection**: Prevents directory escape attacks
- **Package Validation**: Blocks dangerous Python packages
- **Input Validation**: Comprehensive input sanitization
## Configuration Files
### Secure Config Location
- `configs/api/homelab-secure.yaml` - Main secure configuration
### API Keys
- `.api-keys` - Generated API keys (600 permissions)
- Never commit to version control
- Store in password manager
### TLS Certificates
- `ssl/cert.pem` - TLS certificate
- `ssl/key.pem` - Private key (600 permissions)
### Environment Variables
- `.env.secure` - JWT secret and other secrets (600 permissions)
## Deployment Options
### Option 1: Docker Compose (Recommended)
```bash
# Configure secure setup manually (see Quick Setup above)
# Copy and edit the secure configuration
cp configs/api/homelab-secure.yaml configs/api/my-secure.yaml
# Edit with your API keys, TLS settings, and IP whitelist
# Deploy with security overlay
docker-compose -f docker-compose.yml -f docker-compose.homelab-secure.yml up -d
```
### Option 2: Direct Deployment
```bash
# Configure secure setup manually (see Quick Setup above)
# Copy and edit the secure configuration
cp configs/api/homelab-secure.yaml configs/api/my-secure.yaml
# Edit with your API keys, TLS settings, and IP whitelist
# Load environment variables
source .env.secure
# Start server
./api-server -config configs/api/my-secure.yaml
```
## Security Checklist
### Before Deployment
- [ ] Generate unique API keys (don't use defaults)
- [ ] Set strong JWT secret
- [ ] Enable TLS/SSL
- [ ] Configure IP whitelist for your network
- [ ] Set up rate limiting
- [ ] Enable Redis authentication
### Network Security
- [ ] Use HTTPS only (disable HTTP)
- [ ] Restrict API access to trusted IPs
- [ ] Use reverse proxy (caddy)
- [ ] Enable security headers
- [ ] Monitor access logs
### Data Protection
- [ ] Regular backups of configuration
- [ ] Secure storage of API keys
- [ ] Encrypt sensitive data at rest
- [ ] Regular security updates
### Monitoring
- [ ] Enable security logging
- [ ] Monitor failed authentication attempts
- [ ] Set up alerts for suspicious activity
- [ ] Regular security audits
## API Security
### Authentication Headers
```bash
# Use API key in header
curl -H "X-API-Key: your-api-key" https://localhost:9101/health
# Or Bearer token
curl -H "Authorization: Bearer your-api-key" https://localhost:9101/health
```
### Rate Limits
- Default: 60 requests per minute
- Burst: 10 requests
- Per IP address
### IP Whitelisting
Configure in `config-homelab-secure.yaml`:
```yaml
security:
ip_whitelist:
- "127.0.0.1"
- "192.168.1.0/24" # Your local network
```
## Container Security
### Docker Security
- Use non-root users
- Minimal container images
- Resource limits
- Network segmentation
### Podman Security
- Rootless containers
- SELinux confinement
- Seccomp profiles
- Read-only filesystems where possible
## Troubleshooting
### Common Issues
**TLS Certificate Errors**
```bash
# Regenerate certificates
openssl req -x509 -newkey rsa:4096 -keyout ssl/key.pem -out ssl/cert.pem -days 365 -nodes \
-subj "/C=US/ST=Homelab/L=Local/O=FetchML/CN=localhost"
```
**API Key Authentication Failed**
```bash
# Check your API key
grep "ADMIN_API_KEY" .api-keys
# Verify hash matches config
echo -n "your-api-key" | sha256sum | cut -d' ' -f1
```
**IP Whitelist Blocking**
```bash
# Check your IP
curl -s https://api.ipify.org
# Add to whitelist in config
```
### Security Logs
Monitor these files:
- `logs/fetch_ml.log` - Application logs
- Caddy access logs (configure if enabled)
- Docker logs: `docker logs ml-experiments-api`
## Best Practices
1. **Regular Updates**: Keep dependencies updated
2. **Principle of Least Privilege**: Minimal required permissions
3. **Defense in Depth**: Multiple security layers
4. **Monitor and Alert**: Security monitoring
5. **Backup and Recovery**: Regular secure backups
## Emergency Procedures
### Compromised API Keys
1. Immediately revoke compromised keys
2. Generate new API keys
3. Update all clients
4. Review access logs
### Security Incident
1. Isolate affected systems
2. Preserve evidence
3. Review access logs
4. Update security measures
5. Document incident
## Support
For security issues:
- Check logs for error messages
- Review configuration files
- Test with minimal setup
- Report security vulnerabilities responsibly