fetch_ml/docs/src/troubleshooting.md

2.2 KiB

Troubleshooting

Common issues and solutions for Fetch ML.

Quick Fixes

Services Not Starting

# Check container status
docker ps --filter "name=ml-"

# Restart development stack
make dev-down
make dev-up

API Not Responding

# Check health endpoint
curl http://localhost:8080/health

# Check if port is in use
lsof -i :8080
lsof -i :8443

# Kill process on port
kill -9 $(lsof -ti :8080)

Database / Redis Issues

# Check Redis from container
docker exec ml-experiments-redis redis-cli ping

# Check API can reach database (via health endpoint)
curl -f http://localhost:8080/health || echo "API not healthy"

Common Errors

Authentication Errors

  • Invalid API key: Check config and regenerate hash
  • JWT expired: Check jwt_expiry setting

Database Errors

  • Connection failed: Verify database type and connection params
  • No such table: Run migrations with --migrate (see Quick Start)

Container Errors

  • Runtime not found: Set runtime: docker (testing only) in config
  • Image pull failed: Check registry access

Performance Issues

  • High memory: Adjust resources.memory_limit
  • Slow jobs: Check worker count and queue size

Development Issues

  • Build fails: go mod tidy and cd cli && rm -rf zig-out zig-cache
  • Tests fail: Ensure dev stack is running with make dev-up or use make test-auth

CLI Issues

  • Not found: cd cli && zig build --release=fast
  • Connection errors: Check --server and --api-key

Network Issues

  • Port conflicts: lsof -i :8080 / lsof -i :8443 and kill processes
  • Firewall: Allow ports 8080, 8443, 6379, 5432

Configuration Issues

  • Invalid YAML: python3 -c "import yaml; yaml.safe_load(open('config.yaml'))"
  • Missing fields: Run see [Configuration Schema](configuration-schema.md)

Debug Information

./bin/api-server --version
docker ps --filter "name=ml-"
docker logs ml-experiments-api | grep ERROR

Emergency Reset

# Stop and remove all dev containers and volumes
make dev-down
docker volume prune

# Remove local data if needed
rm -rf data/ results/ *.db

# Start fresh dev stack
make dev-up