No description
Find a file
Jeremie Fraeys c5049a2fdf feat: initialize FetchML ML platform with core project structure
- Add comprehensive README with architecture overview and quick start guide
- Set up Go module with production-ready dependencies
- Configure build system with Makefile for development and production builds
- Add Docker Compose for local development environment
- Include project configuration files (linting, Python, etc.)

This establishes the foundation for a production-ready ML experiment platform
with task queuing, monitoring, and modern CLI/API interface.
2025-12-04 16:52:09 -05:00
.flake8 feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
.gitignore feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
.golangci.yml feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
.pylintrc feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
docker-compose.yml feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
go.mod feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
go.sum feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
LICENSE feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
Makefile feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
pyproject.toml feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00
README.md feat: initialize FetchML ML platform with core project structure 2025-12-04 16:52:09 -05:00

FetchML - Machine Learning Platform

A production-ready ML experiment platform with task queuing, monitoring, and a modern CLI/API.

Features

  • 🚀 Production Resilience - Task leasing, smart retries, dead-letter queues
  • 📊 Monitoring - Grafana/Prometheus/Loki with auto-provisioned dashboards
  • 🔐 Security - API key auth, TLS, rate limiting, IP whitelisting
  • Performance - Go API server + Zig CLI for speed
  • 📦 Easy Deployment - Docker Compose (dev) or systemd (prod)

Quick Start

Development (macOS/Linux)

# Clone and start
git clone <your-repo>
cd fetch_ml
docker-compose up -d

# Access Grafana: http://localhost:3000 (admin/admin)

Production (Linux)

# Setup application
sudo ./scripts/setup-prod.sh

# Setup monitoring  
sudo ./scripts/setup-monitoring-prod.sh

# Build and install
make prod
make install

# Start services
sudo systemctl start fetchml-api fetchml-worker
sudo systemctl start prometheus grafana loki promtail

Architecture

┌──────────────┐   WebSocket   ┌──────────────┐
│  Zig CLI/TUI │◄─────────────►│  API Server  │
└──────────────┘               │    (Go)      │
                               └──────┬───────┘
                                      │
                        ┌─────────────┼─────────────┐
                        │             │             │
                   ┌────▼────┐   ┌───▼────┐   ┌───▼────┐
                   │  Redis  │   │ Worker │   │  Loki  │
                   │ (Queue) │   │  (Go)  │   │ (Logs) │
                   └─────────┘   └────────┘   └────────┘

Usage

API Server

# Development (stderr logging)
go run cmd/api-server/main.go --config configs/config-dev.yaml

# Production (file logging)
go run cmd/api-server/main.go --config configs/config-no-tls.yaml

CLI

# Build
cd cli && zig build prod

# Run experiment
./cli/zig-out/bin/ml run --config config.toml

# Check status  
./cli/zig-out/bin/ml status

Docker

make docker-run      # Start all services
make docker-logs     # View logs
make docker-stop     # Stop services

Development

Prerequisites

  • Go 1.21+
  • Zig 0.11+
  • Redis
  • Docker (for local dev)

Build

make build           # All components
make dev             # Fast dev build
make prod            # Optimized production build

Test

make test            # All tests
make test-unit       # Unit tests only
make test-coverage   # With coverage report

Configuration

Development (configs/config-dev.yaml)

logging:
  level: "info"
  file: ""  # stderr only

redis:
  url: "redis://localhost:6379"

Production (configs/config-no-tls.yaml)

logging:
  level: "info"
  file: "./logs/fetch_ml.log"  # file only

redis:
  url: "redis://redis:6379"

Monitoring

Grafana Dashboards (Auto-Provisioned)

  • ML Task Queue - Queue depth, task duration, failure rates
  • Application Logs - Log streams, error tracking, search

Access: http://localhost:3000 (dev) or http://YOUR_SERVER:3000 (prod)

Metrics

  • Queue depth and task processing rates
  • Retry attempts by error category
  • Dead letter queue size
  • Lease expirations

Documentation

Makefile Targets

# Build
make build               # Build all components
make prod                # Production build
make clean               # Clean artifacts

# Docker
make docker-build        # Build image
make docker-run          # Start services
make docker-stop         # Stop services

# Test
make test                # All tests
make test-coverage       # With coverage

# Production (Linux only)
make setup               # Setup app
make setup-monitoring    # Setup monitoring
make install             # Install binaries

Security

  • TLS/HTTPS - End-to-end encryption
  • API Keys - Hashed with SHA256
  • Rate Limiting - Per-user quotas
  • IP Whitelist - Network restrictions
  • Audit Logging - All API access logged

License

MIT - See LICENSE

Contributing

Contributions welcome! This is a personal homelab project but PRs are appreciated.