fetch_ml/cmd/tui/README.md
Jeremie Fraeys 803677be57 feat: implement Go backend with comprehensive API and internal packages
- Add API server with WebSocket support and REST endpoints
- Implement authentication system with API keys and permissions
- Add task queue system with Redis backend and error handling
- Include storage layer with database migrations and schemas
- Add comprehensive logging, metrics, and telemetry
- Implement security middleware and network utilities
- Add experiment management and container orchestration
- Include configuration management with smart defaults
2025-12-04 16:53:53 -05:00

282 lines
6.1 KiB
Markdown

# FetchML TUI - Terminal User Interface
An interactive terminal dashboard for managing ML experiments, monitoring system resources, and controlling job execution.
## Features
### 📊 Real-time Monitoring
- **Job Status** - Track pending, running, finished, and failed jobs
- **GPU Metrics** - Monitor GPU utilization, memory, and temperature
- **Container Status** - View running Podman/Docker containers
- **Task Queue** - See queued tasks with priorities and status
### 🎮 Interactive Controls
- **Queue Jobs** - Submit jobs with custom arguments and priorities
- **View Logs** - Real-time log viewing for running jobs
- **Cancel Tasks** - Stop running tasks
- **Delete Jobs** - Remove pending jobs
- **Mark Failed** - Manually mark stuck jobs as failed
### ⚙️ Settings Management
- **API Key Configuration** - Set and update API keys on the fly
- **In-memory Storage** - Settings persist for the session
### 🎨 Modern UI
- **Clean Design** - Dark-mode friendly with adaptive colors
- **Responsive Layout** - Adjusts to terminal size
- **Context-aware Help** - Shows relevant shortcuts for each view
- **Mouse Support** - Optional mouse navigation
## Quick Start
### Running the TUI
```bash
# Using make (recommended)
make tui-dev # Dev mode (remote server)
make tui # Local mode
# Direct execution with CLI config (TOML)
./bin/tui --config ~/.ml/config.toml
# With custom TOML config
./bin/tui --config path/to/config.toml
```
### First Time Setup
1. **Build the binary**
```bash
make build
```
2. **Get your API key**
```bash
./bin/user_manager --config configs/config_dev.yaml --cmd generate-key --username your_name
```
3. **Launch the TUI**
```bash
make tui-dev
```
## Keyboard Shortcuts
### Navigation
| Key | Action |
|-----|--------|
| `1` | Switch to Job List view |
| `g` | Switch to GPU Status view |
| `l` | View logs for selected job |
| `v` | Switch to Task Queue view |
| `o` | Switch to Container Status view |
| `s` | Open Settings |
| `h` or `?` | Toggle help screen |
### Job Management
| Key | Action |
|-----|--------|
| `t` | Queue selected job (default args) |
| `a` | Queue job with custom arguments |
| `c` | Cancel running task |
| `d` | Delete pending job |
| `f` | Mark running job as failed |
### System
| Key | Action |
|-----|--------|
| `r` | Refresh all data |
| `G` | Refresh GPU status only |
| `q` or `Ctrl+C` | Quit |
### Settings View
| Key | Action |
|-----|--------|
| `↑`/`↓` or `j`/`k` | Navigate options |
| `Enter` | Select/Save |
| `Esc` | Exit settings |
## Views
### Job List (Default)
- Shows all jobs across all statuses
- Filter with `/` key
- Navigate with arrow keys or `j`/`k`
- Select and press `l` to view logs
### GPU Status
- Real-time GPU metrics (nvidia-smi)
- macOS GPU info (system_profiler)
- Utilization, memory, temperature
### Container Status
- Running Podman/Docker containers
- Container health and status
- System info (Podman/Docker version)
### Task Queue
- All queued tasks with priorities
- Task status and creation time
- Running duration for active tasks
### Logs
- Last 200 lines of job output
- Auto-scroll to bottom
- Refreshes with job status
### Settings
- View current API key status
- Update API key
- Save configuration (in-memory)
## Terminal Compatibility
The TUI is built with [Bubble Tea](https://github.com/charmbracelet/bubbletea) and works on all modern terminals:
### ✅ Fully Supported
- **WezTerm** (recommended)
- **Alacritty**
- **Kitty**
- **iTerm2** (macOS)
- **Terminal.app** (macOS)
- **Windows Terminal**
- **GNOME Terminal**
- **Konsole**
### ✅ Multiplexers
- **tmux**
- **screen**
### Features
- ✅ 256 colors
- ✅ True color (24-bit)
- ✅ Mouse support
- ✅ Alt screen buffer
- ✅ Adaptive colors (light/dark themes)
### Key Components
- **Model** - Pure data structures (State, Job, Task)
- **View** - Rendering functions (no business logic)
- **Controller** - Message handling and state updates
- **Services** - SSH/Redis communication
## Configuration
The TUI uses TOML configuration format for CLI settings:
```toml
# ~/.ml/config.toml
worker_host = "localhost"
worker_user = "your_user"
worker_base = "~/ml_jobs"
worker_port = 22
api_key = "your_api_key_here"
```
For CLI usage, run `ml init` to create a default configuration file.
See [Configuration Documentation](../docs/documentation.md#configuration) for details.
## Troubleshooting
### TUI doesn't start
```bash
# Check if binary exists
ls -la bin/tui
# Rebuild if needed
make build
# Check CLI config
cat ~/.ml/config.toml
```
### Authentication errors
```bash
# Verify CLI config exists
ls -la ~/.ml/config.toml
# Initialize CLI config if needed
ml init
# Test connection
./bin/tui --config ~/.ml/config.toml
```
### Display issues
```bash
# Check terminal type
echo $TERM
# Should be xterm-256color or similar
# If not, set it:
export TERM=xterm-256color
```
### Connection issues
```bash
# Test SSH connection
ssh your_user@your_server
# Test Redis connection
redis-cli ping
```
## Development
### Building
```bash
# Build TUI only
go build -o bin/tui ./cmd/tui
# Build all binaries
make build
```
### Testing
```bash
# Run with verbose logging
./bin/tui --config ~/.ml/config.toml 2>tui.log
# Check logs
tail -f tui.log
```
### Code Organization
- Keep files under 300 lines
- Separate concerns (MVC pattern)
- Use descriptive function names
- Add comments for complex logic
## Tips & Tricks
### Efficient Workflow
1. Keep TUI open in one terminal
2. Edit code in another terminal
3. Use `r` to refresh after changes
4. Use `h` to quickly reference shortcuts
### Custom Arguments
When queuing jobs with `a`:
```
--epochs 100 --lr 0.001 --priority 5
```
### Monitoring
- Use `G` for quick GPU refresh (faster than `r`)
- Check queue with `v` before queuing new jobs
- Use `l` to debug failed jobs
### Settings
- Update API key without restarting
- Changes are in-memory only
- Restart TUI to reset
## See Also
- [Main Documentation](../docs/documentation.md)
- [Worker Documentation](../cmd/worker/README.md)
- [Configuration Guide](../docs/documentation.md#configuration)
- [Bubble Tea Documentation](https://github.com/charmbracelet/bubbletea)