Jeremie Fraeys ea15af1833 Fix multi-user authentication and clean up debug code

- Fix YAML tags in auth config struct (json -> yaml)
- Update CLI configs to use pre-hashed API keys
- Remove double hashing in WebSocket client
- Fix port mapping (9102 -> 9103) in CLI commands
- Update permission keys to use jobs:read, jobs:create, etc.
- Clean up all debug logging from CLI and server
- All user roles now authenticate correctly:
  * Admin: Can queue jobs and see all jobs
  * Researcher: Can queue jobs and see own jobs
  * Analyst: Can see status (read-only access)

Multi-user authentication is now fully functional.

2025-12-06 12:35:32 -05:00

4.7 KiB

Raw Blame History

Performance Monitoring Quick Start

Get started with performance monitoring and profiling in 5 minutes.

Quick Start Options

Option 1: Basic Benchmarking

# Run benchmarks
make benchmark

# View results in Grafana
open http://localhost:3001

Option 2: CPU Profiling

# Generate CPU profile
make profile-load-norate

# View interactive profile
go tool pprof -http=:8080 cpu_load.out

Option 3: Full Monitoring Stack

# Start monitoring services
make monitoring-performance

# Run benchmarks with metrics collection
make benchmark

# View in Grafana dashboard
open http://localhost:3001

Prerequisites

Docker and Docker Compose
Go 1.21 or later
Redis (for load tests)
GitHub repository (for CI/CD integration)

1. Setup & Installation

Start Monitoring Stack (Optional)

For full metrics visualization:

make monitoring-performance

This starts:

Grafana: http://localhost:3001 (admin/admin)
Pushgateway: http://localhost:9091
Loki: http://localhost:3100

Start Redis (Required for Load Tests)

docker run -d -p 6379:6379 redis:alpine

2. Performance Testing

Benchmarks

# Run benchmarks locally
make benchmark

# Or run with detailed output
go test -bench=. -benchmem ./tests/benchmarks/...

Load Testing

# Run load test suite
make load-test

3. CPU Profiling

HTTP Load Test Profiling

# CPU profile MediumLoad HTTP test (with rate limiting)
make profile-load

# CPU profile MediumLoad HTTP test (no rate limiting - recommended)
make profile-load-norate

Analyze Results:

# View interactive profile (web UI)
go tool pprof -http=:8081 cpu_load.out

# View interactive profile (terminal)
go tool pprof cpu_load.out

# Generate flame graph
go tool pprof -raw cpu_load.out | go-flamegraph.pl > cpu_flame.svg

# View top functions
go tool pprof -top cpu_load.out

Web UI: http://localhost:8080

WebSocket Queue Profiling

# CPU profile WebSocket → Redis queue → worker path
make profile-ws-queue

Analyze Results:

# View interactive profile (web UI)
go tool pprof -http=:8082 cpu_ws.out

# View interactive profile (terminal)
go tool pprof cpu_ws.out

Profiling Tips

Use profile-load-norate for cleaner CPU profiles (no rate limiting delays)
Profiles run for 60 seconds by default
Requires Redis running on localhost:6379
Results show throughput, latency, and error rate metrics

4. Results & Visualization

Grafana Dashboard

Open: http://localhost:3001 (admin/admin)

Navigate to the Performance Dashboard to see:

Real-time benchmark results
Historical trends
Performance comparisons

Key Metrics

benchmark_time_per_op - Execution time
benchmark_memory_per_op - Memory usage
benchmark_allocs_per_op - Allocation count

5. CI/CD Integration

Setup GitHub Integration

Add GitHub secret:

PROMETHEUS_PUSHGATEWAY_URL=http://your-pushgateway:9091

Now benchmarks run automatically on:

Every push to main/develop
Pull requests
Daily schedule

Verify Integration

Push code to trigger workflow
Check Pushgateway: http://localhost:9091/metrics
View metrics in Grafana

6. Troubleshooting

Monitoring Stack Issues

No metrics in Grafana?

# Check services
docker ps --filter "name=monitoring"

# Check Pushgateway
curl http://localhost:9091/metrics

Workflow failing?

Verify GitHub secret configuration
Check workflow logs in GitHub Actions

Profiling Issues

Flag error like "flag provided but not defined: -test.paniconexit0"

# This should be fixed now, but if it persists:
go test ./tests/load -run TestLoadProfile_Medium -count=1 -cpuprofile cpu_load.out -v -args -profile-norate

Redis not available?

# Start Redis for profiling tests
docker run -d -p 6379:6379 redis:alpine

# Check profile file generated
ls -la cpu_load.out

Port conflicts?

# Check if ports are in use
lsof -i :3001  # Grafana
lsof -i :8080  # pprof web UI
lsof -i :6379  # Redis

7. Advanced Usage

Performance Regression Detection

# Create baseline
make detect-regressions

# Analyze current performance
go test -bench=. -benchmem ./tests/benchmarks/... | tee current.json

Custom Benchmarks

# Run specific benchmark
go test -bench=BenchmarkName -benchmem ./tests/benchmarks/...

# Run with race detection
go test -race -bench=. ./tests/benchmarks/...

8. Further Reading

Ready in 5 minutes!

4.7 KiB Raw Blame History