docs: update documentation for streamlined Makefile

- Replace 'make test-full' with 'make test' throughout docs
- Replace 'make self-cleanup' with 'make clean'
- Replace 'make tech-excellence' with 'make complete-suite'
- Replace 'make deploy-up' with 'make dev-up'
- Update docker-compose commands to docker compose v2
- Update CI workflow to use new Makefile targets
This commit is contained in:
Jeremie Fraeys 2026-03-04 13:22:29 -05:00
parent e08feae6ab
commit 7948639b1e
No known key found for this signature in database
9 changed files with 283 additions and 29 deletions

View file

@ -111,7 +111,7 @@ jobs:
run: go test -v -race -coverprofile=queue-coverage.out ./internal/queue/...
- name: Run comprehensive tests
run: make test-full
run: make test
- name: Run linters
run: make lint

View file

@ -79,7 +79,7 @@ make test
make lint
# Full test suite
make test-full
make test
# Optional: Pre-commit checks
pre-commit run --all-files
@ -197,7 +197,7 @@ make db-reset
make db-migrate
# Access database
docker-compose exec redis redis-cli
docker compose exec redis redis-cli
```
### Schema Changes
@ -262,7 +262,7 @@ Configure these tools according to your workflow preferences.
1. **Go modules not found**: Run `go mod download`
2. **Zig build fails**: Check Zig version with `zig version`
3. **Docker containers not starting**: Check Docker daemon
4. **Redis connection failed**: Start Redis with `docker-compose up redis`
4. **Redis connection failed**: Start Redis with `docker compose up redis`
### Getting Help

View file

@ -38,6 +38,12 @@ High-performance command-line interface for experiment management, written in Zi
| `info` | Show run info from `run_manifest.json` | `ml info <run_dir>` |
| `requeue` | Re-submit an existing run/commit with new args/resources | `ml requeue <commit_id|run_id|task_id|path> -- --epochs 20` |
| `logs` | Fetch and follow job logs | `ml logs job123 -n 100` |
| `compare` | Diff two runs showing narrative/metadata/metrics | `ml compare run1 run2` |
| `find` | Search experiments by tags/outcome/dataset | `ml find --outcome validated --tag bert` |
| `export` | Export run bundles with optional anonymization | `ml export run123 --anonymize` |
| `outcome` | Set post-run outcome tracking | `ml outcome set run123 --outcome validated` |
| `narrative` | Set run narrative fields | `ml narrative set run123 --hypothesis "LR matters"` |
| `dataset` | Dataset operations (verify, list, etc.) | `ml dataset verify ./data` |
### Command Details
@ -91,8 +97,11 @@ ml queue my-job --commit abc123 -- --epochs 5 --lr 1e-3
- `--commit` may be a full 40-hex commit id or a unique prefix (>=7 hex chars) resolvable under `worker_base`.
#### `requeue` - Re-submit a Previous Run
Re-submit an existing run with optional config changes, resource overrides, and provenance tracking.
```bash
# Requeue directly by commit_id
# Basic requeue
ml requeue <commit_id> -- --epochs 20
# Requeue by commit_id prefix (>=7 hex chars; must be unique)
@ -106,6 +115,22 @@ ml requeue /data/ml-experiments/finished/<run_id> -- --epochs 20
# Override priority/resources on requeue
ml requeue <task_id> --priority 10 --gpu 1 -- --epochs 20
# Iterative experimentation with config changes
ml requeue run-id-123 --with-changes --lr=0.002 --batch-size=64
# Inherit narrative from parent run
ml requeue run-id-123 --inherit-narrative -- --epochs 50
# Inherit config/metadata from parent
ml requeue run-id-123 --inherit-config --with-changes --lr=0.001
# Link as child run for provenance tracking
ml requeue run-id-123 --parent -- --epochs 100
# Full provenance chain example
ml requeue run-id-123 --parent --inherit-narrative --inherit-config \
--with-changes --lr=0.0001 -- --epochs 200
```
**What it does:**
@ -113,7 +138,16 @@ ml requeue <task_id> --priority 10 --gpu 1 -- --epochs 20
- Extracts `commit_id`
- Submits a new queue request using that `commit_id` with optional overridden args/resources
**Flags:**
- `--with-changes`: Override specific config values (key=value pairs)
- `--inherit-narrative`: Copy hypothesis/context/intent from parent run
- `--inherit-config`: Copy metadata/tags from parent run
- `--parent`: Link as child run for provenance tracking
**Notes:**
- `--priority` is passed to the server as a single byte (0-255).
- Args are sent via a dedicated queue opcode and become `task.Args` on the worker.
- `--commit` may be a full 40-hex commit id or a unique prefix (>=7 hex chars) resolvable under `worker_base`.
- Tasks support optional `snapshot_id` and `dataset_specs` fields server-side (for provenance and dataset resolution).
#### `watch` - Auto-Sync Monitoring
@ -211,6 +245,169 @@ ml logs job123 -n 20 | grep -i error
---
#### `compare` - Diff Two Runs
Compare two experiment runs showing differences in narrative, metadata, and metrics.
```bash
# Compare two runs
ml compare run-id-1 run-id-2
# Compare with full JSON output
ml compare run1 run2 --json
# Export comparison as CSV
ml compare run1 run2 --csv
```
**Output includes:**
- Narrative differences (hypothesis, context, intent)
- Metadata changes (config, resources, args)
- Metric deltas (before/after comparison)
- Provenance linkage
---
#### `find` - Search Experiments
Search and filter experiments by various criteria.
```bash
# Find by outcome
ml find --outcome validated
ml find --outcome refuted
# Find by tags
ml find --tag bert --tag training
# Find by dataset
ml find --dataset mnist
# Find by experiment group
ml find --experiment-group hyperparam-search
# Find by author
ml find --author alice
# Combine filters
ml find --outcome validated --tag production --dataset cifar10
# Export results as CSV
ml find --outcome validated --csv
```
**Filter Options:**
- `--outcome`: validated, refuted, inconclusive, partial
- `--tag`: Match any tag (repeatable)
- `--dataset`: Dataset name used
- `--experiment-group`: Experiment group name
- `--author`: Username who created the run
---
#### `export` - Export Run Bundles
Export experiment runs with optional anonymization for sharing.
```bash
# Export a single run
ml export run-id-123
# Export with metadata-only anonymization
ml export run-id-123 --anonymize --anonymize-level metadata-only
# Export with full anonymization (paths, IPs, usernames)
ml export run-id-123 --anonymize --anonymize-level full
# Export multiple runs
ml export run1 run2 run3 --anonymize
```
**Anonymization Levels:**
- `metadata-only`: Redact narrative content but keep structure
- `full`: Redact paths, IPs, usernames, and sensitive metadata
---
#### `outcome` - Post-Run Outcome Tracking
Set and track experiment outcomes after runs complete.
```bash
# Set validated outcome (experiment confirmed hypothesis)
ml outcome set run-id-123 --outcome validated
# Set refuted outcome (experiment contradicted hypothesis)
ml outcome set run-id-456 --outcome refuted
# Set inconclusive outcome
ml outcome set run-id-789 --outcome inconclusive
# Set partial outcome (partially validated)
ml outcome set run-id-abc --outcome partial
```
**Outcomes:**
- `validated`: Results confirm the hypothesis
- `refuted`: Results contradict the hypothesis
- `inconclusive`: Results neither confirm nor refute
- `partial`: Partial validation with caveats
---
#### `narrative` - Set Run Narrative
Set narrative fields for runs to track scientific context.
```bash
# Set hypothesis
ml narrative set run-id-123 --hypothesis "Learning rate of 0.001 improves convergence"
# Set research context
ml narrative set run-id-123 --context "Testing different optimizers for transformer training"
# Set research intent
ml narrative set run-id-123 --intent "Determine optimal LR for AdamW"
# Set expected outcome
ml narrative set run-id-123 --expected-outcome "Faster convergence with lower final loss"
# Set experiment group
ml narrative set run-id-123 --experiment-group lr-sweep-2024
# Add tags
ml narrative set run-id-123 --tags "bert,training,optimizer"
```
**Note:** These fields can also be set at queue time using flags like `--hypothesis`, `--context`, etc.
---
#### `dataset` - Dataset Operations
Manage and verify datasets.
```bash
# Verify dataset integrity (checksum validation)
ml dataset verify /path/to/dataset
# Verify and export metrics as CSV
ml dataset verify /path/to/dataset --csv
# List registered datasets
ml dataset list
# Get dataset info
ml dataset info dataset-name
```
**Features:**
- Fast SHA256 checksum validation
- Missing/corrupted file detection
- CSV export for analysis
---
#### `jupyter` - Jupyter Notebook Management
Manage Jupyter notebook services via WebSocket protocol.
@ -438,16 +635,16 @@ If using Docker Compose:
```bash
# Start services
docker-compose up -d (testing only)
docker compose up -d (testing only)
# View logs
docker-compose logs -f
docker compose logs -f
# Stop services
docker-compose down
docker compose down
# Check status
docker-compose ps
docker compose ps
```
## Troubleshooting

View file

@ -190,7 +190,7 @@ vim ~/.ml/config.toml
make test-auth
# Clean up
make self-cleanup
make clean
```
### Full Test Suite

View file

@ -343,7 +343,7 @@ jq 'select(.event_type == "authentication_failure")' /var/log/fetch_ml/audit.log
8. **Secure Grafana**
```bash
# Change default admin password
docker-compose exec grafana grafana-cli admin reset-admin-password new-strong-password
docker compose exec grafana grafana-cli admin reset-admin-password new-strong-password
```
9. **Regular Updates**
@ -352,8 +352,8 @@ jq 'select(.event_type == "authentication_failure")' /var/log/fetch_ml/audit.log
sudo apt update && sudo apt upgrade -y
# Update containers
docker-compose pull
docker-compose up -d (testing only)
docker compose pull
docker compose up -d (testing only)
```
## Password Management

View file

@ -8,7 +8,7 @@ Comprehensive testing documentation for FetchML platform with integrated monitor
```bash
# Clean environment
make self-cleanup
make clean
# Start development stack with monitoring
make dev-up
@ -112,7 +112,7 @@ make prod-up
# Test production deployment
make test-auth # Multi-user auth test
make self-cleanup # Clean up after testing
make clean # Clean up after testing
# Verify production monitoring
curl -f https://your-domain.com/health
@ -267,7 +267,7 @@ curl -I http://localhost:9090/api/v1/query?query=up
```bash
# Test cleanup script
make self-cleanup
make clean
make dev-down
# Expected: Containers stopped and removed
```
@ -306,27 +306,27 @@ echo "TEST_END: $(date)" | tee -a /logs/test.log
## Test Execution Protocol
### Phase 1: Environment Setup
### 1: Environment Setup
1. Clean up any existing containers
2. Start fresh Docker environment with monitoring
3. Verify all services are running
### Phase 2: Authentication Testing
### 2: Authentication Testing
1. Test all user roles (admin, researcher, analyst)
2. Test invalid authentication scenarios
3. Verify role-based permissions
### Phase 3: Functional Testing
### 3: Functional Testing
1. Test CLI commands (queue, status)
2. Test API endpoints
3. Test WebSocket communication
### Phase 4: Integration Testing
### 4: Integration Testing
1. Test complete workflows
2. Test error scenarios
3. Test cleanup procedures
### Phase 5: Performance Testing
### 5: Performance Testing
1. Run benchmarks
2. Perform load testing
3. Validate performance metrics
@ -456,6 +456,63 @@ log.WithFields(log.Fields{
- CLI tests: `cli/tests/` directory
- Follow Zig testing conventions
## Security Test Coverage
FetchML implements comprehensive security testing with **100% coverage** across all security-critical components (49/49 tests completed as of 2026-02).
### Security Test Categories
| Category | Tests | Description |
|----------|-------|-------------|
| Prerequisites | 11 | Config integrity, HIPAA validation, manifest nonce, GPU audit logging, resource quotas |
| Reproducibility | 14 | Environment capture, config hash computation, GPU detection recording, scan exclusions |
| Property-Based | 4 | Config hash properties, detection source validation using gopter |
| Lint Rules | 4 analyzers | no-bare-create-detector, manifest-environment-required, no-inline-credentials, hipaa-completeness |
| Audit Log | 3 | Chain verification, tamper detection, background verification job |
| Fault Injection | 6 stubs | NVML failures, manifest write failures, Redis unavailability, audit log failures, disk full scenarios |
| Integration | 4 | Cross-tenant isolation, run manifest reproducibility, PHI redaction |
### Running Security Tests
```bash
# Run all security tests
make test
# Run security-specific unit tests
go test -v ./tests/unit/security/...
# Run audit verification tests
go test -v ./tests/integration/audit/...
# Run fault injection tests
FETCH_ML_FAULT_INJECTION=1 go test -v ./tests/fault/...
# Run property-based tests
go test -v ./tests/property/...
```
### Security Test Files
- `tests/unit/security/path_traversal_test.go` - Path traversal prevention
- `tests/unit/security/filetype_test.go` - Magic bytes validation
- `tests/unit/security/secrets_test.go` - Environment expansion & secret detection
- `tests/unit/security/audit_test.go` - Audit chain integrity
- `tests/unit/security/config_integrity_test.go` - Config validation
- `tests/integration/security/cross_tenant_test.go` - Tenant isolation
- `tests/integration/audit/verification_test.go` - Audit verification
- `tests/property/*_test.go` - Property-based testing
- `tests/fault/fault_test.go` - Fault injection scenarios
### Test Coverage Validation
```bash
# Generate coverage report
make test-coverage
# View detailed coverage
go tool cover -html=coverage.out
```
## Test Result Analysis
### Grafana Dashboard Analysis
@ -497,7 +554,7 @@ log.WithFields(log.Fields{
```bash
# Clean up test data
make self-cleanup
make clean
# Clean up Docker resources
make clean-all
@ -543,7 +600,7 @@ Tasks: X total, X queued, X running, X failed, X completed
```bash
make help # Show all commands
make test-auth # Quick auth test
make self-cleanup # Clean environment
make clean # Clean environment
make test-status # Check system status
make dev-up # Start dev environment
make dev-down # Stop dev environment

View file

@ -43,7 +43,7 @@ Grafana provisioning is how Grafana auto-configures itself on startup (no clicki
```bash
# Start deployment
make deploy-up
make dev-up
# Access services
open http://localhost:3000 # Grafana (admin/admin123)
@ -88,7 +88,7 @@ Available dashboard configurations in `grafana/dashboards/`:
## Usage
1. Start monitoring stack: `make deploy-up`
1. Start monitoring stack: `make dev-up`
2. Access Grafana: http://localhost:3000 (admin/admin123)
3. Import dashboards from `grafana/dashboards/` directory
4. View metrics and test results in real-time

View file

@ -55,7 +55,7 @@ make test-coverage # All tests with coverage report
make benchmark # Run performance benchmarks
make load-test # Run load testing suite
make chaos-test # Run chaos engineering tests
make tech-excellence # Run complete technical excellence suite
make complete-suite # Run complete technical excellence suite
```
### Individual Test Execution

View file

@ -132,9 +132,9 @@ test_all() {
echo "===================="
echo ""
if make_target_exists test-full; then
print_info "Running full test suite..."
make test-full
if make_target_exists test; then
print_info "Running test suite..."
make test
else
print_info "Running test suite..."
make test