From f35762468559e8c38f4da00cf2bc71908e47b578 Mon Sep 17 00:00:00 2001 From: Jeremie Fraeys Date: Wed, 18 Feb 2026 21:28:25 -0500 Subject: [PATCH] docs: Update CHANGELOG and add feature documentation Update documentation for new features: - Add CHANGELOG entries for research features and privacy enhancements - Update README with new CLI commands and security features - Add privacy-security.md documentation for PII detection - Add research-features.md for narrative and outcome tracking --- CHANGELOG.md | 31 ++++ README.md | 11 ++ docs/src/privacy-security.md | 320 ++++++++++++++++++++++++++++++++++ docs/src/research-features.md | 320 ++++++++++++++++++++++++++++++++++ 4 files changed, 682 insertions(+) create mode 100644 docs/src/privacy-security.md create mode 100644 docs/src/research-features.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 7da9c4e..34c6e0f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,36 @@ ## [Unreleased] +### Added - CSV Export Features (2026-02-18) +- CLI: `ml compare --csv` - Export run comparisons as CSV with actual run IDs as column headers +- CLI: `ml find --csv` - Export search results as CSV for spreadsheet analysis +- CLI: `ml dataset verify --csv` - Export dataset verification metrics as CSV +- Shell: Updated bash/zsh completions with --csv flags for compare, find commands + +### Added - Phase 3 Features (2026-02-18) +- CLI: `ml requeue --with-changes` - Iterative experimentation with config overrides (--lr=0.002, etc.) +- CLI: `ml requeue --inherit-narrative` - Copy hypothesis/context from parent run +- CLI: `ml requeue --inherit-config` - Copy metadata from parent run +- CLI: `ml requeue --parent` - Link as child run for provenance tracking +- CLI: `ml dataset verify` - Fast dataset checksum validation +- CLI: `ml logs --follow` - Real-time log streaming via WebSocket +- API/WebSocket: Add opcodes for compare (0x30), find (0x31), export (0x32), set outcome (0x33) + +### Added - Phase 2 Features (2026-02-18) +- CLI: `ml compare` - Diff two runs showing narrative/metadata/metrics differences +- CLI: `ml find` - Search experiments by tags, outcome, dataset, experiment-group, author +- CLI: `ml export --anonymize` - Export bundles with path/IP/username redaction +- CLI: `ml export --anonymize-level` - 'metadata-only' or 'full' anonymization +- CLI: `ml outcome set` - Post-run outcome tracking (validates/refutes/inconclusive/partial) +- CLI: Error suggestions with Levenshtein distance for typos +- Shell: Updated bash/zsh completions for all new commands +- Tests: E2E tests for compare, find, export, requeue changes + +### Added - Phase 0 Features (2026-02-18) +- CLI: Queue-time narrative flags (--hypothesis, --context, --intent, --expected-outcome, --experiment-group, --tags) +- CLI: Enhanced `ml status` output with queue position [pos N] and priority (P:N) +- CLI: `ml narrative set` command for setting run narrative fields +- Shell: Updated completions with new commands and flags + ### Security - Native: fix buffer overflow vulnerabilities in `dataset_hash` (replaced `strcpy` with `strncpy` + null termination) - Native: fix unsafe `memcpy` in `queue_index` priority queue (added explicit null terminators for string fields) diff --git a/README.md b/README.md index 6e2f3e7..4d490f4 100644 --- a/README.md +++ b/README.md @@ -100,6 +100,15 @@ ml queue my-job ml cancel my-job ml dataset list ml monitor # SSH to run TUI remotely + +# Research features (see docs/src/research-features.md) +ml queue train.py --hypothesis "LR scaling..." --tags ablation +ml outcome set run_abc --outcome validates --summary "Accuracy +2%" +ml find --outcome validates --tag lr-test +ml compare run_abc run_def +ml privacy set run_abc --level team +ml export run_abc --anonymize +ml dataset verify /path/to/data ``` ## Phase 1 (V1) notes @@ -150,6 +159,8 @@ See `docs/` for detailed guides: - `docs/src/zig-cli.md` – CLI reference - `docs/src/quick-start.md` – Full setup guide - `docs/src/deployment.md` – Production deployment +- `docs/src/research-features.md` – Research workflow features (narrative capture, outcomes, search) +- `docs/src/privacy-security.md` – Privacy levels, PII detection, anonymized export ## Source code diff --git a/docs/src/privacy-security.md b/docs/src/privacy-security.md new file mode 100644 index 0000000..d08a0d9 --- /dev/null +++ b/docs/src/privacy-security.md @@ -0,0 +1,320 @@ +# Privacy & Security + +FetchML includes privacy-conscious features for research environments handling sensitive data. + +--- + +## Privacy Levels + +Control experiment visibility with four privacy levels. + +### Available Levels + +| Level | Visibility | Use Case | +|-------|-----------|----------| +| `private` | Owner only (default) | Sensitive/unpublished research | +| `team` | Same team members | Collaborative team projects | +| `public` | All authenticated users | Open research, shared datasets | +| `anonymized` | All users with PII stripped | Public release, papers | + +### Setting Privacy + +```bash +# Make experiment private (default) +ml privacy set run_abc --level private + +# Share with team +ml privacy set run_abc --level team --team vision-research + +# Make public within organization +ml privacy set run_abc --level public + +# Prepare for anonymized export +ml privacy set run_abc --level anonymized +``` + +### Privacy in Manifest + +Privacy settings are stored in the experiment manifest: + +```json +{ + "privacy": { + "level": "team", + "team": "vision-research", + "owner": "researcher@lab.edu" + } +} +``` + +--- + +## PII Detection + +Automatically detect potentially identifying information in experiment metadata. + +### What Gets Detected + +- **Email addresses** - `user@example.com` +- **IP addresses** - `192.168.1.1`, `10.0.0.5` +- **Phone numbers** - Basic pattern matching +- **SSN patterns** - `123-45-6789` + +### Using Privacy Scan + +When adding annotations with sensitive context: + +```bash +# Scan for PII before storing +ml annotate run_abc \ + --note "Contact at user@example.com for questions" \ + --privacy-scan + +# Output: +# Warning: Potential PII detected: +# - email: 'user@example.com' +# Use --force to store anyway, or edit your note. +``` + +### Override Warnings + +If PII is intentional and acceptable: + +```bash +ml annotate run_abc \ + --note "Contact at user@example.com" \ + --privacy-scan \ + --force +``` + +### Redacting PII + +For anonymized exports, PII is automatically redacted: + +```bash +ml export run_abc --anonymize +``` + +Redacted content becomes: `[EMAIL-1]`, `[IP-1]`, etc. + +--- + +## Anonymized Export + +Export experiments for external sharing without leaking sensitive information. + +### Basic Anonymization + +```bash +ml export run_abc --bundle run_abc.tar.gz --anonymize +``` + +### Anonymization Levels + +**Metadata-only** (default): +- Strips internal paths: `/nas/private/data` → `/datasets/data` +- Replaces internal IPs: `10.0.0.5` → `[INTERNAL-1]` +- Hashes email addresses: `user@lab.edu` → `[RESEARCHER-A]` +- Keeps experiment structure and metrics + +**Full**: +- Everything in metadata-only, plus: +- Removes logs entirely +- Removes annotations +- Redacts all PII from notes + +```bash +# Full anonymization +ml export run_abc --anonymize --anonymize-level full +``` + +### What Gets Anonymized + +| Original | Anonymized | Notes | +|----------|------------|-------| +| `/home/user/data` | `/workspace/data` | Paths generalized | +| `/nas/private/lab` | `/datasets/lab` | Internal mounts hidden | +| `user@lab.edu` | `[RESEARCHER-A]` | Consistent per user | +| `10.0.0.5` | `[INTERNAL-1]` | IP ranges replaced | +| `john@example.com` | `[EMAIL-1]` | PII redacted | + +### Export Verification + +Review what's in the export: + +```bash +# Export and list contents +ml export run_abc --anonymize -o /tmp/run_abc.tar.gz +tar tzf /tmp/run_abc.tar.gz | head -20 +``` + +--- + +## Dataset Identity & Checksums + +Verify dataset integrity with SHA256 checksums. + +### Computing Checksums + +Datasets are automatically checksummed when registered: + +```bash +ml dataset register /path/to/dataset --name my-dataset +# Computes SHA256 of all files in dataset +``` + +### Verifying Datasets + +```bash +# Verify dataset integrity +ml dataset verify /path/to/my-dataset + +# Output: +# ✓ Dataset checksum verified +# Expected: sha256:abc123... +# Actual: sha256:abc123... +``` + +### Checksum in Manifest + +```json +{ + "datasets": [{ + "name": "imagenet-train", + "checksum": "sha256:def456...", + "sample_count": 1281167 + }] +} +``` + +--- + +## Security Best Practices + +### 1. Default to Private + +Keep experiments private until ready to share: + +```bash +# Private by default +ml queue train.py --hypothesis "..." + +# Later, when ready to share +ml privacy set run_abc --level team --team my-team +``` + +### 2. Scan Before Sharing + +Always use `--privacy-scan` when adding notes that might contain PII: + +```bash +ml annotate run_abc --note "..." --privacy-scan +``` + +### 3. Anonymize for External Release + +Before exporting for papers or public release: + +```bash +ml export run_abc --anonymize --anonymize-level full +``` + +### 4. Verify Dataset Integrity + +Regularly verify datasets, especially shared ones: + +```bash +ml dataset verify /path/to/shared/dataset +``` + +### 5. Use Team Privacy for Collaboration + +Share with specific teams rather than making public: + +```bash +ml privacy set run_abc --level team --team ml-group +``` + +--- + +## Compliance Considerations + +### GDPR / Research Ethics + +| Requirement | FetchML Support | Status | +|-------------|-----------------|--------| +| Right to access | `ml export` creates data bundles | ✅ | +| Right to erasure | Delete command (future) | ⏳ | +| Data minimization | Narrative fields collect only necessary data | ✅ | +| PII detection | `ml annotate --privacy-scan` | ✅ | +| Anonymization | `ml export --anonymize` | ✅ | + +### Handling Sensitive Data + +For experiments with sensitive data: + +1. **Keep private**: Use `--level private` +2. **PII scan all annotations**: Always use `--privacy-scan` +3. **Anonymize before export**: Use `--anonymize-level full` +4. **Verify team membership**: Before sharing at `--level team` + +--- + +## Configuration + +### Worker Privacy Settings + +Configure privacy defaults in worker config: + +```yaml +privacy: + default_level: private + enforce_teams: true + audit_access: true +``` + +### API Server Privacy + +Enable privacy enforcement: + +```yaml +security: + privacy: + enabled: true + default_level: private + audit_access: true +``` + +--- + +## Troubleshooting + +### PII Scan False Positives + +Some valid text may trigger PII warnings: + +```bash +# Example: "batch@32" looks like email +ml annotate run_abc --note "Use batch@32 for training" --privacy-scan +# Warning triggers, use --force if intended +``` + +### Privacy Changes Not Applied + +- Verify you own the experiment +- Check server supports privacy enforcement +- Try with explicit base path: `--base /path/to/experiments` + +### Export Not Anonymized + +- Ensure `--anonymize` flag is set +- Check `--anonymize-level` is correct (metadata-only vs full) +- Verify manifest contains privacy data + +--- + +## See Also + +- `docs/src/research-features.md` - Research workflow features +- `docs/src/deployment.md` - Production deployment with privacy +- `docs/src/quick-start.md` - Getting started guide diff --git a/docs/src/research-features.md b/docs/src/research-features.md new file mode 100644 index 0000000..7405b5e --- /dev/null +++ b/docs/src/research-features.md @@ -0,0 +1,320 @@ +# Research Features + +FetchML includes research-focused features for experiment tracking, knowledge capture, and collaboration. + +--- + +## Queue-Time Narrative Capture + +Document your hypothesis and intent when queuing experiments. This creates provenance for your research and helps others understand your work. + +### Basic Usage + +```bash +ml queue train.py \ + --hypothesis "Linear LR scaling improves convergence" \ + --context "Following up on Smith et al. 2023" \ + --intent "Test batch=64 with 2x LR" \ + --expected-outcome "Same accuracy, 50% less time" \ + --experiment-group "lr-scaling-study" \ + --tags "ablation,learning-rate,batch-size" +``` + +### Available Flags + +| Flag | Description | Example | +|------|-------------|---------| +| `--hypothesis` | What you expect to happen | "LR scaling improves convergence" | +| `--context` | Background and motivation | "Following paper XYZ" | +| `--intent` | What you're testing | "Test batch=64 with 2x LR" | +| `--expected-outcome` | Predicted results | "Same accuracy, 50% less time" | +| `--experiment-group` | Group related experiments | "batch-scaling" | +| `--tags` | Comma-separated tags | "ablation,lr-test" | + +### Viewing Narrative + +After queuing, view the narrative with: + +```bash +ml info run_abc +``` + +--- + +## Post-Run Outcome Capture + +Record findings after experiments complete. This preserves institutional knowledge and helps track what worked. + +### Setting Outcomes + +```bash +ml outcome set run_abc \ + --outcome validates \ + --summary "Accuracy improved 2.3% with 2x learning rate" \ + --learning "Linear scaling works for batch sizes 32-128" \ + --learning "GPU utilization increased from 60% to 95%" \ + --next-step "Try batch=96 with gradient accumulation" \ + --validation-status "cross-validated" +``` + +### Outcome Types + +- **validates** - Hypothesis confirmed +- **refutes** - Hypothesis rejected +- **inconclusive** - Results unclear +- **partial** - Partial confirmation + +### Repeatable Fields + +Use multiple flags to capture multiple learnings or next steps: + +```bash +ml outcome set run_abc \ + --learning "Finding 1" \ + --learning "Finding 2" \ + --learning "Finding 3" \ + --next-step "Follow-up experiment A" \ + --next-step "Follow-up experiment B" +``` + +--- + +## Experiment Search & Discovery + +Find past experiments with powerful filters and export results. + +### Basic Search + +```bash +# Search by tags +ml find --tag ablation + +# Search by outcome +ml find --outcome validates + +# Search by experiment group +ml find --experiment-group lr-scaling +``` + +### Combined Filters + +```bash +# Multiple criteria +ml find \ + --tag ablation \ + --outcome validates \ + --dataset imagenet \ + --after 2024-01-01 \ + --before 2024-03-01 + +# With limit +ml find --tag production --limit 50 +``` + +### Export Results + +```bash +# JSON output for programmatic use +ml find --outcome validates --json > results.json + +# CSV for analysis in spreadsheet tools +ml find --experiment-group lr-study --csv > study_results.csv +``` + +### CSV Output Format + +The CSV includes columns: +- `id` - Run identifier +- `job_name` - Job name +- `outcome` - validates/refutes/inconclusive/partial +- `status` - running/finished/failed +- `experiment_group` - Group name +- `tags` - Comma-separated tags +- `hypothesis` - Narrative hypothesis + +--- + +## Experiment Comparison + +Compare two or more experiments side-by-side. + +### Basic Comparison + +```bash +ml compare run_abc run_def +``` + +### Output Formats + +```bash +# Human-readable (default) +ml compare run_abc run_def + +# JSON for programmatic analysis +ml compare run_abc run_def --json + +# CSV for spreadsheet analysis +ml compare run_abc run_def --csv + +# Show all fields (including unchanged) +ml compare run_abc run_def --all +``` + +### What Gets Compared + +- **Narrative fields** - Hypothesis, context, intent differences +- **Metadata** - Batch size, learning rate, epochs, model, dataset +- **Metrics** - Accuracy, loss, training time with deltas +- **Outcomes** - validates vs refutes, etc. + +--- + +## Dataset Verification + +Verify dataset integrity with SHA256 checksums. + +### Basic Verification + +```bash +ml dataset verify /path/to/dataset +``` + +### Registration with Checksums + +When registering datasets, checksums are automatically computed: + +```bash +ml dataset register /path/to/imagenet --name imagenet-train +``` + +### View Dataset Info + +```bash +ml dataset info imagenet-train +``` + +This shows: +- Dataset name and path +- SHA256 checksums +- Sample count +- Privacy level (if set) + +--- + +## Best Practices + +### 1. Always Document Hypothesis + +```bash +ml queue train.py \ + --hypothesis "Data augmentation reduces overfitting" \ + --experiment-group "regularization-study" +``` + +### 2. Capture Outcomes Promptly + +Record findings while they're fresh: + +```bash +ml outcome set run_abc \ + --outcome validates \ + --summary "Augmentation improved validation accuracy by 3%" \ + --learning "Rotation=15 worked best" +``` + +### 3. Use Consistent Tags + +Establish tag conventions for your team: +- `ablation` - Ablation studies +- `baseline` - Baseline experiments +- `production` - Production-ready models +- `exploratory` - Initial exploration + +### 4. Export for Analysis + +Regularly export experiment data for analysis: + +```bash +ml find --experiment-group my-study --csv > my_study.csv +``` + +### 5. Compare Systematically + +Use comparison to understand what changed: + +```bash +# Compare to baseline +ml compare baseline_run current_run + +# Compare successful runs +ml compare run_abc run_def run_ghi +``` + +--- + +## Integration with Research Workflow + +### Iterative Experiments + +```bash +# Run baseline +ml queue train.py --hypothesis "Baseline is strong" --tags baseline + +# Run variant A +ml queue train_v2.py \ + --hypothesis "V2 improves on baseline" \ + --tags ablation,v2 \ + --experiment-group v2-study + +# Compare results +ml compare baseline_run v2_run + +# Document outcome +ml outcome set v2_run --outcome validates --summary "V2 improved 2.3%" +``` + +### Ablation Studies + +```bash +# Full model +ml queue train.py --hypothesis "Full model works best" --tags ablation,full + +# Without component A +ml queue train_no_a.py --hypothesis "Component A is critical" --tags ablation,no-a + +# Without component B +ml queue train_no_b.py --hypothesis "Component B adds value" --tags ablation,no-b + +# Search all ablations +ml find --tag ablation --csv > ablations.csv +``` + +--- + +## Troubleshooting + +### Search Returns No Results + +- Check your filters aren't too restrictive +- Try broadening date ranges +- Verify tags are spelled correctly + +### Outcome Not Saved + +- Ensure run ID exists: `ml info run_abc` +- Check you have permission to modify the run +- Try with explicit base path: `ml outcome set run_abc ... --base /path/to/experiments` + +### Comparison Shows No Differences + +- Use `--all` flag to show unchanged fields +- Verify you're comparing different runs +- Check that runs have narrative data + +--- + +## See Also + +- `docs/src/privacy-security.md` - Privacy levels and PII detection +- `docs/src/quick-start.md` - Full setup guide +- `docs/src/zig-cli.md` - Complete CLI reference