docs: Update CHANGELOG and add feature documentation

Update documentation for new features:
- Add CHANGELOG entries for research features and privacy enhancements
- Update README with new CLI commands and security features
- Add privacy-security.md documentation for PII detection
- Add research-features.md for narrative and outcome tracking
This commit is contained in:
Jeremie Fraeys 2026-02-18 21:28:25 -05:00
parent 27c8b08a16
commit f357624685
No known key found for this signature in database
4 changed files with 682 additions and 0 deletions

View file

@ -1,5 +1,36 @@
## [Unreleased]
### Added - CSV Export Features (2026-02-18)
- CLI: `ml compare --csv` - Export run comparisons as CSV with actual run IDs as column headers
- CLI: `ml find --csv` - Export search results as CSV for spreadsheet analysis
- CLI: `ml dataset verify --csv` - Export dataset verification metrics as CSV
- Shell: Updated bash/zsh completions with --csv flags for compare, find commands
### Added - Phase 3 Features (2026-02-18)
- CLI: `ml requeue --with-changes` - Iterative experimentation with config overrides (--lr=0.002, etc.)
- CLI: `ml requeue --inherit-narrative` - Copy hypothesis/context from parent run
- CLI: `ml requeue --inherit-config` - Copy metadata from parent run
- CLI: `ml requeue --parent` - Link as child run for provenance tracking
- CLI: `ml dataset verify` - Fast dataset checksum validation
- CLI: `ml logs --follow` - Real-time log streaming via WebSocket
- API/WebSocket: Add opcodes for compare (0x30), find (0x31), export (0x32), set outcome (0x33)
### Added - Phase 2 Features (2026-02-18)
- CLI: `ml compare` - Diff two runs showing narrative/metadata/metrics differences
- CLI: `ml find` - Search experiments by tags, outcome, dataset, experiment-group, author
- CLI: `ml export --anonymize` - Export bundles with path/IP/username redaction
- CLI: `ml export --anonymize-level` - 'metadata-only' or 'full' anonymization
- CLI: `ml outcome set` - Post-run outcome tracking (validates/refutes/inconclusive/partial)
- CLI: Error suggestions with Levenshtein distance for typos
- Shell: Updated bash/zsh completions for all new commands
- Tests: E2E tests for compare, find, export, requeue changes
### Added - Phase 0 Features (2026-02-18)
- CLI: Queue-time narrative flags (--hypothesis, --context, --intent, --expected-outcome, --experiment-group, --tags)
- CLI: Enhanced `ml status` output with queue position [pos N] and priority (P:N)
- CLI: `ml narrative set` command for setting run narrative fields
- Shell: Updated completions with new commands and flags
### Security
- Native: fix buffer overflow vulnerabilities in `dataset_hash` (replaced `strcpy` with `strncpy` + null termination)
- Native: fix unsafe `memcpy` in `queue_index` priority queue (added explicit null terminators for string fields)

View file

@ -100,6 +100,15 @@ ml queue my-job
ml cancel my-job
ml dataset list
ml monitor # SSH to run TUI remotely
# Research features (see docs/src/research-features.md)
ml queue train.py --hypothesis "LR scaling..." --tags ablation
ml outcome set run_abc --outcome validates --summary "Accuracy +2%"
ml find --outcome validates --tag lr-test
ml compare run_abc run_def
ml privacy set run_abc --level team
ml export run_abc --anonymize
ml dataset verify /path/to/data
```
## Phase 1 (V1) notes
@ -150,6 +159,8 @@ See `docs/` for detailed guides:
- `docs/src/zig-cli.md` CLI reference
- `docs/src/quick-start.md` Full setup guide
- `docs/src/deployment.md` Production deployment
- `docs/src/research-features.md` Research workflow features (narrative capture, outcomes, search)
- `docs/src/privacy-security.md` Privacy levels, PII detection, anonymized export
## Source code

View file

@ -0,0 +1,320 @@
# Privacy & Security
FetchML includes privacy-conscious features for research environments handling sensitive data.
---
## Privacy Levels
Control experiment visibility with four privacy levels.
### Available Levels
| Level | Visibility | Use Case |
|-------|-----------|----------|
| `private` | Owner only (default) | Sensitive/unpublished research |
| `team` | Same team members | Collaborative team projects |
| `public` | All authenticated users | Open research, shared datasets |
| `anonymized` | All users with PII stripped | Public release, papers |
### Setting Privacy
```bash
# Make experiment private (default)
ml privacy set run_abc --level private
# Share with team
ml privacy set run_abc --level team --team vision-research
# Make public within organization
ml privacy set run_abc --level public
# Prepare for anonymized export
ml privacy set run_abc --level anonymized
```
### Privacy in Manifest
Privacy settings are stored in the experiment manifest:
```json
{
"privacy": {
"level": "team",
"team": "vision-research",
"owner": "researcher@lab.edu"
}
}
```
---
## PII Detection
Automatically detect potentially identifying information in experiment metadata.
### What Gets Detected
- **Email addresses** - `user@example.com`
- **IP addresses** - `192.168.1.1`, `10.0.0.5`
- **Phone numbers** - Basic pattern matching
- **SSN patterns** - `123-45-6789`
### Using Privacy Scan
When adding annotations with sensitive context:
```bash
# Scan for PII before storing
ml annotate run_abc \
--note "Contact at user@example.com for questions" \
--privacy-scan
# Output:
# Warning: Potential PII detected:
# - email: 'user@example.com'
# Use --force to store anyway, or edit your note.
```
### Override Warnings
If PII is intentional and acceptable:
```bash
ml annotate run_abc \
--note "Contact at user@example.com" \
--privacy-scan \
--force
```
### Redacting PII
For anonymized exports, PII is automatically redacted:
```bash
ml export run_abc --anonymize
```
Redacted content becomes: `[EMAIL-1]`, `[IP-1]`, etc.
---
## Anonymized Export
Export experiments for external sharing without leaking sensitive information.
### Basic Anonymization
```bash
ml export run_abc --bundle run_abc.tar.gz --anonymize
```
### Anonymization Levels
**Metadata-only** (default):
- Strips internal paths: `/nas/private/data``/datasets/data`
- Replaces internal IPs: `10.0.0.5``[INTERNAL-1]`
- Hashes email addresses: `user@lab.edu``[RESEARCHER-A]`
- Keeps experiment structure and metrics
**Full**:
- Everything in metadata-only, plus:
- Removes logs entirely
- Removes annotations
- Redacts all PII from notes
```bash
# Full anonymization
ml export run_abc --anonymize --anonymize-level full
```
### What Gets Anonymized
| Original | Anonymized | Notes |
|----------|------------|-------|
| `/home/user/data` | `/workspace/data` | Paths generalized |
| `/nas/private/lab` | `/datasets/lab` | Internal mounts hidden |
| `user@lab.edu` | `[RESEARCHER-A]` | Consistent per user |
| `10.0.0.5` | `[INTERNAL-1]` | IP ranges replaced |
| `john@example.com` | `[EMAIL-1]` | PII redacted |
### Export Verification
Review what's in the export:
```bash
# Export and list contents
ml export run_abc --anonymize -o /tmp/run_abc.tar.gz
tar tzf /tmp/run_abc.tar.gz | head -20
```
---
## Dataset Identity & Checksums
Verify dataset integrity with SHA256 checksums.
### Computing Checksums
Datasets are automatically checksummed when registered:
```bash
ml dataset register /path/to/dataset --name my-dataset
# Computes SHA256 of all files in dataset
```
### Verifying Datasets
```bash
# Verify dataset integrity
ml dataset verify /path/to/my-dataset
# Output:
# ✓ Dataset checksum verified
# Expected: sha256:abc123...
# Actual: sha256:abc123...
```
### Checksum in Manifest
```json
{
"datasets": [{
"name": "imagenet-train",
"checksum": "sha256:def456...",
"sample_count": 1281167
}]
}
```
---
## Security Best Practices
### 1. Default to Private
Keep experiments private until ready to share:
```bash
# Private by default
ml queue train.py --hypothesis "..."
# Later, when ready to share
ml privacy set run_abc --level team --team my-team
```
### 2. Scan Before Sharing
Always use `--privacy-scan` when adding notes that might contain PII:
```bash
ml annotate run_abc --note "..." --privacy-scan
```
### 3. Anonymize for External Release
Before exporting for papers or public release:
```bash
ml export run_abc --anonymize --anonymize-level full
```
### 4. Verify Dataset Integrity
Regularly verify datasets, especially shared ones:
```bash
ml dataset verify /path/to/shared/dataset
```
### 5. Use Team Privacy for Collaboration
Share with specific teams rather than making public:
```bash
ml privacy set run_abc --level team --team ml-group
```
---
## Compliance Considerations
### GDPR / Research Ethics
| Requirement | FetchML Support | Status |
|-------------|-----------------|--------|
| Right to access | `ml export` creates data bundles | ✅ |
| Right to erasure | Delete command (future) | ⏳ |
| Data minimization | Narrative fields collect only necessary data | ✅ |
| PII detection | `ml annotate --privacy-scan` | ✅ |
| Anonymization | `ml export --anonymize` | ✅ |
### Handling Sensitive Data
For experiments with sensitive data:
1. **Keep private**: Use `--level private`
2. **PII scan all annotations**: Always use `--privacy-scan`
3. **Anonymize before export**: Use `--anonymize-level full`
4. **Verify team membership**: Before sharing at `--level team`
---
## Configuration
### Worker Privacy Settings
Configure privacy defaults in worker config:
```yaml
privacy:
default_level: private
enforce_teams: true
audit_access: true
```
### API Server Privacy
Enable privacy enforcement:
```yaml
security:
privacy:
enabled: true
default_level: private
audit_access: true
```
---
## Troubleshooting
### PII Scan False Positives
Some valid text may trigger PII warnings:
```bash
# Example: "batch@32" looks like email
ml annotate run_abc --note "Use batch@32 for training" --privacy-scan
# Warning triggers, use --force if intended
```
### Privacy Changes Not Applied
- Verify you own the experiment
- Check server supports privacy enforcement
- Try with explicit base path: `--base /path/to/experiments`
### Export Not Anonymized
- Ensure `--anonymize` flag is set
- Check `--anonymize-level` is correct (metadata-only vs full)
- Verify manifest contains privacy data
---
## See Also
- `docs/src/research-features.md` - Research workflow features
- `docs/src/deployment.md` - Production deployment with privacy
- `docs/src/quick-start.md` - Getting started guide

View file

@ -0,0 +1,320 @@
# Research Features
FetchML includes research-focused features for experiment tracking, knowledge capture, and collaboration.
---
## Queue-Time Narrative Capture
Document your hypothesis and intent when queuing experiments. This creates provenance for your research and helps others understand your work.
### Basic Usage
```bash
ml queue train.py \
--hypothesis "Linear LR scaling improves convergence" \
--context "Following up on Smith et al. 2023" \
--intent "Test batch=64 with 2x LR" \
--expected-outcome "Same accuracy, 50% less time" \
--experiment-group "lr-scaling-study" \
--tags "ablation,learning-rate,batch-size"
```
### Available Flags
| Flag | Description | Example |
|------|-------------|---------|
| `--hypothesis` | What you expect to happen | "LR scaling improves convergence" |
| `--context` | Background and motivation | "Following paper XYZ" |
| `--intent` | What you're testing | "Test batch=64 with 2x LR" |
| `--expected-outcome` | Predicted results | "Same accuracy, 50% less time" |
| `--experiment-group` | Group related experiments | "batch-scaling" |
| `--tags` | Comma-separated tags | "ablation,lr-test" |
### Viewing Narrative
After queuing, view the narrative with:
```bash
ml info run_abc
```
---
## Post-Run Outcome Capture
Record findings after experiments complete. This preserves institutional knowledge and helps track what worked.
### Setting Outcomes
```bash
ml outcome set run_abc \
--outcome validates \
--summary "Accuracy improved 2.3% with 2x learning rate" \
--learning "Linear scaling works for batch sizes 32-128" \
--learning "GPU utilization increased from 60% to 95%" \
--next-step "Try batch=96 with gradient accumulation" \
--validation-status "cross-validated"
```
### Outcome Types
- **validates** - Hypothesis confirmed
- **refutes** - Hypothesis rejected
- **inconclusive** - Results unclear
- **partial** - Partial confirmation
### Repeatable Fields
Use multiple flags to capture multiple learnings or next steps:
```bash
ml outcome set run_abc \
--learning "Finding 1" \
--learning "Finding 2" \
--learning "Finding 3" \
--next-step "Follow-up experiment A" \
--next-step "Follow-up experiment B"
```
---
## Experiment Search & Discovery
Find past experiments with powerful filters and export results.
### Basic Search
```bash
# Search by tags
ml find --tag ablation
# Search by outcome
ml find --outcome validates
# Search by experiment group
ml find --experiment-group lr-scaling
```
### Combined Filters
```bash
# Multiple criteria
ml find \
--tag ablation \
--outcome validates \
--dataset imagenet \
--after 2024-01-01 \
--before 2024-03-01
# With limit
ml find --tag production --limit 50
```
### Export Results
```bash
# JSON output for programmatic use
ml find --outcome validates --json > results.json
# CSV for analysis in spreadsheet tools
ml find --experiment-group lr-study --csv > study_results.csv
```
### CSV Output Format
The CSV includes columns:
- `id` - Run identifier
- `job_name` - Job name
- `outcome` - validates/refutes/inconclusive/partial
- `status` - running/finished/failed
- `experiment_group` - Group name
- `tags` - Comma-separated tags
- `hypothesis` - Narrative hypothesis
---
## Experiment Comparison
Compare two or more experiments side-by-side.
### Basic Comparison
```bash
ml compare run_abc run_def
```
### Output Formats
```bash
# Human-readable (default)
ml compare run_abc run_def
# JSON for programmatic analysis
ml compare run_abc run_def --json
# CSV for spreadsheet analysis
ml compare run_abc run_def --csv
# Show all fields (including unchanged)
ml compare run_abc run_def --all
```
### What Gets Compared
- **Narrative fields** - Hypothesis, context, intent differences
- **Metadata** - Batch size, learning rate, epochs, model, dataset
- **Metrics** - Accuracy, loss, training time with deltas
- **Outcomes** - validates vs refutes, etc.
---
## Dataset Verification
Verify dataset integrity with SHA256 checksums.
### Basic Verification
```bash
ml dataset verify /path/to/dataset
```
### Registration with Checksums
When registering datasets, checksums are automatically computed:
```bash
ml dataset register /path/to/imagenet --name imagenet-train
```
### View Dataset Info
```bash
ml dataset info imagenet-train
```
This shows:
- Dataset name and path
- SHA256 checksums
- Sample count
- Privacy level (if set)
---
## Best Practices
### 1. Always Document Hypothesis
```bash
ml queue train.py \
--hypothesis "Data augmentation reduces overfitting" \
--experiment-group "regularization-study"
```
### 2. Capture Outcomes Promptly
Record findings while they're fresh:
```bash
ml outcome set run_abc \
--outcome validates \
--summary "Augmentation improved validation accuracy by 3%" \
--learning "Rotation=15 worked best"
```
### 3. Use Consistent Tags
Establish tag conventions for your team:
- `ablation` - Ablation studies
- `baseline` - Baseline experiments
- `production` - Production-ready models
- `exploratory` - Initial exploration
### 4. Export for Analysis
Regularly export experiment data for analysis:
```bash
ml find --experiment-group my-study --csv > my_study.csv
```
### 5. Compare Systematically
Use comparison to understand what changed:
```bash
# Compare to baseline
ml compare baseline_run current_run
# Compare successful runs
ml compare run_abc run_def run_ghi
```
---
## Integration with Research Workflow
### Iterative Experiments
```bash
# Run baseline
ml queue train.py --hypothesis "Baseline is strong" --tags baseline
# Run variant A
ml queue train_v2.py \
--hypothesis "V2 improves on baseline" \
--tags ablation,v2 \
--experiment-group v2-study
# Compare results
ml compare baseline_run v2_run
# Document outcome
ml outcome set v2_run --outcome validates --summary "V2 improved 2.3%"
```
### Ablation Studies
```bash
# Full model
ml queue train.py --hypothesis "Full model works best" --tags ablation,full
# Without component A
ml queue train_no_a.py --hypothesis "Component A is critical" --tags ablation,no-a
# Without component B
ml queue train_no_b.py --hypothesis "Component B adds value" --tags ablation,no-b
# Search all ablations
ml find --tag ablation --csv > ablations.csv
```
---
## Troubleshooting
### Search Returns No Results
- Check your filters aren't too restrictive
- Try broadening date ranges
- Verify tags are spelled correctly
### Outcome Not Saved
- Ensure run ID exists: `ml info run_abc`
- Check you have permission to modify the run
- Try with explicit base path: `ml outcome set run_abc ... --base /path/to/experiments`
### Comparison Shows No Differences
- Use `--all` flag to show unchanged fields
- Verify you're comparing different runs
- Check that runs have narrative data
---
## See Also
- `docs/src/privacy-security.md` - Privacy levels and PII detection
- `docs/src/quick-start.md` - Full setup guide
- `docs/src/zig-cli.md` - Complete CLI reference