Update documentation for new features: - Add CHANGELOG entries for research features and privacy enhancements - Update README with new CLI commands and security features - Add privacy-security.md documentation for PII detection - Add research-features.md for narrative and outcome tracking
6.8 KiB
Research Features
FetchML includes research-focused features for experiment tracking, knowledge capture, and collaboration.
Queue-Time Narrative Capture
Document your hypothesis and intent when queuing experiments. This creates provenance for your research and helps others understand your work.
Basic Usage
ml queue train.py \
--hypothesis "Linear LR scaling improves convergence" \
--context "Following up on Smith et al. 2023" \
--intent "Test batch=64 with 2x LR" \
--expected-outcome "Same accuracy, 50% less time" \
--experiment-group "lr-scaling-study" \
--tags "ablation,learning-rate,batch-size"
Available Flags
| Flag | Description | Example |
|---|---|---|
--hypothesis |
What you expect to happen | "LR scaling improves convergence" |
--context |
Background and motivation | "Following paper XYZ" |
--intent |
What you're testing | "Test batch=64 with 2x LR" |
--expected-outcome |
Predicted results | "Same accuracy, 50% less time" |
--experiment-group |
Group related experiments | "batch-scaling" |
--tags |
Comma-separated tags | "ablation,lr-test" |
Viewing Narrative
After queuing, view the narrative with:
ml info run_abc
Post-Run Outcome Capture
Record findings after experiments complete. This preserves institutional knowledge and helps track what worked.
Setting Outcomes
ml outcome set run_abc \
--outcome validates \
--summary "Accuracy improved 2.3% with 2x learning rate" \
--learning "Linear scaling works for batch sizes 32-128" \
--learning "GPU utilization increased from 60% to 95%" \
--next-step "Try batch=96 with gradient accumulation" \
--validation-status "cross-validated"
Outcome Types
- validates - Hypothesis confirmed
- refutes - Hypothesis rejected
- inconclusive - Results unclear
- partial - Partial confirmation
Repeatable Fields
Use multiple flags to capture multiple learnings or next steps:
ml outcome set run_abc \
--learning "Finding 1" \
--learning "Finding 2" \
--learning "Finding 3" \
--next-step "Follow-up experiment A" \
--next-step "Follow-up experiment B"
Experiment Search & Discovery
Find past experiments with powerful filters and export results.
Basic Search
# Search by tags
ml find --tag ablation
# Search by outcome
ml find --outcome validates
# Search by experiment group
ml find --experiment-group lr-scaling
Combined Filters
# Multiple criteria
ml find \
--tag ablation \
--outcome validates \
--dataset imagenet \
--after 2024-01-01 \
--before 2024-03-01
# With limit
ml find --tag production --limit 50
Export Results
# JSON output for programmatic use
ml find --outcome validates --json > results.json
# CSV for analysis in spreadsheet tools
ml find --experiment-group lr-study --csv > study_results.csv
CSV Output Format
The CSV includes columns:
id- Run identifierjob_name- Job nameoutcome- validates/refutes/inconclusive/partialstatus- running/finished/failedexperiment_group- Group nametags- Comma-separated tagshypothesis- Narrative hypothesis
Experiment Comparison
Compare two or more experiments side-by-side.
Basic Comparison
ml compare run_abc run_def
Output Formats
# Human-readable (default)
ml compare run_abc run_def
# JSON for programmatic analysis
ml compare run_abc run_def --json
# CSV for spreadsheet analysis
ml compare run_abc run_def --csv
# Show all fields (including unchanged)
ml compare run_abc run_def --all
What Gets Compared
- Narrative fields - Hypothesis, context, intent differences
- Metadata - Batch size, learning rate, epochs, model, dataset
- Metrics - Accuracy, loss, training time with deltas
- Outcomes - validates vs refutes, etc.
Dataset Verification
Verify dataset integrity with SHA256 checksums.
Basic Verification
ml dataset verify /path/to/dataset
Registration with Checksums
When registering datasets, checksums are automatically computed:
ml dataset register /path/to/imagenet --name imagenet-train
View Dataset Info
ml dataset info imagenet-train
This shows:
- Dataset name and path
- SHA256 checksums
- Sample count
- Privacy level (if set)
Best Practices
1. Always Document Hypothesis
ml queue train.py \
--hypothesis "Data augmentation reduces overfitting" \
--experiment-group "regularization-study"
2. Capture Outcomes Promptly
Record findings while they're fresh:
ml outcome set run_abc \
--outcome validates \
--summary "Augmentation improved validation accuracy by 3%" \
--learning "Rotation=15 worked best"
3. Use Consistent Tags
Establish tag conventions for your team:
ablation- Ablation studiesbaseline- Baseline experimentsproduction- Production-ready modelsexploratory- Initial exploration
4. Export for Analysis
Regularly export experiment data for analysis:
ml find --experiment-group my-study --csv > my_study.csv
5. Compare Systematically
Use comparison to understand what changed:
# Compare to baseline
ml compare baseline_run current_run
# Compare successful runs
ml compare run_abc run_def run_ghi
Integration with Research Workflow
Iterative Experiments
# Run baseline
ml queue train.py --hypothesis "Baseline is strong" --tags baseline
# Run variant A
ml queue train_v2.py \
--hypothesis "V2 improves on baseline" \
--tags ablation,v2 \
--experiment-group v2-study
# Compare results
ml compare baseline_run v2_run
# Document outcome
ml outcome set v2_run --outcome validates --summary "V2 improved 2.3%"
Ablation Studies
# Full model
ml queue train.py --hypothesis "Full model works best" --tags ablation,full
# Without component A
ml queue train_no_a.py --hypothesis "Component A is critical" --tags ablation,no-a
# Without component B
ml queue train_no_b.py --hypothesis "Component B adds value" --tags ablation,no-b
# Search all ablations
ml find --tag ablation --csv > ablations.csv
Troubleshooting
Search Returns No Results
- Check your filters aren't too restrictive
- Try broadening date ranges
- Verify tags are spelled correctly
Outcome Not Saved
- Ensure run ID exists:
ml info run_abc - Check you have permission to modify the run
- Try with explicit base path:
ml outcome set run_abc ... --base /path/to/experiments
Comparison Shows No Differences
- Use
--allflag to show unchanged fields - Verify you're comparing different runs
- Check that runs have narrative data
See Also
docs/src/privacy-security.md- Privacy levels and PII detectiondocs/src/quick-start.md- Full setup guidedocs/src/zig-cli.md- Complete CLI reference