fetch_ml/CHANGELOG.md
Jeremie Fraeys f357624685
docs: Update CHANGELOG and add feature documentation
Update documentation for new features:
- Add CHANGELOG entries for research features and privacy enhancements
- Update README with new CLI commands and security features
- Add privacy-security.md documentation for PII detection
- Add research-features.md for narrative and outcome tracking
2026-02-18 21:28:25 -05:00

58 lines
3.3 KiB
Markdown

## [Unreleased]
### Added - CSV Export Features (2026-02-18)
- CLI: `ml compare --csv` - Export run comparisons as CSV with actual run IDs as column headers
- CLI: `ml find --csv` - Export search results as CSV for spreadsheet analysis
- CLI: `ml dataset verify --csv` - Export dataset verification metrics as CSV
- Shell: Updated bash/zsh completions with --csv flags for compare, find commands
### Added - Phase 3 Features (2026-02-18)
- CLI: `ml requeue --with-changes` - Iterative experimentation with config overrides (--lr=0.002, etc.)
- CLI: `ml requeue --inherit-narrative` - Copy hypothesis/context from parent run
- CLI: `ml requeue --inherit-config` - Copy metadata from parent run
- CLI: `ml requeue --parent` - Link as child run for provenance tracking
- CLI: `ml dataset verify` - Fast dataset checksum validation
- CLI: `ml logs --follow` - Real-time log streaming via WebSocket
- API/WebSocket: Add opcodes for compare (0x30), find (0x31), export (0x32), set outcome (0x33)
### Added - Phase 2 Features (2026-02-18)
- CLI: `ml compare` - Diff two runs showing narrative/metadata/metrics differences
- CLI: `ml find` - Search experiments by tags, outcome, dataset, experiment-group, author
- CLI: `ml export --anonymize` - Export bundles with path/IP/username redaction
- CLI: `ml export --anonymize-level` - 'metadata-only' or 'full' anonymization
- CLI: `ml outcome set` - Post-run outcome tracking (validates/refutes/inconclusive/partial)
- CLI: Error suggestions with Levenshtein distance for typos
- Shell: Updated bash/zsh completions for all new commands
- Tests: E2E tests for compare, find, export, requeue changes
### Added - Phase 0 Features (2026-02-18)
- CLI: Queue-time narrative flags (--hypothesis, --context, --intent, --expected-outcome, --experiment-group, --tags)
- CLI: Enhanced `ml status` output with queue position [pos N] and priority (P:N)
- CLI: `ml narrative set` command for setting run narrative fields
- Shell: Updated completions with new commands and flags
### Security
- Native: fix buffer overflow vulnerabilities in `dataset_hash` (replaced `strcpy` with `strncpy` + null termination)
- Native: fix unsafe `memcpy` in `queue_index` priority queue (added explicit null terminators for string fields)
- Native: add path traversal protection in `queue_index` storage (rejects `..` and null bytes in queue directory paths)
- Native: add mmap size limits (100MB max) to prevent unbounded memory mapping exposure
- Native: modularize C++ libraries with clean layering (common, queue_index, dataset_hash)
### Added
- API/WebSocket: add dataset handlers (list, register, info, search) with DB integration
- API/WebSocket: add metrics persistence to `handleLogMetric` with `websocket_metrics` table
- Storage: add `db_metrics.go` with `RecordMetric`, `GetMetrics`, `GetMetricSummary` methods
- Tests: add payload parsing tests for WebSocket handlers
### Changed
- Config: replace `panic()` with error returns in `smart_defaults.go` for better error handling
- Tests: move WebSocket handler tests to `tests/unit/api/ws/`
### Fixed
- Storage: remove duplicate `db_datasets.go`, consolidate with `db_experiments.go`
### Deprecated
- Config: `ToTUIConfig()` now returns `(*Config, error)` instead of `*Config`
### Removed
- Storage: deleted `internal/storage/db_datasets.go` (duplicate implementation)