- README.md: Replace FETCHML_NATIVE_LIBS with -tags native_libs
- docs/src/native-libraries.md: Update all examples to use build tags
- .forgejo/workflows/ci-native.yml: Use -tags native_libs in all test steps
- Remove deprecated FETCHML_NATIVE_LIBS=1/0 env var references
- Replace worker.DirOverallSHA256HexParallel with worker.DirOverallSHA256Hex
- Fixes in dataset_hash_bench_test.go and hash_bench_test.go
- All benchmarks pass with native_libs build tag
- Remove duplicate hash_selector.go (build tags handle switching)
- Fix benchmark to use worker.DirOverallSHA256Hex
- Fix snapshot_store.go to use integrity.DirOverallSHA256Hex directly
- Native tests pass, benchmarks now correctly test native vs Go
Go Worker (internal/worker/native_bridge_libs.go):
- Add global hashCtx with sync.Once for lazy initialization
- Eliminates 5-20ms fh_init/fh_cleanup per hash operation
- Uses runtime.NumCPU() for optimal thread count
- Log initialization time for observability
Zig CLI (cli/src/native/hash.zig):
- Add global_ctx with atomic flag and mutex
- Thread-safe initialization with double-check pattern
- Idempotent init() callable from multiple threads
- Log init time for debugging
- Remove separate 'hash' subcommand
- Integrate native SHA256 hash into 'dataset verify'
- Hash is now computed automatically when verifying datasets
- Shows hash in output (JSON, CSV, and text formats)
- Help text updated to indicate auto-hashing
Replace FETCHML_NATIVE_LIBS=1 environment variable with -tags native_libs:
Changes:
- internal/queue/native_queue.go: UseNativeQueue is now const true
- internal/queue/native_queue_stub.go: UseNativeQueue is now const false
- build/docker/simple.Dockerfile: Add -tags native_libs to go build
- deployments/docker-compose.dev.yml: Remove FETCHML_NATIVE_LIBS env var
- native/README.md: Update documentation for build tags
- scripts/test-native-with-redis.sh: New test script with Redis via docker-compose
Benefits:
- Compile-time enforcement (no runtime checks needed)
- Cleaner deployment (no env var management)
- Type safety (const vs var)
- Simpler testing with docker-compose Redis integration
Add comprehensive explanation of the reproducibility problem and fix:
- Document readdir filesystem-dependent ordering issue
- Explain std::sort fix for lexicographic ordering
- Clarify recursive traversal with cycle detection
- Document hidden file and special file exclusions
- Warn researchers about silent omissions and empty hash edge cases
This addresses the core concern that researchers need to understand
the hash is computed over sorted paths to trust cross-machine verification.
- Renamed note.zig to annotate.zig (preserves user's preferred naming)
- Updated all references from 'ml note' to 'ml annotate'
- Re-added experiment.zig with create/list/show subcommands
- Updated main.zig dispatch: 'a' for annotate, 'e' for experiment
- Updated printUsage and test block to reflect changes
- store/store.go: New SQLite storage for TUI local mode
- Open() with WAL mode and NORMAL synchronous
- Schema initialization for ml_experiments, ml_runs, ml_metrics, ml_params, ml_tags
- GetUnsyncedRuns(), GetRunsByExperiment(), MarkRunSynced()
- GetRunMetrics(), GetRunParams() for run details
- config/config.go: Add local mode configuration fields
- DBPath, ForceLocal, ProjectRoot fields
- Experiment struct with Name and Entrypoint
- IsLocalMode() and GetDBPath() helper methods
- go.mod: Add modernc.org/sqlite v1.36.0 dependency
- queue.zig: Add --rerun <run_id> flag to re-queue completed local runs
- Requires server connection, rejects in offline mode with clear error
- HandleRerun function sends rerun request via WebSocket
- sync.zig: Rewrite for WebSocket experiment sync protocol
- Queries unsynced runs from SQLite ml_runs table
- Builds sync JSON with metrics and params
- Sends sync_run message, waits for sync_ack response
- MarkRunSynced updates synced flag in database
- watch.zig: Add --sync flag for continuous experiment sync
- Auto-sync runs to server every 30 seconds when online
- Mode detection with offline error handling
- note.zig: New unified metadata annotation command
- Supports --text, --hypothesis, --outcome, --confidence, --privacy, --author
- Stores metadata as tags in SQLite ml_tags table
- log.zig: Simplified to unified logs command (fetch/stream only)
- Removed metric/param/tag subcommands (now in run wrapper)
- Supports --follow for live log streaming from server
- cancel.zig: Add local process termination support
- Sends SIGTERM first, waits 5s, then SIGKILL if needed
- Updates run status to CANCELLED in SQLite
- Also supports server job cancellation via WebSocket
- Fork child process and capture stdout/stderr via pipe
- Parse FETCHML_METRIC key=value [step=N] lines from output
- Write run_manifest.json with run metadata
- Insert/update ml_runs table in SQLite with PID tracking
- Stream output to output.log file
- Support entrypoint from config or explicit command after --
- Update experiment.zig with unified commands (local + server modes)
- Add init.zig for local project initialization
- Update sync.zig for project synchronization
- Update main.zig to route new local mode commands (experiment, run, log)
- Support automatic mode detection from config (sqlite:// vs wss://)
- Add SQLite amalgamation fetch script (make build-sqlite)
- Embed SQLite in release builds, link system lib in dev
- Create sqlite_embedded.zig utility module
- Unify experiment/run/log commands with auto mode detection
- Add Forgejo CI workflow for building with embedded SQLite
- Update READMEs for local mode and build instructions
SQLite follows rsync embedding pattern: assets/sqlite_release_<os>_<arch>/
Zero external dependencies for release builds.
- Fix duplicate check in security_test.go lint warning
- Mark SHA256 tests as Legacy for backward compatibility
- Convert TODO comments to documentation (task, handlers, privacy)
- Update user_manager_test to use GenerateAPIKey pattern
Update documentation for new features:
- Add CHANGELOG entries for research features and privacy enhancements
- Update README with new CLI commands and security features
- Add privacy-security.md documentation for PII detection
- Add research-features.md for narrative and outcome tracking
Reorganize tests for better structure and coverage:
- Move container/security_test.go from internal/ to tests/unit/container/
- Move related tests to proper unit test locations
- Delete orphaned test files (startup_blacklist_test.go)
- Add privacy middleware unit tests
- Add worker config unit tests
- Update E2E tests for homelab and websocket scenarios
- Update test fixtures with utility functions
- Add CLI helper script for arraylist fixes
Add comprehensive research context tracking to jobs:
- Narrative fields: hypothesis, context, intent, expected_outcome
- Experiment groups and tags for organization
- Run comparison (compare command) for diff analysis
- Run search (find command) with criteria filtering
- Run export (export command) for data portability
- Outcome setting (outcome command) for experiment validation
Update queue and requeue commands to support narrative fields.
Add narrative validation to manifest validator.
Add WebSocket handlers for compare, find, export, and outcome operations.
Includes E2E tests for phase 2 features.
Add comprehensive testing for TUI usability over SSH in production-like environment:
Infrastructure:
- Caddy reverse proxy config for WebSocket and API routing
- Docker Compose with SSH test server container
- TUI test configuration for smoke testing
Test Harness:
- SSH server Go test fixture with container management
- TUI driver with PTY support for automated input/output testing
- 8 E2E tests covering SSH connectivity, TERM propagation,
API/WebSocket connectivity, and TUI configuration
Scripts:
- SSH key generation for test environment
- Manual testing script with interactive TUI verification
The setup allows automated verification that the BubbleTea TUI works
correctly over SSH with proper terminal handling, alt-screen buffer,
and mouse support through Caddy reverse proxy.
Update internal/jupyter/workspace_metadata.go to use centralized PathRegistry:
Changes:
- Add import for internal/config package
- Update saveMetadata() to use config.FromEnv() for directory creation
- Replace os.MkdirAll with paths.EnsureDir() for metadata directory
Benefits:
- Consistent directory creation via PathRegistry
- Centralized path management for workspace metadata
- Better error handling for directory creation
Update internal/queue/filesystem_queue.go to use centralized PathRegistry:
Changes:
- Add import for internal/config package
- Update NewFilesystemQueue to use config.FromEnv() for directory creation
- Replace os.MkdirAll with paths.EnsureDir() for all queue directories:
- pending/entries
- running
- finished
- failed
Benefits:
- Consistent directory creation via PathRegistry
- Centralized path management for queue storage
- Better error handling for directory creation