Jeremie Fraeys
c89d970210
refactor: migrate from env var to build tags for native libs
...
Replace FETCHML_NATIVE_LIBS=1 environment variable with -tags native_libs:
Changes:
- internal/queue/native_queue.go: UseNativeQueue is now const true
- internal/queue/native_queue_stub.go: UseNativeQueue is now const false
- build/docker/simple.Dockerfile: Add -tags native_libs to go build
- deployments/docker-compose.dev.yml: Remove FETCHML_NATIVE_LIBS env var
- native/README.md: Update documentation for build tags
- scripts/test-native-with-redis.sh: New test script with Redis via docker-compose
Benefits:
- Compile-time enforcement (no runtime checks needed)
- Cleaner deployment (no env var management)
- Type safety (const vs var)
- Simpler testing with docker-compose Redis integration
2026-02-21 13:43:58 -05:00
Jeremie Fraeys
472590f831
docs: expand Research Trustworthiness section with detailed design rationale
...
Add comprehensive explanation of the reproducibility problem and fix:
- Document readdir filesystem-dependent ordering issue
- Explain std::sort fix for lexicographic ordering
- Clarify recursive traversal with cycle detection
- Document hidden file and special file exclusions
- Warn researchers about silent omissions and empty hash edge cases
This addresses the core concern that researchers need to understand
the hash is computed over sorted paths to trust cross-machine verification.
2026-02-21 13:38:25 -05:00
Jeremie Fraeys
7efe8bbfbf
native: security hardening, research trustworthiness, and CVE mitigations
...
Security Fixes:
- CVE-2024-45339: Add O_EXCL flag to temp file creation in storage_write_entries()
Prevents symlink attacks on predictable .tmp file paths
- CVE-2025-47290: Use openat_nofollow() in storage_open()
Closes TOCTOU race condition via path_sanitizer infrastructure
- CVE-2025-0838: Add MAX_BATCH_SIZE=10000 to add_tasks()
Prevents integer overflow in batch operations
Research Trustworthiness (dataset_hash):
- Deterministic file ordering: std::sort after collect_files()
- Recursive directory traversal: depth-limited with cycle detection
- Documented exclusions: hidden files and special files noted in API
Bug Fixes:
- R1: storage_init path validation for non-existent directories
- R2: safe_strncpy return value check before strcat
- R3: parallel_hash 256-file cap replaced with std::vector
- R4: wire qi_compact_index/qi_rebuild_index stubs
- R5: CompletionLatch race condition fix (hold mutex during decrement)
- R6: ARMv8 SHA256 transform fix (save abcd_pre before vsha256hq_u32)
- R7: fuzz_index_storage header format fix
- R8: enforce null termination in add_tasks/update_tasks
- R9: use 64 bytes (not 65) in combined hash to exclude null terminator
- R10: status field persistence in save()
New Tests:
- test_recursive_dataset.cpp: Verify deterministic recursive hashing
- test_storage_symlink_resistance.cpp: Verify CVE-2024-45339 fix
- test_queue_index_batch_limit.cpp: Verify CVE-2025-0838 fix
- test_sha256_arm_kat.cpp: ARMv8 known-answer tests
- test_storage_init_new_dir.cpp: F1 verification
- test_parallel_hash_large_dir.cpp: F3 verification
- test_queue_index_compact.cpp: F4 verification
All 8 native tests passing. Library ready for research lab deployment.
2026-02-21 13:33:45 -05:00
Jeremie Fraeys
37aad7ae87
feat: add manifest signing and native hashing support
...
- Integrate RunManifest.Validate with existing Validator
- Add manifest Sign() and Verify() methods
- Add native C++ hashing libraries (dataset_hash, queue_index)
- Add native bridge for Go/C++ integration
- Add deduplication support in queue
2026-02-19 15:34:39 -05:00
Jeremie Fraeys
43d241c28d
feat: implement C++ native libraries for performance-critical operations
...
- Add arena allocator for zero-allocation hot paths
- Add thread pool for parallel operations
- Add mmap utilities for memory-mapped I/O
- Implement queue_index with heap-based priority queue
- Implement dataset_hash with SIMD support (SHA-NI, ARMv8)
- Add runtime SIMD detection for cross-platform correctness
- Add comprehensive tests and benchmarks
2026-02-16 20:38:04 -05:00
Jeremie Fraeys
1de9cc2738
fix: add missing C++ headers for queue and condition_variable
Checkout test / test (push) Successful in 4s
CI with Native Libraries / Check Build Environment (push) Successful in 12s
Documentation / build-and-publish (push) Failing after 33s
CI with Native Libraries / Build and Test Native Libraries (push) Failing after 6m36s
CI with Native Libraries / Build Release Libraries (push) Has been skipped
2026-02-12 13:56:09 -05:00
Jeremie Fraeys
d408a60eb1
ci: push all workflow updates
Documentation / build-and-publish (push) Waiting to run
Test / test (push) Waiting to run
Checkout test / test (push) Successful in 5s
CI with Native Libraries / test-native (push) Has been cancelled
CI with Native Libraries / build-release (push) Has been cancelled
2026-02-12 13:28:15 -05:00