# Native C++ Libraries High-performance C++ libraries for critical system components. ## Overview This directory contains selective C++ optimizations for the highest-impact performance bottlenecks. Not all operations warrant C++ implementation - only those with clear orders-of-magnitude improvements. ## Current Libraries ### queue_index (Priority Queue Index) - **Purpose**: High-performance task queue with binary heap - **Performance**: 21,000x faster than JSON-based Go implementation - **Memory**: 99% allocation reduction - **Status**: ✅ Production ready ### dataset_hash (SHA256 Hashing) - **Purpose**: SIMD-accelerated file hashing (ARMv8 crypto / Intel SHA-NI) - **Performance**: 78% syscall reduction, batch-first API - **Memory**: 99% less memory than Go implementation - **Status**: ✅ Production ready ## Build Requirements - CMake 3.20+ - C++20 compiler (GCC 11+, Clang 14+, or MSVC 2022+) - Go 1.25+ (for CGo integration) ## Quick Start ```bash # Build all native libraries make native-build # Run with native libraries enabled FETCHML_NATIVE_LIBS=1 go run ./... # Run benchmarks FETCHML_NATIVE_LIBS=1 go test -bench=. ./tests/benchmarks/ ``` ## Build Options ```bash # Debug build with AddressSanitizer cd native/build && cmake .. -DCMAKE_BUILD_TYPE=Debug -DENABLE_ASAN=ON # Release build (optimized) cd native/build && cmake .. -DCMAKE_BUILD_TYPE=Release # Build specific library cd native/build && make queue_index ``` ## Architecture ### Design Principles 1. **Selective optimization**: Only 2 libraries out of 80+ profiled functions 2. **Batch-first APIs**: Minimize CGo overhead (~100ns/call) 3. **Zero-allocation hot paths**: Arena allocators, no malloc in critical sections 4. **C ABI for CGo**: Simple C structs, no C++ exceptions across boundary 5. **Cross-platform**: Runtime SIMD detection (ARMv8 / x86_64 SHA-NI) ### CGo Integration ```go // #cgo LDFLAGS: -L${SRCDIR}/../../native/build -lqueue_index // #include "../../native/queue_index/queue_index.h" import "C" ``` ### Error Handling - C functions return `-1` for errors, positive values for success - Use `qi_last_error()` / `fh_last_error()` for error messages - Go code checks `rc < 0` not `rc != 0` ## When to Add New C++ Libraries **DO implement when:** - Profile shows >90% syscall overhead - Batch operations amortize CGo cost - SIMD can provide 3x+ speedup - Memory pressure is critical **DON'T implement when:** - Speedup <2x (CGo overhead negates gains) - Single-file operations (per-call overhead too high) - Team <3 backend engineers (maintenance burden) - Complex error handling required ## History **Implemented:** - ✅ queue_index: Binary priority queue replacing JSON filesystem queue - ✅ dataset_hash: SIMD SHA256 for artifact verification **Deferred:** - ⏸️ task_json_codec: 2-3x speedup not worth maintenance (small team) - ⏸️ artifact_scanner: Go filepath.Walk faster for typical workloads - ⏸️ streaming_io: Complexity exceeds benefit without io_uring ## Maintenance **Build verification:** ```bash make native-build FETCHML_NATIVE_LIBS=1 make test ``` **Adding new library:** 1. Create subdirectory with CMakeLists.txt 2. Implement C ABI in `.h` / `.cpp` files 3. Add to root CMakeLists.txt 4. Create Go bridge in `internal/` 5. Add benchmarks in `tests/benchmarks/` 6. Document in this README ## Troubleshooting **Library not found:** - Ensure `native/build/lib*.dylib` (macOS) or `.so` (Linux) exists - Check `LD_LIBRARY_PATH` or `DYLD_LIBRARY_PATH` **CGo undefined symbols:** - Verify C function names match exactly (no name mangling) - Check `#include` paths are correct - Rebuild: `make native-clean && make native-build` **Performance regression:** - Verify `FETCHML_NATIVE_LIBS=1` is set - Check benchmark: `go test -bench=BenchmarkQueue -v` - Profile with: `go test -bench=. -cpuprofile=cpu.prof`