# Native C++ Libraries

High-performance C++ libraries for critical system components.

## Overview

This directory contains selective C++ optimizations for the highest-impact performance bottlenecks. Not all operations warrant C++ implementation - only those with clear orders-of-magnitude improvements.

## Current Libraries

### queue_index (Priority Queue Index)
- **Purpose**: High-performance task queue with binary heap
- **Performance**: 21,000x faster than JSON-based Go implementation
- **Memory**: 99% allocation reduction
- **Status**: ✅ Production ready

### dataset_hash (SHA256 Hashing)
- **Purpose**: SIMD-accelerated file hashing (ARMv8 crypto / Intel SHA-NI)
- **Performance**: 78% syscall reduction, batch-first API
- **Memory**: 99% less memory than Go implementation
- **Status**: ✅ Production ready

## Build Requirements

- CMake 3.20+
- C++20 compiler (GCC 11+, Clang 14+, or MSVC 2022+)
- Go 1.25+ (for CGo integration)

## Quick Start

```bash
# Build all native libraries
make native-build

# Run with native libraries enabled
FETCHML_NATIVE_LIBS=1 go run ./...

# Run benchmarks
FETCHML_NATIVE_LIBS=1 go test -bench=. ./tests/benchmarks/
```

## Build Options

```bash
# Debug build with AddressSanitizer
cd native/build && cmake .. -DCMAKE_BUILD_TYPE=Debug -DENABLE_ASAN=ON

# Release build (optimized)
cd native/build && cmake .. -DCMAKE_BUILD_TYPE=Release

# Build specific library
cd native/build && make queue_index
```

## Architecture

### Design Principles

1. **Selective optimization**: Only 2 libraries out of 80+ profiled functions
2. **Batch-first APIs**: Minimize CGo overhead (~100ns/call)
3. **Zero-allocation hot paths**: Arena allocators, no malloc in critical sections
4. **C ABI for CGo**: Simple C structs, no C++ exceptions across boundary
5. **Cross-platform**: Runtime SIMD detection (ARMv8 / x86_64 SHA-NI)

### CGo Integration

```go
// #cgo LDFLAGS: -L${SRCDIR}/../../native/build -lqueue_index
// #include "../../native/queue_index/queue_index.h"
import "C"
```

### Error Handling

- C functions return `-1` for errors, positive values for success
- Use `qi_last_error()` / `fh_last_error()` for error messages
- Go code checks `rc < 0` not `rc != 0`

## When to Add New C++ Libraries

**DO implement when:**
- Profile shows >90% syscall overhead
- Batch operations amortize CGo cost
- SIMD can provide 3x+ speedup
- Memory pressure is critical

**DON'T implement when:**
- Speedup <2x (CGo overhead negates gains)
- Single-file operations (per-call overhead too high)
- Team <3 backend engineers (maintenance burden)
- Complex error handling required

## History

**Implemented:**
- ✅ queue_index: Binary priority queue replacing JSON filesystem queue
- ✅ dataset_hash: SIMD SHA256 for artifact verification

**Deferred:**
- ⏸️ task_json_codec: 2-3x speedup not worth maintenance (small team)
- ⏸️ artifact_scanner: Go filepath.Walk faster for typical workloads
- ⏸️ streaming_io: Complexity exceeds benefit without io_uring

## Maintenance

**Build verification:**
```bash
make native-build
FETCHML_NATIVE_LIBS=1 make test
```

**Adding new library:**
1. Create subdirectory with CMakeLists.txt
2. Implement C ABI in `.h` / `.cpp` files
3. Add to root CMakeLists.txt
4. Create Go bridge in `internal/`
5. Add benchmarks in `tests/benchmarks/`
6. Document in this README

## Troubleshooting

**Library not found:**
- Ensure `native/build/lib*.dylib` (macOS) or `.so` (Linux) exists
- Check `LD_LIBRARY_PATH` or `DYLD_LIBRARY_PATH`

**CGo undefined symbols:**
- Verify C function names match exactly (no name mangling)
- Check `#include` paths are correct
- Rebuild: `make native-clean && make native-build`

**Performance regression:**
- Verify `FETCHML_NATIVE_LIBS=1` is set
- Check benchmark: `go test -bench=BenchmarkQueue -v`
- Profile with: `go test -bench=. -cpuprofile=cpu.prof`