fetch_ml/native_rust/dataset_hash/include/dataset_hash.h
Jeremie Fraeys 6949287fb3
feat(native_rust): implement BLAKE3 dataset_hash and priority queue_index
Implements two production-ready Rust native libraries:

## dataset_hash (BLAKE3-based hashing)
- FFI exports: ds_hash_file, ds_hash_directory_batch, ds_hash_directory_combined
- BLAKE3 hashing for files and directory trees
- Hidden file filtering (respects .hidden and _prefix files)
- Prometheus-compatible metrics export
- Comprehensive integration tests (12 tests)
- Benchmarks: hash_file_1kb (~14µs), hash_file_1mb (~610µs), dir_100files (~1.6ms)

## queue_index (priority queue)
- FFI exports: 25+ functions matching C++ API
  - Lifecycle: qi_open, qi_close
  - Task ops: add_tasks, update_tasks, remove_tasks, get_task_by_id
  - Queue ops: get_next_batch, peek_next, mark_completed
  - Priority: get_next_priority_task, peek_priority_task
  - Query: get_all_tasks, get_tasks_by_status, get_task_count
  - Retry/DLQ: retry_task, move_to_dlq
  - Lease: renew_lease, release_lease
  - Maintenance: rebuild_index, compact_index
- BinaryHeap-based priority queue with correct Ord (max-heap)
- Memory-mapped storage with safe Rust wrappers
- Panic-safe FFI boundaries using catch_unwind
- Comprehensive integration tests (7 tests, 1 ignored for persistence)
- Benchmarks: add_100 (~60µs), get_10 (~24ns), priority (~5µs)

## Architecture
- Cargo workspace with shared common crate
- Criterion benchmarks for both crates
- Rust 1.85.0 toolchain pinned
- Zero compiler warnings
- All 19 tests passing

Compare: make compare-benchmarks (Rust/Go/C++ comparison)
2026-03-23 12:52:13 -04:00

28 lines
691 B
C

#ifndef DATASET_HASH_H
#define DATASET_HASH_H
#include <stddef.h>
#include <stdint.h>
#ifdef __cplusplus
extern "C" {
#endif
// Hash a directory and return combined digest
// dir: path to directory
// out_hash: output parameter, receives allocated hex string (caller must free with fh_free_string)
// Returns: 0 on success, -1 on error
int fh_hash_directory_combined(const char* dir, char** out_hash);
// Free a string previously returned by FFI functions
void fh_free_string(char* s);
// Get metrics in Prometheus format
// Returns: allocated string with metrics (caller must free with fh_free_string)
char* fh_get_metrics(void);
#ifdef __cplusplus
}
#endif
#endif // DATASET_HASH_H