fetch_ml/native_rust/queue_index/src/task.rs
Jeremie Fraeys 6949287fb3
feat(native_rust): implement BLAKE3 dataset_hash and priority queue_index
Implements two production-ready Rust native libraries:

## dataset_hash (BLAKE3-based hashing)
- FFI exports: ds_hash_file, ds_hash_directory_batch, ds_hash_directory_combined
- BLAKE3 hashing for files and directory trees
- Hidden file filtering (respects .hidden and _prefix files)
- Prometheus-compatible metrics export
- Comprehensive integration tests (12 tests)
- Benchmarks: hash_file_1kb (~14µs), hash_file_1mb (~610µs), dir_100files (~1.6ms)

## queue_index (priority queue)
- FFI exports: 25+ functions matching C++ API
  - Lifecycle: qi_open, qi_close
  - Task ops: add_tasks, update_tasks, remove_tasks, get_task_by_id
  - Queue ops: get_next_batch, peek_next, mark_completed
  - Priority: get_next_priority_task, peek_priority_task
  - Query: get_all_tasks, get_tasks_by_status, get_task_count
  - Retry/DLQ: retry_task, move_to_dlq
  - Lease: renew_lease, release_lease
  - Maintenance: rebuild_index, compact_index
- BinaryHeap-based priority queue with correct Ord (max-heap)
- Memory-mapped storage with safe Rust wrappers
- Panic-safe FFI boundaries using catch_unwind
- Comprehensive integration tests (7 tests, 1 ignored for persistence)
- Benchmarks: add_100 (~60µs), get_10 (~24ns), priority (~5µs)

## Architecture
- Cargo workspace with shared common crate
- Criterion benchmarks for both crates
- Rust 1.85.0 toolchain pinned
- Zero compiler warnings
- All 19 tests passing

Compare: make compare-benchmarks (Rust/Go/C++ comparison)
2026-03-23 12:52:13 -04:00

74 lines
2 KiB
Rust

//! Task definition and serialization
use serde::{Deserialize, Serialize};
/// Task structure - matches both C FFI and Go queue.Task
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct Task {
pub id: String,
pub job_name: String,
pub priority: i64,
pub created_at: i64,
pub next_retry: i64,
pub status: String,
pub retries: u32,
}
impl Task {
/// Create a new task with default values
pub fn new(id: impl Into<String>, job_name: impl Into<String>) -> Self {
Self {
id: id.into(),
job_name: job_name.into(),
priority: 0,
created_at: chrono::Utc::now().timestamp_nanos_opt().unwrap_or(0),
next_retry: 0,
status: "queued".to_string(),
retries: 0,
}
}
/// Serialize to JSON bytes
pub fn to_json(&self) -> Result<Vec<u8>, serde_json::Error> {
serde_json::to_vec(self)
}
/// Deserialize from JSON bytes
pub fn from_json(data: &[u8]) -> Result<Self, serde_json::Error> {
serde_json::from_slice(data)
}
/// Check if task is ready to be scheduled
pub fn is_ready(&self) -> bool {
self.status == "queued" && (self.next_retry == 0 || self.next_retry <= current_time())
}
}
fn current_time() -> i64 {
chrono::Utc::now().timestamp_nanos_opt().unwrap_or(0)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_task_serialization() {
let task = Task::new("task-1", "test-job");
let json = task.to_json().unwrap();
let deserialized = Task::from_json(&json).unwrap();
assert_eq!(task.id, deserialized.id);
assert_eq!(task.job_name, deserialized.job_name);
}
#[test]
fn test_task_ready() {
let mut task = Task::new("task-1", "test-job");
task.status = "queued".to_string();
task.next_retry = 0;
assert!(task.is_ready());
task.next_retry = current_time() + 1_000_000_000; // 1 second in future
assert!(!task.is_ready());
}
}