Commit graph

33 commits

Author SHA1 Message Date
Jeremie Fraeys
e08feae6ab
chore: migrate scripts from docker-compose v1 to v2
- Update all scripts to use 'docker compose' instead of 'docker-compose'
- Fix compose file paths after consolidation (test.yml, prod.yml)
- Update cleanup.sh to handle --profile debug and --profile smoke
- Update test fixtures to reference consolidated compose files
2026-03-04 13:22:26 -05:00
Jeremie Fraeys
dddc2913e1
chore(tools): update scripts, native libs, and documentation
Update tooling and documentation:
- Smoke test script with scheduler health checks
- Release cleanup script
- Native test scripts with Redis integration
- TUI SSH test script
- Performance regression detector with scheduler metrics
- Profiler with distributed tracing
- Native CMake with test targets
- Dataset hash tests
- Storage symlink resistance tests
- Configuration reference documentation updates
2026-02-26 12:08:58 -05:00
Jeremie Fraeys
6e0e7d9d2e
fix(smoke-test): copy promtail config file instead of directory
Some checks failed
Checkout test / test (push) Successful in 5s
CI/CD Pipeline / Test (push) Failing after 1s
CI/CD Pipeline / Dev Compose Smoke Test (push) Has been skipped
CI/CD Pipeline / Build (push) Has been skipped
CI/CD Pipeline / Test Scripts (push) Has been skipped
CI/CD Pipeline / Test Native Libraries (push) Has been skipped
CI/CD Pipeline / GPU Golden Test Matrix (push) Has been skipped
Documentation / build-and-publish (push) Failing after 38s
CI/CD Pipeline / Docker Build (push) Has been skipped
Build CLI with Embedded SQLite / build (arm64, aarch64-linux) (push) Has been cancelled
Build CLI with Embedded SQLite / build (x86_64, x86_64-linux) (push) Has been cancelled
Build CLI with Embedded SQLite / build-macos (arm64) (push) Has been cancelled
Build CLI with Embedded SQLite / build-macos (x86_64) (push) Has been cancelled
Security Scan / Security Analysis (push) Has been cancelled
Security Scan / Native Library Security (push) Has been cancelled
Verification & Maintenance / V.1 - Schema Drift Detection (push) Has been cancelled
Verification & Maintenance / V.4 - Custom Go Vet Analyzers (push) Has been cancelled
Verification & Maintenance / V.7 - Audit Chain Integrity (push) Has been cancelled
Verification & Maintenance / V.6 - Extended Security Scanning (push) Has been cancelled
Verification & Maintenance / V.10 - OpenSSF Scorecard (push) Has been cancelled
Verification & Maintenance / Verification Summary (push) Has been cancelled
Copy just promtail-config.yml to temp root instead of entire monitoring/
directory. This fixes the mount error where promtail couldn't find its
config at the expected path.
2026-02-24 11:57:35 -05:00
Jeremie Fraeys
cebcb6115f
fix(smoke-test): add FETCHML_REPO_ROOT to env file
Ensure FETCHML_REPO_ROOT is set in the env file passed to docker-compose.
This fixes path resolution so fallback paths don't incorrectly use parent directory.
2026-02-24 11:48:10 -05:00
Jeremie Fraeys
ce4106a837
fix(smoke-test): copy monitoring configs to temp directory
Promtail mounts monitoring configs from repo root which fails in Colima:

- Copy monitoring/ directory to temp SMOKE_TEST_DATA_DIR
- Update promtail volume path to use SMOKE_TEST_DATA_DIR for configs
- This ensures all mounts are from accessible temp directories
2026-02-24 11:40:32 -05:00
Jeremie Fraeys
225ef5bfb5
fix(smoke-test): use actual env file instead of process substitution
Process substitution <(echo ...) doesn't work with docker-compose.
Write the env file to an actual temp file instead.
2026-02-24 11:38:18 -05:00
Jeremie Fraeys
bff2336db2
fix(smoke-test): use temp directory for smoke test data
Use /tmp for smoke test data to avoid file sharing issues on macOS/Colima:

- smoke-test.sh: Create temp dir with mktemp, export SMOKE_TEST_DATA_DIR
- docker-compose.dev.yml: Use SMOKE_TEST_DATA_DIR with fallback to data/dev
- Remove file sharing permission checks (no longer needed with tmp)

This avoids Docker Desktop/Colima file sharing permission issues entirely
by using a system temp directory that's always accessible.
2026-02-24 11:37:45 -05:00
Jeremie Fraeys
d3a861063f
fix(smoke-test): add Colima-specific file sharing instructions
Detect if user is running Colima and provide appropriate fix instructions:

- Check for colima command presence
- If Colima detected: suggest virtiofs/sshfs mount options
- Show colima.yaml mount configuration example
- Include verification command: colima ssh -- ls ...

Maintains Docker Desktop instructions for non-Colima users.
2026-02-24 11:35:58 -05:00
Jeremie Fraeys
00f938861c
fix(smoke-test): add Docker file sharing permission check for macOS
Add pre-flight check to detect Docker Desktop file sharing issues:

- After creating data directories, verify Docker can access them
- If access fails, print helpful error message with fix instructions
- Directs users to Docker Desktop Settings -> Resources -> File sharing

Prevents confusing 'operation not permitted' errors during smoke tests.
2026-02-24 11:35:23 -05:00
Jeremie Fraeys
305e1b3f2e
ci: update test and benchmark scripts
**scripts/benchmarks/run-benchmarks-local.sh:**
- Add support for native library benchmarks

**scripts/ci/test.sh:**
- Update CI test commands for new test structure

**scripts/dev/smoke-test.sh:**
- Improve smoke test reliability and output
2026-02-23 18:04:01 -05:00
Jeremie Fraeys
6d200b5ac2
fix(docker): Use named volume for Redis to fix permission errors
Replace bind mount with Docker named volume for Redis data

This fixes 'operation not permitted' errors on macOS Docker Desktop

where bind mounts fail due to file sharing restrictions
2026-02-23 14:20:23 -05:00
Jeremie Fraeys
0ea2ac00cd
fix(scripts): Create data directories before starting Docker
Fix Docker mount permission error by creating data/dev/* directories

before docker-compose up, preventing 'operation not permitted' error
2026-02-23 14:17:37 -05:00
Jeremie Fraeys
fa97521488
chore(scripts): Update CI, dev, release, and testing scripts 2026-02-23 14:13:55 -05:00
Jeremie Fraeys
4b2ee75072
chore: move test-native-with-redis.sh to scripts/testing/ 2026-02-21 13:58:19 -05:00
Jeremie Fraeys
c89d970210
refactor: migrate from env var to build tags for native libs
Replace FETCHML_NATIVE_LIBS=1 environment variable with -tags native_libs:

Changes:
- internal/queue/native_queue.go: UseNativeQueue is now const true
- internal/queue/native_queue_stub.go: UseNativeQueue is now const false
- build/docker/simple.Dockerfile: Add -tags native_libs to go build
- deployments/docker-compose.dev.yml: Remove FETCHML_NATIVE_LIBS env var
- native/README.md: Update documentation for build tags
- scripts/test-native-with-redis.sh: New test script with Redis via docker-compose

Benefits:
- Compile-time enforcement (no runtime checks needed)
- Cleaner deployment (no env var management)
- Type safety (const vs var)
- Simpler testing with docker-compose Redis integration
2026-02-21 13:43:58 -05:00
Jeremie Fraeys
94020e4ca4
chore: move detect_native.go and setup_monitoring.py to dev/ 2026-02-18 17:57:57 -05:00
Jeremie Fraeys
8b75f71a6a
refactor: reorganize scripts into categorized structure
Consolidate 26+ scattered scripts into maintainable hierarchy:

New Structure:
- ci/          CI/CD validation (checks.sh, test.sh, verify-paths.sh)
- dev/         Development workflow (smoke-test.sh, manage-artifacts.sh)
- release/     Release preparation (cleanup.sh, prepare.sh, sanitize.sh, verify.sh, verify-checksums.sh)
- testing/     Test infrastructure (unchanged)
- benchmarks/  Performance tools (track-performance.sh)
- maintenance/ System cleanup (unchanged)
- lib/         Shared functions (unchanged)

Key Changes:
- Unified 6 cleanup-*.sh scripts into release/cleanup.sh with targets
- Merged smoke-test-native.sh into dev/smoke-test.sh --native flag
- Renamed scripts to follow lowercase-hyphen convention
- Moved root-level scripts to appropriate categories
- Updated all Makefile references
- Updated scripts/README.md with new structure

Script count: 26 → 17 (35% reduction)

Breaking Changes:
- Old paths no longer exist, update any direct script calls
- Use make targets (e.g., make ci-checks) for stability
2026-02-18 17:56:59 -05:00
Jeremie Fraeys
b4672a6c25
feat: add TUI SSH usability testing infrastructure
Add comprehensive testing for TUI usability over SSH in production-like environment:

Infrastructure:
- Caddy reverse proxy config for WebSocket and API routing
- Docker Compose with SSH test server container
- TUI test configuration for smoke testing

Test Harness:
- SSH server Go test fixture with container management
- TUI driver with PTY support for automated input/output testing
- 8 E2E tests covering SSH connectivity, TERM propagation,
  API/WebSocket connectivity, and TUI configuration

Scripts:
- SSH key generation for test environment
- Manual testing script with interactive TUI verification

The setup allows automated verification that the BubbleTea TUI works
correctly over SSH with proper terminal handling, alt-screen buffer,
and mouse support through Caddy reverse proxy.
2026-02-18 17:48:02 -05:00
Jeremie Fraeys
e127f97442
chore: implement centralized path registry and file organization conventions
Add PathRegistry for centralized path management:
- Create internal/config/paths.go with PathRegistry type
- Binary paths: BinDir(), APIServerBinary(), WorkerBinary(), etc.
- Data paths: DataDir(), JupyterStateDir(), ExperimentsDir()
- Config paths: ConfigDir(), APIServerConfig()
- Helper methods: EnsureDir(), EnsureDirSecure(), FileExists()
- Auto-detect repo root by looking for go.mod

Update .gitignore for root protection:
- Add explicit /api-server, /worker, /tui, /data_manager rules
- Add /coverage.out and .DS_Store to root protection
- Prevents accidental commits of binaries to root

Add path verification script:
- Create scripts/verify-paths.sh
- Checks for binaries in root directory
- Checks for .DS_Store files
- Checks for coverage.out in root
- Verifies data/ is gitignored
- Returns exit code 1 on violations

Cleaned .DS_Store files from repository
2026-02-18 16:48:50 -05:00
Jeremie Fraeys
c9b6532dfb
fix: remove accidentally committed api-server binary 2026-02-18 16:31:40 -05:00
Jeremie Fraeys
8ecdd36155
test(integration): add websocket queue and hash benchmarks
Some checks failed
Checkout test / test (push) Successful in 7s
CI with Native Libraries / Check Build Environment (push) Successful in 13s
CI/CD Pipeline / Test (push) Failing after 5m8s
CI/CD Pipeline / Dev Compose Smoke Test (push) Has been skipped
CI/CD Pipeline / Build (push) Has been skipped
CI/CD Pipeline / Test Scripts (push) Has been skipped
CI/CD Pipeline / Security Scan (push) Failing after 4m51s
Documentation / build-and-publish (push) Failing after 37s
CI with Native Libraries / Build and Test Native Libraries (push) Failing after 14m38s
CI with Native Libraries / Build Release Libraries (push) Has been skipped
CI/CD Pipeline / Docker Build (push) Has been skipped
- Add websocket queue integration test
- Add worker hash benchmark test
- Add native detection script
2026-02-18 12:46:06 -05:00
Jeremie Fraeys
4813228a0c
chore: make file size check a warning instead of failure 2026-02-17 20:32:50 -05:00
Jeremie Fraeys
3187ff26ea
refactor: complete maintainability phases 1-9 and fix all tests
Test fixes (all 41 test packages now pass):
- Fix ComputeTaskProvenance - add dataset_specs JSON output
- Fix EnforceTaskProvenance - populate all metadata fields in best-effort mode
- Fix PrewarmNextOnce - preserve prewarm state when queue empty
- Fix RunManifest directory creation in SetupJobDirectories
- Add ManifestWriter to test worker (simpleManifestWriter)
- Fix worker ID mismatch (use cfg.WorkerID)
- Fix WebSocket binary protocol responses
- Implement all WebSocket handlers: QueueJob, QueueJobWithSnapshot, StatusRequest,
  CancelJob, Prune, ValidateRequest (with run manifest validation), LogMetric,
  GetExperiment, DatasetList/Register/Info/Search

Maintainability phases completed:
- Phases 1-6: Domain types, error system, config boundaries, worker/API/queue splits
- Phase 7: TUI cleanup - reorganize model package (jobs.go, messages.go, styles.go, keys.go)
- Phase 8: MLServer unification - consolidate worker + TUI into internal/network/mlserver.go
- Phase 9: CI enforcement - add scripts/ci-checks.sh with 5 checks:
  * No internal/ -> cmd/ imports
  * domain/ has zero internal imports
  * File size limit (500 lines, rigid)
  * No circular imports
  * Package naming conventions

Documentation:
- Add docs/src/file-naming-conventions.md
- Add make ci-checks target

Lines changed: +756/-36 (WebSocket fixes), +518/-320 (TUI), +263/-20 (Phase 8-9)
2026-02-17 20:32:14 -05:00
Jeremie Fraeys
355d2e311a
docs: update README and CHANGELOG
- Update project documentation with latest features
- Update manage-artifacts.sh script
2026-02-16 20:38:57 -05:00
Jeremie Fraeys
6cc02b5efc
docs: add native libraries documentation and smoke tests
- Add comprehensive native-libraries.md documentation
- Add smoke-test-native.sh for testing native library builds
- Document build process, architecture, and testing strategy
2026-02-16 20:38:46 -05:00
Jeremie Fraeys
1dcc1e11d5
chore(build): update build system, scripts, and additional tests
- Update Makefile with native build targets (preparing for C++)
- Add profiler and performance regression detector commands
- Update CI/testing scripts
- Add additional unit tests for API, jupyter, queue, manifest
2026-02-12 12:05:55 -05:00
Jeremie Fraeys
f726806770 chore(ops): reorganize deployments/monitoring and remove legacy scripts 2026-01-05 12:31:26 -05:00
Jeremie Fraeys
cd5640ebd2 Slim and secure: move scripts, clean configs, remove secrets
- Move ci-test.sh and setup.sh to scripts/
- Trim docs/src/zig-cli.md to current structure
- Replace hardcoded secrets with placeholders in configs
- Update .gitignore to block .env*, secrets/, keys, build artifacts
- Slim README.md to reflect current CLI/TUI split
- Add cleanup trap to ci-test.sh
- Ensure no secrets are committed
2025-12-07 13:57:51 -05:00
Jeremie Fraeys
03cead6319 Organize docker-compose files and fix test output paths
- Move docker-compose.prod.yml and docker-compose.homelab-secure.yml to deployments/
- Create deployments/README.md with usage instructions
- Update test scripts to use new deployment paths
- Fix performance regression detection to output to tests/bin/
- All test outputs now properly organized in tests/bin/
2025-12-06 13:45:05 -05:00
Jeremie Fraeys
5a19358d00 Organize configs and scripts, create testing protocol
- Reorganize configs into environments/, workers/, deprecated/ folders
- Reorganize scripts into testing/, deployment/, maintenance/, benchmarks/ folders
- Add comprehensive testing guide documentation
- Add new Makefile targets: test-full, test-auth, test-status
- Update script paths in Makefile to match new organization
- Create testing protocol documentation
- Add cleanup status checking functionality

Testing framework now includes:
- Quick authentication tests (make test-auth)
- Full test suite runner (make test-full)
- Cleanup status monitoring (make test-status)
- Comprehensive documentation and troubleshooting guides
2025-12-06 13:08:15 -05:00
Jeremie Fraeys
c80e01b752 Add comprehensive self-cleaning system
- Add cleanup.sh script with dry-run, force, and all options
- Add auto-cleanup service setup for macOS (launchd) and Linux (systemd)
- Add cleanup-status.sh for monitoring Docker resources
- Add Makefile targets: self-cleanup, auto-cleanup
- Features colored output, confirmation prompts, and detailed logging
- Auto-cleanup runs daily to keep system clean
- Status monitoring shows resources and service state
2025-12-06 12:40:35 -05:00
Jeremie Fraeys
ea15af1833 Fix multi-user authentication and clean up debug code
- Fix YAML tags in auth config struct (json -> yaml)
- Update CLI configs to use pre-hashed API keys
- Remove double hashing in WebSocket client
- Fix port mapping (9102 -> 9103) in CLI commands
- Update permission keys to use jobs:read, jobs:create, etc.
- Clean up all debug logging from CLI and server
- All user roles now authenticate correctly:
  * Admin: Can queue jobs and see all jobs
  * Researcher: Can queue jobs and see own jobs
  * Analyst: Can see status (read-only access)

Multi-user authentication is now fully functional.
2025-12-06 12:35:32 -05:00
Jeremie Fraeys
bb25743b0f feat: add comprehensive setup scripts and management tools
- Add production setup scripts for automated deployment
- Include monitoring setup and configuration validation
- Add legacy setup scripts for various Linux distributions
- Implement Bitwarden integration for secure credential management
- Add development and production environment setup
- Include comprehensive management tools and utilities
- Add shell script library with common functions

Provides complete automation for setup, deployment, and management
of FetchML platform in development and production environments.
2025-12-04 16:55:04 -05:00