Commit graph

8 commits

Author SHA1 Message Date
Jeremie Fraeys
7880ea8d79
refactor: reorganize podman directory structure
Organize podman/ directory into logical subdirectories:

New structure:
- docs/          - ML_TOOLS_GUIDE.md, jupyter_workflow.md
- configs/       - environment*.yml, security_policy.json
- containers/    - *.dockerfile, *.podfile
- scripts/       - *.sh, *.py (secure_runner, cli_integration, etc.)
- jupyter/       - jupyter_cookie_secret (flattened from jupyter_runtime/runtime/)
- workspace/     - Example projects (cleaned of temp files)

Cleaned workspace:
- Removed .DS_Store, mlflow.db, cache/
- Removed duplicate cli_integration.py

Removed unnecessary nesting:
- Flattened jupyter_runtime/runtime/ to just jupyter/

Improves maintainability by grouping files by purpose and eliminating root directory clutter.
2026-02-18 16:40:46 -05:00
Jeremie Fraeys
8271277dc3
feat: implement research-grade maintainability phases 2, 5, 8, 10
Phase 2: Deterministic Manifests
- Add manifest.Validator with required field checking
- Support Validate() and ValidateStrict() modes
- Integrate validation into worker executor before execution
- Block execution if manifest missing commit_id or deps_manifest_sha256

Phase 5: Pinned Dependencies
- Add hermetic.dockerfile template with pinned system deps
- Frozen package versions: libblas3, libcudnn8, etc.
- Support for deps_manifest.json and requirements.txt with hashes
- Image tagging strategy: deps-<first-8-of-sha256>

Phase 8: Tests as Specifications
- Add queue_spec_test.go with executable scheduler specs
- Document priority ordering (higher first)
- Document FIFO tiebreaker for same priority
- Test cases for negative/zero priorities

Phase 10: Local Dev Parity
- Create root-level docker-compose.dev.yml
- Simplified from deployments/ for quick local dev
- Redis + API server + Worker with hot reload volumes
- Debug ports: 9101 (API), 6379 (Redis)
2026-02-18 15:34:28 -05:00
Jeremie Fraeys
f726806770 chore(ops): reorganize deployments/monitoring and remove legacy scripts 2026-01-05 12:31:26 -05:00
Jeremie Fraeys
7312451cfe Test and verify CLI-Jupyter workflow integration
- Successfully tested Jupyter notebook server with ML tools
- Verified all 6 ML tools working: MLflow 3.7.0, Streamlit 1.52.1, Dash 3.3.0, Panel 1.8.4, Bokeh 3.8.1
- MLflow experiment tracking working (created run ID: 25e6b467101845f1ab577c9cfe553c9c)
- CLI integration helper copied to workspace
- Fixed Jupyter permission issues with proper directory setup
2025-12-06 16:01:03 -05:00
Jeremie Fraeys
34c632dcde Create CLI-Jupyter integration workflow
- Add jupyter_launcher.sh script to start Jupyter with ML tools
- Create cli_integration.py helper for CLI operations
- Add sample notebook structure for experiments
- Create workflow documentation for seamless data science integration
2025-12-06 15:59:08 -05:00
Jeremie Fraeys
5cf16ac27d Clean up podman directory and verify Jupyter integration
- Remove redundant requirements file
- Test and verify Jupyter notebook 7.5.0 works
- ML tools container successfully built with all tools
- All 6 ML tools (MLflow, WandB, Streamlit, Dash, Panel, Bokeh) working
2025-12-06 15:56:01 -05:00
Jeremie Fraeys
3178cdf575 Enable ML tools integration for data scientists
- Add MLflow, WandB, Streamlit, Dash, Panel, Bokeh to environment.yml
- Update security policy to allow network access for ML tools
- Modify secure_runner.py to check tool permissions
- Add test script and usage guide
- Enable localhost network access for dashboard tools
2025-12-06 15:49:21 -05:00
Jeremie Fraeys
4aecd469a1 feat: implement comprehensive monitoring and container orchestration
- Add Prometheus, Grafana, and Loki monitoring stack
- Include pre-configured dashboards for ML metrics and logs
- Add Podman container support with security policies
- Implement ML runtime environments for multiple frameworks
- Add containerized ML project templates (PyTorch, TensorFlow, etc.)
- Include secure runner with isolation and resource limits
- Add comprehensive log aggregation and alerting
2025-12-04 16:54:49 -05:00