fetch_ml/podman/docs/jupyter_workflow.md
Jeremie Fraeys 7880ea8d79
refactor: reorganize podman directory structure
Organize podman/ directory into logical subdirectories:

New structure:
- docs/          - ML_TOOLS_GUIDE.md, jupyter_workflow.md
- configs/       - environment*.yml, security_policy.json
- containers/    - *.dockerfile, *.podfile
- scripts/       - *.sh, *.py (secure_runner, cli_integration, etc.)
- jupyter/       - jupyter_cookie_secret (flattened from jupyter_runtime/runtime/)
- workspace/     - Example projects (cleaned of temp files)

Cleaned workspace:
- Removed .DS_Store, mlflow.db, cache/
- Removed duplicate cli_integration.py

Removed unnecessary nesting:
- Flattened jupyter_runtime/runtime/ to just jupyter/

Improves maintainability by grouping files by purpose and eliminating root directory clutter.
2026-02-18 16:40:46 -05:00

1.2 KiB

CLI-Jupyter Integration Workflow

Workflow Overview

This workflow integrates the FetchML CLI with Jupyter notebooks for seamless data science experiments.

Step 1: Start Jupyter Server

# Build the container with ML tools
podman build -f ml-tools-runner.podfile -t ml-tools-runner .

# Start Jupyter server
podman run -d -p 8888:8888 --name ml-jupyter \
  -v "$(pwd)/workspace:/workspace:Z" \
  --entrypoint conda localhost/ml-tools-runner \
  run -n ml_env jupyter notebook --no-browser --ip=0.0.0.0 --port=8888 \
  --NotebookApp.token='' --NotebookApp.password='' --allow-root

# Access at http://localhost:8888

Step 2: Use CLI to Sync Projects

# From another terminal, sync your project
cd cli && ./zig-out/bin/ml sync ./my_project --queue

# Check status
./zig-out/bin/ml status

Step 3: Run Experiments in Jupyter

# In your Jupyter notebook
import mlflow
import wandb
import pandas as pd

# Start MLflow tracking
mlflow.start_run()
mlflow.log_param("model", "random_forest")
mlflow.log_metric("accuracy", 0.95)

Step 4: Monitor with CLI

# Monitor jobs from CLI
./zig-out/bin/ml monitor

# View logs
./zig-out/bin/ml experiment log my_experiment