Add Known Limitations section to AGENTS.md documenting:
- AMD GPU not implemented (use NVIDIA, Apple Silicon, or CPU)
- 100+ node gang allocation stress testing not yet implemented
- Podman-in-Docker CI requires privileged mode, not yet automated
- Error handling patterns for unimplemented features
- Container usage rules (Docker for testing/deployments, Podman for experiments)
- Error codes table (NOT_IMPLEMENTED, NOT_FOUND, INVALID_CONFIGURATION)
Update testing documentation to reflect new test locations:
- Unit tests moved from tests/unit/ to internal/ (Go convention)
- Update all test file path references in security testing docs
- Create docker-tests.yml for merge-to-main CI pipeline
- Add mock GPU test matrix (NVIDIA, Metal, CPU-only)
- Add AGENTS.md with container architecture rules:
* Docker for CI/CD testing and deployments
* Podman for ML experiment isolation only
- Update .gitignore to track AGENTS.md