Commit graph

33 commits

Author SHA1 Message Date
Jeremie Fraeys
dbe7b1b6b2
feat(docker): add timezone mounts to all containers for log sync
Add /etc/localtime:/etc/localtime:ro volume mount to:
- alertmanager, authelia, traefik
- exporters (node-exporter, cadvisor)
- fail2ban, lldap, postfix
- forgejo, forgejo_runner
- grafana, loki, prometheus
- watchtower, app_core (postgres, redis)

Ensures container logs use host timezone for consistent timestamps.
2026-03-06 15:13:52 -05:00
Jeremie Fraeys
e2f732c0f5
infra: cleanup repository and add rollback documentation
- Remove unimplemented placeholder roles (airflow, spark)
- Delete cache files (__pycache__, .DS_Store) and generated inventory
- Remove outdated INFRA_GAP_ANALYSIS.md (functionality now in README)
- Standardize DISABLED comments for monitoring stack (Prometheus, Loki, Grafana)
- Add ROLLBACK.md with comprehensive recovery procedures
- Expand vault.example.yml with all backup and alerting variables
- Update README with complete vault variables documentation
2026-03-06 14:40:56 -05:00
Jeremie Fraeys
0fd3b4f9d0
refactor(apps): update forgejo and backups task configurations 2026-03-06 14:31:13 -05:00
Jeremie Fraeys
0cc53c9976
refactor(infrastructure): update traefik, firewall, docker, watchtower configurations 2026-03-06 14:31:02 -05:00
Jeremie Fraeys
8c834ee7d7
refactor(monitoring): update exporters, loki, and prometheus configs
- Update exporters docker-compose configuration
- Modify Loki templates for log aggregation
- Adjust Prometheus configuration and templates

Part of: Monitoring stack maintenance
2026-03-06 14:30:20 -05:00
Jeremie Fraeys
0cad272d46
refactor(hardening): update security handlers and tasks
- Update hardening handlers for service restart management
- Modify hardening tasks for improved security configurations
- Align with container scanning integration

Part of: Infrastructure hardening improvements
2026-03-06 14:30:01 -05:00
Jeremie Fraeys
dc722848c5
feat(backups): add backup verification automation
- Add systemd service and timer for backup verification
- Add backup-verify.sh script for integrity checks
- Schedule periodic verification of backup archives

Implements: Automated backup integrity validation
2026-03-06 14:27:29 -05:00
Jeremie Fraeys
0eb8c1b139
feat(hardening): add container security scanning with Trivy
- Add container-scanning.yml task file for vulnerability scans
- Add systemd timer and service for scheduled scans
- Add container-security-scan.sh script for manual scans
- Integrate Trivy for Docker image vulnerability detection

Implements: Automated container security monitoring
2026-03-06 14:27:20 -05:00
Jeremie Fraeys
ac8b0b9abd
fix(alertmanager): use domain-based email for alerts
- Change default ALERTMANAGER_EMAIL_TO from admin@localhost to domain-based
- Use alerts@auth.jfraeys.com as default (configurable via env/vault)
- Remove hardcoded localhost email reference

Fixes: Alert delivery to proper domain email instead of localhost
2026-03-06 14:25:52 -05:00
Jeremie Fraeys
5791172575
feat(grafana): add SMTP configuration for email alerts
- Enable SMTP with GF_SMTP_ENABLED: true
- Configure internal Postfix relay (postfix:25)
- Set FROM address to grafana@grafana.jfraeys.com
- Disable TLS verification for internal relay (GF_SMTP_SKIP_VERIFY)
- Clear username/password for unauthenticated internal relay

Note: Grafana role currently commented out in playbook (1GB node constraint)
2026-03-06 14:25:43 -05:00
Jeremie Fraeys
465aed31c6
feat(forgejo): add SMTP configuration for email notifications
- Enable mailer with protocol: smtp
- Configure internal Postfix relay (postfix:25)
- Set FROM address to forgejo@git.jfraeys.com
- Use Jinja2 variable for customizable mailer_from

Enables: Password reset emails, issue notifications, webhook alerts
2026-03-06 14:25:36 -05:00
Jeremie Fraeys
6837683608
feat(lldap): add container healthcheck
- Add healthcheck using wget to /health endpoint
- Set interval: 30s, timeout: 3s, retries: 3, start_period: 10s
2026-03-06 14:25:23 -05:00
Jeremie Fraeys
3e0e97a00c
fix(postfix): enable TLS and fix Postmark authentication
- Add Python script to extract certificates from Traefik acme.json
- Mount extracted certs to /etc/ssl in container for TLS support
- Enable smtpd_tls_security_level: may for incoming STARTTLS
- Remove failed_when: false on cert extraction to catch failures early
- Fix relayhost username to default to password (Postmark server token auth)
- Change default Postmark port from 2525 to 587 (blocked on some networks)
- Create SSL directory before extraction

Fixes: SMTP authentication failures and enables TLS for Authelia password reset
2026-03-06 14:25:10 -05:00
Jeremie Fraeys
64defbd528
fix(authelia): resolve 502 error and SMTP authentication issues
- Remove read_only from docker-compose to fix healthcheck file creation
- Add container healthcheck for proper monitoring
- Disable SMTP auth for internal Postfix connections (username/password cleared)
- Remove NoTLS workaround now that Postfix has proper TLS
- Set startup_check_address to domain-based email (admin@auth.jfraeys.com)
- Fix conditional SMTP username/password in configuration template

Fixes: auth.jfraeys.com 502 Bad Gateway and password reset email failures
2026-03-06 14:24:56 -05:00
Jeremie Fraeys
74fb183b7f
chore(deps): bump watchtower to v1.14 and update Docker API version
- Update watchtower from 1.7.1 to 1.14
- Set DOCKER_API_VERSION to 1.44 for compatibility
2026-03-06 10:31:58 -05:00
Jeremie Fraeys
0a85b23a33
refactor(monitoring): update Alertmanager and exporter configurations
- Simplify Alertmanager to use localhost:25 by default (Postfix)
- Update node-exporter and cadvisor compose configurations
- Bump Loki, Grafana, Prometheus image versions
2026-03-06 10:31:52 -05:00
Jeremie Fraeys
1a7cde2939
feat(forgejo): add AI scrapers blocklist, OIDC config, and UI settings
- Add AI scrapers robots.txt update script with weekly cron job
- Add OIDC group claim and admin group configuration for Authelia
- Add UI settings (SHOW_USER_EMAIL: false)
- Increase memory limit to 512M
2026-03-06 10:31:46 -05:00
Jeremie Fraeys
6ea9c060bd
feat(postfix): configure Postmark SMTP relay for transactional email
- Change default relay port from 587 to 2525 (Postmark)
- Add Docker provider environment variables for API version compatibility
- Configure for Postmark server token authentication
2026-03-06 10:31:39 -05:00
Jeremie Fraeys
6bf29f90e6
fix(traefik): add Docker provider and file provider fallback for service discovery
- Add vault vars include with traefik tag for CF_DNS_API_TOKEN availability
- Add Docker provider socket and API version to home compose
- Add Forgejo router to file provider as fallback (Docker provider broken due to API version mismatch)
- Fixes 404 errors on git.jfraeys.com when Docker provider fails
2026-03-06 10:31:05 -05:00
Jeremie Fraeys
2ce1af3b1e
Update Traefik reverse proxy configuration
- Enhance home-docker-compose.yml template with improved networking
- Update deployment tasks for better label handling
- Improve TLS certificate verification flow
2026-02-21 18:31:25 -05:00
Jeremie Fraeys
b9c5cdff12
Add app deployer role for automated deployments
- Systemd service and timer for deployment orchestration
- Webhook listener for Git-triggered deployments
- Forgejo Actions workflow for CI/CD pipeline
- Deployment scripts with rollback capability
- Deploy token validation for security
2026-02-21 18:31:12 -05:00
Jeremie Fraeys
e364538206
Update Forgejo and runner with new features
- Add Redis cache support to Forgejo for improved performance
- Add AI scrapers blocking with update script and robots.txt
- Update Forgejo runner tasks with improved caching support
- Add OIDC authentication configuration tasks
2026-02-21 18:31:06 -05:00
Jeremie Fraeys
e4634484f8
Update authentication stack (Authelia, LLDAP)
- Update Authelia configuration template for OIDC and access control
- Enhance Authelia deployment tasks
- Update LLDAP deployment tasks
2026-02-21 18:31:01 -05:00
Jeremie Fraeys
ed6101be76
Enhance monitoring stack (Prometheus, Grafana)
- Add Prometheus alert rules configuration (alerts.yml.j2)
- Update Prometheus docker-compose and main configuration
- Add Grafana tasks for improved deployment and verification
- Integrate Alertmanager with Prometheus for alerting pipeline
2026-02-21 18:30:57 -05:00
Jeremie Fraeys
7d66552482
Add Alertmanager role for Prometheus alerting
- Docker Compose deployment for Alertmanager v0.27.0
- Optional Discord webhook integration for notifications
- Persistent storage for alert state
2026-02-21 18:30:51 -05:00
Jeremie Fraeys
78ad592664
Add core infrastructure security and utility roles
- Add firewall role for UFW/iptables management
- Add fail2ban role for intrusion prevention with Docker-aware jails
- Add postfix role for mail relay capabilities
- Add backups role for automated infrastructure backups
  - systemd timer for scheduled backups
  - Backup scripts for Docker volumes and configurations
2026-02-21 18:30:42 -05:00
Jeremie Fraeys
d36d3db10d
Add Redis cache to Forgejo 2026-02-21 18:27:04 -05:00
Jeremie Fraeys
0c6d09abcd
fix(ssh): allow dual-stack runner source for restricted keys
- Include web IPv6 alongside IPv4 in authorized_keys from= allowlist\n- Write web public IPv6 into inventory/host_vars/web.yml from Terraform outputs
2026-01-21 15:08:36 -05:00
Jeremie Fraeys
92003e8f1c
fix(forgejo-runner): prevent duplicate runner registrations
- Persist runner registration state by setting container working_dir to /data\n- Add post-register assertion that /opt/forgejo-runner/data/.runner exists
2026-01-20 18:06:51 -05:00
Jeremie Fraeys
a22381492e
feat(infra-controller): add restricted SSH access role
- Add infra_controller role to provision a dedicated user\n- Install register/deregister forced-command authorized_keys entries\n- Read SSH public keys from vault/env and restrict access by source IP
2026-01-20 17:14:31 -05:00
Jeremie Fraeys
a3da8deb0f
feat(actions-ssh): use register/deregister keys for services access
- Add app_ssh_access role to install forced-command keys for infra-register-stdin and infra-deregister\n- Ensure required infra-controller runtime directories exist on services host\n- Add helper script to generate/register both Actions SSH secrets and update vault public keys
2026-01-20 17:10:02 -05:00
Jeremie Fraeys
c2056d4cd4
fix(forgejo-runner): validate label executor scheme
- Set default runner label to 'self-hosted:docker://…'\n- Add an early assert to fail fast when labels use an invalid executor scheme
2026-01-20 17:09:17 -05:00
Jeremie Fraeys
997aff6be3
initial infra commit 2026-01-19 15:02:13 -05:00