1
00 TASKS
Scrublord MacBad edited this page 2026-05-14 23:56:34 +02:00
This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

aXion1337.Chat Task List & Meilensteine

Last Updated: 2026-05-14
Statusübersicht: [ 6 Abgeschlossen] [🔄 1 In Progress] [📋 15+ Pending] [🔒 10 Security]


📊 Status Summary (Quick View)

Kategorie Count Status Details
Completed 6 Done K3S, Flux, ESS, Themes, Desktop, Monitoring, TURN
In Progress 1 🔄 Blocked Authentik Stage 2 (awaiting manual config)
Backlog 15+ 📋 Pending Element Call Fork, DB Backups, NetworkPolicies, etc.
Security Tasks 10 🔒 Pending Firewall, SSH, auditd, Kernel hardening, CrowdSec, Falco

Priority Distribution

Priority Count Timeline
🔴 CRITICAL 3 This week
🟠 HIGH 4 12 weeks
🟡 MEDIUM 8 ~1 month
🟢 LOW 4+ Nice-to-have

🎯 Next Steps (Priorisiert)

🔴 THIS WEEK CRITICAL

  1. Authentik Stage 2 abschließen

    • Manual: OIDC Provider + Application in Authentik UI erstellen
    • Code: upstream_oauth2_config in mas-secret.yaml einfügen
    • Code: passwords: enabled: false aktivieren
    • Commit: enable-authentik-oidc-integration-in-mas
    • Est. Time: 12 hours
    • Blocker: Manual Authentik config (user action)
  2. Hetzner Cloud Firewall Default-Deny Setup

    • Ingress: Allow 80/443 only
    • Allow SSH from your IP or via WireGuard/Tailscale
    • Est. Time: 30 min
    • Cost: Free
    • Impact: Blocks 99% of internet background noise
  3. SSH Hardening

    • Disable password auth (key-only)
    • Disable root login
    • MaxAuthTries 3
    • Est. Time: 12 hours
    • Priority: HIGH
  4. Database Backup Strategy Decision & First Backup

    • Decision: CloudNativePG (on K3S) or Hetzner Postgres (managed)?
    • Setup: Daily automated backups
    • Setup: Off-site storage (S3 / Storage Box)
    • Setup: Monthly verified restores
    • Est. Time: 23 days
    • Priority: CRITICAL (disaster recovery)

🟠 NEXT 12 WEEKS HIGH

  1. Authentik End-to-End Test

    • Test: Login flow Element → MAS → Authentik → Matrix User
    • Test: Password reset
    • Create: Test invite links
    • Est. Time: 2 hours
  2. Element Call Fork

    • Fork: element-hq/element-call
    • Feature: Video/audio constraints parameters
    • Integration: Synapse well-known config
    • Est. Time: 23 days
  3. External PostgreSQL Migration

    • Decision: CloudNativePG vs. Hetzner Postgres
    • Setup: HA + Replication
    • Migration: Move data from ESS embedded Postgres
    • Testing: Verify all services work
    • Est. Time: 12 days
  4. NetworkPolicies Deployment

    • Create: Default-Deny for matrix namespace
    • Create: Allow rules (Synapse↔Postgres, MAS↔Postgres, Ingress→Web, etc.)
    • Test: Ensure no service breakage
    • Est. Time: 1 day

Abgeschlossene Aufgaben (Chronologisch)

Phase 1: Basis-Setup

  • K3S Cluster aufsetzen Single-Node auf Hetzner Cloud (49.13.132.245)

    • Commit: initial-setup (vor Projekt)
    • Status: Läuft
  • Flux CD Installation

    • SOPS + age Encryption
    • GitOps Repository konfigurieren
    • Commit: setup-flux (vor Projekt)
    • Status: Läuft
  • Element Server Suite v26.4.0 Deployment

    • Synapse Homeserver (matrix.axion1337.chat)
    • Matrix Authentication Service (account.axion1337.chat)
    • Element Web (axion1337.chat)
    • Element Admin (admin.axion1337.chat)
    • MatrixRTC/Element Call (mrtc.axion1337.chat)
    • Commit: deploy-ess-matrix-stack
    • Status: Running

Phase 2: Core Features

  • 7 Custom Element Web Themes

    • aXion1337 Dark, Deep Purple, Discord Dark, Electric Blue, Everforest, Gruvbox, Wal
    • Alphabetisch sortiert
    • Commit: add-custom-element-themes
    • Status: Deployed
  • Element Desktop Setup Scripts (Windows/macOS/Linux)

    • Auto-Download + Install + Config
    • Hosted auf axion1337.chat/docs/setup/
    • Commits: add-element-desktop-setup-scripts, fix-element-setup-script-hosting
    • Status: Deployed
  • Room Policies

    • Message Retention (1d1y lifecycle)
    • Room Publication Rules (allow all)
    • Auto-Join Rooms für Onboarding
    • Commit: add-synapse-retention-publication-autojoin
    • Status: Deployed

Phase 3: WebRTC & Medienübertragung

  • TURN Server (coturn) für Video-Calls
    • Domain: turn.axion1337.chat
    • HMAC-Auth mit Shared Secret
    • Ports: 3478/udp, 3478/tcp, 5349/tcp, 49152-65535/udp
    • Commit: implement-turn-server-coturn-for-webrtc-video-calls
    • Status: Deployed
    • Manual: DNS A-Record + Firewall-Ports öffnen (noch erforderlich)

Phase 4: Monitoring & Observability

  • Monitoring Stack Integration
    • Alloy (Grafana Agent) als Collector
    • Remote Write zu Selendis (10.0.0.3:9090 Prometheus, :3100 Loki)
    • kube-state-metrics, node-exporter DaemonSet
    • Commits: integrate-monitoring-alloy-prometheus-loki, fix-prometheus-remote-write-docker
    • Status: Deployed

Phase 5: Identity Provider (Authentik)

  • Authentik Stage 1 Deployment
    • HelmRelease v2026.x in authentik namespace
    • Embedded PostgreSQL + Alloy-compatible
    • Cert-Manager für TLS
    • Commit: deploy-authentik-as-identity-provider-for-matrix-stage-1
    • Status: Deployed
    • Manual: Admin-Passwort setzen + OIDC Provider erstellen (erforderlich)

🔄 [IN PROGRESS] Authentik Stage 2 MAS Integration

  • MAS Upstream OIDC Konfiguration
    • Client ID/Secret aus Authentik Admin UI kopieren
    • upstream_oauth2_config in mas-secret.yaml einfügen
    • passwords: enabled: false
    • Commit: (pending)
    • Status: Wartet auf manuelle Authentik-Konfiguration

Phase 6: Dokumentation

  • Deployment Guides erstellen
    • 5 Markdown-Dateien in docs/deployment-guides/
    • Chronologisch geordnet
    • Troubleshooting + Best Practices
    • Commit: add-comprehensive-deployment-configuration-documentation
    • Status: Deployed

🔄 In Progress / Blocked

Authentik Stage 2 MAS Integration ( Depends on Manual Config)

Beschreibung: Authentik OIDC Provider muss manuell im Authentik Admin UI konfiguriert werden, bevor Stage 2 Deployment möglich ist.

Schritte:

  1. Authentik Stage 1 Deployment (done)
  2. Authentik Admin UI: OIDC Provider erstellen (MANUAL - user action)
  3. Authentik Admin UI: Application mit Slug matrix erstellen (MANUAL - user action)
  4. Authentik Admin UI: Enrollment Flow mit Invitation Stage (MANUAL - user action)
  5. Authentik Admin UI: Client ID + Secret kopieren (MANUAL - user action)
  6. 📋 MAS upstream_oauth2_config mit Client Credentials aktualisieren
  7. 📋 passwords: enabled: false aktivieren
  8. 📋 Commit + Push

Blocker: Manuelle Authentik-Konfiguration (wartet auf Benutzer)


📋 Backlog (Weitere Aufgaben)

Authentik Completion

  • Finish Authentik Stage 2 MAS Integration

    • Prerequisites: Authentik OIDC Provider vollständig konfiguriert
    • Task: Update mas-secret.yaml, enable password login disable
    • Commit: enable-authentik-oidc-integration-in-mas
    • Est. Effort: 30 min (manual + scripted)
  • Test End-to-End Login Flow

    • Element Web login → MAS → Authentik → Matrix User Creation
    • Create test users via Authentik
    • Verify password reset flow
    • Commit: (implicit in Stage 2)
    • Est. Effort: 20 min
  • Create Invite Links für neue User

    • Authentik Admin UI → Invitations → Create
    • Set expiry dates (7d) + use limits
    • Document procedure
    • Est. Effort: 15 min

Element Call Enhancement

  • Element Call Fork für Custom Constraints
    • Repository: Fork element-hq/element-call
    • Feature: Video/Audio constraints parameter im config
    • Include: Bandwidth limiting, resolution limits, frame rate control
    • Integration mit Synapse well-known
    • Est. Effort: 23 days (fork + feature + test)
    • Priority: HIGH (user feature)

Database Hardening

  • External/Dedicated PostgreSQL Deployment

    • Option 1: CloudNativePG Operator (open-source, auf K3S)
    • Option 2: Managed Hetzner Postgres
    • Separate aus ESS matrix-stack embedded Postgres
    • HA + Replication
    • Est. Effort: 12 days
    • Priority: HIGH (reliability)
  • Database Backup Strategy

    • Daily automated backups (PgBackRest oder velero)
    • Off-site backup storage (S3 / Hetzner Storage Box)
    • Monthly verified restores (test restore → verify data integrity)
    • Backup + restore documentation
    • Est. Effort: 23 days
    • Priority: CRITICAL (disaster recovery)
  • Synapse Media PVC Backups

    • Separate backup pipeline für /data/media_store PVC
    • Reason: Media oft >100GB, sollte nicht im DB-Backup sein
    • Velero + Restic für block-level backup
    • Est. Effort: 1 day
    • Priority: HIGH (data preservation)

Network Security

  • NetworkPolicies K8s-Layer Segmentation

    • Default-Deny Ingress für matrix namespace
    • Allow rules:
      • Ingress → MAS:443
      • Ingress → ElementWeb:443
      • MAS ↔ Synapse:8008
      • Synapse ↔ Postgres:5432
      • Authentik → Postgres:5432
      • Authentik → Loki:3100 (monitoring)
    • Egress: Matrix-specific (federation, etc.)
    • Est. Effort: 1 day
    • Priority: MEDIUM (compliance, least-privilege)
  • Pod Security Admission (Restricted)

    • Apply to matrix & authentik namespaces
    • Enforce: non-root, no privileged, read-only root fs
    • Test: Ensure no chart breakage
    • Est. Effort: 1 day
    • Priority: MEDIUM (hardening)

Federation & Access Control

  • Federation-Allowlist oder Closed Federation
    • Decision: Which servers to federate with?
    • If allowlist: explicit federation_domain_whitelist
    • If closed: allow_public_rooms_without_join_rules: false
    • Synapse config in synapse-values.yaml
    • Est. Effort: 4 hours
    • Priority: MEDIUM (security policy)

Moderation & Anti-Abuse

  • Mjolnir/Draupnir Bot Deployment

    • Open-source moderation bot für Matrix
    • Reason: Invitation-based, aber Federation kann Spam bringen
    • Auto-ban known bad servers/users
    • Spam-detection rules
    • HelmChart oder custom Deployment
    • Est. Effort: 12 days
    • Priority: MEDIUM (ops safety)
  • Content Scanner for Media

    • matrix-content-scanner + ClamAV antivirus
    • Scan uploaded media for malware
    • Block suspicious files
    • Est. Effort: 12 days
    • Priority: LOWMEDIUM (optional but good practice)

Secrets Management

  • External-Secrets Operator oder SOPS für Flux
    • Current: SOPS with age encryption
    • Consideration: External-Secrets for cloud-native (AWS Secrets Manager, Hetzner Vault, etc.)
    • OR: Improve SOPS rotation strategy
    • Decision needed: Keep SOPS or upgrade?
    • Est. Effort: 23 days (if switching)
    • Priority: LOW (current SOPS setup working)

Image & Dependency Management

  • Renovate / Dependabot Setup

    • Auto-update Helm Chart versions
    • Auto-update Container Image Tags
    • Monitor for security patches
    • Est. Effort: 4 hours
    • Priority: MEDIUM (maintenance)
  • Trivy Image Scanning

    • Scan images in Flux HelmReleases for CVEs
    • Block deployment if critical CVE found
    • CI/CD hook in git workflow
    • Est. Effort: 8 hours
    • Priority: LOWMEDIUM (security posture)
  • Monitor ESS & Element Security Advisories

    • Subscribe to element-hq security mailing list
    • Monitor #matrix-community security channels
    • Auto-alerts on new CVEs/patches
    • Est. Effort: Ongoing (low maintenance)
    • Priority: MEDIUM (security awareness)

Container Security

  • Disable automountServiceAccountToken Everywhere
    • Audit all Deployments/StatefulSets
    • Disable for: Synapse, ElementWeb, MAS, Postgres, Authentik (where not needed)
    • Add automountServiceAccountToken: false to spec.template.spec
    • Test: Ensure no breakage
    • Est. Effort: 4 hours
    • Priority: MEDIUM (least-privilege)

🔒 Security Hardening (Host & Cluster Level)

Host OS Layer (Ubuntu/Debian)

  • Hetzner Cloud Firewall

    • Default-Deny inbound
    • Allow: 80/443 (HTTP/HTTPS)
    • Allow: 22 (SSH) from your IP only (or via WireGuard/Tailscale)
    • Status: Can be done in Hetzner UI
    • Est. Effort: 30 min
    • Priority: CRITICAL (immediate, zero config cost)
  • SSH Hardening

    • Disable password auth (key-only)
    • Disable root login
    • PermitRootLogin: no
    • PasswordAuthentication: no
    • MaxAuthTries: 3
    • Optional: Change SSH port (cosmetic, reduces log noise)
    • Optional: SSH hinter WireGuard/Tailscale (eliminates fail2ban für SSH)
    • Est. Effort: 2 hours
    • Priority: HIGH (immediate)
  • unattended-upgrades

    • Enable automatic security updates
    • Configure: APT::Periodic::Update-Package-Lists "1";
    • Configure: APT::Periodic::Unattended-Upgrade "1";
    • Configure: APT::Periodic::AutocleanInterval "7";
    • Est. Effort: 30 min
    • Priority: HIGH (set & forget)
  • K3S API Security

    • Current: K3S API listening on :6443 on all interfaces (default)
    • Hardening:
      • Option 1: Firewall restrict :6443 to localhost only
      • Option 2: K3S --bind-address + --advertise-address to WireGuard IP
      • Option 3: kubectl access only via jumphost/bastion
    • Est. Effort: 2 hours
    • Priority: HIGH (API is high-value target)
  • auditd for File Integrity & Syscall Audit

    • Monitor: /etc, ~/.kube, /var/lib/rancher/k3s
    • Audit rules für sensitive file changes
    • Low overhead, good signal/noise ratio
    • Output to syslog / centralized logging
    • Est. Effort: 2 hours
    • Priority: MEDIUM (forensics + compliance)
  • Kernel Hardening (sysctl)

    • Apply hardening recommendations from Lynis
    • Key settings:
      • kernel.kptr_restrict=2 (hide kernel pointers)
      • kernel.dmesg_restrict=1 (restrict dmesg)
      • net.ipv4.tcp_syncookies=1 (SYN flood protection)
      • net.ipv4.conf.all.rp_filter=1 (reverse path filtering)
      • net.ipv4.conf.all.send_redirects=0
      • net.ipv6.conf.all.disable_ipv6=0 (or =1 if no IPv6 needed)
    • Persist via /etc/sysctl.d/99-hardening.conf
    • Est. Effort: 2 hours
    • Priority: MEDIUM (defense in depth)
  • Lynis Security Baseline

    • Run lynis audit system
    • Review recommendations
    • Implement high-priority findings
    • Aim for score >80
    • Re-run quarterly
    • Est. Effort: 4 hours (initial) + 1 hour quarterly
    • Priority: MEDIUM (baseline verification)

Cluster Layer (K3S / Kubernetes)

  • CrowdSec Integration

    • Install CrowdSec agent on host
    • Connect to CrowdSec Hub (commercial platform, free tier available)
    • Feed auth.log, syslog → CrowdSec for attack detection
    • Auto-block IPs via local firewall or Hetzner Firewall API
    • Est. Effort: 4 hours
    • Priority: MEDIUM (proactive threat response)
  • Falco Runtime Monitoring

    • Install Falco DaemonSet in K3S
    • Monitor: Shell spawning in containers, suspicious syscalls, privilege escalation
    • Output to Loki / syslog
    • Alert on anomalies
    • Est. Effort: 1 day
    • Priority: MEDIUM (runtime detection)

🎯 Meilensteine (Milestones)

Meilenstein Beschreibung Status ETA
M1: Basis-Setup K3S + Flux + ESS deployed Done -
M2: Core Matrix Themes, Scripts, Policies Done -
M3: WebRTC & Monitoring TURN + Alloy/Prometheus/Loki Done -
M4: Identity Provider Authentik Stage 1+2 (pending Stage 2) 🔄 In Progress ~12 days
M5: Production-Ready DB Backups, NetworkPolicies, Security Hardening 📋 Backlog ~23 weeks
M6: Advanced Features Element Call Fork, Content Scanner, Mjolnir 📋 Backlog ~4+ weeks
M7: Enterprise-Ready Full compliance (DSGVO), HA setup, Disaster Recovery 🎯 Future ~8+ weeks

📊 Prioritäts-Kategorien

🔴 CRITICAL (do immediately)

  • Hetzner Cloud Firewall setup
  • Database backup strategy
  • SSH hardening

🟠 HIGH (do within 12 weeks)

  • Authentik Stage 2 completion
  • External PostgreSQL migration
  • NetworkPolicies
  • Element Call fork

🟡 MEDIUM (do within 1 month)

  • CrowdSec + Falco
  • Mjolnir bot
  • Renovate/Trivy
  • PSA restricted mode
  • Kernel hardening

🟢 LOW (nice-to-have, do if time allows)

  • Content scanner (ClamAV)
  • External-Secrets upgrade
  • SSH port relocation
  • Advanced federation rules

📝 Notes & Decision Points

Authentik Stage 2 Blocker

Waiting for: User to manually configure Authentik OIDC Provider in Authentik Admin UI.

  • Once done, provide Client ID + Secret
  • Then: Commit Stage 2 MAS config

Database: CloudNativePG vs. Hetzner Postgres

  • CloudNativePG: Open-source, runs on K3S, full control
  • Hetzner Postgres: Managed, backups included, less ops overhead
  • Decision: Recommend CloudNativePG for now (cost-effective), migrate to Hetzner later if operational overhead too high

Federation: Allowlist vs. Closed?

  • Allowlist: Default federation with all public servers, can be attacked
  • Closed: Only federate with trusted servers (higher security, lower interop)
  • Decision: Depends on user intent. For now: allow all, add Mjolnir for abuse protection

Security Framework

  • Layers: Perimeter (Firewall) → Host (SSH, auditd, hardening) → Cluster (NetworkPolicies, PSA, Falco) → App (Rate-limits, Mjolnir)
  • Approach: Implement incrementally, test after each layer

  • docs/deployment-guides/README.md Overview
  • docs/deployment-guides/01-turn-server-setup.md TURN
  • docs/deployment-guides/02-authentik-identity-provider.md Authentik (Stage 1 + Stage 2 plan)
  • docs/deployment-guides/03-monitoring-integration.md Monitoring
  • docs/deployment-guides/04-element-customization.md Themes, Desktop
  • docs/deployment-guides/05-room-policies.md Policies

Last Updated: 2026-05-14
Next Review: 2026-05-21