Skip to content

Post-severance verification #245

@timtalbot

Description

@timtalbot

Parent: #208

Summary

After severance completes, verify the workload still operates independently. Produce a health report for the operator.

Requirements

  • Verify pods are running (no CrashLoopBackOff, no restarts caused by severance) — via EKS for AWS, AKS for Azure
  • Verify sites are accessible (HTTP health check on each site's domain)
  • Verify Alloy is running and writing to local Mimir (not erroring on missing control room endpoint)
  • Verify no control room references remain in active Helm values / deployed config
  • Report results: pass/fail per check, with details on failures
  • If any check fails, provide actionable guidance (not just "verification failed")

Acceptance Criteria

  • Operator gets a clear post-eject health report after severance completes
  • Works for both AWS (EKS) and Azure (AKS) workloads
  • Any failures include specific remediation steps
  • A fully healthy workload reports all checks passing

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions