Skip to content

Prepare v0.1.8 diagnostics release#24

Merged
Colvin-Y merged 1 commit into
mainfrom
codex/v0.1.8-diagnostics-release
Jun 2, 2026
Merged

Prepare v0.1.8 diagnostics release#24
Colvin-Y merged 1 commit into
mainfrom
codex/v0.1.8-diagnostics-release

Conversation

@Colvin-Y

@Colvin-Y Colvin-Y commented Jun 1, 2026

Copy link
Copy Markdown
Owner

Summary: prepare v0.1.8 diagnostics release with expanded topology, failure fixtures, zero-clone Helm docs, and opt-in Secret RBAC. Validation: rtk make ci; helm package chart.

Summary by CodeRabbit

Release Notes: v0.1.8

  • New Features

    • Extended topology discovery for Ingress, EndpointSlice, NetworkPolicy, HPA, PodDisruptionBudget, and VolumeAttachment resources.
    • New diagnostic recipes for storage-CSI, service-routing, identity, and node-context investigation.
    • Enhanced Pod runtime evidence with container state tracking.
    • URL-driven node focusing in the topology viewer.
    • Offline failure-mode diagnostic samples for common Kubernetes issues.
  • Changed

    • Secret reads now disabled by default in Helm charts; requires explicit rbac.readSecrets=true opt-in.
  • Documentation

    • New UPGRADING.md compatibility guide.
    • Updated installation and configuration examples for v0.1.8.

Expand diagnostic topology coverage, add failure fixtures, and document zero-clone release workflows.

Keep Helm Secret reads opt-in by default and package upgrade guidance with release archives.
@vercel

vercel Bot commented Jun 1, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
kubernetes-ontology Ready Ready Preview, Comment Jun 1, 2026 2:16pm

@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This v0.1.8 release extends kubernetes-ontology with six new Kubernetes resource kinds (Ingress, EndpointSlice, NetworkPolicy, HPA, PodDisruptionBudget, VolumeAttachment), new diagnostic recipes (storage-csi, service-routing, identity, node-context), enhanced pod runtime evidence (container states), event target name fallback correlation, and offline failure-mode diagnostic fixtures. The Helm chart defaults Secret reads to disabled (opt-in via rbac.readSecrets=true). Documentation, release artifacts, and the viewer UI are updated throughout.

Changes

Topology Resource Support v0.1.8

Layer / File(s) Summary
Public API contract: Node and Edge kinds
internal/api/types.go, internal/model/node.go, internal/model/edge.go
Six new NodeKind constants (Ingress, EndpointSlice, NetworkPolicy, HPA, PodDisruptionBudget, VolumeAttachment) and six new EdgeKind constants (RoutesToService, TargetsPod, AppliesToPod, ScalesWorkload, ProtectsPod, AttachesPV) extend the diagnostic graph type system.
Kubernetes resource collection and normalization
internal/collect/k8s/collector.go, internal/collect/k8s/resources/types.go
Snapshot struct adds six new resource slices. ContainerState type captures per-container runtime state (waiting/running/terminated with reasons/messages). Event type expands to include involved object details, timestamps, count, and source. Six new normalization functions (Ingress, EndpointSlice, NetworkPolicy, HPA, PodDisruptionBudget, VolumeAttachment) and helpers (timestampPtr, uniqueStrings) support the collection pipeline. Collector performs additional Kubernetes API list calls with Forbidden/NotFound tolerance.
Streaming model and change detection for topology resources
internal/collect/k8s/stream.go
Streaming model registers new resources via informers (namespaced and cluster-scoped as appropriate) and classifies them as "topology" change kind. Fingerprinting for topology resources, container states, and timestamps. Pod fingerprinting incorporates container state; event fingerprinting includes involved UID and first/last timestamps.
Graph builder: nodes, edges, and event target correlation
internal/graph/builder.go, internal/graph/builder_test.go, internal/resolve/explicit/edges.go
Builder creates nodes for all new resource kinds and maintains eventTargetNames index for event target resolution by kind+namespace+name fallback. Pod nodes gain container state attributes. Event nodes gain timestamps, source, reporting component. Six resolver-backed edge construction loops (Ingress→Service, EndpointSlice→Pod, NetworkPolicy→Pod, HPA→Workload, PodDisruptionBudget→Pod, VolumeAttachment→PV) with confidence scoring. New tests validate topology resource nodes/edges and event fallback behavior.
Diagnostic recipes, lanes, and pod runtime evidence
internal/query/facade.go, internal/query/facade_test.go, internal/service/diagnostic/service.go
Four new diagnostic recipes (storage-csi, service-routing, identity, node-context) with dedicated lane sets. DiagnosticRecipeForEntry maps new resource kinds to recipes. Pod runtime evidence extraction: rankPodRuntimeEvidence emits ranked evidence for container waiting/terminated/restart states with base scores. Event evidence includes timestamp. Helper functions for timestamp parsing, container status normalization, numeric coercion.
Schema, ontology, and relation specifications
docs/ontology/kubernetes-ontology.owl, internal/model/relation_spec.go, internal/owl/schema.go, charts/kubernetes-ontology/templates/rbac.yaml
OWL ontology adds six Class and six ObjectProperty declarations with domain/range/resolver hints. Relation specs define semantic attributes for each new edge kind. Helm RBAC template grants read/watch access to additional API groups (autoscaling, discovery.k8s.io, networking.k8s.io, policy, storage.k8s.io).

Helm RBAC: Secret reads default to disabled

Layer / File(s) Summary
Helm chart values and verification
charts/kubernetes-ontology/values.yaml, scripts/ci/verify_helm.sh
rbac.readSecrets flipped from true to false. CI verification ensures no secrets in default render; confirms secrets present only when explicitly enabled via --set rbac.readSecrets=true.

Documentation and release process updates

Layer / File(s) Summary
User documentation: README, QUICKSTART, SECURITY, UPGRADING
README.md, README.zh-CN.md, QUICKSTART.md, SECURITY.md, UPGRADING.md
v0.1.8 release notes with new diagnostic entrypoints, graph recovery evidence, Secret reads opt-in guidance, and offline failure fixture references. Chinese README equivalent. New UPGRADING.md guide covers compatibility, RBAC change, upgrade/rollback procedures.
AI contract, design evolution, and release procedures
AI_CONTRACT.md, docs/design/open-source-diagnostics-evolution-plan.md, docs/release.md, docs/agent-recipes.md, docs/articles/kubernetes-ontology-intro.zh-CN.md, CHANGELOG.md
AI_CONTRACT.md expands phase-1 entry kinds, recipes, and edge semantics. Design doc emphasizes zero-clone Helm install from release .tgz and opt-in Secret reads. Release procedures, agent recipes, and design articles updated for v0.1.8. CHANGELOG.md records additions and RBAC default change.
Helm chart version, skill guide, and release artifacts
charts/kubernetes-ontology/Chart.yaml, skills/kubernetes-ontology-access/SKILL.md, .github/workflows/release.yml, cmd/kubernetes-ontology/main.go
Helm chart version → 0.1.8. Skill guide documents zero-clone release .tgz install, Secret reads opt-in, and diagnostic recipe hints. Release workflow copies documentation files into release archives. CLI --recipe flag lists new recipe options.
Topology Viewer UI enhancements and web metadata
tools/visualize/index.html, index.html
Diagnostic UI expands with new entry/recipe kinds, node colors, relation interestingness, kind presets, and namespace scoping. New focusNode URL parameter support for viewer state. kindRank updated. Web metadata (canonical URL, social previews) refreshed.
Offline failure-mode diagnostic fixture samples
samples/failure-modes/*, tools/visualize/fixture_test.py
Seven failure-mode diagnostic graphs (CrashLoopBackOff, pending scheduling, PVC CSI, service selector mismatch, missing config/secret, RBAC forbidden, webhook admission) demonstrating recipes. fixture_test.py validates fixture structure, references, and recipe presence.
JSON schema and samples directory update
schemas/diagnostic-subgraph.schema.json, samples/README.md
Recipe description examples expanded. Samples README documents new failure-modes directory.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 Six resources now shine bright,
Ingress, slice, and policy in sight,
Container state whispers what went wrong,
Events find their way along,
Secrets opt-in, as they should be—
v0.1.8 sets topology free! 🌟

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete. While it briefly mentions what changed (expanded topology, failure fixtures, zero-clone Helm docs, opt-in Secret RBAC) and lists validation steps, it lacks the required template structure with a formal Summary section, detailed Validation commands, and Safety Checklist items. Expand the description to follow the template: add a formal Summary section explaining the changes in detail, provide complete validation commands (not abbreviated), and fill out all Safety Checklist items with explicit confirmations or explanations.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Prepare v0.1.8 diagnostics release' accurately summarizes the main change—a version release preparation with expanded features and documentation updates.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/v0.1.8-diagnostics-release

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
index.html (1)

832-832: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Update the version to v0.1.8.

The installation example hardcodes v0.1.6, but this PR prepares the v0.1.8 release. Users copying this snippet will install the wrong version.

🔧 Proposed fix
-export KO_VERSION=v0.1.6
+export KO_VERSION=v0.1.8
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@index.html` at line 832, Update the hardcoded installation version by
changing the KO_VERSION value in the HTML snippet from v0.1.6 to v0.1.8; locate
the line containing the export KO_VERSION=... (the installation example code
block) and replace the version string so users copying the snippet install
v0.1.8.
🧹 Nitpick comments (2)
internal/service/diagnostic/service.go (1)

465-516: ⚡ Quick win

Consider capping the restart count contribution to evidence score.

The current implementation adds the raw restart count to the base score (65 + restartCount on line 510). For containers with very high restart counts (e.g., 100+), this could result in scores that dominate the ranked evidence list, potentially crowding out other critical evidence like error events when multiple containers are restarting.

While high restart counts are indeed critical signals, consider capping the contribution to maintain evidence diversity:

-				Score:      65 + restartCount,
+				Score:      65 + min(restartCount, 25),

This would cap the maximum restart evidence score at 90, keeping it comparable to warning-level event scores while still reflecting restart severity.

Helper function to add
func min(a, b float64) float64 {
	if a < b {
		return a
	}
	return b
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/service/diagnostic/service.go` around lines 465 - 516, In
rankPodRuntimeEvidence, cap the restart-count contribution to the evidence score
so extremely high restart counts can't push the score above your proposed
maximum (90); replace the current Score computation for the restart case
(currently "Score: 65 + restartCount") with a capped value e.g. compute
increment := min(float64(restartCount), 25) or use math.Min to ensure
65+increment <= 90, then set Score to int(65+increment); update imports if you
use math.Min and/or add a small helper min function if preferred.
internal/collect/k8s/resources/types.go (1)

480-495: ⚡ Quick win

ReportingComponent duplicates Source instead of capturing ReportingController.

Both Source and ReportingComponent are set to in.Source.Component. Modern Kubernetes events use ReportingController field for the component name. Consider capturing both:

♻️ Proposed fix to capture ReportingController
-		Source:             in.Source.Component,
-		ReportingComponent: in.Source.Component,
+		Source:             in.Source.Component,
+		ReportingComponent: in.ReportingController,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/collect/k8s/resources/types.go` around lines 480 - 495,
NormalizeEvent currently sets both Source and ReportingComponent to
in.Source.Component; change ReportingComponent to use the event's
ReportingController (i.e., set ReportingComponent: in.ReportingController) so it
reflects modern Kubernetes events; if the Event type should also capture
reporting instance, add a ReportingInstance field to the Event struct and
populate it from in.ReportingInstance in NormalizeEvent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/collect/k8s/collector.go`:
- Around line 200-211: The VolumeAttachments listing block is currently nested
inside the CSIDrivers success else branch, causing VolumeAttachments to be
skipped when CSIDrivers list returns Forbidden/NotFound; move the entire block
that declares volumeAttachments (the call to
c.client.StorageV1().VolumeAttachments().List(ctx, metav1.ListOptions{}), the
error handling that checks apierrors.IsForbidden/IsNotFound, and the loop that
appends to out.VolumeAttachments via resources.NormalizeVolumeAttachment) out of
the CSIDrivers else clause so it runs independently (i.e., place it after the
CSIDrivers handling logic in the same function in collector.go).

In `@internal/graph/builder.go`:
- Around line 358-391: NetworkPolicy and PDB selectors that are empty maps are
being treated as non-matching because selector.LabelsMatch returns false for
empty selectors; update the loops in builder.go that iterate
snapshot.NetworkPolicies and snapshot.PodDisruptionBudgets so that when
networkPolicy.Selector or pdb.Selector has length 0 they are treated as "match
all" (i.e., skip calling selector.LabelsMatch and always add the edge for every
pod in the same namespace), otherwise keep the existing selector.LabelsMatch
check; refer to networkPolicy.Selector, pdb.Selector, selector.LabelsMatch,
snapshot.NetworkPolicies, snapshot.PodDisruptionBudgets and where edges are
created with model.NewEdgeWithResolver to locate the changes.

In `@tools/visualize/fixture_test.py`:
- Line 56: The test currently uses data["budgets"].get("truncated", True) which
makes the default True and causes failures when "truncated" is missing; change
the default to False so the assertion becomes
self.assertFalse(data["budgets"].get("truncated", False)) to reflect that
truncated is not set by default and should default to False.

In `@tools/visualize/index.html`:
- Around line 3051-3079: The kindRank function's order object has
duplicate/out-of-sequence rank values (ServiceAccount, RoleBinding,
ClusterRoleBinding set to 4 after Pod is 7), causing inconsistent ordering;
update the order object inside kindRank by either removing those three identity
entries if they should use the default rank, or move/reassign them to sequential
ranks after Pod (e.g., assign ServiceAccount, RoleBinding, ClusterRoleBinding to
8/9/10 and then bump NetworkPolicy, ConfigMap, Secret, PVC, etc. accordingly) so
the numeric sequence is consistent and no ranks are reused out of order.

---

Outside diff comments:
In `@index.html`:
- Line 832: Update the hardcoded installation version by changing the KO_VERSION
value in the HTML snippet from v0.1.6 to v0.1.8; locate the line containing the
export KO_VERSION=... (the installation example code block) and replace the
version string so users copying the snippet install v0.1.8.

---

Nitpick comments:
In `@internal/collect/k8s/resources/types.go`:
- Around line 480-495: NormalizeEvent currently sets both Source and
ReportingComponent to in.Source.Component; change ReportingComponent to use the
event's ReportingController (i.e., set ReportingComponent:
in.ReportingController) so it reflects modern Kubernetes events; if the Event
type should also capture reporting instance, add a ReportingInstance field to
the Event struct and populate it from in.ReportingInstance in NormalizeEvent.

In `@internal/service/diagnostic/service.go`:
- Around line 465-516: In rankPodRuntimeEvidence, cap the restart-count
contribution to the evidence score so extremely high restart counts can't push
the score above your proposed maximum (90); replace the current Score
computation for the restart case (currently "Score: 65 + restartCount") with a
capped value e.g. compute increment := min(float64(restartCount), 25) or use
math.Min to ensure 65+increment <= 90, then set Score to int(65+increment);
update imports if you use math.Min and/or add a small helper min function if
preferred.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 397d103b-ab6d-4bd9-b0a6-4682ebe13958

📥 Commits

Reviewing files that changed from the base of the PR and between 6b01a1d and dfb63de.

📒 Files selected for processing (46)
  • .github/workflows/release.yml
  • AI_CONTRACT.md
  • CHANGELOG.md
  • QUICKSTART.md
  • README.md
  • README.zh-CN.md
  • SECURITY.md
  • UPGRADING.md
  • charts/kubernetes-ontology/Chart.yaml
  • charts/kubernetes-ontology/templates/rbac.yaml
  • charts/kubernetes-ontology/values.yaml
  • cmd/kubernetes-ontology/main.go
  • docs/agent-recipes.md
  • docs/articles/kubernetes-ontology-intro.zh-CN.md
  • docs/design/open-source-diagnostics-evolution-plan.md
  • docs/ontology/kubernetes-ontology.owl
  • docs/release.md
  • index.html
  • internal/api/types.go
  • internal/collect/k8s/collector.go
  • internal/collect/k8s/resources/types.go
  • internal/collect/k8s/stream.go
  • internal/graph/builder.go
  • internal/graph/builder_test.go
  • internal/model/edge.go
  • internal/model/node.go
  • internal/model/relation_spec.go
  • internal/owl/schema.go
  • internal/query/facade.go
  • internal/query/facade_test.go
  • internal/resolve/explicit/edges.go
  • internal/service/diagnostic/service.go
  • samples/README.md
  • samples/failure-modes/README.md
  • samples/failure-modes/crashloopbackoff/diagnostic-graph.json
  • samples/failure-modes/missing-config-secret/diagnostic-graph.json
  • samples/failure-modes/pending-scheduling/diagnostic-graph.json
  • samples/failure-modes/pvc-pending-csi/diagnostic-graph.json
  • samples/failure-modes/rbac-forbidden/diagnostic-graph.json
  • samples/failure-modes/service-selector-mismatch/diagnostic-graph.json
  • samples/failure-modes/webhook-admission/diagnostic-graph.json
  • schemas/diagnostic-subgraph.schema.json
  • scripts/ci/verify_helm.sh
  • skills/kubernetes-ontology-access/SKILL.md
  • tools/visualize/fixture_test.py
  • tools/visualize/index.html

Comment on lines +200 to 211

volumeAttachments, err := c.client.StorageV1().VolumeAttachments().List(ctx, metav1.ListOptions{})
if err != nil {
if !apierrors.IsForbidden(err) && !apierrors.IsNotFound(err) {
return Snapshot{}, err
}
} else {
for _, item := range volumeAttachments.Items {
out.VolumeAttachments = append(out.VolumeAttachments, resources.NormalizeVolumeAttachment(item))
}
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

VolumeAttachments collection incorrectly nested inside CSIDrivers success block.

VolumeAttachments will only be collected if CSIDrivers listing succeeds. These are independent cluster-scoped resources; if CSIDrivers returns Forbidden/NotFound, VolumeAttachments are silently skipped. Move this block outside the CSIDrivers else clause.

🐛 Proposed fix to move VolumeAttachments collection outside CSIDrivers block
 	csiDrivers, err := c.client.StorageV1().CSIDrivers().List(ctx, metav1.ListOptions{})
 	if err != nil {
 		if !apierrors.IsForbidden(err) && !apierrors.IsNotFound(err) {
 			return Snapshot{}, err
 		}
 	} else {
 		for _, item := range csiDrivers.Items {
 			out.CSIDrivers = append(out.CSIDrivers, resources.NormalizeCSIDriver(item))
 		}
+	}
 
-		volumeAttachments, err := c.client.StorageV1().VolumeAttachments().List(ctx, metav1.ListOptions{})
-		if err != nil {
-			if !apierrors.IsForbidden(err) && !apierrors.IsNotFound(err) {
-				return Snapshot{}, err
-			}
-		} else {
-			for _, item := range volumeAttachments.Items {
-				out.VolumeAttachments = append(out.VolumeAttachments, resources.NormalizeVolumeAttachment(item))
-			}
+	volumeAttachments, err := c.client.StorageV1().VolumeAttachments().List(ctx, metav1.ListOptions{})
+	if err != nil {
+		if !apierrors.IsForbidden(err) && !apierrors.IsNotFound(err) {
+			return Snapshot{}, err
 		}
+	} else {
+		for _, item := range volumeAttachments.Items {
+			out.VolumeAttachments = append(out.VolumeAttachments, resources.NormalizeVolumeAttachment(item))
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/collect/k8s/collector.go` around lines 200 - 211, The
VolumeAttachments listing block is currently nested inside the CSIDrivers
success else branch, causing VolumeAttachments to be skipped when CSIDrivers
list returns Forbidden/NotFound; move the entire block that declares
volumeAttachments (the call to
c.client.StorageV1().VolumeAttachments().List(ctx, metav1.ListOptions{}), the
error handling that checks apierrors.IsForbidden/IsNotFound, and the loop that
appends to out.VolumeAttachments via resources.NormalizeVolumeAttachment) out of
the CSIDrivers else clause so it runs independently (i.e., place it after the
CSIDrivers handling logic in the same function in collector.go).

Comment thread internal/graph/builder.go
Comment on lines +358 to +391
for _, networkPolicy := range snapshot.NetworkPolicies {
networkPolicyID := networkPolicyIDs[networkPolicy.Metadata.Namespace+"/"+networkPolicy.Metadata.Name]
for _, pod := range snapshot.Pods {
if pod.Metadata.Namespace != networkPolicy.Metadata.Namespace {
continue
}
if selector.LabelsMatch(networkPolicy.Selector, pod.Metadata.Labels) {
edges = append(edges, model.NewEdgeWithResolver(networkPolicyID, podIDs[pod.Metadata.Namespace+"/"+pod.Metadata.Name], model.EdgeKindAppliesToPod, "networkpolicy-selector/v1"))
}
}
}

for _, hpa := range snapshot.HPAs {
hpaID := hpaIDs[hpa.Metadata.Namespace+"/"+hpa.Metadata.Name]
_, targetName, ok := strings.Cut(hpa.ScaleTargetRef, "/")
if !ok {
targetName = hpa.ScaleTargetRef
}
if targetID, ok := workloadIDs[hpa.Metadata.Namespace+"/"+targetName]; ok {
edges = append(edges, model.NewEdgeWithResolver(hpaID, targetID, model.EdgeKindScalesWorkload, "hpa-scale-target/v1"))
}
}

for _, pdb := range snapshot.PodDisruptionBudgets {
pdbID := pdbIDs[pdb.Metadata.Namespace+"/"+pdb.Metadata.Name]
for _, pod := range snapshot.Pods {
if pod.Metadata.Namespace != pdb.Metadata.Namespace {
continue
}
if selector.LabelsMatch(pdb.Selector, pod.Metadata.Labels) {
edges = append(edges, model.NewEdgeWithResolver(pdbID, podIDs[pod.Metadata.Namespace+"/"+pod.Metadata.Name], model.EdgeKindProtectsPod, "pdb-selector/v1"))
}
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Inspect LabelsMatch semantics for empty selector
ast-grep --pattern 'func LabelsMatch($$$) bool {
  $$$
}'
# Also check how normalization represents an empty/nil NetworkPolicy/PDB selector
rg -nP --type=go -C3 '\bSelector\b' -g '**/resources/*.go'

Repository: Colvin-Y/kubernetes-ontology

Length of output: 4154


Fix empty-selector handling for NetworkPolicy/PDB so they select all pods

  • selector.LabelsMatch returns false when selector is empty (len(selector)==0) (internal/resolve/selector/service.go), so “match all” selectors yield no matches.
  • Normalization feeds empty maps for these cases (internal/collect/k8s/resources/types.go):
    • NetworkPolicy: Selector: in.Spec.PodSelector.MatchLabels
    • PDB: selector := map[string]string{} when in.Spec.Selector == nil
  • internal/graph/builder.go (lines 358-391) passes networkPolicy.Selector / pdb.Selector directly into LabelsMatch, so NetworkPolicies with {} podSelector and PDBs with omitted selector silently produce zero applies_to_pod / protects_pod edges.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/graph/builder.go` around lines 358 - 391, NetworkPolicy and PDB
selectors that are empty maps are being treated as non-matching because
selector.LabelsMatch returns false for empty selectors; update the loops in
builder.go that iterate snapshot.NetworkPolicies and
snapshot.PodDisruptionBudgets so that when networkPolicy.Selector or
pdb.Selector has length 0 they are treated as "match all" (i.e., skip calling
selector.LabelsMatch and always add the edge for every pod in the same
namespace), otherwise keep the existing selector.LabelsMatch check; refer to
networkPolicy.Selector, pdb.Selector, selector.LabelsMatch,
snapshot.NetworkPolicies, snapshot.PodDisruptionBudgets and where edges are
created with model.NewEdgeWithResolver to locate the changes.

self.assertIn("edges", data)
self.assertIn("budgets", data)
self.assertIn("rankedEvidence", data)
self.assertFalse(data["budgets"].get("truncated", True))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the default value to match intended behavior.

The test uses True as the default for a missing truncated field, which will cause the test to fail if truncated is absent from the JSON. Based on the summary indicating that "truncated is not set (defaults to False)", the default should be False to allow fixtures to omit this field.

🐛 Proposed fix for the default value
-                self.assertFalse(data["budgets"].get("truncated", True))
+                self.assertFalse(data["budgets"].get("truncated", False))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.assertFalse(data["budgets"].get("truncated", True))
self.assertFalse(data["budgets"].get("truncated", False))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/visualize/fixture_test.py` at line 56, The test currently uses
data["budgets"].get("truncated", True) which makes the default True and causes
failures when "truncated" is missing; change the default to False so the
assertion becomes self.assertFalse(data["budgets"].get("truncated", False)) to
reflect that truncated is not set by default and should default to False.

Comment on lines 3051 to 3079
function kindRank(kind) {
const order = {
WebhookConfig: 0,
Service: 1,
Workload: 2,
Pod: 3,
Ingress: 1,
Service: 2,
EndpointSlice: 3,
Workload: 4,
HPA: 5,
PodDisruptionBudget: 6,
Pod: 7,
ServiceAccount: 4,
RoleBinding: 4,
ClusterRoleBinding: 4,
ConfigMap: 5,
Secret: 5,
PVC: 6,
PV: 7,
StorageClass: 8,
CSIDriver: 9,
HelmRelease: 10,
HelmChart: 11,
Node: 12,
Image: 13,
Event: 14
NetworkPolicy: 8,
ConfigMap: 9,
Secret: 9,
PVC: 10,
PV: 11,
VolumeAttachment: 12,
StorageClass: 13,
CSIDriver: 14,
HelmRelease: 15,
HelmChart: 16,
Node: 17,
Image: 18,
Event: 19
};
return order[kind] ?? 11;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix duplicate and out-of-sequence rank assignments in kindRank.

Lines 3061-3063 assign rank 4 to ServiceAccount, RoleBinding, and ClusterRoleBinding, but these definitions appear after Pod: 7 (line 3060). This creates an inconsistent sequence where ranks 4-6 are reused after rank 7. Either these lines should be removed (if the kinds should use the default rank 11), or they should be moved earlier in the order object or assigned different rank values that maintain sequence integrity.

🔧 Proposed fix: assign sequential ranks after Pod
     const order = {
       WebhookConfig: 0,
       Ingress: 1,
       Service: 2,
       EndpointSlice: 3,
       Workload: 4,
       HPA: 5,
       PodDisruptionBudget: 6,
       Pod: 7,
-      ServiceAccount: 4,
-      RoleBinding: 4,
-      ClusterRoleBinding: 4,
+      ServiceAccount: 8,
+      RoleBinding: 8,
+      ClusterRoleBinding: 8,
       NetworkPolicy: 8,
       ConfigMap: 9,
       Secret: 9,
       PVC: 10,
       PV: 11,
       VolumeAttachment: 12,
       StorageClass: 13,
       CSIDriver: 14,
       HelmRelease: 15,
       HelmChart: 16,
       Node: 17,
       Image: 18,
       Event: 19
     };

Note: NetworkPolicy at line 3064 should also be adjusted to rank 9 if identity kinds take rank 8.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function kindRank(kind) {
const order = {
WebhookConfig: 0,
Service: 1,
Workload: 2,
Pod: 3,
Ingress: 1,
Service: 2,
EndpointSlice: 3,
Workload: 4,
HPA: 5,
PodDisruptionBudget: 6,
Pod: 7,
ServiceAccount: 4,
RoleBinding: 4,
ClusterRoleBinding: 4,
ConfigMap: 5,
Secret: 5,
PVC: 6,
PV: 7,
StorageClass: 8,
CSIDriver: 9,
HelmRelease: 10,
HelmChart: 11,
Node: 12,
Image: 13,
Event: 14
NetworkPolicy: 8,
ConfigMap: 9,
Secret: 9,
PVC: 10,
PV: 11,
VolumeAttachment: 12,
StorageClass: 13,
CSIDriver: 14,
HelmRelease: 15,
HelmChart: 16,
Node: 17,
Image: 18,
Event: 19
};
return order[kind] ?? 11;
}
function kindRank(kind) {
const order = {
WebhookConfig: 0,
Ingress: 1,
Service: 2,
EndpointSlice: 3,
Workload: 4,
HPA: 5,
PodDisruptionBudget: 6,
Pod: 7,
ServiceAccount: 8,
RoleBinding: 8,
ClusterRoleBinding: 8,
NetworkPolicy: 8,
ConfigMap: 9,
Secret: 9,
PVC: 10,
PV: 11,
VolumeAttachment: 12,
StorageClass: 13,
CSIDriver: 14,
HelmRelease: 15,
HelmChart: 16,
Node: 17,
Image: 18,
Event: 19
};
return order[kind] ?? 11;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/visualize/index.html` around lines 3051 - 3079, The kindRank function's
order object has duplicate/out-of-sequence rank values (ServiceAccount,
RoleBinding, ClusterRoleBinding set to 4 after Pod is 7), causing inconsistent
ordering; update the order object inside kindRank by either removing those three
identity entries if they should use the default rank, or move/reassign them to
sequential ranks after Pod (e.g., assign ServiceAccount, RoleBinding,
ClusterRoleBinding to 8/9/10 and then bump NetworkPolicy, ConfigMap, Secret,
PVC, etc. accordingly) so the numeric sequence is consistent and no ranks are
reused out of order.

@Colvin-Y Colvin-Y merged commit 3462363 into main Jun 2, 2026
10 checks passed
@Colvin-Y Colvin-Y deleted the codex/v0.1.8-diagnostics-release branch June 2, 2026 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant