Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ jobs:
-ldflags="-s -w -X ${BUILDINFO_PACKAGE}.Version=${VERSION} -X ${BUILDINFO_PACKAGE}.Commit=${COMMIT} -X ${BUILDINFO_PACKAGE}.Date=${BUILD_DATE}" \
-o "${work}/${cmd}${ext}" "./cmd/${cmd}"
done
cp README.md README.zh-CN.md QUICKSTART.md CHANGELOG.md AI_CONTRACT.md LICENSE NOTICE SECURITY.md "${work}/"
cp README.md README.zh-CN.md QUICKSTART.md UPGRADING.md CHANGELOG.md AI_CONTRACT.md LICENSE NOTICE SECURITY.md "${work}/"
mkdir -p "${work}/docs"
cp -R docs/assets "${work}/docs/"
mkdir -p "${work}/local"
Expand Down
20 changes: 19 additions & 1 deletion AI_CONTRACT.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,15 @@ The response always includes:
- `PV`
- `StorageClass`
- `CSIDriver`
- `Service`
- `Ingress`
- `EndpointSlice`
- `NetworkPolicy`
- `ServiceAccount`
- `RoleBinding`
- `ClusterRoleBinding`
- `Node`
- `VolumeAttachment`
- `HelmRelease`
- `HelmChart`

Expand Down Expand Up @@ -101,7 +110,8 @@ best-effort reasoning.
`recipe` and `lanes` are additive Incident Context Pack v1 metadata. `recipe`
is a product label for how the graph should be interpreted; it does not change
the underlying graph identity contract. v1 recipes are: `pod-incident`,
`workload-incident`, `helm-ownership`, and `helm-upgrade-runtime-failure`.
`workload-incident`, `storage-csi`, `service-routing`, `identity`,
`node-context`, `helm-ownership`, and `helm-upgrade-runtime-failure`.

`warnings`, `partial`, `degradedSources`, `budgets`, `rankedEvidence`, and
`conflicts` are additive diagnostic metadata. Agents should read them before
Expand Down Expand Up @@ -232,6 +242,14 @@ The following edge kinds are intended as stable phase-1 diagnostic semantics:
- `affected_by_webhook`
- `managed_by_csi_controller`
- `served_by_csi_node_agent`
- `attaches_pv`

### Routing, policy, and scaling
- `routes_to_service`
- `targets_pod`
- `applies_to_pod`
- `scales_workload`
- `protects_pod`

### Helm / package provenance
- `managed_by_helm_release`
Expand Down
31 changes: 31 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,36 @@
# Changelog

## v0.1.8 - 2026-06-01

### Added

- Added diagnostic recipe vocabulary for `storage-csi`, `service-routing`,
`identity`, and `node-context` across the CLI/API/viewer contract.
- Added richer Event and Pod runtime evidence fields, including Event counts
and timestamps plus container waiting, terminated, and restart status.
- Added fallback Event correlation by involved object kind/name when UID
evidence is absent.
- Added topology coverage for Ingress, EndpointSlice, NetworkPolicy,
HorizontalPodAutoscaler, PodDisruptionBudget, and VolumeAttachment.
- Added offline failure-mode fixtures for CrashLoopBackOff, FailedScheduling,
Service selector mismatch, missing ConfigMap/Secret, RBAC forbidden,
admission webhook failure, and PVC/CSI provisioning failure.
- Added viewer support for `focusNode` URL state and the expanded recipe and
entry-kind vocabulary.
- Added `UPGRADING.md` with user-facing compatibility and upgrade guidance.

### Changed

- Changed the Helm chart default to `rbac.readSecrets=false`; Secret topology
collection now requires an explicit opt-in.
- Updated release, installation, documentation, and access skill examples for
`v0.1.8`.

### Validation

- Local Go test, viewer, Helm, binary, client, schema, and sample checks are
expected before tagging.

## v0.1.7 - 2026-05-06

### Added
Expand Down
26 changes: 19 additions & 7 deletions QUICKSTART.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ the CLI at that local server.
Set the version and choose the release archive for your machine:

```bash
export KO_VERSION=v0.1.7
export KO_VERSION=v0.1.8
curl -LO "https://github.com/Colvin-Y/kubernetes-ontology/releases/download/${KO_VERSION}/kubernetes-ontology_${KO_VERSION}_linux_amd64.tar.gz"
tar -xzf "kubernetes-ontology_${KO_VERSION}_linux_amd64.tar.gz"
cd "kubernetes-ontology_${KO_VERSION}_linux_amd64"
Expand Down Expand Up @@ -158,8 +158,9 @@ path above.
Set the version and image namespace you want to use:

```bash
export KO_VERSION=v0.1.7
export KO_VERSION=v0.1.8
export KO_IMAGE=ghcr.io/colvin-y/kubernetes-ontology
curl -LO "https://github.com/Colvin-Y/kubernetes-ontology/releases/download/${KO_VERSION}/kubernetes-ontology-0.1.8.tgz"
```

Use the `KO_VERSION` value for the release tag you want to install. If you
Expand All @@ -169,7 +170,7 @@ image reference.
Install the Helm chart:

```bash
helm upgrade --install kubernetes-ontology ./charts/kubernetes-ontology \
helm upgrade --install kubernetes-ontology kubernetes-ontology-0.1.8.tgz \
--namespace kubernetes-ontology \
--create-namespace \
--set image.repository="${KO_IMAGE}" \
Expand All @@ -183,14 +184,14 @@ read-only RBAC. The daemon uses those in-cluster credentials only for
`get`/`list`/`watch` collection. Inside the pod, the server listens on `:18080`
rather than `0.0.0.0:18080` so Kubernetes IPv4, IPv6, and dual-stack networking
can use the wildcard listener supported by the runtime. By default the chart
grants `secrets` `get`/`list`/`watch` permission so Secret nodes and
`uses_secret` edges can be collected. To run without Secret collection:
does not grant Secret reads. To include Secret nodes and `uses_secret` edges,
opt in explicitly:

```bash
helm upgrade --install kubernetes-ontology ./charts/kubernetes-ontology \
helm upgrade --install kubernetes-ontology kubernetes-ontology-0.1.8.tgz \
--namespace kubernetes-ontology \
--reuse-values \
--set rbac.readSecrets=false
--set rbac.readSecrets=true
```

Wait for the server:
Expand Down Expand Up @@ -564,6 +565,9 @@ fault-diagnosis workflow and downstream AI-agent consumption.
It includes additive `schemaVersion`, `recipe`, `lanes`, `partial`,
`warnings`, `budgets`, `rankedEvidence`, `degradedSources`, and `conflicts`
fields so agents can tell bounded evidence from complete cluster truth.
Canonical recipe hints are `pod-incident`, `workload-incident`, `storage-csi`,
`service-routing`, `identity`, `node-context`, `helm-ownership`, and
`helm-upgrade-runtime-failure`.
When resources carry standard Helm metadata, diagnostic graphs can also include
`HelmRelease` and `HelmChart` nodes connected by `managed_by_helm_release` and
`installs_chart` edges. These edges are label evidence with confidence scores,
Expand All @@ -590,6 +594,7 @@ the Helm output.

An offline reference fixture for this story is available at
`samples/helm-upgrade-failure/diagnostic-graph.json`.
Additional failure-mode fixtures live under `samples/failure-modes/`.

Pod-centered diagnostic queries keep shared nodes bounded by default. For
example, a pod's `ServiceAccount` is shown, but the traversal does not continue
Expand Down Expand Up @@ -650,6 +655,13 @@ recipe metadata, freshness, budget truncation, warnings, conflicts, degraded
sources, and ranked evidence. Evidence and conflict entries focus the related
node or edge when the fixture or daemon response includes IDs.

Viewer URLs can carry state for handoff between docs, issues, and agents:

```text
http://127.0.0.1:8765/?diagnostic=1&kind=Pod&namespace=default&name=my-pod&recipe=pod-incident
http://127.0.0.1:8765/?file=samples/failure-modes/crashloopbackoff/diagnostic-graph.json&focusNode=demo-cluster/core/Pod/default/api-7c9d/pod-crash/_
```

Select a node and use `Expand 1 hop` to fetch the next layer from the daemon.
The CLI equivalent is:

Expand Down
40 changes: 33 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,14 @@ This project turns those object reads into a graph:
- `PV`
- `StorageClass`
- `CSIDriver`
- `Service`
- `Ingress`
- `EndpointSlice`
- `NetworkPolicy`
- `ServiceAccount`
- `RoleBinding`
- `ClusterRoleBinding`
- `Node`
- `HelmRelease`
- `HelmChart`

Expand Down Expand Up @@ -89,6 +97,10 @@ The graph can recover and correlate:
- display-only controller ownership rules for controller pods that Kubernetes
does not expose through owner references
- service selector matches
- Ingress backend Service references
- EndpointSlice Pod target references
- NetworkPolicy, HPA, and PodDisruptionBudget selector/target evidence
- VolumeAttachment to PersistentVolume evidence
- pod to node placement
- pod to Secret, ConfigMap, ServiceAccount, image, PVC, PV, StorageClass, and
CSI driver paths
Expand Down Expand Up @@ -224,9 +236,9 @@ There are three deployment modes:
- Helm mode installs this project's own Deployment, Service, ServiceAccount,
ConfigMap, and read-only RBAC so the daemon and viewer can run in-cluster.
That install-time footprint is expected. The granted RBAC is limited to
`get`, `list`, and `watch` for collected resources. Secret reads are enabled
by default so Secret nodes and `uses_secret` edges can be collected; set
`rbac.readSecrets=false` to disable them.
`get`, `list`, and `watch` for collected resources. Secret reads are disabled
by default; set `rbac.readSecrets=true` only when Secret nodes and
`uses_secret` edges are needed.

The HTTP API is intended for local or controlled environments, not public
multi-tenant exposure.
Expand All @@ -241,7 +253,7 @@ the published GHCR image. The release archive includes the server
viewer `kubernetes-ontology-viewer`.

```bash
export KO_VERSION=v0.1.7
export KO_VERSION=v0.1.8
curl -LO "https://github.com/Colvin-Y/kubernetes-ontology/releases/download/${KO_VERSION}/kubernetes-ontology_${KO_VERSION}_linux_amd64.tar.gz"
tar -xzf "kubernetes-ontology_${KO_VERSION}_linux_amd64.tar.gz"
cd "kubernetes-ontology_${KO_VERSION}_linux_amd64"
Expand Down Expand Up @@ -297,10 +309,11 @@ clusters, mirror `ghcr.io/colvin-y/kubernetes-ontology` to an internal registry
and set `KO_IMAGE` to that mirror, or use the release binary path above.

```bash
export KO_VERSION=v0.1.7
export KO_VERSION=v0.1.8
export KO_IMAGE=ghcr.io/colvin-y/kubernetes-ontology
curl -LO "https://github.com/Colvin-Y/kubernetes-ontology/releases/download/${KO_VERSION}/kubernetes-ontology-0.1.8.tgz"

helm upgrade --install kubernetes-ontology ./charts/kubernetes-ontology \
helm upgrade --install kubernetes-ontology kubernetes-ontology-0.1.8.tgz \
--namespace kubernetes-ontology \
--create-namespace \
--set image.repository="${KO_IMAGE}" \
Expand Down Expand Up @@ -328,6 +341,15 @@ The Helm chart creates the project Deployment, Service, ServiceAccount,
ConfigMap, and read-only RBAC required to run in-cluster. It also deploys the
topology viewer by default:

To include Secret nodes and `uses_secret` edges, opt in explicitly:

```bash
helm upgrade --install kubernetes-ontology kubernetes-ontology-0.1.8.tgz \
--namespace kubernetes-ontology \
--reuse-values \
--set rbac.readSecrets=true
```

```bash
kubectl -n kubernetes-ontology port-forward svc/kubernetes-ontology-viewer 8765:8765
```
Expand Down Expand Up @@ -468,6 +490,10 @@ Diagnostic responses include additive `schemaVersion`, `recipe`, `lanes`,
`partial`, `warnings`, `budgets`, `rankedEvidence`, `degradedSources`, and
`conflicts` fields. Agents should use
those fields to distinguish bounded evidence from complete cluster truth.
Current recipe hints are `pod-incident`, `workload-incident`, `storage-csi`,
`service-routing`, `identity`, `node-context`, `helm-ownership`, and
`helm-upgrade-runtime-failure`. Offline failure-mode fixtures live under
`samples/failure-modes/`.

Expand one graph node:

Expand Down Expand Up @@ -610,7 +636,7 @@ Tagged releases publish:
`kubernetes-ontology-viewer`, Quickstart docs, release notes, and a local
config example
- a packaged Helm chart archive, for example
`kubernetes-ontology-0.1.7.tgz`
`kubernetes-ontology-0.1.8.tgz`
- a multi-architecture image at
`ghcr.io/colvin-y/kubernetes-ontology:<tag>`
- SemVer aliases without the leading `v`, plus `latest`
Expand Down
39 changes: 34 additions & 5 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,8 @@ daemon 运行时不会:
- Helm 模式会安装本项目自己的 Deployment、Service、ServiceAccount、
ConfigMap 和只读 RBAC,让 daemon 和 viewer 能在集群内运行。这些是安装阶段
的预期资源。chart 授予的 RBAC 只包含对采集资源的 `get`、`list`、
`watch` 权限;Secret 读取默认开启,以便采集 Secret 节点和 `uses_secret`
关系;如需关闭可设置 `rbac.readSecrets=false`。
`watch` 权限;Secret 读取默认关闭。只有需要采集 Secret 节点和
`uses_secret` 关系时,才显式设置 `rbac.readSecrets=true`。

HTTP API 建议只暴露在本机或受控内网环境中,不要直接作为公网多租户服务使用。

Expand All @@ -53,12 +53,28 @@ HTTP API 建议只暴露在本机或受控内网环境中,不要直接作为

- `Pod`
- `Workload`
- `PVC`
- `PV`
- `StorageClass`
- `CSIDriver`
- `Service`
- `Ingress`
- `EndpointSlice`
- `NetworkPolicy`
- `ServiceAccount`
- `RoleBinding`
- `ClusterRoleBinding`
- `Node`
- `HelmRelease`
- `HelmChart`

可以恢复和展示的关系包括:

- `Pod -> ReplicaSet -> Deployment` 等 ownerReference 链路
- 自定义 workload 资源,例如 OpenKruise ASTS、Redis Cluster 等 CRD 对象
- Service selector 到 Pod 的匹配关系
- Ingress 到 Service、EndpointSlice 到 Pod、NetworkPolicy 到 Pod 的路由与策略证据
- HPA 到 Workload、PodDisruptionBudget 到 Pod、VolumeAttachment 到 PV 的关系
- Pod 到 Node、Secret、ConfigMap、ServiceAccount、Image、PVC 的关系
- PVC、PV、StorageClass、CSI Driver 的存储链路
- ServiceAccount 到 RoleBinding、ClusterRoleBinding 的证据
Expand Down Expand Up @@ -143,7 +159,7 @@ Skill marketplace 对外链接故意指向默认分支,这样 Agent 会拿到
`kubernetes-ontology-viewer`。

```bash
export KO_VERSION=v0.1.7
export KO_VERSION=v0.1.8
curl -LO "https://github.com/Colvin-Y/kubernetes-ontology/releases/download/${KO_VERSION}/kubernetes-ontology_${KO_VERSION}_linux_amd64.tar.gz"
tar -xzf "kubernetes-ontology_${KO_VERSION}_linux_amd64.tar.gz"
cd "kubernetes-ontology_${KO_VERSION}_linux_amd64"
Expand Down Expand Up @@ -205,10 +221,11 @@ streamMode: informer
`KO_IMAGE` 改成内部地址,或者使用上面的二进制方式。

```bash
export KO_VERSION=v0.1.7
export KO_VERSION=v0.1.8
export KO_IMAGE=ghcr.io/colvin-y/kubernetes-ontology
curl -LO "https://github.com/Colvin-Y/kubernetes-ontology/releases/download/${KO_VERSION}/kubernetes-ontology-0.1.8.tgz"

helm upgrade --install kubernetes-ontology ./charts/kubernetes-ontology \
helm upgrade --install kubernetes-ontology kubernetes-ontology-0.1.8.tgz \
--namespace kubernetes-ontology \
--create-namespace \
--set image.repository="${KO_IMAGE}" \
Expand All @@ -231,6 +248,15 @@ release tag,然后查询状态:
kubernetes-ontology --server "http://127.0.0.1:18080" --status
```

如果需要采集 Secret 节点和 `uses_secret` 边,显式开启:

```bash
helm upgrade --install kubernetes-ontology kubernetes-ontology-0.1.8.tgz \
--namespace kubernetes-ontology \
--reuse-values \
--set rbac.readSecrets=true
```

短期试用结束后,用 `Ctrl-C` 停止 `kubectl port-forward`;如果不再长期使用,
可以卸载集群内资源:

Expand Down Expand Up @@ -353,6 +379,9 @@ OpenKruise,这是正常情况,不需要为此中断启动。
诊断返回会额外包含 `partial`、`warnings`、`budgets`、
`rankedEvidence`、`degradedSources` 和 `conflicts`。Agent 应优先读取
这些字段,区分“有边界的证据图”和“完整集群事实”。
当前 recipe 包括 `pod-incident`、`workload-incident`、`storage-csi`、
`service-routing`、`identity`、`node-context`、`helm-ownership` 和
`helm-upgrade-runtime-failure`。离线故障样例位于 `samples/failure-modes/`。

展开一个图节点:

Expand Down
5 changes: 3 additions & 2 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,9 @@ Expected behavior:
- The daemon and viewer should be exposed only on localhost or controlled
internal networks unless an operator adds external protection.
- The HTTP API has no built-in authentication or TLS yet.
- Secret reads are used only to model Secret nodes and `uses_secret` edges.
Disable them with `rbac.readSecrets=false` when that evidence is not needed.
- Secret reads are disabled by default in the Helm chart. If enabled with
`rbac.readSecrets=true`, they are used only to model Secret nodes and
`uses_secret` edges.

Out of scope for the current MVP:

Expand Down
Loading
Loading