Skip to content

Add CUDN pod churn and CUDN churn periodic tests for 5.0#80485

Open
mohit-sheth wants to merge 1 commit into
openshift:mainfrom
mohit-sheth:cudn-churn-tests
Open

Add CUDN pod churn and CUDN churn periodic tests for 5.0#80485
mohit-sheth wants to merge 1 commit into
openshift:mainfrom
mohit-sheth:cudn-churn-tests

Conversation

@mohit-sheth

@mohit-sheth mohit-sheth commented Jun 12, 2026

Copy link
Copy Markdown
Member

Summary

Test plan

  • pj-rehearse both new jobs
  • Revert e2e-benchmarking fork URL after upstream kube-burner-ocp PR merges

Summary by CodeRabbit

This PR adds two new periodic test jobs to the OpenShift perfscale CI infrastructure for validating CUDN (Cluster User Defined Network) churn support in kube-burner-ocp on AWS 5.0 nightly testing.

New Periodic Test Jobs:
Two 24-node test jobs are added to validate CUDN churn scenarios:

  • cudn-pod-churn-250-24nodes (runs daily at 6 AM UTC): Tests pod churn behavior with 50% churn rate, 30-minute test duration, and 1-minute delay between churn operations
  • cudn-churn-250-24nodes (runs daily at 8 AM UTC): Tests CUDN group churn with 10% churn rate across 3 cycles over a 10-minute duration

Both jobs execute on the aws-perfscale-qe cluster with 21 additional worker nodes, using m6a.2xlarge compute/control-plane instances and r5.2xlarge infra instances. They run the existing openshift-qe-cudn-density test chain with 250 iterations configured per job.

Temporary Fork Reference:
The CUDN density test step script (openshift-qe-cudn-density-commands.sh) is temporarily updated to use a fork of e2e-benchmarking. Instead of cloning from the upstream cloud-bulldozer/e2e-benchmarking repository with dynamic tag selection, the script now clones from mohit-sheth/e2e-benchmarking on the use-fork-kube-burner-ocp branch. This enables testing against in-flight kube-burner-ocp enhancements before they merge upstream. The fork reference is intended to be reverted once the upstream kube-burner-ocp PR (referenced as #458) is merged.

Testing Plan:
The PR uses pj-rehearse validation for both new test jobs to ensure they function correctly with the forked kube-burner-ocp changes.

Add two new 24-node periodic tests to validate kube-burner-ocp
CUDN churn support (kube-burner/kube-burner-ocp#458):
- cudn-pod-churn-250-24nodes: 50% pod churn with 30m duration
- cudn-churn-250-24nodes: 10% CUDN group churn with 3 cycles

Temporarily points e2e-benchmarking to mohit-sheth fork with
use-fork-kube-burner-ocp branch for pj-rehearse validation.

Signed-off-by: Mohit Sheth <msheth@redhat.com>
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

This PR adds two new CUDN churn performance test jobs to the OpenShift perfscale CI suite, updating the benchmarking script to use a forked e2e-benchmarking repository pinned to a specific branch, and configuring the new test workloads to run pod and CUDN churn scenarios on 24-node AWS clusters with 250-iteration overrides.

Changes

CUDN Churn Test Jobs and Script Configuration

Layer / File(s) Summary
CUDN Density Script Repository and Build Configuration
ci-operator/step-registry/openshift-qe/cudn-density/openshift-qe-cudn-density-commands.sh
The script now clones e2e-benchmarking from https://github.com/mohit-sheth/e2e-benchmarking and pins the build to the use-fork-kube-burner-ocp branch, replacing the previous dynamic tag-selection logic.
CUDN Churn Test Job Definitions
ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__aws-5.0-nightly-x86.yaml
Adds cudn-pod-churn-250-24nodes job targeting pod churn with EXTRA_FLAGS for duration, percent, and delay; adds cudn-churn-250-24nodes job targeting CUDN churn with different parameters. Both run 250 iterations on 24-node aws-perfscale-qe clusters with m6a.2xlarge compute and r5.2xlarge infra nodes.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Suggested labels

lgtm, rehearsals-ack


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name Status Explanation Resolution
No-Sensitive-Data-In-Logs ❌ Error openshift-qe-cudn-density-commands.sh enables xtrace (set -x) and sets ES_SERVER with $ES_PASSWORD in the URL, so the password/host could be printed in logs. citeturn7view3 Disable xtrace (set +x) before reading ES_PASSWORD/building ES_SERVER, or avoid embedding credentials in URLs (use headers/safer methods) and ensure secrets aren’t printed.
Ipv6 And Disconnected Network Test Compatibility ⚠️ Warning New periodic CUDN jobs run openshift-qe-cudn-density, whose commands.sh git-clones from github.com and builds an external AWS ES URL—requires internet and risks IPv6 URL issues. Add the IPv6 & disconnected network compatibility notice; this job needs external internet (github.com clone + public AWS ES), so verify on disconnected/IPv6 or skip with [Skipped:Disconnected].
✅ Passed checks (13 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding two new CUDN churn-related periodic tests for OpenShift 5.0.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed In the PR’s changed files, no Ginkgo title calls (It/Describe/Context/When) were found; the new as: job names are fixed static strings.
Test Structure And Quality ✅ Passed PR changes only perfscale CI YAML and a step-registry shell script; no Ginkgo/Go test code is present to review, so the quality requirements are not applicable.
Microshift Test Compatibility ✅ Passed The PR only adds perfscale periodic job configs (YAML) and updates a shell script; no new Go/Ginkgo e2e tests or MicroShift-API references are introduced in these files.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR only adds AWS perfscale periodic jobs and a kube-burner wrapper script; modified files contain no new Ginkgo (It/Describe/Context) tests or SNO skip guards to evaluate.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds two perfscale CUDN CI jobs (EXTRA_FLAGS for churn) and tweaks cudn-density step-registry script to clone a fork/branch; no added manifests/operators/controllers introduce topology-unaware s...
Ote Binary Stdout Contract ✅ Passed PR changes only a CI YAML and a bash step script; no Go OTE binary code (main/init/TestMain) with stdout writes is introduced.
No-Weak-Crypto ✅ Passed PR changes only CUDN periodic YAML and e2e-benchmarking clone/tag logic; scans of both files found no MD5/SHA1/DES/RC4/3DES/Blowfish/ECB or secret non-constant-time comparisons.
Container-Privileges ✅ Passed PR adds CUDN periodic CI jobs and updates kube-burner-ocp script clone URL; diff contains no privileged/hostPID/hostNetwork/hostIPC/SYS_ADMIN/allowPrivilegeEscalation/runAsUser/root/securityContext...
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci

openshift-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mohit-sheth
Once this PR has been reviewed and has the lgtm label, please assign jtaleric for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot requested review from krishvoor and rpattath June 12, 2026 22:01
@mohit-sheth

Copy link
Copy Markdown
Member Author

/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-pod-churn-250-24nodes
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-churn-250-24nodes

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@mohit-sheth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@mohit-sheth: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-pod-churn-250-24nodes N/A periodic Periodic changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-churn-250-24nodes N/A periodic Periodic changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-density-single-ns-250-24nodes N/A periodic Registry content changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-4.23-nightly-x86-cudn-density-multi-ns-500-24nodes N/A periodic Registry content changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-4.23-nightly-x86-cudn-density-single-ns-250-24nodes N/A periodic Registry content changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-4.22-nightly-x86-cudn-density-multi-ns-500-24nodes N/A periodic Registry content changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-4.22-nightly-x86-cudn-density-single-ns-250-24nodes N/A periodic Registry content changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-density-multi-ns-500-24nodes N/A periodic Registry content changed

Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals.

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@mohit-sheth: requesting more than one rehearsal in one comment is not supported. If you would like to rehearse multiple specific jobs, please separate the job names by a space in a single command.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/openshift-qe/cudn-density/openshift-qe-cudn-density-commands.sh`:
- Around line 38-39: Hardcoded temporary fork settings (REPO_URL and TAG_OPTION)
lack a revert tracker; update the file to add a TODO comment adjacent to the
REPO_URL and TAG_OPTION declarations that includes the upstream PR/issue URL (or
CI ticket) and a short "revert when merged" note, or reference a scheduled
task/milestone to perform the revert—mention the exact upstream PR number/link
and the intended revert action so future maintainers can find and remove the
fork override.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 532e70f5-63f4-4e29-97ac-bb105a1eb576

📥 Commits

Reviewing files that changed from the base of the PR and between 85bbd08 and cc8a5ad.

⛔ Files ignored due to path filters (1)
  • ci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-periodics.yaml is excluded by !ci-operator/jobs/**
📒 Files selected for processing (2)
  • ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__aws-5.0-nightly-x86.yaml
  • ci-operator/step-registry/openshift-qe/cudn-density/openshift-qe-cudn-density-commands.sh

Comment on lines +38 to +39
REPO_URL="https://github.com/mohit-sheth/e2e-benchmarking";
TAG_OPTION="--branch use-fork-kube-burner-ocp";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟠 Major | ⚖️ Poor tradeoff

Temporary fork URL lacks tracking mechanism for revert.

Lines 38-39 hardcode the fork URL and branch as explicitly temporary (per PR description: "Revert e2e-benchmarking fork URL after upstream kube-burner-ocp PR merges"). Without a TODO comment, issue reference, or other tracking, this revert could be forgotten, leaving the repository diverged from upstream.

Recommendation: Add a TODO comment with the upstream PR reference or GitHub issue link to track the revert, or ensure the revert is tied to a scheduled task/milestone.

📝 Suggested fix to add tracking
+# TODO: Revert to cloud-bulldozer after upstream kube-burner-ocp PR merges
+# See: https://github.com/cloud-bulldozer/kube-burner-ocp/pull/458
 REPO_URL="https://github.com/mohit-sheth/e2e-benchmarking";
 TAG_OPTION="--branch use-fork-kube-burner-ocp";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
REPO_URL="https://github.com/mohit-sheth/e2e-benchmarking";
TAG_OPTION="--branch use-fork-kube-burner-ocp";
# TODO: Revert to cloud-bulldozer after upstream kube-burner-ocp PR merges
# See: https://github.com/cloud-bulldozer/kube-burner-ocp/pull/458
REPO_URL="https://github.com/mohit-sheth/e2e-benchmarking";
TAG_OPTION="--branch use-fork-kube-burner-ocp";
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/openshift-qe/cudn-density/openshift-qe-cudn-density-commands.sh`
around lines 38 - 39, Hardcoded temporary fork settings (REPO_URL and
TAG_OPTION) lack a revert tracker; update the file to add a TODO comment
adjacent to the REPO_URL and TAG_OPTION declarations that includes the upstream
PR/issue URL (or CI ticket) and a short "revert when merged" note, or reference
a scheduled task/milestone to perform the revert—mention the exact upstream PR
number/link and the intended revert action so future maintainers can find and
remove the fork override.

@mohit-sheth

Copy link
Copy Markdown
Member Author

/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-churn-250-24nodes

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@mohit-sheth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci

openshift-ci Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

@mohit-sheth: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant