Skip to content

update helm chart to use preferred backends for gke#1681

Open
rlakhtakia wants to merge 1 commit into
llm-d:mainfrom
rlakhtakia:feature/preferred-backends-ha
Open

update helm chart to use preferred backends for gke#1681
rlakhtakia wants to merge 1 commit into
llm-d:mainfrom
rlakhtakia:feature/preferred-backends-ha

Conversation

@rlakhtakia

@rlakhtakia rlakhtakia commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What type of PR is this?
/kind feature

What this PR does / why we need it:
Enable helm chart to deploy multiple EPP services and GCPBackendPolicies for gke preferred backends.

Release note (write NONE if no user-facing change):

Enable helm chart to deploy epp with GKE preferred backends

Test Results

Active-Passive HA (preferredReplicas: 1, defaultReplicas: 1)

Test Scenario Traffic Profile Succeeded / Total Median Latency (p50) Notes
Baseline Routing 20 sequential requests 20 / 20 (0 dropped) 0.092 s Normal routing to active Preferred Pod 0.
Live Failover Continuous stream during scale deploy ... --replicas=0 24 / 25 (1 timed out) 0.092 s 1 timeout (HTTP 504) during ~3s SIGTERM window; 100% instant failover to warm standby pod once marked unhealthy.
Graceful Failback Continuous stream during scale up to 1 replica 25 / 25 (0 dropped) 0.092 s 0 dropped connections. Traffic continues hitting standby pod until GCP probers mark recovered primary pod healthy.
Load Spillover 100 concurrent requests (~40 req/s vs maxRate: 5) 99 / 100 (1 dropped) 0.100 s Excess burst traffic spills over natively in Envoy memory to standby tier without dropped connections.

Active-Active HA (preferredReplicas: 2, defaultReplicas: 2)

Test Scenario Traffic Profile Succeeded / Total Median Latency (p50) Notes
Active-Active Baseline 50 concurrent requests across 10 threads 50 / 50 (0 dropped) 0.104 s Traffic actively balances across both leader replicas (...-6lg2r and ...-ngm7w).
Scenario A: 1 Active Pod Down Stream during scale deploy ... --replicas=1 25 / 25 (0 dropped) 0.104 s When 1 active leader pod crashes or scales down, the remaining active pod takes over 100% of Preferred traffic without dropping connections.
Scenario B: Both Actives Down Stream during scale deploy ... --replicas=0 25 / 25 (0 dropped) 0.100 s Once the last active Preferred leader pod terminates, 100% of incoming traffic shifts seamlessly to the warm standby backup pod.

@rlakhtakia rlakhtakia requested a review from a team as a code owner June 17, 2026 20:54
@rlakhtakia rlakhtakia requested review from liu-cong and vMaroon June 17, 2026 20:54
@github-actions github-actions Bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. labels Jun 17, 2026
@rlakhtakia rlakhtakia force-pushed the feature/preferred-backends-ha branch from 2acf74b to 9c46c3d Compare June 18, 2026 00:04
@github-actions github-actions Bot added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/feature Categorizes issue or PR as related to a new feature. labels Jun 18, 2026
epp:
replicas: 1
enablePreferredBackends: false
preferredReplicas: 1

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment on what preferred and default replicas mean? And what does it mean to set a >1 value

9003 by default, so an explicit HealthCheckPolicy is only needed for other ports. */}}
{{- if ne $eppHealthPort 9003 }}
---
kind: HealthCheckPolicy

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we update the HealthCheckPolicy to also loop over the additional backup services?

@rlakhtakia rlakhtakia force-pushed the feature/preferred-backends-ha branch from 9c46c3d to 763a600 Compare June 22, 2026 22:57
@github-actions github-actions Bot added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/feature Categorizes issue or PR as related to a new feature. labels Jun 23, 2026
Signed-off-by: Radhika Lakhtakia <rlakhtakia@google.com>
@rlakhtakia rlakhtakia force-pushed the feature/preferred-backends-ha branch from 763a600 to e7d83f0 Compare June 23, 2026 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants