Skip to content

pimd: fix BSR_PENDING timer being overwritten by BS liveness timer (backport #22460)#22463

Merged
Jafaral merged 1 commit into
stable/10.7from
mergify/bp/stable/10.7/pr-22460
Jun 24, 2026
Merged

pimd: fix BSR_PENDING timer being overwritten by BS liveness timer (backport #22460)#22463
Jafaral merged 1 commit into
stable/10.7from
mergify/bp/stable/10.7/pr-22460

Conversation

@mergify

@mergify mergify Bot commented Jun 24, 2026

Copy link
Copy Markdown

This addresses the likely root cause of the flaky test pim_cand_rp_bsr.test_pim_bsr_priority_modify which had a 3.4 - 11% failure rate in the weekly topotest report.

The bs_timer is used for two different purposes:

  1. During BSR_PENDING state: a ~5 second timer before becoming BSR_ELECTED (callback: pim_cand_bsr_pending_expire)
  2. During other states: a 130 second BS liveness timer (callback: pim_on_bs_timer)

When a candidate BSR is in BSR_PENDING state and receives a BSM from another BSR, the pim_bs_timer_restart() call would overwrite the pending timer with the liveness timer. This prevented the BSR_PENDING timer from ever expiring, causing the candidate BSR to never become elected even when it had higher priority. Fix by skipping pim_bs_timer_restart() when in BSR_PENDING state.

Additionally, when pim_bsm_update() drops a router out of BSR_PENDING state (due to receiving a BSM from a different BSR), the bs_timer with the pim_cand_bsr_pending_expire callback was not cancelled. This could cause an assertion failure when the timer fired with state != BSR_PENDING. Fix by cancelling the bs_timer in pim_bsm_update() before leaving BSR_PENDING state.

This also fixes a related issue where pim_cand_bsr_apply() would return early if the address selection hadn't changed, preventing priority-only changes from triggering the BSR state machine re-evaluation. Fix by always calling pim_cand_bsr_trigger() regardless of address change.


This is an automatic backport of pull request #22460 done by Mergify.

This addresses the likely root cause of the flaky test
pim_cand_rp_bsr.test_pim_bsr_priority_modify which had a 3.4 - 11%
failure rate in the weekly topotest report.

The bs_timer is used for two different purposes:
1. During BSR_PENDING state: a ~5 second timer before becoming BSR_ELECTED
   (callback: pim_cand_bsr_pending_expire)
2. During other states: a 130 second BS liveness timer
   (callback: pim_on_bs_timer)

When a candidate BSR is in BSR_PENDING state and receives a BSM from
another BSR, the pim_bs_timer_restart() call would overwrite the pending
timer with the liveness timer. This prevented the BSR_PENDING timer from
ever expiring, causing the candidate BSR to never become elected even
when it had higher priority. Fix by skipping pim_bs_timer_restart() when
in BSR_PENDING state.

Additionally, when pim_bsm_update() drops a router out of BSR_PENDING
state (due to receiving a BSM from a different BSR), the bs_timer with
the pim_cand_bsr_pending_expire callback was not cancelled. This could
cause an assertion failure when the timer fired with state != BSR_PENDING.
Fix by cancelling the bs_timer in pim_bsm_update() before leaving
BSR_PENDING state, and start the BS liveness timer after transitioning
to ACCEPT_PREFERRED to ensure BSR expiry detection continues to work.

This also fixes a related issue where pim_cand_bsr_apply() would return
early if the address selection hadn't changed, preventing priority-only
changes from triggering the BSR state machine re-evaluation. Fix by
always calling pim_cand_bsr_trigger() regardless of address change,
with a guard that handles the case where an operator lowers the BSR
priority while in BSR_PENDING or BSR_ELECTED state. For BSR_PENDING,
cancel the pending timer and transition to ACCEPT_PREFERRED to avoid
assertion failures in pim_cand_bsr_pending_expire.

Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
(cherry picked from commit 89b31a4)
@greptile-apps

greptile-apps Bot commented Jun 24, 2026

Copy link
Copy Markdown

Target branch is not in the allowed branches list.

@Jafaral Jafaral merged commit f2c0e02 into stable/10.7 Jun 24, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants