Skip to content

Align LLBP-X provider and enable in kmhv3#919

Open
CaoJiaming776 wants to merge 2 commits into
xs-devfrom
llbpx-provider-pb-align
Open

Align LLBP-X provider and enable in kmhv3#919
CaoJiaming776 wants to merge 2 commits into
xs-devfrom
llbpx-provider-pb-align

Conversation

@CaoJiaming776

@CaoJiaming776 CaoJiaming776 commented Jun 24, 2026

Copy link
Copy Markdown
Member

This branch adds an experimental BTB-side LLBP-X direction override component to the XiangShan kmhv3 branch predictor flow, and enables it by default in kmhv3.py.

The implementation is inspired by the upstream LLBP-X project:

  • Source repository: https://github.com/dhschall/LLBP-X
  • Relevant upstream concepts: LLBP/LLBP-X provider arbitration, Recent Control Records, Context/Pattern prediction, and Pattern Buffer timing.

This is not a direct patch import. The upstream gem5 integration targets a different predictor structure, while this repository uses the DecoupledBPUWithBTB component chain. This branch ports the main LLBP-X ideas into the local BTB predictor framework.

Summary by CodeRabbit

  • New Features

    • Added an optional experimental branch-prediction mode with new configuration flags.
    • Enabled a new prediction component in supported branch predictor setups.
    • Expanded branch prediction metadata handling to support more prediction components.
  • Bug Fixes

    • Improved bounds checking when recording prediction metadata.
    • Added safeguards to prevent enabling more prediction components than supported.

Cao Jiaming added 2 commits June 24, 2026 13:15
Change-Id: I08c3756fcc972e6d4170eb7fff4dee654a2320a4
Change-Id: Ided6257af49a9bdbddff49ba29a8a1d2d53e8942
Copilot AI review requested due to automatic review settings June 24, 2026 07:29
@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Introduces BTBLLBPX, a new BTB-side direction-override predictor component for DecoupledBPUWithBTB. It adds a templated set-associative context/pattern store, a pattern buffer timing gate, RCR history snapshots, saturating counters, hashing utilities, stats, and full build/SimObject/CLI wiring.

Changes

BTBLLBPX BTB Direction-Override Predictor

Layer / File(s) Summary
Shared data contracts and component capacity guard
src/cpu/pred/btb/common.hh, src/cpu/pred/btb/btb_tage.cc
Introduces MaxBTBPredComponents = 16, widens FetchTarget::predMetas to 16 entries, extends TageInfoForMGSC with tage_final_provider_table and tage_final_provider_is_alt fields, populates them in BTBTAGE::lookupHelper, and adds required standard-library includes.
SetAssociativeStore cache utility
src/cpu/pred/btb/llbpx_cache.hh
New templated SetAssociativeStore<Entry> providing LRU set-associative lookup, tick-based lastTouch update on hit, and replacement-score/lastTouch victim selection on allocate.
BTBLLBPX class declaration
src/cpu/pred/btb/btb_llbpx.hh
Declares BTBLLBPX (extends TimedBaseBTBPredictor) with internal structures PatternEntry, ContextEntry, PatternBufferEntry, BranchMeta, RCRRecord, LLBPXMeta, ThreadState, public override declarations, helper method declarations, and conditional LLBPXStats definition.
BTBLLBPX construction, stats, lookup, and prediction
src/cpu/pred/btb/btb_llbpx.cc (lines 1–299)
Constructor with parameter validation and LLBPXStats registration; predictorTid/getPredictionMeta accessors; putPCHistory prediction loop iterating stages and BTB entries; lookup() computing context/pattern keys, consulting stores, evaluating depth/timing eligibility, and optionally overriding direction.
BTBLLBPX history, training, and pattern buffer
src/cpu/pred/btb/btb_llbpx.cc (lines 301–499)
specUpdateGHist, recoverHist, recoverPHist for RCR snapshot/restore; update() for reconciling predictions and triggering allocateFor() or updatePattern(); rememberPattern, findPatternBuffer, pushRCR, restoreRCR for pattern-buffer bookkeeping.
BTBLLBPX hashing and counter utilities
src/cpu/pred/btb/btb_llbpx.cc (lines 500–568)
hashBits, mix (64-bit), contextKey, patternKey, tagFromKey, and updateCounter (saturating) used across prediction and training paths.
DecoupledBPUWithBTB wiring and bounds guards
src/cpu/pred/btb/decoupled_bpred.hh, src/cpu/pred/btb/decoupled_bpred.cc
Adds BTBLLBPX* llbpx member and include; initializes it in the constructor, conditionally appends to components, enforces panic_if(numComponents > MaxBTBPredComponents), and adds assert(i < MaxBTBPredComponents) in createFetchTargetEntry.
SimObject, build registration, and CLI config
src/cpu/pred/BranchPredictor.py, src/cpu/pred/SConscript, configs/common/xiangshan.py, configs/example/kmhv3.py
Declares BTBLLBPX SimObject with all parameters and wires llbpx into DecoupledBPUWithBTB; registers BTBLLBPX type, btb/btb_llbpx.cc source, and LLBPX debug flag; adds --enable-llbpx and --enable-llbpx-timing CLI flags; sets llbpx and llbpx.enableTiming in kmhv3.py.

Sequence Diagram(s)

sequenceDiagram
  participant DecoupledBPU as DecoupledBPUWithBTB
  participant BTBLLBPX
  participant ContextStore as SetAssociativeStore[Context]
  participant PatternStore as SetAssociativeStore[Pattern]
  participant PatternBuffer

  DecoupledBPU->>BTBLLBPX: putPCHistory(startAddr, history, stagePreds)
  loop each BTB entry in stagePreds
    BTBLLBPX->>BTBLLBPX: compute contextKey / patternKey
    BTBLLBPX->>ContextStore: find(contextKey, contextTag)
    ContextStore-->>BTBLLBPX: ContextEntry* (hit) or nullptr
    BTBLLBPX->>PatternStore: find(patternKey, patternTag)
    PatternStore-->>BTBLLBPX: PatternEntry* (hit) or nullptr
    alt enableTiming
      BTBLLBPX->>PatternBuffer: findPatternBuffer(key, tag)
      PatternBuffer-->>BTBLLBPX: readyTick check result
    end
    BTBLLBPX->>BTBLLBPX: apply depth/timing eligibility gates
    BTBLLBPX->>DecoupledBPU: override direction in stagePreds (if eligible)
  end
  BTBLLBPX->>BTBLLBPX: store LLBPXMeta in ThreadState

  DecoupledBPU->>BTBLLBPX: update(FetchTarget)
  BTBLLBPX->>BTBLLBPX: locate BranchMeta per BTB entry
  alt provider was used
    BTBLLBPX->>PatternStore: updatePattern (saturating counter)
    BTBLLBPX->>PatternBuffer: rememberPattern (dirty/readyTick)
  else misprediction or missing context/pattern
    BTBLLBPX->>ContextStore: allocate(contextKey, contextTag)
    BTBLLBPX->>PatternStore: allocate(patternKey, patternTag)
    BTBLLBPX->>PatternBuffer: rememberPattern
  end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • OpenXiangShan/GEM5#627: Both PRs modify BTBTAGE::lookupHelper in btb_tage.cc; this PR adds final-provider metadata fields that sit alongside the TAGE lookup refactoring in that PR.
  • OpenXiangShan/GEM5#775: Both PRs extend DecoupledBPUWithBTB by adding new BTB predictor components and adjusting per-component metadata wiring.
  • OpenXiangShan/GEM5#877: This PR's BTBLLBPX implements specUpdateGHist/recoverPHist interfaces that were refactored in that PR, making the two tightly coupled at the history-replay API level.

Suggested labels

align-kmhv3

Suggested reviewers

  • jensen-yan
  • Yakkhini

Poem

🐇 Hopping through the BTB maze,
A new LLBPX lights the ways—
Context keys and patterns hashed,
Saturating counters, timing gated fast.
RCR records fill my burrow deep,
Direction overrides, no mis-predict sleep! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title matches the main change: adding LLBP-X provider support and enabling it in kmhv3.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch llbpx-provider-pb-align

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new BTB-side direction override component (LLBP-X) into the decoupled BTB branch predictor stack, extends the predictor metadata capacity to accommodate the added component, and enables LLBP-X in the XiangShan kmhv3 configuration (with an optional timing gate).

Changes:

  • Add the BTBLLBPX predictor component (C++ implementation + SimObject params) and integrate it into DecoupledBPUWithBTB.
  • Expand per-stream predictor metadata capacity (and add a component-count guard) to support additional BTB predictor components.
  • Enable LLBP-X in configs/example/kmhv3.py and add CLI options related to LLBP-X timing.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/cpu/pred/SConscript Registers BTBLLBPX SimObject, builds btb_llbpx.cc, and adds LLBPX debug flag.
src/cpu/pred/btb/llbpx_cache.hh Adds a small set-associative store utility used by LLBP-X.
src/cpu/pred/btb/decoupled_bpred.hh Wires BTBLLBPX into the decoupled BTB predictor composition.
src/cpu/pred/btb/decoupled_bpred.cc Adds LLBP-X to component list, introduces component-count guard, and stores per-component metadata.
src/cpu/pred/btb/common.hh Extends TageInfoForMGSC provider fields; increases FetchTarget::predMetas capacity via MaxBTBPredComponents.
src/cpu/pred/btb/btb_tage.cc Populates new provider-related fields for downstream consumers (e.g., LLBP-X).
src/cpu/pred/btb/btb_llbpx.hh Declares the new BTBLLBPX predictor component.
src/cpu/pred/btb/btb_llbpx.cc Implements LLBP-X lookup/override, timing gate, recovery, and update/allocation logic.
src/cpu/pred/BranchPredictor.py Adds the BTBLLBPX SimObject and plugs it into DecoupledBPUWithBTB params.
configs/example/kmhv3.py Enables LLBP-X by default in kmhv3 and optionally enables its timing gate via CLI.
configs/common/xiangshan.py Adds CLI flags related to LLBP-X enable/timing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +890 to +903
parser.add_argument(
"--enable-llbpx",
"--enable-llbp",
action="store_true",
default=False,
help="Enable the experimental BTB-side LLBP-X direction override component",
)
parser.add_argument(
"--enable-llbpx-timing",
"--enable-llbp-timing",
action="store_true",
default=False,
help="Enable the LLBP-X Pattern Buffer timing gate",
)
Comment on lines +107 to +111
BTBLLBPX::putPCHistory(Addr startAddr,
const boost::dynamic_bitset<> &history,
std::vector<FullBTBPrediction> &stagePreds)
{
if (stagePreds.empty()) {
Comment on lines +94 to +96
panic_if(numComponents > MaxBTBPredComponents,
"BTB predictor component count %u exceeds predMetas capacity %u",
numComponents, MaxBTBPredComponents);

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@configs/example/kmhv3.py`:
- Around line 113-115: The kmhv3 branch predictor setup currently hard-codes
llbpx on, so the new CLI toggle is ignored. Update the predictor initialization
in the kmhv3 config so `cpu.branchPred.llbpx.enabled` is driven by the parsed
args flag (the same one used for `enable_llbpx_timing`), and keep the timing
option tied to that same `args` value in the `cpu.branchPred.llbpx` block.

In `@src/cpu/pred/btb/btb_llbpx.cc`:
- Around line 423-434: Snapshot the old prediction in BTBLLBPX::updatePattern
before calling updateCounter, because pattern->taken() is currently read after
the counter is modified and can misclassify just-corrected weak states as
correct. Store the pre-update taken/not-taken result in a local variable, use
that saved value for the confidence adjustment, and keep the rest of the logic
in updatePattern and replacementScore unchanged.
- Around line 141-145: The LLBPX gating in lookupHelper() is using the wrong
TAGE depth field, so it may reject valid overrides when TAGE falls back from the
main provider to alt/base. Update the base provider depth lookup in btb_llbpx.cc
to read tage_final_provider_table from stagePred.tageInfoForMgscs instead of
tage_provider_table, keeping the existing branchPC lookup and
providerDepthEligible comparison intact.
- Around line 347-350: The recoverPHist() override in BTBLLBPX currently does
not restore threadState[tid].rcr, leaving the recent-control-record state
speculative after a squash when path-history recovery is used. Update
recoverPHist() to mirror the RCR restoration done in recoverHist(), using the
provided history/entry/update data so threadState[tid].rcr is brought back to
the correct committed state before the next contextKey() call.
- Around line 82-86: The BTBLLBPX initialization currently validates tagBits,
keyBits, and patternBufferSize but never checks counterBits, so invalid values
can break updateCounter() with an illegal shift or overflow the short saturation
range. Add a panic_if guard near the existing constructor validation in BTBLLBPX
to require counterBits be within a safe positive bound before any call sites
that rely on updateCounter(), and make sure the check is placed alongside the
other parameter validations in btb_llbpx.cc.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6e323617-b9b2-4ceb-95bd-dd48e57b2ae1

📥 Commits

Reviewing files that changed from the base of the PR and between 648e246 and 320bfa6.

📒 Files selected for processing (11)
  • configs/common/xiangshan.py
  • configs/example/kmhv3.py
  • src/cpu/pred/BranchPredictor.py
  • src/cpu/pred/SConscript
  • src/cpu/pred/btb/btb_llbpx.cc
  • src/cpu/pred/btb/btb_llbpx.hh
  • src/cpu/pred/btb/btb_tage.cc
  • src/cpu/pred/btb/common.hh
  • src/cpu/pred/btb/decoupled_bpred.cc
  • src/cpu/pred/btb/decoupled_bpred.hh
  • src/cpu/pred/btb/llbpx_cache.hh

Comment thread configs/example/kmhv3.py
Comment on lines +113 to +115
cpu.branchPred.llbpx.enabled = True
cpu.branchPred.llbpx.enableTiming = bool(
getattr(args, 'enable_llbpx_timing', False))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Wire --enable-llbpx into the kmhv3 predictor toggle.

Line 113 hard-codes cpu.branchPred.llbpx.enabled = True, so the new --enable-llbpx/--enable-llbp flag has no effect.

Suggested fix
-            cpu.branchPred.llbpx.enabled = True
+            cpu.branchPred.llbpx.enabled = bool(
+                getattr(args, 'enable_llbpx', False))
             cpu.branchPred.llbpx.enableTiming = bool(
                 getattr(args, 'enable_llbpx_timing', False))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cpu.branchPred.llbpx.enabled = True
cpu.branchPred.llbpx.enableTiming = bool(
getattr(args, 'enable_llbpx_timing', False))
cpu.branchPred.llbpx.enabled = bool(
getattr(args, 'enable_llbpx', False))
cpu.branchPred.llbpx.enableTiming = bool(
getattr(args, 'enable_llbpx_timing', False))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@configs/example/kmhv3.py` around lines 113 - 115, The kmhv3 branch predictor
setup currently hard-codes llbpx on, so the new CLI toggle is ignored. Update
the predictor initialization in the kmhv3 config so
`cpu.branchPred.llbpx.enabled` is driven by the parsed args flag (the same one
used for `enable_llbpx_timing`), and keep the timing option tied to that same
`args` value in the `cpu.branchPred.llbpx` block.

Comment on lines +82 to +86
panic_if(tagBits == 0 || tagBits >= 64, "BTBLLBPX tagBits must be in [1, 63]");
panic_if(keyBits == 0 || keyBits >= 64, "BTBLLBPX keyBits must be in [1, 63]");
panic_if(enableTiming && patternBufferSize == 0,
"BTBLLBPX timing mode requires a non-zero patternBufferSize");
warn_if(adaptCtxDepth, "BTBLLBPX Phase 1 ignores adaptive context depth");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Inspect BTBLLBPX parameter definitions and uses:"
fd -a 'BranchPredictor.py|btb_llbpx.hh|btb_llbpx.cc' .

echo
echo "Search for counterBits declarations/defaults/validation:"
rg -n -C3 '\bcounterBits\b|BTBLLBPX' src configs

Repository: OpenXiangShan/GEM5

Length of output: 19375


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "BTBLLBPX header (parameter/member definitions):"
sed -n '180,230p' src/cpu/pred/btb/btb_llbpx.hh

echo
echo "BTBLLBPX constructor and counter update implementation:"
sed -n '65,90p' src/cpu/pred/btb/btb_llbpx.cc
echo
sed -n '549,565p' src/cpu/pred/btb/btb_llbpx.cc

echo
echo "BTBLLBPX Python params:"
sed -n '1122,1145p' src/cpu/pred/BranchPredictor.py

Repository: OpenXiangShan/GEM5

Length of output: 4879


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
# Read-only probe of the shift expression's edge cases.
# This mirrors the source expression logically: max = (1 << (counterBits - 1)) - 1
# and min = -(1 << (counterBits - 1))
for bits in [0, 1, 2, 3, 31, 32, 63, 64]:
    try:
        v = 1 << (bits - 1)
        print(bits, v.bit_length())
    except Exception as e:
        print(bits, type(e).__name__, e)
PY

Repository: OpenXiangShan/GEM5

Length of output: 226


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Search for any validation or assumptions on BTBLLBPX counterBits:"
rg -n -C2 'counterBits|Pattern counter width|panic_if|fatal_if|warn_if' src/cpu/pred/btb src/cpu/pred/BranchPredictor.py

echo
echo "Inspect the generated param type declaration, if present:"
fd -a 'BTBLLBPX.hh|BTBLLBPX.py|BTBLLBPX.*' build src params 2>/dev/null || true

Repository: OpenXiangShan/GEM5

Length of output: 5541


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
# Probe the exact problematic ranges for the source expression.
# We keep this read-only and do not import or execute repository code.
def expr(bits):
    return (1 << (bits - 1)) - 1, -(1 << (bits - 1))

for bits in [0, 1, 2, 3, 15, 16, 17, 31, 32, 33]:
    try:
        print(bits, expr(bits))
    except Exception as e:
        print(bits, type(e).__name__, e)
PY

Repository: OpenXiangShan/GEM5

Length of output: 364


Guard counterBits before updateCounter()

counterBits has no validation. 0 makes (1 << (counterBits - 1)) a negative shift, and larger values overflow the short saturation bounds.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cpu/pred/btb/btb_llbpx.cc` around lines 82 - 86, The BTBLLBPX
initialization currently validates tagBits, keyBits, and patternBufferSize but
never checks counterBits, so invalid values can break updateCounter() with an
illegal shift or overflow the short saturation range. Add a panic_if guard near
the existing constructor validation in BTBLLBPX to require counterBits be within
a safe positive bound before any call sites that rely on updateCounter(), and
make sure the check is placed alongside the other parameter validations in
btb_llbpx.cc.

Source: Linters/SAST tools

Comment on lines +141 to +145
int baseProviderHistIdx = -2;
auto tageInfoIt = stagePred.tageInfoForMgscs.find(branchPC);
if (tageInfoIt != stagePred.tageInfoForMgscs.end()) {
baseProviderHistIdx = tageInfoIt->second.tage_provider_table;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Use TAGE’s final provider depth here, not the main provider depth.

lookupHelper() now publishes both tage_provider_table and tage_final_provider_table, but this path still gates LLBPX against tage_provider_table. When TAGE falls back to alt/base, Line 144 makes providerDepthEligible compare against the wrong source and can reject overrides that should be legal.

Suggested fix
-            int baseProviderHistIdx = -2;
+            int baseProviderHistIdx = -2;
             auto tageInfoIt = stagePred.tageInfoForMgscs.find(branchPC);
             if (tageInfoIt != stagePred.tageInfoForMgscs.end()) {
-                baseProviderHistIdx = tageInfoIt->second.tage_provider_table;
+                baseProviderHistIdx =
+                    tageInfoIt->second.tage_final_provider_table;
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
int baseProviderHistIdx = -2;
auto tageInfoIt = stagePred.tageInfoForMgscs.find(branchPC);
if (tageInfoIt != stagePred.tageInfoForMgscs.end()) {
baseProviderHistIdx = tageInfoIt->second.tage_provider_table;
}
int baseProviderHistIdx = -2;
auto tageInfoIt = stagePred.tageInfoForMgscs.find(branchPC);
if (tageInfoIt != stagePred.tageInfoForMgscs.end()) {
baseProviderHistIdx =
tageInfoIt->second.tage_final_provider_table;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cpu/pred/btb/btb_llbpx.cc` around lines 141 - 145, The LLBPX gating in
lookupHelper() is using the wrong TAGE depth field, so it may reject valid
overrides when TAGE falls back from the main provider to alt/base. Update the
base provider depth lookup in btb_llbpx.cc to read tage_final_provider_table
from stagePred.tageInfoForMgscs instead of tage_provider_table, keeping the
existing branchPC lookup and providerDepthEligible comparison intact.

Comment on lines +347 to +350
BTBLLBPX::recoverPHist(const boost::dynamic_bitset<> &history,
const FetchTarget &entry,
const PathHistoryUpdate &update)
{

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Recover the RCR in path-history mode too.

This empty override leaves threadState[tid].rcr on speculative state after a squash when recovery flows through recoverPHist() instead of recoverHist(). The next contextKey() then hashes the wrong recent-control record sequence.

Suggested fix
 void
 BTBLLBPX::recoverPHist(const boost::dynamic_bitset<> &history,
                        const FetchTarget &entry,
                        const PathHistoryUpdate &update)
 {
+    auto meta = std::static_pointer_cast<LLBPXMeta>(
+        entry.predMetas[getComponentIdx()]);
+    if (!meta || entry.tid >= threadState.size()) {
+        return;
+    }
+
+    restoreRCR(entry.tid, *meta);
+    if (entry.exeTaken) {
+        pushRCR(entry.tid, entry.exeBranchInfo, true);
+    }
+#ifndef UNIT_TEST
+    llbpxStats.rcrRecover++;
+#endif
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
BTBLLBPX::recoverPHist(const boost::dynamic_bitset<> &history,
const FetchTarget &entry,
const PathHistoryUpdate &update)
{
BTBLLBPX::recoverPHist(const boost::dynamic_bitset<> &history,
const FetchTarget &entry,
const PathHistoryUpdate &update)
{
auto meta = std::static_pointer_cast<LLBPXMeta>(
entry.predMetas[getComponentIdx()]);
if (!meta || entry.tid >= threadState.size()) {
return;
}
restoreRCR(entry.tid, *meta);
if (entry.exeTaken) {
pushRCR(entry.tid, entry.exeBranchInfo, true);
}
`#ifndef` UNIT_TEST
llbpxStats.rcrRecover++;
`#endif`
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cpu/pred/btb/btb_llbpx.cc` around lines 347 - 350, The recoverPHist()
override in BTBLLBPX currently does not restore threadState[tid].rcr, leaving
the recent-control-record state speculative after a squash when path-history
recovery is used. Update recoverPHist() to mirror the RCR restoration done in
recoverHist(), using the provided history/entry/update data so
threadState[tid].rcr is brought back to the correct committed state before the
next contextKey() call.

Comment on lines +423 to +434
BTBLLBPX::updatePattern(const BranchMeta &meta, bool actualTaken)
{
auto *pattern = patterns.find(meta.key, meta.patternTag);
if (!pattern) {
return;
}
updateCounter(actualTaken, pattern->counter);
if (pattern->taken() == actualTaken && pattern->confidence < 15) {
pattern->confidence++;
} else if (pattern->confidence > 0) {
pattern->confidence--;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Snapshot the old prediction before updating the counter.

Line 430 checks pattern->taken() after updateCounter(). A weak counter that was wrong and just crossed zero will be treated as “correct” and gain confidence, which skews replacementScore() toward recently corrected mistakes.

Suggested fix
 void
 BTBLLBPX::updatePattern(const BranchMeta &meta, bool actualTaken)
 {
     auto *pattern = patterns.find(meta.key, meta.patternTag);
     if (!pattern) {
         return;
     }
+    const bool predictedTaken = pattern->taken();
     updateCounter(actualTaken, pattern->counter);
-    if (pattern->taken() == actualTaken && pattern->confidence < 15) {
+    if (predictedTaken == actualTaken && pattern->confidence < 15) {
         pattern->confidence++;
     } else if (pattern->confidence > 0) {
         pattern->confidence--;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
BTBLLBPX::updatePattern(const BranchMeta &meta, bool actualTaken)
{
auto *pattern = patterns.find(meta.key, meta.patternTag);
if (!pattern) {
return;
}
updateCounter(actualTaken, pattern->counter);
if (pattern->taken() == actualTaken && pattern->confidence < 15) {
pattern->confidence++;
} else if (pattern->confidence > 0) {
pattern->confidence--;
}
BTBLLBPX::updatePattern(const BranchMeta &meta, bool actualTaken)
{
auto *pattern = patterns.find(meta.key, meta.patternTag);
if (!pattern) {
return;
}
const bool predictedTaken = pattern->taken();
updateCounter(actualTaken, pattern->counter);
if (predictedTaken == actualTaken && pattern->confidence < 15) {
pattern->confidence++;
} else if (pattern->confidence > 0) {
pattern->confidence--;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cpu/pred/btb/btb_llbpx.cc` around lines 423 - 434, Snapshot the old
prediction in BTBLLBPX::updatePattern before calling updateCounter, because
pattern->taken() is currently read after the counter is modified and can
misclassify just-corrected weak states as correct. Store the pre-update
taken/not-taken result in a local variable, use that saved value for the
confidence adjustment, and keep the rest of the logic in updatePattern and
replacementScore unchanged.

@github-actions

Copy link
Copy Markdown

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.3155 -
This PR 2.3155 ➡️ 0.0000 (0.00%)

✅ Difftest smoke test passed!

@XiangShanRobot

Copy link
Copy Markdown

[Generated by GEM5 Performance Robot]
commit: 320bfa6
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

PR Master Diff(%)
Score 18.86 18.86 0.00

[Generated by GEM5 Performance Robot]
commit: 320bfa6
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 18.86 18.86 0.00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants