Skip to content

cpu-o3, mem-cache: Model L1D fake DCache mainpipe#912

Merged
XXtaoo merged 9 commits into
xs-devfrom
merge/l1d-fake-mainpipe-xs-dev
Jun 26, 2026
Merged

cpu-o3, mem-cache: Model L1D fake DCache mainpipe#912
XXtaoo merged 9 commits into
xs-devfrom
merge/l1d-fake-mainpipe-xs-dev

Conversation

@XXtaoo

@XXtaoo XXtaoo commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Motivation

This PR adds a fake L1D DCache mainpipe model for timing and performance modeling.

In the current classic-cache path, L1D refill and StoreBuffer traffic can access cache resources without modeling the same mainpipe-side arbitration and resource conflicts as the hardware pipeline. This can make GEM5 underestimate contention from refill/store traffic, especially bank/data-array conflicts, same-set refill hazards, tag-write blocking, and StoreBuffer backpressure.

The fake mainpipe models these mainpipe-side timing constraints without turning the classic cache into a real mainpipe-based cache implementation. The goal is to improve performance modeling accuracy for L1D refill and StoreBuffer interactions.

Changes

  • Add an LSQ-local fake DCache mainpipe with S1/S2/S3-style buffered stages.
  • Model refill and StoreBuffer requests as fake mainpipe requests.
  • Model mainpipe resource conflicts, including:
    • bank/data-array conflicts
    • same-set refill hazards
    • tag-write blocking
    • S1 backpressure
    • StoreBuffer replay after failed S2 issue
  • Delay effective MSHR availability until the refill has completed fake mainpipe writeback.
  • Issue StoreBuffer eviction packets from fake mainpipe S2 instead of sending them to the classic cache before mainpipe admission.
  • Route prefetch refill responses through the same fake mainpipe path.
  • Count allocated L1D fill/update traffic as fake mainpipe refill traffic.
  • Add basic fake mainpipe statistics for refill/store entries and blocking causes.

Scope

This PR keeps the fake mainpipe as a performance model. It does not make the classic cache use a real DCache mainpipe for functional accesses.

The modeled request sources are currently L1D refill and StoreBuffer traffic.

Validation

  • Built successfully with:
scons build/RISCV/gem5.opt --gold-linker -j64
  • Triggered manual performance test:
Manual Performance Test
configuration: idealkmhv3.py
benchmark_type: gcc15-spec06-0.8c
branch: merge/l1d-fake-mainpipe-xs-dev
run: https://github.com/OpenXiangShan/GEM5/actions/runs/27602319072

The performance workflow is currently running at the time this PR description is prepared.

Summary by CodeRabbit

Release Notes

  • Refactor / New Features

    • Introduced a new fake D-cache “MainPipe” flow to orchestrate store-buffer eviction/admission, replay behavior, and eviction blocking.
    • Updated store-buffer writeback routing and writeback completion to reflect MainPipe stage activity and replay queue status.
  • Metrics

    • Added detailed MainPipe statistics for refill notification outcomes plus store-buffer admission/blocking and replay timing reasons.
  • Cache Behavior

    • Added MainPipe-aware MSHR credit tracking and updated D-cache prefetch readiness.
    • Added packet flags to track MainPipe store-buffer requests and hits.

XXtaoo added 6 commits June 16, 2026 15:17
Replace the old per-div refill bitmask model with an LSQ-local fake DCache MainPipe that models refill and store-buffer resource blocking. The fake pipe keeps S1/S2/S3 state, uses refill priority over store-buffer requests, models same-set hazards, refill tag-write blocking, and S1 data-read versus S3 data-write bank conflicts.

Gate store-buffer writeback through the fake MainPipe before sending to the classic cache, while keeping MainPipe-local blocking separate from classic-cache port blocking. Derive load bank occupancy from fake MainPipe S1/S3 resource usage and add stats for refill/store admission and blocking causes.

The classic cache remains the real memory access path; this MainPipe is only a timing/resource blocking model.

Change-Id: I1e9957e34042bfd3162bbdecdae297a59aa31e31
Keep the classic cache refill response path functional while delaying the effective availability of released L1D MSHR entries through the LSQ-local fake DCache mainpipe.

The refill response still services CPU targets when data returns, but the freed MSHR capacity is held by a fake credit until the refill leaves the fake mainpipe. This models refill mainpipe occupancy without moving actual cache accesses out of the classic cache path.

Sample the pre-release MSHR-full state before taking the fake credit so cache retry/unblock state is updated only when effective MSHR capacity is really released.

Change-Id: I4a0b5ad7825deb49ac65fc84f00462997f72c2b0
Delay StoreBuffer classic-cache timing requests until the LSQ-local fake DCache MainPipe reaches S2. StoreBuffer admission now only enters the fake pipe; the fake S2 callback performs the real sendTimingReq(), so StoreBuffer misses cannot allocate or merge MSHRs at fake-pipe admission time.

Add StoreBuffer entry state for fake-pipe residency and S2 replay exits. Requests that fail fake S2 issue now leave the fake pipe, enter a replay queue, and retry from S0 later. Keep pre-admission blocking on the existing blocked entry path.

Treat sending, in-mainpipe, and replay-queued entries as eviction-in-progress so younger same-line stores use a vice entry instead of mutating an eviction packet's payload. Keep StoreBuffer release on the existing cache response path.

Validation:
- git diff --check
- scons build/RISCV/gem5.opt --gold-linker -j64

Change-Id: I2a8943c04ff2dc5f2f900f7727dec57685a6c8be
Change-Id: I4c9143524d76850db3cd1e6fa80f0861db4d73df
Send allocating L1D demand and prefetch read refills into the existing fake DCache mainpipe, while excluding permission-only and special whole-line write responses from refill modeling.

Add counters to distinguish packet-LSQ refill notifications, owner-LSQ refill notifications, and skipped refill notifications without an LSQ target.

Change-Id: Ib5ea665c71304390186d3b1ae5364997704f39e6
Change-Id: I4bedc68ed0bc83ed444852d33b4f59d506b4dd8e
@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3f8ac6d9-9df4-4c5f-919c-b35dad051c5f

📥 Commits

Reviewing files that changed from the base of the PR and between 956a8d3 and 52dcc9c.

📒 Files selected for processing (3)
  • src/cpu/o3/issue_queue.cc
  • src/cpu/o3/lsq.cc
  • src/cpu/o3/lsq.hh
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/cpu/o3/lsq.hh
  • src/cpu/o3/lsq.cc

📝 Walkthrough

Walkthrough

This PR adds a fake Dcache MainPipe model in LSQ for store-buffer eviction and refill flow, and connects BaseCache MSHR credit handling, refill notification, and prefetch gating to that model.

Changes

LSQ MainPipe and store-buffer flow

Layer / File(s) Summary
StoreBuffer state and packet flags
src/cpu/o3/lsq.hh, src/cpu/o3/lsq.cc, src/cpu/o3/lsq_unit.cc, src/mem/packet.hh
Adds MainPipe and replay state to StoreBufferEntry, updates eviction-active checks and reset/release assertions, changes the merge guard in insertStoreBuffer, and adds packet flags/accessors for MainPipe sbuffer requests and hits.
LSQ MainPipe contracts and stats
src/cpu/o3/lsq.hh
Declares MainPipe request/slot types, stage/source enums, bank-mask and admission APIs, replay/retry entry points, the set-key helper, and expanded LSQ members and counters.
MainPipe advancement and refill queuing
src/cpu/o3/lsq.cc
Implements MainPipe stage advancement, conflict checks, bank marking, set-key computation, and refill request enqueueing; clearAddresses() now advances the MainPipe each cycle.
Store-buffer admission and replay handling
src/cpu/o3/lsq.cc
Routes store-buffer eviction requests through MainPipe admission, handles S2 issue outcomes, retries replayed entries before blocked entries, and updates writeback completion gating for active MainPipe and replay queues.

BaseCache MainPipe integration

Layer / File(s) Summary
BaseCache MainPipe MSHR contracts
src/mem/cache/base.hh, src/mem/cache/mshr_queue.hh
Adds MainPipe credit and LSQ state, declares effective-fullness and prefetch helpers, and switches MSHR blocking checks to MainPipe-aware helpers.
BaseCache refill notification and prefetch gating
src/mem/cache/base.cc, src/cpu/o3/issue_queue.cc
Registers LSQ owners, tracks refill LSQ selection, coordinates refill notification with optional MSHR-credit holding and release, updates prefetch gating sites, and switches issue blocking to next-cycle tag-write prediction.

Sequence Diagram(s)

sequenceDiagram
  participant SbufferRequest
  participant LSQ
  participant BaseCache
  participant MSHRQueue

  SbufferRequest->>LSQ: sendPacketToCache()
  LSQ->>LSQ: sbufferEnterDcacheMainPipe(pkt)
  LSQ->>BaseCache: sendTimingReq(pkt)
  BaseCache->>BaseCache: recvTimingResp()
  BaseCache->>LSQ: notifyDcacheRefill(addr, need_data_read, on_complete)
  BaseCache->>MSHRQueue: hold/release MainPipe credit
  LSQ->>LSQ: retryReplayStoreBuffer()
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • OpenXiangShan/GEM5#651: Both PRs modify src/cpu/o3/issue_queue.cc to block/replay loads based on LSQ D-cache tag-write timing.
  • OpenXiangShan/GEM5#780: Both PRs touch the LSQ store-buffer writeback path and the call site that drives it from the O3 pipeline.
  • OpenXiangShan/GEM5#892: Both PRs change LSQ store-buffer eviction state and replay-related paths in src/cpu/o3/lsq.cc.

Suggested labels

perf

Suggested reviewers

  • happy-lx
  • jensen-yan

Poem

🐇 Hop, hop, through the MainPipe lane,
Banks and tags dance in a tidy chain.
Credits held, then set back free,
Replays nibble patiently.
A Dcache path with rabbit flair,
S0 to S4, light as air.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.80% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding a fake L1D DCache mainpipe model across cpu-o3 and mem-cache.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch merge/l1d-fake-mainpipe-xs-dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@github-actions

Copy link
Copy Markdown

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.3155 -
This PR 2.3125 📉 -0.0030 (-0.13%)

✅ Difftest smoke test passed!

@XXtaoo XXtaoo requested a review from happy-lx June 16, 2026 08:15
Comment thread src/cpu/o3/lsq.cc Outdated
Comment thread src/cpu/o3/lsq.cc Outdated
Comment thread src/cpu/o3/lsq.cc Outdated
Comment thread src/cpu/o3/lsq.cc
Comment thread src/cpu/o3/lsq.cc Outdated
1. Model S3 as the tag write stage and keep S3 tag writes blocking S0 tag reads.

2. Model S4 as the data write stage and move fake mainpipe completion after S4 data write.

3. Move StoreBuffer cache request issue to fake S2 and let S2 misses exit the fake pipe instead of occupying S3/S4.

4. Release fake mainpipe MSHR credit no earlier than the next cache cycle.

Change-Id: I3fb7b6bce803197d0427667ac2498a32b71b78de
@github-actions

Copy link
Copy Markdown

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.3155 -
This PR 2.3181 📈 +0.0026 (+0.11%)

✅ Difftest smoke test passed!

happy-lx
happy-lx previously approved these changes Jun 18, 2026
1. Merge StoreBuffer same-set admission checks into the common blocked path to avoid overlapping early returns.

2. Release fake mainpipe MSHR credit at the S4 completion tick without adding another cache cycle.

Change-Id: I49039d9c0a69a446a15c864dd7f2b3ddf78bdd4e
@github-actions

Copy link
Copy Markdown

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.3155 -
This PR 2.3181 📈 +0.0026 (+0.11%)

✅ Difftest smoke test passed!

happy-lx
happy-lx previously approved these changes Jun 22, 2026
Make load issue blocking observe whether the fake DCache mainpipe will have a refill tag write in S3 when the selected load reaches loadpipe S0. This keeps the issue-side block aligned with next-cycle loadpipe admission and avoids spurious refill S4 load bank conflicts caused by checking the current live S3 state.

Change-Id: Id6a00ebbc7e1e530329c829052404cd92b04eec3
@github-actions

Copy link
Copy Markdown

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.3155 -
This PR 2.3184 📈 +0.0029 (+0.13%)

✅ Difftest smoke test passed!

@XXtaoo XXtaoo merged commit b42594c into xs-dev Jun 26, 2026
2 checks passed
@XXtaoo XXtaoo deleted the merge/l1d-fake-mainpipe-xs-dev branch June 26, 2026 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants